The description of the MPEG workflow highlights the role of quality assessment across the entire MPEG standard life cycle: at the time of issuing a Call for Evidence (CfE) or a Call for Proposals (CfP), carrying out Core Experiments (CE) or executing Verification Tests.
We should consider, however, that in 30 years the coverage of the word “media” has changed substantially.
Originally (1989-90) the media types tested were Standard Definition (SD) 2D rectangular video and stereo audio. Today the video data types include also High Definition (HD), Ultra High Definition (UHD), Stereoscopic Video, High Dynamic Range (HDR) and Wide Colour Gamut (WCG), and Omnidirectional (Video 360).
Today the video information can be 2D, but also multiview and 3D: stereoscopic, 3 degrees of freedom + (3DoF+), 6 degrees of freedom (6DoF) and various forms of light field. Audio has evolved to different forms of Multichannel Audio and 6DoF. Recently Point Clouds have been added to the media types for which MPEG has applied subjective quality measurements to develop compression standards.
This chapter goes inside the work that comes together with subjectively assessing the quality of media compression.
Preparing for the tests
Even before MPEG decides to issue a CfE or CfP for compression of some type of media content, viewing or listening to content may take place to appreciate the value of a proposal. When a Call is issued, MPEG has already reached a pretty clear understanding of the use cases and requirements (at the time of a CfE) or the final version of them (at the time of a CfP).
The first step is the availability of appropriate test sequences. Sequences may already be in the MPEG Content Repository or are spontaneously offered by members or are obtained from industry representatives by issuing a Call for Content.
Selection of test sequences is a critical step because MPEG needs sequences that are suitable for the media type and are representative of the use cases. Moreover, test sequences must be in a number that allows MPEG to carry out meaningful and realistic tests.
By the CfE or CfP time, MPEG has also decided what is the standard against which responses to the CfE or CfP should be tested. For example, in the case of HEVC, the comparison was with AVC and, in the case of VVC, the comparison was with HEVC. In the case of Internet Video Coding (IVC) the comparison was with AVC. When such a reference standard does not exist (this was the case for, e.g., all layers of MPEG-1 Audio and Point Cloud Compression), MPEG uses the codec built during the exploratory phase that groups together state of the art tools.
Once the test sequences have been selected, the experts in charge of the reference software are asked to run the reference software encoder and produce the “anchors”, i.e. the video sequences encoded using the “old” standard proposals are to be compared with. The anchors are made available in an FTP site so that anybody intending to respond to the CfE or CfP can download them.
The set up used to generate the anchors are documented in “configuration files” for each class of submission. In the case of video, these are (SD/HD/UHD, HDR, 360º) and design conditions (low delay or random access). In order to have comparable data, all proponents must obviously use the same configuration files when they encode the test sequences using their own technologies.
As logistic considerations play a key role in the preparation of quality tests, would-be proponents must submit formal statements of intention to respond to the CfE or CfP to the Requirements group chair (currently Jörn Ostermann) and Test group chair (currently Vittorio Baroncini), and the chair of the relevant technology group, 2 months before the submission deadline.
At the meeting preceding the one in which responses to the CfP/CfE are due, an Ad hoc Group (AhG) is established with the task of promoting awareness of the Call in the industry, to carry out the tests, draft a report and submit conclusions on the quality tests to the following MPEG meeting.
Carrying out the tests
The actual tests are entirely carried out by the AhG under the leadership of AhG chairs (typically the Test chair and a representative of the relevant technology group).
Proponents send the Test Chair their files containing encoded data on hard disk drives or on an FTP site by the deadline specified in the CfE/CfP.
When all drives are received, the Test chair performs the following tasks:
- Acquire special hardware and displays for the tests (if needed)
- Verify that the submitted files are all on disk and readable
- Assign submitted files to independent test labs (sometimes even 10 test labs are concurrently involved is a test run)
- Make copies and distribute the relevant files to the test labs
- Specify the tests or provide the scripts for the test labs to carry out.
The test labs carry out a first run of the tests and provide their results for the Test chair to verify. If necessary, the Test chair requests another test run or even visits the test labs to make sure that the tests will run properly. When this “tuning” phase has been successfully executed, all test labs run the entire set of tests assigned to them using test subjects. Tens of “non-expert” subjects may be involved for days.
Test results undergo a critical revision according to the following steps
- The Test chair collects all results from all test labs, performs a statistical analysis of the data, prepares and submits a final report to the AhG
- The report is discussed in the AhG and may be revised depending on the discussions
- The AhG draws and submits its conclusions to MPEG along with the report
- Report and conclusions are added to all the material submitted by proponents
- The Requirements group and the technology group in charge of the media type evaluate the material and rank the proposals. The material may not be made public because of the sensitivity of some of the data.
This signals the end of the competition phase and the beginning of the collaboration phase.
Other media types
The process above has been described having specifically rectangular 2D or 360º video in mind. Most of the process applies to other media types with some specific actions to be made for each of them, e.g.
- 3DTV: for the purpose of 3D HEVC tests, polarised glasses as in 3d movies and autostereoscopic displays were used;
- 3DoF+: a common synthesiser was used in the 3DoF+ tests to synthesise views that are not available at the decoder;
- Audio: in general subjects need to be carefully trained for the specific tests;
- Point Clouds: videos generated by a common presentation engine consisting of point clouds animated by a script (by rotating the object and seeing it from different viewpoints) were tested for quality as if they were natural videos (NB: there were no established method to assess the quality of point clouds before. It was demonstrated that the subjective tests converged to the same results as the objective measurements).
Verification tests are executed with a similar process. Test sequences are selected and compressed by experts running reference software for the “old” and the “new” standard. Subjective tests are carried out as done in CfE/CfP subjective tests. Test results are made public to provide the industry with guidance on the performance of the new standard. See as examples the Verification Test Report for HDR/WCG Video Coding Using HEVC Main 10 Profile and the MPEG-H 3D Audio Verification Test Report.
Quality tests play an enabling role in all phases of development of a media compression standard. For 30 years MPEG has succeeded in mobilising – on a voluntary basis – the necessary organisational and human resources to perform this critical task.
This chapter has provided a window on an aspect of MPEG life that is little known but instrumental to offer industry the best technology so that users can have the best media experience.
|Table of contents||◄||9 Audio compression in MPEG||█||11 Systems standards keep the pieces together||►|