The challenging goals that MPEG set to itself 3 decades ago have been successfully achieved across many generations of digital media standards implemented in countless products, applications and services by many different industries in interoperable ways. How could this happen?
A significant part of the answer lies in the fact that MPEG has developed a philosophy based on the notion of generic multi-industry standards that is has thoroughly applied and purpose of this chapter is to examine the different aspects of that philosophy:
- Standards need a business model argues that MPEG standardisation is a (non-commercial) business and needs a business model to be successful and long-lasting;
- Common standards for all countries and industries argues that the transition from vertical businesses requires a process change;
- Designing common standards for different users argues that industry-agnostic standardisation needs new rules;
- Technology standards as toolkits highlights the problem that common standards cannot be more burdensome that industry-specific standards;
- Standards for the market, not the other way around argues that industry-agnostic standardisation is incompatible with the imposition of market-crowned standards;
- Standards that anticipate the future argues that industry-agnostic standardisation can only serve its purpose if it looks to standards that satisfy future needs;
- Compete and collaborate argues that industry-agnostic standardisation requires competition to get the best technologies and collaboration to refine and improve them;
- Industry-friendly standards sets some basic rules to make sure that standards do not restrict but enhance the role of individual companies;
- Audio and video come together sets the obvious rule that audio and video compression should be part of the same standardisation effort;
- A glue to keep audio and video together argues that audio-visual systems are made of pieces that need glue to work properly together;
- Integrated standards as toolkits identifies the need for users to have integrated solutions but also to cherry pick individual components of an integrated solution;
- One step at a time sets the strategy to respond to challenges in small steps;
- Separate the wheat from the chaff identifies the need for industry components to test implementation for conformance to an industry-agnostic standard;
- Technology is always on the move identifies the need for a standardisation process where new technology may make some standards obsolete;
- Research for MPEG standards identifies the need for an organic relationship between research and MPEG standardisation;
- Standards as enablers, not disablers argues that industry members have the right to have their legitimate requests satified by industry-agnostic standardisation;
- Never stop working together claims that to achieve standards developed collaboratively in a timely fashion, collaborative work should not be limited to official meetings;
- The nature and borders of compression claims that compression is the enabling technology but its nature and borders change with time.
MPEG in not engaged in a commercial business, but it does have a business model that has guided its existence as a group and driven its standard-development work.
When MPEG started, there were committees who developed standards that avoided the use of essential patents. A successful example of this approach is the well-known JPEG standard (ISO/IEC 10918).
The field of video coding, however, was (and more so is today) very different because industry and academia had worked on video compression technologies for some 3 decades and had filed many patents (at that time they could already be counted by the thousands) covering a wide range of basic video coding aspects. A video coding standard for which only Option 1 declarations were made (the we call here “Option 1 standard”) was certainly possible but would probably have been unattractive because of its low performance compared with state-of-the-art codecs.
Therefore, MPEG decided that it would develop standards with the best performance, without consideration of the IPR involved. Such standards would be widely adopted by the market and patent holders would get royalties from their use. If a patent holder did not want to allow that to happen, they could make an Option 3 declaration and MPEG would remove the technology.
MPEG’s success in the last 30 years proves that its business model has worked as designed. More than that, most patent holders have been and keep on re-investing the royalties they get from existing standards in more technologies for future standards.
The MPEG “business model” has created a standard-producing machine (MPEG) that feeds itself with new technologies.
In the world of analogue technologies, a combination of scarce availability of broadband communication, deliberate policies and natural separation between industries that had had until then little in common, favoured the definition of country-based or industry-based standards.
In the late 1980’s many industries, regions and countries had realised that the state of digital technologies had made enough progress to enable a switch from analogue to digital.
The first steps toward digital video by countries and industries showed that the logic of the analogue world was still ruling: different countries and industries tried their own ways independently. Several companies had developed prototypes, regional initiatives were attempting to develop formats for specific countries and industries, some companies were planning products and some standards organisations were actually developing standards for their industries.
MPEG jumped in the scene at a time the different trials had not had the time to solidify, and the epochal analogue-to-digital transition gave MPEG a unique opportunity to execute its plan. The MPEG proposal of a generic standard, i.e. a common technology for all industries, caught the attention because it offered global interoperability, created global markets – in geography and across industries – and placed the burden of developing the costly VLSI technology on an industry accustomed to do that. Today the landscape has changed beyond recognition. Still the revolutionary idea of that time is taken as a matter of fact.
MPEG knew that it was technically possible to develop generic standards that could be used in all countries of the world and in all industries that needed compressed digital media. MPEG saw that all actors affected – manufacturers, service providers and end users – would gain if such generic standards existed. However, before treading its adventurous path, MPEG did not know whether it was procedurally possible to achieve that goal. But it gambled and succeeded.
Clarifying to oneself the purpose of an undertaking is a good practice that should apply to any human endeavour. This good practice is even more necessary when a group of like-minded people work on a common project – a standard. When the standard is not designed by and for a single industry but by and for many, keeping this rule is vital for the success of the effort. When the standard involves disparate technologies, whose practitioners are not even accustomed to talk to one another, complying with this rule is a prerequisite.
Starting from its early days MPEG has developed a process designed to achieve the common understanding that underpins the technical work: 1) describe the environment (context and objectives), 2) single out a set of exemplary uses of the target standard (use cases), and 3) identify requirements.
MPEG does not have an industry of its own. Therefore, MPEG uses its Requirements subgroup to develop generic requirements and to interact with trade associations and standards organisations of the main industries affected by its standards. Communicating its plans, the progress of its work and the results achieved more actively than other groups is vital to MPEG who uses a large number of tools to achieve that goal (see MPEG communicates).
The network of liaisons and, sometimes, joint activities (e.g., as in Genome Compression) is the asset that has allowed MPEG to achieve many of its goals, certainly the most challenging ones.
The basic media compression technology should be shared by all industries, but individual industries do not necessarily need the same functions or the same performance. Therefore, industry-agnostic standardisation need to be aware of all requirements of all the industries the standard will serve but must have the means to flexibly allocate technology and performance to one industry without encumbering other unconcerned industries. MPEG has made its standards toolkit-based and has successfully applied Profiles and Levels, a notion that had been developed in the 1980’s in the context of Open System Interconnection (OSI), to create industry-specific standard configurations with a high level of interoperability.
Standards are ethereal entities, their impact may be unpredictable but, when it is there, it can be very concrete. This was true and already well understood in the world of analogue media. At that time a company that had developed a successful product would try to get a “standard” stamp on it, share the technology with its competitors and enjoy the economic benefits of its “standard” technology.
MPEG reshuffled the existing order of steps. Instead of waiting for the market to decide which technology would win – an outcome that very often had little to do with the value of the technology or the product – MPEG offered its standard development process where the collaboratively defined “best standard” is developed and assessed by MPEG experts who decide which individual technology wins on the basis of criteria defined a priori. Then the technology package – the standard – developed by MPEG is taken over by the industry.
At a given time MPEG standards are consistently the best standards. Those who have technologies selected to be part of MPEG standards reap the benefits and most likely will continue investing in new technologies for future standards.
When MPEG was specifying MPEG-1, the technology to implement the standard was not available. MPEG made a bet that industry would be able to to design circuits on silicon that could execute the complex operations and build products of which there was no concrete evidence but only guesses: interactive video on CD (not so educated) and digital audio broadcasting (much touted). Ironically, neither products really took off, at least in that time frame, but other products that relied on MPEG-1 technologies – Video CD and MP3 – were (the former) and still are (the latter) extremely successful.
When technology moves fast, or actually accelerates, waiting is a luxury no one can afford. MPEG-1 and MPEG-2 were standards whose enabling technologies were already considered by some industries and MPEG-4 (started in 1993) was a bold and successful attempt to bring media into the IT world (or the other way around). That it is no longer possible to wait is shown by MPEG-I, the current undertaking where MPEG is addressing standards for interfaces that are still shaky or just hypothetical. Having standards that lead – as opposed to trail – technology is a tough trial-and-error game. However, it is the only one possible for digital media. MPEG has practiced this game for the last 30 years.
The alternative is to stop making standards for digital media because if MPEG waits until market needs are clear, the market is already full of incompatible solutions and there is no room left for standards.
Anticipating market needs is in the DNA of MPEG standards. With each of its standards MPEG is betting that a certain standard technology will be adopted. In Standards and uncertainty we show how some MPEG standards are extremely successful and other less so.
MPEG favours competition to the maximum extent possible. This is achieved by calling for solutions that must be comprehensively described, i.e. without black boxes, to qualify for consideration. MPEG’s line-up of aggressive “judges” (meeting participants, especially other proponents) assess the merit of proposed technologies.
Extending competition beyond a certain point, however, is counterproductive and prevents the group from reaching the goal with the best results.
MPEG develops and uses a software platform assembling the candidate components selected by the “judges” – called Test Model – as the platform where participants can work on improving the different areas of the Test Model.
Core Experiments is the tool that allows experts to improve the Test Model by adding step by step the software that implements the accepted technologies. Core Experiments were first defined in March 1992 as “a technical experiment where the alternatives considered are fully documented as part of the test model, ensuring that the results of independent experimenters are consistent”. This definition applies unchanged to the work being done today.
If the MPEG mission is to provide the best standards to industry via competition, why should MPEG standards be shielded from it? Probably the earliest example of application of this principle is provided by MPEG-2 part 3 (Audio). When backward compatibility requirements did not allow the standard to yield the performance of algorithms not constrained by compatibility, MPEG issued a Call for Proposals and developed MPEG-2 part 7 (Advanced Audio Codec). Later the algorithms evolved and became the now ubiquitous MPEG-4 AAC. Had MPEG not made this decision, probably we would still have MP3 everywhere, but no other MPEG Audio standards. The latest example is Essential Video Coding (EVC), a standard not designed to offer the best performance, but a good performance with good licensability prospects.
Working on generic standards means that reasonable requests – the best unconstrained multi-channel audio quality on this case – cannot be dismissed. MPEG tried to achieve that with the technology it was working on – backward-compatible multichannel audio coding – and failed. The only way to respond to the request was to work on a new technology.
Making MPEG standard friendly to a set of disparate industries is difficult task but a necessity that MPEG has managed in 3 ways:
- Display formats: Since the appearance of television cameras and displays in the 1920’s, industry and governments have created tens of television formats, mostly around the basic NTSC, PAL and SECAM families. Even in the late 1960’s, when the Picturephone service was deployed, AT&T invented a new 267-line format, with no obvious connection with any of the existing video formats. As MPEG wanted to serve all markets, it decided that it would just support any display format, leaving display formats outside of MPEG standards.
- Serving one without encumbering others. Industry may like the idea of sharing the cost of an enabling technology but not at the cost of compromising individual needs. MPEG standards share the basic technology but provide the necessary flexibility to the many different users of MPEG standards with the notion of Profiles and Levels. With Profiles MPEG defines subsets of general interoperability, with Levels it defines different grades of performance within a Profile.
- Standards apply only to decoders; encoders are implicitly defined and have ample margins of implementation freedom. By restricting standardisation to the decoding functionality, MPEG extends the life of its standards and, at the same time, allows industry players to compete on the basis of their constantly improved encoders.
Because of the way audio and video industries had developed – audio for a century and video for half a century – people working on the corresponding technologies tended to operate in almost “watertight” compartments, be they in academia, research or companies. That attitude had some justification in the analogue world because the relevant technologies were indeed different and there was not so much added value in keeping the technologies together, considering the big effort that would be needed to keep the experts together.
However, the digital world no longer justified keeping the two domains separate because of so many commonalities of technologies. That is why MPEG, just 6 months after its first meeting, kicked off the Audio subgroup after successfully assembling the best experts in a few months of work.
This injection of new technologies with the experts carrying them was not effortless. When transformed into digital, audio and video signals are bits and bits and bits, but the sources are still different and influence how they are compressed. Audio experts shared some high-level compression technologies – Subband and Discrete Cosine Transform – but video is (was) a time-changing 2D signal, often with “objects” in it, while audio is (was) a time-changing 1D signal. More importantly, audio experts were driven by other concerns than video such as the way the human hearing process handles the data coming out of the frequency analysis performed by the human cochlea and other niceties of the human hearing process.
The audio work has never been “dependent” on the video work. MPEG audio standards can have a stand-alone use (i.e. they do not assume that there is a video associated with it), but there are very few uses of MPEG video standard that do not need an MPEG Audio standard. So it was necessary to keep the two together and it is even more important to do so now when both video and audio are both time-dependent 3D signals and users are going to actually create their own experiences.
Having audio and video together does not necessarily mean that audio and video will play together in the right way if they are stored or transmitted over a channel.
The fact that MPEG established a Digital Storage Media subgroup and a Systems subgroups 18 months after its foundation signals that MPEG has always been keenly aware of the issue that a bitstream composed by MPEG audio and video bitstreams need to be transported to be played back as intended by the bitstream creator. In MPEG-1 it was a bitstream in a controlled environment, in MPEG-2 it was a bitstream in a noisy environment, from MPEG-4 times on it was on IP, in MPEG-DASH it had to deal with unpredictability of the Internet Protocol in the real world.
During its existence the issue of multiplexing and transport formats have shaped MPEG standards. Without a Systems subgroup, efficiently compressed audio and video bitstreams would have remained floating in the space without a standard means to plug them into real systems.
Six months after its inception, MPEG had non only realised that digital media is not just video (although this is the first component that catches attention), but it is also audio (no less challenging and with special quality requirements). In 12 months, it had realised that bits do not float in the air but that a stream of bits needs some means to adapt the stream to the mechanism that carries it (in MPEG-1 the CD-ROM). If the transport mechanism is analogue (as it was 25 years ago and, to some extent, still is today), the adaptation is even more challenging. Later MPEG also realised that a user interacts with the audio-visual bits it receives (even though it is so difficult to understand what exactly is the interaction that the user wants). With its MPEG-2 standard MPEG was able to provide the industry with a complete Audio-Video-Systems (and DSM-CC) solution whose pieces could also be used independently.
That was possible because MPEG could attract, organise and retain the necessary expertise to address such a broad problem area and provide not just a solution that worked, but the best that technology could offer at the time.
By the time it developed its earliest standards such as MPEG-1 and MPEG-2, MPEG had to assemble disparate technology competences that had probably never worked together in a project. With its example, MPEG has promoted the organisational aggregation of audio and video research in many institutions where the two were separate.
When it developed MPEG-4 (a standard with 34 parts), MPEG has assembled its largest ever number of competences ranging from audio and video to scene description, Hardware Description Languages, fonts, timed text and more. MPEG keeps competences organisationally separate in different in MPEG subgroups. However, it retains the flexibility to combine and deploy the needed resources to respond to specific needs.
Most MPEG standards are composed of the 3 key elements – audio, video and systems – that make an audio-visual system and some, such as MPEG-4 and MPEG-I, even include 3D Graphic information and the way to combine all the media. However, the standards allow maximum usage flexibility:
- A standard can be directlty used as complete solutions, e.g. like in VCD where Systems, Video and Audio are used
- The components of the standard can be used individually, e.g. like in ATSC A/53 where Systems and Video are from MPEG, and Audio is from and external source
- The standard does not specify a technology but only an interface to different implementations of the technology, e.g. like in the case of MPEG-I, for which MPEG will likely not standardise a Scene Description but just indicate how externally defined technologies can be plugged into the system
- A standard does not specify the solution but only the components of a solution, e.g. like in the case of Reconfigurable Video Coding (RVC) where a non-standard video codec can be assembled using an MPEG standard.
MPEG wants to satisfy the needs of all customers, even those who do not want to use its standards but other specifications. MPEG standards can signal how an external technology can be plugged into a set of other native MPEG technologies. With one caveat: customer has to take care of the integration of the external technology. That MPEG will not do.
Before MPEG came to the fore many players were trying to be “first” and “impose” their early solutions to other countries, industries or companies. If the newly born MPEG had proposed itself as the developer of an ambitious generic digital media technology standard applicable to all industries, the proposal would have been seen as far-fetched and most likely the initiative would have gone nowhere.
Instead, MPEG started with a moderately ambitious project: a video coding standard for interactive applications on digital storage media (CD-ROM) at a rather low bitrate (1.5 Mbit/s) targeting the market covered by the video cassette (VHS/Beta) with the addition of interactivity.
Moving one step at a time has been MPEG policy for MPEG-1 and all its subsequent standards.
In human societies parliaments make laws and tribunals decide if a specific human action conforms to the law. In certain regulated environments (e.g. terrestrial broadcasting in many countries) there are standards and entities (authorised test laboratories) who decide whether a specific implementation conforms to the standard. MPEG has neither but, in keeping with its “industry-neutral” mission, it provides the technical means – namely, tools for conformance assessment, e.g. bitstreams and reference software – for industries to use in case they want to establish authorised test laboratories for their own purposes.
Technology is always on the move
The Greek philosopher Heraclitus is reported to have said: τὰ πάντα ῥεῖ καὶ οὐδὲν μένει (everything flows and nothing stays). Digital technologies do more than that, they not only do not stay, but move fast and actually accelerate.
MPEG is well aware that the technology landscape is constantly changing, and this awareness informs its standards. Until HEVC – one can even say, including the upcoming Versatile Video Coding (VVC) standard – video meant coding a rectangular area (in MPEG-4, a flat area of any shape, in HEVC it can be a video projected on a sphere). The birth of immersive visual experiences is not without pain, but they are happening and MPEG must be ready with solutions that take this basic assumption into account. This means that, in the technology scenario that is taking shape, the MPEG role of “anticipatory standards” is ever more important and challenging to achieve.
Digital media is one of the most fast-evolving digital technology areas because most of the developers of good technologies incorporated in MPEG standards invest the royalties earned from previous standards to develop new technologies for new standards. As soon as a new technology shows interesting performance, which MPEG assesses by issuing Calls for Evidence (CfE) or the context changes offering new opportunities, MPEG swiftly examines the case, develops requirements and issues Calls for Proposals (CfP).
This has happened for most of its video and audio compression standards. A paradigmatic case of a standard addressing a change of context is MPEG Media Transport (MMT) that MPEG designed having in mind a broadcasting system for which the layer below it is IP, unlike MPEG-2 Transport Stream, originally designed for a digitised analogue channel (but also used for transport over IP as in IPTV).
MPEG is not in the research business. However, without research there would be no MPEG. The MPEG work plan is a great promoter of corporate/academic research because it pushes companies to improve their technologies to enable them to make successful responses to CfPs.
One of the reasons of MPEG success, but also of the difficulties highlighted in this book, is that MPEG standardisation is a process closer to research than to product design.
Roughly speaking, in the MPEG standardisation process, research happens in two phases:
- In companies, in preparation for CfEs or CfPs (MPEG calls this competitive phase)
- In MPEG in what is called collaborative phase, i.e. during the development of Core Experiments (of course this research phase is still done by the companies, but in the framework of an MPEG standard under development).
The MPEG collaborative phase offers another opportunity to do more research. This has apparently a more limited scope, because it is in the context of optimising a subset of the entire scope of the standard, but the sum of many small optimisations can provide big gains in performance. The shortcoming of this process is the possible introduction of a large number of IP items for a gain that some may may well consider not to justify the added IP onus to complexity. With its MPEG-5 project MPEG is trying to see if a suitably placed lower limit to performance improvements can help solve the problems identified in the HEVC standard.
Rethinking what we are
MPEG started as a “club” of Telecommunication and Consumer Electronics companies. With MPEG-2 the “club” was enlarged to Terrestrial and Satellite Broadcasting, and Cable TV concerns. With MPEG-4, IT companies joined forces. Later, a large number of research institutions and academia joined (today they count for ~25% of the total membership). With MPEG-I, MPEG faces new challenges because the demand for standards for immersive services and applications is there, but technology immaturity deprives MPEG of its usual “anchors”.
Thirty years ago, MPEG invented itself and, subsequently, morphed to adapt to the changed conditions while keeping its principles intact. If MPEG will be able to continue to do as it did in the last 30 years, it can continue to support the industry it serves in the future, no matter the changes of context.
MPEG standards are not “owned” by a specific industry. Therefore MPEG, keeping faith to its “generic standards” mission, tries to accommodate all legitimate functional requirements when it develops a new standard. MPEG assesses each requirement for its merit (value of functionality, cost of implementation, possibility to aggregate the functionality with others etc.). Profiles and Levels are then used to partition the application space in response to specific industry needs.
The same happens if an industry comes with a legitimate request to add a functionality to an existing standard. The decision to accept or reject a request is only driven by the value brought by the proposal, as substantiated by use cases, not because an industry gets an advantage or another is penalised.
Development of basic technology is a private job, but collaboration is mandatory when requirements are defined or once a Test Model is available. This happens easily during meetings, but meetings are short events surrounded by times where official collaboration is not possible. Since its early days, MPEG has made massive use of ad hoc groups to progress collaborative work, with limitation such as that work of an ad hoc group is limited to preparation of recommendations to MPEG.
What is the meaning of compression? Is it “less bits is always good” or can it also be “as few meaningful bits as possible is also good”? The former is certainly desirable but, as the nature of information consumption changes and compression digs deeper in the nature of information, compressed representations that offer easier access to the information embedded in the data becomes more valuable.
What is the scope of application of MPEG compression? When MPEG started the MPEG-1 standards work, the gap between the telecom and the CE industries (the first two industries in attendance at that time) were as wide as between the media industry and, say, the genomic industries today. Both are digital now and the dialogue gets easier.
With patience and determination MPEG has succeeded creating a common language and mindset in the media industries. This is an important foundation of MPEG standards. The same amalgamation can continue and be achieved – not in a day – between MPEG and other industries.
|Table of contents||◄||3 MPEG and digital media||█||5 The MPEG operation||►|