5.4 – The ecosystem drives MPEG standards

Standards making changes with time

In days long bygone, standardisation in what today we would call the “media industry” followed a rather simple process. A company wishing to attach a “standard” label to a product that had succeeded in the market made a request to a standards committee whose members, typically from companies in the same industry, had an interest in getting an open specification of what had until then been a closed proprietary system. A good example is offered by the video cassette player for which two products from two different companies, ostensibly for the same functionality – VHS and Betamax – were approved by the same standard organisation – the International Elec­trotechnical Committee (IEC) and by the same committee – SC 60 B at that time.

Things were a little different in the International Telecommunication Union (ITU) where ITU-T (then called CCITT) had a Study Group where the telecommunication industry – represented by the Post and Telecommunication Administrations of the member countries, at that time the only ones admitted to the committee – requested a standard (called recommendation in the ITU) for digital telephony speech. ITU-T ended up with two different specifications in the same standard: one called A-law and the other called µ-law.

In ITU-R (then called CCIR) National Administrations were operating, or had authorised various entities to operate, television broadcasting services (some had even started services before WW II) and were therefore unable to settle on even a limited number of television systems. The only thing they could do was to produce a document called Report 624 Television Systems that collected the 3 main television systems (NTSC, PAL and SECAM) with tens of pages where country A selected, e.g., a different frequency or a different tolerance of the colour subcarrier than country B or C.

Not unaware of past failures of standardisation and taking advantage of the radical technology discontinuity, MPEG took a different approach to standardisation which can be expressed by the synthetic expression “one functionality – one tool”. To apply this expression to the example of ITU-T’s A-law – µ-law dichotomy, if MPEG had to decide on a standard for digital speech, it would

  1. Develop requirements
  2. Select speech samples to be used for tests
  3. Issue a Call for Proposals (CfP)
  4. Run the selected test speech with the proposals
  5. Subjectively assess the quality
  6. Check the proposals for any issue such as complexty etc.
  7. Create a Test Model with the proposals
  8. Create Core Experiments (CE)
  9. Iterate the Test Model with the results of CEs
  10. Produce WD, CD, DIS and FDIS

The process would be long – an overkill in this case because a speech digitiser is just a simple analogue-to-digital (A/D) converter – but not necessarily longer that waiting for a committee to decide on competing proposals with the goal of selecting only one. MPEG’s result would be a single standard providing seamless bitstream interoperability without the need to convert speech from one format to another when speech moves from one environment (country, application etc.) to another.

If there were only the 10 points listed above, the MPEG process would not be much more complex than the ITU’s. The real difference is that MPEG does not have the mindset of the telecom industry who had decided A-law – µ-law digital speech 50+ years ago because it serves a large number of industries.

Where lies the MPEG difference

MPEG is different because it would address speech digitisation taking into consideration the needs of a range of other industries who intend to use and hence want to have a say in how the standard is made: Consumer Electronic (CE), Information Technology (IT), broadcasting, telecommunications and more. Taking into account so many views is a burden for those developing the standard, but the standard eventually produced is abstracted from the small (or big) needs that are specific of individual industries. Profiles and Level allow an industry not to be overburdened by technologies introduced to satisfy requirements from other industries that are irrelevant (and possibly costly) to that industry. Those who need the functionality, not matter what the cost, can do it with different profiles and levels. Figure 10 depicts how MPEG has succeeded in its role of “abstracting” the needs of client “digital media” industries currently served by MPEG.

Figure 10: MPEG and its client “digital media” industries

Figure 10, however, does not describe all the ecosystem actors. In MPEG-1 the Consumer Electronics industry was typically able to develop by itself the technology needed to make products that used the MPEG-1 standard. With MPEG-2 this was less the case and independent companies offering encoding and decoding chips sprang up. Today the industry implementing (as opposed to using or selling products based on) MPEG standards has grown to be a very important element of the MPEG ecosystem. This industry typically provides components to companies who actually manufacture a complete product (sometimes this happens inside the same company, but the logic is the same).

MPEG standards can be implemented using various combinations of software, hardware and hybrid software/hardware technologies. The choice for hardware is very wide: from various integrated circuit architectures to analogue technologies. The latter choice is for devices with extremely low power consumption, although with limited compression. Just about to come are devices that use neural networks. Other technologies are likely to find use in the future, such as quantum computing or even genomic technologies.

The MPEG component industries

Figure 11 identifies 3 “layers” in the MPEG ecosystem.

The red arrows show the flow of Requirements and Standards and the violet arrows show the flow of Implementation requests and Implementations.

Figure 11: MPEG, its Client and Implementation Industries

Client industries in need of a standard provide requirements. However, the “Implementation layer” industries, examples of which have been provided above, also provide requirements. The MPEG layer eventually develops standards that are fed to the Client Industry layer that requested it, but also to the Implementation layer. Requests to implement a standard are generated by companies in the Client industry layer and directed to companies in the Implementation layer who eventually deliver the implementations to the companies requesting them. Conformance testing typically plays a role in assessing conformance of an implementation to the standard.

How the MPEG process takes place

Figure 11 does not fully describe the MPEG ecosystem. More elements are provided by Figure 12 which also describes how the MPEG process actually takes place.

Figure 12: The MPEG process in the MPEG ecosystem

The new elements highlighted by Figure 12 are

  1. The MPEG Toolkit assembling all technologies that have been used in MPEG standards
  2. The MPEG Competence Centres mastering specific technology areas and
  3. The Technology industries providing new technologies to MPEG by responding to CfPs.

In the early days the Implementation Industries did not have a clear identity and could be considered part of the Client and Implementation Industries. Today, as highlighted above, the prov­­iders of basic technologies are well identified and separate industries that may not implement or use the standards.

Using Figure 12 it is possible to describe how the MPEG process unfolds (the elements of the MPEG ecosystem are in italic).

  1. MPEG receives a request for a standard from a Client Industry (or more than one)
  2. The Requirements Competence Centre develops requirements by interacting with Client Industries and Implementation Industries
  3. MPEG issues CfPs (Calls for technologies in the figure)
  4. Technology Industries respond to CfPs by submitting technologies
  5. MPEG mobilises appropriate Competence Centres
  6. Competence Centres develop standards by selecting/adapting submitted technologies and existing technologies (drawn from the toolkit)
  7. MPEG updates the toolkit with new technologies.

MPEG’s role cannot be described by the simple “Standards Provider – Client Industry” rel­at­ion­ship. MPEG is a complex ecosystem that works because all its entities play the role that is proper to them.

MPEG standards are glued together

MPEG started with the idea of a standard for video compression and soon it became clear that an audio compression standard was needed at the same time because in most applications video without audio is not really useful. Then it also became clear that playing audio and video after delivery in the way intended by the creator did not come for free and so came the need for Systems standards. Moving on, a File Format also became a necessity and today, when MPEG develops the MPEG-I standard, a lot more technologies are found necessary to hold the pieces of the system together.

MPEG has devised an organisation of work that allows it to deploy the necessary level of expertise in specific technology areas, e.g. in video and audio coding and file format. At the same time, however, the organisation allows it to identify where interfaces between different media subsystem are needed so that users of the standard do not discover unpleasant surprises when they integrate solutions.

Figure 13 is a good model of how most MPEG standards are developed. Different groups with different competences develop different parts of a standard, say, MPEG-I. Some parts are designed to work together with others in systems identified in the Context-Objectives-Use cases phase. However, many parts are not tightly bound because in general it is possible to use them separately. In other cases, there are parts of different origin that must work tightly together and here is where MPEG provides the “glue” by using ad hoc groups, joint meetings, chairs meeting etc.

Figure 13: Structure of MPEG standards

This work method was developed and refined over the years and has served well to provide industry with standards that can be used as individual components or as a system.

Here are examples of how standards for different purposes have achieved the goal:

  1. MPEG-7 – Multimedia content description interface – was driven by the idea of a world rich of audio-video-multimedia descriptors that would allow users to navigate the large amount of media content expected at that time and that we have today. Content descriptors were expressed in verbose XML, a tool at odds with the MPEG bit-thrifty purposes. So MPEG developed the first standard for XML compression, a technology adopted in many fields which is consistently used by all MPEG-7 descriptors.
  2. Of MPEG-A – Multimedia Application Formats – is remarkable the Common Media Applic­ation Format (CMAF) standard. Several technologies drawn from different MPEG standards are restricted and integrated to enable efficient delivery of large scale, possibly protected, video applications, e.g. streaming of televised events. CMAF Segments can be delivered once to edge servers in content delivery networks, then accessed from cache by thousands of streaming video players without additional network backbone traffic or transmission delay.
  3. MPEG-V – Media context and control – is another typical example. The work was initiated in the wake of the success of Second Life, a service with virtual objects that looked like it could take over the world. The purpose of part 4 of MPEG-V Virtual world object char­ac­teristics was not to standardise a Second Life like service but the interfaces that would allow a user to move assets from one virtual space to another virtual space. Other parts of MPEG-V concern formats and interfaces to enrich the audio-visual user experience with, say, a breeze when there is a little wind in the movie, a smell when you are in a field of violets etc. So far, this apparently exciting extension of the user experience in a virtual world did not fly, but MPEG-V provides a very solid communication framework for sensors and actuator that finds use in other standards.
  4. MPEG-H – High efficiency coding and media delivery in heterogeneous environments – is another integrated standard of which a user can decide to take only the video part (HEVC) or the audio part (3D Audio). Another part – MPEG Media Transport (MMT) – shows how it is possible to innovate without destabilising existing markets. MPEG-2 Transport Stream (TS) has been in use for 25 years and will continue to be used for the foreseable future. But MPEG-2 TS shows the signs of time because it has been designed for a one-way channel – an obvious choice 25 years ago – while so much video distribution today happens on two-way channels. Therefore, MMT uses IP transport instead of MPEG-2 TS transport and achieves content delivery unification in both one-way and two-way distribution channels.

Conclusions

MPEG is a complex ecosystem that has successfully operated for decades serving the needs of the growing number of its component industries. Everything human is perfectible, but whoever wants to lay their hands-on MPEG should remember the following

  1. Dividing MPEG by Client Industries would mean losing the commonality of technologies;
  2. Dividing MPEG by Implementation Industries would make no sense, because in principle any MPEG standard must be implementable with several different technologies;
  3. Dividing MPEG by Competence Centres would mean losing the interactions between them.
Table of contents 5.3 How MPEG develops standards 5.5 Standardisation and product making