8.1 – Forty years of video compression

  • Post author:
  • Post category:Mpeg book

Introduction

This chapter will tell the story of how compressed digital video standards gave rise to the ever-improving digital video experience that the world is experiencing in ever greater numbers.

First Video Coding Standard

The first international standard that used video coding techniques – ITU-T Recommendation H.120 – originated from the European research project called COST 211. H.120 was intended for video-conference services, especially on satellite channels. H.120 was approved in 1984 but was implemented in a limited number of specimens.

Second Video Coding Standard

The second international standard that used video coding techniques – ITU-T Recommendation H.261 – was intended for audio-visual services and was approved in 1988. This project signaled the maturity of video coding standardisation that left the old and inefficient algorithms to enter the DCT/motion compensation age.

For several reasons H.261 was implemented by a limited number of manufacturing companies for a limited number of customers.

Third Video Coding Standard

Television broadcasting has always been – and, with challenges, continues to be so today – a socially important communication tool. Unlike audio-visual services that were mostly a strategic target on the part of the telecom industry, television broadcasting in the 1980’s was a thriving industry served by the Consumer Electronic (CE) industry providing devices to hundreds of millions of consumers.

The idea that originated ISO MPEG-1, the third international standard that used video coding techniques. was intended for interactive video applications on CD-ROM. The MPEG-1 standard was released by MPEG in November 1992. Besides the declared goal, the intention was to pop­ularise video coding technologies by relying on the manufacturing prowess of the CE industry. MPEG-1 was the first example of a video coding standard developed by two industries that had had until that time very little in common: telecom and CE (terminals for the telecom market were developed by a special industry with few contacts with the CE industry).

Fourth Video Coding Standard

Even though in the late 1990’s MPEG-1 Video eventually reached the 1 billion units sold with the nickname “Video CD”, especially in the Far East, the big game started with the fourth inter­national standard that used video coding techniques – ISO MPEG-2 – whose target was “digital television”. The number of industries interested in it made MPEG a crowded WG: telecom had always sought to have a role in television, CE was obviously interested in having existing analogue TV sets replaced by shining digital TV sets or at least supplemented by a set top box, satellite broadcasters and cable were very keen on the idea of hundreds of TV programs in their bouquets, terrestrial broadcasters had different strategies in different regions but eventually joined, as well as the package media sector of the CE industry, with their tight contacts with the movie industry.

This explains why the official title of MPEG-2 is “Generic coding of moving pictures and associ­ated audio information” to signal the fact that MPEG-2 could be used by all the industries that, at that time, had an interest in digital video, a unique feat in the industry.

Fifth and Sixth Video Coding Standards

Remarkably, MPEG-2 Video and Systems were specifications jointly developed by MPEG and ITU-T. The world, however, follows the dictum of the Romance of Three Kingdoms (三國演義): 話說天下大勢.分久必合,合久必分. Adapted to the context this can be translated as “in the world things divided for a long time shall unite, things united for a long time shall divide”. So, the MPEG and ITU paths divided in the following phase. ITU-T developed its own H.263 Recommendation “Video coding for low bit rate communication” and MPEG developed its own MPEG-4 Visual standard, part 2 “Coding of audio-visual objects”. The conjunction of the two standards is a very tiny code that simply tells the decoder that a bitstream is H.263 or MPEG-4 Visual. A lot of coding tool commonality exists, but not at the bitstream level.

H.263 focused on low bitrate video communication, while MPEG-4 Visual kept on making real the vision of extending video coding to more industries: this time Information Technology and Mobile. MPEG-4 Visual was released in 2 versions in 1999 and 2000, while H.263 went through a series of updates documented in a series of Annexes to the H.263 Recommendation. H.263 enjoyed some success thanks to the common belief that it was “royalty free”, while MPEG-4 Visual suffered a devastating blow by a patent pool that decided to impose “content fees” in their licensing term.

Seventh Video Coding Standard

The year 2001 marked the return to the second half of Romance of Three Kingdoms’ dictum: 分久必合 (things separated for a long time shall unite), even though it was not too 久 (long time) since they had divided, certainly not on the scale intended by the Romance of Three Kingdoms. MPEG and ITU-T (through its Video Coding Experts Group – VCEG) joined forces again in 2001 and produced the seventh international video coding standard in 2003. The standard is called Advanced Video Coding by both MPEG and ITU, but is labelled as AVC by MPEG and as H.264 by ITU-T. “Reasonable” licensing terms ensured AVC’s long-lasting success in the market place that continues to this day.

Eighth Video Coding Standard

The eight international standard dealing with video coding stands by itself because it is not a standard with “new” video coding technologies, but a standard that enables a video decoder to build a decoder matching the bitstream using standardised tools represented in a standard form available at the decoder. The technique, called Reconfigurable Video Coding (RVC) or, more generally, Reconfigurable Media Coding (RMC), because MPEG has applied the same technology to 3D Graphics Coding as well, is enabled by two standards: ISO/IEC 23002-4 Codec configuration representation and ISO/IEC 23003-4 Video tool library (VTL). The former defines the methods and general principles to describe codec configurations. The latter describes the MPEG VTL and specifies the Functional Units that are required to build a complete decoder for the following standards: MPEG-4 Simple Profile, AVC Constrained Baseline Profile and Prog­ressive High Profile, MPEG-4 SC3DMC, and HEVC Main Profile.

Ninth Video Coding Standard

In 2010 MPEG and VCEG extended their collaboration to a new project: High Efficiency Video Coding (HEVC). A few months after the HEVC FDIS had been released, the HEVC Verification Tests showed that the standard had achieved 60% improvement over AVC, 10% more than originally planned. After that, HEVC has been enriched with a number of features that at the time of development were not supported by previous standards such as High Dynamic Range (HDR) and Wide Colour Gamut (WCG), and support to Screen Content and omnidirectional video (video 360). Unfortunately, technical success did not translate into full market success because adoption of HEVC is still hampered – 6 years after its approval by MPEG – by an unclear licensing situation.

Tenth Video Coding Standard

The target of MPEG standards until AVC had always been “best performance no matter what is the IPR involved” (of course, if IPR holders allow) but, as the use of AVC extended to many domains, it was becoming clear that there was so much “old” IP (i.e. more than 20 years old) that it was technically possible to make a standard whose IP components were Option 1.

In 2013 MPEG released the FDIS of WebVC, strictly speaking not a new standard because MPEG had simply extracted what was the Constrained Baseline Profile of AVC and made it a separate standard with the intention of making it Option 1. The attempt failed because some companies confirmed their Option 2 patent declarations already made against the AVC standard.

Eleventh Video Coding Standard

WebVC has not been the only effort made by MPEG to develop an Option 1 video coding standard (i.e. a standard for which only Option 1 patent declarations have been made). A second effort, called Internet Video Coding (IVC), was concluded in 2017 with the release of the IVC FDIS. Verification Tests performed showed that IVC exceeded in performance the best profile of AVC, by then a 14 years old standard. Three companies made Option 2 patent declarations that did not contain any detail about the allegedly infringed technologies. Therefore, MPEG could not remove the technologies in IVC that the companies claimed infringed their patents.

Twelfth Video Coding Standard

MPEG achieved a different result with its third attempt at developing an Option 1 video coding standard. The proposal made by a company in response to an MPEG Call for Proposals was reviewed by MPEG and achieved FDIS stage with the name of Video Coding for Browsers (VCB). However, a company made an Option 3 patent declaration that, like those made against IVC, did not contain any detail that would enable MPEG to remove the allegedly infringing technologies. Eventually ISO did not publish VCB.

Today ISO and IEC have disabled the possibility for companies to make Option 3 patent declar­ations without details (a policy that ITU, even before, had not allowed). As the VCB approval process has been completed, it is not possible to resume the study of VCB if MPEG does not restart the process. Therefore, VCB is likely to remain unpublished and therefore not an ISO standard.

Thirteenth Video Coding Standard

For the fourth time MPEG and ITU are collaborating in the development of a new video coding standard with the target of a 50% reduction of bitrate compared to HEVC. The development of Versatile Video Coding (VVC), as the new standard is called, is still under way and involves some 250 experts attending VVC sessions. MPEG expects Versatile Video Coding (VVC) to reach the FDIS stage in July 2020 for the key compression engine. Other components, such as high-level syntax or SEI messages will likely be released later.

Fourteenth Video Coding Standard

Thirteen is a large number for video coding standards but this number should be measured against the number of years covered – close to 40. In this long period of time we have gone from 3 initial standards that were mostly application/industry-specific (H.120, MPEG-1 and H.261) to a series of generic (i.e. industry-neutral) standards (MPEG-2, MPEG-4 Visual, MPEG-4 AVC and HEVC) and then to a group of standards that sought to achieve Option 1 status (WebVC, IVC and VCB). Other proprietary video coding formats that have found significant use in the market point to the fact that MPEG cannot stay forever in its ivory tower of “best video coding standards no matter what”. MPEG has to face the reality of a market that becomes more and more diversified and where – unlike the golden age of a single coding standard – there is no longer one size that fits all.

At its 125th meeting MPEG has reviewed the responses to its Call for Proposals on a new video coding standard that sought proposals with a simplified coding structure and an accelerated development time of 12 months from working draft to FDIS. The new standard will be called MPEG-5 Essential Video Coding (EVC) and is expected to reach FDIS in April 2020.

The new video coding project will have a base layer/profile which is expected to be Option 1 with a performance ~30% more than AVC and a second layer/profile that has already a performance ~25% better than HEVC. Licensing terms are expected to be published by patent holders within 2 years.

VCEG has decided not to work with MPEG on this coding standard. Are we back to the 合久必分 (things combined for a long time must split) situation? Partly so because the MPEG-VCEG col­lab­oration in VVC is continuing. In any case VVC will provide a compression performance 50% more than HEVC’s.

Fifteenth Video Coding Standard

If there was a need to prove that there is no longer “one size fits all” in video coding, just look at the new project called “Low Complexity Enhancements Video Coding (LCEVC)” MPEG has started working on. It is not about a “new video codec”, but a technology capable to extend the capabilities of an existing video codec. A typical usage scenario is the addition of, say, the high definition capability to deployed set top boxes that cannot be recalled. LCEVC is expected to reach FDIS in July 2020.

Sixteenth Video Coding Standard

Point Clouds are not really the traditional “video” content as we know it, namely sequences of “frames” at a frequency that is sufficiently high to fool the eye into believing that the motion is natural. In point clouds, motion is given by dynamic point clouds that represent the surface of objects moving in the scene. For the eye, however, the end-result is the same: moving pictures displayed on a 2D surface, whose objects can be manipulated by the viewer (to do this, however, a system layer is required, and MPEG is already working on it).

MPEG is working on two different technologies: the first one uses HEVC to compress projections of portions of a point cloud (and is therefore well-suited for entertainment applications because it can rely on an existing HEVC decoder) and the second one uses computer graphics technologies (and is currently more suited to automotive and similar applications). The former will become FDIS in January 2020 and the latter in April 2020.

Seventeenth and Eighteenth Video Coding Standards

Unfortunately, the crystal ball gets blurred as we move into the future. MPEG tries to face the reality by investigating several technologies capable to provide solutions for alternative immersive experiences. After providing HEVC and OMAF for 3DoF experiences (where the user can only have roll, pitch, and yaw movement of the head), MPEG is working on OMAF v2 for 3DoF+ experiences (where the user can have a limited translation of the head). The video for OMAF v2 will be provided by MPEG-I Part 7 Immersive Media Metadata. As the title say, this is not about compression, but metadata that enable the viewer not to suffer from parallax errors The FDIS is expected in July 2020. Other investigations regard 6DoF (where the user can have full translation of the head) and light field.

Conclusions

The last 40 years have seen digital video converted from a dream into a reality that involves billions of users every day with an incredible lowering to the threshold to access video creation. This long ride is represented in the figure that ventures into the next steps of the ride.

Figure – 30 years of video coding at MPEG

Legend: yellow=ITU-T only, green= MPEG only, turquoise= joint with ITU-T

MPEG keeps working to make sure that manufacturers and content/services providers have access to more and better standard visual technologies for an increasingly diversified market of increasingly demanding users.

 

Table of contents 8 Video compression in MPEG 8.2 More video features