In the preceding chapter the intense 40-year long history of ITU and MPEG video compression standards was presented. In this chapter the focus is on how over the years, MPEG standards have been offering more functionalities in addition to video compression and how the next generations of standard will add even more.
Table 10 gives an overview of all MPEG video compression standards – past, present and planned. Those in italic have not reached Final Draft International Standard (FDIS) level.
Table 10 – MPEG video coding standards and functionalities
In 1988 MPEG started its first video coding project for interactive video applications on compact disc (MPEG-1). Input video was assumed to be progressive (25/29.97 Hz, but it also supported more frame rates) and spatial resolution was Source Image Format (SIF), i.e. 240 or 288 lines of 352 pixels each, depending on whether the original video was 525 of 625 lines. The syntax supported spatial resolutions up to 16 Kpixels. Obviously progressive scanning is a feature that all MPEG video coding standards have supported since MPEG-1. The (obvious) exception is point clouds because there are no “frames”.
In 1990 MPEG started its second video coding project – MPEG-2 – targeting digital television. Therefore, the input was assumed to be interlaced (frame rate of 50/59.94 Hz, but it also supported other frame rates) and spatial resolution was standard/high definition, and up. The resolution space was quantised by means of levels, the second dimension after profiles. MPEG-4 Visual and Advanced Video Coding (AVC) are the two last standards with specific interlace tools. An attempt was made to introduce interlace tools in High Efficiency Video Coding (HEVC) but the technologies presented did not show appreciable improvements when compared with progressive tools. HEVC does have some indicators (SEI/VUI) to tell the decoder that the video is interlaced.
Scalability, multiview and higher croma resolution
MPEG-2 was the first standard to tackle scalability (High Profile), multiview (Multiview Profile) and higher croma resolution (4:2:2 Profile). Several subsequent video coding standards (MPEG-4 Visual, AVC and HEVC) also support these new features. Versatile Video Coding (VVC) is expected to do the same, probably not in version 1.
Video objects and error resilience
MPEG-4 Visual supports coding of video objects and error resilience. The first feature has remained specific to MPEG-4 Visual. Most video codecs allow for some error resilience (e.g. with the introduction of slices in MPEG-1). However, MPEG-4 Visual – mobile communication being one relevant use case – was the first to specifically consider error resilience as a tool. MPEG-2 first tried to develop 10-bit support and the empty part 8 of MPEG-2 is what is left of that attempt.
Wide Colour Gamut, High Dynamic Range, 3 Degrees of Freedom
WCG, HDR and 3DoF are all supported by AVC. These functionalities were first introduced in HEVC, later added to AVC and are planned to be supported in VVC as well. WCG allows to display a wider gamut of colours, HDR allows to display pictures with brighter regions and with more visible details in dark areas, SCC allows to achieve better compression of non natural (synthetic) material such as characters and graphics and 3DoF (also called Video 360) allows to represent pictures projected on a sphere.
More than 8 quantisation bits
AVC supports more than 8 quantisation bits extended to 14 bits. HEVC even support 16 bits. VVC, EVC and LCEVC are expected to also support more than 8 quantisation bits.
WebVC was the first MPEG attempt at defining a video coding standard that would not require a licence that involves payment of fees (Option 1 in ISO language, legal language more complex than this). Strictly speaking, WebVC is not a new standard because MPEG has simply extracted what was the Constrained Baseline Profile in AVC (originally, AVC tried to define an Option 1 profile but did not achieve the goal and did not define the profile) with the hope that WebVC could achieve Option 1 status. The attempt failed because some companies confirmed their Option 2 patent declarations (i.e. a licence is required to use the standard) already made against the AVC standard. The brackets in the figure convey this fact.
Video Coding for Browsers (VCB) is the result of a proposal made by a company in response to an MPEG Call for Proposals for Option 1 video coding technology. Another company made an Option 3 patent declaration (i.e. unavailability to license the technology). As the declaration did not contain any detail that could allow MPEG to remove the allegedly infringing technologies, ISO did not publish VCB as a standard. The square brackets in the figure convey this fact.
Internet Video Coding (IVC) is the third video coding standard intended to be Option 1. Three Option 2 patent declarations were received and MPEG has declared its availability to remove patented technology from the standard if specific technology claims will be made. The brackets convey this fact.
Two-layer video coding
Finally, Essential Video Coding (EVC), part 1 of MPEG-5 (however, the project has not been formally approved by ISO yet), is expected to be a two-layer video coding standard. The EVC Call for Proposals requested that the technologies provided in response to the Call for the first (lower) layer of the standard to be Option 1. Technologies for the second (higher) layer are Option 2. The curled brackets in the figure convey this fact.
Low Complexity Enhancement Video Coding (LCEVC) is another two-layer video coding standard. Unlike EVC, however, in LCEVC the lower layer is not tied to any specific technology and can be any video codec. The goal of the 2nd layer is to extend the capability of an existing video codec. A typical usage scenario is to give a large amount of already deployed standard definition set top boxes that cannot be recalled the ability to decode high definition pictures.
Screen Content Coding
SCC allows to achieve better compression of non-natural (synthetic) material such as characters and graphics. It is supported by HEVC and is planned to be supported in VVC and possibly EVC.
3D point clouds
Today technologies are available to capture 3D point clouds, typically with multiple cameras and depth sensors producing up to billions of points for realistically reconstructed scenes. Point clouds can have attributes such as colors, material properties and/or other attributes and are useful for real-time communication, Geographic Information System (GIS), Computer Aided Design (CAD) and cultural heritage applications. MPEG-I part 5 will specify lossy compression of 3D point clouds employing efficient geometry and attributes compression, scalable/progressive coding, and coding of point clouds sequences captured over time with support of random access to subsets of the point cloud.
Other technologies capture points clouds, potentially with low density of points. These allow users to freely navigate in multi-sensory 3D media spaces. Such representations require a large amount of data, that cannot be transmitted on today’s networks. Therefore MPEG is developing a second, graphics-based PCC standard, as opposed to the previous one which is video-based, for efficient compression of sparse point clouds.
Three Degrees of Freedom +
3DoF+ is a term used by MPEG to indicate a usage scenario where the user can have translational movements of the head. As a user in a 3DoF scenario sees annoying parallax errors if they move the head too much, MPEG is developing a standard that specifies appropriate metadata that help present the best image based on the viewer’s position if available, or to synthesise a missing one, if not available.
Six Degrees of Freedom and Lightfield
6DoF indicates a use scenario where the user can freely move in a space and enjoy a 3D virtual experience that matches the one in the real world. Lightfield refers to new devices that can capture a spatially sampled version of a light field that has both spatial and angular light information in one shot. The size of captured data is not only larger but also different than traditional camera images. MPEG is investigating new and compatible compression methods for potential new services.
In 30 years compressed digital video has made a lot of progress, e.g., bigger and brighter pictures with less bitrate and other features. The end point is nowhere in sight.
Thanks to Gary Sullivan and Jens-Rainer Ohm for useful comments to this chapter.
|Table of contents||◄||8.1 Forty years of video compression||█||8.3 Immersive visual experiences||►|