13.19 – MPEG-G – Leonardo Chiariglione

13.19 – MPEG-G

Post author:admin
Post published:May 18, 2019
Post category:Mpeg book

Genomic Information Representation (MPEG-G) is a suite of specifications developed jointly with TC 276 Biotechnology that allows to reduce the amount of information required to losslessly store and transmit DNA reads from high speed sequencing machines.

An MPEG-G file can be created with the following sequence of operations:

Put the reads in the input file (aligned or unaligned) in bins corresponding to segments of the reference genome
Classify the reads in each bin in 6 classes: P (perfect match with the reference genome), M (reads with variants), etc.
Convert the reads of each bin to a subset of 18 descriptors specific of the class: e.g., a class P descriptor is the start position of the read etc.
Put the descriptors in the columns of a matrix
Compress each descriptor column (MPEG-G uses the very efficient CABAC compressor already present in several video coding standards)
Put compressed descriptors of a class of a bin in an Access Unit (AU) for a maximum of 6 AUs per bin

MPEG-G currently includes 6 parts

Part 1 – Transport and Storage of Genomic Information specifies the file and streaming formats
Part 2 – Genomic Information Representation specified the algorithm to compress DNA reads from high speed sequencing machines
Part 3 – Genomic information metadata and application programming interfaces (APIs) specifies metadata and API to access an MPEG-G file
Part 4 – Reference Software and Part 5 – Conformance are the usual components of a standard
Part 6 – Genomic Annotation Representation will specify how to compress annotations.

Table of contents

◄

13.18 MPEG-CICP

█

13.20 MPEG-IoMT

►

You Might Also Like

13.4 – MPEG-7

8.2 – More video features

5.7 – Standards and uncertainty