Media Transcoding

 

Content adaptation is an important technique for maximizing accessibility and functionalities of audio-visual content in pervasive or interactive multimedia environments.

In a typical video streaming system, video sequences are encoded in high quality in advanced and stored in the server. Because different networks may have different bandwidths, a gateway can include a transcoder to adapt the video bit rates in order to provide video services to users on different networks. In other words, media transcoding allows the multimedia content delivery to adapt to the wide diversity of client device capabilities in communication, processing, storage, and display.

So users can have the best possible quality.

 

1. Media Transcoding Techniques

 

Transcoding of coded video is regarded as a down conversion process, where the bit rate of a compressed bit stream is reduced according to a given constraint.

Quality jittering is very annoying for video viewers. Therefore, providing constant or smooth quality is an important goal for video venders.

 

1)      SNR scaling

The objective of bit-rate reduction is to reduce the bit rate while maintaining low complexity and achieving the best quality. (By requantization, coefficient dropping)

2)      Spatial resolution scaling

(e.g., by the adoption of the Dynamic Resolution Conversion (DRC) tool in Version 2 of the MPEG-4 standard.)

There are two major issues in the spatial resolution reduction transcoding. One is motion vector mapping from high resolution to low resolution. Another is mode decision making for the downscale macroblocks.

3)      Temporal resolution scaling

By frame skipping

i. Frame rate reduction: (uniform frame dropping)

ii. Time condensation

4)      Heterogeneous transcoding

Video format conversion (e.g., MPEG-2 çè MPEG-4)

 

2. Research Topic

 

2.1 Finding optimal transcoding strategy

Given resource constraints, what is the optimal adaptation operation maximizing the quality of the adapted contents. This solution also includes the investigation of more sophisticated quality metrics, which consider human vision systems capturing visual perception.

  1. Utility (quality) based approach
  2. Rate Control based approach
  3. Heuristic approach

Implement a specific transcoder (FD, CD) è Calculate Quality or Distortion with respect to all possible transcoding cases è Find optimal (best) case

 

2.2 Transcoder (transcoding methods)

A. Pixel Domain approach: Conventional cascade (by fully decoding à fully encoding)

B. Compression Domain approach

                           i.      Open loop architecture: simply requantization

                         ii.      Close loop architecture: reprocess MCP, reusing MV, drift compensation

In this architecture, the full-scale motion estimation is usually not performed to save computational complexity.

è Reuse motion vector (by using bilinear interpolation or forward dominant vector selection method (or activity dominant vector selection)

èMotion vector refinement (by using full search or hierarchical search)

è Reduction of computational complexity

C. Bitstream Syntax Description (BSD)

Because of manifold multimedia binary format, it is very hard to adapt multimedia contents according to each different environment. In case of manipulating the bitstream of multimedia data including various binary formats, it is more advantageous to use the structural description of multimedia bitstream than binary data itself. Once we have the file informing the bitstream structure as well as bitstream itself, it is possible to adapt contents by just transforming this description rather than by manipulating bitstream.

 

<Fig> Transcoding System Architecture

 

D. Scalable Video Coding

           Encode the video once è by simply truncation è obtain transcoded result

-         Without any impact on the coding efficiency

-         Layered approach

-         FGS (Fine Granularity Scalability) in MPEG-4

-         Fully scalable video coding: MCTF (Motion Compensated Temporal Filtering) which is newly being investigated in MPEG SVC.