Maximize Perceptual Quality Using Quality Information Table by contents class
2005 . 02 ~ 2006. 02
Maximize Perceptual Quality Using Quality Information Table by contents class
Multimedia services like video streaming have been popular and the heterogeneous and various networks also have been increased. In this environment, quality of service (QoS) has emerged as an important issue for service acceptability. For an effective multimedia service, there is no doubt that the quality of the serviced video has to be acceptable, even though constraints exist. When a video content is delivered to client, there are several constraints such as limited bandwidth, capability of device, etc that are encountered. The variation of network bandwidth is unpredictable at the encoding stage. Video content needs to be decodable at any dynamic range of bandwidth with the best possible video quality. Considering such a limited environment, multi-dimensional scalability gives a benefit in QoS point of view. The main goal of our research is to find the optimal bit-rate allocation strategy for three scalability layers of SVC video quality. For this, we performed subjective test for perceptual quality preference depending on video class. To find an efficient adaptation operation for scalable video, multiple quality tradeoffs (spatio-SNR-temporal quality) are considered.
The goal of this project is the followings
the semantic concepts that represent different video classes and are able to represent the characteristics of video scene

As shown in Fig. 1, the required bit-rate for the quality preference is dependant on original video’s bit-rate and characteristic. We used relative bit-rate ( ) that is the bit-rate of spatio-temporal switching point normalized by bit-rate of original bit-stream. The quality preference of CIF at 15 f/s and corresponding relative bit-rate of 0.2 is equivalent to 20% of original video’s bit-rate for the quality of CIF at 15 f/s. In case that original video’s bit-rate is 1Mbps, quality preference for the spatio-temporal quality of CIF at 15 f/s) exists at about 200kbps.
Figure 1. Relative bit-rate of video segments versus spatio-temporal switching points: (a) action concept (b) crowd concept; (c) dialog concept (d) scenery concept (e) text&graphic concept
In Fig 2, preference paths are shown in three-dimensional scalability. Those are achieved from the analysis for the preference characteristics of video segments in five semantic concepts as bit-rate decreases. Preference paths are different depending on semantic concept. The line between two switching points represent the path of extraction for SNR scalability. For instance, SNR quality decreases from first marker to second marker as relative bit-rate decreases. The markers indicate switching points for spatio-temporal scalability. For instance, first marker in case (a) indicates the switching point from size of CIF at 30 f/s to size of CIF at 15 f/s, and second marker in case (b) indicates the switching point from size of CIF at 15 f/s to size of QCIF at 30 f/s.
Figure 2. Preference path of perceptual quality : (a) scenery concept (b) action concept (c) crowd concept (d) text&graphic concept (e) dialog concept
SVC bit-stream extraction with QIT is applied to extractor in JSVM 2.0. The procedure of extraction with QIT and the video quality of the extracted bit-stream are discussed in this section. In the experiment, video segment in action concept is used. For spatial layer scalability, video is coded to support two layers, CIF and QCIF. For temporal scalability, both of two spatial layers are applied to support maximum frame rate of 30 f/s. For SNR scalability, three-layer fine grain scalability (FGS) is applied. With the experiment for perceptual quality, preference paths for bit-stream extraction were determined. So, based on one scenario, we applied QIT subject to its semantic concept and extracted video at several bit-rates. The extraction path for the experiment is shown in Figure 3.
Figure 3. Extraction path of perceptual quality preference in experiment
The extracted video versions with the determined extraction parameters are shown in Figure 4. At 448kbps (case (a)), extractor allocates the bit-rate (332kbps) for SNR quality and extracts all NAL units including temporal scalability to support 30 f/s. Also, extractor extracts all NAL units including spatial scalability to support CIF resolution. The required TLs of NAL unit are less than or equal to 4, namely 0, 1, 2, 3 and 4. At 224kbps (case (b)), a different tradeoff is applied based on QIT. Extractor allocates the bit-rate (139kbps) for SNR quality and extracts NAL units including TLs from 0 to 3. These NAL units can construct frame rate of 15 f/s, correspondingly. These two extracted versions keep size of CIF extracting NAL units including all DIs. On the contrary to (case (a)) and (case (b)), at 128kbps (case (c)) and at 70kbps (case (d)), extractor extracts only NAL units including DI of 0 to support size of QCIF. The required TLs of NAL unit for case (c) are less than or equal to 4, namely 0, 1, 2, 3, and 4. The bit-rate allocated for SNR quality enhancement from base layer is 48kbps. In case (d), NAL units contains TL of 4 are not extracted only to support the frame rate of 15 f/s. 10kbps is allocated for SNR quality enhancement, correspondingly
Figure 4. Extracted video segments: (a) CIF at 30 f/s (448 kbps) (b) CIF at 15 f/s (224 kbps) (c) QCIF at 30 f/s (128 kbps) (d) QCIF at 15 f/s (70kbps)

we presented SVC bit-stream extraction scheme using perceptual quality preference. In our analysis, video segments belonging to the same semantic concept (video class) have consistent characteristic of quality preference. This characteristic is different from that of different semantic concept. Therefore, depending on semantic concept, different quality emphasis on multi-dimensional scalability is preferable. Take for example; more bit-rates for temporal scalability layers are allocated for action scene. In the extractor, for action scene, more NAL units for temporal scalability are extracted to sustain frame rate higher enough while SNR quality is degraded. On the contrary, for a scene in drama category, more bit-rate is allocated for SNR scalability. Based on the achieved quality preference in five semantic concepts, quality information tables (QIT) for SVC bit-stream extraction were determined. The determined QITs are applied to SVC bit-stream extraction and the detailed extraction procedure for the requested quality was shown in our experiment. Finally, we can conclude that different tradeoff of multi-dimensional scalability should be considered depending on characteristic of video scene to maximize perceptual quality, since different perceptual quality preference exists in different video class. With the proposed SVC bit-stream extraction scheme, perceptual quality depending on video class can be applied efficiently

International Journal
T. C. Thang, Y. J. Jung, Y. M. Ro, "Dynamic Programming Based Adaptation of Multimedia Contents in UMA", Proc. PCM2004, LNCS vol.3332, Springer-Verlag, pp.347-355, 2004.
T. C. Thang, Y. J. Jung, Y. M. Ro, "Effective adaptation of multimedia documents with modality conversion", EURASIP Signal Processing: Image Communication Journal, Vol. 20, Issue 5, pp.413-434, 2005
T. C. Thang, Y. J. Jung, Y. M. Ro, "Modality conversion for QoS management in Universal Multimedia Access", IEE Proceedings - Vision, Image and Signal Processing, Vol. 152, Issue 03, pp.374-384, Jun. 2005.
International Conference
Published (6)
Yong Ju Jung, Yong Man Ro: Joint control for hybrid transcoding using multidimensional rate distortion modeling. ICIP 2004: 2789-2792
Yong Ju Jung and Yong Man Ro, "Distortion Modeling for Inter-Operation Dependent Video Transcoding",IEEE TENCON2004, CD Version
Y. S. Kim, Y. J. Jung, T. C. Thang, Y. M. Ro, "Bit-stream extraction to maximize perceptual quality using quality information table in SVC", Proc. SPIE Electronic Imaging, Vol. 6077, Jan. 2006.
T. C. Thang, Y. S. Kim, C. S. Kim, Y. M. Ro, "Quality Models for Audiovisual Streaming", Proc. SPIE Electronic Imaging, Vol. 6059, Jan. 2006.
T. C. Thang, Y. J. Jung, Y. M. Ro, "Semantic Quality for Content-Aware Video Adaptation" (in Special Session of Content-Aware Video Coding and Communication), Proc. IEEE MMSP2005, Shanghai, Oct. 2005.
T. C. Thang, Y. J. Jung, Y. M. Ro, ""Distortion Measures in MPEG-Compressed Domain for Multidimensional Transcoding" Proc. IEEE MMSP2005, Shanghai, Oct. 2005.
Domestic Journal
Published (2)
Truong Cong Thang1, Yong Ju Jung2, Yong Man Ro3, "Quality Evaluation of Video Summaries", HCI2005. 2004. 10.
최정화,서동준,노용만, "유비쿼터스 주거 환경에서의 개인화된 컨텐츠 적응 시스템", HCI 2006. 02


View The Demo < click here >