The Scientific World Journal

Volume 2015, Article ID 418437, 10 pages

http://dx.doi.org/10.1155/2015/418437

## H.264 SVC Complexity Reduction Based on Likelihood Mode Decision

^{1}Faculty of Information & Communication, Anna University, Chennai 600025, India^{2}Department of ECE, Velammal Institute of Technology, Panchetti, Tamil Nadu 601204, India^{3}RMD Engineering College, Kavaraipettai, Tamil Nadu 601206, India

Received 21 March 2015; Revised 20 May 2015; Accepted 7 June 2015

Academic Editor: Bruno Carpentieri

Copyright © 2015 L. Balaji and K. K. Thyagharajan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

H.264 Advanced Video Coding (AVC) was prolonged to Scalable Video Coding (SVC). SVC executes in different electronics gadgets such as personal computer, HDTV, SDTV, IPTV, and full-HDTV in which user demands various scaling of the same content. The various scaling is resolution, frame rate, quality, heterogeneous networks, bandwidth, and so forth. Scaling consumes more encoding time and computational complexity during mode selection. In this paper, to reduce encoding time and computational complexity, a fast mode decision algorithm based on likelihood mode decision (LMD) is proposed. LMD is evaluated in both temporal and spatial scaling. From the results, we conclude that LMD performs well, when compared to the previous fast mode decision algorithms. The comparison parameters are time, PSNR, and bit rate. LMD achieve time saving of 66.65% with 0.05% detriment in PSNR and 0.17% increment in bit rate compared with the full search method.

#### 1. Introduction

H.264 Scalable Video Coding (SVC) as an elongation of H.264 Advanced Video Coding (AVC) permits a single encoding but multiple decoding capabilities [1] of various gadget requirements. SVC prolongs all the characteristics of AVC; in addition to that, it provides a multiple layered approach, efficiency in coding, and so forth. The multiple layered approach constitutes base layer and one or more enhancement layers. The base layer consists of more essential information in the form of bit stream. The bit stream is partitioned off into more amounts of subset bit streams [1] known as enhancement layer. The subset bit stream comprises only essential message of the video while removing all redundant and less essential messages. The less essential message is deduced from base layer and already coded enhancement layers [1]. The base layer contains a bit stream of low resolution or low frame rate or low quality. The enhanced resolution or frame rate or quality will be obtained by adding enhancement layer bit streams.

SVC can be able to decode the video content, even with a limited bit stream of its exclusive feature, referred to as scalability. Scalability in SVC undergoes three levels: spatial, temporal, and quality. Spatial scalability refers to the resolution or the dimension of the video. Temporal scalability refers to the number of frames per second in the video. Quality or SNR scalability refers to PSNR (peak signal to noise ratio) gain in the video. SVC executes temporally with hierarchical B picture prediction of frames. A frame in a video is categorized into macroblocks (MBs). A macroblock contains many blocks of modes in which each has its own identity. The temporal scalability performs mode search for a prime mode in a macroblock (MB). SVC constitutes three types of frames, such as I or intraframe, P or prediction frame, and B or bidirectional prediction frames. I frame constitutes more essential information which requires all the modes in a macroblock to be coded. P frames contain essential information, but less compared to I frame, which requires few modes to be coded. B frames contain less essential information which requires very few modes to be coded.

The frames are divided in terms of fixed size macroblocks of in the former standard. But, in H.264/AVC, variable block sizes of , , , , , , and are available. It also offers to have its own way of estimating the modes on how the macroblock is divided. The prime mode for a MB or block will be decided based on rate distortion cost (RDC) function using Lagrangian parameter. RDC computation includes integer transform, quantization, and entropy coding in both forward and backward process. RDC computed for all the modes in a macroblock and the mode with minimum value is decided as the prime mode in a MB or block. SVC defines nine intramodes for prediction in a block of INTRA , four intramodes for prediction in a macroblock of INTRA , seven intermodes for prediction in a macroblock of INTER , INTER , INTER , INTER , INTER , INTER , and INTER , and one SKIP mode [2]. For motion estimation, BL_PRED mode and QPEL_REF mode for enhancement layer are added to the modes of the base layer. A full search method decides upon the ideal motion vector difference (MVD) using RDC between the current frame and the previous frame. The MVD is the difference between a predicted motion vector and actual motion vector between current and previous frame. The computationally in-depth rate distortion optimization method increases encoding time and complexity and results in many fast motion decision algorithms to develop. More algorithms are evaluated for AVC which is less difficult compared to SVC. But few algorithms are evaluated for SVC which saves time while selecting a prime mode. These algorithms are discussed in the next section.

#### 2. Related Work

The complexity in determining the prime mode in H.264/AVC is proposed [3], which saves encoding time. The proposed method involves Lagrangian optimization with rate and distortion cost to decide the prime mode while achieving less encoding time and coding efficiency. A motion activity-based mode decision (MAMD) algorithm is proposed in [4] to speed up the coding time by minimizing the number of candidate modes. The candidate modes are skipped based on motion vectors, avoiding them to be coded, thus reducing time. The candidate modes of the enhancement layer are significantly lessened on the relation between the base layer and the enhancement layer in [5]. But, base layer modes are chosen based on full search process. A probability based coding mode decision algorithm [6] is accomplished for H.264/AVC. The mode is resolved with the maximum probability of correlation between the adjacent block and present block. The probability model saves more encoding time. A timely outcome of mode decision is proposed for the enhancement layer MB in [7]. If the MB is found to be all zeros, then the previous MB can be chosen and the mode decision method can be earlier terminated.

The enhancement layer MB mode is determined using the Bayesian theorem, proposed in [8]. The proposed algorithm discusses the Markov procedure. The Markov procedure based likelihood analysis finds the mode for a macroblock earlier and saves time. The correlation among adjacent MBs of the base layer and the colocated MBs of enhancement layer is utilized to forecast the mode in the enhancement layer in [9]. A selective interlayer residual prediction using Lagrangian RDC based fast mode decision algorithm is proposed in [10]. The Lagrangian parameter involved in this prediction reduces coding time while deciding appropriate mode. Classification based intra-inter mode decision is accomplished in [11]. The frame will be coded or skipped based on the determination of intra-inter coding for rate control. This approach is devoted to video over networks and then bestowed to scalable video coding. In [12], the relation between MB of enhancement layer and its colocated base layer MB is used for mode decision. Intermode prediction for temporal scalability is proposed in [13]. The proposed method compares the pixel values of the current MB with reference block using statistical analysis. In our previous works [14], a desired mode list is constructed for predicting the mode in the base layer. The mode for enhancement layer is predicted based on correlation between current frame and reference frame. A quick video streaming through the Internet is accomplished using a mathematical model in [15]. The mathematical model maximizes the information rate to the client from the streaming server, which plays a delay-free video.

Although each proposed algorithm evaluates faster encoding time in deciding the prime mode, it fails to fulfill in terms of PSNR and bit rate with the full search method. Only a comparative measure was obtained among different algorithms in terms of encoding time, irrespective of the computational complexity involved. As a result, a fast mode decision algorithm with low computation complexity which attains less encoding time is proposed. The proposed algorithm uses likelihood mode decision method, discussed in the next section.

#### 3. Likelihood Mode Decision

In SVC, the rate distortion cost (RDC) based mode decision is performed. The mode with minimum RDC will be decided as prime mode for each MB in full search method. But the complexity in estimating RDC for each mode in a MB is tedious, in turn consuming more encoding time. To decide a prime mode earlier and escape from encoding unwanted modes are the question to be discussed. In this section, a likelihood mode decision (LMD) algorithm is proposed which decides the prime mode for I frame of the enhancement layer. P/B frames which are derived from I frames need less attention. Also, these frames hold less essential information; a selective prediction of modes can be implemented for obtaining the prime mode. The I frame of base layer is of more importance which follows a standard full search algorithm, while P/B frames involve an enhanced selective prediction of certain modes.

The likelihood model is evaluated below to show the importance of likeliness in terms of intermode prediction. The likeliness of modes between adjacent and current MB resembles high degree of likeliness to be same mode. The video sequence tested with various quantization parameters for different MB is disclosed in Table 1.