Advances in Video Coding for Broadcast ApplicationsView this Special Issue
Intra-Skip in Inter-Frame Coding of H.264/AVC
In inter-frame coding of H.264/AVC standard, not only seven inter-partition modes but also intra-modes are taken into account for seeking the best coding mode so as to maintain higher encoding efficiency by sacrificing the speed of H.264/AVC encoder. Aiming at intra-skip, this paper proposes a novel mathematical model for intra-skip in inter-frame coding to alleviate the complexity of the process; the model provides remarkable performance increment by cutting down encoding time while accompanying very minor bitrate increase. The critical advantage of this proposed scheme most emphasized on is that it can optimize H.264 encoder in conjunction with any proposed fast inter- and intra-methods which are focusing on inter-partition mode decision, motion search algorithms, and fast intra-algorithms.
The latest established video compression standard H.264/AVC  is recognized to be a major international standard in the next generation video compression techniques, because of higher coding efficiency and better performance in various environments, compared to previous video coding standards. In the inter-frame coding of H.264/AVC standard, intra-modes are also calculated, for seeking the best coding mode in order to obtain low bitrate and high fidelity, also for images that would be better encoded by intra-mode prediction, such as background images that frequently change in a video sequence . Cheng et al. pointed out in  that the encoder speed would be accelerated significantly and dramatically if all computations of intra-modes were skipped in inter-frame coding, which means only seven inter-modes are taken into consideration, and all intra-modes are excluded in inter-frame coding. But the experimental results show that this choice would also result in picture fidelity deterioration and high bitrate. Lee and Jeon put forward a method  by which designated partition blocks are intra-skipped and also present a new mathematic model for inter-frame coding. The performance increment is dramatic while in some cases this method will miss some blocks that should have passed intra-predictions in inter-frame coding.
Afterwards, Cheng et al.  put forth an advanced method about P frame based on the idea that the decision for intra-skip is generated by three fixed adjacent blocks. And the speed of the encoder with this method is obviously accelerated in P frames. Pan et al.  introduced a new scheme mainly focusing on inter-frame. Recently, Kim et al.  proposed a simple method which is to adopt minimum RDcost of adjacent blocks as the threshold for intra-skip, and therefore intra-skip is reintroduced in the coding process.
The remainder of this paper is going to analyze the whole procedure of the inter-frame coding first and then to present a new mathematic model for inter-frame coding not only aiming at P frames, but also addressing B frames, in Section 2. The results of experiments and comparison to some well-known algorithms are presented in Section 3. Finally, in Section 4, the conclusion and discussion are presented.
2. Inter-Frame Coding Procedure
2.1. Prediction Modes
To achieve higher coding efficiency, H.264/AVC employs rate distortion optimization (RDO) [1, 8] to seek the best coding result in terms of maximizing image quality and minimizing resulting transmission data bits. That is to say, in order to achieve rate distortion optimization, the encoder has to encode the video sequence by exhaustively testing all the possible mode combinations, including different intra- and inter-prediction modes, for each block that minimizes the difference between the original and its reconstruction to be encoded. As a result, due to the dramatically increased computation load of sequence coding, practical applications of an H.264 encoder are limited at large especially for real time visual communication.
The whole procedure of inter- and intra-modes in inter-frame coding is comprised of three parts. First, calculate the mincost of intra-partition modes. Second, figure out the mincost of intra-modes. And at last, compare mincost of inter-partition modes with mincost of intra-modes to decide final coding mode. If mincost of intra-mode is less than mincost of inter-mode, the final coding mode will be intra- and vice versa. In the following parts, this procedure is specified.
2.2. Intra-Modes in Inter-Frame Coding
In intra-modes, a prediction block is formed based on the previously encoded and reconstructed blocks and is subtracted from the current block prior to encoding. That means intra-modes only exploit spatial redundancies within the same frame instead of previously encoded frames as in inter-modes. The prediction mode for each block that minimizes the difference (RDcost) between original block and its prediction is selected as the best intra-mode.
In inter-frame coding, intra-modes are also taken into consideration, including intra 4 × 4, Intra 16 × 16 and intra 8 × 8 (optional since JM9.3). intra 16 × 16 has four directional predictions (Intra_ 16 × 16_Vertical, Intra_16 × 16_Horizontal, Intra_ 16 × 16_DC, Intra_ 16 × 16_Plane) while intra 4 × 4 has nine different directional predictions (Intra_4 × 4_Vertical, Intra_4 × 4_Horizontal, Intra_4 × 4_Diagonal_Down_Left, Intra_4 × 4_Diagonal_Down_Right, Intra_4 × 4_Vertical_Right, Intra_4 × 4_Horizontal_Down, Intra_4 × 4_Vertical_Left, Intra_4 × 4_Horizontal_Up, Intra_4 × 4_DC).
2.3. Whole Procedure of Inter-Frame Coding
The entire flow inter-frame coding is shown in Figure 1. First, determine whether initial SKIP mode is adopted, which is different from SKIP mode with no coefficients later. If not, calculate seven inter-mode predictions to seek the best inter-mode and consider the corresponding cost as BEST_INTER_COST. Then, calculate intra-mode predictions to get the BEST_INTRA_COST. Finally, compare BEST_INTER_COST and BEST_INTRA_COST to obtain the best mode among all possible modes in inter-frame coding.
2.4. The Analysis and Effect of Intra-Skip in Inter-Frame Coding
The purpose of using intra-modes in inter-frame coding is to improve image fidelity and to possibly provide more precise mode prediction so as to reduce the bitrates of coded sequences by sacrificing coding speed. Thus, whether the intra-modes are frequently adopted in inter-frame coding may become an issue.
Admittedly, a considerable amount of transmission bits is saved on the condition that RDcost of best intra-mode is lower than RDcost of best inter-mode. This means that, as the RDcost of best inter-mode is larger than the RDcost of best intra-mode, it possibly indicates that the current block is in a rapid or median motion. As a matter of fact, if the background images are changing, the complicated procedure of greatly changing match-block search in motion compensation will cost a lot of bits for motion vector predicted (MVP), and a nonoptimized prediction will require a lot of extra transmission bits for the residual [9, 10]. As a result, in this situation, intra-modes in inter-frame coding is a better choice for encoding current block. However, taking more possible modes into account in inter-frame coding sacrifices encoder’s coding speed [11, 12]. Hence, the possibility that intra-modes are unnecessary to be calculated and could be skipped is largely determined by the motion of current image to be encoded.
In addition, the procedure of intra-skip is irrelevant to the procedures of motion search and seven inter-mode and nine intra-mode predictions. Hence, intra-skip can accomplish higher performance increment together with any fast motion search algorithm and any fast inter-/intra-algorithm [9–17].
Consequently, performing intra-skip or not, which largely depends on the motion range of objects and background in sequences, also plays a key role in inter-frame coding compared to the procedure of motion search and block match [13–15]. The purpose of this paper is to focus on intra-skip and present a method to decide whether intra-modes are to be adopted or skipped in inter-frame coding by early estimation and detection.
2.5. The Proposed Mathematical Model for Intra-Skip
Since the adjacent blocks are highly correlated with the current block to be encoded, the information of encoded blocks is essential for current block. Consequently, whether encoded block was intra-skip or not is a substantial point to indicate the possibility of intra-skip for current block. In addition, according to the analysis of intra-skip in Section 2.4, it can be noted that the values of RDcost of best inter-mode largely represent the changing speed of image background and motion ranges of objects in sequences. As a result, this paper adopts the values of encoded blocks’ BEST_INTER_COSTs (RDcost of best inter-mode) that are assigned to different weighted coefficients as multipliers due to the encoded blocks’ various distances from current block for predetermining whether the time-exhaustive process of intra-mode predictions is necessary or not for current block. A mathematical model is proposed in order to correctly skip the blocks’ intra-mode predictions in inter-frame coding. The model is provided as follows: In the model, is latest encoded block's RDcost of the best inter-mode and is the number of reference blocks. denotes the RDcost of the best inter-mode in the prior th encoded blocks. For example, is the latest block's RDcost of inter-, which is encoded prior to current block, and is prior to . In the model, denotes the weighted coefficient of latest encoded block's RDcost of inter-. And the weighted-coefficient is the prior th block's weighted coefficient of the RDcost of that block, the values of which are decided by the weighted-coefficient function . This function that plays a key role in this model can be an arithmetic/geometric progression. , which is adopted for intra-skip decision, is the weighted average RDcost of reference blocks. The constraint conditions in this model are weighted coefficients , which are provided based on the experimental results of more than thirty sequences (here 0.5 covers most of situations. In most cases according to experimental results, it ranges within (0.2, 0.3)). The procedure of implementation is presented in Algorithm 1, and the proposed method is illustrated in Figure 2 (gray parts) compared to the original procedure in Figure 1.
In Step (2), all the taken into consideration for are the RDcost values obtained with best inter-mode rather than RDcost values obtained with best final coding mode.
2.6. Analysis of the Proposed Model
Most of advanced inter-coding algorithms conceived for speeding up the H.264 encoder are largely concentrating on the computation of the BEST_INTER_COST (Step (2)) because partition mode decision and motion search algorithm are exhaustively calculated in this step [9–18]. However, the proposed mathematic model is carried out after this step; consequently it can optimize the H.264 encoder in cooperation with any advanced fast partition modes and search algorithms. Hence, the speed of the encoder will be accelerated immensely if we adopt this mathematical model together with fast approaches for partition modes and search algorithms.
2.7. Weighted-Coefficient Function
The core of the proposed mathematical model is the weighted-coefficient function, which in a large sense should be optimized so as to gain higher performance and mistake less blocks that should have been coded in intra-modes, since the threshold of intra-skip is generated by the weighted-coefficient function. In this paper, we propose a geometric progression for this model: According to more than thirty sequences’ experimental results obtained with various weighted coefficient functions, a statistic survey indicates that the best performance increment is derived from conditions set as follows: They are right for a geometric progression in this model. According to Newton’s Iteration Method of solving equation, the function can be expressed as follows: Therefore, ; is adopted as the self-adaptive threshold for intra-skip in the procedure of inter-frame coding.
3. Experimental Results
To verify the performance of the algorithm proposed in this paper, several common and typical QCIF (Foreman, Carphone, and Highway) and CIF (Paris, Mobile, and Bus) sequences are specified. Our experimental environment is based on JM10.1 , which is developed for H.264 reference, and the simulation environment of experiments is P4 1.7 G +256 M, VC++6.0+sp5 in Windows XP+sp2.
Experimental results are tested with the conditions indicated in Table 1, which strictly follow the simulation contexts suggested by JVT.
Figure 3, shows the speed increment by comparing the JM standard encoder combined with the mathematic model proposed in this paper to the original JM standard encoder. In this figure, the percentage of speed increment is defined as follows: where is JM10.1 standard encoder’s coding time and is the optimized encoder’s coding time we proposed. From Figure 3, it can be seen that the percentage of whole encoder has accelerated by 35% (QCIF)/ 45% (CIF) as QP is 5 and about 25% as QP is 25. The trend is that the smaller QP is, the coding speed increment the encoder shows.
The percentage of missing blocks that should have passed intra-mode prediction is shown in Table 2. The percentage is the ratio of number of missed blocks over all blocks that need intra-calculations. From the table we can see that almost all blocks in inter-frame coding are intra-skipped correctly. In QCIF sequences, less than 0.206043% (maximum) is wrongly skipped for the sequence Foreman and less than 0.816145% (maximum) and 1.634520% (maximum) for Carphone and Highway, respectively. In CIF sequences, less than 0.341728% (maximum) is incorrectly skipped for the sequence of Paris, and less than 1.113545% (maximum) and 1.784902% (maximum) are for Mobile and Bus, respectively. In most cases only 0.6% or less is intra-skipped incorrectly. The statistic results of Table 2 indicate that this model with weighted coefficients we set has obvious effect on intra-skip.
Figure 4 compares PSNR-Y performance of these sequences when QP is 25. From the figure, it is clear that the PSNR-Y degradation is very minor, although sometimes there is some fluctuation, such as frame number 43 in Foreman sequence and number 70 in Bus sequence.
To compare our proposed scheme with recently proposed well-known methods, IBBPBBP sequence format is also selected in our experiments. Experimental results are presented in Table 3. When the value of is negative and is positive (a negative value of means decreased PSNR and negative value of means decreased bitrate), it corresponds to performance degradation and vice versa. denotes the percentage of saved encoding time. In the table, Lee’s method is in paper  and Pan’s method is in paper  and Kim’s method is in paper . They are evaluated against our proposed method. The experimental environment and configuration of JM are the same as shown in Table 1.
From the table, it is shown that in most sequences, the speed acceleration obtained by the proposed scheme is the best among four methods and provides very minor PSNR deterioration. The coding speed in QCIF sequences is almost three times faster than the other three methods, although there is some minor PSNR and bitrate degradation that affect image fidelity. Admittedly, in Mobile sequence, the proposed scheme just keeps the same level compared to Kim’s. However, in the Coastguard and Paris sequences, the performance increase is considerable compared to the other three ones.
In this paper, we first give a brief introduction about H.264 and then exhaustively specify the whole procedure of inter-frame coding, especially concerning the conjunction of inter-partition modes and intramodes, to demonstrate that intra-skip is a very effective method to increase the speed of the encoder if adopted in the inter-frame coding. After that, we discuss a mathematical model for intra-skip and the critical advantage of this model that can optimize H.264 encoder together with any proposed fast inter-partition mode decision, search algorithms, and fast intra-algorithms. At last, experimental results are provided and illustrated to substantiate the practical value of this model in the inter-frame coding.
The coefficients of the mathematical model proposed in this paper might not bring perfect performance, as there is also some bitrate increase and PSNR degradation on certain circumstances, which means that few blocks are intra-skipped incorrectly. These few blocks therefore lead to incorrect mode prediction that brings out more extra residual to be encoded, which is also confirmed by our experiments. For example, when the proposed method is compared to Kim’s method in the CIF sequence Mobile, it does not show great performance improvement like, other sequences because some few blocks are wrongly intra-skipped and lead to inexact prediction. The drawback could be tentatively solved by neural networks applied to PID control field [20, 21] and Fuzzy Control [22, 23] or similar areas for coefficients tracing so as to seek better performance.
“Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC),” March 2003.View at: Google Scholar
Y. Cheng, Z. Y. Wang, K. Dai, and J. J. Guo, “Analysis of inter-frame coding without intra modes in H.264/AVC,” in Proceedings of the 7th Eurographics Symposium on Multimedia, pp. 77–86, Nanjing, China, October 2004.View at: Google Scholar
Y. Cheng, Z. Wang, J. Guo, and K. Dai, “Research on intra modes for inter-frame coding in H.264,” in Proceedings of the 9th International Conference on Computer Supported Cooperative Work in Design, vol. 2, pp. 740–744, Coventry, UK, May 2005.View at: Google Scholar
B.-G. Kim, J.-H. Kim, and C.-S. Cho, “A fast intra skip detection algorithm for H.264/AVC video encoding,” ETRI Journal, vol. 28, no. 6, pp. 721–731, 2006.View at: Google Scholar
I. E. G. Richardson, H.264 and MPEG-4 Video Compression, John Wiley & Sons, England, UK, 2003.
Z. Chen, P. Zhou, and Y. He, “Fast integer pel and fractional pel motion estimation for JVT,” in 6th Meeting of the Joint Video Team of ISO/IEC MPEG & ITU-T VCEG, Awaji Island, Japan, December 2002, JVT-F017.View at: Google Scholar
J. Bu, S. Lou, C. Chen, and Z. Yang, “A novel fast approach for H.264 inter mode decision,” in Proceedings of the 4th IASTED International Conference on Communications, Internet, and Information Technology (CIIT '05), pp. 220–224, Cambridge, Mass, USA, October-November 2005.View at: Google Scholar
Joint Video Team (JVT) Test Model JM10.1, January 2006, http://iphome.hhi.de/suehring/tml/download.
L. Fausett, Fundamentals of Neural Networks: Architectures, Algorithms, and Applications, Prentice-Hall, Upper Saddle River, NJ, USA, 1994.
C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, UK, 1995.
D. Driankov, H. Hellendoorn, and M. Reinfrank, An Introduction to Fuzzy Control, Springer, New York, NY, USA, 1996.
H. Ying, Fuzzy Control and Modeling: Analytical Foundations and Applications, Wiley-IEEE Press, New York, NY, USA, 2000.