Research Article  Open Access
IntraSkip in InterFrame Coding of H.264/AVC
Abstract
In interframe coding of H.264/AVC standard, not only seven interpartition modes but also intramodes are taken into account for seeking the best coding mode so as to maintain higher encoding efficiency by sacrificing the speed of H.264/AVC encoder. Aiming at intraskip, this paper proposes a novel mathematical model for intraskip in interframe coding to alleviate the complexity of the process; the model provides remarkable performance increment by cutting down encoding time while accompanying very minor bitrate increase. The critical advantage of this proposed scheme most emphasized on is that it can optimize H.264 encoder in conjunction with any proposed fast inter and intramethods which are focusing on interpartition mode decision, motion search algorithms, and fast intraalgorithms.
1. Introduction
The latest established video compression standard H.264/AVC [1] is recognized to be a major international standard in the next generation video compression techniques, because of higher coding efficiency and better performance in various environments, compared to previous video coding standards. In the interframe coding of H.264/AVC standard, intramodes are also calculated, for seeking the best coding mode in order to obtain low bitrate and high fidelity, also for images that would be better encoded by intramode prediction, such as background images that frequently change in a video sequence [2]. Cheng et al. pointed out in [3] that the encoder speed would be accelerated significantly and dramatically if all computations of intramodes were skipped in interframe coding, which means only seven intermodes are taken into consideration, and all intramodes are excluded in interframe coding. But the experimental results show that this choice would also result in picture fidelity deterioration and high bitrate. Lee and Jeon put forward a method [4] by which designated partition blocks are intraskipped and also present a new mathematic model for interframe coding. The performance increment is dramatic while in some cases this method will miss some blocks that should have passed intrapredictions in interframe coding.
Afterwards, Cheng et al. [5] put forth an advanced method about P frame based on the idea that the decision for intraskip is generated by three fixed adjacent blocks. And the speed of the encoder with this method is obviously accelerated in P frames. Pan et al. [6] introduced a new scheme mainly focusing on interframe. Recently, Kim et al. [7] proposed a simple method which is to adopt minimum RDcost of adjacent blocks as the threshold for intraskip, and therefore intraskip is reintroduced in the coding process.
The remainder of this paper is going to analyze the whole procedure of the interframe coding first and then to present a new mathematic model for interframe coding not only aiming at P frames, but also addressing B frames, in Section 2. The results of experiments and comparison to some wellknown algorithms are presented in Section 3. Finally, in Section 4, the conclusion and discussion are presented.
2. InterFrame Coding Procedure
2.1. Prediction Modes
To achieve higher coding efficiency, H.264/AVC employs rate distortion optimization (RDO) [1, 8] to seek the best coding result in terms of maximizing image quality and minimizing resulting transmission data bits. That is to say, in order to achieve rate distortion optimization, the encoder has to encode the video sequence by exhaustively testing all the possible mode combinations, including different intra and interprediction modes, for each block that minimizes the difference between the original and its reconstruction to be encoded. As a result, due to the dramatically increased computation load of sequence coding, practical applications of an H.264 encoder are limited at large especially for real time visual communication.
The whole procedure of inter and intramodes in interframe coding is comprised of three parts. First, calculate the mincost of intrapartition modes. Second, figure out the mincost of intramodes. And at last, compare mincost of interpartition modes with mincost of intramodes to decide final coding mode. If mincost of intramode is less than mincost of intermode, the final coding mode will be intra and vice versa. In the following parts, this procedure is specified.
2.2. IntraModes in InterFrame Coding
In intramodes, a prediction block is formed based on the previously encoded and reconstructed blocks and is subtracted from the current block prior to encoding. That means intramodes only exploit spatial redundancies within the same frame instead of previously encoded frames as in intermodes. The prediction mode for each block that minimizes the difference (RDcost) between original block and its prediction is selected as the best intramode.
In interframe coding, intramodes are also taken into consideration, including intra 4 × 4, Intra 16 × 16 and intra 8 × 8 (optional since JM9.3). intra 16 × 16 has four directional predictions (Intra_ 16 × 16_Vertical, Intra_16 × 16_Horizontal, Intra_ 16 × 16_DC, Intra_ 16 × 16_Plane) while intra 4 × 4 has nine different directional predictions (Intra_4 × 4_Vertical, Intra_4 × 4_Horizontal, Intra_4 × 4_Diagonal_Down_Left, Intra_4 × 4_Diagonal_Down_Right, Intra_4 × 4_Vertical_Right, Intra_4 × 4_Horizontal_Down, Intra_4 × 4_Vertical_Left, Intra_4 × 4_Horizontal_Up, Intra_4 × 4_DC).
2.3. Whole Procedure of InterFrame Coding
The entire flow interframe coding is shown in Figure 1. First, determine whether initial SKIP mode is adopted, which is different from SKIP mode with no coefficients later. If not, calculate seven intermode predictions to seek the best intermode and consider the corresponding cost as BEST_INTER_COST. Then, calculate intramode predictions to get the BEST_INTRA_COST. Finally, compare BEST_INTER_COST and BEST_INTRA_COST to obtain the best mode among all possible modes in interframe coding.
2.4. The Analysis and Effect of IntraSkip in InterFrame Coding
The purpose of using intramodes in interframe coding is to improve image fidelity and to possibly provide more precise mode prediction so as to reduce the bitrates of coded sequences by sacrificing coding speed. Thus, whether the intramodes are frequently adopted in interframe coding may become an issue.
Admittedly, a considerable amount of transmission bits is saved on the condition that RDcost of best intramode is lower than RDcost of best intermode. This means that, as the RDcost of best intermode is larger than the RDcost of best intramode, it possibly indicates that the current block is in a rapid or median motion. As a matter of fact, if the background images are changing, the complicated procedure of greatly changing matchblock search in motion compensation will cost a lot of bits for motion vector predicted (MVP), and a nonoptimized prediction will require a lot of extra transmission bits for the residual [9, 10]. As a result, in this situation, intramodes in interframe coding is a better choice for encoding current block. However, taking more possible modes into account in interframe coding sacrifices encoder’s coding speed [11, 12]. Hence, the possibility that intramodes are unnecessary to be calculated and could be skipped is largely determined by the motion of current image to be encoded.
In addition, the procedure of intraskip is irrelevant to the procedures of motion search and seven intermode and nine intramode predictions. Hence, intraskip can accomplish higher performance increment together with any fast motion search algorithm and any fast inter/intraalgorithm [9–17].
Consequently, performing intraskip or not, which largely depends on the motion range of objects and background in sequences, also plays a key role in interframe coding compared to the procedure of motion search and block match [13–15]. The purpose of this paper is to focus on intraskip and present a method to decide whether intramodes are to be adopted or skipped in interframe coding by early estimation and detection.
2.5. The Proposed Mathematical Model for IntraSkip
Since the adjacent blocks are highly correlated with the current block to be encoded, the information of encoded blocks is essential for current block. Consequently, whether encoded block was intraskip or not is a substantial point to indicate the possibility of intraskip for current block. In addition, according to the analysis of intraskip in Section 2.4, it can be noted that the values of RDcost of best intermode largely represent the changing speed of image background and motion ranges of objects in sequences. As a result, this paper adopts the values of encoded blocks’ BEST_INTER_COSTs (RDcost of best intermode) that are assigned to different weighted coefficients as multipliers due to the encoded blocks’ various distances from current block for predetermining whether the timeexhaustive process of intramode predictions is necessary or not for current block. A mathematical model is proposed in order to correctly skip the blocks’ intramode predictions in interframe coding. The model is provided as follows: In the model, is latest encoded block's RDcost of the best intermode and is the number of reference blocks. denotes the RDcost of the best intermode in the prior th encoded blocks. For example, is the latest block's RDcost of inter, which is encoded prior to current block, and is prior to . In the model, denotes the weighted coefficient of latest encoded block's RDcost of inter. And the weightedcoefficient is the prior th block's weighted coefficient of the RDcost of that block, the values of which are decided by the weightedcoefficient function . This function that plays a key role in this model can be an arithmetic/geometric progression. , which is adopted for intraskip decision, is the weighted average RDcost of reference blocks. The constraint conditions in this model are weighted coefficients , which are provided based on the experimental results of more than thirty sequences (here 0.5 covers most of situations. In most cases according to experimental results, it ranges within (0.2, 0.3)). The procedure of implementation is presented in Algorithm 1, and the proposed method is illustrated in Figure 2 (gray parts) compared to the original procedure in Figure 1.

In Step (2), all the taken into consideration for are the RDcost values obtained with best intermode rather than RDcost values obtained with best final coding mode.
2.6. Analysis of the Proposed Model
Most of advanced intercoding algorithms conceived for speeding up the H.264 encoder are largely concentrating on the computation of the BEST_INTER_COST (Step (2)) because partition mode decision and motion search algorithm are exhaustively calculated in this step [9–18]. However, the proposed mathematic model is carried out after this step; consequently it can optimize the H.264 encoder in cooperation with any advanced fast partition modes and search algorithms. Hence, the speed of the encoder will be accelerated immensely if we adopt this mathematical model together with fast approaches for partition modes and search algorithms.
2.7. WeightedCoefficient Function
The core of the proposed mathematical model is the weightedcoefficient function, which in a large sense should be optimized so as to gain higher performance and mistake less blocks that should have been coded in intramodes, since the threshold of intraskip is generated by the weightedcoefficient function. In this paper, we propose a geometric progression for this model: According to more than thirty sequences’ experimental results obtained with various weighted coefficient functions, a statistic survey indicates that the best performance increment is derived from conditions set as follows: They are right for a geometric progression in this model. According to Newton’s Iteration Method of solving equation, the function can be expressed as follows: Therefore, ; is adopted as the selfadaptive threshold for intraskip in the procedure of interframe coding.
3. Experimental Results
To verify the performance of the algorithm proposed in this paper, several common and typical QCIF (Foreman, Carphone, and Highway) and CIF (Paris, Mobile, and Bus) sequences are specified. Our experimental environment is based on JM10.1 [19], which is developed for H.264 reference, and the simulation environment of experiments is P4 1.7 G +256 M, VC++6.0+sp5 in Windows XP+sp2.
Experimental results are tested with the conditions indicated in Table 1, which strictly follow the simulation contexts suggested by JVT.

Figure 3, shows the speed increment by comparing the JM standard encoder combined with the mathematic model proposed in this paper to the original JM standard encoder. In this figure, the percentage of speed increment is defined as follows: where is JM10.1 standard encoder’s coding time and is the optimized encoder’s coding time we proposed. From Figure 3, it can be seen that the percentage of whole encoder has accelerated by 35% (QCIF)/ 45% (CIF) as QP is 5 and about 25% as QP is 25. The trend is that the smaller QP is, the coding speed increment the encoder shows.
(a)
(b)
The percentage of missing blocks that should have passed intramode prediction is shown in Table 2. The percentage is the ratio of number of missed blocks over all blocks that need intracalculations. From the table we can see that almost all blocks in interframe coding are intraskipped correctly. In QCIF sequences, less than 0.206043% (maximum) is wrongly skipped for the sequence Foreman and less than 0.816145% (maximum) and 1.634520% (maximum) for Carphone and Highway, respectively. In CIF sequences, less than 0.341728% (maximum) is incorrectly skipped for the sequence of Paris, and less than 1.113545% (maximum) and 1.784902% (maximum) are for Mobile and Bus, respectively. In most cases only 0.6% or less is intraskipped incorrectly. The statistic results of Table 2 indicate that this model with weighted coefficients we set has obvious effect on intraskip.

Figure 4 compares PSNRY performance of these sequences when QP is 25. From the figure, it is clear that the PSNRY degradation is very minor, although sometimes there is some fluctuation, such as frame number 43 in Foreman sequence and number 70 in Bus sequence.
(a)
(b)
(c)
(d)
(e)
(f)
To compare our proposed scheme with recently proposed wellknown methods, IBBPBBP sequence format is also selected in our experiments. Experimental results are presented in Table 3. When the value of is negative and is positive (a negative value of means decreased PSNR and negative value of means decreased bitrate), it corresponds to performance degradation and vice versa. denotes the percentage of saved encoding time. In the table, Lee’s method is in paper [4] and Pan’s method is in paper [6] and Kim’s method is in paper [7]. They are evaluated against our proposed method. The experimental environment and configuration of JM are the same as shown in Table 1.

From the table, it is shown that in most sequences, the speed acceleration obtained by the proposed scheme is the best among four methods and provides very minor PSNR deterioration. The coding speed in QCIF sequences is almost three times faster than the other three methods, although there is some minor PSNR and bitrate degradation that affect image fidelity. Admittedly, in Mobile sequence, the proposed scheme just keeps the same level compared to Kim’s. However, in the Coastguard and Paris sequences, the performance increase is considerable compared to the other three ones.
4. Conclusion
In this paper, we first give a brief introduction about H.264 and then exhaustively specify the whole procedure of interframe coding, especially concerning the conjunction of interpartition modes and intramodes, to demonstrate that intraskip is a very effective method to increase the speed of the encoder if adopted in the interframe coding. After that, we discuss a mathematical model for intraskip and the critical advantage of this model that can optimize H.264 encoder together with any proposed fast interpartition mode decision, search algorithms, and fast intraalgorithms. At last, experimental results are provided and illustrated to substantiate the practical value of this model in the interframe coding.
The coefficients of the mathematical model proposed in this paper might not bring perfect performance, as there is also some bitrate increase and PSNR degradation on certain circumstances, which means that few blocks are intraskipped incorrectly. These few blocks therefore lead to incorrect mode prediction that brings out more extra residual to be encoded, which is also confirmed by our experiments. For example, when the proposed method is compared to Kim’s method in the CIF sequence Mobile, it does not show great performance improvement like, other sequences because some few blocks are wrongly intraskipped and lead to inexact prediction. The drawback could be tentatively solved by neural networks applied to PID control field [20, 21] and Fuzzy Control [22, 23] or similar areas for coefficients tracing so as to seek better performance.
References
 “Draft ITUT Recommendation and Final Draft International Standard of Joint Video Specification (ITUT Rec. H.264  ISO/IEC 1449610 AVC),” March 2003. View at: Google Scholar
 C. Ning and S. Hui, “Research on a novel model for intraskip in inter coding,” in Proceedings of the 9th International Symposium on Signal Processing and Its Applications (ISSPA '07), pp. 1–4, Sharjah, UAE, February 2007. View at: Publisher Site  Google Scholar
 Y. Cheng, Z. Y. Wang, K. Dai, and J. J. Guo, “Analysis of interframe coding without intra modes in H.264/AVC,” in Proceedings of the 7th Eurographics Symposium on Multimedia, pp. 77–86, Nanjing, China, October 2004. View at: Google Scholar
 J. Lee and B. Jeon, “Fast mode decision for H.264,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '04), vol. 2, pp. 1131–1134, Taipei, Taiwan, June 2004. View at: Publisher Site  Google Scholar
 Y. Cheng, Z. Wang, J. Guo, and K. Dai, “Research on intra modes for interframe coding in H.264,” in Proceedings of the 9th International Conference on Computer Supported Cooperative Work in Design, vol. 2, pp. 740–744, Coventry, UK, May 2005. View at: Google Scholar
 F. Pan, X. Lin, S. Rahardja et al., “Fast mode decision algorithm for intraprediction in H.264/AVC video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 7, pp. 813–822, 2005. View at: Publisher Site  Google Scholar
 B.G. Kim, J.H. Kim, and C.S. Cho, “A fast intra skip detection algorithm for H.264/AVC video encoding,” ETRI Journal, vol. 28, no. 6, pp. 721–731, 2006. View at: Google Scholar
 I. E. G. Richardson, H.264 and MPEG4 Video Compression, John Wiley & Sons, England, UK, 2003.
 T.Y. Kuo and C.H. Chan, “Fast variable block size motion estimation for H.264 using likelihood and correlation of motion field,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 10, pp. 1185–1195, 2006. View at: Publisher Site  Google Scholar
 Z. Zhou, J. Xin, and M.T. Sun, “Fast motion estimation and intermode decision for H.264/MPEG4 AVC encoding,” Journal of Visual Communication and Image Representation, vol. 17, no. 2, pp. 243–263, 2006. View at: Publisher Site  Google Scholar
 S.E. Kim, J.K. Han, and J.G. Kim, “An efficient scheme for motion estimation using multireference frames in H.264/AVC,” IEEE Transactions on Multimedia, vol. 8, no. 3, pp. 457–466, 2006. View at: Publisher Site  Google Scholar
 C. Grecos and M. Y. Yang, “Fast inter mode prediction for P slices in the H264 video coding standard,” IEEE Transactions on Broadcasting, vol. 51, no. 2, pp. 256–263, 2005. View at: Publisher Site  Google Scholar
 S. Zhu and K.K. Ma, “Correction to “a new diamond search algorithm for fast blockmatching motion estimation”,” IEEE Transactions on Image Processing, vol. 9, no. 2, pp. 287–290, 2000. View at: Publisher Site  Google Scholar
 Z. Chen, P. Zhou, and Y. He, “Fast integer pel and fractional pel motion estimation for JVT,” in 6th Meeting of the Joint Video Team of ISO/IEC MPEG & ITUT VCEG, Awaji Island, Japan, December 2002, JVTF017. View at: Google Scholar
 Y. Nie and K.K. Ma, “Adaptive irregular pattern search with matching prejudgment for fast blockmatching motion estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 6, pp. 789–794, 2005. View at: Publisher Site  Google Scholar
 J. Bu, S. Lou, C. Chen, and Z. Yang, “A novel fast approach for H.264 inter mode decision,” in Proceedings of the 4th IASTED International Conference on Communications, Internet, and Information Technology (CIIT '05), pp. 220–224, Cambridge, Mass, USA, OctoberNovember 2005. View at: Google Scholar
 T.Y. Kuo and C.H. Chan, “Fast macroblock partition prediction for H.264/AVC,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '04), vol. 1, pp. 675–678, Taipei, Taiwan, June 2004. View at: Publisher Site  Google Scholar
 D. Wu, F. Pan, K. P. Lim et al., “Fast intermode decision in H.264/AVC video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 7, pp. 953–958, 2005. View at: Publisher Site  Google Scholar
 Joint Video Team (JVT) Test Model JM10.1, January 2006, http://iphome.hhi.de/suehring/tml/download.
 L. Fausett, Fundamentals of Neural Networks: Architectures, Algorithms, and Applications, PrenticeHall, Upper Saddle River, NJ, USA, 1994.
 C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, UK, 1995.
 D. Driankov, H. Hellendoorn, and M. Reinfrank, An Introduction to Fuzzy Control, Springer, New York, NY, USA, 1996.
 H. Ying, Fuzzy Control and Modeling: Analytical Foundations and Applications, WileyIEEE Press, New York, NY, USA, 2000.
Copyright
Copyright © 2009 Hui Su. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.