EURASIP Journal on Image and Video Processing
Volume 2008 (2008), Article ID 524793, 8 pages
doi:10.1155/2008/524793
Research Article

Improved Motion Estimation Using Early Zero-Block Detection

Department of Communication Engineering, National Central University, Chungli 32054, Taiwan

Received 23 December 2007; Revised 13 May 2008; Accepted 24 June 2008

Academic Editor: Jian Zhang

Copyright © 2008 Y. M. Lee et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We incorporate the early zero-block detection technique into the UMHexagonS algorithm, which has already been adopted in H.264/AVC JM reference software, to speed up the motion estimation process. A nearly sufficient condition is derived for early zero-block detection. Although the conventional early zero-block detection method can achieve significant improvement in computation reduction, the PSNR loss, to whatever extent, is not negligible especially for high quantization parameter (QP) or low bit-rate coding. This paper modifies the UMHexagonS algorithm with the early zero-block detection technique to improve its coding performance. The experimental results reveal that the improved UMHexagonS algorithm greatly reduces computation while maintaining very high coding efficiency.

1. Introduction

The newest international video coding standard H.264/AVC has recently been approved by the ITU-T (as recommendation H.264) and by ISO/IEC as the international standard MPEG-4 part 10 advanced video coding (AVC) standard [1]. The emerging H.264/AVC achieves significantly better performance in both PSNR and visual quality at the same bit-rate compared with prior video coding standards such as MPEG4 part 2 and H.263. One important technique is the use of the variable block-size motion estimation and rate distortion optimization techniques; the computational complexity of H.264/AVC is dramatically increased due to the variable block-size modes performed.

Many fast and efficient methods for motion estimation (ME) have been proposed in recent years to reduce computational cost and maintain coding performance. In general, there are two ways to reduce computation. One is to speed up the ME algorithms themselves, such as the hybrid unsymmetrical-cross multihexagon-grid search (UMHexagonS) algorithm [2], which has been adopted in JM reference software. The other is to terminate the ME calculation by early detection of the zero-blocks (ZBs) of discrete cosine transform (DCT) coefficients after quantization. Xie et al. [3] established a zero-block condition based on the following criterion:(1)where and is residual samples between the current macroblock and the reference macroblock. For H.264, the relation between and quantization parameter is . This criterion has been employed in the JM reference software. In [4, 5], the early zero-block detection approach was applied to the motion search process using a threshold of for comparison with the sum of difference of block size () and deciding whether DCT is a zero-block. The motion search stops when all zero-blocks are detected. This results in significant computational savings, especially for low bit-rate coding. The threshold of (corresponding to in discrete cosine transform and quantization (DCT/Q)) is not sufficient, and it could improperly detect a great number of zero-blocks, leading to a severe degradation in coding performance.

Some sufficient but not necessary conditions for zero-block detection of DCT coefficients after quantization were derived by examining the sum of absolute differences (SADs) between the current macroblock and the reference macroblock [6, 7]. Although the zero-blocks of DCT coefficients can be detected correctly, numerous zero-blocks still remain undetected. Based on Moon’s method [7], a technique using an adaptive threshold was suggested to enhance zero-block detecting capability [8].

In this work, we derive a nearly sufficient condition based on the ensemble average of all DCT coefficients. The nearly sufficient condition for zero-block detection is then applied to both motion search and DCT/Q calculation in the UMHexagonS algorithm. The experimental results reveal that a significant improvement in computation reduction can be achieved compared to methods using the other two sufficient conditions, while high coding efficiency is still maintained.

2. A Nearly Sufficient Condition for Zero-Block Detection

To guarantee integer transform, the DCT in H.264/AVC is approximated to the following form:(2)where , and . The basic quantization operation is given by(3) The value of quantization parameter (QP) varies in the range 0–51. The quantizer step size is used to control bit-rate and video quality. With postscaling factor (PF) considered with the quantizer, the quantized output can be written as (4)where is the entry of the core 2D transform . To avoid any division operation, the factor () is implemented by a multiplication factor and a right shift:(5) with(6) where , % denotes the modular operator, and is the multiplication factor. The quantized coefficient can be implemented using integer arithmetic: (7)where represents a binary shift right, and f is for interblocks or for intrablocks.

Sousa [6] derived a simple sufficient condition under which each quantized coefficient becomes zero for DCT. To derive the sufficient condition for DCT, the PF factor is absorbed back into the core 2D transform and DCTcoefficients are rewritten Y:(8)where , and . Each coefficient can be written as(9)(10)for all DCT coefficients. For interblock encoding, the DCT coefficient is quantized as zero when the quantized coefficient satisfies , that is,(11) From (10) and (11), it is easy to show that the DCT is a zero-block if the sum of absolute differences satisfies(12) This is Sousa’s sufficient condition for zero-block detection.

Moon et al. [7], derived a more precise sufficient condition for zero-block detection by examining the integer transform and quantization in H.264/AVC, which is summarized as follows:

(1) if , then DCT is a zero-block, and where(13) (2) if and , then DCT is also a zero-block where the parameters and are, respectively, given by(14) Interestingly, note that is exactly identical to Sousa’s condition. As can be seen, the condition varies with . An intensive study indicates that this sufficient condition varies within a range , which is a little higher than the Sousa’s condition ().
2.1. A Nearly Sufficient Condition Based on Ensemble Average of DCT Coefficients

In this section, a nearly sufficient condition is derived based upon the ensemble average of all DCT coefficients by summing up all DCT coefficients . The summation over all DCT coefficients can be written as(15) Define , and the ensemble average can be upper-bounded as follows:(16)or(17) After some manipulation, was found to be 3.7975. Instead, using , if the ensemble average of DCT coefficients is applied to (11), the following upper bound for zero-block detection can be obtained:(18) Although the ensemble average condition is not sufficient and it might detect a zero-block incorrectly, the experiment indicates that only a very small portion of DCT coefficients is incorrectly detected as a zero-block. However, compared to both Sousa’s and Moon’s conditions, more zero-blocks can be detected correctly using the derived condition.

2.2. Zero-Block Detection Capability and Computation Reduction in DCT/Quantization

The various thresholds for zero-block detection as a function of QP are plotted in Figure 1. Note that both Sousa’s and Moon’s conditions are theoretically sufficient, but not for the thresholds and . The zero-block detecting capability of all various thresholds carried on the news and paris sequences are plotted in Figure 2. Although both Sousa’s and Moon’s conditions are theoretically sufficient, fewer zero-blocks can be detected using these two sufficient conditions compared to the other two conditions. The threshold brings out the best zero-block detecting capability; it simultaneously detects numerous improper zero-blocks that could lead to severe performance degradation. The percentage of zero-blocks detected improperly using these two nonsufficient conditions are shown in Figure 3. As can be seen, less than 1% of improper zero-blocks were found for the ensemble average threshold , while more than 9% for the threshold for .

524793.fig.001
Figure 1: Thresholds versus .
fig2
Figure 2: Zero-block detecting capability.
fig3
Figure 3: Percentage of improper zero-block detected.

To evaluate the performance of previously mentioned conditions for early zero-block detection, an experiment was performed in DCT/Q calculation. Table 1 displays the savings of total encoding time in DCT/Q as well as PSNR loss, conducted on the news sequence, for different QPs. The integer transform and quantization only occupies about 5% of the total encoding time. Note that no loss in either PSNR or bit-rate were found for Sousa’s and Moon’s conditions. As shown, the threshold can achieve a significant reduction in DCT/Q computation with a negligible PSNR loss. Up to 3% of total encoding time can be saved with PSNR loss of only 0.005 dB for . The threshold [4], however, runs into a severe PSNR degradation due to improper zero-block detection, although computation in DCT/Q can be further reduced. Consequently, the threshold is not subsequently analyzed.

tab1
Table 1: Encoding time saving and PSNR loss in DCT/Q.

3. Conventional Methods to Adopt Zero-Block Detection in Umhexagon Algorithm

In the H.264/AVC, interframe motion estimation is performed for 7 different block sizes (denoted as modes), varying among , , , , , , and . The motion estimation involves finding a macroblock in a previously encoded reference frame that best matches the current macroblock using SADs between current and reference area samples:(19)

In the early termination method of motion estimation, each in is compared with a threshold; and if all satisfy sufficient or nearly sufficient conditions, the motion search stops. In addition, the DCT/Q calculation need not be done if the DCT is a zero-block. This leads to a great reduction in computation. Since the conventional early zero-block detection method only requires a comparison of with a threshold, this approach can be applied to all kinds of motion searches, such as full search and all other fast search algorithms. This has been investigated in [4, 5].

In this section, we apply the various zero-block detection methods to the UMHexagonS algorithm and investigate the performance. The simulation conditions are tabulated in Table 2. Table 3 displays the average search points per block for different QPs conducted on the news sequence achieved using various zero-block detection thresholds. As shown, the average search points decrease with increasing threshold. For the news sequence and , up to 78% of average search points (14.09 reduced to 3.04) in the motion estimation can be saved when utilizing the zero-block detection approach using threshold much higher than the other two sufficient conditions (9.11 and 7.26, resp.). The average PSNR loss, bit-rate increment, and motion estimation time saving versus QP are also compared using various thresholds and tabulated in Table 4. As shown, the early zero-block detection using a nearly sufficient condition (i.e., with threshold ) significantly outperforms other thresholds in terms of computation for any bit-rate coding. As high as 56% of motion estimation time can be saved for compared to the UMHexagonS algorithm.

tab2
Table 2: Simulation conditions.
tab3
Table 3: Average search points per frame achieved by various thresholds.
tab4
Table 4: Performance comparison on news sequence.

The PSNR degradation, to whatever extent it occurs, becomes strict for low bit-rate coding or high QP. Table 5 displays PSNR loss conducted on several video sequences for . As shown, the conventional zero-block detection runs into a PSNR loss of 0.212 dB on the foreman sequence. This phenomenon is illustrated in Figure 4, which demonstrates the SAD error surface and the corresponding search iterations using the UMHexagonS algorithm in mode for a macroblock (42nd MB, 10th frame) in the foreman sequence. As shown, it requires 110 search points for the UMHexagonS algorithm to find the minimum error ( at the 26th iteration). The search stops at the 26th iteration and the minimum error can also be found when the conventional zero-block detection method is employed to the UMHexagonS algorithm with (threshold ). However, the search stops at the first iteration where as QP is increased to , which corresponds to the threshold ; and this leads to severe performance degradation. As the quantization parameter increases, the degradation becomes harsher.

tab5
Table 5: PSNR loss using nearly sufficient condition for .
fig4
Figure 4: Foreman (42nd MB,10th frame) (a) SAD error surface (b) search iteration using UMHexagonS algorithm.

4. Improved Umhexagons Algorithm

The conventional early zero-block detection technique cannot give a satisfactory coding performance when applied to the UMHexagonS algorithm for large quantization step sizes. In this section, we modify the UMHexagonS algorithm using the early zero-block detection technique to achieve high coding efficiency. Many commonly used video sequences (4 QCIF sequences: foreman, carphone, football, coastguard and 4 CIF sequences: stefan, mobile, paris, tempete) with different motion contents are simulated by exploiting full search algorithms on these video sequences with a search range . The experimental results indicate that a large number of global minimum are occupied near the search center especially at the zero MV (0,0) (average 38%), horizontal direction (average 27%), and vertical direction (average 18%). The early zero-block detection technique is not employed in these search points to improve coding performance. In addition, the motion search does not stop immediately when the nearly sufficient condition is satisfied. Instead, the diamond search is performed to find a smaller SAD. The improved algorithm is illustrated in Figure 5, and summarized as follows.

524793.fig.005
Figure 5: Early zero-block detection for motion search and DCT/Q.

Step 1. Predict the initial search point.

Step 2. Perform unsymmetrical-cross search.

Step 3. Perform uneven multi-hexagon-grid search. If all satisfy the nearly sufficient condition in (16), the motion search stops in this step and jumps to the diamond search in Step 4 and perform the diamond search.

Step 4. Perform extended hexagon based search. Similarly, if all satisfy the nearly sufficient condition in the hexagon search, then jump to perform the diamond search.

The average PSNR loss, bit-rate increment, and ME time saving of the improved algorithm versus QP are also compared with the UMHexagonS algorithm and tabulated in Table 6. As shown, a great improvement in computation and up to 55% of ME computation can be saved, while maintaining a very good rate distortion performance. A gain of 0.128 dB in PSNR can be obtained for the improved algorithm on the foreman sequence for with a slight increase in computation, compared to the conventional early zero-block detection method.

tab6
Table 6: PSNR loss, bit-rate and ME time saving.

5. Conclusion

In this paper, we modified the early termination of UMHexagonS algorithm to avoid the serve performance degradation in high QP. In addition, we derived a nearly sufficient condition for zero-block detection of DCT coefficients after quantization, based upon the ensemble average of all DCT coefficients. The nearly sufficient condition for zero-block detection is shown to have excellent zero-block detecting capability, while improper zero-block detection is negligible. The early zero-block detection approach with a nearly sufficient condition (threshold ) was then applied to both motion search and DCT/Q calculation in a fast-motion estimation algorithm (UMHexagonS algorithm). The simulation results reveal that a significant improvement in computation reduction (up to 55%) can be achieved with negligible performance degradation compared to the UMHexagonS algorithm.

References

  1. T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, 2003.
  2. Z. Chen, J. Xu, Y. He, and J. Zheng, “Fast integer-pel and fractional-pel motion estimation for H.264/AVC,” Journal of Visual Communication and Image Representation, vol. 17, no. 2, pp. 264–290, 2006.
  3. Z. Xie, Y. Liu, J. Liu, and T. Yang, “A general method for detecting all-zero-blocks prior to DCT and quantization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 2, pp. 237–241, 2007.
  4. J.-F. Yang, S.-C. Chang, and C.-Y. Chen, “Computation reduction for motion search in low rate video coders,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 10, pp. 948–951, 2002.
  5. L. Yang, K. Yu, J. Li, and S. Li, “An effective variable block-size early termination algorithm for H.264 video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 6, pp. 784–788, 2005.
  6. L. A. Sousa, “General method for eliminating redundant computations in video coding,” Electronics Letters, vol. 36, no. 4, pp. 306–307, 2000.
  7. Y. H. Moon, G. Y. Kim, and J. H. Kim, “An improved early detection algorithm for all-zero blocks in H.264 video encoding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 8, pp. 1053–1057, 2005.
  8. D. Wu, K. P. Lim, T. K. Chiew, J. Y. Tham, and K. H. Goh, “An adaptive thresholding technique for the detection of all-zeros blocks in H.264,” in Proceedings of the IEEE International Conference on Image Processing (ICIP '07), vol. 5, pp. 329–332, San Antonio, Tex, USA, September 2007.