Mathematical Problems in Engineering

Volume 2016 (2016), Article ID 1725051, 11 pages

http://dx.doi.org/10.1155/2016/1725051

## Fractal Video Coding Using Fast Normalized Covariance Based Similarity Measure

Center for VLSI and Nanotechnology, Department of Electronics Engineering, Visvesvaraya National Institute of Technology, Nagpur 440010, India

Received 18 July 2016; Revised 13 October 2016; Accepted 1 November 2016

Academic Editor: Yakov Strelniker

Copyright © 2016 Ravindra E. Chaudhari and Sanjay B. Dhok. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Fast normalized covariance based similarity measure for fractal video compression with quadtree partitioning is proposed in this paper. To increase the speed of fractal encoding, a simplified expression of covariance between range and overlapped domain blocks within a search window is implemented in frequency domain. All the covariance coefficients are normalized by using standard deviation of overlapped domain blocks and these are efficiently calculated in one computation by using two different approaches, namely, FFT based and sum table based. Results of these two approaches are compared and they are almost equal to each other in all aspects, except the memory requirement. Based on proposed simplified similarity measure, gray level transformation parameters are computationally modified and isometry transformations are performed using rotation/reflection properties of IFFT. Quadtree decompositions are used for the partitions of larger size of range block, that is, 16 × 16, which is based on target level of motion compensated prediction error. Experimental result shows that proposed method can increase the encoding speed and compression ratio by 66.49% and 9.58%, respectively, as compared to NHEXS method with increase in PSNR by 0.41 dB. Compared to H.264, proposed method can save 20% of compression time with marginal variation in PSNR and compression ratio.

#### 1. Introduction

The emerging multimedia applications such as video conferencing, video over mobile phones, video email, and wireless communications require an effective video coding standard to achieve a low bit rate with good quality. Performance of video coding standard depends on parameters such as quality of reconstructed video, compression ratio, and encoding time. Fractal based video compression [1] is an alternative to accomplish high compression ratio with good quality reconstructed output video than the existing video standards (MPEG, H.263, H.264). Recently, various researchers proposed different algorithms to improve the fractal encoding speed.

Jacquin [2] proposed an innovative technique which is based on the fractal theory of iterated function system for image compression. It reduces an affine redundancy of an image by using its self-similarity properties. Video sequences contain temporal redundancies between consecutive frames that can be easily removed by using fractal based technique. Fractal coding has received attention of researcher due to its advantages of independent resolution, high decoding speed, and high compression ratio [3, 4]. But high encoding time is main drawback of fractal based method; due to this it is not useful for real time applications. The motivation of this method is that it gives high compression ratio with good quality output which is useful for storage and transmission of bulky videos. To increase the encoding speed with keeping motivational parameters, a fast fractal video coder system is proposed.

Cube-based [5, 6] and frame based [7] fractal compression methods are used frequently for video compression. In cube-based compression, the video is divided into groups of frames, each of which in turn is partitioned into three-dimensional (3D) domain and range blocks; however, it has high computing complexity and low compression ratio. In frame based compression, each frame is encoded using the previous frame as a domain pool which introduces and spreads the error between the frames and it can be used to obtain a high compression ratio. Wang proposed a fixed block size hybrid compression algorithm [8] and an adaptive partition instead of fixed-size partition [9], which merges the advantages of cube-based and frame based fractal compression method. Another hybrid coder scheme which combines neighborhood vector quantization with fractal coding to compress the video as a 3D volume was proposed by Yao and Wilson [10]. Fractal approach for 3D searchless [11], prediction of error frame for low bit rate video [12], and wavelet transform based video coding approach [13, 14] are also considered for compression of videos.

Circular prediction mapping (CPM) and noncontractive interframe mapping (NCIM) are proposed by Kim et al. [15], to combine the fractal sequence coder with well-known motion estimation/motion compensation algorithm that exploits the high temporal correlations between the frames. Fractal video coding using a new cross hexagon search (NHEXS) algorithm is proposed [16] for higher motion estimation speed for searching stationary and quasi-stationary blocks. The regions can be defined according to [17, 18] a previously computed segmentation map and are encoded independently using NHEXS based searching technique. A new object-based method [19] is introduced in the transform domain using shape-adaptive DCT for stereo video compression. Zhu et al. proposed an automatic region-based video coder [20] with asymmetrical hexagon searching algorithm and deblocking loop filter to improve decompression video quality. High efficiency fractal multiview codec is presented in [21] to encode anchor viewpoint video using intraprediction modes and fractal coder with motion compensation technique. Three-step search algorithm is modified in [22] using two cross search and two cross hexagon search patterns to implement fractal video coder.

Block based motion estimation and motion compensation algorithms exploit the high temporal correlations between the adjacent frames. In frame based fractal video coding, range and domain blocks need to be matched with proper selection of geometrical transformation, scaling, and luminance factors. Normalized covariance, that is, Zero Mean Normalized Cross Correlation (ZNCC) is a method for determining the structural similarity between two blocks from the image [23]. The best matched domain block having high normalized cross correlation [24, 25] value may have large average gray level difference. This difference is reduced to zero or very small value by selecting a proper fractal encoding parameters. But the direct computation of ZNCC for every range block is computationally very expensive. Sum table based method significantly minimizes the computation complexity of ZNCC. It is a precalculated running sum discrete structure of the entire image and acts as a look-up table for the calculation of definite sum according to the size of block.

In this paper, a fast fractal based video coder is proposed using the normalized covariance algorithm as a similarity measure. It uses three levels of quadtree partition for motion estimation, which provides good balance degree of variation to picture content and helps to improve the compression ratio. The complexity of covariance between range and all domain blocks is simplified and implemented in one computation using FFT algorithm. Computational complexity of scaling factor and brightness factor are also minimized with new simple expression based on normalized covariance concept instead of traditional mean square error (MSE). The speed of fractal encoding process is further increased by incorporating a few steps such as FFT based or sum table based method; either one is used to perform the normalization of covariance component, eight isometry transformations’ operation using 2D IFFT properties, and the early search termination technique. Performance of video compression using FFT based and sum table based methods is separately verified and they are nearly equal to each other. These techniques can be used to improve the subjective quality of video and coding efficiency.

The rest of the paper is organized as follows. The basic fractal block coding for the image is described in Section 2. Normalized covariance based motion estimation and quadtree partition are explained in Section 3. Fast fractal video coding using FFT is presented in Section 4. The experimental results and comparative study of the proposed algorithm with existing algorithms are presented in Section 5. The conclusion is outlined in Section 6.

#### 2. Fractal Image Coding Theory

Fractal image coding is based on the theory of the partitioned iterated function system (PIFS) [2]. It consists of a set of contractive transformations; when this transformation is applied iteratively to an arbitrary image it will converge to an approximation of the original image. Images are stored as a collection of transformations, which will result in image compression.

The original image of size is initially partitioned into nonoverlapping range blocks () of each size . Similarly, the same image is partitioned into overlapping domain blocks () of each size as a domain pool with one pixel shift in horizontal and vertical direction (). For each range block, locate the best matching domain block from the domain pool and then apply contractive mapping which minimizes the MSE between range and contractive domain block. A range-domain mapping consists of three operations [3] sequentially on each domain block of size : (1) spatial contraction of the domain block () by downsampling or averaging the four neighboring pixels of disjoint group forming a block () of size ; (2) taking 8 geometrical transformations of each block which includes 4 rotations with 90 degrees and 4 mirror reflections; (3) for each geometrical transformed block perform contractive affine transformation on the grayscale values and select the parameters which give lowest MSE. The error between range () and one of the domain () blocks is measured by equation (1) and scaling factor “” and brightness factor “” of an affine transformation are calculated by (2) and (3), respectively. where is the number of pixels in the block and , are the pixel values of range block and contractive domain block at coordinates . For each range block, the parameters which need to be stored as a fractal encoded data are the coordinates of domain block along with and and geometric transformation index. Gray level transformation parameters “” and “” should be in the range of −1.2 to 1.2 and −255 to 255, respectively [3], to make sure that the transformation is contractive. At the decoder, these fractal parameters are iteratively applied to an arbitrary initial image according to the encoding block size, which will finally converge to a reconstruction of the original image after certain number of iterations.

#### 3. Normalized Covariance Based Motion Estimation

The ZNCC is a recognized similarity measure criterion and is considered as one of the accurate motion estimators in video compression. In fractal video coding, the best matched domain block is decided after applying the proper affine transformation which gives least mean square error. The ZNCC method is more robust under uniform illumination changes and less sensitive to noise; hence it can help to increase the coding efficiency and improve subjective visual quality of output video. For identical regions, it may give a high NCC value for the best match, but with a large average gray level difference. This difference is minimized by selecting the accurate luminance and geometrical transformation parameters. These fractal parameters are interpreted as a kind of motion compensation technique due to unavailability of error frame. In discrete domain, the range block () of current frame is shifted pixel by pixel across the search window () of the reference frame. The estimation of motion relies on the detection of maximum ZNCC function between and . The ZNCC is defined as

In (4) denotes the mean value of within the area of the range block shifted to () position and similarly denote the mean value of the range block . represents the ZNCC surface matrix between the current macroblock and reference search window. If the search range is pixels in both directions from the same location of then the size of the will be . Fractal encoding is itself complex method and use of ZNCC function to estimate motion vector can make the combined technique computationally expensive and time consuming. To overcome these complexity problems, an efficient method of ZNCC calculation has been proposed which is based on new optimal and transformation parameters.

##### 3.1. Quadtree Based Partition

Quadtree decomposition method initially partitions the current frame into a set of larger size (16 × 16 pixels) range blocks at level-1 (). Motion vector of the matched block is decided after verifying the highest three peak locations of ZNCC surface matrix. This verification is required because after quantizing gray level transformation parameters, the error between the blocks may vary. If the lowest quantizing error of larger size block () is above the prespecified threshold, then the block is partitioned in four quadrants . The partitioning scheme [26] can be recursively continued until the error becomes smaller than the threshold or 4 × 4 pixels block size () is reached.

Here, 3 levels of quadtree partitioning are employed with block sizes from 16 × 16 to 4 × 4 pixels. All subsequent partitions of 16 × 16 block size are represented by one code word as shown in Figure 1. The length of code word can be either 1 bit or 5 bits; it depends on only level-1 and level-2 nodes. If the level-1 block is not partitioned then the code word is 1 bit; otherwise it is assigned 5 bits. Similarly, if the corresponding block is partitioned, then it is represented by bit 1; otherwise it is represented by bit 0 as shown in Figure 1.