Table of Contents Author Guidelines Submit a Manuscript
Journal of Electrical and Computer Engineering
Volume 2016 (2016), Article ID 5091519, 11 pages
Research Article

Hardware Efficient Architecture with Variable Block Size for Motion Estimation

1Sarvajanik College of Engineering and Technology, Surat, India
2S V National Institute of Technology, Surat, India

Received 27 July 2016; Revised 2 November 2016; Accepted 27 November 2016

Academic Editor: Jar Ferr Yang

Copyright © 2016 Nehal N. Shah et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Video coding standards such as MPEG-x and H.26x incorporate variable block size motion estimation (VBSME) which is highly time consuming and extremely complex from hardware implementation perspective due to huge computation. In this paper, we have discussed basic aspects of video coding and studied and compared existing architectures for VBSME. Various architectures with different pixel scanning pattern give a variety of performance results for motion vector (MV) generation, showing tradeoff between macroblock processed per second and resource requirement for computation. Aim of this paper is to design VBSME architecture which utilizes optimal resources to minimize chip area and offer adequate frame processing rate for real time implementation. Speed of computation can be improved by accessing 16 pixels of base macroblock of size 4 × 4 in single clock cycle using z scanning pattern. Widely adopted cost function for hardware implementation known as sum of absolute differences (SAD) is used for VBSME architecture with multiplexer based absolute difference calculator and partial summation term reduction (PSTR) based multioperand adders. Device utilization of proposed implementation is only 22k gates and it can process 179 HD (1920 × 1080) resolution frames in best case and 47 HD resolution frames in worst case per second. Due to such higher throughput design is well suitable for real time implementation.