Table of Contents Author Guidelines Submit a Manuscript
International Journal of Reconfigurable Computing
Volume 2010, Article ID 454506, 11 pages
Research Article

Parameterized Hardware Design on Reconfigurable Computers: An Image Processing Case Study

1Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR 72701, USA
2Department of Electrical and Computer Engineering, The George Washington University, Washington, DC 20052, USA
3Arctic Region Supercomputing Center, University of Alaska Fairbanks, Fairbanks, AK 99775, USA

Received 1 July 2009; Accepted 10 February 2010

Academic Editor: Elías Todorovich

Copyright © 2010 Miaoqing Huang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Reconfigurable Computers (RCs) with hardware (FPGA) co-processors can achieve significant performance improvement compared with traditional microprocessor ()-based computers for many scientific applications. The potential amount of speedup depends on the intrinsic parallelism of the target application as well as the characteristics of the target platform. In this work, we use image processing applications as a case study to demonstrate how hardware designs are parameterized by the co-processor architecture, particularly the data I/O, i.e., the local memory of the FPGA device and the interconnect between the FPGA and the . The local memory has to be used by applications that access data randomly. A typical case belonging to this category is image registration. On the other hand, an application such as edge detection can directly read data through the interconnect in a sequential fashion. Two different algorithms of image registration, the exhaustive search algorithm and the Discrete Wavelet Transform (DWT)-based search algorithm, are implemented on hardware, i.e., Xilinx Vertex-IIPro 50 on the Cray XD1 reconfigurable computer. The performance improvements of hardware implementations are and , respectively. Regarding the category of applications that directly access the interconnect, the hardware implementation of Canny edge detection can achieve speedup.