Table of Contents Author Guidelines Submit a Manuscript
International Journal of Reconfigurable Computing
Volume 2011 (2011), Article ID 514581, 15 pages
http://dx.doi.org/10.1155/2011/514581
Research Article

The Potential for a GPU-Like Overlay Architecture for FPGAs

The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada M5S 3G4

Received 3 August 2010; Accepted 27 December 2010

Academic Editor: Aravind Dasu

Copyright © 2011 Jeffrey Kingyens and J. Gregory Steffan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. J. Koo, D. Fernandez, A. Haddad, and W. Gross, “Evaluation of a high-level-language methodology for high-performance reconfigurable computers,” in Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP '07), pp. 30–35, July 2007.
  2. D. Lau, O. Pritchard, and P. Molson, “Automated generation of hardware accelerators with direct memory access from ANSI/ISO standard C functions,” in Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '06), pp. 45–54, April 2006. View at Publisher · View at Google Scholar · View at Scopus
  3. J. L. Tripp, K. D. Peterson, C. Ahrens, J. D. Poznanovic, and M. B. Gokhale, “Trident: an FPGA compiler framework for floating-point algorithms,” in Proceedings of the International Conference on Field Programmable Logic and Applications (FPL '05), pp. 317–322, August 2005. View at Publisher · View at Google Scholar · View at Scopus
  4. J. Hensley, “AMD CTM overview,” in Proceedings of the International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '07), ACM, August 2007. View at Scopus
  5. B. Fort, D. Capalija, Z. G. Vranesic, and S. D. Brown, “A multithreaded soft processor for SoPC area reduction,” in Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '06), pp. 131–140, April 2006. View at Publisher · View at Google Scholar · View at Scopus
  6. M. Labrecque and J. G. Steffan, “Improving pipelined soft processors with multithreading,” in Proceedings of the International Conference on Field Programmable Logic and Applications (FPL '07), pp. 210–215, August 2007. View at Publisher · View at Google Scholar
  7. R. Moussali, N. Ghanem, and M. A. R. Saghir, “Supporting multithreading in configurable soft processor cores,” in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '07), pp. 155–159, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  8. P. Yiannacouras, J. G. Steffan, and J. Rose, “Vespa: portable, scalable, and flexible fpga-based vector processors,” in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '08), 2008.
  9. J. Yu, G. Lemieux, and C. Eagleston, “Vector processing as a soft-core CPU accelerator,” in Proceedings of the 16th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '08), pp. 222–231, February 2008. View at Publisher · View at Google Scholar · View at Scopus
  10. M. Labrecque, P. Yiannacouras, and J. G. Steffan, “Scaling soft processor systems,” in Proceedings of the 16th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '08), pp. 195–205, April 2008. View at Publisher · View at Google Scholar · View at Scopus
  11. W. R. Mark, R. S. Glanville, K. Akeley, and M. J. Kilgard, “Cg: a system for programming graphics hardware in a c-like language,” in Proceedings of the International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '03), pp. 896–907, ACM, New York, NY, USA, 2003.
  12. J. Kingyens and J. G. Steffan, “A GPU-inspired soft processor for high-throughput acceleration,” in Proceedings of the IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW '10), April 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. “Developing fpga coprocessors for performance-accelerated spacecraft image processing,” Xcell Journal Second Quarter, pp. 22–26, 2008.
  14. O. Mencer, “ASC: a stream compiler for computing with FPGAs,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 9, Article ID 1673737, pp. 1603–1617, 2006. View at Publisher · View at Google Scholar · View at Scopus
  15. I. Page, “Closing the gap between hardware and software: hardware-software cosynthesis at Oxford,” in Proceedings of the IEE Colloquium on Hardware-Software Cosynthesis for Reconfigurable Systems, pp. 201–211, February 1996, Digest no: 1996/036.
  16. P. Yiannacouras, J. Rose, and J. Gregory Steffan, “The microarchitecture of FPGA-based soft processors,” in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '05), pp. 202–212, New York, NY, USA, 2005.
  17. J. Yu, G. Lemieux, and C. Eagleston, “Vector processing as a soft-core CPU accelerator,” in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '08), pp. 222–231, ACM, New York, NY, USA, 2008. View at Publisher · View at Google Scholar
  18. R. Dimond, O. Mencer, and W. Luk, “Application-specific customisation of multi-threaded soft processors,” IEE Proceedings: Computers and Digital Techniques, vol. 153, no. 3, pp. 173–180, 2006. View at Publisher · View at Google Scholar · View at Scopus
  19. P. James-Roxby, P. Schumacher, and C. Ross, “A single program multiple data parallel processing platform for FPGAs,” in Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '04), pp. 302–303, April 2004. View at Publisher · View at Google Scholar
  20. A. K. Jones, R. Hoare, I. S. Kourtev et al., “A 64-way VLIW/SIMD FPGA architecture and design flow,” in Proceedings of the 11th IEEE International Conference on Electronics, Circuits and Systems (ICECS '04), pp. 499–502, December 2004. View at Scopus
  21. C. E. LaForest and J. G. Steffan, “Efficient multi-ported memories for FPGAs,” in Proceedings of the 18th ACM SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '10), pp. 41–50, February 2010. View at Publisher · View at Google Scholar · View at Scopus
  22. M. A. R. Saghir, M. El-Majzoub, and P. Akl, “Datapath and isa customization for soft vliw processors,” in Proceedings of the IEEE International Conference on Reconfigurable Computing and FPGA (ReConFig '06), pp. 1–10, September 2006.
  23. M. Peercy, M. Segal, and D. Gerstmann, “A performance-oriented data parallel virtual machine forgpus,” in Proceedings of the International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '06), p. 184, ACM, New York, NY, USA, 2006.
  24. W. W .L. Fung, I. Sham, G. Yuan, and T. M. Aamodt, “Dynamic warp formation and scheduling for efficient GPU control flow,” in Proceedings of the 40th Annual International Symposium on Microarchitecture (MICRO '07), pp. 407–418, IEEE Computer Society, Washington, DC, USA, 2007. View at Publisher · View at Google Scholar
  25. D. Slogsnat, A. Giese, and U. Brüning, “A versatile, low latency HyperTransport core,” in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '07), pp. 45–52, ACM, New York, NY, USA, 2007. View at Publisher · View at Google Scholar
  26. B. Holden, Latency Comparison between HyperTransport and PCI-Express In Communications Systems, World Wide Web Electronic Publication, 2006.
  27. K. Fatahalian, J. Sugerman, and P. Hanrahan, “Understanding the efficiency of GPU algorithms formatrix-matrix multiplication,” in Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pp. 133–137, ACM, New York, NY, USA, 2004.