Table of Contents Author Guidelines Submit a Manuscript
Journal of Electrical and Computer Engineering
Volume 2015 (2015), Article ID 902591, 20 pages
Research Article

Performance Analysis of Homogeneous On-Chip Large-Scale Parallel Computing Architectures for Data-Parallel Applications

1College of Computer, National University of Defense Technology, Changsha, Hunan 410073, China
2Department of Electronic Systems, KTH-Royal Institute of Technology, Kista, 16440 Stockholm, Sweden
3Institute of Computer Technology, Vienna University of Technology, 1040 Vienna, Austria

Received 28 August 2014; Revised 18 January 2015; Accepted 18 January 2015

Academic Editor: Dimitrios Soudris

Copyright © 2015 Xiaowen Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


On-chip computing platforms are evolving from single-core bus-based systems to many-core network-based systems, which are referred to as On-chip Large-scale Parallel Computing Architectures (OLPCs) in the paper. Homogenous OLPCs feature strong regularity and scalability due to its identical cores and routers. Data-parallel applications have their parallel data subsets that are handled individually by the same program running in different cores. Therefore, data-parallel applications are able to obtain good speedup in homogenous OLPCs. The paper addresses modeling the speedup performance of homogeneous OLPCs for data-parallel applications. When establishing the speedup performance model, the network communication latency and the ways of storing data of data-parallel applications are modeled and analyzed in detail. Two abstract concepts (equivalent serial packet and equivalent serial communication) are proposed to construct the network communication latency model. The uniform and hotspot traffic models are adopted to reflect the ways of storing data. Some useful suggestions are presented during the performance model’s analysis. Finally, three data-parallel applications are performed on our cycle-accurate homogenous OLPC experimental platform to validate the analytic results and demonstrate that our study provides a feasible way to estimate and evaluate the performance of data-parallel applications onto homogenous OLPCs.