Scientific Programming

Scientific Programming / 2010 / Article
Special Issue

Exploring Languages for Expressing Medium to Massive On-Chip Parallelism

View this Special Issue

Open Access

Volume 18 |Article ID 646829 |

Yili Zheng, "Optimizing UPC Programs for Multi-Core Systems", Scientific Programming, vol. 18, Article ID 646829, 9 pages, 2010.

Optimizing UPC Programs for Multi-Core Systems


The Partitioned Global Address Space (PGAS) model of Unified Parallel C (UPC) can help users express and manage application data locality on non-uniform memory access (NUMA) multi-core shared-memory systems to get good performance. First, we describe several UPC program optimization techniques that are important to achieving good performance on NUMA multi-core computers with examples and quantitative performance results. Second, we use two numerical computing kernels, parallel matrix–matrix multiplication and parallel 3-D FFT, to demonstrate the end-to-end development and optimization for UPC applications. Our results show that the optimized UPC programs achieve very good and scalable performance on current multi-core systems and can even outperform vendor-optimized libraries in some cases.

Copyright © 2010 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Order printed copiesOrder

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.