Research Article

Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC

Figure 1

Comparison between per-thread predictions and measurements of , , and for UPCv3. Test problem 1 (), 32 threads spread over two nodes, with BLOCKSIZE = 65536.