Research Article

Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC

Figure 2

Test problem 1 (), 32 threads spread over two nodes. (a) Per-thread communication volumes required by the three transformed UPC implementations with BLOCKSIZE = 65536. (b) Per-thread communication volumes associated with UPCv3 for different values of BLOCKSIZE.
(a)
(b)