Research Article
Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC
Figure 2
Test problem 1 (), 32 threads spread over two nodes. (a) Per-thread communication volumes required by the three transformed UPC implementations with BLOCKSIZE = 65536. (b) Per-thread communication volumes associated with UPCv3 for different values of BLOCKSIZE.
(a) |
(b) |