Figure 1: Comparison between per-thread predictions and measurements of , , and for UPCv3. Test problem 1 (), 32 threads spread over two nodes, with BLOCKSIZE = 65536.