Research Article
Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC
Listing 2
A naive UPC implementation of SpMV using a modified EllPack storage format.
| /∗ Total number of blocks in every shared array ∗/ | | int nblks = n/BLOCKSIZE + (n% BLOCKSIZE) ?1 : 0; | | /∗ Allocation of five shared arrays ∗/ | | shared [BLOCKSIZE] double∗x = upc_all_alloc (nblks, BLOCKSIZE ∗ sizeof(double)); | | shared [BLOCKSIZE] double∗y = upc_all_alloc (nblks, BLOCKSIZE ∗ sizeof(double)); | | shared [BLOCKSIZE] double∗D = upc_all_alloc (nblks, BLOCKSIZE ∗ sizeof(double)); | | shared [rnz ∗ BLOCKSIZE] double∗A = upc_all_alloc (nblks, rnz ∗ BLOCKSIZE ∗ sizeof(double)); | | shared [rnz ∗ BLOCKSIZE] int∗J = upc_all_alloc (nblks, rnz ∗ BLOCKSIZE ∗ sizeof(int)); | | // … | | /∗ Computation of SpMV involving all threads ∗/ | | upc_forall (int i = 0; i < n; i++; &y[i]) { | | double tmp = 0.0; | | for (int j = 0; j < rnz; j++) | | tmp += A[i ∗ rnz + j] ∗ x[J[i ∗ rnz + j]]; | | y[i] = D[i] ∗ x[i] + tmp; | | } |
|