Research Article
A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs
Table 6
Comparison of PCSRI and PCSRII with communication on four GPUs.
| Matrix | ET (GPU) | PCSRI (4 GPUs) | PCSRII (4 GPUs) | ET | SD | PE | ET | SD | PE |
| 2cubes_sphere | 0.4444 | 0.1567 | 0.0052 | 70.89 | 0.1531 | 0.0028 | 72.54 | scircuit | 0.3484 | 0.1544 | 0.0204 | 56.39 | 0.1495 | 0.0073 | 58.27 | Ga41As41H72 | 4.2387 | 1.7157 | 0.7909 | 61.76 | 1.4154 | 0.2178 | 74.87 | F1 | 6.5544 | 2.1149 | 0.3833 | 77.48 | 2.0022 | 0.1941 | 81.84 | ASIC_680ks | 0.8196 | 0.3449 | 0.0187 | 59.39 | 0.3423 | 0.0147 | 59.87 | ecology2 | 1.2321 | 0.4257 | 0.0048 | 72.35 | 0.4257 | 0.0056 | 72.35 | Hamrle3 | 1.7684 | 0.6231 | 0.0087 | 70.95 | 0.6297 | 0.0085 | 70.21 | thermal2 | 2.0708 | 0.6922 | 0.0267 | 74.78 | 0.6959 | 0.0269 | 74.39 | cage14 | 5.9177 | 1.9339 | 0.3442 | 76.50 | 1.6417 | 0.0067 | 90.12 | Transport | 4.7305 | 1.3323 | 0.0279 | 88.77 | 1.3217 | 0.0070 | 89.48 | G3_circuit | 1.9731 | 0.7234 | 0.0408 | 68.19 | 0.7458 | 0.0620 | 66.14 | kkt_power | 4.3465 | 1.7277 | 0.5495 | 62.89 | 1.3791 | 0.0305 | 78.79 | CurlCurl_4 | 5.1605 | 1.5065 | 0.0253 | 85.63 | 1.5004 | 0.8789 | 85.99 | memchip | 3.8257 | 1.3804 | 0.1768 | 69.29 | 1.3051 | 0.1029 | 73.28 | Freescale1 | 5.0524 | 2.0711 | 0.4342 | 60.98 | 1.8193 | 0.2262 | 69.43 |
|
|