Research Article
A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs
Table 3
Comparison of PCSRI and PCSRII without communication on two GPUs.
| Matrix | ET (GPU) | PCSRI (2 GPUs) | PCSRII (2 GPUs) | ET | SD | PE | ET | SD | PE |
| 2cubes_sphere | 0.4444 | 0.2670 | 0.0178 | 83.21 | 0.2640 | 0.0156 | 84.17 | scircuit | 0.3484 | 0.2413 | 0.0322 | 72.20 | 0.2250 | 0.0207 | 77.41 | Ga41As41H72 | 4.2387 | 2.3084 | 0.0446 | 91.81 | 2.3018 | 0.0432 | 92.07 | F1 | 6.5544 | 3.8865 | 0.7012 | 84.32 | 3.5710 | 0.2484 | 91.77 | ASIC_680ks | 0.8196 | 0.4567 | 0.0126 | 89.72 | 0.4566 | 0.0021 | 89.74 | ecology2 | 1.2321 | 0.6665 | 0.0140 | 92.42 | 0.6654 | 0.0152 | 92.58 | Hamrle3 | 1.7684 | 0.9651 | 0.0478 | 91.61 | 0.9208 | | 96.02 | thermal2 | 2.0708 | 1.0559 | 0.0056 | 98.06 | 1.0558 | 0.0045 | 98.06 | cage14 | 5.9177 | 3.4757 | 0.5417 | 85.13 | 3.1548 | 0.0458 | 93.78 | Transport | 4.7305 | 2.4665 | 0.0391 | 95.89 | 2.4655 | 0.0407 | 95.93 | G3_circuit | 1.9731 | 1.0485 | 0.0364 | 94.08 | 1.1061 | 0.1148 | 89.18 | kkt_power | 4.3465 | 2.7916 | 0.7454 | 77.85 | 2.2252 | 0.0439 | 97.66 | CurlCurl_4 | 5.1605 | 2.7107 | 0.0347 | 95.18 | 2.7075 | 0.0244 | 95.30 | memchip | 3.8257 | 2.1905 | 0.3393 | 87.32 | 2.0975 | 0.2175 | 91.19 | Freescale1 | 5.0524 | 3.0235 | 0.5719 | 83.55 | 2.8175 | 0.2811 | 89.66 |
|
|