Research Article
A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs
Table 4
Comparison of PCSRI and PCSRII without communication on four GPUs.
| Matrix | ET (GPU) | PCSRI (4 GPUs) | PCSRII (4 GPUs) | ET | SD | PE | ET | SD | PE |
| 2cubes_sphere | 0.4444 | 0.1560 | 0.0132 | 71.23 | 0.1527 | 0.0111 | 72.78 | scircuit | 0.3484 | 0.1453 | 0.0262 | 59.94 | 0.1357 | 0.0130 | 64.17 | Ga41As41H72 | 4.2387 | 1.6123 | 0.7268 | 65.72 | 1.3410 | 0.1846 | 79.02 | F1 | 6.5544 | 2.5240 | 0.6827 | 64.92 | 1.9121 | 0.1900 | 85.69 | ASIC_680ks | 0.8196 | 0.2944 | 0.0298 | 69.59 | 0.2887 | 0.0264 | 70.98 | ecology2 | 1.2321 | 0.3593 | 0.0160 | 85.72 | 0.3554 | 0.0141 | 86.67 | Hamrle3 | 1.7684 | 0.5114 | 0.0307 | 86.45 | 0.4775 | 0.0125 | 92.59 | thermal2 | 2.0708 | 0.5553 | 0.0271 | 93.22 | 0.5546 | 0.0255 | 93.33 | cage14 | 5.9177 | 1.8126 | 0.3334 | 81.62 | 1.5386 | 0.0188 | 96.15 | Transport | 4.7305 | 1.2292 | 0.0270 | 96.21 | 1.2275 | 0.0158 | 96.35 | G3_circuit | 1.9731 | 0.5804 | 0.0489 | 84.99 | 0.6195 | 0.0790 | 79.63 | kkt_power | 4.3465 | 1.4974 | 0.5147 | 72.57 | 1.1584 | 0.0418 | 93.80 | CurlCurl_4 | 5.1605 | 1.3554 | 0.0153 | 95.18 | 1.3501 | 0.0111 | 95.56 | memchip | 3.8257 | 1.1439 | 0.1741 | 83.61 | 1.1175 | 0.1223 | 85.59 | Freescale1 | 5.0524 | 1.7588 | 0.4039 | 71.81 | 1.4806 | 0.1843 | 85.31 |
|
|