Research Article
A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs
Table 5
Comparison of PCSRI and PCSRII with communication on two GPUs.
| Matrix | ET (GPU) | PCSRI (2 GPUs) | PCSRII (2 GPUs) | ET | SD | PE | ET | SD | PE |
| 2cubes_sphere | 0.4444 | 0.2494 | | 89.09 | 0.2503 | | 88.75 | scircuit | 0.3484 | 0.2234 | 0.0154 | 77.95 | 0.2165 | 0.0070 | 80.44 | Ga41As41H72 | 4.2387 | 2.3516 | 0.0030 | 90.12 | 2.3795 | 0.0521 | 89.07 | F1 | 6.5544 | 3.9252 | 0.6948 | 83.49 | 3.6076 | 0.2392 | 90.84 | ASIC_680ks | 0.8196 | 0.4890 | 0.0113 | 83.80 | 0.4998 | 0.0178 | 81.99 | ecology2 | 1.2321 | 0.6865 | | 89.74 | 0.6863 | | 89.76 | Hamrle3 | 1.7684 | 1.0221 | 0.0209 | 86.50 | 1.0066 | 0.0170 | 87.84 | thermal2 | 2.0708 | 1.1403 | 0.0230 | 90.80 | 1.1402 | 0.0203 | 90.81 | cage14 | 5.9177 | 3.5756 | 0.5644 | 82.75 | 3.2244 | 0.0196 | 91.76 | Transport | 4.7305 | 2.4623 | 0.0203 | 96.05 | 2.4550 | 0.0183 | 96.34 | G3_circuit | 1.9731 | 1.1215 | 0.0189 | 87.96 | 1.1766 | 0.0896 | 83.84 | kkt_power | 4.3465 | 2.9539 | 0.6973 | 73.57 | 2.4459 | 0.0356 | 88.85 | CurlCurl_4 | 5.1605 | 2.7064 | 0.0092 | 95.34 | 2.7049 | | 95.39 | memchip | 3.8257 | 2.3218 | 0.3467 | 82.39 | 2.2243 | 0.1973 | 85.99 | Freescale1 | 5.0524 | 3.1216 | 0.5868 | 80.92 | 2.9367 | 0.3199 | 86.02 |
|
|