Research Article
A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
Table 3
Performance of heuristics and reference approaches.
| | Speedup | Configs | Compile time (s) | Runtime (s) | Best count | NV | AMD | NV | AMD | NV | AMD | NV | AMD | NV | AMD |
| Baselines | | | | | | | | | | | ExpS | 1.07 | 1.21 | 1311 | 1066 | 937.48 | 246.13 | 27.02 | 5.18 | 24 | 22 | Rand | 1.00 | 1.00 | 321 | 210 | 344.71 | 52.75 | 37.93 | 9.03 | 2 | 2 | Oracles | | | | | | | | | | | Dimensions | 1.09 | 1.17 | 720 | 711 | 712.84 | 167.39 | 96.15 | 24.39 | 22 | 11 | Optimizations | 1.15 | 1.32 | 1768 | 1526 | 1076.60 | 339.25 | 103.39 | 25.80 | 62 | 42 | Hybrid | 1.14 | 1.34 | 992 | 1018 | 1062.82 | 242.38 | 104.85 | 26.73 | 58 | 55 | Heuristics | | | | | | | | | | | Dimensions | 1.07 | 1.13 | 181 | 191 | 55.24 | 36.01 | 20.85 | 4.23 | 19 | 11 | Optimizations | 1.14 | 1.28 | 502 | 357 | 136.28 | 77.45 | 20.64 | 5.26 | 57 | 36 | Hybrid | 1.12 | 1.32 | 266 | 266 | 88.35 | 53.67 | 21.51 | 5.45 | 54 | 48 |
|
|