Research Article

A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs

Table 4

Breakdown by data loading technique (NVIDIA).

Data loading technique SpeedupBest count
DimsOptsHybridDimsOptsHybrid

Global without vectorization0.540.540.54000
Global with vectorization0.450.450.45312
Image1.001.051.04134036
Local0.760.870.8662120