Research Article

A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs

Table 5

Breakdown by data loading technique (AMD).

Data loading technique SpeedupBest count
DimsOptsHybridDimsOptsHybrid

Global without vectorization0.951.021.0141211
Global with vectorization0.870.850.926417
Image0.560.580.58000
Local0.931.181.2012627