Research Article

A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs

Algorithm 2

Generic 3D OpenCL stencil kernel.
_ā€‰_kernel void stencil_computation (
global float (in) [Y_SIZE] [X_SIZE],
global float (out) [Y_SIZE] [X_SIZE])
int x = get_global_id (0);
int y = get_global_id ;
int z = get_global_id (2);
float temp0 = in [z+] [y+] [x+];
float temp1 = in [z+] [y+] [x+];
ā‹®
float tempN = in [z+] [y+] [x+];
out [z] [y] [x]= f();