Research Article
A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
Algorithm 2
Generic 3D OpenCL stencil kernel.
_ā_kernel void stencil_computation ( | global float (in) [Y_SIZE] [X_SIZE], | global float (out) [Y_SIZE] [X_SIZE]) | | int x = get_global_id (0); | int y = get_global_id ; | int z = get_global_id (2); | float temp0 = in [z+] [y+] [x+]; | float temp1 = in [z+] [y+] [x+]; | ā® | float tempN = in [z+] [y+] [x+]; | out [z] [y] [x]= f(); | |
|