Research Article
A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
Algorithm 1
Generic 3D stencil computation.
void stencil_computation ( | float (array1) [Y_SIZE] [X_SIZE], | float (array2) [Y_SIZE] [X_SIZE]) | | float (in) [Y_SIZE] [X_SIZE] = array1; | float (out) [Y_SIZE] [X_SIZE] = array2; | for (int t = 0; t < T_MAX; ++t) | for (int z = 0; z < Z_SIZE; ++z) | for (int y = 0; y < Y_SIZE; ++y) | for (int x = 0; x < X_SIZE; ++x) | float temp0 = in [z+] [y+] [x+]; | float temp1 = in [z+] [y+] [x+]; | ⋮ | float tempN = in [z+] [y+] [x+]; | out [z] [y] [x]= f(); | | // Swap in and out pointers | | |
|