Research Article
3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies
Algorithm 3
CUDA code of NLM algorithm with partial unrolling strategy.
(1) int const i_1 threadIdx.x + blockDim.x*blockIdx.x; | (2) int const i_2 threadIdx.y + blockDim.y*blockIdx.y; | (3) /* local statements */ | (4) if ((i_1 0) && (i_1 X_Dim) && (i_2 0) && (i_2 Y_Dim)) { | (5) for (i_3 0; i_3 Z_Dim; i_3) { | (6) /* compute out_img i_1 + i_2*X_Dim + i_3*X_Dim*Y_Dim using | in_img i_1 + i_2*X_Dim + i_3*X_Dim*Y_Dim */ | (7) } |
|