Research Article
3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies
Algorithm 4
CUDA code of NLM algorithm with full unrolling strategy.
(1) int const i_1 = threadIdx.x + blockDim.x*blockIdx.x; | (2) int const i_2 = threadIdx.y + blockDim.y*blockIdx.y; | (3) int const i_3 = threadIdx.z + blockDim.z*blockIdx.z; | (4) /* localstatements */ | (5) if ((i_1 ≥ 0) && (i_1 < X_Dim) && (i_2 ≥ 0) && (i_2 < Y_Dim) && | (i_3 0) && (i_3 Z_Dim)) { | (6) /* compute out_img i_1 + i_2*X_Dim i_3*X_Dim*Y_Dim using | in_img i_1 + i_2*X_Dim + i_3*X_Dim*Y_Dim */ | (7) } |
|