Research Article

3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies

Algorithm 3

CUDA code of NLM algorithm with partial unrolling strategy.
(1)  int  const  i_1 threadIdx.x+blockDim.x*blockIdx.x;
(2) int  const  i_2 threadIdx.y+blockDim.y*blockIdx.y;
(3) /*   local   statements   */
(4) if   ((i_1 0)   &&   (i_1 X_Dim)   &&   (i_2 0)   &&   (i_2 Y_Dim))  {
(5) for   (i_3 0;   i_3 Z_Dim;   i_3 )  {
(6)    /*   compute   out_img i_1 + i_2*X_Dim +i_3*X_Dim*Y_Dim    using
in_img i_1+i_2*X_Dim+ i_3*X_Dim*Y_Dim    */
(7) }