Research Article

A Parallel Nonrigid Registration Algorithm Based on B-Spline for Medical Images

Algorithm 3

The pseudocode listing of the kernel function CalSimilarityGradientKernelFunc().
 Get the block index and the local thread index .
 for to do
    ;
    Get the normalized coordinates for the voxel within the region.
    ;
    for from to do
     ;
     ;
     Reduce and store in
    _syncthreads();
    if then
       ;
    end if
    _syncthreads();
    ;
    end for
  end for
  ;
  ;
Use tree-style reduction to reduce the gradient values in the shared memory.
Store the reduction results in the global memory of GPU.