Research Article

Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU

Algorithm 5

GPU-based method for normalizing .
// i: the ID of pixel.
// N: the number of threads.
// np: the number of pixels.
// SMData: stores the tile in shared memory.
( ) ;
( ) ;
( ) while  
( )     [i];
( )     ;
( ) end  while
( ) ;
( ) parallel reduction in SMData;
( ) ;
( ) if  threadId is 0
( )     atomic add: ;
( ) end  if
( ) mean   tmp/np;
( ) synchronize  threads  in  grid;
( ) ;
( ) while  i < np
( )      ;
( )      ;
( )      ;
( ) end  while
( ) ;
( ) parallel reduction in SMData;
( ) sum reduction result;
( ) if  threadId is 0
( )     atomic add: ;
( ) end  if
( ) norm SQRT(tmp);
( ) synchronize  threads  in  grid;
( ) i   blockId * gridDim + threadId;
( ) while  i < np
( )      i     i /norm;
( )     v   v  +   i × i ;
( )     i ;
( ) end  while
( ) SMData threadIdx · x   v;
( ) parallel reduction in SMData;
( ) sum reduction result;
( ) if  threadId is 0
( )     atomic add:  ;
( ) end  if
( ) α   tmp;
( ) synchronize  threads  in  grid;
( ) if  α is  not  0
( )      ;
( )     while  
( )         ;
( )        ;
( )    end  while
( ) end  if