Research Article
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Algorithm 5
GPU-based method for normalizing
.
// i: the ID of pixel. | // N: the number of threads. | // np: the number of pixels. | // SMData: stores the tile in shared memory. | () ; | () ; | () while | () [i]; | () ; | () end while | () ; | () parallel reduction in SMData; | () ; | () if threadId is 0 | () atomic add: ; | () end if | () mean tmp/np; | () synchronize threads in grid; | () ; | () while i < np | () ; | () ; | () ; | () end while | () ; | () parallel reduction in SMData; | () sum reduction result; | () if threadId is 0 | () atomic add: ; | () end if | () norm SQRT(tmp); | () synchronize threads in grid; | () i blockId * gridDim + threadId; | () while i < np | () i i /norm; | () v v + i × i ; | () i ; | () end while | () SMData threadIdx · x v; | () parallel reduction in SMData; | () sum reduction result; | () if threadId is 0 | () atomic add: ; | () end if | () α tmp; | () synchronize threads in grid; | () if α is not 0 | () ; | () while | () ; | () ; | () end while | () end if |
|