Research Article

3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies

Table 7

Full unrolling algorithm on 2 GPU units. Execution times and speed-up values for a 3D dataset of normally distributed random numbers (size ) for several window configurations.

Execution time/speed-up
2 GPU units CPU
(16, 16, 1)(128, 1, 1)(256, 1, 1)(512, 1, 1)

30.9/91.130.3/92.830.3/92.732/87.92814
196/101192/103193/103205/96.519790
107/75.8103/78.9103/78.8113/728133
685/85.8654/89.8656/89.7729/80.658785