Research Article

3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies

Table 10

Full unrolling algorithm with split for different video formats. Observe that the number of splits is intended on the same GPU and the time needed to copy data from CPU to GPU and from GPU to CPU is reported in bold.

Video sizeSplitsExecution time + CPU-GPU copy time

PAL 116.8 + 0.185.8 + 0.1555 + 0.11276 + 0.16
PAL 233.5 + 0.2172 + 0.3110 + 0.22552 + 0.32
PAL 467 + 0.4343 + 0.59220 + 0.431104 + 0.63
PAL 10168 + 0.99858 + 1.48550 + 1.082761 + 1.58
HD 131.4 + 0.18161 + 0.27103 + 0.2517 + 0.29
HD 262.8 + 0.37321 + 0.54206 + 0.41035 + 0.58
HD 4126 + 0.73643 + 1.09412 + 0.82070 + 1.16
HD 10314 + 1.831607 + 21030 + 4.85175 + 2.9
FULL-HD 178.5 + 0.61402 + 1.04257 + 0.641294 + 1.1
FULL-HD 2157 + 1.23804 + 2.07515 + 1.282587 +2.2
FULL-HD 4314 + 2.451607 + 4.141030 + 2.565174 + 4.4
FULL-HD 10785 + 6.144018 + 10.42575 + 6.412935 + 11