Mathematical Problems in Engineering

Volume 2016 (2016), Article ID 5483485, 7 pages

http://dx.doi.org/10.1155/2016/5483485

## A Correlation Based Strategy for the Acceleration of Nonlocal Means Filtering Algorithm

^{1}Laboratory of Image Science and Technology, Southeast University, Nanjing 210096, China^{2}Key Laboratory of Computer Network and Information Integration, Ministry of Education, Southeast University, Nanjing 210096, China^{3}Centre de Recherche en Information Biomédicale Sino-Francais (LIA CRIBs), 35000 Rennes, France^{4}INSERM, U1099, 35000 Rennes, France^{5}Université de Rennes 1, LTSI, 35000 Rennes, France

Received 23 March 2016; Accepted 12 June 2016

Academic Editor: Pasquale Memmolo

Copyright © 2016 Junfeng Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Although the nonlocal means (NLM) algorithm takes a significant step forward in image filtering field, it suffers from a high computational complexity. To deal with this drawback, this paper proposes an acceleration strategy based on a correlation operation. Instead of per-pixel processing, this approach performs a simultaneous calculation of all the image pixels with the help of correlation operators. Complexity analysis and experimental results are reported and show the advantage of the proposed algorithm in terms of computation and time cost.

#### 1. Introduction

Digital image denoising has been a fundamental and challenging issue for several decades [1, 2]. Many contributions have been devoted to recover the image degraded by Gaussian noise. Much attention has been paid to the Partial Differential Equation (PDE) approaches among which the total variation (TV) and the Perona and Malik (PM) model are well known [3–5]. Other methods rely on the image transform from the spatial domain to another domain (such as the Fourier domain, the wavelet domain, and the DCT domain). After adjusting the transform coefficients, the image is restituted by applying the inverse transform [6, 7].

The algorithm of nonlocal means (NLM) filtering was proposed by Buades et al. [8]. They suggested that a denoised pixel is equivalent to the weighted average of its neighboring pixels, with the weights calculated by the normalized Gaussian weighted Euclidean distance between the blocks centred at those pixels. The NLM algorithm has demonstrated better performance than other main-stream filter methods, such as bilateral filter and TV model in both visual performance and objective measure [9]. Unfortunately, the computational cost of computing the weights is too expensive in many applications.

Some solutions have been proposed to alleviate this high computation burden for weights’ calculation. Liu et al. proposed an approximation to the similarity of neighborhood windows by employing an efficient Summed Square Image (SSI) further combined with Fast Fourier Transform (FFT) [10, 11]. In [12, 13], the NLM algorithm is accelerated by eliminating some computation of weights through a preclassification step based on a hard threshold of local block measures (average intensities, gradients, and first- and second-order moments). Also, some Probabilistic Early Termination (PET) schemes, such as cluster tree, Singular Value Decomposition (SVD), or dictionaries for image blocks and image edges, were also employed to speed up the weights’ calculation [14–17].

In this paper, compared with the traditional NLM algorithm, in which the calculations of weights are by the pixel by pixel way, our proposed fast strategy was performed on the whole image. Specifically, correlation operators are applied to compute the differential image and lead to a straightforward shortcut to achieve all the weights. By doing this, a lot of redundancy computation can be successfully avoided. Thereby, the computation speed of NLM is improved.

This paper is organized as follows. In Section 2, we briefly review the NLM algorithm. The repetitive computation causing the original per-pixel NLM slowness is analyzed and our proposed fast NLM algorithm is described in Section 3. In Section 4, some experiments are conducted and they show that a significant improvement over competitive approaches is brought by our method. Section 5 concludes this paper and some relative discussions are given.

#### 2. The Non-Local Means Algorithm

##### 2.1. The Principle of NLM

The image denoising model for an image , where is the real value domain, can be formulated bywhere denotes the noisy image value, represents the original noise-free image value, and is the noise value at the pixel . The idea behind the NLM filter is to consider that the denoised pixel value of is equivalent to the weighted mean of all pixels’ values of the noisy image (indexed in image ) [1, 8]. However, considering the high computational cost, it was suggested in [1, 8] that we can estimate the denoised pixel value using the pixels within a larger search neighborhood centred at the pixel* i*, where denotes the radius of , whose pixel values are scanned column by column and then concatenated to a vector:where is the number of pixels within . The superscript “” denotes the transpose and , , is the value of the th pixel in the search neighborhood . In this situation, the NLM algorithm can be expressed bywhere is the filtered image value of the pixel and represents the weight between and . The weight is acquired bywhere is a normalization factor:and denotes the exponential Gaussian weighted Euclidean distance:In (6), acts as a filtering parameter and is the Gaussian Euclidean distance between blocks and centred at the pixels and , respectively, and is given by In (7),obtained by scanning the pixel values of the block column by column and then concatenating them and is the number of pixels within the square block whose radius is set as . That is to say, . Similarly,is achieved by scanning the pixel values of the block column by column and then concatenating them. Using (8) and (9), (7) can be deduced step by step as follows:where is the th component of the discrete Gaussian kernel , whose standard deviation is . Apparently it has the same size as the block .

##### 2.2. Computation Cost Using Per-Pixel Algorithm

From (10)–(12), we can clearly see the cause of the NLM computational complexity. For every pixel , the NLM algorithm must compute the dissimilarity between and all in the search neighborhood . For instance, to filter a noisy image of size , it needs multiplications and ( additions. By cutting down the search neighborhood into 15 × 15, with the block set as 3 × 3, for a general 512 × 512 size image , it needs 2 × 9^{2} × 15^{2} × 512^{2} multiplications and 17 × 15^{2} × 512^{2} additions. Hence, a fast NLM algorithm is required. This motivated us to propose a correlation based accelerated nonlocal denoising algorithm.

#### 3. The Fast Nonlocal Means Algorithm

##### 3.1. Repetitive Computation of the Per-Pixel Algorithm

From the above observations, we can see that, for one specific central pixel , the Gaussian weighted Euclidean distance between the block centred at and block centred at in the neighboring needs to be computed. However, for any pixel adjoining the pixel , the Gaussian weighted Euclidean distance between the block centred at and block centred at in the neighboring also needs to be computed. Because the two pixels are adjacent, and have lots of overlap and are the same as and . Therefore, by the* per-pixel* way to implement the NLM algorithm, the blocks based Gaussian weighted Euclidean calculation is highly repetitive.

More precisely, for a block with size , the repetitive calculated square term is . The repetitive rate is . To get an intuitive observation, a diagram is depicted in Figure 1 in which we set the size of block as =3 × 3. It can be observed that there are 2/3 repetitive operators of intensities differences when calculating the weights for the adjacent centered values of pixels and .