Mathematical Problems in Engineering

Volume 2015, Article ID 318341, 17 pages

http://dx.doi.org/10.1155/2015/318341

## Multiscale Hybrid Nonlocal Means Filtering Using Modified Similarity Measure

^{1}Department of Applied Mathematics, Hanyang University (ERICA), Ansan 426-791, Republic of Korea^{2}Department of Mathematics, University of the Punjab, Lahore 54590, Pakistan

Received 3 September 2014; Accepted 7 November 2014

Academic Editor: Daniel Zaldivar

Copyright © 2015 Zahid Hussain Shamsi and Dai-Gyoung Kim. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A new multiscale implementation of nonlocal means filtering (MHNLM) for image denoising is proposed. The proposed algorithm also introduces a modification of the similarity measure for patch comparison. Assuming the patch as an oriented surface, the notion of a normal vectors patch is introduced. The inner product of these normal vectors patches is defined and then used in the weighted Euclidean distance of intensity patches as the weight factor. The algorithm involves two steps: the first step is a multiscale implementation of an accelerated nonlocal means filtering in the discrete stationary wavelet domain to obtain a refined version of the noisy patches for later comparison. The next step is to apply the proposed modification of standard nonlocal means filtering to the noisy image using the reference patches obtained in the first step. These refined patches contain less noise, and consequently the computation of normal vectors and partial derivatives is more precise. Experimental results show equivalent or better performance of the proposed algorithm compared to various state-of-the-art algorithms.

#### 1. Introduction

The phenomenon of image degradation is quite natural due to the digitization and quantization processes in signal acquisition devices. Although various image denoising techniques have been extensively studied and effectively employed in the last two decades, the preservation of texture, edges, and fine details during image denoising is an open problem and needs rigorous treatment.

In advance of the nonlocal methodology, a variety of variational [1–6] PDE [7–12] and wavelet based [13–16] approaches were proposed for image denoising that rely on the local features of image data. A major shift towards nonlocal filtering was initiated by the usage of bilateral filtering [17], which exploits spatial and intensity domains for image denoising. In this approach, spatially proximate pixels are given more weights in the similarity measure. Later on, the wavelet-based BLS-GSM [18] method provided the best results in terms of PSNR measure; however, these denoised images contain ringing artifacts and have low visual quality. More recently, Buades et al. [19] introduced nonlocal means filtering for image denoising. Although the PSNR of nonlocal means filtering was found to be less than the wavelet-based BLS-GSM [18], the notion of the patch-based approach combined with the idea of nonlocality has led to an entirely new way of attacking the problem. Earlier, approaches similar to nonlocal means were used for image inpainting [20] and texture synthesis [21]. Ever since the nonlocal means were proposed, more rigorous research for better estimation of parameters or finding suitable similarity measures has improved the performance of nonlocal means filtering for a variety of noise models in various image processing applications.

Kervrann et al. [22] provided a theoretical foundation for an intuitive nonlocal means approach using Bayesian statistics. In this approach, refined adaptive dictionaries of similar patches are obtained using Bayesian estimation, while the irrelevant patches are rejected. Afterwards, these learned dictionaries are used for patch-based comparison using a modified similarity measure. Elad and Aharon [23] proposed learned dictionaries of patches using K-means SVD algorithm and then employed sparse representation using pseudonorm over the learned dictionaries for construction of denoised image. Tasdizen [24] proposed using the similarity of patches in the principal component analysis domain. Extensive research in patch-based denoising has resulted in state-of-the-art algorithms [25, 26]. Dabov et al. [25] achieved enhanced sparsity for similar patches in the predesigned 3D dictionaries like discrete cosine transform (DCT) or discrete Haar wavelets transform. These dictionaries are separable and can be extended from two-dimensional to three-dimensional form using tensor product. Furthermore, Mairal et al. [26] extended the dictionary learning approach [23] using norm for learning phase and pseudonorm for the final construction phase. The major distinction between this approach and [25] is the use of learning phase to construct a dictionary rather than to use a predesigned dictionary. The learned dictionary is developed either from the given noisy image or from a database of patches extracted from large collection of natural images.

Several attempts have been made to relate nonlocal means filtering with the variational and PDE based image denoising techniques. Gilboa and Osher [27] provided an elegant interpretation of the nonlocal means approach as a generalization of variational and PDE-based formulations. Brox et al. [28] proposed a computationally efficient algorithm for nonlocal filtering that arranged the data in a cluster tree. This special arrangement further helps in preselection of similar patches. Also, an iterative version of nonlocal filtering based on the variational framework was suggested in [28]. Pizarro et al. [29] introduced a discrete version of variational formulation that exploits the nonlocal data and smoothness (NDS) constraints for a variety of generalized dissimilarity measures defined on the space of image patches. This formulation results in a new similarity measure that considers not only the patch similarity of the two selected pixels but also the similarity of the respective neighbors. Moreover, the formulation emphasized the connection between diffusion based approaches and NDS formulation. Yang and Jacob [30] proposed a unified variational framework for nonlocal regularization by introducing robust distance norm to determine interpatch distances. Furthermore, by using this formulation for inverse problems, theoretical justification is provided for heuristic iterative nonlocal means approaches. Tschumperlé and Brun [31] defined a patch space for implementing PDE-based diffusion or smoothing process.

In addition to the photometric similarity, which is used for patch-based comparison, the patches contain much more information that requires the attention of researchers. With this motivation, we propose the notion of a normal vectors patch corresponding to each intensity patch. By employing this new notion in the second step of our algorithm, we achieve remarkably better results than most of the state-of-the-art algorithms in the presence of moderate or severe noise. Inspired by the special treatment of the central patch in [32], we also employ a slightly higher weight than in the standard nonlocal means approach. The weight value in our approach is associated with the central patch empirically through experiments. This modification has further improved our results.

Earlier, Mahmoudi and Sapiro [33] have proposed the average gradient orientation difference in combination with nonlocal means filtering which seems to be similar to the second step of our proposed method. However, the approach of [33] significantly differs from the proposed approach due to two main reasons. Firstly, for [33], the average gradient information is employed only for preclassification of the intensity patches to increase the computational efficiency of standard nonlocal filtering without the modification of the classical similarity measure. Whereas, the proposed approach exploits the interactions of normal vectors patches for the modification of the similarity weights without any preclassification step. Secondly, the average gradient orientations of patches are used in [33] on the basis of the assumption that the average gradient direction is expected to be similar in the presence of additive white Gaussian noise. This assumption is somewhat arguable in the presence of severe noise level.

The rest of the paper is organized as follows. A brief review of methodologies used in this paper is provided in Section 2. The new multiscale algorithm and its implementation are explained in Section 3. Experimental results are described and discussed in Section 4, and conclusions are drawn in Section 5.

#### 2. Preliminaries

##### 2.1. Nonlocal Means Filtering

Consider a noisy gray-scale intensity imagewhere is the true noise free image to be recovered, represents the zero mean additive white Gaussian noise of known variance and is the Identity matrix with the same dimensions as those of the given image. Instead of the nonlocal filtering, it is preferable to use the notion of semilocal filtering. It was mentioned in [32] that considering the whole image to search for similar patches has no major benefit, with the exception of periodic or quasiperiodic images. Also, searching the whole image for each pixel is computationally too expensive. Therefore, in the rest of this paper, the term nonlocal means refers specifically to semilocal filtering. The nonlocal means filtering is defined [19] as follows:where denotes the denoised value at pixel location and is the weight obtained by determining similarity of noisy intensity patches and around the central pixel and its neighboring pixel , respectively, within the search window . The normalization factor is given by .

The weights, , are obtained using the Gaussian weighted norm [19]:where is the standard deviation of Gaussian function to consider the spatial proximity of the central patch and its neighboring patches in the search window. The patches, in the search window, spatially closer to the central patch are assigned higher weights than those for the distant patches. The nonlocal means approach involves three parameters: the size of the patch , the size of the search window , and the filtering or smoothing parameter . Detailed discussions on the suitable choices of these parameters can be found in [32, 34].

Note that the similarity measure, given in (3), trivially assigns the largest value of to the patch when compared to itself, so that the central patch has the weight as for . However, the weight value assigned to the central patch may not be reliable for the noisy image data. To avoid this unreliable self-similarity weight, Buades et al. [19] offered the maximum value of all the weights computed for around the central patch for the center self-similarity weight. The standard similarity measure can then be given by

##### 2.2. Accelerated Nonlocal Means Filtering

To increase the computational efficiency of nonlocal means filtering, several approaches [28, 33, 34, 36] have been introduced. Buades et al. [34] suggested an accelerated version (block-based) of nonlocal means algorithm that, instead of using the central pixel, replaces the whole patch around the central pixel with a weighted average of patches around the neighboring pixels in the search window. Mathematically this filtering process is defined as [34]where the similarity measure, , is the same as defined in (3). Finally, the pixel value at central location is recovered by averaging all the resulting estimators (patches) containing that pixel location. However, the cost for the computational improvement of the accelerated nonlocal means is a slight degradation in the visual quality of the denoised image.

Mahmoudi and Sapiro [33] proposed a fast implementation of nonlocal means filtering through preclassification step to remove unrelated patches prior to the computation of similarity weights. This preclassification step is implemented on the basis of either the comparison of the local average gray values computed from intensity patches or the comparison of average gradients computed from the gradient patches. In either case, due to the presence of noise, this preclassification step is followed by thresholding process. Finally, the weights are computed only with the preclassified patches. We employ the block-based nonlocal means filtering [34] in the first step of our algorithm exploiting its relative simplicity as compared to other fast implementations.

##### 2.3. Adaptive Similarity Measure Approaches

In this section, we briefly discuss the recent approaches proposing effective modifications to the similarity measure used in the classical nonlocal means filtering. For further details of these modifications, we refer to [37, 38]. The weighted patch-wise photometric distance between patches and can be expressed in vectorial notation aswhere denotes a fixed diagonal matrix which contains Gaussian weights. Despite the fact that the weighted norm represented by weight matrix performs quite well, nonlocal means filtering results in oversmoothing in a certain area of a given image due to fixed weight matrix. One of the most effective approaches to incorporate the locally adaptive similarity weights is based on finding the local covariance matrix. The locally adaptive photometric distance is therefore defined aswhere the is the local covariance matrix for the patch centered at pixel . However, it is not possible to obtain reliable covariance matrix from a single observation that contains noise as well. In order to overcome this problem, a certain redundancy in the given image is used as an implicit prior which is generally true in case of natural images. Exploiting this assumption, the noisy patches in the observed image similar to the reference patch are sampled and grouped to obtain an estimate of covariance matrix. Several methods [35, 39, 40] have been proposed to obtain robust and reliable estimates of locally adaptive covariance matrix which are briefly described below.

Dabov et al. [39] proposed further enhancement for the original BM3D method [25] by introducing shape adaptive transform basis (BM3D-SAPCA) instead of fixed DCT or wavelet transform basis. This shape adaptivity is achieved by incorporating the following modifications. Firstly, the shape adaptive 3D groups of patches similar to the reference patch are constructed as described in [41]. Secondly, the local covariance matrix is estimated from this group provided that the number of similar patches is large enough to ensure the reliable estimation of the local covariance matrix. Thirdly, PCA transform is obtained by the eigen-decomposition of the estimated covariance matrix and only those eigenvectors are retained as principal components whose eigenvalues are greater than a fixed threshold value. Next, the truncated PCA transform is applied on the 3D group of the shape adaptive patches with 1D orthogonal wavelet transform along the third direction of the group. All the remaining process is same as that of the original BM3D [25].

Zhang et al. [40] proposed LGP-PCA approach where PCA is performed on the local covariance matrix estimated by local pixel grouping (LPG) of the patches. Afterwards, the linear minimum mean square-error estimation (LMMSE) technique is used for coefficient shrinkage in the PCA transform domain obtained from diagonalization of the covariance matrix. This approach is similar to BM3D-SAPCA [39] in the sense that it involves two steps denoising process. However, in contrast to the BM3D, it does not depend on basic estimate (oracle) obtained in first iteration. Recently, Lebrun et al. [35] has proposed NL-Bayes algorithm to merge the Fourier transform domain like methods with Bayesian framework. In this case, a reliable estimate of the covariance matrix is obtained by applying Bayes’ rule for 3D groups of noisy patches followed by MAP (maximum* a posteriori*) estimation. The LPG-PCA method [40] can be realized as a special case of this more general approach.

As will be described in the next section, the proposed modification to the classical similarity measure significantly differs from the above mentioned approaches. It assigns the adaptive similarity weights based on the correlation of normal vectors patches in addition to the similarity of intensity patches. Also, unlike the above techniques, the proposed approach does not apply 2D or 3D grouping of the similar patches.

##### 2.4. Discrete Stationary Wavelet Transform

Wavelets have numerous applications, in particular, in signal and image processing, thanks to the efficient time-frequency localization and multiresolution properties. The wavelet transform is generated by the convolution integral with a wavelet which defines a Riesz basis for with a two parameter family , whereThe wavelet decomposition of is given bywhere is the dual wavelet to . For the orthogonal wavelet case, . The wavelet decomposition provides the perfect reconstruction and multiresolution analysis of , as well. In the practical situation, is a signal with finite resolution scale. In this case, the wavelet decomposition has the following representation:where is the finest resolution scale determined by the size of the signal and is a chosen coarsest resolution scale. Also, is the scaling function associated with the wavelet . The standard discrete wavelet transform (DWT) decomposes a signal into low and high pass subbands. At a given resolution scale , is the approximate (low-passed) data and is the detail (high-passed) data of . For the properties of the wavelet transform, we refer to [42].

The one-dimensional wavelet can be applied through separable extension to obtain a two-dimensional wavelet. One easy way to construct two-dimensional wavelet is to use the tensor product of one-dimensional wavelet. In this case, there are one scaling function and three kinds of wavelet, : , , and are horizontally, vertically, and diagonally oriented wavelets, respectively. For a given image data , the wavelet transform provides the subband data through the wavelets and scaling function at the resolution scale bySo, represents the low-passed data and are high-passed detail data: , , and are horizontal detail, vertical detail, and diagonal detail subband data, respectively. Note that the standard DWT generates the downsampled data due to low and high pass filtering followed by the decimation in the nature of DWT. Also, even if DWT has the perfect reconstruction property, it is not shift invariant as the wavelet coefficients of the given data and its shifted version may not be same [43]. So, the transformed data loses some detail information.

To overcome this kind of disadvantage, the discrete stationary wavelet transform (DSWT) was proposed [42, 43]. One of the main ideas of DSWT is to perform low and high pass filtering without decimation. In this approach, multiresolution analysis is achieved by simply inserting zeroes between every adjacent pair of the coefficients in low and high pass filters associated with the wavelet. The insertion of zeroes makes the dimensions of filters at the resolution scale same as those at the resolution scale . Subsequently, the DSWT provides a redundant representation of a data so that the resulting wavelet decomposition is overcomplete. We notice the following advantages of the DSWT in the viewpoint of image denoising.(i)The redundant representation of data contains more information than the case of nonredundant one.(ii)The shift invariance representation has smooth regularity and also does not contain the error due to downsampling.(iii)The size of the subband data is same as that of the input image data, so that the spatial information of those data can be directly compared.

In the proposed method, we implicitly exploit these advantages to obtain the similarity measure of normal vectors patches. For our notational convenience, we will use the same notation for DSWT as the case of DWT. That is, , , and are the subband data at the resolution scale generated by the DSWT.

#### 3. The Proposed Algorithm

In this section, we adopt a hybrid approach to nonlocal means filtering. Our scheme is composed of two steps, as shown in Figure 1. In the first step, we apply the accelerated (block-based) version in the wavelet transform domain to obtain a predenoised image as shown in Figure 2. In the second step, we employ a modified version of conventional nonlocal means filtering on the given noisy image. In this step, the weights are computed using the predenoised image obtained in the first step.