Abstract
Images may be corrupted by salt and pepper impulse noise due to noisy sensors or channel transmission errors. A denoising method by detecting noise candidates and enforcing image sparsity with a patchbased sparse representation is proposed. First, noise candidates are detected and an initial guide image is obtained via an adaptive median filtering; second, a patchbased sparse representation is learnt from this guide image; third, a weighted  regularization method is proposed to penalize the noise candidates heavier than the rest of pixels. An alternating direction minimization algorithm is derived to solve the regularization model. Experiments are conducted for 30%∼90% impulse noise levels, and the simulation results demonstrate that the proposed method outperforms total variation and Wavelet in terms of preserving edges and structural similarity to the noisefree images.
1. Introduction
Noise is an evitable problem in image processing and computer vision. Images may be corrupted by impulse noise due to noisy sensors or channel transmission errors [1]. To improve the image quality, it is important to remove these noises.
Median [1] or adaptive median filtering (AMF) [2] is usually adopted to remove the impulse noise. Noise detection and pixel restoration are two main steps in impulse noise removal. In noise detection stage, noise candidates may be found in the spatial domain [2] or multiscale decomposition domain [3, 4]. In the spatial domain, the size of local window is adaptively set [5, 6] in a noiseaware way and a hypergraph can be defined as model relationship of a central pixel and its neighbor pixels [7]. In the multiscale decomposition domain, Wavelet transform has been adopted [3]. In pixel restoration stage, recovery methods can be applied in the spatial domain [2] or multiscale decomposition domain [4, 8]. In the spatial domain, pixels will be better recovered in the optimal direction if directional edges are taken into account [9, 10]. Fuzzy rules are also applied both in the spatial [11] or Wavelet domain [8] to deal with the uncertainty of inaccurate recovery. When 50% pixels are corrupted by impulse noise, methods in [9, 10] have shown promising results, but these methods produce unsatisfactory images when noise level is higher than 50%. Methods in both [4, 5] restored reasonable images when 90% pixels are contaminated.
Images are recovered much better than typical impulse denoising methods if appropriate sparsity priors are applied [12–14]. Beyond traditional multiscale decomposition methods, for example, Wavelet [3, 4, 12], sparse representations of image model wider image priors such as geometric directions [15–17] or redundancy [18] among images. In impulse noise removal, noise is detected by a sparse representation in identity matrix [13], or images are restored by enforcing their sparsity in Wavelet domain [12] or finite difference domain [19]. Rather than using fixed basis or dictionary, datadriven dictionaries have been proposed to provide an adaptive sparse representation for a specific image [20, 21]. To further reduce the reconstruction error in impulse noise removal, adaptive sparse representation has been explored under the framework of dictionary learning [14, 22]. However, the dictionary training in iterations is time consuming.
Recently, a fast adaptive sparse representation of images, called patchbased nonlocal operator (PANO) [23], has been proposed to make use of image selfsimilarity [24–26]. PANO searches similar patches in neighborhood; thus, the training phase is relatively fast. In our previous work, PANO was explored to reconstruct images from impulse noise corruption using an  regularization model. By learning the similarity from images restored by median filtering, both strong edges and textures are recovered much better than total variation or Wavelet [27]. However, the denoising performance dropped when 50% pixels are contaminated with impulse noise.
In our experiments, we found two reasons for the reduction of denoising performance. First, many image structures are lost in the guide image thus affect learning the proper similarity. Second, noisefree pixels are also changed since the original PANObased  regularization model did not distinguish the noisy or noisefree pixels. Based on the two reasons, adaptive median filtering [2] is used to obtain a good guide image and to detect the noise candidates first. And a new model penalizing the noisy pixels more than noisefree ones is proposed to preserve the noisefree pixels.
In this paper, a novel denoising framework is proposed for salt and pepper noise removal based on adaptive sparse representation of similar image patches. The contributions of this paper are summarized as follows.(1)Detect the noise candidate first, and only noisy pixels are heavily penalized in a weighted  regularization model.(2)Improve the similarity learning with a proper guide image, thus producing more accurate adaptive sparse representation of image patches.(3)A numerical algorithm, alternating direction minimization with continuation, is developed to solve the new model accordingly.
The rest of this paper is organized as follows. The original PANObased  regularization model will be reviewed in Section 2.1. The new denoising approach is presented in Section 2.2. Experiments and results analysis are given in Section 3. Finally, Section 4 presents the conclusions.
2. Method
2.1. Review of PANOBased Impulse Noise Removal Method
In the original PANObased noise removal method [27], the image is recovered from noisy observation by solving a  minimization model where promotes the sparsity of groups of similar patches using PANO , removes the outliers, that is, impulse noise in images, and balances between the sparsity and outliers removal. A larger should be assigned since it is expected to smoothen images harder for heavier noise.
However, all the pixels of noisy observation are penalized equally. This may change the pixels uncorrupted by noise. If the noise candidates are detected, the denoising performance is expected to be improved [28].
2.2. PANOBased Impulse Noise Removal with Noise Detection and a Weighted  Regularization Model (PANOND)
In the proposed method, the salt and pepper noise are detected first and the locations of noise candidates are marked. With the location information, a weighted  regularization model is proposed to preserve the noisefree pixels while removing noise. We call the proposed method PANO with noise detection (PANOND).
2.2.1. Noise Detection
For the salt and pepper noise, the adaptive median filter (AMF) can be used to detect the noise candidates and replace each with the median of the pixels in a local window [2]. As shown in Figure 2, an adaptive structure of the filter ensures that most of the impulse noise is detected even at a high noise level provided that the window size is large enough [28].
Let be a window of size centered at , and let be the maximum allowable window size. Initialize and compute which denote the minimum, median, and maximum of the pixel values in , respectively. When , that is, impulse noise does not dominate the window, we can judge is not a noise candidate if , else we replace by (Median Filtering); that is, When , that is, impulse noise dominates this window, we set (Adaptive) and repeat the above steps.
When , we replace by (the algorithm is terminated when the maximum window size is reached). Notice that except for the noise candidates that are replaced by the median , the remaining pixels are left unchanged. The flowchart of AMF is summarized in Figure 3. In the proposed scheme, the noise candidates are detected using AMF.
The locations of detected noise candidates are stored in a set whose complementary set stores locations of the rest of pixels. For an image , and will be used to generate a diagonal matrix whose entries stand for weights on pixels in the regularization model.
2.2.2. Weighted  Regularization Model
In this paper, a weighted  regularization model is proposed to solve the image reconstruction problem as follows: where is the noisy observation and is the original image to be recovered; promotes the sparsity of groups of similar patches using PANO ; removes the outliers, that is, impulse noise in images; is a diagonal matrix whose entries stand for weights on pixels; balances between the sparsity and outliers removal.
Let be the th diagonal entry of ; a small weight is assigned for noise candidates and a large weight for the rest of pixels; that is, where and contain the indexes of noise candidates and the rest of pixels, respectively. In this paper, we set for simplicity. The effect of will be analyzed in Section 3.3.
Compared with (1), noise candidates are distinguished from pixels and are expected to be suppressed according to the new model in (3). Therefore, the rest of the pixels are relatively preserved in the image reconstruction.
To solve the new regularization model, the alternating direction minimization with continuation [23] is modified to solve the minimization problem with two regularization terms. A relaxed unconstraint form of (3) is written as
The solution of (5) approaches that of (3) as [7]. For practical implementation, as gradually increases, we use the previous solution as a “warm start” for the next alternating optimization.
When is fixed, (5) can be solved in an alternating fashion as follows.(1)For a fixed , solve whose solution is obtained via soft thresholding for each and solve whose solution is obtained via soft thresholding (2)For fixed and , solve which can be written as The minimizer of (11) is given by the solution of the normal equation which can be simplified as and the term is an assembled image reconstructed from patches in all groups. One can use conjugate gradient to solve (13).
The algorithm is summarized in Algorithm 1 where is updated in the subsequent iterations by following Step 1, 2, and 3.

3. Results
The proposed method is compared with AMF [2], sparsitybased denoising methods including TV [28, 29] and dualtree complex Wavelet [12]. A suffix “ND” means that noise detection is applied in denoising. The proposed method PANOND is also compared with the original PANO [23]. Typical parameters of PANOND are the same as [23], including 16 similar patches in a group, patch size and search region size . The similarity is first learnt form a denoised image using adaptive median filter, and then learnt twice from the denoised image for further reconstructions.
To quantitatively measure the denoising performance, the peak signaltonoise ratio (PSNR) and mean measure of structural similarity (MSSIM) [30] are used. PSNR mainly measures the average pixel difference of the denoised and noisefree images. MSSIM focuses on the image structure consistency of denoised images to the original image. Higher PSNR and MSSIM mean better denoising performance. The regularization parameter of all sparsitybased methods is optimized to achieve the highest MSSIMs. Maximum window size of AMF is chosen as since its performance is stable under all noise levels [28].
3.1. Effect of Noise Detection
Although edges are reconstructed better by exploring the image selfsimilarity in the original PANO (Figure 1(d)), all sparsitybased denoising methods fail to recover the image structures (Figures 4(d)–4(f)) when 50% or more pixels are corrupted without noise detection. By assigning noise candidates with a small weight, for example, 0.1 in experiments, all these methods have greatly improved the edge reconstruction (Figures 4(g)–4(i)). Thus, noise detection is necessary for salt and pepper noise removal.
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
PANOND produces the most faithful images (Figure 4(i)) by clearly preserving edges of Barbara image among these methods. For House image (Figure 5) with more flat regions, between which exists a longer strong edge such as eave of the house, or pole in Boat image (Figure 6), they have more similar image groups; thus, the proposed method shows more advantage visually in restoring these edges.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
In Table 1, we compared the performance of the AMF, Wavelet, PANO, and these methods with noise detection and weighted  (). The proposed method achieves highest PSNRs and MSSIMs, indicating it performs better than the already existing ones. Therefore, PANOND significantly improves the edges and achieves better quantitative measures than other methods.
3.2. Different Noise Levels
How the performance varies for different noise levels is discussed. All the compared methods are with noise detection since its importance has been demonstrated. The denoising performance is quantitatively compared in Figure 7.
(a)
(b)
(c)
(d)
(e)
(f)
Under all noise levels, TVND achieves higher PSNRs and MSSIMs than AMF. WaveletND obtains higher PSNRs than AMF for less than 50% noise and lower PSNRs when the noise level is further increased. But the MSSIMs of WaveletND are consistently larger than AMF for all noise levels, which implies that the image structures are better preserved. Comparing WaveletND with TVND, the former leads to higher MSSIMs for Barbara image (Figure 7(d)) embedding fruitful textures while the latter obtains higher MSSIMs for House (Figure 7(e)) and Boat (Figure 7(f)) with more flat regions. The repeated small and directional patterns in clothes of Barbara image are usually considered as texture [31, 32]. Due to the directional Wavelet functions of the adopted dualtree complex Wavelet, the Wavelet transform provides sparse representation of texture [33, 34]. On the contrary, total variation is good at sparsifying piecewise constant image features [32, 35] and thus favors recovering flat regions in House and Boat images. Texture is easy to be lost when total variation is applied in noise removal [36]. Therefore, TVND is more suitable for images with flat regions while WaveletND is better for image with fruitful textures.
The proposed PANOND outperforms both TVND and WaveletND in terms of MSSIMs. PANOND achieves larger or comparable PSNRs when the noise level is below 90%. Besides, higher MSSIMs using PNAOND imply that image structures recovered by PANOND are more consistent to those of noisefree images. As shown in Figures 8, 9, and 10, edges are recovered much clearer using PANOND than TVND. When the noise level approaches 90%, an extremely heavy noise, the PSNRs (lower than 34 dB) and MSSIMs (lower than 0.75) performance of all the methods are unsatisfactory since very limited information is available.
(a)
(b)
(c)
(d)
(e)
(f)
(a)
(b)
(c)
(d)
(e)
(f)
(a)
(b)
(c)
(d)
(e)
(f)
3.3. Effect of Weight for Noise Candidates
A smaller should be assigned to achieve better denoising performance. As shown in Figure 11, when becomes larger than 0.1, the quantitative measures reduce dramatically. One explanation is that, when and are comparable, meaning , the penalization on noise candidates and the rest of pixels are in the same order; thus, noise candidates are not heavily suppressed in reconstruction. Therefore, is suggested.
(a)
(b)
3.4. Complexity Analysis
In this section, we tested the computation time of different methods in Table 2. The AMF is the fastest method, while PANOND is the most timeconsuming method. WaveletND runs faster than TVND but much slower than AMF.
Figure 12 shows the improvement in the MSSIM scores of these methods with increased complexity. The curve of PANOND undergoes 3 peaks because the nonlocal similarity is learnt followed by reconstruction for 3 times. As to the first peak, PANOND gets higher or at least comparable MSSIM scores than other methods when the initial guide image is obtained using AMF. When this reconstructed image is further adopted as a guide image, PANOND keeps on achieving even higher MSSIM on the next two peaks for Barbara image. For Boat and House image, it seems enough to only perform PANObased reconstruction once.
(a)
(b)
(c)
(d)
(e)
(f)
3.5. Effect of
Effect of is discussed in Figure 13. MSSIMs versus are evaluated in Figure 13(a). With the increasing of , the MSSIMs increase first. Then MSSIMs hold similar for a range of ( for 50% noise level). When becomes too large ( for 50% noise level), the MSSIMs decrease significantly. Reconstructed images with typical are shown in Figures 13(b)–13(d). The optimal result is achieved when . The leads to oversmooth image in Figure 13(c) since a smaller encourages higher sparsity and fine image structures are lost. The results in some noise are not removed since the consistency between a noisy image and its denoised version is highly enforced. How to set an optimal is still unsolved which would be an interesting future work.
(a)
(b)
(c)
(d)
4. Conclusion
A new salt and pepper impulse noise removal method is proposed by first detecting noise candidates and then enforcing image sparsity with a patchbased sparse representation. A weighted  regularization model is proposed to penalize the noise candidates heavier than other pixels. The proposed scheme significantly improves the denoising performance than the original PANObased method under heavy noise. Compared with traditional impulse denoising methods, including adaptive median filtering, total variation and Wavelet, the new method shows obvious advantages on preserving edges and achieving higher structural similarity to the noisefree images. However, the nonlocal similarity is not accurate when noise level is high, for example, 90% noisy image, since initial guide image estimated from traditional adaptive median filtering is unsatisfactory. Future work includes the following.(1)Taking advantage of nonlocal similarity in both noise detection and image restoration. A sparsitybased model to simultaneously extract impulse noise [13] and recover image will avoid introducing traditional adaptive median filtering; thus, it may improve the denoising performance when noise is heavy.(2)Wedding adaptive geometric information [15, 16, 37] with image patch similarity may further improve the denoising performance.(3)Given a specific image, how to automatically set the regularization parameters to trade the data consistency with sparsity remains open.(4)Accelerate the proposed approach with advanced numerical algorithms [38–40] and hardware, for example, graphic processing units.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61302174, 61201045, and 61065007), Scientific Research Foundation for the Introduction of Talent at Xiamen University of Technology (YKJ12021R and YKJ12023R), Open Fund from Key Lab of Digital Signal and Image Processing of Guangdong Province (2013GDDSIPL07 and 54600321), and Fundamental Research Funds for the Central Universities (2013SH002). The authors are grateful to the reviewers for their thorough advices which made this work more interesting.