Abstract

We define a strictly convex smooth potential function and use it to measure the data fidelity as well as the regularity for image denoising and cartoon-texture decomposition. The new model has several advantages over the well-known ROF or TV- and the TV- model. First, due to the two-modality property of the new potential function, the new regularity has strong regularizing properties in all directions and thus encourages removing noise in smooth areas, while, near edges, it smoothes the edge mainly along the tangent direction and thus can well preserve the edges. Second, the new potential function is very close to the norm; thus using it to measure the data fidelity makes the new model perform very well in removing impulse noise and preserving the contrast. Lastly, the proposed fidelity and regularization term is strictly convex and smooth and thus allows a unique global minimizer and it can be solved by using the steepest descent method. Numerical experiments show that the proposed model outperforms TV- and TV- in removing impulse noise and mixed noise. It also outperforms some state-of-the-art methods specially designed for impulse noise. Tests on cartoon-texture decomposition show that our method is effective and performs better than TV-.

1. Introduction

Image denoising aims to recover a clean image from a noisy observation. In this work, we mainly focus on removing the impulse noise, which randomly contaminates a portion of the pixels so that their true values are completely lost. The impulse noise is physically caused by malfunctioning pixels in camera sensors, faulty memory locations in hardware, or transmission in a noisy channel [1]. It can be categorized into two types: one is the random-valued impulse noise, for which the noisy pixels can take any random values between the maximal and the minimal pixel values; and another is the salt-and-pepper noise, for which the noisy pixels can take only the maximal and minimal pixel values. For both types of noise, the noisy pixels are assumed to be randomly distributed in the image.

Let be the original clean image defined on its domain , with Lipschitz boundary and be the observed image corrupted by impulse noise. The corruption can be formulated in the following general from:where represents an impulse noise. Two main models for the impulse noise are used in a wide variety of applications: salt-and-pepper noise and random-valued impulse noise [2]. Denote the dynamic range of by ; that is, , for every pixel , the model of the salt-and-pepper noise is defined bywhere denotes the gray level of at a pixel location and determines the level of the salt-and-pepper noise. The model of the random-valued impulse noise is defined bywhere are identically and uniformly distributed random numbers in the range and defines the level of the random-valued impulse noise.

Image denoising is a typical ill-posed inverse problem and one of the most popular approaches is to solve a minimization problem of the formwhere is a data fitting term derived according to the assumed noise type and is a regularization term that imposes the former on . Many methods have been proposed by using various a priori knowledge about the image and the noise [35]. One of the most influential examples is the Rudin-Osher-Fatemi (ROF, or TV-) [6]:where is the modulus of the gradient of . The ROF model uses the norm to measure the data fidelity under the additive Gaussian noise assumption and uses the total variation (TV) to measure the regularity by assuming the image is piecewise smooth or the gradient of the image is sparse. The total variation regularity allows for reconstruction of images with discontinuities across hypersurfaces and is extensively used in variational image restoration. Nevertheless, the fidelity leads to some limitations. One important issue is the loss of contrast in the restored image even if the observed image is noise-free; another issue is that the fidelity term with norm deals well with Gaussian noise but does not perform well in removing impulse noise. In [7], Chan and Esedoglu use the norm as a measure of fidelity and formulate the following variational problem (TV-):It was shown that the norm better preserves the contrast, and the order in which features disappear in the regularization process is completely determined by their geometry (area and length), rather than the contrast as in the ROF model. This important geometric property is also used for the active contour global minimization problem [8]. Using the fidelity, as analyzed in [7], model (6) implicitly detects the pixels contaminated by impulse noise and it preserves edges very well. Empirically, TV- outperforms TV- in detecting outliers and removing impulse noise [9]. However, in order to detect large noisy connected regions, it requires a greater weight of the regularization term in the cost function, which causes distortion of some pixels near edges. Moreover, it has some mathematical limitations: the minimizers of the variational problem (6) need not be unique in general because the fidelity term is not strictly convex; it is not smooth either and solving the problem needs some regularization tricks. A weighted sum of and fidelities is used as the data fitting term and it works effectively and robustly for removal of mixed noise or almost any type of unknown noise [10]. But it still suffers from the shortcomings of fidelity. Huber norms [11] have been used for TV in order to avoid undesirable staircase effects [12]. In [13], the Huber lossis used for both data fidelity and regularization. The advantage of using the Huber loss in comparison to the norm is that geometric features such as edges are better preserved and it has continuous derivatives in contrast to the norm that is not differentiable and leads to staircase artifacts. However, the Huber norm involves a parameter that affects the results.

Except the variational methods, some filtering based methods exist for impulse noise removal such as the Adaptive Median Filters (AMF) [14] and the Adaptive Center Weighted Median Filters (ACWMF) [15]. The AMF method uses Adaptive Median Filter with variable window size to filter out impulse noise. It is robust in removing mixed impulses with high probability of occurrence while preserving sharpness. But it is ineffective when an image is disturbed by other types of mixed noise, such as Gaussian, Poisson, and impulse noise. The ACWMF further uses spatial varying central weight to improve AMF and it is better than AMF in preserving details and in suppressing impulse noise, additive white noise, and signal dependent noise. However, the ACWMF tends to become an identity filter if impulses exist within a window and in that case, the ACWMF is not effective in suppressing impulses, especially for salt-and-pepper noise.

Cartoon-texture decomposition is an important mathematical tool for image analysis. It aims to decompose an image into a cartoon component and a texture component. Ideally, the cartoon component is a piecewise smooth approximation of the original image and it mainly contains object hues and sharp edges while the texture component contains repeated small scale patterns. The general framework for cartoon-texture decomposition has the following from:where and are two functionals, usually norms, measuring the cartoons and , respectively. Meyer [2] shows that the ROF model is ineffective in cartoon-texture decomposition because the norm is not a good measure of the texture, yet the TV is effective in measuring the cartoon component. To overcome the ineffectiveness of the norm in measuring the texture component, Meyer [2] and Haddad and Meyer [16] proposed using the -norm, Vese and Osher [17] approximated the -norm by the norm, Osher et al. [18] proposed using the norm, Lieu and Vese [19] proposed using the more general norm, and Le and Vese [20] proposed using the div(BMO) norm to measure the texture component. However, the models involving these norms are difficult to solve. Yin et al. [21] show that the norm is effective in measuring the texture component and proposed the TV- model.

In this work we define a strictly convex smooth potential function and use it to measure the data fidelity as well as the regularity for image restoration and cartoon-texture decomposition. Like Huber norm, the new potential function has two modalities: it is approximately half the square function (corresponding to the norm) near 0 and approximately a linear function (corresponding to the norm) when is far away from 0. But Huber norm involves a parameter while our potential function does not. The new model has several advantages over the well-known Rudin-Osher-Fatemi (ROF) or TV- model and the TV- model. First, due to the two-modality property of the new potential function, using it working on the image gradient to measure the regularity makes the regularity work in two ways: in smooth area of the image, the regularity results in a diffusion term that is uniform and isotropic, having strong regularizing properties in all directions, and thus encourages removing noise in smooth area, while, near edges, the regularity results in a diffusion process which smoothes the edge mainly along the tangent direction and thus can well preserve the edges. Such regularizing role of our regularity term is different from the TV; especially in smooth areas, TV regularity causes staircasing effect while our method does not. Second, the new potential function is very close to the norm; thus using it to measure the data fidelity makes the new model perform very well in removing impulse noise and preserving the contrast. Lastly, the proposed fidelity and regularization term is strictly convex and smooth; thus the new model allows a unique global minimizer and it can be solved by using the steepest descent method. Mathematical analysis and numerical experiments show that the proposed model outperforms TV- and TV- in removing impulse noise and mixed noise. It also outperforms the Adaptive Median Filters (AMF) and the Adaptive Center Weighted Median Filters (ACWMF) in removing mixed noise. We also apply this model for cartoon-texture decomposition. Experimental results show it performs better than TV- in cartoon-texture decomposition.

2. The Proposed Model

The proposed model is as follows:where is a nonnegative tuning parameter and is defined byThe rationality of this potential function can be explained as follows. First of all, the function is strictly convex and differentiable since and ; thus our model (9) allows a unique global minimizer and it can be solved by using the steepest descent method. Secondly, when is used to measure the data fidelity, similar to Huber norm, it is also a good approximation of the norm in the sense that as and as , so it has similar performance to that of the fidelity in removing impulse noise and preserving image contrast. Figure 1 compares , norm, norm, and the Huber norm. Lastly, when is used to measure the regularity as in (9), it induces the following gradient descent flow:where the right side is the negative gradient of the functional in (9), with the first and the second term being deduced from the regularity term and the fidelity term, respectively. The diffusion term can be decomposed aswhere and denote the tangent and normal directions to the isophote lines (lines along which the intensity is constant) and and denote the second derivatives of in the -direction and -direction. One can see that in a flat or smooth region of the image where the variations of the intensity are weak, that is, , the coefficient of , and the coefficient of , satisfythen (11) becomeswhere denotes Laplacian differential operator. So, at these points, locally satisfies (14), in which the diffusion term is uniform and isotropic, having strong regularizing properties in all directions, and thus encourages removing noise in smooth area. Near edges of the image, that is, , , and , satisfyThis means the coefficient of , and the coefficient of , both vanish. However, vanishes faster than ; this allows the diffusion process to smooth the edge a little along the tangent direction; thus our regularity term can well preserve the edge. The TV regularity can be regarded as a special case of our regularity term by taking , then , and . In smooth area, that is, , the coefficient of , becomes large while the coefficient of ; this may be the reason why TV regularity causes the staircasing effect in smooth area.

The minimization problem (9) can be iteratively solved by the gradient descent method. Numerically, we use the following forward finite difference scheme to discrete the gradient descent flow (11): where denotes the time step size and is the diffusion term, defined by and is a regularizing constant to avoid dividing by 0, which is set by , in our experiment. The spatial derivatives are discretized by central differences.

3. Numerical Simulation

This section is mainly devoted to numerical simulation of image denoising in the presence of impulse noise and mixed noise consisting of Gaussian, Poisson, and impulse noise. We also use our model to decompose an image into a cartoon component and a texture component. The simulations are performed using Matlab 8.5.0 (R2015a) in Windows 7 environment on 3.30 GHZ Intel Core i5-4590 CPU, 4 GB Ram PC. To assess the restoration performance quantitatively, we evaluate the peak signal to noise ratio (PSNR) defined as [22] where and are the pixel values of the restored image and of the original image, respectively. In the presence of Poisson noise, the maximum intensity of the original noise-free image is varied in order to create images with different levels of Poisson noise.

3.1. Image Denoising

We first show the effectiveness of our method in removing impulse noise, including the salt-and-pepper noise and the random-valued impulse noise. In all experiments the time step size is set by .

The regularization parameter plays an important role in denoising because it balances the competition between the data fidelity and the regularization term. When takes large values, the regularization term dominates the total energy, which tends to force the restored image to be smoother and cleaner. When takes small values, the fidelity term dominates the total energy, which tends to force the restored image to be closer to the observed noisy image. In the following we analyze through experiments how the PSNR of the restored image depends on the value of . We show the results for the test images “Cameraman” () and “Lena” () with intensity values ranging from 0 to 255. In the experiment, the noisy images are produced by corrupting the test images with salt-and-pepper noise or random-valued impulse noise of levels 10%, 20%, and 30%. Figure 2 plots the PSNRs versus the values of for the image “Cameraman” with salt-and-pepper noise at different levels. Figure 3 plots that for random-valued impulse noise. From the plots, one can observe the following: first of all, in both cases of impulse noise, the PSNR of the three methods increases and reaches a maximum rapidly and then decreases slowly as the value of increases. Moreover, the optimal value (numerical) of (corresponding to the maximum PSNR) depends on the level of impulse noise. Lastly, for all levels of noise, the maximum PSNRs obtained by TV- and our method are comparative while the maximum PSNRs obtained by TV- are much lower (about 2 dB less). This again indicates that TV- is not fit for impulse noise removal.

To study how the optimal value of depends on the noise level for TV- and our method, we show some best values of corresponding to various levels of impulse noise in Figure 4, salt-and-pepper, and in Figure 5, random-valued. From Figures 4 and 5, one can see that TV- and our method have similar patterns of the dependency of the best on the noise level. In general, the higher the noise level, the larger the best value of . To be more specific, for both methods, the best value of tends to be stable in when the noise level is above 15%. Moreover, Figures 2 and 3 show that the PSNR attenuates slowly if the value of is a little larger than the optimal value. For convenience, we choose uniformly when the noise level is above 15% and when the noise level is below 15%.

In the following experiments we compare visually and quantitatively the performance of our method with TV-, AMF, and ACWMF in removing impulse noise. Figures 6 and 7, respectively, show the results obtained by these methods for salt-and-pepper noise and random-valued impulse noise. The maximum window size used in AMF [14] is 19. The ACWMF [15] is successively performed 4 times with different parameters, which are chosen to be the same as those in [23]. Obviously, whether in removing salt-and-pepper noise or random-valued impulse noise, TV- and our method are well in removing noise and preserving the edges. But our method is a little better than TV- in two aspects. Objectively, the PSNR of our method is about 0.2~0.3 dB higher than that of TV-, and visually, there is less staircasing effect in the smooth area of the restored images. The ACWMF and AMF are better than TV- and our method in preserving small scale details such as the textured ground in the image “cameraman,” and the PSNR of the AMF on the image “cameraman” is even higher than our method by 2.65 dB in case of salt-and-pepper noise. However, the ACWMF and AMF cannot successfully detect all the impulse noise in that some scattered peak points are visible in the restored images. Moreover, the AMF fails in suppressing the random-valued impulse noise.

As indicated in [10], the weighted sum of and fidelity is robust to any kind of commonly used noise prior, yet empirically, the norm absolutely dominates the fidelity. This motivates us to apply our model to remove Gaussian noise. Table 1 presents some results by TV-, TV-, and our method. Figure 8 compares the visual effects of these methods. One can see that TV- performs worse than the other two methods in case of higher level noise, and the restored image by TV- exhibits obvious and annoying staircase artifacts. The PSNRs and the restored images show that, for additive Gaussian noise, where the fitting function is the best choice based on statistical analysis among all possible data fitting terms, our method performs better as well. The better results come from the two-modality property of our potential function.

Now we test the performance of our method in removing mixed noise consisting of additive Gaussian noise, Poisson noise, and impulse noise. We also compare our method with TV-, TV-, ACWMF, and AMF. The Poisson noise is generated using the “poissrnd” function in Matlab with the input image scaled to the maximum intensity (). For the impulse noise, we only consider the random-valued impulse noise, because a pixel contaminated by such an impulse noise is not as distinctive as an outlier that is contaminated by salt-and-pepper noise and consequently is more difficult to detect. We consider three levels of the random-valued impulse noise: 10%, 20%, and 30%. The standard deviation of the white Gaussian noise is 10. For all cases, impulse noise is the first to be added and Gaussian noise is the last to be added. The PSNRs of different methods are presented in Table 2 and some of the restored images are shown in Figure 9. For all levels of impulse noise, our method obtains the best PSNRs and visual effects. TV- performs comparatively in removing impulse noise, but it does not perform as well as our method in removing mixture noise containing Gaussian noise. It may be explained by the two modalities of our potential function. The median filter based methods, especially the AMF is well fit for salt-and-pepper noise, but it does not perform well in case of random-valued impulse noise or mixed noise containing random-valued impulse noise. In fact, the AMF is good at detecting salt-and-pepper noise because in that case, most of the noisy pixels are much more dissimilar to regular pixels and hence are easier to detect. However, the AMF is not effective in detecting random-valued impulse noise when the noise ratio is high.

3.2. Cartoon-Texture Decomposition

In this subsection, we show the effectiveness of our method in cartoon-texture decomposition and compare it with the TV- method. Since the function defined in (8) is a good approximation of the norm, our model (9) can be used in cartoon-texture decomposition. We use model (9) to obtain and finally take .

Figure 10 shows some results by the two methods on four test images, each of which contains smooth area bounded by large scale edges (cartoon) and repeated small scale details (texture). The top row shows the original test images. The other rows show the decomposition results. One can observe that our method can more thoroughly separate the cartoon and the texture. In the cartoon components obtained by TV-, some textures are left behind. The cartoon component obtained by our method only contains the mainframe of the image, that is, the smoothed objects and their boundaries, and the small scale details are to a large extent separated into the texture part.

Finally we test the robustness of our method for cartoon-texture decomposition in presence of noise. The results are shown in Figure 11. The first row shows the input images: image (a) is corrupted with salt-and-pepper noise (), Gaussian noise with standard deviation , and Poisson noise; synthetic image (b) is corrupted with random-valued impulse noise (), Gaussian noise with standard deviation , and Poisson noise. Both TV- and our method decompose the noise together with the texture. In the cartoon components obtained by TV-, there exist noticeable staircase artifacts while the cartoon components obtained by our method are visually much better.

4. Conclusions

In this work we define a new potential function and use it to measure the data fidelity as well as the regularity for image denoising and cartoon-texture decomposition. The new potential function has some attractive mathematical properties: strictly convex, smooth, and two-modality, which makes the proposed model have some advantageous properties over the classical TV- and TV- models. For example, it can well remove wider categories of noise including additive Gaussian noise, impulse noise, Poisson noise, and their mixture; like TV regularity, it can well preserve important geometric structure such as image edges, but unlike TV regularity, it does not cause staircase effect in smooth areas; moreover, the new model allows a unique global minimizer and it can be solved by using the steepest descent method. Numerical experiments show that the proposed model outperforms TV- and TV- in removing commonly used noise. Tests on cartoon-texture decomposition show that our method is effective and performs better than TV-.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to acknowledge the National Natural Science Foundation of China under Grants nos. 61472303, 61772389, and 61271294 and the Fundamental Research Funds for the Central Universities under Grant no. NSIY21 for supporting our research works.