Well-Posed Inhomogeneous Nonlinear Diffusion Scheme for Digital Image Denoising
We study an inhomogeneous partial differential equation which includes a separate edge detection part to control smoothing in and around possible discontinuities, under the framework of anisotropic diffusion. By incorporating edges found at multiple scales via an adaptive edge detector-based indicator function, the proposed scheme removes noise while respecting salient boundaries. We create a smooth transition region around probable edges found and reduce the diffusion rate near it by a gradient-based diffusion coefficient. In contrast to the previous anisotropic diffusion schemes, we prove the well-posedness of our scheme in the space of bounded variation. The proposed scheme is general in the sense that it can be used with any of the existing diffusion equations. Numerical simulations on noisy images show the advantages of our scheme when compared to other related schemes.
Anisotropic diffusion-based image denoising  for gray images was started by Perona and Malik  in 1990. It uses a scale parameter-based adaptive image filtering using nonlinear diffusion. Let be a noisy version of the true image with noise field of known variance : Our aim is to recover the noise-free image with edge preservation. To avoid the oversmoothing nature of the linear diffusion, Perona and Malik  introduced an edge indicator function for reducing the diffusion near edges via the following anisotropic diffusion scheme (ADS): with the initial condition . The diffusion function is chosen to be decreasing, such that . If in (1.2), then we recover the linear diffusion PDE. The diffusion functions proposed in  are where is the so-called contrast parameter. Better numerical results are obtained in  using this class of anisotropic PDEs (1.2)-(1.3) instead of the linear diffusion. When the PDE (1.2) turns into inverse diffusion, which is known to be ill-posed . It is easily seen that the role of as an edge indicator function is important in reducing the noise and enhancing true edges. A more robust indicator function based on smoothed gradient, , where the Gaussian kernel of width is given by , can be used for edge identification. This presmoothing also alleviates the instability  associated with the ADS (1.2). However, edge localization is lost and noisy oscillations can remain in this method. Moreover, when the noise level is high, ADS can reduce the contrast of edges and reliance on the instable alone can be a drawback.
There are efforts to remedy the ill-posedness and to use better edge indicators in the diffusion process. Strong  used an adaptive parameter in total variation-based PDE using the relationship between energy minimization and PDEs . Better results are obtained when compared to the classical total variation (TV) scheme of [6, 7], but the use of gradients alone and the noisy edge map provided by can lead to noise amplification along edges. Recently, Ceccarelli et al.  devised well-posed ADS schemes by approximating the TV function. Kusnezow et al.  used a similar weight function of Strong  combined with linear diffusion for fast computations. Yu et al.  considered ADS in terms of kernel-based smoothing and proposed to use a diffusion function based on modified gradients. Barbu et al.  considered the variational-PDE problem in Sobolev space setting with different growth functions. Douiri et al.  use an edge-based weight in the regularization functional for diffuse optical tomography to control smoothing across edges. The exact edge information is used to compute the weight which in real images is not possible, since we do not know, a priori, the exact edge map of the true image. Indeed, this edge estimation problem is closely related to image denoising, and most of the edge detection schemes [13, 14] are based on this observation.
Apart from staircasing artifacts in flat regions and noise amplification along edges, capturing multiscale edges are difficult in previous schemes. To circumvent the drawbacks a more stable and robust edge indicator function using the multiscale edge detectors as discussed in [13–15] can be used. Anisotropic diffusion scheme is usually employed as a preprocessing step, for noise removal and edge detection. Here, we reverse the trend and integrate the edge detection into the anisotropic scheme such as (1.2). The method can, in general, be used in any anisotropic PDE, and it provides a better coupling of edge detection and restoration of images under noise. We prove the well-posedness of the proposed scheme in the space of bounded variation . The analysis is based on monotone operators and approximation schemes. The discrete version is implemented and it satisfies the properties required for an edge preserving scale space. Preliminary numerical results were reported in .
The rest of the paper is organized as follows. In Section 2, we introduce the edge adaptive PDE scheme and in Section 3 its well-posedness is proved in the space of bounded variation functions. We show the numerical results on real and noisy images in Section 4. Finally, Section 5 concludes the paper.
2. Proposed Scheme
The diffusion function in ADS (1.2) uses the information provided by magnitude of the gradient to reduce diffusion near edges. Figure 1 shows the effect of this on test image. Original image (Figure 1(a)) is corrupted with additive Gaussian noise of variance and is shown in Figure 1(b). ADS (1.2) uses only the diffusion coefficient as in (1.3). ADS restores the image in a piecewise constant manner but reduces overall contrast and creates staircasing artifacts near edges; see Figure 4(a). The magnitude of the gradient image (Figure 1(c), where whiter pixels represent maximal value of ) shows that it is not reliable under noise, and loss of spatial coherence such as edge connectivity and continuity is high. Smoothed gradient () method of  reduces this loss but still leaves some spurious pixels due to noise (Figure 1(d)) and true locations of the edges are not preserved. This makes the result lacking clear boundaries; see Figure 4(b). Also this use of isotropic diffusion is against the very principle of ADS (1.2), which is anisotropic in nature.
2.2. Edge Adaptive ADS
Instead of using only the edge indicator function alone to drive the diffusion process, we introduce a spatially adaptive term in the anisotropic scheme (-ADS): where provides a pixelwise edge characterization. As we will see in Section 3 and in the numerical experiments (Section 4) this avoids the stability problems of the gradient magnitude-based schemes and overcomes the localization error of smoothed gradient method. The requirements for are as follows: (1)at edge pixels to reduce the diffusion: on the edges,(2)in flat regions to allow smoothing: beyond the edges,(3)near edge pixels should vary smoothly between and so as to avoid spurious oscillations.
We choose a term of the form with as a robust and smooth edge indicator function.
2.3. Choice of Edge Indicator Function
Various edge detectors [13–15] exist in digital image processing literature. These detectors are based on a principle of gradient maxima. Among the available edge detectors  such as Sobel's, Prewitt's, Laplacian of Gaussian (LOG), and Marr-Hildreth's Zero-crossing that use maxima points of gradient to decide about the edge pixels, Canny's edge detector [13, 15] has been proved to be the best in terms of spatial coherency and edge continuity. The general method of Canny involves the following steps: (1)convolution of the given image by a Gaussian ,(2)estimating the second derivative using finite differences,(3)again using convolution for with a small Gaussian kernel,(4)thresholding the gradient of Step ,(5)zero-crossings of Step displayed if threshold of Step is achieved.
Canny edge map is a binary output with edge pixels marked as and nonedge pixels with . Hence we use where smoothens the transition from detected edges to homogenous regions. Figure 2 shows a closeup of the edge map from the hat region of image given by a rectangle in Figure 1(b). Clearly the Canny edge detector-based smoothing parameter gives a good localization of edges (Figure 2(b)) and is devoid of noisy pixels. Compare this with the or the smoothed version -based edge indicators (Figures 1(c) and 1(d)) where spatial coherency is missing and edge continuity is lost. Also, in the diffusion process based on these measures, the noisy pixels found in homogenous regions are propagated resulting in staircase artifacts.
One can use any of the mentioned edge detectors into the spatially adaptive term in (2.2). We choose Canny detector because of its superiority and its computational efficiency. This trade-off between optimal performance and computation time can be decided according to the problem at hand.
3. Existence and Uniqueness
Let be a bounded domain. The proposed spatially adaptive version -ADS (2.1) can be considered as a gradient descent form of a regularization functional . The proposed scheme (2.1) is equivalent to a descent method for solving the following energy minimization problem: with . For example, for the diffusion functions in (1.3) we get the regularization functions as Since these two functions are nonconvex, the minimization problem (3.1) and ADS (1.2) do not have a unique solution. In these nonconvex regularization function cases the corresponding energy minimization problem (3.1) does not have a solution at all, except when ; see  for an analysis between the PDE (1.2) and minimization problem (3.1). To obtain a well-posed scheme we restrict ourselves to the class of linear growth functions for satisfying the following.
(H1) is a nondecreasing, convex, and even function with . (H2) There exist and such that , for all .
An important example in this class is the minimal surface function , which is related to the total variation (TV) regularization [6, 7]; see Figure 3. We assume the preliminary results about the space of functions of bounded variation ; see . Throughout this paper, we use the notation . We recall the definition of total variation of a function .
(a) PM 
(b) CL 
(c) RO 
(d) ST 
(e) CD 
(f) KH 
(g) YW 
(h) BB 
(i) Our scheme
Definition 3.1. Let ; its total variation is defined as
The total variation is in fact a Radon measure; see [1, page 50] for more details. Making use of the properties of the convex functions of measures from  we will prove that the -ADS (2.1) has a unique weak solution in the following sense.
Definition 3.2. is a weak solution of the PDE (2.1) if for every , and Moreover, it satisfies and .
Lemma 3.3. Let the regularization function satisfy the assumptions . If converges to in , then
Proof. Let be such that for all . Then where the inequality follows from Definition 3.1 of total variation of . Taking supremum over gives the result.
Note that defined in (2.2) is a smooth function, that is, due to the Gaussian presmoothing; see (2.3). The above lemma can also be proved using a general assumption about the adaptive parameter such as ; see also [19, Theorem ]. We next need the following result from nonlinear semigroup theory; see Theorem in .
Theorem 3.4. Let be Hilbert space, and let be its power set. Let be a maximal monotone operator and , where the domain is . Then there exists a unique function such that
Lemma 3.5. Consider the approximation PDE with There exists a unique solution for the approximation PDE. The solution is bounded in by the initial value , that is,
Proof. Since , existence of the sequence is guaranteed. Let . As is the derivative of a convex lower semicontinuous functional, it is a maximal monotone operator. The existence of follows from Theorem 3.4.
To get the bound for the solution we proceed as follows. Let and . Multiply the approximation PDE by the term . Integrating the PDE, we obtainThis implies that since the remaining integrals are all nonnegative. Let Then is decreasing, nonnegative, and . Thus for all . Subsequently, Letting , we get . Similarly multiplying the term with the approximation PDE and integrating, we obtain .
Lemma 3.6. The weak solution of the approximation PDE satisfied the following inequality, for : Moreover, the weak solutions and are uniformly bounded in ϵ with respect to the and norms, respectively.
Proof. The proof of the inequality follows from the identity
and for any .
From Lemma 3.5, it follows that is uniformly bounded and We have
Let be a sequence converging to . Consider the corresponding weak solutions of the approximation PDE . In the following lemma, we prove the convergence of a subsequence of weak solutions as .
Lemma 3.7. There exists a subsequence such that as dddd where is a weak solution of the PDE (2.1) with . Moreover .
Proof. By Lemma 3.6, and are uniformly bounded. When , for fixed , we have a convergent subsequence in for , and in .
Since is the weak solution of the approximation PDE, it satisfies the corresponding weak solution formulation in Definition (3.4). That is, for every and , When , we obtain, using Lemma 3.3, Thus, is a weak solution of the PDE (2.1) with .
Now, we state and prove our main result.
Theorem 3.8. If , then there exists a unique weak solution of (2.1).
Proof. From Lemma 3.7 we get as a weak solution of the PDE (2.1) with . Using Lemma 3.7 for the sequence as , we get Finally we let in the following inequality: to obtain We see that is a weak solution of (2.1). Uniqueness of the solution follows from the weak solution inequality (3.4) above.
As a consequence of Theorem 3.8, we also obtain a bound for the solution , that is, the maximum principle holds for the solution of (2.1); this guarantees that no new structures are created and -ADS (2.1) satisfies the scale space axiom. Note that though (since ) in (2.3), if we use the original ADS functions (1.3) we cannot obtain the well-posedness of the PDE (2.1), since the corresponding functions are not convex.
4. Numerical Experiments
We use finite differences to discretize the proposed edge adaptive PDE (2.1) for images normalized to be in the range . We take as the grid size, and as the intensity value at iteration . Instead of the classical explicit scheme, which severely restricts the step size, we make use of the unconditionally stable semi-implicit scheme. In 1D with matrix-vector notation it reads as where is the time step, , and with and is the discrete neighborhood of the pixel . For n-D images the semi-implicit scheme is written as The matrix corresponds to the derivatives along the th coordinate axis.
We use the Canny edge detector with its default settings for (2.3) in MATLAB7.4, and the computations are done in a desktop computer with Pentium IV, 2.45 GHz processor. It took nearly a minute for 100 iterations of our scheme, since the computation of Canny edge detector is done at each iteration. Further reduction in execution time can be achieved if we are to use other faster edge detectors. The numerical example, given in Figure 4, shows the results obtained for the noisy image given in Figure 1(b). We show results of applying our scheme (2.1) with regularized TV function , in Figure 4(i) and also compare with the following schemes: (I)classical schemes: ADS (1.2) of Perona and Malik  (Figure 4(a)), Catté et al.'s smoothed gradient scheme  (Figure 4(b)), and Rudin et al.'s total variation scheme  (Figure 4(c)),(II) adaptive schemes: Strong's adaptive total variation scheme  (Figure 4(d)), Ceccarelli et al.'s approximate TV scheme  (Figure 4(e)), Kusnezow et al.'s adaptive linear diffusion scheme  (Figure 4(f)), Yu et al.'s kernel-based ADS  (Figure 4(g)), and Barbu et al.'s variational PDE scheme  (Figure 4(h)).
We use the peak signal-to-noise ratio (PSNR) for comparing our scheme with other schemes. The parameters are tuned to get the best possible PSNR value for each of the scheme compared. Figure 5(a) shows the effect of the same against different noise levels () for our scheme. For an image , the PSNR value of an estimated image is given by We see that our scheme produces a stable PSNR value as the noise level increases and attains highest values among other related schemes as well. To show the strong smoothing along flat regions and edge preservation, Figure 5(b) shows a signal line taken across image from Figure 4(i), whose position is indicated in Figure 1(a) by a line. As can be seen our scheme can remove small scale texture details as the edge detector in (2.2) is not sufficient to capture them. Incorporation of textural measures into points at further improvements in this direction.
A class of well-posed inhomogeneous diffusion schemes for image denoising is studied in this paper. By integrating multiscale edge detectors into the divergence term of anisotropic diffusion PDE we obtain edge preserving restoration of noisy images. Unlike the classical anisotropic diffusion schemes, which use only gradients to find edge pixels, we have proposed to include an adaptive parameter computed from Canny edge detector. Using an approximation scheme and the theory of monotone operators, well-posedness of the proposed inhomogeneous PDE is proved in the space of functions of bounded variation. Numerical examples illustrate that the scheme performs well under noise. Comparison with related schemes is undertaken to show that the expected improvement can really be achieved. Similar analysis for higher-order diffusion schemes and comparing various edge detectors for performance evaluation provide future directions.
The authors thank the anonymous reviewers for their comments which improved the content and presentation of the paper and the handling editor Professor Malgorzata Peszynska for her patience during the review period.
G. Aubert and P. Kornprobst, Mathematical Problems in Image Processing: Partial Differential Equations and the Calculus of Variations, vol. 147 of Applied Mathematical Sciences, Springer, New York, NY, USA, 2nd edition, 2006.View at: MathSciNet
Y.-L. You, W. Xu, A. Tannenbaum, and M. Kaveh, “Behavioral analysis of anisotropic diffusion in image processing,” IEEE Transactions on Image Processing, vol. 5, no. 11, pp. 1539–1553, 1996.View at: Google Scholar
D. Strong, Adaptive total variation minimizing image restoration, Ph.D. thesis, UCLA Mathematics Department, Los Angeles, Calif, USA, 1997, ftp://ftp.math.ucla.edu/pub/camreport/cam97-38.ps.gz.
L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268, 1992.View at: Google Scholar
W. Kusnezow, W. Horn, and R. P. Würtz, “Fast image processing with constraints by solving linear PDEs,” Electronic Letters in Computer Vision and Image Analysis, vol. 6, no. 2, pp. 22–35, 2007.View at: Google Scholar
R. C. Gonzelez and R. E. Woods, Digital Image Processing, Pearson, Upper Saddle River, NJ, USA, 2002.
J. F. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679–698, 1986.View at: Google Scholar
E. Giusti, Minimal Surfaces and Functions of Bounded Variation, vol. 80 of Monographs in Mathematics, Birkhäuser, Basel, Switzerland, 1984.View at: MathSciNet
V. B. S. Prasath and A. Singh, “Edge detectors based anisotropic diffusion for enhancement of digital images,” in Proceedings of the 6th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP '08), pp. 33–38, IEEE Computer Society Press, Bhubaneswar, India, December 2008.View at: Publisher Site | Google Scholar
L. C. Evans and R. F. Gariepy, Measure Theory and Fine Properties of Functions, Studies in Advanced Mathematics, CRC Press, Boca Raton, Fla, USA, 1992.View at: MathSciNet
H. Brézis, Opérateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Espaces de Hilbert, North-Holland Mathematics Studies, no. 5, North-Holland, Amsterdam, The Netherlands, 1973.View at: MathSciNet