Abstract

This paper presents a method determining neighborhoods of the image pixels automatically in adaptive denoising. The neighborhood is named stationary neighborhood (SN). In this method, the noisy image is considered as an observation of a nonlinear time series (NTS). Image denoising must recover the true state of the NTS from the observation. At first, the false neighbors (FNs) in a neighborhood for each pixel are removed according to the context. After moving the FNs, we obtain an SN, where the NTS is stationary and the real state can be estimated using the theory of stationary time series (STS). Since each SN of an image pixel consists of elements with similar context and nearby locations, the method proposed in this paper can not only adaptively find neighbors and determine size of the SN according to the characteristics of a pixel, but also be able to denoise while effectively preserving edges. Finally, in order to show the superiority of this algorithm, we compare this method with the existing universal denoising algorithms.

1. Introduction

Image denoising is a very important image preprocessing step. In acquisition, images would be more or less affected by noise. Noise will make the image quality reduction, which will influence the subsequent processing steps. In order to recover a real hidden image from a noisy image, a lot of efforts have been done for a long time.

In 1949, Wiener proposed Wiener filtering using the theory of stationary random process [1]. In theory, Wiener’s filter meets the minimum mean-square errors (MMSEs) of a linear filter. However, Wiener filtering is only applicable to stationary time series, which causes the edges to be blurred in denoising.

The most effective way to address these problems is adaptive denoising [25], which assumes that the image gray levels are piecewise constant or piecewise continuous. However, near the singular points, such as edges and textures, the assumption of being piecewise continuous and constant does not hold. It also makes the edges and textures oversmoothing.

An improved form of adaptive denoising is called the bilateral filtering [611]. Bilateral filtering integrates range filtering (gray level) and domain filtering (space) together, which preserves edges while denoising. However, in noises, the real gray levels are polluted seriously, which makes the real levels unable to be correctly estimated from the noisy gray levels. In addition, two window parameters, the variances of range filter's kernel and spatial filter's kernel, must be selected by experience. Once it is fixed, it cannot be changed.

Some researchers suggested that the context able to be used for distinguishing the singular points from smooth points [12, 13]. Essentially, the context defined on the local gray level energy is a classifier for image pixels. This makes smooth can be done only among the similar points, which can maintain the singularity in denoising. However, due to context defined in the whole image, it lacks spatial adaptation. Studies in [1417] propose different methods to improve it.

Some algorithms combine space and context together [1416]. In these methods, a fixed-size of sliding windows is chosen by experience firstly. Then the true value of each pixel is estimated from the points in the window with the similar context. These methods have better performance than the context. The challenge of these methods is that fixed size sliding windows will make the window too small to obtain reliable estimate for singular points.

Another well-known method is nonlocal denoising algorithm proposed recently [17]. Nonlocal approach determines the similarities through a big and a small window together. The small window is used to determine the nature of local gray energy while the large window is used to look for similarities. As the searching neighborhood is large, nonlocal approach can overcome the default for most of spatial methods in unreliable estimates near singular points, which can more effectively maintain the borders and textures.

We think that nonlocal is the same as context in finding the similar points using local gray level energy. However, the context defined throughout the image lacks spatial adaptation, while nonlocal searches for similarities in a large window with better spatial adaptability. However, on smooth regions, nonlocal also lacks spatial adaption. Besides this, the nonlocal method is with high computational complexity and the sizes of two windows are also fixed and chosen fully by experience.

As can be seen from the above discussion, the neighborhood sizes of existing adaptive image denoising algorithms are selected by experience. Moreover, these neighborhoods can no longer be changed after being selected, which makes the edge-preserving image denoising a very difficult tradeoff problem. That is, denoising needs a large neighborhood to eliminate noises while maintaining edges requires a small neighborhood to keep singularity. A fixed-size neighborhood is impossible to satisfy these two requirements simultaneously. Nonlocal can overcome the shortcomings of the appeal by taking advantage of big and small windows. But its mechanism for the coexistence of two windows makes the computing complexity increase greatly. Besides this, it lacks of spatial adaptation to smooth regions. In this paper, we propose a method to adaptively determine the neighborhood of image pixels in denoising using the theory of NTS analysis.

Recently, with the development of the theory of time series analysis, NTS analysis becomes the focus in time series analysis [1831]. Two well-known NTSs include fractal time series and chaotic time series. However, the field of image denoising, to date, almost do not use these NTSs. The reason for this phenomenon is that most of researchers believe that the image noise is random rather than chaos [29, 30].

In this paper, we are concerned about how to convert an NTS to a set of STS. The method firstly removes the false neighbors dynamically using context to obtain a SN. In a SN, since all of the pixels have similar local gray energy, the time series can be considered as stationary. Besides this, the SN is composed with close spatial locations, which guarantees its spatial adaptability.

The motivation for SN is that the observation (the noisy image) of a NTS is the sampled data of an underlying high-dimensional manifold. Some projection points of these sampling points, which are not neighbors in the manifold, become neighbors in one-dimensional projection space. These neighbors are called FNs. To obtain SNs of an image pixel, firstly, we must remove these FNs out. The removing increases the embedding dimension gradually, which makes folding, wrapping, and twisting orbit open. Using this method, FNs can be found and removed [18, 24]. The original neighborhood without FNs becomes a SN. Thus, the real state of a NTS can be estimated from the noisy observation on SNs using the theory of STS.

Note that the different image pixels have different SNs and sometimes SNs are irregular, for example, near the edges. In addition, the proposed method also maintains a nonlinearity for NTS, which is coincident with the same nature for manifolds. That is, local structures are simple while its global structure is very complex [3234].

The neighborhoods determined by proposed method are fully automatic and reliable in estimate, and they are also able to maintain the image edges with the variable sizes and irregular shapes. It finds a perfect solution to the existing challenges in adaptive image denoising.

Section 2 in this article will discuss the NTS and SN, and Section 3 describes the denoising algorithm presented in this paper. Section 4 presents the experimental results and discussion. Section 5 gives conclusions, and finally the acknowledgment part is given.

2. Nonlinear Time Series and Stationary Neighborhood

In this section, we mainly introduce how to find SNs for the pixels of a noisy image. Thus, the NTS will be converted into the STS in SNs.

2.1. Definitions

From the geometric point of view, an NTS is a projection from a high-dimensional phase space to a one, dimensional space. In this projection, some points, which are not adjacent in the phase space, become neighbors in one-dimensional projection space and are called FNs. To remove FNs, the most direct idea is to increase the embedding dimension for the phase space. As the embedding dimension is increasing, the folding, wrapping, and twisting orbit will gradually be open. Therefore, FNs can easily be removed from the original neighborhood. And then a SN, which is a neighborhood without FNs, is obtained. On this SN, the true state can be restored according to the theory of STS. In order to explain our approach better, the definitions of related terms are given as follows.

Definition 2.1. A vector of phase space for a time series is an -dimension vector in the phase space, which is composed by a different time delay of the time series : where is called embedding dimension for the phase space.

The method deleting FNs is from the smallest embedding dimension, such as by , and is gradually increasing embedding dimension . As is increasing, the wrapping and folding orbit of the nonlinear movement will be gradually opened up. When increases to a definite value, which the number of FNs no further increases, the correct embedding dimension is found.

Definition 2.2. A manifold is a topological space that is locally Euclidean (i.e., around every point, there is a neighborhood that is topologically the same as the open unit ball in ).

Manifold resembles the Euclidean space near each point, and its global structure may be very complicated. The nature of manifold, which is local simple and complex global, coincides with our method. That is, the global complex nonlinear can be parted into local STS.

Definition 2.3. The Neighborhood of a pixel is a collection of pixels. The elements of this collection satisfy , where is the distance and is a predefined constant. The elements in are called neighbors of .

Definition 2.4. Two points and on the phase space of NTS are not neighbors but they are neighbors on the one-dimensional orbit, and is called a false neighbor (FN) for .

Note that, in our method, the size of the SN is determined by the number of FNs.

Definition 2.5. A time series is stationary if, for all , the joint probability distribution of is independent on the time index . More specially, the expectation, variance, correlation coefficients of a time series only are functions of time interval and are independent on the origin of the time; henc, the time series is then called weakly stationary.

Definition 2.6. One neighborhood determined by the method presented in this paper is called a stationary neighborhood.

Theorem 2.7. Pixels in a SN form an STS.

2.2. Some Remarks

In this subsection, we will give some remarks on our method.

Remark 2.8 (The Selection of Time Delay ). In this paper, the time delay is set to 1. The reason is that most of adjacent points are very similar in image and this assumption usually is adopted in adaptive denoising. Here, we also follow this assumption.

Remark 2.9 (Context and Embedding Dimension). Firstly, the definition of context is given.

Definition 2.10. The context for an image pixel is defined as a length vector formed as a multidimensional function of observations.

The context commonly used in image processing can be defined as the -dimensional vector in phase space of NTS. Since the context in image coding is studied deeply, it can be used directly to construct the SNs. For example, if we follow a specific definition of context, the embedding dimension can be determined immediately. From the above discussion, we know the following.

Theorem 2.11. An -dimensional vector in phase space is a special form of context.

Remark 2.12 (Size of Neighborhood). It should be explained that the neighborhood size and embedding dimension are two different concepts. Generally, neighborhood is a more global concept than the embedding dimension.

It should satisfy two basic criteria simultaneously in choosing neighborhood size. That is, it must ne big enough to satisfy the reliability of estimates and be small enough to satisfy spatial adaptation of the singularity detection. As discussed in the previous section, a fixed-size neighborhood cannot satisfy the above two requirements simultaneously.

The method proposed in this paper meets these two requirements simultaneously by building different neighborhoods for different image points. In order to ensure the reliability of estimates, the least number of pixels should be given firstly. Here, we select 48. In other words, a neighborhood after deleting the FNs still has 48 pixels in it; it is a right SN.

The reason why we should give the least number of neighbors is neighborhood on a smooth region is different for that near the singular points. In a smooth region, a neighborhood is enough, while near a singular point there are few neighbors in neighborhood, which cannot meet the reliable requirement. Thus, near singular points, it must increase the size of neighborhood in order to increase the number of neighbors. This is the reason why nonlocal method has two different windows.

2.3. Determining Stationary Neighborhood

Firstly, we must determine three parameters: time delay , embedding dimension and size of neighborhood . We know that , could be determined by the context, and neighborhood size is determined by 48 and the least number of neighbors automatically. In this way, we can find a SN for a pixel in accordance with the following steps.

Step 1 (initialization). Give context and set , the least number of neighbors , the threshold of context , and initial size of neighborhood .

Step 2 (finding FNs in a neighborhood of a pixel). The FN is a pixel in the neighborhood and satisfies . Find all FNs in the neighborhood. Then record the index and the number of FNs.

Step 3. If , then and repeat Steps 2-3; otherwise, deleting the FNs of the neighborhood, the SN is the remainder of the neighborhood.

3. The New Framework

In this section, we will discuss the theory of image denoising [35]. And then the framework will be presented.

3.1. Image Denoising

Image denoising studies on how to recover an original image from a noisy observation , assuming that the noise is . , , are matrixes, which represent the random fields with variables. The relation of their observations can be represented by the following formula: where , and are realizations of random fields , and respectively.

The can be estimated under Minimum Mean Squared Error (MMSE); that is, where represents the estimate value of . An uppercase letter represents a random field or a random variable while the lowercase letter represents one realization of the random field or variable.

Thus, the estimate of one pixel of is conditioned on the observation optimized by MMSE. If is a 0 mean and variance Gaussian white noise (GWN), the optimal estimate of is where is variance of the original image and is the mean of .

Here, two parameters should be estimated in denoising: and . However, we only have one observation for Thus we have to share data in a neighborhood, which assumes that the whole data in this neighborhood are independent identical distribution (iid).

3.2. The New Framework

In this subsection, we will give the new framework of our method.

Step 1. For each image pixel , determine a SN (see Section 2.3).

Step 2. Compute and in the SN using the assumption of iid. And is .

Step 3. Estimate using (3.3).

Step 4. For all image pixels, repeat Steps 13.

4. Experiments and Discussion

In this section, we will compare our method to Wiener’s filter, bilateral filter, context, and nonlocal. It should note that these five filters are universal denoising filters. In order to compare them on the same benchmark, the same context is used in our method, nonlocal, and context. The programs are implemented on Matlab with the same designer.

Firstly, we will give some brief comments on these five filters. And then some experimental results will be shown. Finally, we will give discussion.

Wiener’s filter is proposed in 1949 by Wiener [1]. It uses the natures of STS and the frequency properties to filter noise from the signal. The experiments of Wiener are carried on the function “wiener2” in Matlab. It blurs the edges and texture while denoising. One example is shown in Figure 1. The denoised image using Wiener’s filter with mask is blurred seriously, especially for the mouth of Lena and the decoration on Lena's hat.

Bilateral Filter. the formula for bilateral filter is where and are two pixels. and are gray levels of and respectively. is a normalized constant for two weighs and is defined as where and are measures of the spatial and range closeness between the center pixel and its neighbor respectively. Usually, these two measures can be defined as two Gaussian Kernel functions:

Bilateral filter integrates domain filter and range filter together. It also defines a space neighborhood using the variance of domain filter . The range filter is used for selecting the points with similar gray levels to . However, in denoising, the real gray level is hidden in the noisy data. Therefore, the range filter cannot work well. Besides this, two neighborhood sizes of bilateral filter also are fixed after defining the two variances and . The program of bilateral filter is designed on Matlab. Figure 2 gives one image and the image after bilateral filtering.

Context. For an image, pixel is defined as a length vector formed as a function of observations. In order to ensure that the comparison is on the same benchmark, the context in our method, nonlocal, and context is defined as where is the gray level of pixel .

Using the context defined by (4.4), the image pixels can be classified to several groups according to their local energy. In this paper, we use a parameter (see Section 2.3) to control the difference for each of group. Figure 3 gives denoising results for different ’s for context. Although context has better denoising results than these Wiener’s filter and bilateral filter, it also lacks spatial adaptivity. We also design context denoising on Matlab.

Note that, designing more complex context or using tools like wavelet, FFT and so forth will undoubtedly obtain better denoising results. However, it is beyond the scope of paper.

Nonlocal is a famous good denoising algorithm. The most important mechanism for nonlocal is using two windows simultaneously. At least, it improves the estimate near singular points. However, on smooth region, since it lacks spatial adaptation, it leads over smooth on these regions.

The Nonlocal program is designed on Matlab, in which the size of small window is (context) while the size of large window is . The points on the large window with similar context are used for estimating the real gray level of the center point. In Figure 4 denoised images using different ’s (see Section 2.3) are shown. It is obvious that Figure 4(d) is oversmooth on smooth regions but still has good performance in edge preserving.

Stationary Neighborhood is finding a stationary neighborhood for each image pixel. The neighborhood has at least 48 pixels after deleting FNs. Since On smooth regions and near singular points, if two requirements of designing a neighborhood are satisfied, it has good performance on both type regions. That is, in theory analysis, the method proposed in this paper has the same performance near singular points while having better performance on smooth regions than the nonlocal. Figure 5 gives us denoised results of SN.

In order to compare the performance of above five filters, we test some images in Matlab and some images in the image databases on the internet. These images include lena.jpg () and coins.png (in Matlab). For coins, nonlocal and SN overmatch other three methods both on denoising and edge preserving; see Figure 6. Nonlocal also has very similar Visual Effects to SN! It shocks me much since I think that SN should have obviously better performance than nonlocal.

After analysis, I think the reason is that the image (coins) is too simple to find the difference between nonlocal and SN. The most important difference between these two methods should be the different neighborhoods on the smooth regions. Thus, Lena, a famous denoising test image with big smooth regions, becomes a test image for comparing nonlocal and SN. In theory, SN has the same performance of nonlocal near singular points while having better performance on smooth regions.

From Figure 7, we can see that SN has better performance than nonlocal on smooth regions. That is, SN preserves much more gray levels and details in smooth regions, especially for upper borderline of hat where SN preserves the borderline but nonlocal loses it! In addition, SN also provides us more good visual effects.

Besides these, SN also has relatively low computation complexity. The computation of nonlocal is [17]. Since most of image pixels (about ) are smooth pixels [13], SN reduces the computation complexity greatly. That is, the smooth points only need a neighborhood for denoising. Thus the computation complexity is about where 441 is an estimate mean for singular regions according to nonlocal. Its computation complexity is about of nonlocal.

5. Conclusions

In this paper, we propose a new method to determine a neighborhood, named SN, for each image pixel in adaptive image denoising. The motivation for finding SN is based on the idea that an NTS can be convert to STS in some overlapped neighborhoods. An SN is a neighborhood whose false neighbors are deleted and has at least 48 neighbors. SN satisfies two requirements for designing neighborhood on both smooth regions and singularity regions. It also has good performance on two type regions with about computation complexity of nonlocal.

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (nos. 60873102, 60573125, 60973157, and 60873264), National Key Basic Research Program Project of China (no. 2010CB732501), and Open Foundation of Key Laboratory of Land Resources Evaluation and Monitoring of Southwest, Ministry of Education, (no. KLEM2009001).