Abstract

This paper proposes a novel approach for structure-sensitive image processing based on the rigorous mathematical derivation of data-specific anisotropic Mexican hat wavelets (DAM). Our DAM is derived from the negative first-order derivative of the fundamental solution of heat diffusion equation with respect to time, which not only shares similar properties with Mexican hat wavelet but also intrinsically embeds the image-specific properties. Through the scale-aware DAM transform and its inverse transform, we are capable of conducting structure-sensitive image processing. Our key idea is to represent the images as undirected graphs, whose edge weights are governed by the normalized intensity/color differences within the local neighboring pixel window. Based on the rigorous theory of global graph Laplacian and heat diffusion, our original DAM can also encode the local/global structure of images. We employ the Krylov subspace technique to reduce the computational cost of our DAM transform. Furthermore, aiming at various structure-preserving image processing applications such as filtering, detail enhancement, tone manipulation, and stylization, we conduct comprehensive experiments and make quantitative comparisons with other state-of-the-art methods, which demonstrate the versatility and superiority of our method.

1. Introduction

As an attractive property for encoding intrinsic information, structure preserving plays an important role in image processing, which also has been a very intensive research topic over the past two decades. There is a large amount of literature addressing this important topic. Currently, the commonly used ways of image processing can be categorized into linear or nonlinear filters, which can preserve structures. The conventional linear filters such as Gaussian smoothing cannot preserve the local and global structure effectively and oftentimes smooth the image excessively, which results in artifacts for edges easily. In contrast, the nonlinear filters enable us to well preserve the structures in an image while performing the smoothing task. This property gives rise to a great deal of attention from many researchers and results in many excellent prior works, such as anisotropic diffusion [1, 2], weighted least squares [3], the bilateral filter [4], and wavelets [5]. Other sophisticated methods have also been proposed, such as neighborhood filters [6] and edge-preserving optimization [7, 8]. Among these, the anisotropic diffusion (AD) and wavelets appear to be the best with sound theoretical foundation. The AD theory is employed to smooth images with directionally selective diffusion that preserves the structure of images. And the wavelet transforms are used to transform the image processing problem into a frequency space where it can be better solved. This motivates us to explore the data-specific anisotropic Mexican hat wavelets (DAM) in this paper. DAM defined in this paper is relevant to the data of underlying images that will be processed. That is to say, if the data changes, the accompanying wavelet will also be constructed differently. The most significant characteristic of DAM is data-specific, which sets the main difference of our current research in this paper on wavelets from other types of wavelets. We construct the DAM by means of global spectrum decomposition and incorporate the intrinsic image information into DAM. Our DAM is applied to various tasks of image processing (shown in Figure 1). Particularly, it exhibits good structure-preserving behavior in the field of image processing and synthesis.

In this paper, we design a novel approach for effectively preserving structure while performing image processing. Our algorithm is based on data-specific anisotropic Mexican hat wavelets (DAM), derived from the negative first-order derivative of the fundamental solution to heat diffusion with respect to time, which shares similar properties with Mexican hat wavelet (MHW). In addition, based on the rigorous theory of global graph Laplacian and heat diffusion, the local/global structures can be elegantly encoded into the definition of DAM. The global spectrum decomposition enables us to construct DAM exactly with sufficient image structure information. It may be noted that the local information alone may lose the insight of weak structure information and hence make the algorithm less accurate. Therefore, the scale-aware DAM transform and its inverse transform are also defined in this paper. It can naturally facilitate the construction of scale space. They are all applied to a variety of structure-preserving image processing. The construction of Laplacian uses the weighted undirected graph from the image pixel lattice, which plays a fundamental role in the anisotropic diffusion process. Here, the edge weight of the graph characterizes the similarity between pixels in an intrinsic way. The preciseness of the edge weight (similarity) dictates how faithfully the algorithm preserves the structure of the image. The edge weights are governed by the weighted distances between local neighboring windows in this paper. Our DAM, which features the properties of zero mean, Gaussian decay, and convergence, can be deployed into a variety of practical applications of image processing. In order to reduce the computational cost of the wavelet transform, we have also adopted the Krylov subspace technique [9], which is an iterative method for sparse matrix problems.

We demonstrate the effectiveness of our structure-sensitive algorithm on several image processing tasks in the context of image filtering, detail enhancement, tone manipulation, and stylization (Section 4). The contributions of this paper can be summarized as follows:(i)A novel data-specific anisotropic Mexican hat wavelets related to specific data and being constructed by local/global structures. It not only has similar properties to Mexican hat wavelet but also intrinsically embeds the image-specific properties. It is derived from the fundamental solution to heat diffusion (Section 3.1).(ii)A technique to perform anisotropy by recasting the image pixel lattice to the weighted undirected graph (Section 3.2). It enables us to establish a correlation between data-specific anisotropic Mexican hat wavelets and the image to process. DAM is structure-sensitive.(iii)A technique to effectively implement wavelet transform and inverse transform (Section 3.3). The scale-aware DAM wavelet transform is presented as inner products between DAM and the images. And the discrete inverse transform is to reconstruct the images. The Krylov subspace technique is used to accelerate the wavelet transform (Section 3.4).(iv)An experimental system that demonstrates our algorithm can be used to accomplish a variety of effects for gray-scale and color images (Section 4).

A number of techniques for structure-sensitive filter have been recently explored. Most of the existing structure-preserving techniques are able to produce well visual results in many practical applications. In this section, we briefly review the previous work of image processing most related to ours.

In computer vision and image processing, structure-preserving filtering attracts many research efforts. The bilateral filter (BLF), as a kind of nonlinear filters, is an effective structure-preserving filter. Despite its appealing characteristics, it still has some limitations. For example, Fattal et al. [10] improved the bilateral filter by multiscale and iterative decomposition and restoration of image. To better handle the small structures, Farbman et al. [3] computed a multiscale edge-preserving decomposition with a weighted least squares optimization scheme (WLS). Paris and Durand [11] extended the fast bilateral filter and employed trilinear interpolation and division to recast the computations as a higher-dimensional linear convolution. Meanwhile, the bilateral grid proposed by Chen et al. [12] accelerates edge-aware image processing and achieves a better visual effect. Paris and Samuel [8] introduced local Laplacian filters to decompose images into multiple scales, as well as certain wavelet transforms, which only rely on pointwise nonlinearities and renew the viewpoint of the Gaussian kernel.

Another powerful approach for structure-preserving filtering is PDE-based anisotropic diffusion (AD) [1, 2], of which the original Perona and Malik model lacks effectiveness in operation due to hard-to-control parameters and tends to overly sharpen edges and lose the small feature. As a nonlinear iterative process, its convergence rate is also very slow. Most recently, Eduardo and Gastal [4] presented a domain transform approach to conduct edge-preserving filtering of images and videos by measuring the geodesic distance between pixels and achieve better performance in many practical applications. However, this method is not rotationally invariant.

Meanwhile, wavelet transforms also relate to our work, which has been widely used in multiscale image decomposition. In fact, nonlinear wavelet diffusion (NWD) models were proposed for robust image recovery by Feng [13], of which the diffusion coefficients are derived from wavelet coefficients at one or multiple scales and thus can loyally reflect the singularity of images. Particularly, Dorini et al. [14] presented a brief survey of wavelets and scale-space filtering and showed the basic definitions and some possible applications of these approaches in image processing; please refer to [14] for more comprehensive reviews. Recently, Fattal et al. [5] proposed an edge-avoiding wavelet basis (EAW). EAW takes edges factors into its definition, whose image-specific characteristic is similar to our method.

3. Data-Specific Anisotropic Mexican Hat Wavelets (DAM)

Hou and Qin [15] proposed a novel Mexican hat wavelet (MHW) on the 2D manifold mesh by combining the conventional Mexican hat wavelet and heat diffusion. Inspired by it, we are exploring data-specific anisotropic Mexican hat wavelets (DAM) for structure-sensitive image processing. In this section, we will theoretically define the image-specific anisotropic wavelets based on heat diffusion theory and detail them in four aspects: (a) heat diffusion-based wavelets, (b) graph Laplacian-based DAM, (c) DAM transform and inverse transform, and (d) numerical calculation.

3.1. Heat Diffusion-Based Wavelets

For spatial signals, the well-known Mexican hat wavelet is derived from the negative normalized second-order derivative of the Gaussian function [15]

It closely relates to the partial differential equation of heat diffusionwhere is the Laplace operator. In Euclidean space, the Gaussian function is the analytic solution to heat diffusion equation. From this perspective, the Mexican hat wavelet can also be defined as the negative first-order derivative with respect to time:

The generic solution of (2) is known as heat kernel . With the help of eigendecomposition of the global Laplacian matrix on manifold, can be analytically defined aswhere and are, respectively, the -th eigenvalue and its accompanying eigenfunction. The spectrum of the Laplacian matrix consists of an increasing nonnegative eigenvalue sequence , whose corresponding eigenfunctions form an orthonormal basis. Hence, we can define the heat diffusion wavelet using the negative first-order derivative of the heat kernel with respect to time as

And the dilation and scaling of the wavelet can be intrinsically obtained by way of time-relevant heat diffusion, of which parameter is related to “frequency,” which can naturally facilitate the construction of scale space. Small corresponds to high frequencies, while large one relates to low frequencies.

Particularly, Figure 2 illustrates the multiscale-based heat diffusion wavelet of a certain pixel in an image. It oscillates and attenuates, which is very similar to the Mexican hat wavelet.

Directly inherited from heat kernel [16] by negative first-order derivative, the heat diffusion-based wavelet is symmetric, multiscale, and stable. Besides, DAM also has many individual remarkable characteristics, such as zero mean, Gaussian decay, convergent, and informative. The rigorous mathematical derivation of zero mean, Gaussian decay, and convergence is as follows:Zero Mean. The wavelet has zero mean for given ; that is, . This property is directly derived from the heat kernel, since satisfies the conditions for and for all for given . Zero mean indicates that DAM will vanish at zero frequency during Fourier transform, which is an important condition for the admissible condition of wavelets.Gaussian Decay. Since heat kernel has Gaussian decay with respect to and DAM is defined as the negative first-order derivative of the fundamental solution of heat diffusion equation, DAM also has Gaussian decay. The property of exponential decay means that, for the small , heat kernel is mainly determined by a few neighborhoods of . The Gaussian decay implies that DAM is mainly determined by a local supporting area through layered and blocked formulation, and the size of the local supporting area rapidly extends as increases. The properties of zero mean and Gaussian decay guarantee that DAM is oscillated and attenuated and can be accurately approximated by a high-precision local supporting area.Convergence. For the large , the DAM will converge to zero for all , , since when is large, heat kernel has a stable state as , where is the smallest nonzero eigenvalue and is the associated eigenvector.

3.2. Graph Laplacian-Based DAM

Indeed, a gray-scale image can be represented as a 2D manifold embedded in 3D Euclidean space (5D space for color images). We employ edge-weighted undirected graph to map an image to a 2D manifold, whose node set corresponds to the pixels of the original image. For an edge , denotes its edge weight, which can be calculated using the intensity/color differences between corresponding adjacent pixel lattices.

At each node, heat diffuses along its joint edges as time goes by. The edge weight serves as the thermal conductivity, which controls the velocity of the heat flow and plays an important role in DAM definition. The larger the edge weight is, the easier the heat is transmitted, and vice versa. An edge with zero weight means that heat will never flow along it.

To effectively transform the local image structure changes to corresponding edge weights, there has been a large amount of work on the choice of edge weight since it can significantly affect the extent to which discontinuities are preserved and unimportant information will be eliminated. For instance, edge weight is computed by way of pixel window in [1719], which enables high robustness and effectiveness to image noise. Here, we adopt a similar function. First, we encode the intensities/colors of the image as a column vector in sequential row/column raster order, and then we compute the edge weight bywhere denotes the neighborhood of pixel , denotes the intensities within the window centered at pixel , and the intensities within window are organized as a vector . Hence, we can measure the structure distance using (6), while (7) is used to remove additive white noise in advance. Generally speaking, a larger neighborhood window can better preserve image structure. Since the exponential function decreases rapidly, it will lead to significant differences in the weights of intraregion and interregion edges. Therefore, the parameter can be used to adjust the extent of edge-preserving. To make the parameter accommodate the images with different contrast, we normalize the distances to span the interval . We show the denoising performance with and without the use of neighboring windows in Figure 3. Figures 3(c) and 3(f) better preserve the image structure features than that in Figures 3(b) and 3(e), which states that the use of neighboring window can achieve more robust results. And we also use the peak signal-to-noise ratio () to quantitatively measure the image smoothing quality. Table 1 lists the PSNR of the resulting images in Figure 3. Since the large PSNR is expected, the neighboring window method can obviously achieve more robust results.

Besides, we compute the weighted adjacency matrix for graph and construct the Laplacian matrix from and the diagonal matrix by . The Laplacian matrix encodes the local structure of the image. It is a sparse matrix of size , which can be defined as

The spectral decomposition of the Laplacian matrix is , where is the diagonal matrix with ordered eigenvalues () as diagonal elements and is the matrix with corresponding eigenvectors as matrix columns. Since $L$ is symmetric and positive semidefinite, the eigenvalues of the Laplacian matrix are all nonnegative, and the eigenvector associated with the smallest nonzero eigenvalue is referred to as the Fiedler vector.

3.3. DAM Transform and Inverse Transform

Given an image organized as a column vector , its wavelet transform can be defined aswhere are in the space domain and denotes the frequency domain. is called the wavelet coefficient of the image. The wavelet transform can produce several detail levels () and a residual level at different , which can facilitate forming scale space with encoded frequencies.

Assume we have an increasing discrete sampled time sequence . For the simplicity of formulation, the discrete image-specific wavelets can be defined aswhere is the time step. By adding equation (10) to equation (9), we get

The discrete inverse transform is used to restore the image from the wavelet coefficients aswhere is the residual level at , named as coarse base level. Figure 4 illustrates the flowchart of DAM transform () and its inverse transform (). And Figure 5 shows different residual levels of a CT image obtained from inverse transform corresponding to multiple time scales, which can be seen as the smoothed versions of the input image. The detail levels, computed from the residual level, can be used for detail manipulation and reconstruction.

3.4. Numerical Calculation

A direct way to compute image-specific heat diffusion wavelet transform with equations (9) and (10) is to first compute the heat kernel at different times and then perform the inner product with the vectorization-organized image . The heat kernel definition of the graph is exactly similar to that of Riemannian manifolds [20]. To interpret the heat kernel in the discrete setting, we can rewrite the heat equation associated with the Laplacian operator aswhich encapsulates information concerning the average path length distribution on the graph and has a generic solution as

The initial condition of heat kernel is , which is the identity matrix. Particularly, when , .

In practice, the scale of image pixels is huge, so it cannot be afforded to directly calculate the heat kernel through complete eigenspectrum of the Laplacian matrix. To overcome this problem, we employ the Krylov subspace projection technique [9, 19] as an accelerated scheme, which allows us to efficiently compute the matrix exponential in a way like . Its underlying principal is to approximate withwhere is typically small compared to the order of (usually , while the order of the principal matrix can exceed several thousand). Thus, the approximate formula iswhere is the first unit basis vector. and are, respectively, the orthonormal basis of the Krylov subspace , and the upper Hessenberg matrix comes from the well-known Arnoldi process [21]. Thus, the initial large but sparse problem is reduced to a much smaller but dense problem. In our locally connected graphs, since the Laplacian matrix is symmetric, positive-definite, and very sparse with a few nonzero elements in each row, the aforementioned Arnoldi process can be replaced by the Lanczos process to further decrease the computational complexity. Obviously, (11) is the main computational bottleneck; we invoke the MATLAB subroutines from the Exploit package [22] to solve it. A flowchart of our algorithm is shown in Figure 6. Given the input image, we first organize it as an edge-weighted undirected graph and construct the data-specific DAM. Then we can get the detail and residual levels by wavelet transform.

4. Results and Discussion

In this section, we verify the proposed wavelet with kinds of practical image processing applications. In our experiments, the size of neighboring windows is , and all images are organized with eight-neighborhood graphs. Edge-preserving smoothing and detail enhancement applications are addressed first and then followed by tone manipulation and stylization. We mainly conduct qualitative or quantitative evaluation by comparing the produced results with several state-of-the-art methods, including anisotropic diffusion (AD) [1], weighted least squares filter (WLS) [3], edge-avoiding wavelets [5], and domain transform filter (DT) [4].

4.1. Image Smoothing

In Figure 7, we qualitatively show a comparison of structure-sensitive smoothing applied to the input image (a). Noting the white spots on the lamp holder, our technique can preserve their structure while smoothing the small structures on the wall (Figures 7(b) and 7(c)). Although other methods may also preserve the small structure of the white spots when they select very small scale parameters for their kernel functions, they will preserve the small structures on the wall as well; otherwise, they smooth all those structures when selecting large-scale parameters. Besides, these methods tend to be oversensitive to noise and are hard to process the weak structures (such as the wall smoothing), for example, the results produced by the domain transform method (Figures 8(b) and 8(d)). However, even for a relatively larger time scale in our method, these small spots are in fact regarded as strong structure comparing those on the wall due to sharp color differences (Figure 7(c)) because the data-specific structure-sensitive measurement is embedded in DAM. It demonstrates that our technique can well preserve semantic sharp structure while correctly smoothing other flat regions. In contrast, the bilateral filter (e) may incorrectly mix colors for large amounts of smoothing; for instance, the BF result has changed the color of white spots to light yellow (Figure 8(e)). As for the result from NC (Figure 7(f)), the white spots are drastically smoothed. Figure 7(h) shows the result of EAW, and it presents the worse capability of strong structure preserving. While the worst results can be observed from the WLS result (shown in (d)) and EAW (shown in (h)), the white spots have been removed completely. Table 2 summaries the PSNR comparison, the PSNR indicator of our method is high against other methods, even for large-scale smoothing, which is a desired characteristic in image processing.

4.2. Detail Enhancement

Our wavelet transforms can decompose image into several frequency bands, which can be manipulated independently and reconstructed to produce various enhanced effects. Let be the detail levels of image . The enhanced image can be obtained bywhere is the sigmoid function [3], which is used to avoid hard clipping when the detail bands are significantly enhanced.

In fact, detail exaggeration mainly depends on the structure-sensitive capability of the filter at multiscale decomposition processing. Since our DAM is image-specific and encodes the global and local structures of the original image, most of the structural features can be retained in the residual layer when conducting DAM transforms; thus, detail enhancement can be easily achieved through in frequency space.

Figure 9 shows an example of level detail exaggeration of the flower image (Figure 9(a)). The first row shows the enhancement effects, and the second row shows the enlarged effects for the regions determined by the black box in original image. Our result in (b) is created by manipulating using (17) with . Figure 9(c) shows the result produced by the EAW method [5]. The result of domain transform filter [4] is presented in (d). Figure 9(e) shows the result produced by the WLS method. Overall, the results in (c), (d), and (e) present similar visual quality. However, our result (b) shows more texture detail against other algorithms since our multiscale wavelet transform-based algorithm not only enhances the strong structures but also effectively reserves the weak structural features, which can be clearly observed from the locally enlarged subfigures. More particular results of our method are shown in Figure 10, especially for medical image.

4.3. Tone Manipulation

Our method can also be easily used to conduct detail-preserving compression and structure-preserving tone manipulation of HDR images, which should avoid the mild halo artifacts during compression. In tone manipulation, we only adjust the intensity channel while keeping the color unchanged. Here, denotes luminance channel of image, and color ratios () are equal to . We compute the 3-level wavelets transform of the log-luminance channel and scale them. Then, we reconstruct a new log-luminance channel, which must be remapped to the displayable dynamic range. Finally, we multiply the new log-luminance channel with the color ratios () to recover the output RGB channels.

Figure 11 shows three tone manipulation results, respectively, obtained from our method (a), the domain transform method (b), and the WLS method (c). Figures 11(b) and 11(c) show similar quality, while our method can strongly compress the residual level, boast the detail levels, and thus produce a rather smooth image with exaggerated local contrasts. As shown in Figure 11(a), although our whole effect is lightly darker than the other algorithms, it performs better in the aspect of structure awareness. For example, the close-up subfigures emphasize the local contrast ratio and structure-sensitive differences between the three methods; the whole effects are very similar; however, our algorithm shows the structural features more clearly and can effectively stretch the image local contrast ratio, such as the outlines of the frescoes on the dome. It can be deduced from these results that our technique has the characteristics of structural awareness and intrinsic-structure preserving. Table 3 summaries the comparison of the contrast ratio and PSNR. High-contrast ratio as well as high PSNR is a desired aspect in HDR image tone manipulation. Our method obtains a higher contrast ratio than other methods, while the PSNR of three techniques is almost equal. It demonstrates that our algorithm can effectively exaggerate local contrast and ensure the lower distorted rate of image.

4.4. Stylization

Stylization aims to produce cartoon-like digital imagery, which emphasizes smooth low-contrast regions while preserving high-contrast structure features. As discussed in Sections 4.1 and 4.2, the structure-sensitive property makes our algorithm to be more suitable for this task (Figure 12(c)) since the quality of boundary extraction closely relates to the accuracy of edge weights. Our DAM measures the local structure similarity governed by the normalized intensity/color differences; the greater the local neighboring window similarity is, the higher the edge weight is. Thus, the weak structure can be effectively protected due to the low-contrast characteristics of local region. Figure 12 illustrates its stylization results and shows the comparison with that from the domain transform method (Figure 12(f)). Obviously, our method can produce a more detailed edges effect. Figure 12(c) shows that our method can produce a much more coherent abstraction effect.

4.5. Time Consumption

Here, we analyze the time cost of our method. Table 4 documents the time cost statistics of our experiments that are conducted on a computer with 12 GB RAM and Intel Xeon E5630 CPU, GHz. The dominated time cost of our method is mainly expended on the Laplacian construction and wavelet transform operation, which is linear to the number of image pixels. Benefiting from the Krylov subspace projection technique, the cost of wavelet transform can be drastically reduced.

5. Conclusions

In this paper, we have devised a novel theoretic approach for data-specific anisotropic wavelet definition and its application in edge-preserving image processing. The proposed wavelets are derived from the negative first-order derivative of the fundamental solution to heat diffusion with respect to time. Given weighted undirected graph representation of specific image, heat diffusion in image is depicted by the heat equation, which is governed by the global Laplacian matrix. Then the wavelets are constructed locally by further employing the Laplacian eigensystem for the computation of heat kernels and their derivatives. We process images by diffusion-relevant multiscale wavelet transforms. On the computational level, the numerical implementation of the algorithm can be fast accomplished using the Krylov subspace projection technique. We have also demonstrated the effectiveness of our method by qualitatively and quantitatively comparing the experimental results with several state-of-the-art techniques.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported in part by the National Natural Science Foundation of China (no. 61702185), National Key Research and Development Project (nos. 2017YFC0403600 and 2017YFC0403604), Key Scientific and Research Project in University of Henan Province (18A520034), Henan Province Science and Technology Research Program (172102210050), the Open Research Foundation of Key Laboratory of Sediments in Chinese Ministry of Water Resources (2017001), and Science and Technology of the Henan Province (no. 222102210125). The authors thank Shuai Li for the fruitful and insightful discussion and Aimin Hao and Hong Qin for feedback on the paper.