Abstract

The shadow of pavement images will affect the accuracy of road crack recognition and increase the rate of error detection. A shadow separation algorithm based on morphological component analysis (MCA) is proposed herein to solve the shadow problem of road imaging. The main assumption of MCA is that the image geometric structure and texture structure components are sparse within a class under a specific base or overcomplete dictionary, while the base or overcomplete dictionaries of each sparse representation of morphological components are incoherent. Thereafter, the corresponding image signal is transformed according to the dictionary to obtain the sparse representation coefficients of each part of the information, and the coefficients are shrunk by soft thresholding to obtain new coefficients. Experimental results show the effectiveness of the shadow separation method proposed in this paper.

1. Introduction

With the development of image processing and signal technology, how to use the components of the signal and image such as subcomponent, principal component, independent component, sparse component, and morphological component to represent the image and signal has become a research focus of many signal and image processing tasks, such as reconstruction, noise suppression, compression, and feature extraction. Starck et al. proposed a separation method based on signal sparse representation, namely, morphological component analysis (MCA) [1, 2]. This method assumes that, for each source signal in the mixed signal, there is a corresponding dictionary which can sparsely represent the source signal and considers that the dictionary can only sparsely represent the source signal and cannot sparsely represent other source signals; then, using the tracking algorithm to search the most sparse representation will produce an ideal separation effect. MCA is used to realize signal separation in several fields such as first-order and second-order cyclostationary signal separation [3], to enhance textural differences based on wavelet texture features to improve the image segmentation preprocessing method [4], to decompose oscillation plus the transient signal [5], to decompose the interference hyperspectral image [6], double-layer adaptive shape morphological analysis for retinal image evaluation [7], and to separate different types of noise in seismic image processing [811]. All these show that MCA is effective in signal separation.

Although shadow is easy to recognize in human visual sense, it is not easy for computers to realize this function. Selecting effective shadow features is the key step to achieve shadow separation, and the quality of the features directly affects the detection effect and performance. Luo et al., Gomes et al., Qi et al., and Gao et al. [1215] proposed some shadow separation methods. The shadow in the road image mainly comes from trees, cameras, and vehicles. In many cases, the shadow will bring many problems and difficulties to the follow-up processing, which will greatly increase the error rate of target recognition and classification. However, for the problem of shadow separation of the road surface image, due to the particularity of the road shadow, these shadow removal methods [1215] cannot be directly applied.

The contributions of this work are as follows:(1)The algorithm of shadow separation based on sparse decomposition is proposed. It can adaptively approximate the shadow and image background in the road image.(2)This work provides a high-quality input image that meets the requirements of high-level image understanding for image segmentation, target recognition, and other tasks as an image preprocessing step.

2. Road Image Shadow Separation

2.1. Problem Description

In signal and image processing, it is generally considered that the observation value is a mixture of different independent source signals , and the simplest mixture model is linear instantaneous. Its model is expressed as

Among them, is the observation signal, is the source signal, is the noise, and is the mixed matrix. The observation signal is known, but the mixture matrix and the source signal are unknown. Now, the problem is to solve the inverse process of the mixture process so as to separate different source signals.

When the source signal is highly sparse, it means that only a few moments of each source signal value are nonzero (or larger), while most of the time, the value is zero (or close to zero) [16]. The independent assumption in this case means that the probability of two source signals being useful signals at the same time is very low, so the source signals can be represented by different basis functions. Among them, sparse component analysis is to transform data into sparse representation first, which will greatly improve the quality of separation.

2.2. MCA and Dictionary Selection

MCA method has attracted increasing attention in the field of image processing because of its closeness to human vision. The core of this method is to find two suitable dictionaries, one for the sparse representation of the smooth part (smooth dictionary) and the other for the sparse representation of the texture part (texture dictionary), each of which can provide sparse representation of specific types of content. MCA can be regarded as the combination of basis pursuit (BP) and matching pursuit (MP) algorithm.

Suppose the input image is , is the texture part of the image, and is the smooth part of the image, MCA assumes contains pixels, and the image is represented as a one-dimensional vector with a length of by line scanning.

The MCA framework contains texture and structure component sparsity measures. For noisy images, the image should be composed of the structure component, texture component, and noise component. Dictionaries and play key roles in the MCA algorithm. In many image processing applications, it is necessary to separate the texture part of the image from the piecewise smooth part. We can choose some common transformations that can better represent the texture or smooth part [17]. In this work, the curvelet transform is shown to be suitable for the smooth part of the road map, and the local discrete cosine transform is suitable for the texture part.

Curvelet transform (CT) [1821] can sparsely represent the edge of an image. Its basic idea is that when the curve is infinitely divided, each small segment can be approximately regarded as a straight segment, and then the straight segments are analyzed by ridgelet transform [22, 23]. As shown in Figure 1, the frequency space area of the curvelet is divided into blocks, and the shadow part represents a wedge window, which is the support area of the curvelet.

2.3. Road Image Morphology Decomposition and Dictionary Implementation

The road image can be regarded as the approximated linear combination of the smooth gray normal road image layer, and the image layer includes shadow, noise, or small local texture, which will cause great interference to the automatic detection of road cracks and imperfections. Therefore, MCA is introduced into the detection of the road image. By selecting the subdictionary which can distinguish two image layers, the smooth gray image layer needed for detection is extracted so that the shadow is separated.

In this work, the road with the shadow is represented as , and thenwhich includes the smooth part, texture part, and noise. CT and local discrete cosine transform (LDCT) are selected as dictionaries to represent smooth image layers and image layers including shadow and background noise, respectively. In this way, we can separate the shadow from the road image, which easily interfered with the crack and imperfection detection and road extraction process. The road crack detection and other processing can then be carried out in the smooth image layer including the main gray distribution of the image.

The corresponding subimages are expressed as and , and according to the MCA algorithm, the shadow problem in this work can be expressed as follows:where is the Lagrange multiplier, and norm is chosen to measure the residual. As norm is closely related to the Gaussian white noise characteristic of the zero-mean value, we assumed the noise component in the general road image is Gaussian distribution. According to the block coordinate relaxation [1], we can calculate the optimal subimages and after MCA decomposition. Since the total variation regularization prior (TV) is considered to be a good image prior model of cartoon images, the prior knowledge is introduced to the smooth component to constrain it. is the regularization prior model of the smooth component, and is the data fidelity term.

The dictionary implementation for the smooth part and texture part of the road image is as follows.

The implementation process of CT includes the following:(1)2D Fourier transform of image X:where n is the side length value of the square image.(2)Product for different angles and scales:(3)Obtained by packing product:(4)Inverse 2D Fourier transform is applied to each to obtain the curvelet coefficient :

The coefficients of DCT can represent the texture direction of the original image. The direction of spectrum distribution in the frequency domain of DCT is perpendicular to the texture direction of the original image. It can be seen that the DCT coefficients can well describe the texture roughness, direction, and other features. Therefore, when using DCT as the dictionary to represent the texture image and according to the requirements of sparse representation for the dictionary, the high-frequency part of the transformed coefficient is close to zero, and the nonzero part is concentrated in the low-frequency part. Using DCT, a sparse representation to represent the texture image can be obtained.

For the road image , the size is , and its DCT transformation is as follows:where and .where and , and the transform kernel iswhere

The discrete cosine transform is expressed as matrix form , where cosine transform matrix satisfies

The coefficients in the discrete transformation matrix can be calculated, and the coefficient matrix of DCT after transformation is also an matrix. After DCT transformation, the energy of the image signal is relatively concentrated in the frequency domain, and most of the image information is concentrated in the low-frequency part, while the image edge and details are mainly located in the high-frequency part. High-frequency components in the frequency domain correspond to the fine textures that change rapidly in the image; the coarse texture with slow change corresponds to the low-frequency component of the frequency domain; therefore, the shadow can be sparsely approximated by using the local cosine discrete transform.

2.4. Optimization Algorithm for Shadow Separation

Based on the overcomplete dictionary of smooth and texture components of the road image constructed based on the methods as mentioned in Section 2.3, the road image is MCA separated, and the algorithm is as follows:(1)Initial input: original road image X; set the initial smooth part component as and the initial texture part component as .(2)Initialization maximum coefficient is , N is the number of iterations of each layer, and the threshold is .(3)If , repeat the following steps; otherwise, the algorithm will terminate and output the smooth part, shadow part, and noise of the road image.(4)Carry out the following N iterations:

Part 1: suppose is unchanged, update :(1)Calculation residual: .(2)For , carry out the curvelet transformation to obtain the curvelet coefficient :where is the Moore–Penrose pseudo-inverse of . The curvelet coefficient is shrunk by soft thresholding, which is , , namely,(3)Reconstruct by .

Part 2: assuming remains unchanged, update :(1)Calculation residual: .(2)For , LDCT is used to obtain coefficient :The coefficient is shrunk by soft thresholding, and the threshold is ,.(3)Reconstruction through .(5)Modify threshold .

3. Experimental Results and Analysis

3.1. Experiment Process

According to the above analysis, we propose a road shadow separation algorithm based on MCA, and the specific flowchart is shown in Figure 2.

3.2. Result Analysis

The proposed algorithm is tested on a road image database. The results are depicted as follows.

3.2.1. Comparison of Shadow Separation Results Obtained by Different Numbers of Iteration

Figure 3 is a group of shadow separation effects obtained by different iterations. It can be found that different iterations have different separation effects. Figure 3(a) is the original road images with different shadows, and Figures 3(b)3(f) are the shadow separation results from the original road images of Figure 3(a) when N (the number of iterations) is 10, 20, 30, 50, and 100, respectively. The four groups of images in Figures 3(b)3(f) correspond to road images one to four, respectively. Each group contains three parts of the original image after shadow separation, namely, “part 1” is the shadow part, “part 2” is the part after shadow separation, “residuals” are the residual part.

Figure 3 shows the road images with large shadow, vehicle shadow, and tree shadow. Comparing “part 1” and “part 2” in each of the figure of Figure 3, we can find that when the numbers of iteration are 10 to 20, the shadow part of “part 1” in Figures 3(b) and 3(c) is relatively complete, and the shadow part of “part 2” is relatively obvious. When the numbers of iteration are 30 to 50, the shadow part of “part 1” in Figures 3(d) and 3(e) is complete, and the shadow part of “part 2” is relatively complete. When the number of iterations reaches 100, the shadow part of “part 1” in Figure 3(f) is complete, and the shadow part of “part 2” is basically free. In addition, in Figures 3(b) and 3(c), the residuals have obvious shadow edges and cracks.

According to Figure 3, when the number of iterations is less than or equal to 20, the separation effect is not ideal; when the number of iterations is 30–50, the separation effect is good; and when the number of iterations is 100, the separation effect is ideal. However, it is also noted that the larger the number of iterations, the longer the computation time. Therefore, the ideal effect is achieved at the cost of time.

3.2.2. Results of the Road Image with Shadow before and after Separation

In this work, nonsampled curvelet transform (NSCT) enhancement algorithm is used to enhance and transform the road crack graph with shadow in Figure 4(a), and two direction subband graphs of the third layer shown in Figure 4(b) are obtained. After reconstruction, Figure 4(c) is obtained. In the process of shadow separation, the shadow part crack graph (Figure 4(d)) is taken out. After the same enhancement of the separated road graph, two direction subband graphs of the third layer are obtained, and Figure 4(e) is the reconstructed crack graph.

Figure 4(a) is the original road image. Without using our proposed shadow separation algorithm and directly using a NSCT algorithm on the original road image for image enhancement, the two direction subband graphs of the third layer are obtained and shown in Figure 4(b). Applying the NSCT reconstruction to Figure 4(b), the restructured shadow image without our proposed shadow separation algorithm is obtained and shown in Figure 4(c). Shown in Figure 4(d) are the same two direction subband graphs of the third layer but obtained by first applying our proposed shadow separation algorithm and thereafter applying a NSCT algorithm. Figure 4(e) is the restructured shadow image obtained through applying NSCT reconstruction to Figure 4(d).

Using two other different sets of original road images as shown in Figures 5(a) and 6(a), Figures 5(b) and 5(c) and 6(b) and 6(c) are the two direction subband graphs of the third layer and the restructured shadow image obtained directly by NSCT algorithm without using our proposed shadow separation algorithm. With regard to Figures 4(b) and 4(c), 5(b) and 5(c), and 6(b) and 6(c), because the gray value of the shadow is very low, the enhancement result is not ideal. Part of the cracks is covered by the shadow; hence, the cracks are not obvious either in the direction subband graph or in the reconstructed result graph, which is not conducive for crack detection. With regard to Figures 4(d) and 4(e), 5(d) and 5(e), and 6(d) and 6(e), after applying our proposed shadow separation algorithm, the direction subband graph and the restructured shadow image show the crack information well, which is conducive to subsequent processing. Therefore, it is necessary to separate the shadow for the detection of the road cracks.

3.2.3. Adaptability of the Algorithm to Shadow Separation of Different Shapes

Figure 7 shows adaptive experiment results of the algorithm to separate shadows of different structures. There are 18 kinds of shadow images ((1)∼(18)). The number of iterations is 50. The left side (a) is the original road image with different shadows such as vehicle shadow, tree shadow, railing shadow, large shadow, and vertical shadow, (b) is the shadow part, and (c) is the part after shadow separation.

Through the experiments of adaptability for shadow separation of different shapes, it is found that this algorithm can achieve the purpose of shadow separation.

3.2.4. Comparison with Other Works

We utilize the performance evaluation formula proposed in [24]. The evaluation index is shadow detection rate R1 as follows:where the subscript s represents the shadow, TPs is the number of correctly recognized shadow pixels, and FNs is the number of shadow pixels that are not correctly identified. It can be seen that the larger the value of R1, the better the effect.

Three different datasets, as shown in Table 1, are used to evaluate our proposed model. The image sequences of highway-1, highway-2, and highway-3 are widely used as the reference image sequences for shadow detection. The shadow size, shadow intensity, vehicle type, size, and speed are different in these image sequences.

For the purpose of comparison, Table 2 shows the results of the existing models versus our model on the three datasets. As can be seen, our model is significantly improved from the other models.

3.2.5. Shadow Removal in the Road Monitoring Image

This method can be further used as the preprocessing of the road monitoring image to remove the shadow. At present, many shadow removal algorithms are needed to obtain the background image first, and then various algorithms are used to remove the shadow based on the difference between the obtained target image and the background image. In this work, the original image can be separated by MCA, and then the shadow can be removed by multigradient analysis and morphological operation.

According to the properties of the shadow, the gray level of each pixel in the shadow area is () times that of the corresponding point in the background image, and the n value has a small change in the shadow area, that is, it is in the low-frequency area. At the same time, the ratio of the gray level of the target area to the background gray level is usually a variable value, that is, in the high-frequency area. The gradient operator has the function of highlighting the gray level change. By using multigradient analysis, the gray level values of the points with large gray level change are higher, so these gradient values can be used to judge the continuity and uniformity of the image gray level. Generally, the gray level of the shadow area is relatively uniform, and the fluctuation is small, while the boundary between the shadow and the target will have dramatic gray level changes.

Figure 8 shows the experimental results of shadow removal based on MCA and multigradient analysis. Background image difference method is a simple and effective method in target detection, and it is also widely used at present. However, it needs to store the background image in advance. According to the difference between the object and the background in the gray level, through the difference operation between the background image and the current image, the value of each pixel in the result is compared with the preset threshold value and divided into the front scenic spot and the background point. Both the background image and the current image in this experiment come from the results of MCA, so there is no need to store the background image in advance. Through multigradient analysis, the improved Sobel operator is used to analyze the gradient of the corresponding region from the vertical, diagonal, and horizontal directions.

4. Conclusion

In this work, the algorithm of shadow separation based on sparse decomposition is studied. It can adaptively approximate the shadow and image background in the road image, so as to provide a high-quality input image that meets the requirements of high-level image understanding for image segmentation, target recognition, and other tasks as an image preprocessing step. The proposed algorithm involves sparse representation theory and MCA. MCA is a signal and image decomposition method based on sparse representation. Its main assumption is that the geometric structure and texture structure components of an image are sparse within a class under a specific base or overcomplete dictionary, while the base or overcomplete dictionaries of sparse representation of various morphological components are incoherent. In order to solve the problem of road shadow, a method of road shadow separation based on MCA is proposed. First, according to the geometric characteristics of the image, the corresponding dictionary is found to sparsely represent each part. Thereafter, according to the dictionary, the corresponding image signal is transformed to obtain the sparse representation coefficient of each part, and the coefficient is shrunk with soft thresholding to get a new coefficient; finally, the coefficients are, respectively, inversed, and the process is iterated many times to separate the desired road images and shadow parts. Hence, we can use the morphological differences of various information components in the road image for separation. From the experimental results demonstrated in this work, it is verified that the proposed method is effective for shadow separation applications and achieves better performance results than state-of-the-art techniques.

Data Availability

The data used to support the findings of this study are available at http://cvrr.ucsd.edu/aton/shadow/.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.