Abstract

Shadows limit many remote sensing applications such as classification, target detection, and change detection. Most current shadow detection methods utilize the histogram threshold of spectral characteristics to distinguish the shadows and nonshadows directly, called “hard binary shadow.” Obviously, the performance of threshold-based methods heavily rely on the selected threshold. Simultaneously, these threshold-based methods do not take any spatial information into account. To overcome these shortcomings, a soft shadow description method is developed by introducing the concept of opacity into shadow detection, and MRF-based shadow detection method is proposed in order to make use of neighborhood information. Experiments on remote sensing images have shown that the proposed method can obtain more accurate detection results.

1. Introduction

Remote sensing images are applied in many fields because of the abundant information. However, shadows existing universally in these images obstruct these applications for the reduction or even loss of radiance in the shadow areas. On the other hand, shadows may provide some useful information, for example, building height, which make them valuable in some applications, such as building extraction, surface deformation analysis. Therefore, shadow detection is an essential prestep in the remote sensing image processing. We try to find a feasible shadow detection method by analyzing the characteristics of shadows in remote sensing images.

Various shadow detection methods have been explored during the last decades. Most of these methods are based on the histogram threshold of extracted spectral features of shadow areas. As shown in Figure 1, Jaynes et al. [1] observed that pixels in shadow areas have lower intensity and proposed an appropriate method using intensity threshold. Polidorio et al. [2] utilized the higher saturation of shadows combining the lower intensity to detect shadows in aerial color images. And later, Huang et al. [3] used the property that pixels in a shadow region usually have large hue value, low blue color value, and small difference between green and blue color values. These works have obtained a variety of good results, but some drawbacks are also blocking their detection accuracy. For example, dark objects may be misclassified as shadows, while light ones in a shadow area may be treated as nonshadows. Besides, it is also rather difficult to distinguish the deep green grass from shadows.

In order to take full advantages of spectral features of multiband images, invariant color spaces are often used to stress the differences between shadows and nonshadows. Haijian et al. [4] used the special properties of shadows in HSV color space (i.e., high saturation, low value) to detect shadows in high resolution satellite images and Arévalo et al. [5] proposed a shadow detection method with a region growing procedure in the C1C2C3 color space [6]. Tsai [7] computed the ratio of hue/intensity in several color spaces, including HSI, HSV, HCV, YIQ, and to deshadow color aerial images. These methods did improve the detection precision but are still not robust enough to get satisfactory results. To improve the performance of thresholding method, many attempts have been proposed. Yang et al. [8] tried to combine thresholds of different features, and Liu and Xie [9] performed the principal component transformation of extracted feature before using its histogram. These methods have achieved some good results in their intended realms.

All the methods mentioned above are threshold-based ones. Unfortunately, it is a high-risk strategy to divide images only by a specific threshold because, in reality, the extracted features of shadows usually do not obey the uniform distribution, which makes the selection of a proper threshold more difficult. For example, intensity values in penumbras and shadow boundaries are always between those in shadow and nonshadow areas. It is hard to represent the actual situations for the thresholding methods. Therefore, a natural way to describe image is using the soft manner, for example, the degree of each pixel belonging to the shadow or not, called “soft shadow.” Wu et al. [10] used the distance between a pixel and a feature set to measure the soft degree. Liu and Yamazaki [11] divided shadows into three classes according to the histogram: dark-shadow, medium-shadow, and light-shadow. There are some limits in these methods, such as, the applicability only for the images with single land surface in [10] and the discrete nature in [11]. But the notion of describing shadows with soft degree affords us a way to represent the nonuniform distribution of shadows.

Inspired by [10, 11], we present a novel solution for describing the “soft shadow” by introducing the concept of opacity of image matting into shadow detection and combining it with the intensity. Instead of using the thresholding method, we propose an iterative detection method based on Markov random field model (MRF) [12] which serves as a powerful formal tool to take advantage of the neighborhood interactions. In our method, the soft shadow model is employed to estimate the probability of features and MRF is utilized to model the distribution of label a priori. Experiments on four pieces of QuickBird remote sensing images have shown that the proposed method can obtain more accurate detection results.

This paper is organized as follows: the determination of shadow probability is described in Section 2, followed by the proposed MRF-based shadow detection method built in Section 3. Extensive experimental results are reported in Section 4. Section 5 concludes the paper.

2. Shadow Probability

As mentioned above, the shadow is soft rather than hard for the nonuniform distributions of spectral features in shadow areas. In other words, “soft shadow” is more appropriate to describe the shadows in remote sensing images. To measure the degree of each pixel belonging to the shadow, in this study, the concept of shadow probability is derived. If the shadow probability of a pixel is 1, that means it is a shadow pixel. On the contrary, the probability value of nonshadow pixel is 0. For a pixel, a value closer to 1 means the greater the chance of being shadows. In order to obtain appropriate probability values, we proposed a novel shadow probability model which combined intensity and opacity together.

2.1. Get the Probability by Intensity

There are many different spectral characteristics of shadows, and among them, the lower intensity is the primary one, and as stated in [13], it is the most consistent one. Therefore, intensity is always chosen to estimate the distance between shadow and nonshadow. Meanwhile, experiments in [7] had shown that the HSI model was the optimal choice for the proposed approach. In fact, the superiority of HSI color space mainly exists in that intensity is independent of hue and saturation, which makes HSI achieve more accurate intensity value. Hence, we transformed images into HSI color space and use the intensity component to describe shadow pixels. To make the distinction between shadow and nonshadow areas more obvious, the inverse tangent transformation [13] is performed on the intensity component.

Let denote an intensity image defined on a rectangular lattice set, let denote the transformed intensity image described later, and let denote the set of pixel sites. Let the training shadow samples be denoted by , where is the site set of the th sample. Then, the transformation can be described aswhere is the intensity distance between each pixel and the mean of the shadow samples denoted bywhere denotes the pixel site and stands for the total pixel number in the sample area.

The inverse tangent function in (1) can enlarge the difference between shadows and nonshadows. The parameter is a sensitivity factor adjusting the change speed, whose empirical value is 20. An example of this function is shown in Figure 2. It can be seen that the intensity contrast in Figure 2(b) is more obvious than that in Figure 2(a).

Aiming at achieving a shadow probability for each pixel, the distance between the transformed intensity and mean intensity of the sample shadow areas is used to measure the degree of each pixel belonging to the shadow. Let , () denote the label field, where indicates that the pixel on site is a shadow pixel. Then, the shadow probability can be defined aswhere is the mean value of sample areas in transformed intensity image . Then the nonshadow probability is described as

This probability is applicable when pixel intensities are obviously different. However, in most cases, it is not accurate enough for shadow detection. To improve the accuracy of detection result, we introduce the concept of opacity of image matting into shadow detection, which will be introduced in the next section.

2.2. Introducing the Concept of Opacity

Image matting is the process of extracting a foreground object from an image by estimating the foreground opacity [14]. For shadow detection, it aims at extracting shadows from an image; therefore, it is reasonable to regard shadows as foreground objects and introduce the methods used in image mating to the field of shadow detection.

In image matting, the color of the pixel on site in channel is assumed to be a linear combination of the corresponding foreground and background colors:where is the pixel’s foreground opacity and and denote the foreground and the background. It is a massive task to estimate the foreground and the background colors, as well as the opacity. A closed-form solution [13] is presented to extract the alpha matte directly without requiring reliable estimates for and . In this method, alpha is computed aswhere is a large constant, is a square matrix defined as mating Laplacian [14], is a diagonal matrix whose diagonal elements are 1 for sample matrix and 0 for all other pixels, and is the vector, which consists of specified alpha values for the sampled pixels and 0 for all others. To describe shadow in a soft manner, we suppose that shadow pixels are of the foreground and the nonshadow ones are of the background. Then, the shadow probability can be denoted by alpha [14].

So, the shadow probability using alpha is denoted by

Finally, taking into account both intensity and opacity, the shadow probability can be described aswhere and , and the parameter , whose value is decided by the histogram of , is a parameter to adjust the effect of intensity and opacity when determining the shadow probability.

3. MRF Shadow Detection Method

Aiming at partitioning an image into shadows area and nonshadows area, the procedure of shadow detection can be treated as a process of image labeling. The thresholding methods label each pixel individually based on its intensity. It has some obvious defects. The first one is its ignorance of the interaction information between neighbors, which seriously impacts the detection accuracy. Besides, imprecision is associated with remote sensing images inherently since a pixel always represents an area of the land space, which also makes the thresholding methods fail to retain enough information from the original images compared to soft methods. Therefore, in this study, aiming at obtaining more accurate detection results, we design a soft shadow detection method by resorting to MRF to incorporate spatial information.

Let denote an intensity image defined on a rectangular lattice set and let denote the label field, where with 1 and 2 indicating shadow area and nonshadow area, respectively. In the Bayesian image segmentation framework, the segmented image is obtained bywhere is the class conditional probability which represents the characteristics of extracted features of pixels. In our proposed method, the shadow probability described in Section 2 is employed as the class conditional probability. is the prior probability which defines the interaction between each pixel and its neighbors, where is the set of neighborhood pixels of pixel . Potts model is a simple and efficient one to represent the interactions of neighboring pixels clearly. Although some improved complex Potts models have been proposed, the classical model is sufficient enough to this study. Therefore, for simplicity without loss of generality, we employ the classical Potts model to describe the label field in our proposed method.

Given a remote sensing image in the visible spectrum, the training samples can be obtained by scribbling different lines on shadow area and nonshadow area, respectively. After that, the opacity can be estimated by (6). Then, the input image is transformed into the HSI color space to get the corresponding transformed intensity image. Based on this, the shadow probabilities are computed and applied in our iterative shadow extraction procedure.

The details of our proposed MRF-based shadow detection algorithm are described as in Algorithm 1.

Procedure detectShadow()
 solveAlpha()/By (6)/
 / is an array records the parameter alpha for each pixel site/
/By (7)/
colorTransform
/By (1)/
,
 /By (8)/
 initSeg  histogramOpacity()
 /Obtain the initial segment according to the histogram of probability computed through the opacity, /
while (iter ≤ maxIter)
 /Apply the feature condition probability iteratively until the algorithm reaches the maximum iteration number /
   priorPercent (seg, )
  /By Potts model/
  /In the first iteration, seg = initSeg/
  seg  MAPcriteria(, )
  /Segment the image into shadows and nonshadows using the MAP criteria./
Endwhile
= seg;
Return

4. Experiments and Discussion

In this section, we show the experimental results for verifying the performance of the proposed method. We implemented all these algorithms in MatLab 2009a. The present experiments were performed on a personal computer with an Intel Pentium Dual-Core 2.2 GHz CPU and 8 G random access Memory, under the Microsoft Windows 7 environment.

4.1. Data Sets

To evaluate the performance of proposed shadow detection method, four pieces of QuickBird urban images were chosen. The first image shown in Figure 3(a), which is acquired in September 2003, refers to a 280 × 280 pixel image in the visible spectrum. It mainly consists of grass areas, buildings, and their shadows. The next two images show different scenes of Beijing acquired in May 2011 and April 2011, respectively, and their sizes are 300 × 300 and 280 × 280 (see Figures 5(a) and 6(a)). They mainly contain roads, trees, grass, buildings, blue roofs, and shadows of trees and buildings. The fourth image represents a crop of 256 × 256 pixels of the Wuhan University acquired in 2004 (see Figure 7(a)). There are six land covers: roads, bared land, trees, buildings, water, and shadows.

The main variation among the selected scenes is the complexity of scenes contents. For example, types of land covers in the first image (Figure 3(a)) are less than the others. Meanwhile, in the first image, shadows mainly obscured by buildings are relatively larger, whereas in “Wuhan University” image (Figure 7(a)), there are many fragmental tree shadows in addition to building shadows. This different complexity can affect the performance of our proposed shadow detection method which will be discussed later.

4.2. Parameter Setting

Some parameters in our experiments were chosen empirically. In detail, for intensity transform function and for Potts model. However, the parameter used in the shadow probability function should be set in accordance with specific images, because its value is determined by the histogram of as mentioned in Section 2.2. In our experiments, 0.98, 0.91, 0.88, and 0.95 are set as the values of for Figures 3(a), 5(a), 6(a), and 7(a), respectively.

Another important parameter in the proposed MRF-based method is the size of neighborhood. Three shadow detection experiments using different neighborhood sizes, namely, 3 × 3, 5 × 5, and 7 × 7, were performed on the first image. And the results shown in Figure 3 represent that the best choice should be 3 × 3 because it can preserve as much as possible the image details. So, such a neighborhood size has been adopted in all the following experiments.

4.3. Experiment Setup

Before running our detection method, a sample image should be prepared by scribbling nonshadows in black and shadows in white. In order to get enough samples, white scribbles should be placed on all different kinds of land surfaces in shadow areas. Simultaneously, black scribbles for the nonshadow samples should mark all the regions that should not be mixed with shadows.

In order to obtain the shadow probability, two procedures are performed: one for computing the probability by the opacity image and the other for the transformed intensity image. Figure 4(b) shows the detection result on the first QB image of shadow probability model using opacity only, while Figure 4(c) displays the result using transformed intensity only. As seen from Figure 4(b), the detection has lost some little shadow areas and misclassified some nonshadow pixels as shadows ones. The main reason lies in the insufficiency of samples in corresponding areas. Obviously, it is not sufficient to reply on opacity only. Simultaneously, Figure 4(c) shows that detection using intensity only gets many speckled shadows caused by dark pixels on trees or buildings.

Then, the final shadow probability is obtained by combining the opacity with transformed intensity. As expected, the result using the final probability has obviously less misclassifications, which is shown in Figure 4(d). In summary, the superiority of our novel soft shadow model is clearly demonstrated by the comparison of the results shown in Figure 4.

4.4. Results and Discussions
4.4.1. Comparative Method

In order to verify the superiority of the proposed shadow detection method to the thresholding ones, three methods are employed as the competitors: bithreshold method [8], PCAHSI [9], and H/I method [7].(i)Bithreshold method [8]: firstly, transform the image into HIS color space, compute the normalized difference of intensity () and saturation () components, and obtain the initial detection by its threshold. Then, get the detection result of channel by histogram threshold. The final result is obtained by performing AND operation on two detected results mentioned above.(ii)PCAHSI [9]: firstly, compute the shadow index (SI) based on principal component transformation and HIS model. Then, image is divided into the shadow area and nonshadow area by SI histogram threshold.(iii)H/I method [7]: firstly, compute the ratio of H/I in HSI color space and get initial result by H/I threshold. Then obtain a shape mask by applying a Sobel operator on the component. Finally, by overlapping the shape mask with the existing shadow mask of shadowed regions from the logical AND operator, a shape-preserved shadow mask is derived.

4.4.2. Visual Comparison

Figures 5 and 6 present the results on “Beijing scene” images of our proposed method and the competitive ones. In these experiments, the corresponding ground truth images (Figures 5(c) and 6(c)) are obtained by careful photo interpretation. Subfigures (d)–(g) in both Figures 5 and 6 show the detection results of bithreshold method, PCAHSI, H/I method, and the proposed method, respectively. As seen from Figure 5, the thresholding methods make serious misclassification because the intensity of trees is very similar to that of shadows. Only by using the histogram thresholds, the thresholding methods are not efficient enough to distinguish between trees and shadows accurately, while the proposed method describes each pixel with shadow probability and decides whether a pixel is a shadow one or not by considering the neighbor information. Therefore, the proposed method appears to be the most satisfactory one.

The comparison between Figures 6(d)6(f) and 6(g) shows that our proposed method is obviously superior to the referenced ones. The competitive results show many misclassified speckle noises. In these noises, some are caused by small dark objects in nonshadow areas and others are caused by the fragmented shadow of trees. As the purpose of the present work is to separate the shadows from their producer, those are consider as noises. However, this problem is solved in Figure 6(g), where small misclassification islands are eliminated by the spatial information between neighboring pixels. Moreover, in Figures 6(d)6(f), some high intensity objects in shadow areas are misclassified as nonshadows, which shows that the hard shadow models cannot accurately represent the nonuniform distributions of spectral features in shadow areas. On the contrary, the misclassifications in Figure 6(g) are greatly reduced, which mainly benefits from the accurate nonuniform description capability of the proposed shadow probability model and the powerful spatial modeling ability of the MRF used in our method.

4.4.3. Quantitative Comparisons

To obtain a quantitative comparison between different algorithms, both recall and precision are employed as indicators. Recall represents how many true shadow pixels have been detected as shadow pixels, which is denoted bywhere is the number of true shadow pixels, which is counted from a true shadow mask obtained by careful photo interpretation manually and is the number of pixels correctly detected, which is computed by performing AND operation on the detected result and the true shadow mask. Precision indicates how many shadow pixels have been detected correctly which is denoted bywhere is the number of pixels labeled as shadow.

From the definition, it is easy to conclude that the recall favors overdetection and the precision favors underdetection. That is to say, high recall combined with a low precision means overdetection shadows.

We applied all competitive algorithms and proposed method on a “Wuhan University” QB image shown in Figure 7. Correspondingly, we measured the recall and precision of each method and listed their values in Table 1.

As seen from Table 1, recall of bithreshold and PCAHSI methods is 1, whereas their precision is only 0.4601 and 0.5645, respectively, which shows that both methods obtained overdetection results. The similar results can also be found in Figures 7(d) and 7(e). Although the H/I method achieved more accurate result (shown in Figure 7(f)) compared to bithreshold and PCAHSI, it still produces serious misdetection islands in the final map. From the quantitative results as shown in Table 1, it is easy to conclude that the proposed method can obtain more accurate shadow detection results than the competitors for its shadow probability models.

4.5. Limits

By carefully observing the detection results of the proposed method on different images (Figures 5(g), 6(g), and 7(g)), we can found that the more complex the surface is, the more dissatisfactory the result will be. In order to give a quantitative evaluation, we compared the results on the first QB image and “Wuhan University” image (see Figure 8). Their values of recall and precision are listed in Table 2.

As mentioned in Section 4.1, the first QB image contains only three types of land covers, while “Wuhan University” image consists of six different types. Accordingly, the recall and precision for the first QB image are 1 and 0.8961, respectively. These values for “Wuhan University” image descend to 0.9882 and 0.78901, respectively. The main reason is that it is hard to provide sufficient samples in a complex image. When we scribbled samples, pixels belonging to different land covers may be selected as the training samples. Even worse, pixels of the same land covers but with different intensity, for example, the light buildings and dark buildings, may also be selected. A possible way to eliminate this limit is to jointly use the shape of sample land covers. And this may be our future work.

5. Conclusions

In this paper, we focus on the design of a soft shadow description method, where both intensity and opacity are employed to estimate a soft degree for each pixel to measure the extent of belonging to shadows. Based on the soft shadow description, MRF-based iterative detection method is proposed to make full use of the interactions between neighboring pixels. The proposed method is tested on four pieces of QuickBird remote sensing images. The experimental results have demonstrated that the proposed method can obtain more accurate results, compared to three competitors.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported in part by the National Natural Science Foundation of China under Grants 41001251, the Key Technology Projects of Henan province of China (no. 132102210212), the Key Technology Projects of the Educational Department of Henan Province of China (no. 13A520011), and the Excellent Youth Teacher of Henan Educational Department of China (2011).