Automatic sea-land segmentation is an essential and challenging field for the practical use of panchromatic satellite imagery. Owing to the temporal variations as well as the complex and inconsistent intensity contrast in both land and sea areas, it is difficult to generate an accurate segmentation result by using the conventional thresholding methods. Additionally, the freely available digital elevation model (DEM) also difficultly meets the requirements of high-resolution data for practical usage, because of the low precision and high memory storage costs for the processing systems. In this case, we proposed a fully automatic sea-land segmentation approach for practical use with a hierarchical coarse-to-fine procedure. We compared our method with other state-of-the-art methods with real images under complex backgrounds and conducted quantitative comparisons. The experimental results show that our method outperforms all other methods and proved being computationally efficient.

1. Introduction

Sea-land segmentation is a promising yet very challenging area of research in the field of image segmentation and used in a wide range of remote sensing applications such as ship detection, reconnaissance/investigation on surface of water, and oil spilling detection, just to name a few [1]. Compared to SAR imagery, panchromatic remote sensing images are free from high-level speckle and have fine-drawn information even for small targets and hence make this sort of images ideal for sea-land segmentation. Moreover, the easily interpreted characteristic makes researchers gradually draw their attention to high-resolution panchromatic imagery for applications of ship detection and classification [2, 3]. However, sea-land segmentation using panchromatic remote sensing images, so far, has attracted less attention in the literature, despite containing rich information and possible impact on a wide range of applications. In this paper, we attempt to narrow down this gap to utilize the maximum advantages of panchromatic remote sensing images.

Digital elevation model (DEM) can plausibly be used to obtain the sea-land mask; however, both the freely available and paid DEMs are low in quality and resolution (the spatial resolution varies from approx. 1 km to 12 m) in contrast to the high-resolution imagery acquired by on-orbit panchromatic payloads nowadays [4, 5]. Moreover, when integrating the DEM into the practical real time systems, the consumption of storage resource becomes huge. As a result, automatic sea-land segmentation is an essential and meaningful application of panchromatic satellite images. Work of automatic sea-land can broadly be divided into three categories: thresholding based segmentation, edge-based segmentation, and classification based methods. Based on the theory of 2D maximum entropy (ME) and genetic algorithm, Li et al. [1] developed an improved thresholding algorithm to extract water areas. Zhang and Li [6] presented a thresholding method (T) based on minimum class mean absolute deviation. In [7], Mao et al. proposed an improved Chan-Vese (CV) model under the constraint of extracting edge information which is calculated by dual tree complex wavelet transform (DT-CWT). Aktaş et al. [8] put forward an edge-aware segmentation and preserved the shoreline boundaries by using steerable filters. Since both the thresholding- and edge-based segmentation methods are sensitive to noise as well as the complicated distribution of intensity and texture on land and shoreline borders, denoising is commonly used as the preprocessing step prior to segmentation. However, this preprocessing may largely destroy the integrity of coastal line. To address this problem, classification based methods are gaining popularity recently on the basis of intensity, texture, statistical characteristics, and other low level features. A set of gray level cooccurrence matrix (GLCM) features in four directions was employed in [9]. Dai et al. [10] introduced a multilevel local pattern histogram (MLPH) to classify water area class from TerraSAR-X images. Xia et al. [11] used local binary patterns (LBP) features to obtain the integrated feature map for sea-land segmentation.

Sea-land segmentation provides a rough interpretation of the scene, though less focused on minute details but much higher on processing efficiency. The existing thresholding- and edge-based methods suffer from the deficiency in illumination variations leading to oversegmentation. However, the classification based algorithms usually have a higher computational complexity due to the complex feature extraction and the subsequent classification. To tackle all these problems, we propose a fast algorithm for sea-land segmentation by combining modified Otsu’s method [12] with homogeneous texture and intensity features. As we demonstrate, compared with state-of-the-art methods, our hierarchical method reasonably distributes the computational resources and presents a promising segmentation performance.

The rest of the paper is organized as follows. The outline of the proposed hierarchical segmentation method is given in Section 2. Experimental results are discussed in Section 3. In Section 4, we draw the conclusion.

2. Hierarchical Sea-Land Segmentation

Unlike the multispectral and natural-colored images, the panchromatic images lack useful help of color information or spectral metric, that is, normalized difference water index (NDWI) for sea-land segmentation. In addition, the complicated distributions of intensity and texture make it arduous to segment an integrated land mask because of the various land cover types. Ships, isles, clouds, and other scattered resources in shore may also affect the extraction of water information from sea because they partially cover the water area.

In this paper, the overall work is divided into two stages: a coarse segmentation stage and a fine segmentation stage. The main part of sea/land is separated out by using homogenized features in the coarse stage. Then, the subsequent fine stage refines the shoreline boundaries, to be obtained from coarse segmentation, by means of Otsu’s method. Under this strategy, the computing resources are reasonably assigned resulting in spatially consistent results. The overall flowchart is shown in Figure 1. In the following subsections, we discuss our work in more detail.

2.1. Coarse Segmentation Stage

As shown in Figure 1, the coarse segmentation stage comprises three steps: homogeneous feature extraction (HFE), local threshold segmentation (LTS), and fusion and false alarm removal (F-FAR). It is easy to notice the fact that most of the sea surfaces are in a calm and peaceful state and show regular gray values and textures. However, the ships, isolated isles, and waves may destroy the integrity of water, whereas the shadows of mountains and buildings existing on land may act as the complicated disturbances because they become obstructions in getting the information required for land detection. In addition, the commonly existing clouds with various kinds and sizes make it a big challenge for consistent segmentation. From this perspective, we replace the original spatial information with the homogeneous features to characterize the land and sea. The features used here are intensity and texture.

2.1.1. Homogeneous Feature Extraction (HFE)

HFE consisted of intensity feature extraction and texture feature extraction. They are defined in detail in subsections below.

(a) Intensity Feature Extraction. Let represent the input image. We divide into equal blocks with the size of , that is, . To utilize the correlation between the regions, the overlapped width between adjacent blocks is set as , and the standard deviation of each block can be calculated according to Let denote the intensity value of the pixel in block , whereas represents the mean value of all pixels in block . Then for each block the pixel is labeled uniformly according to where means the center pixel intensity value in block and is an experimental threshold, which is 2.97 in this paper.

Afterwards, a homogeneous intensity value is defined and used to represent each block. First, the numbers of “0” and “1” labels are counted, respectively. Then, compare the label numbers of “0” and “1.” If pixels “1” are in the majority, then assign the mean intensity values of all pixels “1” to the present block; otherwise, the mean intensity values of pixels “0” are used for assigning instead. Last, repeatedly calculate all blocks to get the homogeneous intensity feature vector .

(b) Texture Feature Extraction. Gradient information always offers a basic and direct way to characterize texture features. In this paper, we combine gradient information with calculation of integral image to represent the homogeneous texture feature. For the input image , the gradient map is calculated on the basis of the vertical gradient and the horizontal gradient . The gradient values in image boundaries are set as 0.

The integral image of the gradient map is generated according to

Then, we divide into blocks with the size of . The mean gradient value of block is determined by (5) and (6).where represents the coordinate of the center pixel in block and represents the sum value of gradients. Thus, the homogeneous texture feature vector is generated, when these elements , , are obtained by computing within all bocks. Subsequently, the homogenized -dimensional intensity and texture features are generated and then used for the further coarse segmentation calculation to replace the original image information.

2.1.2. Local Threshold Segmentation (LTS)

The traditional global thresholding based methods segment images according to the gray histogram using the optimal threshold. However, within the usage of remote sensing applications, the aforementioned disturbances in sea/land areas give a wide dynamic range in gray scale, which makes the conventional thresholding based method sensitive to the complex backgrounds. Considering this problem, local thresholding segmentation based on homogeneous features is proposed in this section. In contrast with conventional Otsu’s method, this modified version maintains robustness to the noise interference and avoids the deficiency from backgrounds. The specific steps are illustrated below.

Step 1. For the previously obtained homogeneous feature vectors , set the binary threshold using Otsu’s method.

Step 2. Label each block initially based on using

Step 3. For one block region, we separate it uniformly into four subblocks. Hence, every subblock region with the size of is overlapped by four labeled blocks from different sides: upper left, upper right, lower left, and lower right.

Step 4. For each subblock, we accumulate all labels of overlapped blocks. If the total sum value is greater than 2, then we consider this half-block region as a land region; otherwise it is considered as a water region.

2.1.3. Fusion and False Alarm Removal (F-FAR)

After the above subsection, the initial binary masks and are generated with the same size as the input image, respectively. To incorporate the complementary information from these two results, we employ a simple strategy. Regions identified with water labels, from both and , are considered as water regions as whole; the rest of areas are defined as land areas. We refer to this fusing result as .

On one side, the cargo ships and scattered reefs in sea are outliers similar to land. On other side, green lands, shadows of clouds, mountains, and buildings existing on land have properties similar to water area. These types of local information seem to be a large obstruction in distinctively recognizing the water and land areas separately. To address these outliers, here, FAR is used. Detailed steps are given below.

Step 1 (remove false alarms in water areas). Perform morphological opening on with a disk-shaped structuring element whose size is defined in accordance with the size of the biggest merchant vessel as well as the image resolution. Specifically, for the 5-meter resolution images, the radius of is set as 40 according to (7), because it is widely accepted that there is no ship larger in size than 400 meters.

Step 2 (remove false alarms in land areas). Reverse the binary values, generated as a result of Step1, and then process the reversed binary map using the same way as in Step1. This step aims to counteract the “fading effects” to coastal boundary caused by opening operation of Step1. Hence, the size of the disk-shaped structuring element used in this step should be the same as . Meanwhile, the false alarms on land can be removed somewhat.

Step 3 (inversion). Invert binary values of Step2 result, to obtain the coarse binary mask .

Herein, this practical and efficient FAR method preserves the sea-land boundary to the maximum extent while guaranteeing the FAR effects at the same time in contrast to a solely morphological opening or closing operation in similar applications. By applying above steps, the coarse segmentation stage is successfully completed.

2.2. Fine Segmentation Stage

The sea and land areas can roughly be separated from each other by coarse stage. However, considering the computing efficiency and the effectiveness, the block is the smallest processing unit used in the coarse stage, leading to a roughly defined binary mask. Aiming to get smoother and more accurate boundaries, we use the fine segmentation stage. In this stage, we extract the boundary area information from the original image with the help of firstly. Then, the shoreline boundaries are refined and synthesized with the rough binary mask by using the thresholding method. We observed that although traditional thresholding methods may lead to excessive false alarms, the segmented boundaries are precise. And this is the reason why we use the thresholding method to refine the boundaries. The concrete details are as follows.

Step 1. Extract the sea-land boundary mask by computing “XOR” between and , which is acquired by eroding with .

Step 2. Dilate the detected boundary mask . Here, the radius size of the structuring element is .

Step 3. Extract according to , the corresponding gray level information, from the input image .

Step 4. Original Otsu’s method is applied to , and the result is in terms of .

Step 5. As a complementary process, a morphological opening step using is adopted upon to prevent the interferences caused by shoals and preserve the detailed harbor outlines to the great extent. Consequently, the refined result is generated.

Step 6. The final segmentation result is calibrated by on the basis of , which is elaborated by

3. Experimental Results and Analysis

In this section, two experiments are designed to evaluate our purposed methods in terms of the segmentation accuracy as well as the calculating time cost. All images are taken from a panchromatic SPOT5 satellite with 5 m spatial resolution and the size of 4096 × 4096 pixels.

3.1. Parameter Selection

Based on the arguments mentioned above, the selection of block size may influence the overall segmentation performance, including the homogeneous feature and sea-land boundary mask extraction in coarse and fine segmentation stage, respectively. Therefore, to select the best value of , we conducted an experiment with dataset (a). In this dataset, various factors of interfering conditions existing in 100 images were considered such as partial cloud covers, shadows caused by constructions, diversified vessels, and isles. Performance was evaluated with widely used precision , recall , and -measure   and they are defined in the literature aswhere means the ground truth and denotes the binary mask obtained from the segmented result. It is well known that higher values of P, R, and present better results.

Figure 2 shows related results when the block size varies from 8 to 512. The conclusions drawn from Figure 2 and the segmentation results are summarized here. First, it was wrongly assumed that smaller size blocks may lead to robust recognition but, contrary to this, result in a low recall. In most cases, the selection of small block is the main reason of high false alarm rates when encountered with interferences. This is because the smaller the block, the larger the possibility of being affected by contextual interferences. Second, a large leads to a low precision because of the weak capabilities for detail recognition. Last, considering the overall performance indicated by , 256 to 320 is a good range of to obtain optimal performance.

Figure 3 illustrates the computation time of each substage and the overall process. From the perspective of computing efficiency, the increasing size of decreases the processing load for homogenous feature extraction. However, it would increase the computational burden of morphology based method for sea-land boundary extraction in Section 2.2. As expected, in Figure 3, the coarse stage time decreases monotonically with the increase of . And the fine stage time increases with the increasing . Synthetically, the overall computational time-consuming curve shows a concave function characteristic.

Based on the discussion above, we assigned 288 to in this paper to guarantee both effectiveness and efficiency.

3.2. Comparison with State of the Art

This section discussed the experiments to test our results under different complex background conditions and analyze our method by comparing the results with those of four approaches: ME [1], T [6], LBP [11] based algorithms, and the method in [13], which is referred to as ATI for short in this paper. The former three methods are state-of-the-art methods designed using different perspectives. Though ATI was not originally explored for the sea-land segmentation in remote sensing applications, its excellent robustness under various illuminated conditions made it a promising competitor. Working well on the temporal variations is a valuable treasure for this undertaken field. To give a comprehensive comparison, a dataset (b) containing five classes was taken. This dataset had total 200 panchromatic SPOT images varying from the ratio of sea-land coverage to background interference factors. Each class has 40 images, and it is shown in Figure 4 that the main characteristics of 5 classes from class (i) to class (v) can be summarized as less land with calm sea, less land with tides and waves of sea, half land with half water, less water with smooth land, and less water with rough land, respectively. The resultant samples are shown in Figure 5. First row in Figure 5 shows the manually labeled ground truth. Rows 2 to 5 show the results of ME, T, LBP, and ATI, respectively. And the last row shows the results of our proposed algorithm.

Additionally, to give a quantitative analysis, we used average values of -measure and false alarm rate (FA) in the experiment defined by (10). We observed that high recall values can be obtained from all these competing methods; however FA values differ distinctively. As shown in Table 1, our method shows the highest and lowest FA values among all the methods.

Table 1 also illustrates the total execution time of these different algorithms on the abovementioned dataset (b). We used i5-3230M CPU and 8 GB RAM hardware and MATLAB R2016a software for these experiments. For the images with the size of 4096 × 4096 pixels, the average execution time of our proposed algorithm was 1.93 s. Time taken by our proposed method is little longer than the thresholding based method but drastically less than the other three methods. Promisingly, the performance of our method beats all other four methods. Overall, taking both the performance and execution time into consideration, our proposed method outperforms state of the art.

4. Conclusion

In this paper, a hierarchical sea-land segmentation method for panchromatic remote sensing imagery has been proposed and it provides a practical solution and can quickly be transformed from research to hardware implementation as it requires lower computational resources. Our work is divided into two stages and each stage addresses the different scenario instead of applying one whole procedure for every area. The coarse stage focuses on the main sea/land body segmentation giving out an initial result, and the successive fine stage refines the sea-land borders and brings on an ultimate result. In comprehensive consideration of both performance and computing efficiency, our proposed algorithm outperforms the other four state-of-the-art methods.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


This work was supported by the Chang Jiang Scholars Program under Grant no. T2012122 and the Hundred Leading Talent Project of Beijing Science and Technology under Grant no. Z141101001514005.