Abstract

Urban planning depends strongly on information extracted from high-resolution satellite images such as buildings and roads features. Nowadays, most of the available extraction techniques and methods are supervised, and they require intensive labor work to clean irrelevant features and to correct shapes and boundaries. In this paper, a new model is implemented to overcome the limitations and to correct the problems of the known and conventional techniques of urban feature extraction specifically road network. The major steps in the model are the enhancement of the image, the segmentation of the enhanced image, the application of the morphological operators, and finally the extraction of the road network. The new model is more accurate position wise and requires less effort and time compared to the traditional supervised and semi-supervised urban extraction methods such as simple edge detection techniques or manual digitization. Experiments conducted on high-resolution satellite images prove the high accuracy and the efficiency of the new model. The positional accuracy of the extracted road features compared to the manual digitized ones, the counted number of detected road segments, and the percentage of completely closed and partially closed curves prove the efficiency and accuracy of the new model.

1. Introduction

Urban areas are the inhabited areas on earth where most dynamic changes can be observed. They are characterized by a trend of permanent expansion and growth. For the first time ever more than 60% of earth’s entire human population can be found in urban areas.

These urban areas are characterized by the existence of diversity of complex features such as buildings, roads, and bridges. One important urban feature is the road network which is extracted from up-to-date high-resolution images such as Ikonos, Quickbird, WorldView, and GeoEye. Many modern applications in many disciplines use these networks. These applications are widely used by a large number of individuals, organizations, and institutes such as car navigation system, emergency rescue system, and urban and environmental planning. There are two types of approaches for extraction and identification of road network features in a remotely sensed image: manual and task-specific automated approaches [1]. In the past, most feature extraction methods are usually done by visual interpretation and manual digitizing from aerial photographs or satellite imagery. Although this is still the predominant approach to geospatial data production, but it consumes labor and time for manual feature extraction or identification. Successful development of feature extraction technologies from high-resolution satellite can greatly increase its usability in geographic databases updating and remote sensing application. At the present, there are many algorithms, semiautomatic [2, 3] and manual for the extraction of road network features. Automated Feature Extraction (AFE) methods have been the long-term goal of geospatial data production workflows for the past 30 years; extracted features over small training sets can then be applied to larger areas, reducing the extraction time required by several orders of magnitude [4]. Automated Feature Extraction applications use spectral, ancillary information, and feature characteristics such as spatial association, size, shape, texture, pattern, and shadow. Machine-learning algorithms and techniques serve to automate the feature recognition process [1].

Many researchers concentrated on developing automated systems to detect the urban area. Karathanassi et al. [5] used building density information to classify residential regions. They benefit from texture information and segmentation to extract the residential areas. Unfortunately, they had several parameters to be adjusted manually. Mathematical morphological operators are used widely in the extraction process of urban features within the AFE methods. Mura et al. [6], Bellens et al. [7], and Akcay and Aksoy [8] used mathematical morphological operators for automated extraction of multiscale urban features, such as buildings, shadows, roads, and other man-made objects. Benediktsson et al. [9] used mathematical morphological operations to extract structural information to detect the urban area boundaries in satellite images. This method is based on neural network architecture. Valero et al. [10] deployed the mathematical morphological operators of opening and closing to extract road features. The method is based on advanced directional morphological operators, namely, path opening and path closing [11]. The above research showed success in extracting different urban features, but with limitations due to the complex content and structure of the high-resolution satellite images such as the road-width can vary considerably, presence of lane markings, vehicles, shadows cast by buildings and trees, and changes in surface material.

Although AFE-existing techniques offer several advantages, such as saving hours of labor time and reducing budgets for heads-up digitizing to create or maintain GIS data, AFE requires posttreatment or guidance to overcome some deficiencies which exist in some AFE methods. Other AFE methods require training and supervision before they automate the process of urban feature extraction such as Feature Analyst (Feature Analyst for ArcMap http://www.esri.com/). Finally others require the image to be converted to binary format and to be clear of noises and other existing features. In addition, they require the scanned map to be large scale before digitizing such as ArcScan (ArcScan for ArcMap http://www.esri.com/) which means that it works for utility and cadastral maps only.

To overcome the above problems and to create a robust toward an unsupervised road features extraction model, several methods are combined. The new model consists of several processing steps such that each step is a prerequisite for the success of the next one. The initial step is to improve the extraction of information by applying many different image enhancement techniques. The next step in the model requires the use of an unsupervised classification technique to separate the road features from other features. Then morphological operators are used to improve the feature extraction. Finally, a reliable edge detection technique is used to extract the features which are provided to a raster to vector conversion algorithm. All the above details will be explained later in the following sections.

The remaining of this paper is divided into two more sections and conclusion. After the introduction above, Section 2 describes in detail the new model including all its components. Section 3 is a complete explanation and details about the experimental results to extract road network features from high-resolution satellite images using the new model. Finally, conclusion and future perspectives provide a summary of the completed steps in the model and the planned ones. The reader is urged to further investigate the important use of the extracted urban features in the planning and security reasons in many literatures.

2. The New Urban Feature Extraction Model

The complexity of the high-resolution satellite images requires sophisticated techniques to handle different tasks which are necessary to extract information such as urban objects. Due to the problems which are inherited by every satellite image such as radiometric, atmospheric, and sensor malfunction, there is need to eliminate these problems. The noises, missing information, and atmospheric effects such as haziness in the images require preprocessing using restoration and filtering techniques to improve information extraction from the image. So, enhancement of the image is the first step in the model. The next step is to classify the image in a semisupervised or unsupervised way using existing or newly implemented segmentation/classification methods.

2.1. Classification/Segmentation of the Enhanced Image

There are several commercial-known clustering algorithms such as Fuzzy C-means (FCM) [12] and ISODATA [13]. Moreover, there are clustering methods which consist of a combination of Artificial Neural Network and evolutionary methods such as Self-Organizing Maps (SOMs) [14] and Hybrid Dynamic Genetic Algorithm (HDGA) [15]. The selection of the segmentation method is very important in eliminating any extra information which can represent nonroad network features. The literature has been investigated, and several approaches which are combinations of two or more methods and which show high accuracy in satellite image segmentation are examined. These approaches are Self-Organizing Maps and Hybrid Genetic Algorithm (SOMs-HGA) [16], Fuzzy C-Means and Hybrid Dynamic Genetic Algorithm (FCM-HDGA) [17], and the combination of both SOMs and FCM [18]. In this paper the unsupervised SOMs-FCM approach is used because of the efficiency in the segmentation of high-resolution images which includes speed and accuracy. In brief the standard Fuzzy C-Means clustering algorithm works as follows. Given a set of data patterns, , the FCM algorithm minimizes the weights within the group sum of the squared error objective function : where is the th -dimensional data vector, is the prototype of the cluster center , is the degree of membership of in the th  cluster, and   is a weighting exponent on each fuzzy membership. The function ) is a distance measure between object and cluster centre , is the number of objects (pixels of an image), and is the number of clusters.

The Self-Organizing Maps (SOMs) algorithm is used to modify FCM by providing it with the initial cluster centers which reduce the required time for FCM to converge to a solution and optimize and stabilize the final solution. The new FCM objective function is explained in the following equation: where is the vector of weights representing the cluster centers obtained by SOMs final iteration and multiplied by 255 (grey level values). One should notice that represents the number of neurons defined as the size of SOMs network (i.e., 16 × 16) and the final cluster numbers (this varies if threshold is used). In addition, every neuron has three weights for every band which means that a distance must be computed between every pixel and center in a specific band. SOMs-FCM is very efficient compared to some commercial segmentation/classification methods such as ISODATA which stands for Iterative Self-Organizing Data algorithm. The ISODATA algorithm has some further refinements by splitting and merging of clusters. Clusters are merged if either the number of pixels in a cluster is less than a certain threshold or if the centers of two clusters are closer than a certain threshold. Clusters are split into two different clusters if the cluster standard deviation exceeds a predefined value and the number of members (pixels) is twice the threshold for the minimum number of members. The algorithm is similar to the -means algorithm with the distinct difference that the ISODATA algorithm allows for different number of clusters while the -means algorithm assumes that the number of clusters is known a priori. The efficiency of the selected segmentation algorithm is due to the instability of the existing clustering algorithms such as ISODATA algorithm which provides different results each time the threshold value and the number of iterations are changed even when the number of clusters is fixed.

The comparison between SOMs-FCM and ISODATA is completely explained in [18], and the superiority of SOMs-FCM over ISODATA is empirically proved.

2.2. Enhancement of the Road Extraction Process Using Morphological Operators

After the segmentation of the image, several classes are obtained which represent urban and nonurban objects. The road network objects are selected, and they are subjected to a repetitive and equal number of morphological operators [11] of opening and closing. These operators depend on other operators which are the erosion and dilation operators. The last two operations are fundamental to the morphological image processing. Dilation is an operation that “grows” or “thickens” objects in a binary image. The specific manner and extent of this thickening is controlled by a shape referred to as a structuring element. Mathematically, dilation is defined in terms of set operations. The dilation of by , denoted , is defined as

Here is the empty set, and is the structuring element. In other words, the dilation of by is the set consisting of all the structuring element origin locations, where the reflected and translated overlaps at least some portion of .

Erosion “shrinks” or “thins” objects in a binary image. As in dilation, the manner and extent of shrinking is controlled by the structuring element.

The mathematical definition of erosion is similar to that of dilation except that the intersection is equal to an empty set. The erosion of by , denoted , is defined as

In other words, erosion of by is the set of all structuring element origin locations, where the translated has no overlap with the background of .

The opening operator is the combination of erosion followed by dilation such that

On the other hand, closing is the combination of dilation followed by erosion such that

The morphological process requires the structuring element to be set before any operation takes place. There are many shapes and sizes for the structuring element such as disk, ball, square, and line. The selection of the shape and size depends on the characteristics of the information in the image and the speed requirements of the process. In this paper several structuring element shapes have been tested in order to obtain the best combination in order to reduce irrelevant information and to close gaps (missing information) in every road segment shape.

2.3. Edge Detection of the Road Segments

After the morphological operator, the edges of the urban objects are delineated using an edge detection method. Edge detection is one of many efficient techniques in image segmentation. Edge detection plays an important role in reducing significantly the amount of data and filters out information that may be regarded as less relevant, yet preserving the important structural properties of an image. There are many methods for edge detection, such as search-based, zero-crossing-based, and active-based contours and deformable models [1922]. However, these methods suffer from sensitivity to noise which does not represent edges. In addition, they are inefficient in providing complete edge boundaries.

To overcome these problems, Canny-Deriche [23] edge detection technique is considered to extract the boundaries of different road networks from the processed high-resolution images.

Finally, the conversion of the raster edges of the road networks to vector format is done using any available software such as ArcMap (a trademark of ESRI http://www.esri.com/) toolbox. The following graph (Figure 1) shows the hierarchical steps of the model. The number of morphological operators of open and close depends on how successful the previous steps. The existence of noise and missing information due to the existence of clouds or malfunction of the sensor may require more work before the morphological operators.

3. Experimental Results

In order to prove the efficiency of the model, QuickBird satellite images are used in the experiments. QuickBird has 5 bands one panchromatic with 0.6 m, three true color bands, and one near infrared with spatial resolution of 2.4 m and spectral wavelength between 0.45 μm and 0.85 μm. The images are projected using the Universal Transverse Mercator (UTM) with zone 36 north.

In the first experiment Figure 2(a) is used to obtain road networks. The figure with a size of 573 × 540 represents a touristic, government, and financial area in the Center of Beirut, capital of Lebanon. The figure was enhanced to remove noise and to improve the scene illumination. The enhancements include adjusting the contrast and brightness and applying smoothing filter to remove noise. The same enhancements are applied to the second experiment which is a QuickBird image of size 450 × 450 pixels. The image represents a coastal area in Beirut “Ramleh El-Baida—White Sand area” (Figure 2(b)). The diversity of the themes in the images makes it more complex for processing them in order to extract roads, and this is intentionally done.

After enhancement, the images are classified using SOMs-FCM algorithm into several classes including roads in red color and shadow in blue (Figures 3(a) and 3(b)). Sometime, different road network classes may exist; this richness in the number of road classes depends on the material (type of asphalt) used to pave the roads and the surrounding environment (dusty or sandy area). Moreover, classification indicates the status of road: if the classification is homogeneous (one simple class), then the road is made of the same material or the road does not have any structural problem. If the homogeneity is intermittent, this indicates that the quality of the pavement material is not the same.

The road network class is extracted from the classified images, and it is converted into binary image where the black indicates the background and white color indicates roads (Figures 4(a) and 4(b)). In the first image the shadow class is included (grey color) because the majority of the roads are covered by the shadow of the buildings.

One can see clearly that homogeneity is not complete with some roads. Some black gaps exist which represent different classes and which indicate the condition or the material property of the asphalt and pavements.

In order to extract linear and continuous roads features the regions must be homogeneous and continuous; in other words, the following should be applied with respect to the background such that the partition of image into road regions must satisfy the following conditions: is a connected region, (homogeneous).

However, if this condition is not satisfied, then the morphological operators must be used to reduce the heterogeneity. The open and close operator which depends on the famous morphological operators of dilation and erosion as explained before are used several times based on trials and errors outcomes. The results of the morphological operators for both images in the two experiments are shown in Figures 4(c) and 4(d). The reader can see that several road segments are fixed by filling gaps with more white pixels and removing isolated ones and leaving others which may represent parking lots or bare lands. After the conversion from raster to vector format in the final step, the user can eliminate irrelevant and false polygons.

In addition to the dilation and erosion, the morphological operators of open and close depend on the type of the Structuring Element (SE) which can be of different shapes such as disk, line, square, or diamond. Each parameter of the SE defines the way the information are added or removed. In this paper, several lines, disks, and squares were used to add and remove pixels. There should be some measure to continue or halt the process of morphological operators. Currently this is done manually with several test strategies in order to find a combination of many open and close operators. It is repetitive similar and opposite morphological operators. The research will continue in the future to find an automated and an optimal process of using morphological operators.

The final step is to use Canny-Deriche to detect edges and to filter out unneeded pixels according to a specific threshold. Figures 5(a) and 5(b) show the canny edge detection and the corresponding vector layer (overlaid on the original image) Figures 5(c) and 5(d) after removing redundant and false information.

The success of this new method depends on the use of a very high-resolution image with little conflict between spectral signatures such as road asphalt, vegetation, and shadow of other urban objects. In other words, it depends on the success of the segmentation approaches.

In general, the method is still in the early phase and the improvements which are planned for the future will make it more successful toward being completely unsupervised.

In addition, to reduce the time required to extract road network features compared to the manual one, another criterion is considered which is the continuity or the discontinuity of the lines representing road networks (percentage of the closed lines). The accuracy can be compared to the proportion of the extracted complete and incomplete road features to the total number of extracted road networks equation:

where stands for accuracy, is the number of complete road segments, is the number of partially complete road segments, and is the total number of road segments.

In the two experiments there is 32 road segments; 30 of them are complete and partially complete. The efficiency of extracting road segments is equal to 93%.

To continue our investigation about the accuracy of the model a computer operator is asked to digitize the road segments according to the normal digitizing process which is used by the GIS experts every day. The digitized layer is overlaid on the extracted road features by our model (Figures 6(a) and 6(b)). There are several features and characteristics in the road network extraction process which can be compared such as the accurate spatial position of the roads and false detection of the road features. Taking few samples (road segments) and after measuring the distance between these samples and the accurate position of the road edges, we noticed that there are several meter shifting from the correct position of the road edges. This shifting is caused by the limitation of the human visual system caused by low contrast of the images and many other issues related to the quality of these images which in turn cause the operator to create incomplete segments which can be a common issue with the new model. Figure 7 shows the shifting in meters from the real road edges for some selected segments extracted by the manual method and the new model. The accuracy of the new model as indicated by the graph is almost 3 times better than that of the manual digitization.

4. Conclusion

In this paper an attempt is made to improve the process of road network features extraction from high-resolution satellite images. The model consists of many methods and techniques. Each step has specific role such as reducing the efforts incurred on the user by the manual process and increasing the efficiency of road segments extraction. Although, several issues in the model require tuning and adjustments, this method is still much better than the manual one which requires more time and human efforts and in turn requires a larger budget. The accuracy of the position of the line depicting the roads is much higher than the manual one. The continuity of the line is another criterion which indicates the efficiency of the model. The accuracy of the model is almost 3 times better than the manual one with respect to the spatial position of the segments especially where the area is not affected by the shadow of the buildings or covered by vegetation (high-quality satellite image). The human interference in this new model is in the lowest rate. In addition, when the road networks are extracted by the model their direction and positions are respected accurately which is not the same with respect to the manual digitization (supervised or semisupervised). The model can be used not only to extract road network but also to help in identifying problems in the road network (classification phase) such as rehabilitation of some segments due to severe weather or extensive use by heavy machines.

The model can be improved in the future to include a method or a technique which can optimize automatically the quality of all the components of the model in order to become completely unsupervised.