Abstract

In the last decade, object-based image analysis (OBIA) has been extensively recognized as an effective classification method for very high spatial resolution images or integrated data from different sources. In this study, a two-stage optimization strategy for fuzzy object-based analysis using airborne LiDAR was proposed for urban road extraction. The method optimizes the two basic steps of OBIA, namely, segmentation and classification, to realize accurate land cover mapping and urban road extraction. This objective was achieved by selecting the optimum scale parameter to maximize class separability and the optimum shape and compactness parameters to optimize the final image segments. Class separability was maximized using the Bhattacharyya distance algorithm, whereas image segmentation was optimized using the Taguchi method. The proposed fuzzy rules were created based on integrated data and expert knowledge. Spectral, spatial, and texture features were used under fuzzy rules by implementing the particle swarm optimization technique. The proposed fuzzy rules were easy to implement and were transferable to other areas. An overall accuracy of 82% and a kappa index of agreement (KIA) of 0.79 were achieved on the studied area when results were compared with reference objects created via manual digitization in a geographic information system. The accuracy of road extraction using the developed fuzzy rules was 0.76 (producer), 0.85 (user), and 0.72 (KIA). Meanwhile, overall accuracy was decreased by approximately 6% when the rules were applied on a test site. A KIA of 0.70 was achieved on the test site using the same rules without any changes. The accuracy of the extracted urban roads from the test site was 0.72 (KIA), which decreased to approximately 0.16. Spatial information (i.e., elongation) and intensity from LiDAR were the most interesting properties for urban road extraction. The proposed method can be applied to a wide range of real applications through remote sensing by transferring object-based rules to other areas using optimization techniques.

1. Introduction

Urban road network information is essential for several geospatial and geographic modeling applications such as flood simulations [1], environmental studies [2], and traffic accident analysis and prevention [3, 4]. Information on urban roads can also help decision makers make strategic and effective decisions, which are important for sustainable urban planning and management. LiDAR is one of the best approaches to gather 3D terrain information [5]. Recently, LiDAR data have been used for road network extraction and road geometric modeling. LiDAR provides highly accurate elevation data that can be used to extract 3D models of different types of objects, along with their geometric characteristics. Additional information, such as image intensity and aerial orthophotos, collected by LiDAR systems provides considerable opportunities for geometric road modeling.

In the last decade, object-based image analysis (OBIA) has been recognized as one of the most effective classification techniques for very high spatial resolution images or integrated data from different sources (e.g., aerial orthophotos and LiDAR data) [6]. Compared with pixel-based classification techniques, OBIA provides additional information that can be used to improve the discrimination of land cover classes [7]. OBIA has two main steps. First, image segmentation is the process of creating nonoverlapping homogeneous objects from image pixels based on their spectral, spatial, and texture information. In this step, OBIA generally uses the multiresolution segmentation (MRS) algorithm [8]. The main advantage of OBIA is that it processes satellite images in a manner similar to that of the human brain; that is, this technique identifies landscape features as objects [9]. The size and shape of image objects are controlled by three main parameters, namely, scale, shape, and compactness, as in the MRS algorithm [6]. Scale controls the size of image objects, whereas the other two parameters control the shape of image objects in terms of geometry and smoothness. The scale parameter defines the allowed maximum heterogeneity of grid cell values within image objects. Small-scale values create small image objects, which can be used to classify specific image objects with certain rule sets. By contrast, large values of the scale parameter generate relatively large image objects, which typically lead to the loss of detailed information. Therefore, an appropriate scale parameter is critical for accurate classification and information extraction from satellite images. The shape and compactness parameters are also essential. The selection of their values directly affects the final image objects and the classification results.

Selecting the values of segmentation parameters in OBIA classification approach is achieved via a method called trial-and-error optimization [10]. This visual estimation method is subjective, time-consuming, and labor intensive. Recent segmentation parameter selection methods are based on segmentation accuracy measures by comparing machine-created segments with hand-digitized ones. Considering these advances in object-based analysis, the development of transferable rule sets remains a challenge mainly because creating appropriate image segments and identifying multiple optimal scales for extraction for a particular feature are difficult tasks. Knowledge-based feature extraction can only be applied if the features of interest are segmented properly, which require selecting multiple scales. This technique will allow the use of spatial information. As the shape and boundary of the considered features, texture and contextual information for feature extraction are described efficiently by the segmentation process. Therefore, appropriate segmentation by selecting the optimum multiple scales in OBIA is a critical task for accurate feature extraction and for developing knowledge-based and transferable rule sets. In this study, a new strategy for selecting the optimum segmentation parameters by optimizing both image segments and class separability is proposed.

1.1. Related Studies

Among the methods used to extract features from remote sensing data, fuzzy object-based methods are becoming increasingly popular. Compared with that of other approaches, the accuracy of fuzzy-based classification is less sensitive to thresholds [11]. In addition, fuzzy logic is considered an appropriate classification method for urban feature extraction because it models uncertainties among the considered classes; these uncertainties are typically found in complex urban areas [12]. Jabari and Zhang [11] developed fuzzy rules for delineating road networks from GeoEye-1 and QuickBird images in an urban environment; the results showed that the roads could be extracted at 85% user accuracy. Spectral, geometric, and neighborhood information were used for road extraction. Grote et al. [13] developed a region-based method for extracting road networks from LiDAR data in suburban areas. Their method used very high-resolution (VHR) color infrared orthophotos and a digital surface model (DSM). Intensity, normalized difference vegetation index, area and length, width, elongation and convexity, and height from DSM were used to extract roads using the approach proposed by the authors. The accuracy assessment demonstrated that correctness was approximately 90% for extraction with DSM. Sebari and He [14] developed fuzzy-based rules for OBIA classification approach to extract roads in an urban environment. They used spectral and shape (i.e., elongation and size) information in the rule. A correctness of 81% was achieved.

In addition, Li et al. [15] provided a hierarchical method for urban road extraction. Their method encompassed two main steps. The first step involved obtaining the road region of interest from a VHR image, whereas the second step involved hierarchically representing this road region of interest in a binary partition tree and extracting roads at hierarchical levels. Geometric features (i.e., compactness and elongation) and structural features (i.e., histograms and morphological profiles) were used to guide the region merging of the binary partition tree. The method was tested on two different data sets, and correctness rates of 85% and 57% were achieved for the two data sets. The low accuracy of the second study area was attributed to the dense urban areas. Hamedianfar and Shafri [16] developed fuzzy-based parameters for object-oriented classification for various impervious surface and roofing materials. The authors proposed a set of rules for road extraction from WorldView-2 images by combining spectral and spatial features. A user accuracy of 92% was achieved following their road class rules.

The literature review indicated that previous studies used fuzzy object-based analysis without optimizing segmentation parameters. As mentioned earlier, the optimum values of the scale, shape, and compactness parameters are critical for accurate classification and feature extraction. Although several studies have provided clear guidelines for selecting the optimum values of segmentation parameters, they disregarded class separability distance when selecting the optimum segmentation parameters. In the present study, a novel two-stage optimization strategy for OBIA is presented and tested by using it to extract road networks from LiDAR data combined with high-resolution orthophotos. In this strategy, the Taguchi method is used to optimize the segmentation parameters, whereas the Bhattacharyya distance is used to optimize class separability.

1.2. Study Areas

The study areas were selected from some regions of the Universiti Putra Malaysia campus located in Selangor State, Malaysia (Figure 1). Two pilot sites were selected. The first site was for method development, whereas the second site was for method transferability testing. Geographically, the first study area is located in a region that extends from the upper left longitude of 3°00′14.48′N and latitude of 101°42′14.7E to the lower right longitude of 3°00′00.7N and latitude of 101°42′44.12′E of the WGS84 coordinate system. Meanwhile, the second study area is located 4 km away from and to the right side of the first study area. The study and test areas are characterized by a mixture of urban, asphalt road, and vegetation features. Buildings vary in terms of roofing material, shape, and size; asphalt roads vary in width and condition.

2. Methodology

In this study, a two-stage optimization strategy for fuzzy object-based analysis was proposed to extract urban roads from airborne LiDAR data. The two-stage optimization technique was used to optimize the image segmentation and image classification processes simultaneously by selecting the optimum scale parameter that maximized class separability and the optimum shape and compactness parameters that optimized the final image segments. The method proposed in this study was developed through the following steps (Figure 2). First, LiDAR point clouds and aerial orthophotos were preprocessed and analyzed to prepare the input data for urban road extraction. Second, the scale parameter was optimized using the Bhattacharyya distance algorithm [17] and the best ranges of the scale parameters were selected for further analysis. Afterwards, the shape and compactness parameters were optimized using the Taguchi method [18]. After the three parameters of the segmentation algorithm were optimized and selected, the input image was segmented using the optimum values of scale, shape, and compactness. Finally, fuzzy rules were developed using the decision tree and fuzzy -mean algorithms, which enabled the classification of image segments and the extraction of urban roads.

2.1. LiDAR Data Acquisition and Processing

The LiDAR data used in this study were collected on March 8, 2013, using Riegl LM Q5600 and Hassleblad 39 Mp camera. The device has a spatial resolution of 13 cm, a laser-scanning angle of 60°, and a camera angle of approximately −45°. In addition, the posting density of the LiDAR data is 3-4 pts/m2.

The collected LiDAR point clouds (i.e., LAS file) were converted into vector points in ArcGIS 10.3 software. The vector points created from the LiDAR point clouds had 2D coordinates (i.e., and ) and height information as attribute data. To convert these vector points into raster format, the first outliers from the point clouds were removed. In this research, outliers were removed using “Locate Outliers,” a tool in ArcGIS 10.3 software. This tool identifies anomalous elevation measurements from terrain, TIN, or LAS data sets that exceed a defined range of elevation values or have slope characteristics inconsistent with the surrounding surface (ESRI, 2016). Afterwards, the filtered point clouds were converted into a raster surface at 0.13 cm spatial resolution using a TIN-based interpolation method [4]. The produced raster surface represented the DSM (Figure 3) with a spatial resolution the same as the orthophotos (0.13 cm). This allowed using the orthophoto and the produced DSM for creating the image objects.

The point clouds were then filtered to ground and nonground points using multiscale curvature classification (MCC) [19]. MCC is an iterative multiscale algorithm for classifying LiDAR returns into ground and nonground [19]. The MCC algorithm was developed at the Moscow Forestry Sciences Laboratory of the United States Forest Service Rocky Mountain Research Station. It integrates curvature filtering with a scale component and variable curvature tolerance. During this stage, a surface is interpolated at different resolutions using the thin-plate spline method [19], and points are classified based on a progressive curvature threshold parameter, which increases as resolution coarsens to compensate for the slope effect because data are generalized. The same interpolation technique was then used to generate the digital elevation models (DEMs) for the study and test areas (Figure 4). Afterwards, the DSM and DEM rasters were used to produce the nDSM raster by subtracting DEM from DSM. In addition, the intensity raster was generated by interpolating the ground point clouds with the related intensity attribute information of the point clouds (Figure 5).

Meanwhile, the aerial orthophoto was first corrected geometrically to fit exactly into the LiDAR point clouds using the intensity raster (Figure 6). Finally, the LiDAR-derived DEM and DSM, intensity, nDSM, and aerial orthophoto were used as a subset based on the boundary of the study area to prepare the final input data for method development and urban road extraction.

2.2. Feature Subset Selection via Particle Swarm Optimization (PSO)

In object-based or knowledge modeling, a large number of features are present in the data sets; however, not all of these features are useful for classifying and extracting information. Therefore, the first step in object-based analysis or knowledge modeling is selecting features from those available in the data sets. Useful features from the data sets for a particular object extraction process can be selected using optimization techniques. Such techniques aim to select a small number of relevant features to achieve similar or better classification results than using all the available features. The final features to be selected should be relevant. Including irrelevant and redundant features in the analysis may produce poor classification results because of the course of dimensionality. Considering the aforementioned reason, only relevant features should be selected for accurate modeling.

Several optimization methods, such as greedy search-based sequential forward selection and sequential backward selection, can be used for feature selection. However, these techniques suffer from various problems such as stagnation in local optima and high computational cost [20]. An efficient global search technique, such as evolutionary computation (EC), is required to address feature selection problems more effectively. PSO is one of the EC techniques based on swarm intelligence [20]. Compared with other techniques, such as genetic algorithm and simulated annealing, PSO is computationally less costly and converges more rapidly [20]. Therefore, this technique was used in this study.

PSO was proposed by Kennedy and Eberhart in 1995 [21, 22]. This technique is motivated by social behavior, such as birds flocking and fish schooling. In PSO, knowledge is optimized through social interaction in the population, in which thinking is not only personal but also social. The useful features from the available features of eCognition software were selected using PSO optimization implemented via MATLAB. PSO was used to minimize the classification error rate. The fitness function is given in (1), which is used to minimize the classification error rate obtained by the selected features during the evolutionary training process and the number of selected features [20, 23].where is a constant number (the weight of the classification error rate obtained from the training data) and , is the classification error rate obtained from the selected feature subset and the training subset data, is the classification error rate obtained from the selected feature subset and the testing subset data, is a constant number (the weight of the number of selected features), stands for the number of selected features, represents the total number of features available for classification, is the classification error rate obtained from the selected features, and is the classification error rate obtained from all the available features. The error rate of the classification results can be calculated using (2) [23, 24], as follows:where , , , and stand for true positives, true negatives, false positives, and false negatives, respectively.

2.3. Knowledge Modeling and Creation of Fuzzy Rule Sets

Knowledge modeling is the process used to model the knowledge used by an expert to identify land cover/land use objects in aerial photos or satellite images. In this study, spectral, spatial, and texture properties were used to describe land cover/land use objects. For each property, a set of quantifiable attributes was determined using several mathematical formulations. The useful features (quantifiable attributes) selected by the PSO technique were used to create fuzzy rule sets. The details of these attributes are provided in Table 1.

Pure data-driven rules developed for object-based analysis are generally difficult to transfer [24]. Meanwhile, fuzzy logic allows the formulation of knowledge given in a natural language with vague and imprecise expressions [14]. Fuzzy logic also avoids the use of hard boundaries between the considered classes and thus allows easy and efficient selection of thresholds [14]. That is, classification accuracy is less sensitive to thresholds when fuzzy logic is used [11]. In addition, fuzzy logic is considered an appropriate technique for urban feature extraction because it models uncertainties among the considered classes; these uncertainties are typically found in urban areas [12]. The aforementioned reasons justify the use of fuzzy logic in object-based analysis for urban road extraction and to generate thematic maps for the study areas.

In general, fuzzy rules can be developed in two ways. First, fuzzy rules can be developed using training data and clustering algorithms. Second, they can be developed using expert knowledge on object identification. Both approaches have advantages and disadvantages. For example, data-driven fuzzy rules are highly sensitive to the selection of training data, whereas the expert knowledge-based approach is relatively difficult to perform and knowledge may be difficult to convert into quantifiable attributes. In this study, both methods were integrated for fuzzy rule-based creation because rules were initially generated based on training data using two algorithms (decision tree and fuzzy -mean), and then, the rules were assessed and modified based on expert knowledge. This manner of assessment and modification ensures that the rules are logical and can be potentially transferred to other areas. Such approach also ensures that quantifiable attributes are correctly assigned for each class.

First, a small number of training data (10 objects per class) was generated for the considered classes, including urban roads. A small number of training data were selected to avoid including incorrect samples and to ensure that the rules satisfactorily defined the objects. Afterwards, a decision tree algorithm was applied on the training data, and a set of crisp rules was generated. In addition, the fuzzy -mean algorithm was applied in the same training data set to generate fuzzy rules that could recognize the considered classes. Utilizing expert knowledge (i.e., that of the authors of this article in this case), the two sets of crisp and fuzzy rules were analyzed. Finally, by understanding the importance of each quantifiable attribute for each class and by determining the number of attributes required to recognize a particular class, the final version of the fuzzy rules was formulated (Table 2).

2.4. Selection of the Optimum Range Values for the Scale Parameter

In OBIA, the accuracy levels of feature extraction and object classification are generally controlled through segmentation quality [8]. High-quality segmentation (perfectly coinciding with the reference objects) can be achieved by selecting the optimum values for its user-defined parameters (i.e., scale, shape, and compactness). However, the value of the scale parameter directly affects the accuracy of object classification. Therefore, before optimizing the scale parameter for segmentation, selecting a range of scale values that can achieve high classification accuracy for image objects is necessary. The optimum range values of the scale parameter can be selected by choosing the values that maximize class separability distance. Figure 7 illustrates an experiment that analyzes the effect of the scale parameter on classification accuracy. Scale parameters ranging from 10 to 150 were analyzed for the class separability distance, which was measured using the Bhattacharyya distance algorithm. Figure 7 shows three peaks in the graph, which indicate that the scale parameters that can achieve the best classification results on the data set and the training samples are used.

Image object hierarchies that incorporate spatial information describing each of these situations separately frequently provide better classification results than a spatial representation constrained within a single operative scale. Although multiscale representations exhibit advantages over single-scale representations, a corresponding increase in data processing and storage demands occurs. Therefore, the minimum number and configuration of object scales (hierarchical levels) that maximize the potential final land cover classification accuracy when they are combined should be identified while minimizing excess data processing [25]. This analysis also determines the best number of segmentation levels that can be used to classify land cover/land use classes in the data set. Therefore, three segmentation levels with different scale values were used. In this analysis, the scale parameters for the three segmentation levels are determined within a range of values. The exact scale parameter for each segmentation level should be determined by applying another optimization technique. The selection of the exact scale parameters for the three segmentation levels is discussed in the next section.

2.5. Selection of the Optimum Values for the Scale, Shape, and Compactness Parameters

After the range values for the scale parameter were selected, the exact values for the scale, shape, and compactness parameters should be optimized. As mentioned earlier, selecting these parameters using the trial-and-error method is time-consuming and labor intensive. Consequently, Taguchi optimization was used in this study.

The first step in implementing the Taguchi method is to design an orthogonal array to test a set of choices [26]. An orthogonal array is related to the situation in which the columns for independent variables are “orthogonal” to one another. These tables provide an easy and consistent design for experiments. Selecting the orthogonal array depends on the number of levels and the number of parameters [25] (Table 3). This technique is frequently used in cases in which various levels of parameters are available. For example, several combinations, which require a significant amount of time for testing, can be used. However, by using the Taguchi technique, choices will be considerably reduced to a small number of experiments.

Taguchi optimization can be implemented by conducting the following steps. First, the objective of the process should be determined. This step involves defining the possible values of a particular parameter for the process. Second, the parameters that can influence the process are defined. These parameters exhibit variable values that can affect accuracy or performance; thus, the level should be defined by the user depending on the effect of a parameter on the process. In this study, the range values of the scale parameter were determined using the Bhattacharyya distance algorithm, as mentioned earlier, whereas the levels of the shape and compactness parameters were determined through their minimum () and maximum () values. Subsequently, the experiments were performed after the appropriate orthogonal array was designed. The effect of each parameter on performance was then measured.

In the next step, the plateau objective function (POF) was measured for each test to examine segmentation quality by using each testing combination and to determine the optimum segmentation parameters (Table 4). POF is the combination of Moran’s index and a variance indicator. Then, signal-to-noise ratio (SNR) was measured to assess the results of the segmentation tests. A high SNR indicates high segmentation accuracy. Equation (3) is used to calculate SNR, as follows:where is the number of repetitions under similar test situations ( in this study) and denotes the POF values obtained from each segmentation test. The SNR table was then completed, and the optimum condition was determined.

Selecting image segmentation parameters is complicated because multiple spatial scales can convey important information on physical characteristics, which define a specific class or feature [27]. A single image segmentation configuration may not be optimal to distinguish all the features in a typical image. Consequently, optimization techniques were used in this study to select the appropriate combination of user-defined parameters for segmentation. The optimization techniques used also determined the best number of segmentation levels required to classify image objects and to extract the considered information accurately. Once the required number of segmentation levels is identified with their associated user-defined parameters, each segmentation level should be assigned to its corresponding land cover/land use classes. In this research, the visual interpretation of the segmented images was used to determine the appropriate segmentation level for each class. According to the conducted analysis, shadow and urban tree classes were assigned to segmentation level 1; urban road and building classes were assigned to segmentation level 2; water body, grassland, and bare land classes were assigned to segmentation level 3 (Figure 8).

3. Results and Discussion

In this section, the results obtained from the study on a two-stage optimization for fuzzy object-based analysis using airborne LiDAR and orthophotos for urban road extraction are presented and discussed. The selection of the segmentation parameters is first described. Then, the results obtained from the classification of image objects based on the fuzzy rules proposed in this study are provided and explained.

3.1. Results of the Selection of the Segmentation Parameters

User-defined MRS parameters were optimized using the Bhattacharyya distance and Taguchi approaches. Figure 9 shows the SNRs for each segmentation parameter and level. For segmentation level 1, the highest SNR value of the scale parameter suggests that the best value for the scale parameter in segmentation level 1 is 40 (scale level 3). Similarly, Figure 9 indicates that the shape value of 0.5 (shape level 3) is the best for segmentation level 1. Meanwhile, the best value for compactness in segmentation level 1 is 0.3 (compactness level 2).

For segmentation level 2, the analysis shows that the best scale value is 115 (scale level 1). Meanwhile, the best values for shape and compactness are both 0.1 (shape level 1 and compactness level 1). Furthermore, the analysis indicates that the appropriate scale value for segmentation level 3 is 0.5 (scale level 2), whereas the best values for shape and compactness are both 0.1 (shape level 1 and compactness level 1).

In the MRS algorithm, the scale parameter defines the size and heterogeneity of an object. The scale value of 40 was selected using the two-stage optimization method for segmentation level 1, which included two classes (i.e., urban tree and shadow), because this threshold, combined with the other two parameters, was the best for extracting the considered land cover classes. Higher scale values cause oversegmentation in the shadow class, whereas lower scale values result in undersegmentation in the urban tree class. Meanwhile, the two-stage optimization applied in this research suggested a shape parameter value of 0.5 for segmentation level 1. The sum of the weight values of the shape and color parameters is 1. High shape values are recommended for regular objects, whereas small-scale values are suggested for objects with obvious spectral characteristics. In segmentation level 1, urban trees have nearly similar shapes (i.e., compact), whereas shadows have different shapes because they are formed by urban trees and tall buildings. This result led to the selection of the intermediate shape value of 0.5 to balance between shape and color properties. In addition, the compactness parameter refers to the compactness degree in each segmented object; when weight is high, the internal object is compact. Meanwhile, the smooth parameter () represents the smoothness of the segmented object boundary. In this study, a low compactness value was selected for the urban tree and shadow classes. Therefore, the study area has noncompact shadows from tall trees.

Furthermore, a scale value of 110 for segmentation level 2 was selected for the two-stage optimization technique applied in this study (Table 3). This segmentation level is considered suitable for extracting building and road features depending on the scale parameter as determined by the Bhattacharyya distance and Taguchi approaches. However, low values for both shape and compactness were selected , which indicated uncertainties in road boundary detection, because buildings were compact and had smooth shapes. Accordingly, high shape and compactness values were selected. Optimization was applied to select suitable parameters for both classes (building and road) simultaneously, and thus, low values were selected because of the effect of noncompact and nonsmooth roads.

Water bodies, bare lands, and grasslands were assigned to segmentation level 3 based on the visual interpretation of the aerial orthophoto. They have relatively large areas and are mostly homogenous. In terms of segmentation parameters, the scale value of 140 was determined as the best threshold to separate the three classes and to describe the objects. Meanwhile, the shape and compactness values of 0.1 were suitable using the proposed optimization method. The low shape and compactness values can be justified because the three classes have noncompact and rough shapes. This condition is attributed to the presence of urban trees within grasslands and bare lands (Figure 6(a)).

3.2. Segmentation Assessment

The assessment of the segmentation results and the optimization of the segmentation parameters may improve classification results and final road networks. In addition, if image objects are created accurately, then classification rules can be more efficiently optimized and transferred to other study areas. Therefore, although the segmentation parameters were selected using optimization techniques, the segmentation results should still be assessed. In this study, the segmentation results were evaluated by comparing the extracted objects with the reference objects created via manual digitization in a geographic information system (GIS). The geometric quality of segmentation was evaluated using two indices: area ratio () and position error (). Area ratio yields accurate information on the percentage of the area extracted for each object, which is defined as follows [14]:where is the area.

Position error is related to the mean distance between the extracted objects and their corresponding objects in reference data. This error is determined according to the characteristic points in the two data. The characteristic points for buildings correspond to building corners, whereas those for roads are selected along road axes. Position error is determined by the following equation [14]:where is the number of characteristic points; (, ) correspond to the coordinates of an extracted point; and (, ) are the coordinates of the corresponding characteristic point on the reference layer.

3.3. Classification Assessment

Classification results were assessed in eCognition software using its accuracy assessment tools. Results are shown in Tables 5 and 6. The overall accuracy of the produced thematic map was first calculated. Then, producer and user accuracies were calculated for each class. Finally, the kappa index of agreement (KIA) was calculated for the whole classified map and for each class. KIA is calculated using the following equation [28]:where is the number of samples in the accurately predicted class, is the number that will be predicted accurately via random class selection, and is the total number of samples in that class.

3.4. Results of Image Classification

Figure 12 shows the classification results on the study area used for model development. The quality of segmentation used to generate the classification map is presented in Figures 10 and 11. The average area ratio of all the classes was 80.80%. This finding indicates that most of the extracted objects are accurately matched with the reference objects. Meanwhile, the minimum and maximum area ratios were 62% (for grass) and 99% (for water), respectively. The low accuracy of grass was ascribed to some of the grass objects being merged together after the image was segmented. The highest accuracy was obtained for the water class because only a few water objects were present, and all of these were accurately extracted. The spectral information of the aerial orthophoto enabled the accurate extraction of water and the separation of water from surrounding nonwater objects (i.e., bare lands and grasslands). In addition, the minimum and maximum position errors in the segmented objects were 1.02 (water class) and 3.8 (building class), respectively.

In evaluating the classification results, an overall accuracy of 82% and a KIA of 0.79 were achieved when the results were compared with the reference objects created via manual digitization in GIS. The producer, user, and KIA accuracies for each class and the detailed confusion matrix are presented in Tables 5 and 6. The KIA of the building class was 0.82. This value indicated that most of the buildings were extracted accurately and separated from other classes. However, misclassifications between buildings and vehicles on the roads typically occur because cars are rectangular and elevated objects (approximately 1.2 m). These characteristics, which are similar to most of the buildings in the study area, lead to a decrease in the KIA of the building class. To address this problem in the future, spectral information should be considered or more advanced rules should be developed using contextual and geometric information. Shadow objects were extracted at a KIA accuracy of 0.88. This high accuracy was achieved because shadow objects were dark and differed from other natural classes (i.e., water, bare land, and grassland). They were distinguished from the building class because of the difference in height and were differentiated from roads because of their spatial characteristics (roads were elongated, whereas shadows had lower values in this property). Shadows mostly occur because of tall buildings and urban trees. Water bodies were extracted at a KIA accuracy of 0.80. Spectral information, such as the red band of the aerial orthophoto as well as intensity and height information from LiDAR data, is the most important feature used to differentiate water bodies from other classes in the study area. Accuracy was reduced to 0.80 because one of the grass objects was classified as water given the similar texture and spectral information between water and grass objects. The rules developed for the water class indicate that spatial information is not as important as spectral information because water bodies have spatial characteristics similar to those of some grassland and bare land classes. In addition, the shape of water bodies is not unique and depends on the study area. Therefore, developing rules that use only spectral, texture, and height information is important.

Urban trees are easily differentiated from other classes because of the difference in height; they are distinguished from buildings through intensity and rectangular fit properties. Urban trees were extracted at a KIA accuracy of 0.88. Given that urban trees are typically located within bare lands or grasslands, they are differentiated from surrounding objects with different characteristics. Texture (homogeneity) information and intensity information are also useful in differentiating urban trees from some of buildings with specific roofing materials. The results demonstrated that the proposed fuzzy rules enabled the extraction of bare lands and grasslands at an average KIA accuracy of 0.64. After the interpretation of the produced thematic map and the statistical information of the confusion matrix, the low accuracy was ascribed to some of the objects merging during segmentation. This situation led to the loss of the characteristics and properties used to define these classes in rule development. Misclassifications mostly occurred between bare lands and roads, grasslands and roads, and bare and grasslands because some of the small bare land and grassland objects exhibited similar spatial characteristics with roads (i.e., the elongation property of roads was the same as those of small grassland and bare land objects). The interpretation of the results indicated that additional spectral and contextual information would be necessary to improve the accuracy of feature extraction.

3.5. Results of Urban Road Detection

Figure 13 shows the urban roads extracted using the proposed fuzzy rules. Height, spectral, spatial, and texture information were used for road extraction. The accuracy of road extraction using the developed fuzzy rules was 0.76 (producer), 0.85 (user), and 0.72 (KIA). Spatial information (i.e., elongation) and intensity from LiDAR data were the most interesting properties for their extraction. Several omissions were attributed to the heterogeneity of the surface of roads and the shadows projected onto roads by trees, buildings, or vehicles. The most significant problems of road extraction were the presence of trees along the roads and the undetected boundaries of roads. Even spatial information is not useful in solving these problems because these problems occur during segmentation. To address these problems, more advanced rules based on contextual and probability theories can be integrated. To improve the quality of the extracted roads further, postprocessing techniques, such as filtering, removing nonroad features, and smoothing the extracted polygons, are necessary. The quality of the final road features is critical for the applications mentioned in Introduction. Roads can also be converted into polylines that represent their centerlines. However, the aim of this study is to extract roads in their complete geometry to increase their use for a wide range of geospatial applications.

3.6. Evaluation of the Transferability of the Fuzzy Rule Sets

A transferability test is essential to ensure that the proposed rules will extract the considered objects without any changes in the rule base. Figure 14 shows the classified map using the same segmentation parameters and the proposed fuzzy rule sets. The test site used to evaluate the efficiency of the proposed rules included the same classes except for water bodies. Therefore, the test site was suitable for evaluating the rules developed in this research. The test site is a subset of the data set used in this research but has a different urban environment. Assuming that most of the airborne LiDAR data have similar characteristics (i.e., similar features of nDSM, intensity, aerial orthophoto, and spatial resolution), the proposed rules are most probably transferable to any other study areas. However, slight changes in segmentation parameters may be required because these rules depend on the scale of the study area and the resolution of the data sets.

Overall accuracy was decreased by approximately 6% when the rules were applied on the test site. A KIA of 0.70 was achieved on the test site using the same rules without any changes. The first interpretation of this decrease in accuracy suggested that some features in the new test site had different properties, which were disregarded when the rules were developed. Therefore, knowledge modeling should be carefully integrated into data-driven rules as applied in this research. Given the extensive variety of properties for natural and man-made features, these issues remain a challenge. In particular, the KIA accuracies of the building and grassland classes increased from 0.82 to 0.90 and from 0.68 to 0.88, respectively, because only one type of building structures was found in the test site. By contrast, the accuracies of the urban tree and shadow classes were decreased. The KIA accuracy of the urban tree type decreased from 0.88 to 0.79, whereas that of the shadow class decreased from 0.88 to 0.38. The decrease in the urban tree type was normal because the rule was created based on data integration and expert knowledge. By contrast, the decrease in shadow accuracy was abnormal (KIA decreased by approximately 0.50) mainly because of the uncertainties in the testing samples. The low contrast of the shadow features in the orthophoto made interpreting those features difficult, and thus, the creation of testing samples was affected.

Figure 15 shows the extracted urban roads from the test site using the proposed fuzzy rules. The KIA accuracy of the extracted urban road was 0.72, which decreased to nearly 0.16. Misclassifications mostly occurred between urban roads and buildings and between urban roads and bare lands. In the test site, many cars were located on the roads, which were misclassified as buildings because of the similar spatial properties of cars and buildings. In addition, the shadows projected from tall trees and high cars also contributed in decreasing road extraction accuracy. To improve the result of urban road extraction, removing vehicles before the classification process using several filtering techniques is suggested. Further improvement can be achieved by developing advanced rules for complex urban environments.

4. Conclusion

In this study, a two-stage optimization method for segmentation parameter selection was presented. A fuzzy object-based analysis technique for urban road extraction from airborne LiDAR data was also proposed. Suitable segmentation parameters were selected using the two-stage optimization method. In the first stage, the scale parameters were optimized via PSO to improve classification. In the second stage, suitable shape and compactness parameters were selected to improve object creation using the Bhattacharyya distance. The proposed fuzzy rules are easy to implement and are transferable. The fuzzy rules were formulated based on data samples and expert knowledge using the decision tree and fuzzy -mean algorithms. The results showed that the proposed method could extract urban roads from LiDAR data with a KIA accuracy of 0.72 on the test site.

Feature selection using PSO contributed to improving knowledge modeling by selecting relevant and useful features and excluding nontransferable and problematic features. Meanwhile, the two optimization techniques, namely, Bhattacharyya distance and Taguchi, enabled the selection of the best combination of segmentation parameters to improve object creation and classification results. In addition, integrating fuzzy logic into object-based rules contributed to managing the complexity of classification in urban environments by involving uncertainties in the boundaries of the used classes.

The main challenge in urban road extraction was the presence of urban trees along the roads, which created difficulties in accurately extracting road features with complete geometry. The presence of vehicles on urban roads also created problems in urban road extraction. In addition, vehicles on the roads formed shadows on the road surface, which also affected the quality of final road extraction. Furthermore, a variety of shadow shapes that could be present in the scene made the extraction of some classes challenging. To solve these limitations and problems, further research should be conducted by focusing on the factors that created a wide range and variety of problems.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research is supported by research project with UPM Vote no. 6300193.