Abstract

Information regarding the current status of urban green space is crucial for urban land-use planning and management. This study proposes a remote sensing and data-driven solution for urban green space detection at regional scale via employment of state-of-the-art metaheuristic and machine learning approaches. Remotely sensed data obtained from Sentinel 2 satellite in the study area of Da Nang city (Vietnam) are used to construct and verify an intelligent model that hybridizes Marine Predators Algorithm (MPA) and support vector machines (SVM). SVM are employed to generalize a decision boundary that separates features characterizing statistical measurements of remote sensing data into two categories of “green space” and “nongreen space”. The MPA metaheuristic is used to optimize the SVM training phase by identifying an appropriate set of the SVM’s hyperparameters including the penalty coefficient and the kernel function parameter. Experimental results show that the proposed model which processes information provided by all of the Sentinel 2 satellite’s spectral bands can deliver a better performance than those obtained from the model based on vegetation indices. With a good classification accuracy rate of roughly 93%, an F1 score = 0.93, and an area under the receiver operating characteristic = 0.98, the newly developed model is a promising tool to assist local authority to obtain up-to-date information on urban green space and develop plans of sustainable urban land use.

1. Research Background and Motivation

In many regions around the globe, fast pace of urbanization leads to various problems including traffic congestion, poor air quality, and noise pollution. As pointed out by Xian et al. [1], urbanization significantly transforms the landscape from the natural surface types to impervious surface such as housing, commercial building, and infrastructures. This transformation is happening in a large spatial extent and an increasing speed due to a burgeoning pressure on additional housing and commercial/industrial areas. Moreover, developments of urban lands consume areas of green land areas and bring about negative impacts on the urban environments [24]. Worsening living environment caused by lack of green space is a major issue for human heath because roughly 54% of people in the world are living in urban areas [5].

Urban green space is generally defined as green infrastructure that contains vegetated spaces including urban parks, road, and workplace green space [6]. Previous works have recognized the crucial role of green space for reducing the adverse impacts of urbanization in both aspects of urban ecosystem and socioeconomics [715]. The World Health Organization has identified that green spaces are innovative methods for enhancing the quality of urban environments via improvements of local resilience and promotions of sustainable lifestyles [15]. Turaga et al. [16] state that urban green spaces become a critical asset because they deliver various benefits including aesthetics enhancement, pollution reduction, positive effects on physical and mental health of citizens, urban heat island reduction, and groundwater recharge. Therefore, there is a raising societal support for protection and development of green space in urban areas.

Therefore, up-to-date spatial information regarding the current status of urban green space is crucial for urban land-use planning and management. This information has become increasingly difficult to obtain via conventional landscape surveying approaches since green spaces have been constantly modified, fragmented, and dispersed due to the fast pace of urbanization. Moreover, surveying tasks at a regional scale are daunting because of both time and labor consumptions required for field data acquisition, processing, and report. Thus, there is a pressing need for advanced methods to automate the green space surveying task.

Recently, medium-resolution imagery coupled with advanced machine learning methods has provided effective solution for urban landscape survey [1722]. Remote sensing data used with geographic information system (GIS) can be used to generate thematic maps to assess green vegetation cover at a regional scale. Data extracted from such map can be helpful for further data analyzing processes regarding the size, shape, and other landscape pattern of urban green space [2328].

Rafiee et al. [13] relies on Landsat Thematic Mapper images to study the patterns of green areas; this study employs combined techniques of remote sensing image classification, landscape metrics assessment, and vegetation indices. El Garouani et al. [29] employ the maximum likelihood supervised classification to analyze data obtained from Landsat’s bands with a spatial resolution of 30 m; the authors investigate the relationship between urbanization and land use changes as well as the effect of the increase in impervious surface areas. Urban green space distribution has been modeled in [7] with the use of remote sensing, GIS technology, and normalized difference vegetation index. Do et al. [3] relies on Landsat 8OLI (Operational Land Imager) image datasets provided the United States Geological Survey to study green space patterns; the support vector machine (SVM) has been used by the authors for the task of image data classification; the overall accuracy of the proposed machine learning model is 82.70%.

Li et al. [30] construct land-use and land-cover maps including green spaces using Landsat Operational Land Imager (OLI) and Enhanced Thematic Mapper Plus (ETM+) imagery; convolutional neural network, random forest, and SVM are the employed machine learning models used for image data classification; this study reports a classification accuracy of 84.40% on the testing dataset. Dinda et al. [31] construct an integrated model for studying urban growth and associated green space loss; the model relies on maximum likelihood classifier, artificial neural network, and SVM for performing pattern recognition task; the SVM model has attained the most desired classification accuracy and the area under the receiver operating characteristic curve (0.906).

It is noted that besides SVM, deep neural networks (DNNs) have also been successfully applied in remote sensing–based land-use classification [3234]. DNNs are highly appropriate for image categorization due to its convolution operator based autonomous feature extraction phase [35]. However, successful implementations of DNNs often require a large number of training image samples. The computational expense of DNNs is generally significant due to the time-consuming training process used to fine-tune the networks’ weights. Moreover, because deep learning models have a quiet large number of hyperparameters (e.g., the number of hidden layers, the number and the size of convolution operations, the size of the pooling operations, etc.), the process of identifying a suitable network architecture can be tedious. Moreover, since the extracted features are represented as numerical data in this study, the application of SVM can be highly appropriate. It is because SVM has been proven to be a capable tool for classifying extracted numerical datasets [19, 3639].

Based on literature review, there is an increasing trend of applying machine learning in remote sensing–based urban green space study. Since the problem of interest is challenging due to the involvement of multivariate and nonlinear data analysis, other advanced machine learning solutions need to be investigated to improve the urban green space detection accuracy. Moreover, the current literature also points out that individual machine learning methods are the commonly employed approach. Hybrid machine learning models that harness advantages of various computational intelligence techniques are rarely investigated to construct urban green space detection models. Specifically, previous studies have mainly relied on the individual machine learning approach [3, 13, 29, 31], and the employment of metaheuristic algorithms used for optimizing machine learning based remote sensing data classification has rarely been proposed and investigated. Therefore, the original contribution of the current work is proposing a hybridization of SVM machine learning and metaheuristic optimization used for remote sensing–based urban green space detection.

SVM [40] is considered to be a capable pattern recognizer with excellent generalization capability. It is due to the fact that the model structure of this machine learning method is learnt via the framework of structural risk minimization which is resilient to overfitting and noisy data [41]. Nevertheless, the model construction phase of a SVM model requires a proper setting of its two hyperparameters including the penalty coefficient and the kernel function parameter. The former specifies the amount of penalty imposing on data samples having classification errors. The later determines the locality of the employed kernel function which directly influences the generalization of the constructed model.

The task of determining hyperparameters of a machine learning model is known as model selection [41] and can be modeled as an optimization problem. For the case of a SVM model, this is a challenging task because of several reasons. First, the landscape of the objective function is unknown and not differentiable. Second, the hyperparameters must be searched in continuous space; therefore, there is an indefinite number of feasible solutions. This fact means that an exhaustive search on the hyperparameters is infeasible. Therefore, various scholars have resorted to metaheuristic algorithms for dealing with the model selection problems. The role of metaheuristic algorithms in the task of hyperparameter setting (also called model selection) is indeed crucial. These algorithms are used to optimize the performance of machine learning model to achieve a balance between model accuracy and model generalization.

The employed metaheuristic approaches include symbiotic organisms search [42], particle swarm optimization [43, 44], the forensic-based investigation optimization [45], equilibrium optimization [20], Harris hawks optimization [46], simulated annealing [47], social spider optimization [48, 49], gray wolf optimization [38, 50], teaching-learning-based algorithm [51], salp swarm algorithm [52, 53], artificial bee colony [54], pigeon-inspired optimization [55], cuckoo search optimization [56], imperialist competitive algorithm [57], moth flame optimization [58], and cuckoo search algorithm [59]. Those previous works have demonstrated the effectiveness of metaheuristic algorithms in optimizing machine learning models and solving complex tasks in various application domains.

Marine Predators Algorithm (MPA), first introduced in [60], is a recently proposed nature-inspired metaheuristic inspired from the foraging strategy of marine predators. This metaheuristic is characterized by a novel combination of Lévy and Brownian movements used for enhancing the optimization performance. The capability of MPA has been demonstrated by various optimization tasks [60]. Nevertheless, the performance of this metaheuristic used in optimizing machine learning models has rarely been investigated. Hence, this study proposes to hybridize MPA and SVM to form an integrated intelligent model used for remote sensing-based urban green space detection.

Remote sensing data obtained from Sentinel 2 satellite in the study area of Da Nang city is used to train and verify the MPA-SVM hybrid model. In this work, the MPA optimized SVM model trained by remote sensing data with all of the Sentinel 2’s spectral bands is compared with the models that use commonly employed vegetation indices including normalized difference vegetation index (NDVI) [61], normalized difference water index (NDWI) [62], soil-adjusted vegetation index (SAVI) [63], and MERIS terrestrial chlorophyll index (MTCI) [64].

The rest of the article is organized as follows: Section 2 reviews the research methodology and material. Section 3 presents the proposed MPA optimized SVM used for remote sensing-based urban green space detection. Experimental results are reported in Section 4. Concluding remarks of the current study are summarized in Section 5.

2. Research Methodology and Material

2.1. General Description of the Study Area and Remote Sensing Data

As mentioned earlier, urban green spaces play a significant role in the urban living environment; they serve a variety of functions including climatic modification, aesthetics, recreation, and physical/mental health improvement. Nevertheless, due to the physical expansion of Da Nang city (Vietnam), certain areas of green spaces have been replaced by impervious surface such as buildings and roads. Therefore, the current status of urban green space in this city needs to be updated in a timely manner and this city has been selected as the study area of this research work.

Da Nang is a crucial coastal city located in Central Vietnam. Da Nang’s location is at 15o55’ to 16o14’North and 107o18’ to 108o20’ East [3]. It is the third largest city within the nation with a population of about 1 million. Da Nang urban center (refer to Figure 1) is located in the eastern section of the area and consists of six districts: Hai Chau, Cam Le, Thanh Khe, Lien Chieu, Ngu Hanh Son, and Son Tra [65]. The rural districts of Hoa Vang and Hoang Sa also belong to Da Nang city but are not included in the Da Nang urban center; therefore, these two rural districts are excluded from the study area.

To survey the urban green space status of Da Nang city, remote sensing data in form of spectral bands have been collected from Sentinel 2 on July 16, 2020. These spectral bands (see Table 1) are provided openly by USGS [66]; they can be processed and analyzed by Sentinel Application Platform (SNAP) software package [67] as well as ENVI software package [68]. Using the open-accessed tools of SNAP, the original Sentinel 2’s spectral bands are converted to TIF format via the geometric operation of resampling. Moreover, it is noted that the used map projection of the obtained images is Universal Transverse Mercator (UTM) within Zone 48 N–Datum World Geodetic System (WGS) 84. Images of the Sentinel-2 spectral bands are demonstrated in Figure 2. This figure demonstrates the 13 spectral bands obtained from the Sentinel-2 on July 16, 2020. These bands are coastal aerosol, blue, green, red, red-edge 1, red-edge 2, red-edge, near infrared, near infrared narrow, water vapour, shortwave infrared/cirrus, shortwave infrared 1, and shortwave infrared 2. The wavelength range and resolution of each spectral band are provided in Table 1.

2.2. Remote Sensing-Based Vegetation Indices

In remote sensing field, vegetation indices have been widely used to extract vegetation biophysical information from satellite image data [69]. Previous works have demonstrated the effectiveness of vegetation indices in remote sensing–based green space mapping [7, 13, 7072]. Therefore, this study relies on such conventional indices as a means of urban green space detection. The employed vegetation indices include normalized difference vegetation index (NDVI) [61], soil-adjusted vegetation index (SAVI) [63], normalized difference water index (NDWI) [62], and MERIS terrestrial chlorophyll index (MTCI) [64]. These indices can be computed as follows [61, 69, 73]:where and represent the red reflected radiant flux (Sentinel 2’s band 4) and near-infrared radiant flux (Sentinel 2’s band 8a), respectively.where L = 0.5 denotes the soil brightness correction factor [74].where and represent the near-infrared radiant flux (Sentinel 2’s band 8a) and shortwave infrared (Sentinel 2’s band 11), respectively.where is Sentinel 2’s B4; is Sentinel 2’s B5; is Sentinel 2’s B6 [74, 75].

2.3. The Used Metaheuristic and Machine Learning Approaches
2.3.1. Marine Predators Algorithm

Marine Predators Algorithm (MPA), first introduced in [60], is a stochastic global optimization algorithm inspired from the widespread foraging strategy of marine species such as sharks and tunas. The foraging strategy of these marine species effectively utilizes Lévy and Brownian movements along with optimal encounter rate policy in biological interaction between predator and prey [7678].

The searching process of MPA consists of three phases considering three scenarios: (i) high velocity ratio when a prey is moving faster than a predator, (ii) unit velocity ratio when the rates of movement of a prey and a predator are similar, and (iii) low velocity ratio when the rate of movement of a predator is higher than that of a prey. The searching operation of the MPA metaheuristic is demonstrated in Figure 3. Let XE be the position of predators and XP be the position of preys within a marine ecosystem. The 1st phase aims at search space exploration and is applied for the first one-third of the searching iteration number; the mathematical equation used to revise the prey position is given bywhere is an entry-wise multiplication operator. RB is a vector including random numbers generated from a normal distribution which mimics the Brownian motion. i denotes the index of population members. R represents a vector of uniform random number within [0, 1]. NP is the number of population members.

The 2nd phase serves as an intermediate phase and occurs within the second one-third of the searching iteration number. The positions of the first half of the population members are updated as follows:where RL denotes a vector of random numbers generated from the Lévy distribution which represents the Lévy movement.

The positions of the second half of the population members are updated as follows:where ; Iter and MaxIter are the current iteration count and the maximum number of iterations, respectively.

The last phase of the optimization process aims at exploitation of the search space. The population members’ positions are updated in the following equation:

In addition, to model behavior shift in marine predators according to the eddy formation or Fish Aggregating Devises (FADs) effects [79], the MPA metaheuristic employs the following operation:where FADs = 0.2 denotes the probability of the FADs effect. U represents a random binary vector. r is a random number within [0, 1]. LB and UB are vectors of lower and upper boundaries of the searched variables. r1 and r2 denote two random indices.

2.3.2. Support Vector Machine

Introduced by Vapnik [40], support vector machines (SVM) have gained attentions of the academic community and have become a preeminent pattern recognition approach [55, 8090]. Given a data sample set S drawn from a data universe XU, a hidden target function f: X, we first create a labeled training dataset D, where . The SVM machine learning can be used to estimate the target function f(x) by constructing a function based on data samples stored in D so that for all x in X. Herein, for the task of urban green space detection, the data label can be modeled as “0” = nongreen space (the negative class) and “1” = green space (the positive class). The input data X are properties of the Sentinel 2’s spectral bands.

To construct a SVM model, it is required to solve the following constrained optimization problem [91]:where Rn and bR are used to construct a classification hyperplane used for pattern classification. is the vector of slack variables. denotes the penalty coefficient. denotes a nonlinear data mapping used for dealing with data that cannot be linearly separated.

One advantage of the SVM method is that the explicit formula of is not required. To construct a SVM model, only the dot product of is necessary. The dot product of two data samples xk and xl is represented as a kernel function K(xk, xl):

For multivariate and nonlinear data classification problem, the radial basis kernel function (RBKF) is commonly utilized:where denotes a hyperparameter of the RBKF.

By solving a Lagrangian dual of the aforementioned constrained optimization problem and using a quadratic programming solver, the SVM model used for data classification can be expressed compactly as follows [91]:where represents the solution of the optimization problem; SV denotes the number of support vectors.

3. The Proposed Marine Predators Algorithm Optimized Machine Learning Approach for Urban Green Space Detection

This section of the article is dedicated to describing the integrated model used for remote sensing–based urban green space detection. The core of the proposed model is a hybridization of the MPA metaheuristic and the SVM machine learning. These two methods work synergistically to analyze patterns hidden in a set of remotely sensed data collected for the study area of Da Nang urban center. In detail, SVM is used to construct a decision boundary that separates the input data space into two distinctive regions of “nongreen space” and “green space”.

To further enhance the performance of the SVM model, MPA is utilized to autonomously fine-tune the SVM training process by identifying a set of appropriate model hyperparameters. The optimized hyperparameters include the penalty coefficient and the RBKF parameter. In this study, the searching range of the penalty coefficient is [1, 100]; the searching range of the RBKF parameter is [0.1, 100].

These two hyperparameters strongly influence the learning and the predictive capability of the integrated urban green space detection model. A too large penalty coefficient or a too small RBKF parameter leads to overfitted models. On the other hand, a too small penalty coefficient and a too large RBKF parameter tends to construct underfitted models [92]. Therefore, the role of the MPA is to find a set of the penalty coefficient and the RBKF parameter which features a balance between predictive accuracy and modeling generalization. Accordingly, it is expected that the constructed model will not suffer from either overfitting or underfitting.

The overall model structure is presented in Figure 4. The 13 spectral bands are obtained from the Sentinel 2 satellite and processed via SNAP [67] and ENVI [68] software packages. To accelerate the computing process, the images of spectral bands are divided into blocks with the size 5 × 5 pixels. For the purpose of data classification, the average () and the standard deviation () of gray intensity of each image block are computed and used as numerical features by the integrated MPA-SVM model.

The average and standard deviation of gray intensity are given by [93]where Ii,c = 0,1,2, …, 255. NL = 256 represents the number of discrete color values. P(I) is the first-order histogram of an image block [94].

To construct the integrated MPA-SVM model for urban green space detection, it is necessary to prepare a training dataset with assigned ground truth labels. This study has performed sampling process to collect data in the nongreen space and green space areas within the study area (demonstrated in Figure 5). It is noted that the ground truth label of each image data in this study has been verified by field trips and Google Earth Engine. Each block is sampled with the size of 25 × 25 pixels to generate nonoverlapped image patches with the size of 5 × 5 pixels. After the data sampling process, there are 1,000 image patches available for the feature extraction operator. Herein, each class label contains 500 image patches to guarantee a balanced classification. The collected dataset is illustrated in Table 2. Notably, each spectral band yields two statistical measurements (i.e., the mean and standard deviation) and there are 13 bands. Thus, the total number of features used for classification is 13 × 2 = 26.

It is worth noticing that the extracted dataset including the input features which characterize statistical properties of the spectral bands and the corresponding class labels has been randomly separated into a training (70%) dataset and a testing dataset (30%) [95]. The first set is used for model training and the second set is used to inspect the model predictive capability. In addition, to standardize the input data range, the Z-score equation is used as follows [96]:where XZ and XD denote the normalized and the original features, respectively. MX and STDX denote the mean value and the standard deviation of the features, respectively.

To optimize the SVM model used for urban green space detection, the objective function of the MPA metaheuristic has employed a 5-fold cross validation process and the indices of false negative rate (FNR) and false positive rate (FPR). This objective function (OF) is described as follows [96]:where FNRk and FPRk denote FNR and FPR computed in the kth run, respectively.

The FNR and FPR indices are given by [96]where FN, FP, TP, and TN are the false negative, false positive, true positive, and true negative data samples, respectively.

In this study, the source code of the MPA metaheuristic is provided by Faramarzi et al. [60]. For the purpose of model optimization, the integrated MPA-SVM has been constructed in MATLAB. The SVM model with an optimized set of hyperparameters is then developed in Visual C# .NET framework 4.7.2 to process and analyze remote sensing data. The SVM model in Visual C# .NET has been built with functions provided by the Accord.NET Framework [97]. Moreover, the program has been implemented with the ASUS FX705GE-EW165T (Core i7 8750H and 8 GB Ram) platform.

4. Experimental Results and Discussion

In this section, a set of performance measurement indices is used to express the model predictive accuracy. This set includes classification accuracy rate (CAR), precision, recall, negative predictive value (NPV), F1 score, and area under the receiver operating characteristic curve (AUC) [98, 99]. The calculation of AUC is described in [99]. The formulas used to compute CAR, precision, recall, NPV, and F1 score are given bywhere NC and NA are the numbers of correctly predicted data and the total number of data, respectively.

Besides the MPA-SVM model which utilizes information provided by 13 spectral bands, this study has employed the MPA-SVM models using the aforementioned vegetation indices as benchmark models. The MPA-SVM employing all of the Sentinel 2’s bands is denoted as MPA-SVM-13B. The benchmark models that use the NDWI, NDVI, SAVI, and MTCI are denoted as MPA-SVM-NDWI, MPA-SVM-NDVI, MPA-SVM-SAVI, and MPA-SVM-MTCI, respectively. The MPA-SVM-13B utilizes the statistical information obtained from all of the 13 spectral bands (i.e., the mean and the standard deviation of each band). Meanwhile, the MPA-SVM-NDWI, MPA-SVM-NDVI, MPA-SVM-SAVI, and MPA-SVM-MTCI employ the statistical information of the vegetation indices of NDWI, NDVI, SAVI, and MTCI, respectively. Therefore, the feature extraction phase of the benchmark models is similar to that of the MPA-SVM-13B. This feature extraction phase also computes the two indices of mean and standard deviation of image patches. The model optimization processes of the constructed models are demonstrated in Figure 6. Herein, the maximum number of searching iteration (MaxIter) of the MPA metaheuristic has been set to be 100; the number of population members (NP) is fixed to be 20. The detailed optimization results are reported in Table 3 which shows the best penalty coefficient and RBFK parameters for each model used for urban green space detection. In addition, the best found cost function values of the MPA-SVM-13B, MPA-SVM-NDWI, MPA-SVM-NDVI, MPA-SVM-SAVI, and MPA-SVM-MTCI are provided in Figure 7. It can be observed from Figure 7 that the MPA-SVM using all of the 13 bands results in the lowest value of cost function.

As stated earlier, the constructed dataset has been randomly divided into a training set (70%) and a testing set (30%). The first set is used for model training and the second set is reserved for model validation. Moreover, in order to reliably evaluate the model predictive performance, this study has repeated the model training and prediction processes 20 times. It is noted that the training and testing datasets are resampled in each run. The statistical measurements obtained from this multiple model construction and validation phases are used for model assessment. This repeated process aims at diminishing the variation caused by the randomness in data sampling. The model prediction outcomes are summarized in Table 4 which shows the mean and standard deviation (Std) of the employed performance measurement indices. It can be observed that the MPA-SVM-13B with CAR = 93.100%, precision = 0.916, recall = 0.947, NPV = 0.947, F1 score = 0.931, and AUC = 0.979 has obtained the most desired urban green space detection performance. Compared with the proposed method, MPA-SVM-NDWI has gained a lower CAR (89.400%) and F1 score (0.896), followed by MPA-SVM-NDVI (CAR = 89.300% and F1 score = 0.894), MPA-SVM-SAVI (88.983% and F1 score = 0.889), and MPA-SVM-MTCI (83.850% and F1 score = 0.843).

In terms of AUC score, MPA-SVM-13B is the best model (AUC = 0.979), followed by MPA-SVM-NDVI (AUC = 0.946), MPA-SVM-SAVI (AUC = 0.937), MPA-SVM-NDWI (AUC = 0.926), and MPA-SVM-MTCI (AUC = 0.910). The AUC values of the employed models used for urban green space detection are demonstrated in Figure 8. In addition, the model result comparison in terms of CAR, precision, recall, NPV, and AUC is provided in Figures 9 and 10. The box plots of CAR, F1 score, and AUC are illustrated in Figures 1113.

In addition, to confirm the superiority of the proposed MPA-SVM model that employs all of the Sentinel 2’s spectral bands, the Wilcoxon signed-rank test [100] with the significant level ( value) = 0.05 is utilized in this study to express the statistical significance of the model performance indices. The test outcomes of pair-wise model comparison with respect to CAR, F1 score, and AUC are shown in Tables 57, respectively. Observably, with values <0.05, the null hypotheses of insignificant model performances can be rejected. Therefore, MPA-SVM-13B is confirmed to be the best model which provides the classification performance on the collected dataset. Accordingly, the MPA-SVM-13B model is employed to construct an urban green space map for the whole study area. The mapping outcome is demonstrated in Figure 14. Based on the constructed map, it can be found that the green space occupies roughly 34.40% of the study area. Nevertheless, the green space is not evenly distributed in Da Nang. The majority of the green space is located in Son Tra peninsula within the Son Tra district.

5. Concluding Remarks

Urban green space plays a crucial role in improving the living quality of urban environment and has a positive effect on citizens’ physical/mental health. Nevertheless, few researches have been dedicated to detecting, locating, and quantifying green space in the study of Da Nang urban center. This study is an attempt to fill this knowledge gap by developing a remote sensing and data-driven approach for urban green space detection applied in the study area. Remotely sensed data obtained from the Sentinel 2 satellite are used to train and validate a hybrid metaheuristic-machine learning approach of MPA-SVM. This hybrid method is employed to construct a decision boundary that separates the input space into two distinctive regions of green space and nongreen space.

The experimental results supported by the Wilcoxon signed-rank test show that the MPA-SVM model employing all of the spectral bands is superior to those of the models relying on individual vegetation indices. Good green space detection results with CAR = 93.100%, precision = 0.916, recall = 0.947, NPV = 0.947, F1 score = 0.931, and AUC = 0.979 demonstrate that the proposed method is highly suited for the task at hand. Moreover, the MPA metaheuristic is confirmed to be a capable method for optimizing machine learning models. Accordingly, the green space mapping of the entire study area can be constructed by the proposed hybrid approach. The information provided by the newly developed model can be helpful for local authority to evaluate the status of green spaces in Da Nang city.

Although MPA-SVM has attained a good predictive performance in urban green space mapping in the study area, the proposed approach also has several limitations. The first limitation is that the MPA-SVM model has not been integrated with feature selection algorithms used for dimensionality reduction. In addition, although the RBFK is widely used for SVM-based pattern recognition, the effectiveness of other sophisticated kernel functions (e.g., hybrid kernel functions [101, 102]) in urban green space detection should be investigated. Accordingly, future extensions of the current study may include the following:(i)Investigating other state-of-the-art metaheuristic algorithms used for optimizing data-driven urban green space detection(ii)Studying the effects of the maximum number of searching iterations and the number of population members on the performance of the SVM-based urban green space detection models(iii)Employing other advanced texture descriptors to further meliorate the detection accuracy(iv)Performing detection tasks at different time periods to inspect changes and trends in urban green space(v)Performing urban green space detection using high-resolution satellite images(vi)Incorporating advanced feature selection algorithms and kernel functions into the current model structure

Data Availability

The dataset used to support the findings of this study has been deposited in the repository of GitHub (https://github.com/NDHoangDTU/MPA-SVM-UGSD).

Conflicts of Interest

The authors confirm that there are no conflicts of interest.

Acknowledgments

This research was funded by the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant no. 105.99-2019.339.