Predicting Rainfall-Induced Soil Erosion Based on a Hybridization of Adaptive Differential Evolution and Support Vector Machine Classification

Dinh, Tuan Vu; Nguyen, Hieu; Tran, Xuan-Linh; Hoang, Nhat-Duc

doi:https://doi.org/10.1155/2021/6647829

Mathematical Problems in Engineering

On this page

Abstract Introduction Methods Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Metaheuristics and Machine Learning: Theory and Applications

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 6647829 | https://doi.org/10.1155/2021/6647829

Predicting Rainfall-Induced Soil Erosion Based on a Hybridization of Adaptive Differential Evolution and Support Vector Machine Classification

Tuan Vu Dinh,^1,2Hieu Nguyen,^2,3Xuan-Linh Tran,^2,4and Nhat-Duc Hoang^2,4

Academic Editor: Essam Houssein

Received18 Dec 2020

Revised21 Jan 2021

Accepted28 Jan 2021

Published20 Feb 2021

Abstract

Soil erosion induced by rainfall is a critical problem in many regions in the world, particularly in tropical areas where the annual rainfall amount often exceeds 2000 mm. Predicting soil erosion is a challenging task, subjecting to variation of soil characteristics, slope, vegetation cover, land management, and weather condition. Conventional models based on the mechanism of soil erosion processes generally provide good results but are time-consuming due to calibration and validation. The goal of this study is to develop a machine learning model based on support vector machine (SVM) for soil erosion prediction. The SVM serves as the main prediction machinery establishing a nonlinear function that maps considered influencing factors to accurate predictions. In addition, in order to improve the accuracy of the model, the history-based adaptive differential evolution with linear population size reduction and population-wide inertia term (L-SHADE-PWI) is employed to find an optimal set of parameters for SVM. Thus, the proposed method, named L-SHADE-PWI-SVM, is an integration of machine learning and metaheuristic optimization. For the purpose of training and testing the method, a dataset consisting of 236 samples of soil erosion in Northwest Vietnam is collected with 10 influencing factors. The training set includes 90% of the original dataset; the rest of the dataset is reserved for assessing the generalization capability of the model. The experimental results indicate that the newly developed L-SHADE-PWI-SVM method is a competitive soil erosion predictor with superior performance statistics. Most importantly, L-SHADE-PWI-SVM can achieve a high classification accuracy rate of 92%, which is much better than that of backpropagation artificial neural network (87%) and radial basis function artificial neural network (78%).

1. Introduction

Soil erosion induced by water is the main culprit of the degradation of upland and mountain ecosystems [1]. The erosion process poses a threat to the capacity of the land to provide ecosystem services that are needed to reach the Sustainable Development Goal (SDG) target 15.3 [2]. Currently, soil erosion is a critical concern in many regions; this issue must be prevented to achieve sustainable development goals [3, 4]. Soil erosion is characterized by a complex and dynamic process involving detachment, transport, and deposition of soil material. There are many factors affecting erosion magnitude, namely, climate, soil type, soil structure, vegetation, and cropping on top and especially land management [5].

In tropical areas, such as Northwest Vietnam, soil erosion potential is high due to heavy rainfall and is currently accelerating under maize monocropping in the uplands [6, 7]. A recent increase in maize monocropping in the region is caused by market demand for the rapid growing livestock and poultry industry. The majority of the maize area expansion has been achieved by encroachment into steep forested mountain watersheds. Land preparation for a maize cropping season employed a weeding prior to burning, followed by several ploughs before sowing seeds. The practice is carried out at the onset of the monsoon rains; by the time, bare fields are exposed to high-intensity rains. This often results in high runoff leading to large erosion and longer-term degradation with declining crop production on-site and strong environmental impacts off-site [8, 9]. Erosion control measures are based on the following: covering the soil to protect it from raindrop impact; increasing the infiltration capacity of the soil to reduce runoff; improving the aggregate stability of the soil; and increasing surface roughness to reduce the velocity of runoff [10–12]. Many attempts have been tested worldwide: contour ploughing or contour farming using stone, wood, or vegetation barriers/hedgerow; cover crop; minimum tillage or zero tillage; and mulching. Such measures can be effective in different geographic and climatic conditions under various soil characteristics and land management systems. Among these, land management is crucial in controlling erosion [11, 13–16].

Cerdà et al. [17] investigate the hydrological and erosional impact as well as farmer’s perception on catch crops and weeds in citrus organic farming; this study recommends that farmers should be informed and given initial subsidies for implementing new measures for improving soil quality and preventing soil erosion. In the study of Tuan et al. [18], kinetic energy of rainfall is found to be one of the most important factors that cause soil erosion, particularly when heavy rains coincide with poor ground cover at the beginning of the cropping season. However, it should be noted that the importance of a factor varies from case to case and other comprehensive variables affecting erosion should also be studied.

Soil loss studies at the plot scale have been of crucial importance to identify the mechanism of the processes. The erosion plot experiments can help to introduce new erosion prevention as it provides access to dependable and faithful erosion measurements and large numbers of data necessary to test new models [19]. A number of empirical erosion models such as USLE and RUSLE [20], SWAT [21], and WEPP [22] used data from these types of studies for prediction. However, the performance and accuracy are not as good as expected [23].

In the last decades, machine learning-based models have proved to be a helpful alternative to deal with the multivariate and complex nature of the phenomena in various disciplines of engineering [24–38]. Optimized kernel logistic regression models were employed in [39] for landslide susceptibility assessment. The evidential belief function-based function tree (FT), logistic regression, and logistic model tree were utilized for predicting landslide occurrences in [40]. Ensemble learning methods used for natural hazard risk assessment have been proposed in [41–44].

In addition, ANN is used to estimate the soil erodibility in Malaysia [45] and to conclude that rainfall and runoff are important factors affecting the amount of soil eroded [46]. In [47], Kohonen neural networks (KNNs) are utilized to model runoff erosion with a better outcome than that of the conventional multiple linear regression models. It was also shown that ANN is generally more competitive than the WEPP model in quantitative prediction of soil loss [48]. Albaradeyia et al. [49] confirm that while WEPP underestimated the soil loss, ANN provided results that are in line with observation. Recently, Vu et al. [50] successfully constructed a machine learning model based on multivariate adaptive regression splines for predicting soil loss occurrences.

Based on the literature review, it can be seen that advanced machine learning methods can be a good option for soil erosion modeling. Even though previous research works extensively relied on ANN, other advanced machine learning algorithms are worth explored because the prediction of soil erosion is both complex and dynamic. We are particularly interested in the support vector machine (SVM) [51–54] because it usually has comparable results with ANN and has a lower risk of overfitting. In addition, there has been a great success in using SVM for different but closely related hydrological and geological problems [55–61].

Furthermore, to enhance the predictive performance of the SVM, this study relies on the approach of using metaheuristic approaches. It is noted that the process of fine-tuning a machine learning model can be formulated as a global optimization problem. In recent years, various metaheuristic approaches have been successfully employed, including monarch butterfly optimization [62, 63], slime mould algorithm [64], moth search algorithm [65, 66], Harris hawks optimization [67–70], differential flower pollination [71], symbiotic organisms search [72], Henry gas solubility optimization [73], and satin bowerbird optimizer [74]. As can be seen from the literature, there is an increasing trend of hybridizing metaheuristics and machine learning to tackle complex problems in the field of engineering.

Therefore, the current study aims at extending the body of knowledge by establishing soil erosion prediction models for tropical hilly regions based on an integration of the history-based adaptive differential evolution with linear population size reduction and population-wide inertia term (L-SHADE-PWI) metaheuristic and the support vector machine pattern classification method. A dataset, featuring ten explanatory variables, collected from plot experiments in Son La province (Vietnam) is employed to construct and verify the prediction models. In this study, the problem of erosion status prediction is formulated as a pattern classification task within which a vector of explanatory variables is assigned to either “erosion” or “nonerosion.” We hypothesize that the structure of the L-SHADE-PWI-SVM, an integration of machine learning and metaheuristic optimization, is capable of predicting rainfall-induced soil erosion.

In summary, the innovative points of the current study can be stated as follows:(i)A novel integration of the L-SHADE-PWI metaheuristic and the SVM pattern classifier is first proposed to predict the complex phenomenon of soil erosion.(ii)The effort to fine-tune the machine learning model is eliminated via the utilization of metaheuristic optimization.(iii)The model construction phase of the proposed L-SHADE-PWI-SVM is entirely data-driven. Therefore, it is convenient to be used by land-use planners and hazard prevention agencies without much domain knowledge in machine learning.(iv)The superiority of the proposed hybrid approach is validated using experiments based on a repeated subsampling of the collected data.

The rest of the paper is organized as follows. In Section 2, we review the related research methods. In Section 3, we present the proposed L-SHADE-PWI-SVM model. Section 4 is devoted to experiments and result comparisons. We end with the conclusions in Section 5.

2. Material and Methods

2.1. General Description of the Study Area and the Collected Dataset

For this study, data in Son La province (Northwest Vietnam) were collected from 2009 to 2011 at two experimental sites, which are two catchments with bounded plots. Figure 1 shows the location of the study area in the maps of Son La and Vietnam. The area is governed by tropical monsoon climate with mild and generally warm temperature (the annual average is 21°C, and the lowest and highest average temperatures are 16°C in February and 27°C in August, respectively [75]). The winter (from November to March) is relatively dry while the summer (from May to October) can have a lot of rain.

At the experimental sites, erosion plots of 72 m² (4 m × 18 m) with boundaries were installed to avoid the exchange of water runoff from outside. There were 24 of them in total: four treatments, three replicates arranged in a completely randomized block design. In order to gather deposited sediment from soil erosion at the plots, a system of buckets was employed. Erosion data were collected on storm basic during 3 years from 2009 to 2011. Detailed information on soil, crop, and field managements can be found in the previous study of Tuan et al. [18].

There are many factors contributing towards rill and interrill soil erosion induced by raindrop impact and surface runoff. They include climate, soil, topography, and land use. The amount and intensity of rainfall affect erosivity, while soil properties, land use, and topography (described via slope length and steepness) can have a big influence on the degree of erodibility. To represent these factors, a set of ten explanatory variables has been chosen. In Table 1, these variables are listed together with their statistics.

First, I30 is the prolonged peak rates of detachment and runoff in 30 minutes (30 min intensity). The storm energy is calculated according to the following formula [76]:where i is the maximum intensity of 30 minutes. The product of E and I30 gives us the kinetic rainfall energy (EI) which is the combined interaction of total energy and peak intensity in each particular storm. This quantity represents how particle detachment is combined with transport capacity [77].

The second and third variables are topographic factors, namely, slope length and steepness. They all have a great influence on soil loss. The bigger the slope is, the higher potential of soil loss will be. Similarly, the larger the slope length is, the more likely there will be soil erosion as there is a greater accumulation of runoff. In this study, a Nikon Forestry 550 Inclinometer was used to measure the slope of the plots.

Soil erodibility is heavily influenced by texture, but permeability, structure, and organic matter also play a role. In this study, wet analysis [78] was used to analyze soil texture with a two pseudoreplicate plot. In addition, it is known that aggregate stability is the index that is closely related to soil erodibility. However, there is currently no efficient method to calculate this index. Therefore, OC and pH, two simple characteristics of the soil, were used instead. Total OC was obtained using a C/N analyzer with carbonate content removed by HCl. pH was measured using a glass electrode with a soil-to-water ratio of 1 : 2.5.

The final variable chosen is ground cover rate as this is one of the most influential factors of soil erosion rate in tropical areas [18]. For more information on how this variable is measured, we refer the reader to Tuan et al. [18].

In our model to predict soil erosion, we aim to classify each sample, which is associated with a vector of values for the explanatory variables, into two classes: “erosion” or “nonerosion.” In order to provide ground truth classification to train the model, we follow the same criterion on soil loss as in [79]: “erosion” label is assigned to any sample with more than 3 tons per hectare soil loss, and other records are classified as “nonerosion.” A total of 236 data samples have been collected within which 118 records were classified as “erosion.”

2.2. Support Vector Machine (SVM) for Pattern Classification

The original SVM algorithm was first introduced in [80] for linear binary classification. The algorithm builds a hyperplane to separate positive and negative samples with the margin as large as possible. However, in practice, it is often the case that samples are not linearly separable and such hyperplane does not exist. This can lead to the poor performance of the algorithm. Accordingly, the original SVM algorithm is extended for nonlinear classification via the use of kernel functions [81]. Kernel functions, which are often nonlinear, map the space of input variables into a much higher-dimensional space, where separable hyperplanes are more likely to exist [82, 83]. In addition, soft margin is proposed to handle the case the hyperplane does not exist even in the higher-dimensional space [53]. With these improvements, the SVM has become one of the most popular and successful classification algorithms [84–86].

Given a training set consists of N samples where is a feature or input vector and is the corresponding response or class label. The training phase of the SVM algorithm can be mathematically formulated as the following optimization problem [53, 87].

Find to minimizesubjected towhere the first two outcomes Rⁿ and b R define the classification hyperplane (they are a normal vector and the corresponding offset, respectively); the third outcome is the vector of slack variables introduced to handle the case data cannot be separated without error; is the penalty constant decides how much weight should be put on classification error; and denotes the nonlinear mapping from the input space to a much higher-dimensional space.

The explicit formula of is not normally required. However, one needs to know the dot product of in the higher-dimensional space. It is referred to as the kernel function given by

There are different choices of kernel functions, such as linear, polynomial, sigmoid, and radial basis functions. In this work, radial basis function (RBF) is employed due to its good performance in pattern recognition tasks [88]. It is defined aswhere and is a parameter that can be tuned.

Once training is finished, the final classification is performed as follows:where is the solution of the dual problem of equations (2) and (3) and SV is the number of support vectors (the number of ).

2.3. The History-Based Adaptive Differential Evolution with Linear Population Size Reduction and Population-Wide Inertia (L-SHADE-PWI)

From Section 2.2, it is obvious that, before using the SVM algorithm, appropriate values of the penalty coefficient (C) and the parameter in the kernel function () need to be determined. These hyperparameters play an important role in the learning phase of the model. As a consequence, they can have essential influences on the predictive capability of the soil erosion prediction model. The step of selecting these hyperparameters is usually referred to as model selection. It is problem-dependent. It is also very challenging because the hyperparameters must be selected from continuous domains, where there is an infinite number of possible choices [33, 89–92].

Traditional approaches for model selection include grid search and random search. In grid search, hyperparameters are exhaustively searched from a specified subset of the space of hyperparameters. Random search, on the other hand, randomly select combinations of hyperparameters for consideration. Both grid search and random search are simple and can be done in parallel. However, they cannot deliver desired predictive accuracy since hyperparameter selection is highly data-dependent and complex. For the case of an SVM model, the complexity coefficient (C) and the kernel function parameter (γ) are both real numbers. Hence, there are an infinite number of possible combinations of C and γ. This means that an exhaustive grid search is infeasible and researchers have turned to metaheuristic methods to address the model selection problem [72, 91, 93, 99].

In this work, we propose to use the history-based adaptive differential evolution with linear population size reduction and population-wide inertia (L-SHADE-PWI) algorithm to select the hyperparameters for the SVM algorithm. The L-SHADE-PWI [100] is a newly proposed metaheuristic with promising optimization capability. With the L-SHADE-PWI algorithm, the model selection of an SVM model is formulated as a global optimization problem where the objective function measures how well the model fits the training and validating sets using a set of the hyperparameters of the penalty coefficient (C) and the kernel function parameter (γ).

The L-SHADE-PWI is essentially an extension of the history-based adaptive differential evolution with linear population size reduction (L-SHADE) algorithm proposed in [101]. This newly developed method inherits the effective mutation-crossover operator of the standard differential evolution (DE), the adaptive tuning strategy of the L-SHADE, the DE/current-to-pbest/1 mutation strategy described in [102], the linear population size reduction proposed in [101], and a population-wide inertia term (PWI) incorporated in the mutation operation [100].

A brief summary of the L-SHADE-PWI algorithm is presented in Algorithm 1. In the heart of the algorithm, there are three main steps, including mutation, crossover, and selection. The mutation step generates a trial vector. The mutation operator requires the computing of the population-wide inertia term . The PWI term indicates an averaged direction and size of successful moves that lead to better solutions in the preceding generation. The incorporation of the population-wide inertia aims at enhancing the variations of population members in the potential direction that on average brought about cost function’s reduction [100]. The PWI term is calculated as follows [100]:where denotes the move of a population member in the search space occurred in generation − 1. It is worth noticing that individual moves that do not lead to better cost function values are not taken into account. Therefore, the sum described in equation (7) is divided by which represents the number of beneficial moves [100].

	Define PS, D, CF(x), Gmax, an archive A =
	Define the two archives of MF and MCR and the two sets of SF and SCR
	Create a population P with PS members randomly
	Identify and record x_best,g which is currently the best solution
	For each generation
	For each individual x_i,g
	Select MF_k and CR_k randomly from MF and MCR
	Calculate F_i = Gaussian(MF_k, 0.1)
	Calculate CR_i = Gaussian(CR_k, 0.1)
	Select a x_pbest,g randomly from A
	Select two random members and
	Calculate the population-wide inertia term
	Create a trial vector according to [100]:

	Perform crossover according to [103]:

	Carry out the selection operator [103]
	End For
	Collect successful F_i and CR_i
	Update SF, SCR, MF, MCR, and A [104]
	Update PS [101]
	End For
	Return x_best,g

Accordingly, this trial vector is used in the crossover step to create a mutated vector u. The final step compares the mutated vector with its parent and replaces them in the next generation if better target is achieved. In Algorithm 1, PS is the population size; D represents the number of the searched variables (two in our case); CF(x) is the cost function; and x denotes the vector of the searched variables. The two vectors of length H MF and MCR are archives containing the mean values of the mutation scale (F) and the crossover probability (CR). The two vectors SF and SCR keep CR and F values delivering offspring better than the parent. Gmax is the limit (maximum number) of generations. Finally, a scheme for population size reduction is implemented to facilitate the convergence rate of the metaheuristic algorithm.

3. The Proposed L-SHADE-PWI Optimized SVM for Rainfall-Induced Soil Erosion Prediction

This section aims at describing the structure of the L-SHADE-PWI-SVM model used for rainfall-induced soil erosion. The newly developed model is an integration of machine learning and metaheuristic optimization. In detail, the SVM machine learning is employed to generalize a decision boundary that separates the input space into two distinctive domains: nonerosion (the negative class) and erosion (the positive class). The factors of EI30, slope, OC topsoil, pH topsoil, bulk density, total pore volume, texture-silt, texture-clay, texture-sand, and soil cover rate are employed as influencing factors for pattern classification. Because the model establishment phase of the SVM necessitates an appropriate specification of the penalty coefficient and the kernel function parameter, the proposed model relies on the L-SHADE-PWI to optimize the hyperparameter selection phase.

Figure 2 demonstrates the overall structure of the proposed L-SHADE-PWI-SVM model. A software program based on the model structure has been developed by the authors in Visual C# .NET (framework 4.6.2) environment and the Accord.NET Framework [105]. The newly established model has been tested on the ASUS FX705GE-EW165T (Core i7 8750H and 8 GB RAM) platform.

It is proper to note that the influencing factors of the collected dataset have been randomly separated into a training (90%) dataset and a testing dataset (10%). The first set is used for model construction and the second set is reserved for model validation. Moreover, the Z-score equation has been used to standardize the data range since it may enhance the classification performance [106]. The Z-score equation is given bywhere X_Z and X_D are the normalized and the original variables, respectively. M_X and STD_X denote the mean value and the standard deviation of the soil erosion influencing factors, respectively.

As described earlier, the model training and soil erosion prediction phases of the SVM necessitate an appropriate set of the penalty coefficient and the kernel function parameter. The first hyperparameter specifies how the SVM’s loss function increases due to misclassified data samples. The second hyperparameter influences the smoothness of the classification boundary. Thus, these two hyperparameters strongly affect the generalization and predictive accuracy of the SVM-based soil erosion prediction model. This research proposes to use the L-SHADE-PWI metaheuristic for optimizing the performance of the SVM model. At the first generation, the L-SHADE-PWI initializes a population of hyperparameters randomly. In each subsequent generation, this metaheuristic approach gradually explores and exploits the search space to identify potential regions which contains high-quality sets of the SVM model’s hyperparameters.

Relying on the hyperparameters found by the L-SHADE-PWI metaheuristic, the SVM model analyzes the training dataset and generalizes a decision boundary that separates input samples associated with nonerosion and erosion classes. It is noted that this study employs the Accord.NET Framework [105] to carry out the SVM model’s training and prediction phases. Furthermore, to optimize the SVM-based soil erosion prediction model, this research relies on a K-fold cross-validation with K = 5. Based on the cross-validation framework, the dataset is separated into 5 mutually exclusive sets. In each of the five runs, one set is utilized for model verification and the other four sets are used as training samples. The average predictive performance obtained from the 5 folds is used to express the generalization capability of the rainfall-induced soil erosion prediction model. Accordingly, the objective function (F) used for the L-SHADE-PWI metaheuristic-based optimization is given bywhere FNR_k and FPR_k are the false-negative rate (FNR) and the false-positive rate (FPR) obtained from k^th run, respectively.

The FNR and FPR are calculated according to the following equations:where FN, FP, TP, and TN represent the false-negative, false-positive, true-positive, and true-negative data samples, respectively.

4. Model Results and Discussions

4.1. Preliminary Feature Importance Investigation

For the purpose of constructing a robust soil prediction model, a preliminary assessment of the relevance of input features (EI30, slope, OC, pH, bulk density, soil porosity, soil texture (silt, clay, and sand fractions), and soil cover rate) should be carried out. This study relies on the well-known ReliefF method [107] to investigate the importance of each aforementioned soil erosion influencing factor. This step of the study aims at identifying irrelevant influencing factors. Using the ReliefF method, a weighting value is computed for each input factor. A large weight indicates a strong relevance between an influencing factor and the soil erosion status (either erosion or nonerosion). The preliminary assessment of the relevancy of input features is reported in Figure 3. In this figure, the ReliefF weight values of all input variables are shown. It is observed that the input factor X₁ (EI30) receives the highest weight value, followed by the input factor X₁₀ (soil cover rate), X₃ (OC topsoil), and X₉ (sand fraction). The factors X₇ (silt fraction) and X₈ (clay fraction) have low weight values. However, these weights are still significantly larger than zero. Therefore, the L-SHADE-PWI metaheuristic-optimized SVM model employs all of the ten factors.

4.2. Model Performance Evaluation Metrics

To quantify and compare performances of soil erosion prediction models, the classification accuracy rate (CAR), which is the percentage of correctly classified cases, is often employed:where N_c and N_a denote the numbers of correctly classified samples and the total number of data samples, respectively.

Besides CAR, the true-positive rate (TPR) (the percentage of positive instances correctly classified), false-positive rate (FPR) (the percentage of negative instances misclassified), false-negative rate (FNR) (the percentage of positive instances misclassified), and true-negative rate (TNR) (the percentage of negative instances correctly classified) are also employed [56, 108]. The FPR and FNR indices have been defined in the previous section of the article; the TPR and TNR indices are given bywhere TP, TN, FP, and FN represent the true-positive, true-negative, false-positive, and false-negative data samples, respectively.

In addition, based on the results of the TP, TN, FP, and FN, indices of precision, recall, negative predictive value (NPV), and F1 score are also useful for expressing the model predictive capability [109]:

4.3. Experimental Results and Comparison

To evaluate the model performance, the original dataset has been randomly divided into two mutually exclusive sets. Since the size of the collected dataset is moderate, the training/testing data ratio is selected to be 9/1 [110]. This means that 90% of the original dataset is used for model construction; the rest of the dataset is reserved for the model testing phase. The testing set is employed as novel data instance to verify the generalization of the constructed L-SHADE-PWI-SVM model used for rainfall-induced soil erosion. Illustrations of training and testing datasets are provided in Table 2.

In addition, since the model construction phase of the SVM-based model requires the setting of the penalty coefficient and kernel function parameters, the L-SHADE-PWI has been used to automatically fine-tune these hyperparameters. The L-SHADE-PWI-based optimization process is illustrated in Figure 4. It is noted that the metaheuristic population consists of 20 members and the maximum number of generations is 100. The outcomes of the optimization process are as follows:(i)The penalty coefficient(ii)The kernel function parameter

To demonstrate the capability of the proposed L-SHADE-PWI-SVM, the backpropagation artificial neural network (BPANN) [106, 111] and radial basis function artificial neural network (RBFANN) models are employed as benchmark methods. The BPNN and RBFANN models have been constructed in MATLAB environment with the help of built-in functions provided in the Statistics and Machine Learning Toolbox [112].

It is noted that the number of neurons in the hidden layer of the BPANN model is selected to be based on the suggestion in [113], where D_X and C_N denote the numbers of input features and class outputs, respectively. Herein, the numbers of input features and class outputs are 10 and 2, respectively. Moreover, the BPANN is trained using the Levenberg–Marquardt (LM) algorithms [111, 114]. The sigmoidal activation function is used for the BPANN to train with the LM algorithm (denoted as LM-BPANN). The BPANN model is trained with the maximum number of epochs = 1000. Moreover, based on several trial-and-error experiments with the collected dataset, the appropriate RBFANN model [115] consists of 30 neurons and has been trained with a basic width of 1.2.

The L-SHADE-PWI-SVM as well as the used benchmark models is trained with a training set (90%) and a testing set (10%) randomly drawn from the original datasets. It is noted that since one-time evaluation is not sufficient to verify the model capability due to the effect of the random data sampling process, this study has repeated the random splitting of the dataset into model training and testing phases 20 times to negate such an undesired effect. Accordingly, the prediction results of the L-SHADE-PWI-SVM and benchmark models obtained from 20 runs are reported in Table 3 and Figure 5.

It can be seen that the proposed hybrid machine learning model has achieved the most desired accuracy on soil erosion detection with CAR = 92.292% while LM-BPANN ranks second with CAR = 87.083% and RBFANN ranks third with CAR = 77.917%. L-SHADE-PWI-SVM is also the most reliable predictor with a smaller CAR deviation (5.1 versus 6.6 and 9.9). This is clearly demonstrated in the box plots in Figure 5.

The proposed model is also superior in other metrics such as precision, recall, NPV, and F1-score with better measurement value and less variation (smaller deviation). More specifically, L-SHADE-PWI-SVM has precision = 0.944, recall = 0.900, NPV = 0.904, and F1-score = 0.919, and LM-BPANN is the second best prediction approach with precision = 0.901, recall = 0.854, NPV = 0.868, and F1-score = 0.869, followed by RBFANN with precision = 0.804, recall = 0.758, NPV = 0.769, and F1-score = 0.776.

Moreover, the receiver operating characteristic curve (ROC) and the area under the curve (AUC) [116] are also employed for assessing the prediction model performance [56, 117, 118]. It is because the AUC can express the overall predictive accuracy of the employed classifiers used for soil erosion prediction. This study also computes the AUC to indicate the generalization capability of the proposed L-SHADE-PWI-SVM as well as the benchmark models. The AUC results are reported in Table 4. As can be seen from the experimental results, the AUC value in the testing phase of the proposed L-SHADE-PWI-SVM (0.908) is higher than those of the LM-BPANN (0.898) and RBFANN (0.787). In addition, the ROCs of the proposed model are illustrated in Figure 6.

4.4. Discussion

To further confirm the superiority of the proposed hybrid L-SHADE-PWI-SVM model, the Wilcoxon signed-rank test [119] with the significant level ( value) = 0.05 is also employed in this study to demonstrate the statistical significance of the difference in model results. This nonparametric hypothesis testing method is widely employed for comparing classification models [120]. The test outcomes of pairwise model comparison are reported in Table 5. Observably, with values <0.05, the null hypothesis of equal means is rejected.

Moreover, to assess the reliability of the L-SHADE-PWI-SVM-based soil erosion prediction model, the coefficient of variation (COV) [121, 122] is employed in this section of the study. The COV (%) is computed as follows [121, 122]:

The COV calculation results of the proposed L-SHADE-PWI-SVM and the benchmark approaches used for rainfall-induced erosion susceptibility prediction are reported in Table 6. It is noted that the COV is the ratio of the standard deviation to the mean and is used to quantify the dispersion of a model prediction outcomes obtained from a repeated data sampling process [123]. In the particular case of rainfall-induced soil erosion susceptibility evaluation, a small value of COV is desirable since it indicates a stable data-driven model used for predicting such phenomenon. As shown in Table 6, the proposed L-SHADE-PWI-SVM has achieved the lowest COV in all of the performance measurement metrics (5.535% for CAR, 7.309% for precision, 9.000% for recall, 8.628% for NPV, and 5.767% for F1-score). These outcomes point out that the newly developed model is the most reliable tool for rainfall-induced soil erosion susceptibility assessment.

In conclusion, L-SHADE-PWI-SVM is more accurate and more reliable model compared to LM-BPANN ( values = 0.009816) and RBFANN ( values = 0.000631). Statistical tests also confirm that the superior of L-SHADE-PWI-SVM in terms of accuracy is also significant with values <0.05. This means that the newly constructed L-SHADE-PWI-SVM is highly appropriate for soil erosion prediction in the study area.

Compared to the conventional approaches for soil erosion prediction such as the Revised Universal Soil Loss Equation (RUSLE) [124, 125] which requires significant efforts on parameter calibration, the newly proposed method is entirely data-driven in which all of the model parameters are determined via the model training process. In addition, recently proposed machine learning methods for soil erosion susceptibility prediction have dominantly relied on individual or ensemble of models [126–128]. Therefore, trial-and-error processes and modeling experience are required for constructing such machine learning models. An integration of machine learning and metaheuristic used for soil erosion prediction is rarely investigated. Thus, the current study is an attempt to fill this gap in the literature and show the great potentiality of this hybrid framework for tackling the problem at hand.

Nevertheless, one disadvantage of the proposed approach is that the optimization process to determine an optimal set of parameters of the SVM model might be costly, especially when the size of the collected dataset is large. It is because the model training and prediction processes of the SVM are embedded into the cost function calculation phase of the employed metaheuristic. Another limitation of the current approach is that automatic feature selection has not been integrated into the hybridization of the L-SHADE-PWI and SVM. Such drawbacks should be resolved with in future extensions of the study.

5. Conclusions

In tropical regions, soil erosion is a natural hazard that causes various harmful effects on the land including loss of soil, soil structure breakdown, and the decline of organic matter as well as nutrients within the soil. These ultimately lead to a critical economic loss for landowners. This study has developed a hybrid intelligent method, named as L-SHADE-PWI-SVM, for predicting the status of soil erosion based on an integration of machine learning and metaheuristic. The SVM pattern recognition method is employed to generalize a decision boundary that separates input data into two categories of erosion and nonerosion.

The ten variables of EI30, slope, OC topsoil, pH topsoil, bulk density, soil porosity, soil texture (silt, clay, and sand fractions), and soil cover rate are used as erosion conditioning factors. In addition, to optimize the SVM model performance, the state-of-the-art L-SHADE-PWI metaheuristic is employed. The newly developed L-SHADE-PWI-SVM has achieved a good predictive performance (CAR = 92.292%, F1 score = 0.919, and AUC = 0.908) obtained from a repeated data sampling process. The experimental results supported by the Wilcoxon signed-rank test demonstrate that the proposed hybrid model is superior to the benchmark methods including the LM-BPANN (CAR = 87.083%, F1-score = 0.869, and AUC = 0.898) and the RBFANN (CAR = 77.917%, F1-score = 0.776, and AUC = 0.787). The proposed L-SHADE-PWI-SVM has outperformed the benchmark approaches in all of the evaluation metrics. These facts strongly confirm the efficacy of applying the proposed hybrid machine learning method for solving the problem of interest. The L-SHADE-PWI-SVM can be a very promising tool to assist landowners and managers to quickly identify the potential soil erosion areas and develop preventive measures (Table 7).

Future developments of the current study may include the following:(i)The applications of other advanced metaheuristics in optimizing the machine learning model used for soil erosion prediction(ii)The integration of state-of-the-art feature selection into the machine learning structure to further enhance the prediction accuracy(iii)The investigation of advanced kernel functions for better dealing with nonlinearity in soil erosion data classification

Data Availability

The data used to support the findings of this study can be found in Table 7.

Conflicts of Interest

The authors confirm that there are no conflicts of interest regarding the publication.

Acknowledgments

This research was funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant no. 105.08-2017.302.

References

J. Jiao, H. Zou, Y. Jia, and N. Wang, “Research progress on the effects of soil erosion on vegetation,” Acta Ecologica Sinica, vol. 29, no. 2, pp. 85–91, 2009.
View at: Publisher Site | Google Scholar
S. Visser, S. Keesstra, G. Maas, and M. De Cleen, “Soil as a basis to create enabling conditions for transitions towards sustainable land management as a key to achieve the SDGs by 2030,” Sustainability, vol. 11, no. 23, p. 6792, 2019.
View at: Publisher Site | Google Scholar
S. Molenaar, G. Mol, J. De Leeuw et al., “Soil-related sustainable development goals: four concepts to make land degradation neutrality and restoration work,” Land, vol. 7, no. 4, p. 133, 2018.
View at: Publisher Site | Google Scholar
S. D. Keesstra, J. Bouma, J. Wallinga et al., “The significance of soils and soil science towards realization of the United Nations sustainable development goals,” Soil, vol. 2, no. 2, pp. 111–128, 2016.
View at: Publisher Site | Google Scholar
S. A. El-Swaify, “Factors affecting soil erosion hazards and conservation needs for tropical steeplands,” Soil Technology, vol. 11, no. 1, pp. 3–16, 1997.
View at: Publisher Site | Google Scholar
G. Clemens, S. Fiedler, N. D. Cong, N. Van Dung, U. Schuler, and K. Stahr, “Soil fertility affected by land use history, relief position, and parent material under a tropical climate in NW-Vietnam,” Catena, vol. 81, no. 2, pp. 87–96, 2010.
View at: Publisher Site | Google Scholar
P. Schmitter, H. L. Fröhlich, G. Dercon et al., “Redistribution of carbon and nitrogen through irrigation in intensively cultivated tropical mountainous watersheds,” Biogeochemistry, vol. 109, no. 1–3, pp. 133–150, 2012.
View at: Publisher Site | Google Scholar
F. Turkelboom, J. Poesen, and G. Trébuil, “The multiple land degradation effects caused by land-use intensification in tropical steeplands: a catchment study from northern Thailand,” Catena, vol. 75, no. 1, pp. 102–116, 2008.
View at: Publisher Site | Google Scholar
P. Schmitter, G. Dercon, T. Hilger et al., “Sediment induced soil spatial variation in paddy fields of Northwest Vietnam,” Geoderma, vol. 155, no. 3-4, pp. 298–307, 2010.
View at: Publisher Site | Google Scholar
R. P. C. Morgan, Soil Erosion and Conservation, Blackwell Science Ltd, Oxford, UK, 3rd edition, 2005.
M. López-Vicente, H. Kramer, and S. Keesstra, “Effectiveness of soil erosion barriers to reduce sediment connectivity at small basin scale in a fire-affected forest,” Journal of Environmental Management, vol. 278, Article ID 111510, 2021.
View at: Publisher Site | Google Scholar
A. Cerdà, A. Novara, P. Dlapa et al., “Rainfall and water yield in Macizo del Caroig, Eastern Iberian Peninsula. Event runoff at plot scale during a rare flash flood at the Barranco de Benacancil,” Cuadernos de Investigación Geográfica Geographical Research Letters, 2021, In press.
View at: Publisher Site | Google Scholar
A. Novara, A. Cerda, E. Barone, and L. Gristina, “Cover crop management and water conservation in vineyard and olive orchards,” Soil and Tillage Research, vol. 208, Article ID 104896, 2021.
View at: Google Scholar
J. Rodrigo-Comino, E. Terol, G. Mora, A. Giménez-Morera, and A. Cerdà, “Vicia sativa roth. can reduce soil and water losses in recently planted vineyards (vitis vinifera L.),” Earth Systems and Environment, vol. 4, no. 4, pp. 827–842, 2020.
View at: Publisher Site | Google Scholar
S. D. Keesstra, J. Rodrigo-Comino, A. Novara et al., “Straw mulch as a sustainable solution to decrease runoff and erosion in glyphosate-treated clementine plantations in Eastern Spain. An assessment using rainfall simulation experiments,” Catena, vol. 174, pp. 95–103, 2019.
View at: Publisher Site | Google Scholar
A. Cerdà, J. Rodrigo-Comino, A. Novara et al., “Long-term impact of rainfed agricultural land abandonment on soil erosion in the Western Mediterranean basin,” Progress in Physical Geography: Earth and Environment, vol. 42, no. 2, pp. 202–219, 2018.
View at: Publisher Site | Google Scholar
A. Cerdà, J. Rodrigo-Comino, A. Giménez-Morera, and S. D. Keesstra, “Hydrological and erosional impact and farmer's perception on catch crops and weeds in citrus organic farming in Canyoles river watershed, Eastern Spain,” Agriculture, Ecosystems & Environment, vol. 258, pp. 49–58, 2018.
View at: Publisher Site | Google Scholar
V. D. Tuan, T. Hilger, L. MacDonald et al., “Mitigation potential of soil conservation in maize cropping on steep slopes,” Field Crops Research, vol. 156, pp. 91–102, 2014.
View at: Publisher Site | Google Scholar
M. A. Nearing, G. Govers, and L. D. Norton, “Variability in soil erosion data from replicated plots,” Soil Science Society of America Journal, vol. 63, no. 6, pp. 1829–1835, 1999.
View at: Publisher Site | Google Scholar
K. G. Renard and J. R. Freimund, “Using monthly precipitation data to estimate the R-factor in the revised USLE,” Journal of Hydrology, vol. 157, no. 1-4, pp. 287–306, 1994.
View at: Publisher Site | Google Scholar
Z. Kliment, J. Kadlec, and J. Langhammer, “Evaluation of suspended load changes using AnnAGNPS and SWAT semi-empirical erosion models,” Catena, vol. 73, no. 3, pp. 286–299, 2008.
View at: Publisher Site | Google Scholar
J. M. Laflen, L. J. Lane, and G. R. Foster, “WEPP: a new generation of erosion prediction technology,” Journal of Soil and Water Conservation, vol. 46, pp. 34–38, 1991.
View at: Google Scholar
W. S. Merritt, R. A. Letcher, and A. J. Jakeman, “A review of erosion and sediment transport models,” Environmental Modelling & Software, vol. 18, no. 8-9, pp. 761–799, 2003.
View at: Publisher Site | Google Scholar
Ł4 Sadowski, M. Nikoo, M. Shariq, E. Joker, and S. Czarnecki, “The nature-inspired metaheuristic method for predicting the creep strain of green concrete containing ground granulated blast furnace slag,” Materials, vol. 12, no. 2, p. 293, 2019.
View at: Publisher Site | Google Scholar
A. Goetzke-Pala, A. Hoła, and Ł Sadowski, “A non-destructive method of the evaluation of the moisture in saline brick walls using artificial neural networks,” Archives of Civil and Mechanical Engineering, vol. 18, no. 4, pp. 1729–1742, 2018.
View at: Publisher Site | Google Scholar
A. Ashrafian, F. Shokri, M. J. Taheri Amiri, Z. M. Yaseen, and M. Rezaie-Balf, “Compressive strength of foamed cellular lightweight concrete simulation: new development of hybrid artificial intelligence model,” Construction and Building Materials, vol. 230, Article ID 117048, 2020.
View at: Publisher Site | Google Scholar
A. Malik, A. Kumar, S. Kim et al., “Modeling monthly pan evaporation process over the Indian central Himalayas: application of multiple learning artificial intelligence model,” Engineering Applications of Computational Fluid Mechanics, vol. 14, no. 1, pp. 323–338, 2020.
View at: Publisher Site | Google Scholar
K. Khosravi, P. Daggupati, M. T. Alami et al., “Meteorological data mining and hybrid data-intelligence models for reference evaporation simulation: a case study in Iraq,” Computers and Electronics in Agriculture, vol. 167, Article ID 105041, 2019.
View at: Publisher Site | Google Scholar
M. Abedini, B. Ghasemian, A. Shirzadi et al., “A novel hybrid approach of bayesian logistic regression and its ensembles for landslide susceptibility assessment,” Geocarto International, vol. 34, no. 13, pp. 1427–1457, 2019.
View at: Publisher Site | Google Scholar
M. A. Shahin, “State-of-the-art review of some artificial intelligence applications in pile foundations,” Geoscience Frontiers, vol. 7, no. 1, pp. 33–44, 2016.
View at: Publisher Site | Google Scholar
M. A. Shahin, “A review of artificial intelligence applications in shallow foundations,” International Journal of Geotechnical Engineering, vol. 9, no. 1, pp. 49–60, 2015.
View at: Publisher Site | Google Scholar
R. Mohanty, S. Suman, and S. K. Das, “Prediction of vertical pile capacity of driven pile in cohesionless soil using artificial intelligence techniques,” International Journal of Geotechnical Engineering, vol. 12, no. 2, pp. 209–216, 2018.
View at: Publisher Site | Google Scholar
X. Ding, M. Hasanipanah, H. Nikafshan Rad, and W. Zhou, “Predicting the blast-induced vibration velocity using a bagged support vector regression optimized with firefly algorithm,” Engineering with Computers, 2020.
View at: Google Scholar
Q. Fang, B. Yazdani Bejarbaneh, M. Vatandoust, D. Jahed Armaghani, B. Ramesh Murlidhar, and E. Tonnizam Mohamad, “Strength evaluation of granite block samples with different predictive models,” Engineering with Computers, 2019.
View at: Google Scholar
H. Luo and S. G. Paal, “A locally weighted machine learning model for generalized prediction of drift capacity in seismic vulnerability assessments,” Computer-Aided Civil and Infrastructure Engineering, vol. 34, no. 11, pp. 935–950, 2019.
View at: Publisher Site | Google Scholar
S. I. Abba, S. J. Hadi, S. S. Sammen et al., “Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination,” Journal of Hydrology, vol. 587, Article ID 124974, 2020.
View at: Publisher Site | Google Scholar
O. Kisi, M. Alizamir, S. Trajkovic, J. Shiri, and S. Kim, “Solar radiation estimation in mediterranean climate by weather variables using a novel bayesian model averaging and machine learning methods,” Neural Processing Letters, vol. 52, no. 3, pp. 2297–2318, 2020.
View at: Publisher Site | Google Scholar
S. Q. Salih, A. Sharafati, K. Khosravi et al., “River suspended sediment load prediction based on river discharge information: application of newly developed data mining models,” Hydrological Sciences Journal, vol. 65, no. 4, pp. 624–637, 2020.
View at: Publisher Site | Google Scholar
X. Chen and W. Chen, “GIS-based landslide susceptibility assessment using optimized hybrid machine learning methods,” CATENA, vol. 196, Article ID 104833, 2021.
View at: Publisher Site | Google Scholar
X. Zhao and W. Chen, “Optimization of computational intelligence models for landslide susceptibility evaluation,” Remote Sensing, vol. 12, no. 14, p. 2180, 2020.
View at: Publisher Site | Google Scholar
W. Chen and Y. Li, “GIS-based evaluation of landslide susceptibility using hybrid computational intelligence models,” CATENA, vol. 195, Article ID 104777, 2020.
View at: Publisher Site | Google Scholar
K. Pham, D. Kim, S. Park, and H. Choi, “Ensemble learning-based classification models for slope stability analysis,” CATENA, vol. 196, Article ID 104886, 2021.
View at: Publisher Site | Google Scholar
P.-T. T. Ngo, T. D. Pham, V.-H. Nhu et al., “A novel hybrid quantum-PSO and credal decision tree ensemble for tropical cyclone induced flash flood susceptibility mapping with geospatial data,” Journal of Hydrology, Article ID 125682, 2020.
View at: Publisher Site | Google Scholar
L. Bragagnolo, R. V. D. Silva, and J. M. V. Grzybowski, “Artificial neural network ensembles applied to the mapping of landslide susceptibility,” CATENA, vol. 184, Article ID 104240, 2020.
View at: Publisher Site | Google Scholar
M. F. Yusof, H. M. Azamathulla, and R. Abdullah, “Prediction of soil erodibility factor for Peninsular Malaysia soil series using ANN,” Neural Computing and Applications, vol. 24, no. 2, pp. 383–389, 2014.
View at: Publisher Site | Google Scholar
M. Kim and J. E. Gilley, “artificial neural network estimation of soil erosion and nutrient concentrations in runoff from land application areas,” Computers and Electronics in Agriculture, vol. 64, no. 2, pp. 268–275, 2008.
View at: Publisher Site | Google Scholar
C. A. S. de Farias and C. A. G. Santos, “The use of Kohonen neural networks for runoff-erosion modeling,” Journal of Soils and Sediments, vol. 14, no. 7, pp. 1242–1250, 2014.
View at: Publisher Site | Google Scholar
P. Licznar and M. A. Nearing, “Artificial neural networks of soil erosion and runoff prediction at the plot scale,” CATENA, vol. 51, no. 2, pp. 89–114, 2003.
View at: Publisher Site | Google Scholar
I. Albaradeyia, A. Hani, and I. Shahrour, “WEPP and ANN models for simulating soil loss and runoff in a semi-arid Mediterranean region,” Environmental Monitoring and Assessment, vol. 180, no. 1–4, pp. 537–556, 2011.
View at: Publisher Site | Google Scholar
D. T. Vu, X.-L. Tran, M.-T. Cao, T. C. Tran, and N.-D. Hoang, “Machine learning based soil erosion susceptibility prediction using social spider algorithm optimized multivariate adaptive regression spline,” Measurement, vol. 164, Article ID 108066, 2020.
View at: Publisher Site | Google Scholar
V. N. Vapnik, Statistical Learning Theory, John Wiley & Sons, Hoboken, NJ, USA, 1998.
H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” in Proceedings of the 9th International Conference on Neural Information Processing Systems, Denver, CO, USA, December 1996.
View at: Google Scholar
C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
View at: Publisher Site | Google Scholar
B. E. Boser, I. M. Guyon, and V. N. Vapnik, “A training algorithm for optimal margin classifiers,” in Proceedings of the Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, PA, USA, July 1992.
View at: Google Scholar
D. Tien Bui, N.-D. Hoang, H. Nguyen, and X.-L. Tran, “Spatial prediction of shallow landslide using Bat algorithm optimized machine learning approach: a case study in Lang Son Province, Vietnam,” Advanced Engineering Informatics, vol. 42, Article ID 100978, 2019.
View at: Publisher Site | Google Scholar
B. T. Pham, D. Tien Bui, and I. Prakash, “bagging based support vector machines for spatial prediction of landslides,” Environmental Earth Sciences, vol. 77, p. 146, 2018.
View at: Publisher Site | Google Scholar
Y. Huang and L. Zhao, “Review on landslide susceptibility mapping using support vector machines,” CATENA, vol. 165, pp. 520–529, 2018.
View at: Publisher Site | Google Scholar
W. Chen, H. R. Pourghasemi, and S. A. Naghibi, “A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China,” Bulletin of Engineering Geology and the Environment, vol. 77, no. 2, pp. 647–664, 2018.
View at: Publisher Site | Google Scholar
L. Y. Zhou, F. P. Shan, K. Shimizu, T. Imoto, H. Lateh, and K. S. Peng, “A comparative study of slope failure prediction using logistic regression, support vector machine and least square support vector machine models,” AIP Conference Proceedings, vol. 1870, Article ID 060012, 2017.
View at: Google Scholar
D. Tien Bui, T. A. Tuan, N.-D. Hoang et al., “Spatial prediction of rainfall-induced landslides for the Lao Cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization,” Landslides, vol. 14, no. 2, pp. 447–458, 2017.
View at: Publisher Site | Google Scholar
D. Tien Bui, B. Pradhan, O. Lofman, and I. Revhaug, “Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and naïve bayes models,” Mathematical Problems in Engineering, vol. 2012, Article ID 974638, 26 pages, 2012.
View at: Publisher Site | Google Scholar
H. Faris, I. Aljarah, and S. Mirjalili, “Improved monarch butterfly optimization for unconstrained global search and neural network training,” Applied Intelligence, vol. 48, no. 2, pp. 445–464, 2018.
View at: Publisher Site | Google Scholar
G.-G. Wang, S. Deb, and Z. Cui, “Monarch butterfly optimization,” Neural Computing and Applications, vol. 31, no. 7, pp. 1995–2014, 2019.
View at: Publisher Site | Google Scholar
S. L. Zubaidi, I. H. Abdulkareem, K. S. Hashim et al., “Hybridised artificial neural network model with slime mould algorithm: a novel methodology for prediction of urban stochastic water demand,” Water, vol. 12, no. 10, p. 2692, 2020.
View at: Publisher Site | Google Scholar
A. M. Hussein, M. Abd Elaziz, M. S. M. Abdel Wahed, and M. Sillanpää, “A new approach to predict the missing values of algae during water quality monitoring programs based on a hybrid moth search algorithm and the random vector functional link network,” Journal of Hydrology, vol. 575, pp. 852–863, 2019.
View at: Publisher Site | Google Scholar
D. J. Kalita, V. P. Singh, and V. Kumar, “A dynamic framework for tuning SVM hyper parameters based on moth-flame optimization and knowledge-based-search,” Expert Systems with Applications, vol. 168, Article ID 114139, 2020.
View at: Publisher Site | Google Scholar
Z. Yu, X. Shi, J. Zhou, X. Chen, and X. Qiu, “Effective assessment of blast-induced ground vibration using an optimized random forest model based on a Harris hawks optimization algorithm,” Applied Sciences, vol. 10, no. 4, p. 1403, 2020.
View at: Publisher Site | Google Scholar
H. Moayedi, M. Gör, Z. Lyu, and D. T. Bui, “Herding Behaviors of grasshopper and Harris hawk for hybridizing the neural network in predicting the soil compression coefficient,” Measurement, vol. 152, Article ID 107389, 2020.
View at: Publisher Site | Google Scholar
E. H. Houssein, M. E. Hosney, M. Elhoseny, D. Oliva, W. M. Mohamed, and M. Hassaballah, “Hybrid Harris hawks optimization with cuckoo search for drug design and discovery in chemoinformatics,” Scientific Reports, vol. 10, Article ID 14439, 2020.
View at: Publisher Site | Google Scholar
E. H. Houssein, M. E. Hosney, D. Oliva, W. M. Mohamed, and M. Hassaballah, “A novel hybrid Harris hawks optimization and support vector machines for drug design and discovery,” Computers & Chemical Engineering, vol. 133, Article ID 106656, 2020.
View at: Publisher Site | Google Scholar
D. Tien Bui, N.-D. Hoang, and P. Samui, “Spatial pattern analysis and prediction of forest fire using new machine learning approach of multivariate adaptive regression splines and differential flower pollination optimization: a case study at Lao Cai province (Viet Nam),” Journal of Environmental Management, vol. 237, pp. 476–487, 2019.
View at: Publisher Site | Google Scholar
M.-Y. Cheng, D. Prayogo, and Y.-W. Wu, “Prediction of permanent deformation in asphalt pavements using a novel symbiotic organisms search–least squares support vector regression,” Neural Computing and Applications, vol. 31, no. 10, pp. 6261–6273, 2018.
View at: Publisher Site | Google Scholar
N. Neggaz, E. H. Houssein, and K. Hussain, “An efficient henry gas solubility optimization for feature selection,” Expert Systems with Applications, vol. 152, Article ID 113364, 2020.
View at: Publisher Site | Google Scholar
W. Chen, X. Chen, J. Peng, M. Panahi, and S. Lee, “Landslide susceptibility modeling based on ANFIS with teaching-learning-based optimization and Satin bowerbird optimizer,” Geoscience Frontiers, vol. 12, no. 1, pp. 93–107, 2021.
View at: Publisher Site | Google Scholar
L. B. Thao, Vietnam, the Country and Its Geographical Regions, The Gioi Publishers, Hanoi, Vietnam, 1997.
K. C. McGregor, R. L. Bingner, A. J. Bowie, and G. R. Foster, “Erosivity index values for Northern Mississippi,” Transactions of the ASAE, vol. 38, no. 4, pp. 1039–1047, 1995.
View at: Publisher Site | Google Scholar
K. G. Renard, G. R. Foster, G. A. Weesies, D. K. McCool, and D. C. Yoder, Predicting Soil Erosion by Water: a Guide to Conservation Planning with the Revised Universal Soil Loss Equation (RUSLE), Agricultural Handbook 703, US Department of Agriculture, Washington, DC, USA, 1997.
A. Klute, Methods of Soil Analysis: Part 1—Physical and Mineralogical Methods, Soil Science Society of America, American Society of Agronomy, Madison, WI, USA, 1986.
C. Valentin, F. Agus, R. Alamban et al., “Runoff and sediment losses from 27 upland catchments in Southeast Asia: impact of rapid land use changes and conservation practices,” Agriculture, Ecosystems & Environment, vol. 128, no. 4, pp. 225–238, 2008.
View at: Publisher Site | Google Scholar
V. N. Vapnik, Statistical Learning Theory, John Wiley & Sons, Inc., Printed in the United States of America, Hoboken, NY, USA, 1998.
C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics, Springer, Berlin, Germany, 2011.
C. Qi, A. Fourie, G. Ma, X. Tang, and X. Du, “Comparative study of hybrid artificial intelligence approaches for predicting hangingwall stability,” Journal of Computing in Civil Engineering, vol. 32, Article ID 04017086, 2018.
View at: Publisher Site | Google Scholar
A. Malik, Y. Tikhamarine, D. Souag-Gamane, O. Kisi, and Q. B. Pham, “Support vector regression optimized by meta-heuristic algorithms for daily streamflow prediction,” Stochastic Environmental Research and Risk Assessment, vol. 34, no. 11, pp. 1755–1773, 2020.
View at: Publisher Site | Google Scholar
B. T. Pham, A. Jaafari, I. Prakash, and D. T. Bui, “A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling,” Bulletin of Engineering Geology and the Environment, vol. 78, no. 4, pp. 2865–2886, 2018.
View at: Publisher Site | Google Scholar
S. A. Naghibi, H. R. Pourghasemi, and K. Abbaspour, “A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS,” Theoretical and Applied Climatology, vol. 131, no. 3-4, pp. 967–984, 2018.
View at: Publisher Site | Google Scholar
G. M. Hadjidemetriou, P. A. Vela, and S. E. Christodoulou, “Automated pavement patch detection and quantification using support vector machines,” Journal of Computing in Civil Engineering, vol. 32, Article ID 04017073, 2018.
View at: Publisher Site | Google Scholar
N.-D. Hoang, Q.-L. Nguyen, and D. T. Bui, “Image processing-based classification of asphalt pavement cracks using support vector machine optimized by artificial bee colony,” Journal of Computing in Civil Engineering, vol. 32, Article ID 04018037, 2018.
View at: Publisher Site | Google Scholar
M. S. Tehrany, B. Pradhan, S. Mansor, and N. Ahmad, “Flood susceptibility assessment using GIS-based support vector machine model with different kernel types,” CATENA, vol. 125, pp. 91–101, 2015.
View at: Publisher Site | Google Scholar
A. Sina and D. Kaur, “Short term load forecasting model based on kernel-support vector regression with social spider optimization algorithm,” Journal of Electrical Engineering & Technology, vol. 15, no. 1, pp. 393–402, 2020.
View at: Publisher Site | Google Scholar
H. Nguyen, Y. Choi, X.-N. Bui, and T. Nguyen-Thoi, “Predicting blast-induced ground vibration in open-pit mines using vibration sensors and support vector regression-based optimization algorithms,” Sensors, vol. 20, p. 132, 2020.
View at: Google Scholar
N.-D. Hoang and Q.-L. Nguyen, “A novel approach for automatic detection of concrete surface voids using image texture analysis and history-based adaptive differential evolution optimized support vector machine,” Advances in Civil Engineering, vol. 2020, Article ID 4190682, 15 pages, 2020.
View at: Publisher Site | Google Scholar
Z. Yu, X. Shi, J. Zhou et al., “Prediction of blast-induced rock movement during bench blasting: use of gray wolf optimizer and support vector regression,” Natural Resources Research, vol. 29, no. 2, pp. 843–865, 2020.
View at: Publisher Site | Google Scholar
A. Jaafari, M. Panahi, B. T. Pham et al., “Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility,” CATENA, vol. 175, pp. 430–445, 2019.
View at: Publisher Site | Google Scholar
S. Deng, X. Wang, Y. Zhu, F. Lv, and J. Wang, “Hybrid grey wolf optimization algorithm based support vector machine for groutability prediction of fractured rock mass,” Journal of Computing in Civil Engineering, vol. 33, Article ID 04018065, 2019.
View at: Publisher Site | Google Scholar
X.-N. Bui, P. Jaroonpattanapong, H. Nguyen, Q.-H. Tran, and N. Q. Long, “A novel hybrid model for predicting blast-induced ground vibration based on k-nearest neighbors and particle swarm optimization,” Scientific Reports, vol. 9, Article ID 13971, 2019.
View at: Publisher Site | Google Scholar
D. Prayogo and Y. T. T. Susanto, “Optimizing the prediction accuracy of friction capacity of driven piles in cohesive soil using a novel self-tuning least squares support vector machine,” Advances in Civil Engineering, vol. 2018, Article ID 6490169, 9 pages, 2018.
View at: Publisher Site | Google Scholar
Z. M. Yaseen, H. Faris, and N. Al-Ansari, “Hybridized extreme learning machine model with salp swarm algorithm: a novel predictive model for hydrological application,” Complexity, vol. 2020, Article ID 8206245, 14 pages, 2020.
View at: Publisher Site | Google Scholar
H. A. Afan, M. F. Allawi, A. El-Shafie et al., “Input attributes optimization using the feasibility of genetic nature inspired algorithm: application of river flow forecasting,” Scientific Reports, vol. 10, p. 4684, 2020.
View at: Publisher Site | Google Scholar
V.-H. Nhu, P.-T. Thi Ngo, T. D. Pham et al., “A new hybrid firefly-PSO optimized random subspace tree intelligence for torrential rainfall-induced flash flood susceptible mapping,” Remote Sensing, vol. 12, no. 17, p. 2688, 2020.
View at: Publisher Site | Google Scholar
A. P. Piotrowski, “L-SHADE optimization algorithms with population-wide inertia,” Information Sciences, vol. 468, pp. 117–141, 2018.
View at: Publisher Site | Google Scholar
R. Tanabe and A. S. Fukunaga, “Improving the search performance of SHADE using linear population size reduction,” in Proceedings of the 2014 IEEE Congress on Evolutionary Computation (CEC), pp. 1658–1665, Beijing, China, July 2014.
View at: Google Scholar
J. Zhang and A. C. Sanderson, “JADE: adaptive differential evolution with optional external archive,” IEEE Transactions on Evolutionary Computation, vol. 13, pp. 945–958, 2009.
View at: Google Scholar
K. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution - A Practical Approach to Global Optimization, Springer-Verlag Berlin Heidelberg, Berlin, Germany, 2005.
R. Tanabe and A. Fukunaga, “Success-history based parameter adaptation for Differential Evolution,” in Proceedings of the 2013 IEEE Congress on Evolutionary Computation, pp. 71–78, Cancun, Mexico, June 2013.
View at: Google Scholar
Accord, “Accord.NET framework,” 2019, http://accord-framework.net/.
View at: Google Scholar
G. Montavon, G. Orr, and K.-R. Müller, Neural Networks: Tricks of the Trade, Springer-Verlag Berlin Heidelberg, Berlin, Germany, 2012.
I. Kononenko, Estimating Attributes: Analysis and Extensions of RELIEF, Springer, Berlin, Heidelberg, 1994.
R. C. Prati, G. E. A. P. A. Batista, and D. F. Silva, “Class imbalance revisited: a new experimental setup to assess the performance of treatment methods,” Knowledge and Information Systems, vol. 45, no. 1, pp. 247–270, 2015.
View at: Publisher Site | Google Scholar
D. Tien Bui, N.-D. Hoang, F. Martínez-Álvarez et al., “A novel deep learning neural network approach for predicting flash flood susceptibility: a case study at a high frequency tropical storm area,” Science of The Total Environment, vol. 701, Article ID 134413, 2020.
View at: Publisher Site | Google Scholar
T.-D. Nguyen, T.-H. Tran, and N.-D. Hoang, “Prediction of interface yield stress and plastic viscosity of fresh concrete using a hybrid machine learning approach,” Advanced Engineering Informatics, vol. 44, Article ID 101057, 2020.
View at: Publisher Site | Google Scholar
M. T. Hagan, H. B. Demuth, M. H. Beale, and O. D. Jesús, Neural Network Design, Vikas Publishing House, Noida, India, 2nd edition, 2014.
Matwork, “Statistics and machine learning toolbox user’s guide: Matwork Inc.,” 2017, https://www.mathworks.com/help/pdf_doc/stats/stats.pdf.
View at: Google Scholar
J. Heaton, Artificial Intelligence for Humans, Volume 3 Deep Learning and Neural Networks, Heaton Research, Inc., St. Louis, MO, USA, 2015.
M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the Marquardt algorithm,” IEEE Transactions on Neural Networks, vol. 5, no. 6, pp. 989–993, 1994.
View at: Publisher Site | Google Scholar
S. Chen, C. F. N. Cowan, and P. M. Grant, “Orthogonal least squares learning algorithm for radial basis function networks,” IEEE Transactions on Neural Networks, vol. 2, no. 2, pp. 302–309, 1991.
View at: Publisher Site | Google Scholar
A. R. van Erkel and P. M. T. Pattynama, “Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology,” European Journal of Radiology, vol. 27, no. 2, pp. 88–94, 1998.
View at: Publisher Site | Google Scholar
W. Chen, Y. Chen, P. Tsangaratos, I. Ilia, and X. Wang, “Combining evolutionary algorithms and machine learning models in landslide susceptibility assessments,” Remote Sensing, vol. 12, no. 23, p. 3854, 2020.
View at: Publisher Site | Google Scholar
W. Chen, X. Zhao, P. Tsangaratos et al., “Evaluating the usage of tree-based ensemble methods in groundwater spring potential mapping,” Journal of Hydrology, vol. 583, Article ID 124602, 2020.
View at: Publisher Site | Google Scholar
M. Hollander and D. A. Wolfe, Nonparametric Statistical Methods, John Wiley & Sons, Hoboken, NJ, USA, 1999.
V.-H. Dang, N.-D. Hoang, L.-M.-D. Nguyen, D. T. Bui, and P. Samui, “A novel GIS-based random forest machine algorithm for the spatial prediction of shallow landslide susceptibility,” Forests, vol. 11, no. 1, p. 118, 2020.
View at: Publisher Site | Google Scholar
B. Everitt, The Cambridge Dictionary of Statistics, Cambridge University Press, Cambridge, UK, 1998.
T. Whelen and P. Siqueira, “Coefficient of variation for use in crop area classification across multiple climates,” International Journal of Applied Earth Observation and Geoinformation, vol. 67, pp. 114–122, 2018.
View at: Publisher Site | Google Scholar
N.-D. Hoang, “Image processing-based pitting corrosion detection using metaheuristic optimized multilevel image thresholding and machine-learning approaches,” Mathematical Problems in Engineering, vol. 2020, Article ID 6765274, 19 pages, 2020.
View at: Publisher Site | Google Scholar
M. Kouli, P. Soupios, and F. Vallianatos, “Soil erosion prediction using the revised universal soil loss equation (RUSLE) in a GIS framework, chania, northwestern crete, Greece,” Environmental Geology, vol. 57, no. 3, pp. 483–497, 2009.
View at: Publisher Site | Google Scholar
I. Gaubi, A. Chaabani, A. Ben Mammou, and M. H. Hamza, “A GIS-based soil erosion prediction using the revised universal soil loss equation (RUSLE) (lebna watershed, cap bon, Tunisia),” Natural Hazards, vol. 86, no. 1, pp. 219–239, 2017.
View at: Publisher Site | Google Scholar
H. R. Pourghasemi, N. Sadhasivam, N. Kariminejad, and A. L. Collins, “Gully erosion spatial modelling: role of machine learning algorithms in selection of the best controlling factors and modelling process,” Geoscience Frontiers, vol. 11, no. 6, pp. 2207–2219, 2020.
View at: Publisher Site | Google Scholar
A. Arabameri, W. Chen, M. Loche et al., “Comparison of machine learning models for gully erosion susceptibility mapping,” Geoscience Frontiers, vol. 11, no. 5, pp. 1609–1620, 2020.
View at: Publisher Site | Google Scholar
O. Ghorbanzadeh, H. Shahabi, F. Mirchooli et al., “Gully erosion susceptibility mapping (GESM) using machine learning methods optimized by the multi‑collinearity analysis and K-fold cross-validation,” Geomatics, Natural Hazards and Risk, vol. 11, no. 1, pp. 1653–1678, 2020.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Tuan Vu Dinh et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1112

Downloads

816

Citations