#### Abstract

An accurate slope prediction model is important for slope reinforcement before the disaster. The *k*-nearest neighbor (KNN) algorithm, as a simple and effective nonparametric machine learning method, is widely applied in classification recognition. In our study, the *k*-nearest neighbor (KNN) algorithm is improved to reduce its sample dependence and improve the robustness of the algorithm, and then the prediction model of the slope stability is proposed based on the improved *k*-nearest neighbor (KNN) algorithm. Extensive experimental results show that our proposed prediction model achieves high prediction performance in this regard. Moreover, a comparison between our proposed prediction model and the finite element method, which is the classical theoretical method of slope stability, was made, which will provide an important approach to predicting the slope stability for slope engineering. Finally, shaking table test of a slope model is conducted to evaluate whether the slope is stable or not, and the experimental results are in good agreement with the prediction results of our proposed prediction model, which further demonstrates its effectiveness.

#### 1. Introduction

Landslide is a complex natural phenomenon of slope instability, and it usually causes huge losses to human life and property. It is widely understood that slope stability depends on different parameters, such as cohesion, internal friction angle, rainfall, and earthquake. At present, numerical analysis is commonly adopted in the slope stability analysis. However, numerical analysis will not help analyze slope stability solely because slope is a complex dynamic system affected by many factors [1]. Consequently, the prediction of the slope stability should be of practical significance. The aim of this work is to propose a prediction approach of the slope stability based on machine learning techniques.

Predicting the slope stability is still a challenge. The factors that influence the slope stability are various and complicated, and the main influence factors can be roughly divided into three categories [2], including physical and mechanical properties of the slope soil (unit weight, cohesion, and the angle of internal friction), natural topography (slope height and slope angle), and external factors (rainfall infiltration, groundwater seepage, and earthquake load). It is difficult to predict the slope stability due to various and complicated factors [3]. Lin et al. [4] chose six typical slope parameters—unit weight, cohesion, internal friction angle, slope inclination, slope height, and pore water ratio—to establish the evaluation index system and predicted the slope stability using four supervised learning methods. Zhao et al. [5] chose six input variables—density, friction angle, friction coefficient, slope angle, slope height, and pore water pressure—for the prediction of slope stability using the relevance vector machine method and found that the RVM is a robust tool for the prediction of slope stability. Samui and Kothari [6] chose six input variables—unit weight, cohesion, angle of internal friction, slope angle, height, and pore water pressure coefficient—for the prediction of slope stability using the least square support vector machine method and found that the developed LSSVM is a robust model for slope stability analysis. Hu et al. [7] used the support vector machine method to forecast the slope instance and found that the forecasting results are consistent with the actual states of slope stability. Li and Jiang [8] chose six characteristic parameters—unit weight, cohesion, angle of internal friction, slope angle, height, and pore water pressure coefficient—for the prediction of slope stability using KNN and found that the KNN method was more accurate than the backpropagation neural network algorithm. Xiong and Li [9] applied PNN to rock slope stability forecasting, and the results of the case study show that the analysis results are completely consistent with the actual situation. Consequently, machine learning approaches are being increasingly used for slope stability. However, the main external influencing factors of slope stability have been neglected in previous studies, such as rainfall and earthquake, which are the main inducing factors of landslide.

It is worth noting that the *k*-nearest neighbor (KNN) algorithm [10–12], which is one of the most well-known algorithms in classification recognition, has been proven to be very effective in prediction. To improve prediction accuracy, scholars have carried out studies adjusting its weights [13, 14]. Dudani [15] proposed a weighted voting method named the distance-weighted *k*-nearest neighbor (WKNN) rule, which is the first distance-based vote weighting schemes. Gou et al. [16] presented a dual weighted *k*-nearest neighbor (DWKNN) rule that extended the linear mapping of Dudani. Furthermore, because the KNN algorithm generally requires a preset *k* value and runs multiple experiments with different *k* values to obtain the best prediction results, some new ideas are proposed to improve the selection of the *k* value. Zheng [17] proposed a strategy of dynamically setting *k* values. Liu and Zhang [18] proposed a scheme reconstructing points of the test dataset by learning the correlation matrix, in which different *k* values are assigned to different points of test data based on the training data. In addition, Ma et al. [19] proposed a coefficient-weighted KNN classifier and a residual-weighted KNN classifier for making classification decisions on the basis of sparse coefficients in the sparse representation. Gou et al. [20] proposed the two-phase probabilistic collaborative representation-based classification (TPCRC) to enhance the power of pattern discrimination in PCRC. Huang et al. [21, 22] analyzed the factors influencing the rockfall runout distance, predicted the rockfall runout distance based on an improved KNN algorithm, and predicted sand liquefaction using the local mean-based pseudo-nearest neighbor algorithm; however, the accuracy of the prediction still needs to be improved. From previous studies, it is found that the KNN algorithm is widely used in classification prediction, but it depends on the number of training samples. In this study, we improved the KNN algorithm to reduce its sample dependence and improve the robustness of the algorithm and built the prediction model of the slope.

#### 2. Establishment of the Prediction Model

##### 2.1. Our Improved KNN Algorithm

KNN, as a simple, effective, and nonparametric prediction method, was first proposed by Cover and Hart to solve text prediction problems [18]. Its principle is to expand the area from the test sample point *x* constantly until *k* training sample points are included. In addition, the test sample point *x* is classified into the category that most frequently appears in the nearest *k* training sample points.

*k*-nearest neighbor rule (WKNN). In the WKNN, the closer neighbors are weighted more heavily than the farther ones, using the distance-weighted function. The weighted function of the WKNN is shown as follows:

Accordingly, the prediction result of the query is made by the majority weighted voting as defined in the following:(3)DWKNN [16] is based on the WKNN: different weights are given to *k*-nearest neighbors according to their distances, with closer neighbors having greater weights. The dual distance-weighted function of the DWKNN is defined as

Then, we label the query by the majority weighted vote of *k*-nearest neighbors, the same as

Through comparative study, we find that the method in equation (5) for improvement has better robustness and less sample dependence. Thus, in this study, we used this method to predict the slope stability:

Accordingly, we classify the query point *x* into class *c* by majority weighted voting of its neighbors as shown in the following:

##### 2.2. Establishment of the Prediction Model

The prediction model of the slope stability based on our improved algorithm can be expressed as follows.

Let *X* denote a set of the slope stability sample, and suppose *X* is , where represents the feature of the *i*-th surrounding rock stability sample, *N* is the total number of features, and *m* is the feature dimension. In addition, let represent the slope stability levels, and . Therefore, the sample set of the prediction model is shown as follows:

Given the unknown sample , our proposed slope stability prediction model based on our improved KNN algorithm can be expressed aswhere is the nearest neighbors of the unknown sample *x* in class . Hence, the unknown sample *x* is classified into class that has the closest neighbor among all classes.

#### 3. Prediction Model of the Slope Stability Based on Our Improved KNN Algorithm

The prediction model is established using the training samples in [3]. There are 50 cases which are used for training, and 14 cases are used for testing.

##### 3.1. Data Information and Predictors

The slope stability prediction is performed to find the nonlinear relationship between the influencing factors and the slope stability. The main influencing factors can be roughly divided into three categories [3], including physical and mechanical properties of the slope soil (unit weight, cohesion, and the angle of internal friction), natural topography of a slope (slope height and slope angle), and external factors (rainfall infiltration, groundwater seepage, and earthquake load). In our study, we chose the representative factors—unit weight, cohesion, internal friction angle, slope height, slope angle, groundwater level, earthquake intensity, and rainfall intensity—as the influencing factors.

By comparison of the slope codes of the earthquake-prone countries (China, Japan, European countries, and the United States), evaluation methods of the slope seismic stability in different specifications were determined at home and abroad, as shown in Table 1.

By comparing slope codes in different countries, we used safety factor and permanent displacement to evaluate the slope stability. According to the safety factor of the slope, Xiong [23] classified the slope stability into five grades which are particular instability, instability, potential instability, basic stability, and stability. The five grades are labeled as I, II, III, IV, and V, respectively, as depicted in Table 2.

##### 3.2. Normalization

Since the range of each predictor is significantly different and the test results might rely on the values of a few predictors, they are preprocessed using normalization [24]. We compute the upper and lower bound of each predictor, and the process for the used normalization is represented aswhere is each predictor.

Accordingly, the value of each predictor is normalized to between 0 and 1 based on equations (9)–(11).

##### 3.3. Criteria for Our Prediction Model Performance

The accuracy, computed based on the percentage of all test samples classified correctly, is used to evaluate the prediction performance of the slope stability. Accuracy tells us about the number of samples which are correctly predicted, and it is defined as follows: where denotes the total number of test samples and is the number of test samples that are predicted correctly.

##### 3.4. Procedure Algorithm of Our Proposed Prediction Model

In this study, we improve the KNN algorithm to further overcome the influence of neighborhood *k*. Let denote a training set with *M* classes which are . Training samples for each class are , where is the subset of the training samples , is the dimensional feature space, and is the training samples. In our improved KNN algorithm, the class label of a query point is computed as shown in the following steps.

For computing the nearest neighbors from the set for the unknown query point , let , where is the set of nearest neighbors for . And the *k*-nearest neighbors , are sorted in the ascending order according to the distance between their Euclidean distance and . By assigning different weights to the nearest neighbors , the weight of the *j*-th nearest neighbor is defined as

Accordingly, we classify the query point into class by majority weighted voting of its neighbors as shown in the following:

##### 3.5. Slope Stability Prediction

In this section, our proposed prediction model is trained by 50 typical slope stability cases and tested by 14 typical slope stability cases. The neighborhood size *k* ranges from 1 to 7 with an interval of 1, which is inspired by [25]. The 50 typical slope stability cases are shown in Table 3, and the 14 typical slope stability cases are shown in Table 4. This prediction experiment is implemented in Eclipse 3.7.2 by Java language programming, and the hardware environment is Inter Core i7-6700 CPU 3.40 GHz.

As shown in Table 4, our proposed prediction model has high accuracy and reliability, and the prediction results of the proposed prediction model are in good agreement with the actual results. The accuracy of our proposed prediction model is up to 92.85%. This illustrates that our proposed prediction model is feasible to predict the slope stability, which shows that our proposed prediction model could be used to evaluate the slope stability before the design and construction of slope engineering.

Next, the prediction performance of our proposed prediction model is compared with other prediction models based on the KNN algorithm [10], WKNN algorithm [15], and DWKNN algorithm [16]. The following prediction experiments will show whether our proposed prediction model will achieve better prediction performance. The comparison results between different prediction models are shown in Figures 2 and 3.

**(a)**

**(b)**

**(c)**

As can be seen in Figures 2 and 3, the prediction accuracy of our proposed prediction model is somewhat better than the prediction accuracy of the prediction models based on KNN, WKNN, and DWKNN algorithms in almost all of the test cases, which shows that our proposed prediction approach performs better than other approaches with the increasing of the neighborhood size *k*. It can be found that the accuracy of our proposed prediction model is the highest when the neighborhood size *k* is 4, and our proposed prediction model achieves an accuracy of 92.85%. This result suggests that our proposed prediction model based on the improved KNN algorithm has the robustness to the sensitivity of different choices of the neighborhood size *k* with a good prediction performance in predicting the slope stability.

#### 4. Engineering Application of Our Proposed Prediction Model

To further determine the performance of our proposed prediction approach based on the improved KNN algorithm in engineering applications, we also conduct experiments to see the prediction performance for evaluating the slope stability along the Sichuan-Tibet railway in China and compared the prediction results with the finite element method and shaking table test results. The intensities of the historical earthquakes were within a radius of 500 km around.

Our research group drilled lots of boreholes in our survey region along the Sichuan-Tibet railway. On the basis of mass borehole data, the values of the influencing factors—unit weight, cohesion, internal friction angle, and groundwater level—are obtained, and we use our proposed prediction approach based on the improved KNN algorithm to predict the stability of the slope along the Sichuan-Tibet railway. The seismic activity around the Sichuan-Tibet railway is relatively frequent.

##### 4.1. Slope Stability Prediction of the Sichuan-Tibet Railway

Cutting slopes along the Sichuan-Tibet railway in China are chosen as the research object. We simplified the slope shape, and the simplified slope models are established with finite element software MIDAS GTS NX. Mohr–Coulomb elastoplastic model is used to model the stress-strain behavior of the soil. And the grid size of the finite element model is 0.5 m, as shown in Figure 4. Furthermore, the bottom is set as the fixed boundary, and the left and right are set as viscoelastic artificial boundaries.

In the numerical simulation model, the quality damping coefficient and the stiffness damping coefficient are fixed as 0.2 and 0.0019, respectively. So, the damping coefficient of the numerical simulation model is calculated by the Rayleigh damping formula, as illustrated in the following:where denotes the quality damping coefficient and is the stiffness damping coefficient. Moreover, the quality damping coefficient and the stiffness damping coefficient are computed bywhere is the natural frequency of the first model, is the natural frequency of the second model, and and are conventional damping ratios ranging from 2% to 7%.

The engineering geological conditions of the slope along the Sichuan-Tibet railway were investigated, and the influencing factor values were determined based on the indoor experiment. The value ranges of the influencing factors are shown in Table 5.

First, we should know the influence laws of all the factors on slope stability, and we compute the safety factors of the slope under different influencing factors, as shown in Figure 5.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

As shown in Figure 5, the safety factors increase with the increasing of the cohesion and internal friction angle, while the safety factors decrease with the increasing of other factors. Also, we found that the slope stability is significantly affected by slope angle, slope height, cohesion, internal friction angle, groundwater level, and peak acceleration.

In order to more directly demonstrate the influence of different influencing factors on slope stability, different influencing factors are normalized. And the safety factors under the normalized influencing factors are shown in Figure 6.

As shown in Figure 7, it can be found that the factors selected in our study are all sensitive to slope stability which shows the accuracy of the influencing factors chosen in our study. Also, we found that the slope is the most instable under the influence of the peak acceleration, which shows that the impact of potential future earthquakes on the slope cannot be ignored. Consequently, we could use our proposed prediction model to predict the slope stability under the potential future earthquakes, and some reinforcement measures can be taken according to the predicted results, which are important and useful for solving the realistic engineering problems.

**(a)**

**(b)**

**(c)**

**(d)**

Based on the nonlinear finite element method and the strength reduction methods, the slope damage contour can be obtained under different slope stable-states. We could more intuitively determine the slope failure degree by the contour. Grade V indicates that the slope is in a stable state; thus, we only plot the slope damage contour for slope stability grades I, II, III, and IV. In this section, the groundwater levels vary, and other factor values remain constant. The slope damage contours for different slope stability grades are shown in Figure 7.

As shown in Figure 7, the slope shows different stability states when the slope is at different groundwater levels. The slope stability degree could be determined through the plastic zone distribution. And we could more intuitively determine the slope stability grades by the finite element method. Thus, we could verify the accuracy of our proposed prediction model by comparing the finite element results.

To assess the stability of the slope along the Sichuan-Tibet railway, we chose 16 slope cases to predict the slope stability using our proposed prediction model and the finite element method, respectively. And the predicted results obtained by our proposed prediction model and the finite element method are compared as shown in Table 6.

As shown in Table 6, our proposed prediction approach based on the KNN algorithm almost achieves the best performance compared with the finite element method. The prediction accuracy is up to 93.75% which demonstrates that our proposed prediction model could be used for slope stability discrimination for the engineering geological hazard safety assessment.

##### 4.2. Comparison of Our Prediction Model Results with Shaking Table Test Results

In this section, we mainly conduct the effect of the earthquake on slope stability, and other influencing factors remain as a constant value. Thus, we conduct shaking table test which could reproduce the failure process of the slope under the real earthquakes to see the prediction performance of our proposed prediction model. Figure 8 shows shaking table test equipment to model the slope failure process under the seismic excitation.

**(a)**

**(b)**

As can be seen in Figure 8, the main technical indicators of shaking table test equipment include rated working frequency (40 Hz), the maximum acceleration (20 m/s^{2}), the maximum test load (5000 kg), and dimensions of the shaking table (1.5 m × 1.5 m).

The size of the test model is 1.96 m × 0.96 m × 1.2 m. The slope rate is 1 : 1.5. To keep the sandy soil uniform, the sandy slope is repeatedly stirred. The sponge whose thickness is 20 mm is used to reduce the reflection of seismic waves at the border of the slope. The test model is shown in Figure 9.

The dynamic pore water pressure change of the slope under the earthquake is the main cause of the slope failure. In order to obtain the dynamic pore water pressure, many sensors are deployed at the slope toe. The layout of monitoring points of the test model is shown in Figure 10.

The far-field seismic wave (type I: T1-II-1) and the near-field seismic wave (type II: T2-II-1) are applied to the slope stability analysis. The parameters of the earthquake motions are shown in Table 7, and the acceleration-time histories of the seismic waves are shown in Figure 11. According to the code for seismic design of railway engineering (GB50111-2006) [26] of China, the peak accelerations of the seismic waves are adjusted to 4 degrees, 5 degrees, 6 degrees, and 7 degrees.

**(a)**

**(b)**

In the laboratory test, the effect of different influencing factors on slope stability is investigated. We could determine the damage degree of the slope with different stability levels through the shaking table test. The values of the influencing factors in the shaking table test are shown in Table 8.

The scaling law between our test model and the actual projects follows the Buckingham Pi theorem [27], and the proportional relation for the similarity ratio is developed by Jiang et al. [28]. Poisson’s ratio of the soil in the test is 0.35, the coefficient of the lateral pressure is 0.54, and the dimensionless index . Other similarity coefficients [29] based on the similarity principle are shown in Table 9.

is the horizontal shear strength, and is computed bywhere is the normal pressure stress (i.e., the geostatic stress caused by burial depth), is the internal friction angle, is cohesion, and is the lateral pressure coefficient.

In our experiment, similar material was developed based on the slope material of the Sichuan-Tibet railway. The similarity coefficients simulating the slope are shown in Table 10.

Simulation of underground water level determines the accuracy of test results. Test program of the simulation of groundwater level is shown in Figure 12.

As shown in Figure 12, water is injected at the left side of the slope. The height of the groundwater level on the right side is always controlled at the height of the slope toe by turning on the tap. The stable seepage field inside the slope will be formed after a long time seepage of water. Then, the seismic excitation is input into the model.

Before prediction, we analyze the development laws of the dynamic pore water pressure inside the slope, which is a more intuitive indicator of slope failure. The dynamic pore water pressure of the slope at different positions is calculated at the groundwater levels of 14 m and 20 m (before scaling using the equation in Table 10), as shown in Figures 13 and 14.

**(a)**

**(b)**

**(a)**

**(b)**

As shown from Figures 13 and 14, the dynamic pore water pressure rises sharply within a short time under near-field earthquakes and far-field earthquakes. The rising pore water pressure is too late to dissipate and shows large fluctuations. Especially, the dynamic pore water pressure values of the monitoring point *G* are greater than the dynamic pore water pressure values of other monitoring points. The dynamic pore water pressure of the slope toe is greater influenced by the seepage force, which shows that the slope toe is the position which is most easy to have the plastic damage. The production of dynamic pore water pressure under the earthquake could decrease the strength of the soil slope, make the effective stress act on the soil skeleton change, limit its deformation, and cause the destruction of the slope. Shear slide occurs at the slope toe position under the dynamic pore water pressure action; thus, slope toe should be as the key protection position in actual engineering.

Grade V of the slope stability indicates that the slope is in a stable state; thus, we determined the level of the slope damage for I, II, III, and IV grades, as shown in Figure 15.

**(a)**

**(b)**

**(c)**

**(d)**

As can be seen in Figure 15, through the shaking table test, we can more directly determine the failure degree of the slope at different stability grades. The slope slides with the increasing of earthquake intensities, and the destruction begins at the slope toe. The cracks in the slope spread gradually when the slope stability grades vary from IV to I. Meanwhile, the position at the top of the slope has an obvious settlement phenomenon. The sliding surface is approximately circular in shape. And the slope is particular instability when the slope stability grade is I; thus, the shaking table test could accurately determine the slope stability grades.

Then, we compare the shaking table test results with the prediction results of our proposed prediction model, and the comparison results are shown in Table 11.

As can be seen in Table 11, our proposed prediction model based on the improved KNN algorithm achieves the best performance compared with the shaking table test results. The prediction accuracy is up to 92.30% which demonstrates that our proposed prediction model could be used for slope stability prediction before the major project construction near the slope.

#### 5. Conclusions

(1)We improved the KNN algorithm and established a prediction model of the slope stability. And the performance of our proposed prediction model is evaluated by conducting extensive experiments on slope stability grade prediction, and the experimental results demonstrate the effectiveness of our proposed prediction model.(2)We used our proposed prediction model to evaluate the stability of actual slope engineering, and the evaluation results using the finite element method match well with the predicted results of our proposed prediction model, which shows that our proposed prediction approach is an effective method to predict the slope stability.(3)The progressive failure process of the slope is conducted by the shaking table test, and the failure degree of the slope at different stability grades is determined. Our proposed prediction model could determine the failure degree of the slope by comparing the experiment result, which further demonstrates the effectiveness of our proposed prediction model of the slope stability.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was financially supported by the research grant from the Institute of Crustal Dynamics, China Earthquake Administration (no. ZDJ2019-10), the National Natural Science Foundation of China (Grant no. 51708516), the Young Elite Scientists Sponsorship Program by CAST (2018QNRC001), the Hebei Province Natural Science Fund (no. E2019210126), and the Fundamental Research Funds for the Central Universities (FRF-BD-19-004A).