Abstract

A new response surface method (RSM) for slope reliability analysis was proposed based on Gaussian process (GP) machine learning technology. The method involves the approximation of limit state function by the trained GP model and estimation of failure probability using the first-order reliability method (FORM). A small amount of training samples were firstly built by the limited equilibrium method for training the GP model. Then, the implicit limit state function of slope was approximated by the trained GP model. Thus, the implicit limit state function and its derivatives for slope stability analysis were approximated by the GP model with the explicit formulation. Furthermore, an iterative algorithm was presented to improve the precision of approximation of the limit state function at the region near the design point which contributes significantly to the failure probability. Results of four case studies including one nonslope and three slope problems indicate that the proposed method is more efficient to achieve reasonable accuracy for slope reliability analysis than the traditional RSM.

1. Introduction

Slope stability analysis is an important issue in geotechnical engineering. The application of probabilistic concepts to the slope stability analysis has been pursued over the past decades. On real slope engineering, slope reliability can be obtained via an implicit limit state function with a numerical procedure employing the limit equilibrium methods [1, 2] or finite element method [3, 4]. The classical reliability analysis methods including first-order reliability method (FORM) and second-order reliability method (SORM) [58] that require limit state function gradients with respect to the basic variables may encounter great difficulties when direct or analytical differential is not possible. The Monte Carlo simulation (MCS) can handle the implicit performance function. However, the MCS was notorious for its unendurable computational cost.

The response surface method (RSM), which usually employs a polynomial function to approximate the unknown implicit performance function, has been used to the slope reliability analysis with the implicit performance function [4, 911]. However, the RSM is a rigidly nonadaptive regression technique in the statistical learning perspective [12]. Consequently, the RSM will become computationally impractical for problems involving many random variables, particularly when the involved random variables are mixed or statistically dependent. In addition, there is no guarantee that the fitted surface is a sufficiently close fit in all regions of interest.

In recent decades, machine learning methods such as artificial neural networks (ANN) [1317], radial basis function network (RBFN) [18], and support vector machine (SVM) [19, 20] have been widely used for the reliability analysis. These methods can describe nonlinear and complex interactions among variables in a system and are therefore more suitable for solving reliability problems with highly nonlinear implicit performance. However, when these machine learning methods are used, some difficulties will be encountered. For ANN, it is always difficult to obtain an appropriate network topology and the optimum hyperparameters. In addition, the ANN has some inherent drawbacks such as limitation in solving the small sample problem [21]. The SVM cannot avoid the blindness which is the common phenomenon in the artificial choice of the kernel function and its hyperparameter [21]. Hence, it is desirable to develop an efficient framework for slope reliability analysis.

The Gaussian process (GP), based on statistical learning theory and Bayesian theory, is a newly developed machine learning technology [22]. In recent years, the GP has gained considerable attention in the machine learning community and has been successfully applied in solving nonlinear, small samples, and high-dimensional problems [2328]. The GP has many advantages, such as it does not require a predefined structure, can approximate arbitrary function landscapes including discontinuities and multimodality, has meaningful hyperparameters, and includes a theoretical framework for obtaining the optimum hyperparameter self-adaptively. Furthermore, GP provides an uncertainty measure in the form of a standard deviation for predicted function values.

In the present study, a new GP-based RSM for slope reliability analysis was put forward by combining the GP with the FORM. This paper is constructed as follows. The used machine learning technique, namely, Gaussian process regression, is briefly described in Section 2. The GP-based RSM method is presented in detail in Section 3. Finally, four case studies are used to verify the flexibility and efficiency of the proposed method.

2. Gaussian Process

A Gaussian process is a collection of random variables, any finite set of which has a joint Gaussian distribution. A Gaussian process can be defined by its mean function and the covariance function as

There is a training set of observations, , where is an input vector with n dimensions and is a scalar output or target. By using Bayesian forecasting, the distribution of output given a test input x and a set of training points can be calculated. Using the Bayesian rule, the posterior distribution for the Gaussian process outputs can be obtained. By conditioning on the observed targets in the training set, the predictive distribution is Gaussian:where the mean and variance are defined bywhere the compact forms of the notation setting for the matrix of the covariance functions are , , is the unknown variance of the Gaussian noise.

Assuming , equation (3) can be seen as a linear combination of m covariance functions, each one centered on a training point, by writing

The Gaussian process procedure can handle interesting models by simply using a covariance function with an exponential term:where denotes the characteristic length scale, denotes the signal variance, and denotes a Kronecker delta. This function expresses the idea the cases with nearby inputs will have highly correlated outputs. The GP employs a set of hyperparameters including the length scale , the signal variance , and the noise variance . The hyperparameters can be optimized based on the log-likelihood framework:

The log-likelihood and its derivative with respect to can be described aswhere .

Hyperparameters are initialized to random values in a reasonable range and then optimized using an iterative, such as the conjugate gradient.

More detailed theory of the GP can be found in the literature [24].

3. GP-Based RSM

3.1. Explicit Formation of Reliability Index Using GP

The performance function of a slope can be built as follows:where denotes the random variables with mean and standard deviation . indicates a stable slope, while indicates a failed slope. indicates that the slope is hovering between stable and unstable. is the safety factor.

Assuming that there is a point located on the limit state surface, which is called as the design point. Here, n is the dimension of . is generally a nonlinear function. However, it can be linearized at by neglecting the second- or upper-order term:

Assuming is statistically uncorrelated, the mean value and standard deviation of are defined by

Reliability index can be calculated by

According to the FORM, a tangent hyperplane is used to fit the limit state surface at the design point. Therefore, the most important step in the method is to obtain the design point. Reliability index β is defined as the distance from the origin to the design point. Hence, the coordinate of the design point is denoted aswhere

According to equation (5), the performance function can be approximated by the GP model as

The first-order partial derivatives of the approximate function can be given bywhere .

Substituting equations (15) and (16) into (17), the reliability index β is given as

According to equations (15) and (16), (14) can be rewritten as

Substituting equations (17) and (18) into (13), one can obtain the coordinate of design point x using GP approximation.

The failure probability can be calculated aswhere denotes the cumulative distribution function of the standard normal variable.

3.2. Procedure of GP-Based RSM

The main difference between the GP-based RSM and the traditional RSM is that the former employs the GP to approximate the performance function and its first-order partial derivatives simultaneously. Moreover, from the viewpoint of machine learning theory, more knowledge means a better effect on the result of training or prediction of the machine learning model. Thus, the training samples, namely, knowledge set in the machine learning domain, is generated as a dynamic updating one to improve the performance of the GP model in the procedure of GP-based RSM.

As shown in Figure 1, reliability analysis using the GP-based RSM is described in detail as follows:Step 1. Assume initial values of the design point . Usually, mean values of the random variables can be selected as the coordinates of the design point .Step 2. According to the experimental plan of the RSM developed by Bucher and Bourgund [29]; the training samples are generated according to the intersection of the axis and coordinates of and , where μi and σi are the mean and standard deviation of the random variable xi. f, which is an integer, ranges from 1 to 4 and is usually set to 1. Then, values of the performance function containing and are obtained using the slope analysis code. Thus, the training samples are established, and the number of training samples is 2n × s + 1. s is the number of selected f.Step 3. If the training target y has a nonzero mean, to improve the efficiency of the training process of the model, y is adjusted by the mean of yStep 4. Train the GP model by learning the training samples . Then, we make predictions of performance function on the design point and obtain their predictions according to equation (5). Recall that the training targets were centered, thus we must adjust the predictions by offset:Step 5. Compute the reliability index of the kth iteration step β(k) using equation (17).Step 6. Compute the values of the new design point according to equation (21).Step 7. Check the convergence criterion for (ε is 0.001 in present paper). To improve constantly the reconstructing precision at the important region where the failure probability is contributed significantly, the new design point and its value of the real performance function are taken as a new sample added into the old training sample if the convergence criteria are not satisfied. The number of training samples increases to 2n × s + 1 + k. Go to step 3 and repeat steps 3–6 until convergence is satisfied. If the convergence criterion is satisfied, go to step 8.Step 8. Calculate the probability of failure pf using equation (19).

To apply the GP-Based RSM to slope reliability analysis, a general program package is developed using MATLAB.

4. Case Studies

4.1. Case 1: A Highly Nonlinear Performance Function

The performance function of the first example is defined as [30]where and are independent and obey standard normal distribution with zero mean and unit standard deviation.

The training samples and predicted values of the performance function are listed in Table 1. There are 5 training samples generated using the proposed method, where f is 1. The optimum hyperparameters of trained GP θ = (3.5197, 5.4426, 4.8151, −9.6465). The small prediction error from the GP indicates that the training is successful. If a vector of the basic random variables is input, the GP can generate an adequate accurate value of the corresponding performance function. Consequently, the GP can represent the true performance function well.

The GP approximation of the performance function based on the initial training samples iswhere the covariance function denotes the column of matrix K and the values of vector α and matrix K based on the initial training samples are as follows:

Table 2 lists both calculation processes of the design point using the FORM and the proposed method. It clearly illustrates that the real performance and its first derivatives are both fitted very well by the GP model.

The failure probability and reliability index obtained by various methods in preexisting studies are listed in Table 3. The exact reliability index is 2.9044 obtained by the direct MCS [30]. The reliability index yielded by the FORM, RSM, and GP-based RSM are very close to that by the MCS. The GP-based RSM needs 7 function evaluations, while the RSM and direct MCS require 19 and 106 function evaluations, respectively.

Figure 2 shows the curves of approximated limit state function from the GP based on final training samples composed of 5 initial samples and 2 new samples and that of real limit state function. As can be seen from Figure 3, the GP can capture the whole shape of the real limit state function, and the design point is found quickly by the use of the GP-based RSM. This shows that the dynamic update of the training samples can significantly improve the efficiency of searching the design point in the proposed method.

4.2. Case 2: A Two-Layered Slope

As illustrated in Figure 3, the second example considers a two-layered slope which has been studied by Xu and Low [4]; Cho [14]; Hassan and Wolff [31]; Bhattacharya et al. [32]; and Chowdhury and Rao [33]. Property parameters of the soil related to the slope, including the friction angle and cohesion, are taken as random variables, as listed in Table 4. These random variables are assumed to obey normal distribution variables in the numerical analysis. Unit weight of the soil is assumed to be 19 kN/m3. By numerical analysis using SLIDE [24], the factor of safety of circular slip surface was calculated using the Bishop and Spencer methods, respectively.

There are 7 training samples (f = 1) established for training the GP model as shown in Table 5. The safety factors of the other 10 samples created randomly were predicted using the trained GP model. The safety factors calculated by the Bishop code and Spencer method and values predicted by the GP are shown in Figure 4. The GP model can reflect the relationship between the random variables and the safety factor of the slope. Then, the performance functions in the explicit form using the GP model were established to approximate the implicit real performance functions.

Table 6 listed the results of the probabilistic analyses obtained from the previous literature for the same problem, while Tables 7 and 8 present the results obtained by the GP-based RSM. It can be observed from Tables 7 and 8, the result of reliability index by the early study shows that the exact solution is about 2.2 [33], and the reliability index estimation by the GP-based RSM with 7 initial samples is 2.249 from the Bishop method and 2.269 from the Spencer method, respectively. The results are quite in agreement with each other.

Furthermore, Tables 7 and 8 indicate that the change of the numbers of initial training samples slightly influences the accuracy of estimated values of probability of failure. It is worth noting that a larger number of initial training samples may mean higher robust and computation cost of the proposed method at the same time. The proposed method needs 12 function evaluations, while the ANN [14], MCS, first-order HDMR-based response surface method, and second-order HDMR-based response surface method [33] requires 25, 106, 13, and 61 number of original stability analysis, respectively. The larger the number of the function evaluations, the more the time needed to finish the computation is, the less efficient the used method is. Consequently, the GP-based RSM method is relatively efficient.

The slip surfaces from the Bishop method and Spencer method at the mean point and at the design point are shown in Figures 5 and 6, respectively. The slip surfaces at the mean point are very different from that at the design point. The values of the factor of safety Fs at the design point are both very close to 1; in another words, , which mean the slope just reaches the limited equilibrium status.

4.3. Case 3: A Three-Layered Slope

A complex slope with three different soil layers is studied. The cross section of the nonhomogeneous slope with a height of 10 m and an inclination of 2 : 1 is presented in Figure 7. The unit weight γ = 19.5 kN·m−3. The corresponding property parameters of the soil related to the slope, including the cohesion c and friction angle φ, are considered as random variables. These random variables obey the normal distribution. The values of the mean and standard deviation of the random variables are listed in Table 9.

The calculations of the safety factor of the slope are conducted using the Bishop method and the Spencer method by the SLIDE code [35]. 11 initial training samples (f = 1) are listed in Table 10 while the safety factor of the circular slip surface was calculated using the Bishop and Spencer methods, respectively.

The results of the slope reliability analysis using different methods are listed in Table 11. A sampling size of 106 is considered in the direct MCS by numerical analysis using SLIDE [35] to estimate the failure probability Pf. The failure probability estimation by the GP-based RSM is in good agreement with that of the MCS. The proposed method needs 12 function evaluations, while the number of function evaluations for the SVM-based RSM [20] and MCS are 19 and 106, respectively. This indicates that the proposed GP-based RSM is much more efficient to achieve reasonable accuracy for slope reliability analysis than the SVM-based RSM [20].

In addition, the slip surfaces from the Bishop method and Spencer method at the design point obtained using the proposed method are both presented in Figure 8.

4.4. Example 4: Congress St. Cut Model

In this example, the confessed Congress St. Cut model studied by Chowdhury and Xu [36] is considered. The soil profile is presented in Figure 9. Two sets of circular slip surfaces are considered. The first set consists of potential failure surface tangential to the lower boundary of the Clay 2 layer (failure mode I), while the second considers slip surfaces tangential to the lower boundary of Clay 3 (failure mode II). The property parameters of soil are listed in Table 12 and are assumed to obey the Gaussian distribution in the numerical analysis. The Bishop method is used to estimate the safety factor from the numerical analysis using SLIDE [35]. For evaluating the failure probability Pf, f = 1 is selected, which results in 7 initial training samples (Table 13). Without any iteration, totally 7 actual stability analyses using the Bishop method are needed for the GP-based RSM. It can be observed from Table 14 that the failure probability estimated by the GP-based RSM agrees well with that reported in the literature [33]. However, the total numbers of slope stability analysis for the proposed method is 12, while that for the MCS, first-order HDMR, and second-order HDMR are 5000, 13, and 61, respectively [33] This indicates that the proposed method is very efficient to achieve reasonable accuracy. The slip surface of failure mode I and failure mode II at the design point obtained using the proposed method are both presented in Figure 10.

5. Conclusions

A GP-based RSM was developed to predict failure probability of slope. In this method, the training samples for establishing the GP model are first generated by the design method of the classical RSM method. The limit state function and its first-order partial derivative are then approximated by the trained GP model. Finally, failure probability is estimated using the FORM. In addition, the viewpoint of more knowledge means the better effect on the result of training or prediction of the machine learning model, and the use of the renewed continuously training samples may improve the performance of the GP model for approximating the limit state function around the design point. Thus, the GP significantly reduces the number of required training samples. Otherwise, GP also shows good performance to approximate the limit state function and then provides accurate estimation of the failure probability when connected with the FORM. Four numerical examples including both slope and nonslope problems illustrate the flexibility and efficiency of the proposed method. Compared with the traditional RSM, the GP-based RSM is much more efficient to achieve reasonable accuracy for slope reliability analysis. Moreover, the present method can directly take advantage of existing slope software without modification and thus are convenient to be used for practitioner engineers. However, it should be noted that the proposed algorithm is intended as a possible complement rather than a replacement for existing classical methods.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was financially supported by the National Natural Science Foundation of China under Grant no. 51369007.