This study focuses on the use of deep neural network (DNN) to predict the soil friction angle, one of the crucial parameters in geotechnical design. Besides, particle swarm optimization (PSO) algorithm was used to improve the performance of DNN by selecting the best structural DNN parameters, namely, the optimal numbers of hidden layers and neurons in each hidden layer. For this aim, a database containing 245 laboratory tests collected from a project in Ho Chi Minh city, Vietnam, was used for the development of the proposed hybrid PSO-DNN model, including seven input factors (soil state, standard penetration test value, unit weight of soil, void ratio, thickness of soil layer, top elevation of soil layer, and bottom elevation of soil layer) and the friction angle was considered as the target. The data set was divided into three parts, namely, the training, validation, and testing sets for the construction, validation, and testing phases of the model. Various quality assessment criteria, namely, the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE), were used to estimate the performance of PSO-DNN models. The PSO algorithm showed a remarkable ability to find out an optimal DNN architecture for the prediction process. The results showed that the PSO-DNN model using 10 hidden layers outperformed the DNN model, in which the average correlation improvement increased R2 by 1.83%, MAE by 5.94%, and RMSE by 8.58%. Besides, a global sensitivity analysis technique was used to detect the most important inputs, and it showed that, among the seven input variables, the elevation of top and bottom of soil played an important role in predicting the friction angle of soil.

1. Introduction

In geotechnical engineering, the internal friction angle of soil is one of the most important parameters [1]. This index is mentioned in many studies in the field of geology, for instance, in the investigation of the foundation bearing capacity [27], soil stability [3, 810], limit loads [11], and slope stability [9, 1214]. In slope stability or retaining walls-related problems, the friction angle was proven to affect the soil slope stability significantly, as an increase of the friction angle decreases the soil consolidation [3]. Similarly, the retaining wall’s reaction is also significantly affected by the internal friction angle [8]. In another study, You Mo et al. [11] pointed out a positive correlation between friction angle and critical load. The above points showed that the friction angle is a vital soil parameter [1], and more in-depth research of the latter is required. However, the friction angle is only mentioned indirectly in related geological studies [24, 8, 9, 11], and there are limited researches on this index.

The value of the internal friction angle depends on many soil parameters, such as grain size, particle shape, unit weight, specific gravity, and soil moisture [15, 16]. Two main approaches were proposed for estimating the friction angle of soil: experimental and theoretical approaches. Concerning the experimental one, the internal friction angle can be obtained from the direct shear test [1, 1719] and the triaxial shear test [4, 1922]. However, these experiment testing methods possessed some disadvantages, as they are time-consuming and required relatively expensive equipment [2325]. Besides, theoretical and numerical methods were also applied to determine the friction angle of soil. As an example, Mo et al. [11] proposed an equation for the friction angle of the soil and confirmed the validity of the theoretical method.

Notwithstanding the foregoing, the theoretical method’s limitation is that it is necessary to be based on some assumptions, in this case, the ideal rigid-plastic slip-line field theory. Other studies proposed alternative methods to evaluate the internal friction angle of soil, based on the standard penetration test (SPT) values [2629] and cone penetration test (CPT) results [30, 31]. Those methods provided several empirical formulas that could quickly predict the soil friction angle. However, these formulas contained only one or several main input variables, which could reduce the prediction accuracy in many cases.

In the last two decades, based on the development of computer science, numerous studies have proposed a novel approach for surveying and assessing geological issues, namely, the artificial intelligence (AI) approach [10, 25, 32, 33]. In Mikaeil et al.’s work [32], the authors have used the genetic algorithm (GA) to study three crucial physical and mechanical characteristics of soil, in which the internal friction angle is an important parameter. Additionally, an adaptive neuro-fuzzy inference system (ANFIS) has been used by Murlidhar et al. [25] to study the shear strength of rock based on the internal friction angle. Das and Basudhar [34] and Al-Hamed et al. [35] have developed ANN models to predict the internal friction angle of soil. Besides, Pham et al. [36] have developed a hybrid model using random forest and particle swarm optimization for estimation of undrained shear strength of soil. The obtained results showed the effectiveness and reliability of AI approach, reflected by the high R2 values achieved [10, 25, 32, 33]. Besides, Deep Neural Network (DNN) is also another powerful and efficient algorithm of AI. Nowadays, DNN has become popular and widely used to solve practical engineering problems and provide reliable results [3740]. The problem with such an ML approach is that it has a large number of critical hyperparameters, making it difficult to find an optimal model architecture. Among various optimization algorithms, Particle Swarm Optimization (PSO) is an efficient, robust, and straightforward algorithm mainly used to solve problems that are difficult to find an exact mathematical model [4143].

Therefore, the main objective of this study is to develop a PSO-DNN hybrid model, which can be self-developed to find the best architectural model to predict the internal friction angle of the soil, including the number of hidden layers and number of neurons in each hidden layer. Seven input factors that might affect the prediction of friction angle of soil were considered: the soil state, standard penetration test value, unit weight of soil, void ratio, thickness of soil layer, top elevation of the soil layer, and bottom elevation of the soil layer. For this aim, a data set consisting of 245 soil samples was collected from some drill holes in Ho Chi Minh City, Vietnam. The database was then divided into the training (60% of the data set), validation (20% of the data set), and testing (20% of the data set), related to the training, validation, and testing phases of the hybrid model. In addition, a global sensitivity analysis method using Monte Carlo simulation was conducted to find out the most important parameter that affects the prediction of friction angle of soil.

2. Data Collection and Preparation

2.1. Experimental Measurement of Friction Angle of Soil

In this work, the authors conducted 245 identified different soil samples in Ho Chi Minh City, Vietnam (Figure 1), and the results are summarized in Table 1. To determine the characteristics of the soil properties, we use the method of determination of shear laboratory resistance in a shear box apparatus [44, 45]. The test samples with the original structure and natural moisture were prepared by cutting from the original soil samples into blocks, which were taken into the ring knives by the method of determining the volume by the ring knife [46]. The underside and top of the soil sample should be leveled with the circular knife's edge and placed with dampened paper first. For fast cutting without draining, the absorbent paper must be replaced with tracing paper (or thin plastic) [46]. Simultaneously with cutting soil samples, it is necessary to take soil to determine moisture content.

With sandy soil, the sample was prepared by pouring dry sand onto a cutter installed in a hard bottom box with many small holes. Then, take a sample into a cylindrical ring knife with two ends, place the sample in the cutting box, and compress the sample under pressure σ. If soil does not preconsolidate, cut samples immediately, cut quickly, and cut samples at a speed of 1 mm/minute until the sample is damaged. If the soil consolidates under pressure σ, maintain that compressive force until the level of consolidation is met, and then proceed as normal. It is required to cut slowly at a cutting rate of 0.01 mm/min (or slower) until the sample breaks. The destructive force is the maximum value read on the strain gauges. The vertical compression pressure was applied at four levels: 50 kPa, 100 kPa, 200 kPa, and 300 kPa, respectively. For each level of compressive pressure, the corresponding shear strength of soil was recorded. Draw the shear strength according to the vertical pressure, thereby determining the cohesion of soil C, tgϕ and calculating the internal friction angle (ϕ).

2.2. Data Preparation

To predict the internal friction angle of the soil, the inputs need to be carefully considered. First, the water content was not included in this study, as it is believed that the effect was taken into account in the number of SPT blows [47]. Besides, many studies have been successful in finding the relationship between the results of in situ experiments (CPT and SPT) and the Plastic and Liquid Limited humidity [4851]. Since the SPT is one of the most commonly used tests in practice to indicate the soil's ability to withstand compression and in situ shear, it is believed to relate to the factors that characterize the ability of the shear strength of the soil. Many studies have shown the relationship between the in situ experiment results and the internal friction angle of the soil [2631]. Second, this study aims to provide a tool to quickly analyze the internal friction angle of the soil based on basic parameters that can be quickly and simply determined instead of parameters that take time and cost for determination. Therefore, this study selected SPT value (N30) of soil as the main feature, and some other indicators such as soil state (S), unit weight of soil (G), void ratio (e), thickness of soil layer (H), top elevation of soil layer (Z1), and bottom elevation of soil layer (Z2) are inputs to predict the internal friction angle of soil. The internal friction angle of soil (ϕ) is the single output variable.

One of the most popular methods to determine the internal friction angle of soil (ϕ) is the direct shear test method [44, 45]. There are three soil direct cutting modes: UU mode (this mode is fast cutting-no cohesion, no drainage), CU mode (fast cutting, cohesive-no drainage), and CD mode (slow cutting, consolidation-drainage; this mode usually applies to sandy soils) [1]. Firstly, the UU cutting mode is suitable for fast construction and difficult drainage, and the soil samples in the present study were all under this type of condition. Secondly, the UU cutting mode is quick test, taking under 30 minutes to perform, while the CU and CD modes take longer to execute and can take weeks or even months to complete. Besides, the shear strength parameters given by the UU cutting mode are generally safer than those in the other two modes. Therefore, in this study, the results of the UU cutting mode were applied to determine the friction angle of all soil samples. The meaning of the soil type state number is explained in Table 2. Besides, the parameters’ meanings are illustrated in Figure 2, and the histograms of all the input and output variables are shown in Figure 3.

The data set containing 245 samples was statistically introduced and summarized in Table 1, including several soil samples, min, max, average, and standard deviation of the input and output variables. It can be seen that, among 245 soil samples, 46 samples are sandy soil and the rest are clayey soil. Soil state number (S) ranged from 2 to 9, corresponding to soft clay to dense sandy soil. The minimum depth of the soil samples is 2.0 m, and the maximum depth is 79.32 m. The standard penetration test values (N30) were from 2 to 85, where 2 corresponds to the soft clay soil and 85 corresponds to the hard clay soil. The unit weight of soil (G) ranged from 16.2 (kN/m3) to 21.57 (kN/m3) in accordance with the clay soil to sandy soil layers. The void ratio (e) varied from 0.416 to 2.194, corresponding from medium sandy soil to soft clay soil. The internal friction angles of soil ranged from 4.23 (degrees) to 33.4 (degrees), related to the shear strength of the soil increasing from soft clay to medium and dense sandy soil. These input data results are consistent with the properties of the soil layers. An example of 50 data samples is given in Table 3.

In this study, we divided the data into three data sets, in which the training set was used to build the model, the validation set was used to give an estimate of model skill while tuning the model’s hyperparameters, and the testing set was used to estimate of the skill of the final tuned models to choose the best model. The test set was fixed at 20% of the total data set, hidden during training, and used only to evaluate the performance of the final model after the hyperparameters tuning process.

3. Machine Learning Methods

3.1. Particle Swarm Optimization

Particle Swarm Optimization (PSO) was presented by Kennedy and Eberhart [52]. It became prevalent because it is a continuous optimization process type and allows an analysis of multiple targets. For a continuous search of the best solution, the method used is to move the positions of particles at a given velocity calculated in each iteration. Each movement of particles is influenced by its best position and the best position in the entire search space (Figure 4). This is expected to move the swarm to the best position. The effectiveness of the solution is assessed through a fitness function. In addition, PSO does not use the slope of the problem to be optimized, which means that PSO does not require optimization issues to be differentiable like standard optimization methods such as gradient descent and quasi-newton methods. PSO is a powerful technique that has been widely used for optimization issues in many fields in general and geotechnical engineering in particular [53, 54].

The pseudocode of the algorithm is presented as follows. (Algorithm 1)

FOR each particle i in swarm
 Initialize parameters: , _damp, c1, c2.
FOR each dimension j
  Initialize position xij randomly within permissible range
  Initialize velocity randomly within permissible range
Iteration k = 1
WHILE k < maximum_Iteration
 FOR each particle i in swarm
  Calculate fitness value
  IF fitness value > P_best[i] THEN
   P_best[i] = fitness value
  IF fitness value > G_best THEN
   G_best = fitness value
FOR each particle i in swarm
FOR each dimension j
   Calculate new velocity:
(k + 1) = wvij(k) + c1random (0,1) (P_best[i] − xij[i]) + c2random (0,1) (G_best − xij[i])
   Update particle positon:
   xij(k + 1) = xij(k) + (k + 1)
 = ._damp
k = k + 1

The parameters in the equation that defines the velocity of the next iteration are as follows: is an inertial parameter; c₁ and c₂ are the acceleration coefficients; c₁ value gives the importance of individual best solution, and c2 is the importance of global best candidate solution; _damp is the inertial reduction coefficient, and it helps the swarm movement to quickly converge.

3.2. Deep Neural Network

Deep neural network (DNN) is definitely one of the most advanced regression methods. DNN model is analogous to a multistage regression. The main idea is to create a flexible nonlinear statistical model consisting of several layers and neurons in each layer. In DNN, each node in a layer is associated with a certain weight, denoted as ij, with every node in the other layers creating a fully linked neural system [55]. Except for the input layer, each node is a neuron that uses a nonlinear activation function [56]. Thanks to multilayer and nonlinear activation functions, the DNN model could distinguish nonlinear separable data. The DNN structure used in this study is a multilayer perception (MLP), shown in Figure 5. Further details of the DNN model can be found in [57].

3.3. Particle Swarm Optimization-Deep Neural Network Algorithm

The PSO-DNN hybrid model was created based on the PSO algorithm, in which the fitness function of each particle in the swarm was DNN. The hybrid algorithm is implemented with the number of hidden layers ranging from 2 to 10. For each case of the number of hidden layers, random particles are generated, the length of the particle vector is the number of hidden layers, and the jth dimension of the particle vector is the number of hidden neurons. The swarm architecture was presented in Figure 6, in which k is the number of hidden layers and N is the number of particles in a swarm. The flow chart of the hybrid PSO-DNN model was shown in Figure 7. In this algorithm, the number of hidden layers in the model is incremented from 2 to 10. For each case of the number of hidden layers, the algorithm will automatically look for the optimal model architecture for that number of hidden layers.

3.4. Performance Evaluation

To evaluate the effectiveness of the proposed AI models, three widely used statistical criteria are applied, namely, mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R2). In order to estimate the percentage change of data which can be achieved by prediction, it is common to use the R2 coefficient in regression analysis [58]. To measure the average intensity of error, both MAE and RMSE were used [59]. Studies have concluded that the model will be more accurate when R2 is closer to 1 and MAE and RMSE is closer to 0. These coefficients are determined by the following formulas:where represents the number of the samples, and were the actual and predicted outputs, respectively, and was the average value of .

4. Results and Discussion

4.1. Effect of Training Data Size

Because the testing set size was selected and fixed to 20% of the total data set and hidden from the training process, the training data set size was chosen by dropping from 60% to 20% of the data set, or meaning the size of the validation set increased from 20% to 60% of the remaining data set. For each size case of the training set, 300 simulations were used, taking into account the random shuffle in the training order. For each size case of the training set, 300 simulations were used, taking into account the random shuffle in the training order. The initialization parameters of the DNN model used in this section are presented in Table 4.

From Figure 8, it can be seen that training performance progressively increased with decreasing in the training set. The average R2 value was increased from 0.952 to 0.991 when the training set size was decreased from 60% to 20%. On the other side, the predictions showed less accurate results on the validation set and testing set as the training set size decreases. The average R2 value was decreased from 0.889 to 0.602 on the validation set, while the average R2 value dropped from 0.919 to 0.570 on the testing set. In addition, the standard deviation from the DNN models was increased with decreasing in training set size. To be more specific, the standard deviation was increased from 0.0277 to 0.351 on the testing set. The above results show that the performance of the model got better and became more stable with the most significant training set size case. Aiming to increase the prediction accuracy of the model, the training set size of 60% was selected in this study.

4.2. DNN Structure Optimization

The evolutionary results of PSO-DNN models are evaluated in this section. The initialization parameters of PSO-DNN used in this study are given in Table 5. Figure 9 illustrates the evolution of the PSO-DNN model through 30 generations with the number of hidden layers set to values: 2, 4, 6, 8, and 10. A summary of the best predictability of the models is presented in Table 6.

It can be seen that the DNN model with 10 hidden layers has evolved to bring about better performance; the best generation yielded a correlation of R2 = 0.938, MAE = 1.395, and RMSE = 1.846. The model with the next best performance is the DNN model with 8 hidden layers which produce accurate intermediate precision (R2 = 0.931, MAE = 1.528, and RMSE = 1.839). The DNN model with 10 hidden layer gives the best performance for R2 and MAE criteria, while the model with 8 hidden layers gives the best result for the RMSE cost function. However, the best neuron structure for each cost function is different and is also shown in Table 6. The evolutions of number of neurons in 10 hidden layers and 8 hidden layers of the best DNN model referring to R2, MAE, and RMSE are presented in Figure 10. It means that the evolution of the model has found three good results; choosing the best model will be done in the next section.

4.3. Predictive Capability of the Models

In this section, the three best models, 10 hidden layers, R2 (10-HD-R2), 10 hidden layers, MAE (10-HD-MAE), and 8 hidden layers, RMSE (8-HD-RMSE) were chosen for comparison with each other. From the point of view of statistical probability, the random factors in the input variables should be considered. Therefore, 300 simulations take into account the randomness of the input variable sequence number. The data used in this section include training, validation, and testing data sets. The criteria for selecting the best model are based on the results of the testing data set. The results are shown in Figure 11 and Tables 79 .

The results showed that the predicted results on the training and validation sets gave good accuracy with satisfactory average accuracy R2 ranging from 0.851 to 0.946, MAE ranging from 1.188 to 2.105, and RMSE varying from 1.566 to 2.668, in which it can be seen that, among three good models, the 10-HD-R2 gives the best prediction, which yielded an average correlation of R2 = 0.935, MAE = 1.34, and RMSE = 1.77 on the testing data set. The next best model is 8-HD-RMSE, which produces accurate intermediate precision (R2 = 0.881, MAE = 1.523, and RMSE = 2.019). Moreover, the standard deviation of the 10-HD-R2 model is better than that of the 8-HD-RMSE, where SD = 0.0138, 0.1339, and 0.1835 for R2, MAE, and RMSE criteria compared to SD = 0.0186, 0.1923, and 0.3264 of two models, respectively.

Table 10 shows the comparison between the DNN model and the PSO-DNN model. The DNN model received initialized parameters as shown in Table 11. The best 10-HD-R2 PSO-DNN model was included for comparison with the proof of the effectiveness of the algorithm. The average error criteria of 300 simulations were compared to confirm the model’s performance. It can be seen that the 10-HD-R2 PSO-DNN model outperformed the DNN model, which improved the average correlation of R2 by 1.83%, MAE by 5.94%, and RMSE by 8.58%.

Table 11 presents some research results on ML applications in determining the internal friction angle of soil. The results of the present study, as well as other studies, show the effectiveness of the ML technique in predicting the internal friction angle of the soil with expected R2 from 0.79 to 0.935 on the testing data set. However, due to the different data sets used and the input variables, it makes no sense to compare these results. A project using different data sets as well as various input variables is needed to provide a general model for the prediction of soil shear parameters.

Figure 12 shows a visual comparison of test results and predictions based on the friction angle of soil of the DNN and PSO-DNN models. The performances of two ML models have been tested on all three data sets: training, validation, and testing.

4.4. Sensitivity Analysis

In this section, a global sensitivity analysis was conducted to evaluate the importance of input parameters for the model using Monte Carlo methods [63]. It is an effective way to investigate a relationship between input and output. The input data set taken from Saltelli’s sampling scheme was used to develop the DNN model [64]. The global sensitivity index is determined by the following formulas:where Var(Y) denotes the total variance of the model output; d denotes the number of input features; Vari denotes the model output variance in response to variation of the ith input variable; Varij denotes the model output variance in response to the simultaneous variation of the ith and the jth input; STi denotes the total sensitivity index.

The result of the total sensitivity analysis was shown in Figure 13. It can be seen that, among the 7 input variables used to predict the friction angle of soil, the elevation of the soil layer bottom (Z2) was the most important feature, which achieved an average sensitivity index score of 0.625. The topsoil layer elevation (Z1) was the second important variable, confirmed by an average sensitivity index score of 0.603. From the point of view of soil mechanics, the deeper the soil is, the more compact it is, so the depth of the soil sample plays an important role in detecting the friction angle of the soil when its density, state, and many other mechanical properties change according to the depth. The variables H, G, and S were ranked as the third to the fifth important predictors, with an average sensitivity index ranging from 0.203 to 0.05. Other predictor variables included in the model (e, G) had lower than 0.01 in the sensitivity index, indicating that they had not affected the output prediction. It is important to note that the sensitivity analysis in this section is only relevant for the input data set itself and cannot be confirmed with other data sets.

5. Conclusions

In this study, a PSO-DNN hybrid model, which can evolve itself to find out the best models, was developed to predict the friction angle of soil. The hybrid model evolved on its own and found three of the best among the survey models. A database containing 245 soil samples from geological boreholes was used to develop and evaluate the three proposed DNN models: 10-HD-R2, 10-HD-MAE, and 8-HD-RMSE.

The results show that the models’ performances improved and stabilized from R2 = 0.602 to 0.889 on the validation set with the increase in training data set size from 20% to 60%. Research results show that PSO-DLNN obtains the best results with the number of hidden layers from 8 to 10. The optimal number of neurons in each hidden layer is not the same and is distributed intricately in the hidden layer. It recommends that a DLNN model with 8 to 10 hidden layers might be optimal for the problem related to predicting the friction angle of soil. However, it is advisable to select the number of neurons in each hidden layer through evolutionary methods to bring about high efficiency to the DLNN model. The results also showed that, on the training and validation data sets, all three best models, 10-HD-R2, 10-HD-MAE, and 8-HD-RMSE, have good predict results, in which the leading is the 10-HD-R2 model on the testing data set; the 10-HD-R2-DNN model still gave the best results and outperformed the other two models. Prediction results of 300 simulations show that the 10-HD-R2-DNN model has a smaller standard deviation, indicating that the model is more stable than the other two models.

In addition, the sensitivity analysis using the Monte Carlo method was carried out to evaluate the importance of input features in the model study. The results show that the two inputs related to the depth of the soil layers (Z2 and Z1) were considered to be the most important parameters for predicting soil friction angle.

Data Availability

The data used in the manuscript are available in the supplementary materials. Resharing of data with other researchers is completely allowed; however, they need to be cited in the researches.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Supplementary Materials

Data collected in Ho Chi Minh City, Vietnam, including 245 experimental results for determining the internal friction angle of the soil, together with the input parameters required for machine learning modeling. (Supplementary Materials)