#### Abstract

This research presents a novel hybrid prediction technique, namely, self-tuning least squares support vector machine (ST-LSSVM), to accurately model the friction capacity of driven piles in cohesive soil. The hybrid approach uses LS-SVM as a supervised-learning-based predictor to build an accurate input-output relationship of the dataset and SOS method to optimize the *σ* and *γ* parameters of the LS-SVM. Evaluation and investigation of the ST-LSSVM were conducted on 45 training data and 20 testing data of driven pile load tests that were compiled from previous studies. The prediction accuracy of the ST-LSSVM was then compared to other machine learning methods, namely, LS-SVM and BPNN, and was benchmarked with the previous results by neural network (NN) from Goh using coefficient of correlation (*R*), mean absolute error (MAE), and root mean square error (RMSE). The comparison showed that the ST-LSSVM performed better than LS-SVM, BPNN, and NN in terms of *R*, RMSE, and MAE. This comprehensive evaluation confirmed the capability of hybrid approach SOS and LS-SVM to modeling the accurate friction capacity of driven piles in clay. It makes for a reliable and robust assistance tool in helping all geotechnical engineers estimate friction pile capacity.

#### 1. Introduction

Deep foundations built through the past years were made either of concrete, steel, or timber piles, which are either driven and precast or bored and cast-in-situ. Driven piles are frequently used in developing countries with a vast array of suburban and rural areas as foundations to support heavily loaded structures, for example, high-rise buildings and bridges. Recently, a variation of driven piles, jack-in piles, has also successfully used as foundations for high-rise buildings in urban areas due to their lower vibration and noise compared to the conventional-driven piles [1].

Despite all the development in the driven piles method, the design of driven piles still heavily relies on semi-empirical methods to estimate shaft resistance (*f*_{s}) and end resistance (*f*_{b}) [1–3]. The *f*_{b} will be negligible, and the large percentage of pile load will be taken by *f*_{s} if there is no bearing layer, that is, driven piles in cohesive soil. Consequently, *f*_{s} that can be provided by the soil is very important in pile design. Up until now, there is no comprehensive assessment of *f*_{s} prediction methods [3]. Efforts were limited to comparing *f*_{s} prediction from various semiempirical methods with results from pile load tests [3–5]. To overcome this limitation, other efforts were dedicated to predicting *f*_{s} through machine learning techniques [6–8].

In civil engineering, machine learning techniques have developed into an important research area. Several studies reveal the advantages of using machine learning techniques in establishing a better predictive model over traditional methods [9–11]. Recently, least squares support vector machine (LS-SVM) has become one of the widely used machine learning techniques in handling variety of complex problems [12–15]. Although acceptable prediction results have been reported, an improper parameter tuning may lessen the learning process of LS-SVM resulting in a lower accuracy. Building a more accurate predictive model can be achieved by optimizing the LS-SVM parameters. This includes a regularization parameter (*γ*) to deal with the trade-off between minimizing model complexity and training error. Also, it includes a kernel parameter (*σ*) of the radial basis function (RBF) which describes the nonlinear mapping between the input space and high-dimensional feature space.

Identifying the optimal parameters is an optimization issue, therefore, many recent studies combine a machine learning technique with a metaheuristic-based optimizer instead of using a sole machine learning [16–21]. Therefore, this research presents a new hybrid prediction method called self-tuning least squares support vector machine (ST-LSSVM) to accurately model the friction capacity of driven piles. The hybrid approach ST-LSSVM combines the techniques of SOS and LS-SVM. While the SOS is used to optimize the *σ* and *γ* parameters of the LS-SVM, the LS-SVM builds an accurate input-output relationship of the dataset by performing as a supervised-learning-based predictor. The total 45 training data and 20 testing data from Goh [6] have been employed to validate the performance of the proposed method. The ST-LSSVM method is further compared with LS-SVM and BPNN and is benchmarked with the previous results from Goh [6] using the coefficient of correlation (*R*), mean absolute error (MAE), and root mean square error (RMSE).

#### 2. The Proposed Self-Optimized Machine Learning Framework

The objective of this proposed hybrid method is to improve the learning abilities of LS-SVM by searching for optimized set of LS-SVM parameters automatically. The collaborative integration between LS-SVM-based regression and SOS facilitates the LS-SVM to accurately determine the complicated relationship behavior between input variables and the output variable of the given historical data. The LS-SVM and SOS are briefly described below.

##### 2.1. Machine Learning Technique: Least Squares Support Vector Machine (LS-SVM)

LS-SVM is first introduced by Suykens and Vandewalle [12] as a modification of the conventional support vector machine (SVM). LS-SVM is used with a least squares loss function which allows for function estimation while reducing computational costs. Where highly nonlinear spaces occur, RBF kernel is chosen as the kernel function in LS-SVM which brings more promising results than other kernels [12, 22]. The following model of interest underlies the functional relationship between one or more independent variables along with a response variable [12, 23]:where , , and is the mapping to the high-dimensional feature space. In LS-SVM for regression analysis, given a training dataset , the optimization problem is formulated as follows:

where are the error variables and denotes the regularization constant.

In the previous optimization problem, a regularization term and a sum of squared fitting errors make for the objective function. For the cost function, this will be similar to the standard procedure with training feedforward neural networks and this is closely related to a ridge regression. However, the primal problem becomes somewhat impossible to solve when becomes infinite dimensional. In this case, the dual problem should be derived after constructing the Lagrangian [12].

The Lagrangian is given bywhere are the Lagrange multipliers. The conditions for optimality are given by

After elimination of *e* and , the following linear system is obtained:where , , and . And the kernel function is applied as follows:

The resulting LS-SVM model for function estimation is expressed aswhere and *b* are the solution to the linear system (5).

The kernel function that is often utilized is RBF kernel. Description of RBF kernel is given as follows:where is the kernel function parameter.

With the *γ* parameter, the imposed penalty (to data points that move away from the regression function) can be controlled. For the *σ* parameter, this will have a direct impact on the smoothness of the regression function. To ensure the best performance of the predictive model, it should be noted that proper setting of these tuning hyperparameters is required.

##### 2.2. Metaheuristic Optimization Algorithm: Symbiotic Organisms Search (SOS)

Developed by Cheng and Prayogo, SOS is a recently developed metaheuristic algorithm that took inspiration from dependency-based interaction normally found among natural organisms and symbiosis [24]. Just like many other metaheuristic solutions, SOS guides the searching process using special operators that use candidate solutions; it looks for organisms containing candidate solutions to find the global solution in the search space; it requires a maximum number of evaluations and other common control parameters; and it preserves the better solutions by using a selection mechanism.

Nevertheless, there are some key differences because SOS does not need algorithm-specific parameters; for example, particle swarm optimization (PSO) relies upon the social factor, inertia weight, and cognitive factor. To tune the parameters, SOS requires no extra work, and this is a huge advantage. With improper tuning of the parameters, there is a possibility that the obtained solutions are found in local optima regions. Since the first development in 2014, SOS has been successfully utilized in solving many optimization problems in various research areas [25–31].

At the beginning, SOS will create a random ecosystem matrix (population) with each problem having a viable candidate solution. For the user, the number of organisms can be entered within the ecosystem, and this is called the ecosystem size. In each row of the matrix, this represents organisms which are the same as individuals in many other solutions. With each virtual organism, a candidate solution is represented alongside the corresponding objective. Once the ecosystem has been generated, the search then begins.

With three clear phases, the idea comes from the most well-known symbioses. In the long term, organisms use them to improve their survival advantage and fitness (Figure 1). Throughout this searching process, there are three ways in which the organisms benefit from interacting with one another. The SOS algorithm adopts greedy selection scheme. Therefore, the updated organisms can replace the current organisms only if their fitness is better. Once one organism has finished all three phases, the best organism can then be updated. All things considered, the phases will form a continual cycle until the stopping criterion has been reached.

###### 2.2.1. Mutualism Phase

With the mutualism phase, this is a relationship where the two sides benefit, and the perfect example would be flowers and bees. The mathematical formulation for the mutualism phase is shown as follows:where and represent the two current organisms engaged in mutualism; represents the current best organism; models the mutualism interaction between the two current organisms; and represent the updated organisms after the interaction; and are the two random values of either 0 or 1 representing the level of benefit of each organism. Meanwhile, can be calculated using the following formulation:

###### 2.2.2. Commensalism Phase

In this phase, the organism manages to develop a relationship in which only they benefit. For example, this is common between sharks and remora fish. The mathematical formulation for the commensalism phase is shown as follows:where is the uniform random parameter between −1 and 1.

###### 2.2.3. Parasitism Phase

Finally, this is a relationship where one side is benefitted and the other is harmed in some way. For instance, the plasmodium parasite transfers from one human host to the next using the *Anopheles* mosquito. In this phase, the beneficiary will get fitter while the harmed is likely to perish. The mathematical formulation for the parasitism phase is shown as follows:where is the artificial parasite affiliated with that threatens the existence of . Meanwhile, can be calculated as follows:where and are the binary random matrix and its inverse, respectively, and is the uniform random parameter between 0 and 1.

##### 2.3. Cross-Validation Technique and Performance Measurement

Training and testing processes are essential in establishing the prediction model. In the training process, a dataset is implemented to buildup a prediction model through the machine learning method. In the testing process, the established prediction model is used to validate new dataset. Using the entire dataset to train might cause an “overfitting” phenomenon, that is, the prediction model fits the dataset extremely well but is useless for a new and unseen dataset. Hence, the training dataset is often divided into two subsets to avoid the overfitting problem; larger portion of the training dataset as “training subset” and smaller portion of the training data as “validation subset.” The validation subset is used to validate the model built. This approach ensures the established prediction model to perform well in predicting the testing dataset.

To eliminate the randomness in partitioning the training dataset, *k*-fold cross-validation technique is employed [32, 33]. During this process, *k*-fold cross-validation creates *k* subsets from historical data, and they will always be nonoverlapping. The first (*k* − 1) subset will be used in training the inference model which, in this case, is ST-LSSVM, before the last *k*th subset is used to validate the result. Since it relies upon cross-validation, this means that the process repeats itself for *k* times until all subsets have been used once as the validating subsets. At all times, *k* will remain as an unfixed parameter, and this means *k* can be any suitable number. In the current research, the value of *k* was set to 10 since this allowed computational time to minimize. This meant that all data were divided into 10 randomly ensuring equally sized groups. While one subset is used as testing data, the other nine can be used as training data. The term “tenfold” means that 10% of the data will be used as validating subsets while the remaining 90% is used as training subsets.

The performance measures used to evaluate the predictive methods are further described in Table 1. The performance measures are implemented on the predicted output results of the training and testing data. The lowest RMSE and MAE values, together with the highest *R* value, indicate the best model outcome.

##### 2.4. Integration of LS-SVM and SOS with Cross-Validation

The procedure of ST-LSSVM explains the interaction of the proposed method in using training data and testing data to deliver the best prediction results. As mentioned earlier, the training dataset was divided as training subset and validating subset. In the training process, the building of predictive model sets was constructed by allowing SOS to determine the optimal LS-SVM parameters. Figure 2 presents the framework of the ST-LSSVM method.

To remove the issue of overfitting, *k*-fold cross-validation was chosen for parameter selection. From here, statistical performance measures can be used to calculate all results from the subset and folds. The best parameters represent the parameter set that can produce the minimum average RMSE on validating datasets through the tenfold training simulation. Meanwhile, the testing dataset focused on evaluating the performance of the trained LS-SVM model after finishing the optimization on unseen data. Using SOS, the whole optimization process was automated since this allowed for simultaneous optimization of LS-SVM. While SOS concentrated on optimizing the two LS-SVM parameters (*γ* and *σ*) to reduce prediction errors, LS-SVM addressed curve fitting and learning.

#### 3. Experimental Results

##### 3.1. Historical Dataset

Consisting of 45 training data and 20 testing data of load test records, this research uses historical data compiled by Goh [6]. With data-driven models mentioned here, they were based on the results received from various load data records for driven piles in clay including Vijayvergiya [34], Flaate and Selnes [35], and Semple and Rigden [36]; they were scaled down for laboratory conditions which means that the results should be used for similar conditions. In actual field data, it may not always fall within the range used in this study. Thus, dimensional analysis will need to be used alongside scaling effects to effectively apply the results here in actual field practice.

When expressing properties of load test records, numerous components were used including effective vertical stress (kPa), pile length (m), undrained shear strength (kPa), and pile diameter (cm). It is worth noticing that the output of the load test is friction capacity (kPa). Statistical descriptions of training and testing datasets of load test records are reported in Table 2.

In the training dataset, the pile length varied from 4.6 meters to 96 meters, while the diameter started with a minimum of 13.5 cm and reached up to 76.7 cm. For effective vertical stress and undrained shear strength, these could be found from 19 to 718 kPa and 10 to 335 kPa, respectively. In terms of friction capacity, this started at 8 and reached 192.1 kPa. From the testing dataset, pile length and pile diameter were from 8 to 66.4 m and 11.4 to 61 cm, respectively. While the undrained shear strength started at 9 and reached 185 kPa, effective vertical stress was found between 21 and 244 kPa. Finally, measured friction capacity was between 9 and 88.8 kPa.

##### 3.2. Training and Testing Processes

To build the pile capacity prediction model using previous data, the training process is essential. The *k*-fold cross-validation method can be used to ensure that the pile capacity model is as accurate as possible. The training process of ST-LSSVM with the given training dataset is simulated over 10 times based on cross-validation, with each 10 subset used exactly once as a validating subset. After the training process finished, the model is ready for predicting a new unseen testing dataset. The complete training result is provided in Appendix.

Through trial and error, suitable parameter settings for ST-LSSVM were determined: (1) maximum number of iterations is set to be 100, (2) population size is set to be 30, and (3) search range for the *γ* and *σ*^{2} parameters is varied from 10^{−10} to 10^{10} as suggested in [37]. Using the given training dataset, the training procedure begins with random initial population of hyperparameters. For every iteration, ST-LSSVM simulates 10-fold cross-validation of training and validating subsets and stores the average RMSE value of the validating subset of each fold as the fitness value. The fitness value starts from a high RMSE value of 7.596 and iteratively decreases and converges to a RMSE value of 6.831 as shown in Figure 3. Figure 4 shows the historical records of the hyperparameters selection process. The final set of parameters that produce the lowest RMSE value on validation subsets are 81369676 for *γ* value and 8622 for *σ*^{2} value. Finally, the complete testing result is provided in Appendix and evaluated using three above-mentioned performance measurements.

#### 4. Results and Discussions

For comparison purposes, this research applied other machine learning-based predictive methods which are LS-SVM and backpropagation neural network (BPNN). The setting of BPNN follows the default parameter setting of MATLAB neural network toolbox, and Levenberg–Marquardt is chosen to train the BPNN [38]. The hyperparameters of LS-SVM follow the default setting suggested in the publication of Suykens and Vandewalle [12]. Additionally, the previous result from the literature published by Goh [6] is collected for benchmarking the result obtained from the proposed method.

Figure 5 shows the obtained training and testing results of ST-LSSVM. The actual and predicted output of both training and testing phases showed a good fit to a straight line. The *R*-values for training and testing phases reported in this experiment also reflect the high accuracy and the superior performance of the trained ST-LSSVM model.

As mentioned earlier, to further express the accurate evaluation of the performing methods, RMSE and MAE have also been utilized, besides *R*. Table 3 compiles the prediction results of each method for further analysis. The results showed that the ST-LSSVM model effectively facilitated constructing an optimized predictive model over the default LS-SVM method. By implementing the self-optimized framework, the performance measures *R*, RMSE, and MAE of LS-SVM testing results were improved by 0.079, 5.2019, and 3.0419 kPa, respectively. Additionally, the results of ST-LSSVM are also superior to those of BPNN in terms of *R*, RMSE, and MAE. When comparing with neural network (NN) which was previously published by Goh [6], ST-LSSVM performs relatively better. It can be seen that ST-LSSVM achieves slightly better testing results over NN in two categories (*R* and MAE) while producing significantly better training results over NN in all categories. This comprehensive evaluation confirmed the capability of SOS and LS-SVM for modeling the accurate friction capacity based on load test records.

#### 5. Conclusions

In this study, a new method for predicting friction pile capacity has been established based on load test records. With this research, it now extends the current body of knowledge that exists on investigating how capable LS-SVM is in predicting this information. Genuine load test records were collected, and the proposed model managed to achieve accurate prediction results. In this study, the main purpose was to investigate a hybrid computational intelligence system and its efficacy in optimizing the LS-SVM parameters to improve the accuracy of friction capacity forecasts of driven piles on the main load test records.

After analyzing further, the results suggested that LS-SVM, combined with SOS, could facilitate the construction of an optimized predictive model to be used for friction capacity. With modeling friction pile capacity being so complex and highly nonlinear, the obtained *R*, RMSE, and MAE values for both training and testing are impressive and desirable for most. With this new method, it makes for a reliable and robust assistance tool in helping all geotechnical engineers estimate friction pile capacity.

#### Appendix

Prediction Results by ST-LSSVM for Training and Testing Data

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.