This study examines the potential of two soft computing techniques, namely, support vector machines (SVMs) and genetic programming (GP), to predict ultimate bearing capacity of cohesionless soils beneath shallow foundations. The width of footing (𝐵), depth of footing (𝐷), the length-to-width ratio (𝐿/𝐵) of footings, density of soil (𝛾 or ğ›¾î…ž), angle of internal friction (Φ), and so forth were used as model input parameters to predict ultimate bearing capacity (ğ‘žğ‘¢). The results of present models were compared with those obtained by three theoretical approaches, artificial neural networks (ANNs), and fuzzy inference system (FIS) reported in the literature. The statistical evaluation of results shows that the presently applied paradigms are better than the theoretical approaches and are competing well with the other soft computing techniques. The performance evaluation of GP model results based on multiple error criteria confirms that GP is very efficient in accurate prediction of ultimate bearing capacity cohesionless soils when compared with other models considered in this study.

1. Introduction

Design of foundations is performed based on two criteria: ultimate bearing capacity and limiting settlement. The ultimate bearing capacity is governed by shear strength of the soil and is estimated by theories proposed by Terzaghi [1], Meyerhof [2], Hansen [3], Vesic [4], and others. However, the different bearing capacity formulae shows wide degree of variability while estimating bearing capacity of dense sand on cohesionless soils. Also the bearing capacities are validated through laboratory studies performed on small-scale models. Due to the “scale effect” for the large-scale foundations on dense sand, shearing strain show that considerable variation along the slip line and the average mobilized angle of shearing resistance along the slip line are smaller than the maximum value (Φmax) obtained by plane shear tests [5]. Thus, the use of Φmax may lead to an overestimated bearing capacity value for the calculations based on different formulae [1–4].

In the recent past, the use of soft computing techniques has attracted many researchers and applied quite successfully for solving many complex geotechnical engineering problems. Artificial neural networks (ANNs) may probably be the most popular among these tools, applied for prediction of bearing capacity of cohesionless soils [5], bearing capacity of piles, settlement predictions, liquefaction, and slope stability problems [6]. Support vector machines (SVMs) are recent addition to the soft computing family that uses statistical learning theory as the working principle. SVM and its variants are applied for geotechnical problems such as prediction of pile load capacity [7], settlement of foundations [8], slope stability [9], and liquefaction potential [10].

The evolutionary computational techniques may be a better alternative for solving regression problems as they follow an optimization strategy with progressive improvement towards the global optima. They start with possible trial solutions within a decision space, and the search is guided by genetic operators and the principle of “survival of the fittest” [11]. Genetic Algorithm (GA) is one of the most popular and powerful evolutionary optimization technique [11] explored by [12], but it cannot be used to evolve complex models such as equations. This limitation is overcome by Genetic Programming (GP) introduced by Koza [13], which works on the principle of GA. GP writes expressions or computer programs instead of strings in GA. In this paper, SVM and GP are used as alternate paradigms to predict bearing capacity of cohesionless soils under shallow foundations.

2. Support Vector Machine

Support vector machine (SVM) is a relatively recent addition to the family of soft computing techniques evolved from the concept of statistical learning theory explored by Boser et al. [14]. SVM performs the regression by using a set of nonlinear functions that are defined in a high-dimensional space. SVM has been used to solve nonlinear regression problems by the principle of structural risk minimization (SRM), where the risk is measured using Vapnik’s accuracy intensive loss function (𝜀) [15]. SVM uses a risk function consisting of the empirical error and a regularization term. More details on SRM can be found in Cortes and Vapnik [16]. Considering a set of input-output pairs as training dataset, [(𝑥1,𝑦1),(𝑥2,𝑦2)⋯(𝑥𝑙,𝑦𝑙)]𝑥∈𝑅𝑁,𝑦∈𝑟, where 𝑥 is the input, 𝑦 is the output, 𝑅𝑁 is the 𝑁-dimensional vector space, and 𝑟 is the one-dimensional vector space. In this problem, the width of footing (𝐵), depth of footing (𝐷), the length-to-width ratio (𝐿/𝐵) of footings, density of soil (𝛾 or ğ›¾î…ž) angle of internal friction (Φ), and so forth were used as model input parameters to predict ultimate bearing capacity (ğ‘žğ‘¢). Hence, for this problem, 𝑥=[𝐵,𝐷,(𝐿/𝐵),𝛾,Φ] and 𝑦=[ğ‘žğ‘¢].

The intension of SVM is to fit a function that can approximately predict the value of output on supplying a new set of predictors (input variables).

The 𝜀-intensive loss function can be described as follows:𝐿𝜀||||(𝑦)=0for𝑓(𝑥)−𝑦≤𝜀,(1) otherwise,𝐿𝜀||||(𝑦)=𝑓(𝑥)−𝑦−𝜀.(2) This defines an 𝜀-tube so that if the predicted value is within the tube, the loss is zero; otherwise the loss is equal to the absolute value of the deviation minus 𝜀. This concept is depicted in Figure 1.

SVM attempts to find that a function 𝑓(𝑥) that gives the deviation of “𝜀” from the actual output is as flat as possible.

Consider a linear function of the form,𝑓(𝑥)=(𝑤⋅𝑥)+𝑏,𝑤∈𝑅𝑁,𝑏∈𝑟,(3) where 𝑤 is an adjustable weight vector and 𝑏 is the scalar threshold. Fitness means the search for a small value of “𝑤”. It can be represented as a minimization problem with an objective function comprising the Euclidian norm as follows: 1Minimize∶2‖𝑤‖2Subjectto𝑦𝑖−𝑤⋅𝑥𝑖+𝑏≤𝜀,𝑖=1,2,3,…,𝑙,𝑤⋅𝑥𝑖+𝑏−𝑦𝑖≤𝜀,𝑖=1,2,3,…,𝑙.(4) Some allowance for errors (𝜀) may also be introduced. Two slack parameters 𝜉 and 𝜉∗ have been introduced to penalize the samples with error more than “𝜀”. Thus the infeasible constraints of the optimization problem are eliminated. The modified formulation takes the following form:1Minimize∶2‖𝑤‖2+𝐶𝑙𝑖=1𝜉𝑖+𝜉𝑖∗Subjectto𝑦𝑖−𝑤⋅𝑥𝑖+𝑏≤𝜀+𝜉𝑖,𝑖=1,2,3,…,𝑙,𝑤⋅𝑥𝑖+𝑏−𝑦𝑖≤𝜀+𝜉𝑖∗𝜉,𝑖=1,2,3,…,𝑙.𝑖≥0,𝜉𝑖∗≥0,𝑖=1,2,3,…,𝑙.(5) The constant 0<𝐶<∞ determines the tradeoff between the flatness of 𝑓(𝑥) and the amount up to which the deviations larger than “𝜀” are tolerated [17]. The above optimization problem is solved by Vapnik [15] using Lagrange multiplier method. The solution is given by𝑓(𝑥)=𝑀𝑖=1𝛼𝑖−𝛼𝑖∗𝑥𝑖1⋅𝑥+𝑏,where𝑏=−2𝑥𝑤⋅𝑟+𝑥𝑠,(6) where 𝑥𝑠 and 𝑥𝑟 are known as support vectors and 𝑀 is the number of support vectors.

Some Lagrange multipliers (𝛼𝑖,𝛼𝑖∗) will be zero, which implies that these training solutions are irrelevant to the final solution (known as sparseness of the solution). The training objects with nonzero Lagrange multipliers are called support vectors. When linear regression is not appropriate, input data have to be mapped into a high-dimensional feature space through nonlinear mapping and the linear regression needs to be performed in the high-dimensional feature space [14]. Kernel function 𝐾 is used to transform nonlinear data from the input to the feature space in linear form. Then linear fitting in new space will be equal to nonlinear fitting in original space:𝐾𝑥𝑖,𝑥𝑗𝑥=𝜙𝑖𝑥⋅𝜙𝑗,(7) where 𝐾 is the kernel function, 𝑥𝑖 and 𝑥𝑗 are inputs, and 𝜙(𝑥𝑖)⋅𝜙(𝑥𝑗) is the dot product in the high-dimensional space.

Thus, (6) can be replaced by𝑓(𝑥)=𝑀𝑖=1𝛼𝑖−𝛼𝑖∗𝐾𝑥𝑖⋅𝑥𝑗+𝑏.(8) The concept of nonlinear mapping is depicted in Figure 2.

The functions which satisfy Mercer’s theorem can be used for fitting the data [14]. Polynomial functions, radial basis function (RBF), and splines are the most commonly used kernel functions for data fitting using SVM. The mathematical forms of some popular kernel functions can be found in [18].

3. Genetic Programming

Genetic Programming (GP) is an automatic programming technique for evolving computer programs to solve, or approximately solve, problems introduced by Koza [13]. GP is basically an optimization paradigm that can also be effectively applied to the genetic symbolic regression (GSR). GSR involves finding a mathematical expression in symbolic form relating finite values of set of independent variables (𝑥𝑖) and a set of dependent variables (𝑦𝑖) [19]. GP works on Darwin’s natural selection theory in evolution. Here, a population is progressively improved by selectively discarding the not-so-fit population and breeding new children to form better populations. Like other evolutionary algorithms, the solution is started with a random population of individuals (equations or computer programs). Each possible solution set can be visualized as a “parse tree” comprising the terminal set (input variables) and functions (general operators such as +, −, *, /, logarithmic or trigonometric). The “fitness” is a measure of how closely a trial solution solves the problem. The objective function—the minimization of error between estimated and observed values—is the fitness function. The solution set in a population associated with the “best fit” individuals will be reproduced more often than the less fit solution sets. It iteratively transforms a population of computer programs into a new generation of programs by applying analogs to naturally occurring genetic operators like reproduction, mutation, and crossover. The different genetic operations can be found in detail in [13]. The basic procedure of GP is presented as a flow chart in Figure 3.

In the recent past, GP is effectively applied to solve a wide range of geotechnical engineering problems [20–22]. GP can evolve an explicit equation or equivalent computer program relating the input and output variables which is a more understandable depiction of the cause-effect process. Some literature suggests that the program-based GP approach (i.e., the GP algorithms which give a computer program which helps for estimating the predictant value for a given set of predictors) can perform equally well with an equation-based approach and other soft computing tools like ANN [23–26]. A program-based GP approach is adopted for the present study.

4. Model Development and Results

The primary step in model development for the estimation of bearing capacity of cohesionless soils underneath shallow foundations is identification of parameters that affect the bearing capacity. The basic form of equation for bearing capacity of cohesionless soil is [5]ğ‘žğ‘¢=ğ›¾ğ·ğ‘ğ‘žğ‘†ğ‘žğ‘‘ğ‘ž+0.5𝐵𝛾𝑁𝛾𝑆𝛾𝑑𝛾,(9) where 𝐵 is the width of foundation, 𝐷 is the depth of foundation, 𝛾 is the unit weight of sand, ğ‘ğ‘ž, 𝑁𝛾 is the bearing capacity factors, ğ‘†ğ‘ž, 𝑆𝛾 is the shape factors, and ğ‘‘ğ‘ž, 𝑑𝛾 the depth factors. These factors primarily depend on the angle of shearing resistance, unit weight of the sand, and the geometry of the foundation.

The main factors affecting the bearing capacity are its width (least lateral dimension, 𝐵), length of footing (𝐿), shape (square, rectangular, and circular), and depth of embedment (𝐷). The depth of foundation has the greatest effect on the bearing capacity of all the physical properties of the foundation. There are some other factors such as compressibility and thickness of the soil layer beneath the foundation that contribute to a lesser degree [5]. The effect of compressibility is small, except for loose densities, and is generally less important in bearing capacity computation [5]. Moreover, there are insufficient data to consider compressibility as well as thickness of soil stratum. Therefore, they are not considered in this study.

4.1. Database

The data used in the present study has been adopted from Padmini et al. [5]. The five input parameters used for the model development in this study are width of footing (𝐵), depth of footing (𝐷), footing geometry (𝐿/𝐵), unit weight of sand (𝛾), and angle of shearing resistance (Φ). Ultimate bearing capacity (ğ‘žğ‘¢) is the single output. The data thus compiled comprises a total of 97 data sets, which consists of results of load test data of square, rectangular, and strip footings of different sizes tested in sand beds of various densities. Out of the total 97 sets of data, 78 are used for training and 19 are used for validation in all the experiments considered in this study. The data division is done in such a way that the same 19 sets of data used by Padmini et al. [5] are kept as the validation dataset to enable a comparison of results of the present study with those obtained by ANN and FIS by Padmini et al. [5].

4.2. Development of SVM Model

The data mining software WEKA 3.6.1 proposed by Witten and Frank [27] is used for developing SVM model. In this study an 𝜀-variant of SVM (𝜀-SVM) is used for support vector regression, and the loss function (𝜀) is fixed as 0.001. Initially, a polynomial kernel of degree (𝑑) 2 is used to fit a nonlinear model. The selection of regularization parameter 𝐶 and kernel-specific parameters (𝑑 and ğœŽ for polynomial and RBF kernel, resp.) may influence the results. A large value of 𝐶 indicates that the objective function is only to minimize the empirical risk, which makes the learning machine more complex. On the other hand, a smaller 𝐶 may cause learning errors with poor approximation [28].

A trial and error approach is followed to find the optimal value of 𝐶 for model with polynomial kernel. The 𝐶 parameter of 100 is found to be quite successful in giving satisfactory performance. Then a radial basis function (RBF) kernel is used to fit a nonlinear model in the present study to build an SVM model. The combination of control parameters such as 𝐶=250 and ğœŽ=3 gives very good training performance. The plot between observed and predicted values of training dataset is shown in Figure 4 (polykernel) and Figure 5 (RBF kernel). These plots indicate that the model is well trained.

4.3. Development of GP Model

The genetic programming software DISCIPULUS [29] is used for developing GP model. The models are created in the form of “evolved” computer programs as GP uses Darwinian natural selection to create them. Using this model, the output of statistically similar input data can be predicted with very much accuracy. The initial control parameters used for the problem are population size: 500, crossover probability: 0.95, and mutation probability: 0.5. The basic arithmetical functions (such as addition, multiplication, subtraction, and division (+, *, −, /)) constitute the function set. The fitness function is selected as the root-mean-square error between the measure and predicted values of ultimate bearing capacity. The best program generated by GP software for predicting the UBC of cohesionless soils is given in the appendix. The plot between observed and predicted values of training dataset is shown in Figure 6. This plot indicates that the model is well trained.

5. Results and Discussions

The efficiency of the developed models is analyzed by different statistical performance evaluation criteria such as correlation coefficient (𝑅), coefficient of efficiency (𝐸), root-mean-square error (RMSE), mean bias error (MBE), and mean absolute relative error (MARE). The equations of different performance evaluation measures were presented in Table 1, in which 𝑦𝑜 stands for the observed output value, 𝑦𝑐 represents the computed output value, 𝑦𝑜 is the mean of observed values, 𝑦𝑐 represents the mean of computed values, and 𝑛 represents the number of data points. The different performance evaluation criteria estimated for training dataset are presented in Table 2. The predictions for testing dataset using different models are presented in Table 3, and the performance evaluation for these predictions is presented in Table 4. However, it is to be noted that the ANN and FIS results presented in Table 4 are deduced based on the relative error (RE) values reported by Padmini et al. [5]. From Table 4 it can be inferred that the correlation coefficient and coefficient of efficiency are the highest (0.997 and 0.996) and the error criteria such as RMSE, MBE, and MARE are the least (44.967, 4.01, and 7.69) for the GP-based modeling.

Further the scatter plots between observed and predicted values of UBC for SVM models are presented in Figure 7 (polykernel) and Figure 8 (RBF kernel). The 5% error bar lines are plotted along with these scatter plots. Such a plot can be used to indicate the range of standard deviation and to determine whether the differences are statistically significant [8]. A perusal of plots shows that, for GP-based predictions, all points lie within the specified confidence interval of 95%. Thus, it can be inferred that all the soft computing methods outperform the theoretical approaches in the prediction of bearing capacity. Similar plot for predictions with GP model is presented in Figure 9. Also from Table 4, it is seen that the 𝑅 value and 𝐸 value are closer to unity and different error criteria are much lesser for any of the applied soft computing tools when compared with theoretical models.

A statistical evaluation of the predictions by the different soft computing models for the testing dataset is performed and presented in Table 5. The standard deviation, average deviation, and coefficient of variation values of GP model results (607.91, 454.59, and 1.059) show close agreement with that of observed values (600.02, 459.48, and 1.052) followed by that of SVM (RBF) model. This confirms the robustness of the newly applied paradigms.

The different performance evaluation measures of SVM-based modelling (in Tables 4 and 5) show that the performance of RBF-based SVM is competent with ANN and FIS results. Also SVM involves only lesser number of control parameters (such as 𝐶 and ğœŽ), and ANN involves large number of such parameters and their optimal combination is a tedious process. Thus, the SVM approach is quite simple to implement. Further, the performance evaluation of GP-based results (Tables 4 and 5) shows that the 𝑅, 𝐸 and different error criteria are better for the GP model when compared with the theoretical methods, the SVM, and interpreted results of ANN and FIS. Thus, GP is proven to be a reliable alternative soft computing technique for prediction of ultimate bearing capacity of shallow foundation on cohesionless soil.

6. Conclusions

In this paper the application of two relatively recent soft computing techniques—SVM and GP—is investigated for the prediction of ultimate bearing capacity of cohesionless soils beneath shallow foundations. SVM results are competent and demand the optimal selection of only a few number of control parameters when compared with ANN. Performance evaluation based on multiple error criteria shows that error is the least and correlation coefficient (𝑅) and coefficient of efficiency (𝐸) are the highest for the GP-based modeling than SVMs, ANN, FIS, and the different theoretical models considered in this study. The GP-based modeling is found to be superior in terms of quality, and it gives the output in the form of computer programs which enables the user to apply for a new set of input data to predict the ultimate bearing capacity. Thus, GP can be recommended as a robust soft computing paradigm to predict the ultimate bearing capacity of soil.


A. Note

The C++ Program to predict the ultimate bearing capacity of cohesionless soils is given here. V[0] to V[4] represent the input parameters width of footing (𝐵), the depth of footing (𝐷), the length-to-width ratio (𝐿/𝐵), the field density (𝛾), the angle of shearing resistance (Φ). f[0], f[1], and so forth, are the temporary computation variables that the programs GP software creates. The output of these programs is the value remaining in f[0] after the program executes. This program needs to be run in the DISCIPULUS software environment to get the predictant value for a new set of predictors.

B. Best Program

#define TRUNC(x)(((x)>=0) ? floor(x): ceil(x))
#define C_FPREM (_finite(f[0]/f[1]) ? f[0]-(TRUNC(f[0]/f[1])
 *f[1]): f[0]/f[1])
#define C_F2XM1 (((fabs(f[0])<=1) &&
 (!_isnan(f[0]))) ? (pow(2,f[0])-1):
 ((!_finite(f[0]) && !_isnan(f[0]) &&
 (f[0]<0)) ? -1: f[0]))
float DiscipulusCFunction(float v[])
   long double f[8];
   long double tmp = 0;
   int cflag = 0;
   L0: f[0]/=-1.364008665084839f;
   L1: f[0]+=f[1];
   L2: f[0]=−f[0];
   L3: f[0]−=v[0];
   L4: f[0]+=v[4];
   L5: f[0]+=v[4];
   L6: f[0]=cos(f[0]);
   L7: f[0]+=f[0];
   L8: f[0]+=f[0];
   L9: f[0]+=f[0];
   L10: f[0]*=v[1];
   L11: f[0]+=v[4];
   L12: f[0]−=1.252994060516357f;
   L13: f[0]*=pow(2,TRUNC(f[1]));
   L14: cflag=(f[0] < f[1]);
   L15: f[0]=sqrt(f[0]);
   L16: f[0]*=0.2877938747406006f;
   L17: tmp=f[1]; f[1]=f[0]; f[0]=tmp;
   L18: f[0]−=v[3];
   L19: f[0]*=−0.494312047958374f;
   L20: f[0]*=0.7790718078613281f;
   L21: f[0]*=f[1];
   L22: f[0]=fabs(f[0]);
   L23: f[0]=cos(f[0]);
   L24: f[0]=−f[0];
   L25: f[0]=sqrt(f[0]);
   L26: if (cflag) f[0] = f[1];
   L27: f[0]+=v[4];
   L28: f[0]*=0.9955191612243652f;
   L29: f[0]*=0.4281637668609619f;
   L30: f[0]−=f[0];
   L31: f[0]−=v[3];
   L32: f[0]/=v[0];
   L33: f[0]−=0.9955191612243652f;
   L34: f[0]+=v[4];
   L35: cflag=(f[0] < f[1]);
   L36: f[0]−=f[1];
   L37: f[0]/=f[0];
   L38: f[0]*=pow(2,TRUNC(f[1]));
   L39: f[0]−=f[1];
   L40: f[0]=−f[0];
   L41: f[0]=fabs(f[0]);
   L42: f[0]=−f[0];
   L43: f[0]*=0.4281637668609619f;
   L44: f[0]*=f[0];
   L45: f[0]*=f[0];
   L46: f[0]/=f[0];
   L47: f[0]*=−0.7297487258911133f;
   L48: f[0]/=1.084159851074219f;
   L49: f[0]+=v[4];
   L50: tmp=f[0]; f[0]=f[0]; f[0]=tmp;
   L51: f[0]/=f[1];
   L52: f[0]+=0.7790718078613281f;
   L53: f[0]+=v[4];
   L54: f[0]−=f[1];
   L55: f[0]*=f[1];
   L56: f[0]*=0.4281637668609619f;
   L57: f[0]−=v[3];
   L58: f[0]−=−0.9486191272735596f;
   L59: if (cflag) f[0] = f[1];
   L60: f[0]/=v[2];
   L61: f[0]+=v[4];
   L62: f[0]−=v[2];
   L63: f[0]−=v[2];
   L64: f[0]−=v[2];
   L65: f[0]*=v[1];
   L66: f[0]+=v[4];
   L67: f[0]*=f[1];
   L68: f[0]*=0.4281637668609619f;
   L69: f[0]−=v[3];
   L70: f[1]*=f[0];
   L71: f[0]*=f[0];
   L72: f[0]−=f[1];
   L73: f[1]+=f[0];
   L74: if (!cflag) f[0] = f[1];
   L75: f[0]−=1.987620830535889f;
   L76: f[0]−=v[4];
   L77: f[0]−=1.987620830535889f;
   L78: f[0]−=v[4];
   L79: f[0]−=v[4];
   L80: f[0]−=0.7361507415771484f;
   L81: f[0]−=1.987620830535889f;
   L82: f[0]−=v[4];
  L83: f[0]−=1.987620830535889f;
   L84: f[0]−=v[4];
   L85: f[0]−=1.501374244689941f;
   L86: f[0]+=v[2];
   L87: f[0]−=v[4];
  L88: f[0]−=1.530829906463623f;
   L89: if (!cflag) f[0] = f[1];
   L90: f[0]+=v[3];
   L91: f[0]−=v[4];
   L92: f[0]+=−1.907608032226563f;
   L93: if (!cflag) f[0] = f[1];
   L94: f[0]+=v[3];
   L95: f[0]+=v[3];
   L96: f[0]+=v[3];
   L97: f[0]+=v[3];
   L98: f[0]+=v[3];
   L99: f[0]+=v[3];
   L100: f[0]+=v[3];
   L101: f[0]+=v[3];
   L102: f[0]+=v[3];
   L103: f[0]+=v[3];
   L104: f[0]+=v[3];
   L105: f[0]+=v[3];
   L106: f[0]+=v[3];
   L107: f[0]+=v[3];
   L108: f[0]+=v[2];
   L109: f[0]+=v[3];
   if (!_finite(f[0])) f[0]=0;
   return f[0];


This paper is a part of a research work carried out at the Department of Civil Engineering, TKM College of Engineering Kollam, Kerala, India, in 2010. The authors thank the Department of Civil Engineering, TKM College of Engineering Kollam for providing all necessary help. They also thank the anonymous reviewer/s who helped to improve the quality of the paper.