Abstract

The main objective of site characterization is the prediction of in situ soil properties at any half-space point at a site based on limited tests. In this study, the Support Vector Machine (SVM) has been used to develop a three dimensional site characterization model for Bangalore, India based on large amount of Standard Penetration Test. SVM is a novel type of learning machine based on statistical learning theory, uses regression technique by introducing Ξ΅-insensitive loss function. The database consists of 766 boreholes, with more than 2700 field SPT values (𝑁) spread over 220 sq km area of Bangalore. The model is applied for corrected 𝑁 (𝑁𝑐) values. The three input variables (π‘₯, 𝑦, and 𝑧, where π‘₯, 𝑦, and 𝑧 are the coordinates of the Bangalore) were used for the SVM model. The output of SVM was the 𝑁𝑐 data. The results presented in this paper clearly highlight that the SVM is a robust tool for site characterization. In this study, a sensitivity analysis of SVM parameters (Οƒ, 𝐢, and Ξ΅) has been also presented.

1. Introduction

In general, geotechnical engineers characterize a site based on a limited number of tests. They interpret a site in terms of working soil profiles, which are generally prepared based on soil properties. So, they have to predict in-situ soil properties at any half-space point at a site, based on a limited number of tests. The prediction of soil property is a difficult task due to uncertainty. Spatial variability, measurement β€œnoise,” measurement and model bias, and statistical error due to limited measurements are the sources of uncertainty [1]. Prediction of soil properties using geostatistics has been reported by many researchers [2–6]. However, several reasons appear to hinder the use of geostatistics in geotechnical engineering [7]. In probabilistic site characterization, random field theory has been used by many researchers in geotechnical engineering [8–19]. One of the most important assumptions of random field theory is that the soil property is to be statistically homogeneous with the chosen layer. In addition, these models have also assumed that soil property consists of a constant mean or a global mean trend with a stationary stochastic portion. To model the constant mean or global mean trend, regression analysis with polynomial functions has been used by the above researchers. Autocovariance function, autocorrelation function, autoregressive processes, power spectra functions, variance function, and scale of fluctuation are available for modelling the stationary stochastic portion. Statistically homogeneous soil layers have been determined by using β€œModified Bartlett Statistics” [20]. However, random field methods and geostatistics have been applied in site characterization modelling with limited success [21]. Recently, artificial neural network has been used for site characterization [21]. A major disadvantage of ANN models is that there is no information about the relative importance of the various parameters [22]. In ANN, as the knowledge acquired during training is stored in an implicit manner, it is very difficult to come up with reasonable interpretation of the overall structure of the network [23]. This leads to the term β€œblack box” which many researchers use while referring to ANN’s behavior. In addition, ANN has some inherent drawbacks such as slow convergence speed, less generalizing performance, arriving at local minimum, and overfitting problems.

The support vector machine (SVM) based on statistical learning theory has been developed by Vapnik (1995) [24]. It provides a new, efficient novel approach to improve the generalization performance and can attain a global minimum. In general, SVMs have been used for pattern recognition problems. Recently, it has been used to solve nonlinear regression estimation and time series prediction by introducing Ξ΅-insensitive loss function [25–27]. The SVM implements the structural risk minimization principle (SRMP), which has been shown to be superior to the more traditional empirical risk minimization principle (ERMP) employed by many of the other modelling techniques [28, 29]. SRMP minimizes an upper bound of the generalization error, whereas ERMP minimizes the training error. In this way, SVM produces the better generalization than traditional techniques.

The standard penetration test (SPT) is a well-established and unsophisticated method of soil test, which was developed in the United States around 1925. It has been established as the most popular field testing method to characterize the subsurface soil profiles, despite its limitations. Field SPT (𝑁) value is used to determine the bearing capacity, settlement, and liquefaction potential, and it is also correlated to many soil properties such as shear wave velocity, angle of internal friction, and cone tip resistance. The objective of this paper is to use SVM for three-dimensional (3D) site characterization model for Bangalore, India based on a large amount 𝑁𝑐 values in this area. Further, sensitivity analysis of SVM parameters (Οƒ, 𝐢, and Ξ΅) has been carried out and results are presented to highlight their influence on the results.

2. Site Description

The city of Bangalore covers an area of over 220 square kilometres, and ground reduced levels (GRLs) also vary a lot in the city. It varies from 810 m in the north-eastern part to 940 m in the south-western part of Bangalore. Ground reduced levels do not vary much in the other parts of the city. There were more than 450 lakes once upon a time, and more than 340 lakes have dried up due to erosion and encroachments for construction of layouts and buildings. The population of greater Bangalore region is over 6 million, and it is the fifth biggest city in India. It is growing very fast and is situated on latitude of 12Β°8β€²north and longitude of 77Β°37β€²east.

From geology, the most part of Bangalore falls in gneiss complexes, which are formed due to several tectonic-thermal events with large influx of sialic material and are believed to have occurred between 3400 and 3000 million years ago giving rise to an extensive group of gray gneisses designated as the β€œolder gneiss complex.” These gneisses act as the basement for a widespread belt of schist’s. The younger group of gneissic rocks mostly of granodiorite and granite composition is found in the eastern part of the state, representing remobilized parts of an older crust with abundant additions of newer granite material, for which the name β€œyounger gneiss complex” has been given [31]. The soil is mostly a residual soil from granite gneiss due to weathering action. In the old tank beds, silty sand/clay is also found as overburden.

3. Geographic Information System (GIS) Model and Geotechnical Data

The Bangalore map forms the base layer for the development of GIS model (see Figure 1). The map entities have been developed in view of two aspects, firstly for locating the borelogs to the utmost accuracy on a scale of 1 : 20000 and secondly for identification of borelogs by end user. The digitized map has several layers of information. Some of the important layers considered are the boundaries (outer and Administrative), highways, major roads, minor roads, streets, rail roads, water bodies, drains, ground contours, and borehole locations. A large amount of geotechnical data consisting of 766 boreholes has been collated along with index and engineering properties of subsoil layers at different locations in Bangalore (location of boreholes is shown in Figure 1). Geotechnical data were evaluated for geotechnical investigations of several major projects in Bangalore. In total, 766 borelogs information has been entered into the database using a GIS with ARCINFO package. The latitudes and longitudes were confirmed using global positioning system (GPS) stations at selected locations. In total, 2722 β€œπ‘β€ values are available in 766 boreholes in the three-dimensional GIS model. Distribution of collected boreholes in Bangalore is shown in Figure 2, indicating a very good distribution of the boreholes in each quadrant of Bangalore from the city center. Figure 1 depicts a grid of 1 km Γ— 1 km within the corporate boundary of Bangalore along with outer boundary circumscribing the ring road also with location of boreholes. It gives a clear view of the spatial distribution of boreholes in Bangalore region. An average of about four boreholes data is available within the grid of 1 km Γ— 1 km.

Geotechnical data was collated from archives of Torsteel Research Foundation in India and Indian Institute of Science for geotechnical investigation carried out for several major projects in Bangalore. The data collected are of very high quality for important projects in Bangalore during the years 1995–2003. The data in the model are on average to a depth of 30 m below the ground level. The borelogs contain information about depth, density of the soil, total stress, effective stress, fines content, and 𝑁 values and depth of ground water table. For the purpose of general identification of soil layers, the Bangalore map area is divided into four parts (four quadrants) in north-south and east-west directions as shown in Figure 2. The typical soil profile in the north-western part of the Bangalore has three layers of soil deposition. The first layer contains brownish silty sand with clay or red soil in some location up to 3 m, after which up to 6 m, medium dense to very dense silty sand is present. The third layer has weathered rock varying from 6 m to 17 m depth and followed by hard rock. The south-western part contains red soil or reddish silty sand with gravel up to 1.7 m depth, yellowish clayey sand from 1.7 m to 3.5 m, yellowish silty sand with clay from 3.5 m to 8.5 m, and hard rock below 8.5 m. The soil in the south-eastern part can be classified into 4 layers. The first layer up to 1.5 m contains brownish clayey sand, brownish clayey sand with gravel from 1.5 m to 4 m, yellowish silty sand with gravel up to 5.5 m, different stages of weathered rock from 5.5 m to 17.5 m, and hard rock beneath. North-eastern side has 4 layer depositions, filled up soil to 1.5 m, reddish silty clay from 1.5 m to 4.5 m, sandy clay from up to 7.5 m, weathered rock form 7.5 m to 18.5 m, and hard rock below. The corrections for field 𝑁 values (shown in Tables 1 and 2) are applied for overburden pressures (𝐢𝑁), hammer energy (𝐢𝐸), borehole diameter (𝐢𝐡), presence or absence of liner (𝐢𝑆), rod length (𝐢𝑅), and correction for fines content (𝐢fines) as per standard procedures existing in literature [32–37].

4. Support Vector Machine Model

SVM has originated from the concept of statistical learning theory pioneered by Boser et al. (1992) [38]. In this section, a brief introduction is presented on the construction process of SVM for regression problems. There are three distinct characteristics of SVM when they are used to estimate the regression function. First of all, SVM estimates the regression using a set of linear functions that are defined in a high-dimensional space. Secondly, SVM carries out the regression estimation by risk minimization where the risk is measured using Vapnik’s Ξ΅-insensitive loss function. Thirdly, SVM uses a risk function consisting of the empirical error and a regularization term which is derived from the SRMP. This study uses the SVM as a regression technique by introducing an Ξ΅-insensitive loss function. The Ξ΅-insensitive loss function (πΏπœ€(𝑦)) can be described in the following way:πΏπœ€ξ‚»||||||||(𝑦)=0,for𝑓(π‘₯)βˆ’π‘¦<πœ€,𝑓(π‘₯)βˆ’π‘¦βˆ’πœ€,otherwise.(1) This defines an Ξ΅ tube (Figure 3) so that if the predicted value is within the tube, the loss is zero, while if the predicted point is outside the tube, the loss is the magnitude of the difference between the predicted value and the radius,Ξ΅, of the tube. Assume that the training dataset consists of l training sample {(π‘₯1,𝑦1),…(π‘₯𝑙,𝑦𝑙)} where π‘₯ is the input and 𝑦 is the output. For site characterization model for Bangalore, π‘₯=[π‘₯,𝑦,𝑧] and 𝑦=[𝑁𝑐].

The main aim in SVM is to find a function 𝑓(π‘₯) that gives a deviation of Ξ΅ from the actual output and at the same time is as flat as possible. Let us assume a linear function𝑓(π‘₯)=(𝑀⋅π‘₯)+𝑏,π‘€βˆˆπ‘…π‘›,π‘βˆˆπ‘Ÿ,(2) where 𝑀 = an adjustable weight vector, 𝑏 = the scalar threshold, 𝑅𝑛 = 𝑛-dimensional vector space, and π‘Ÿ = one-dimensional vector space.

Flatness in the case of (2) means that one seeks a small 𝑀. One way of obtaining this is by minimizing the Euclidean norm ‖𝑀‖2. This is equivalent to the following convex optimization problem:1minimize:2‖𝑀‖2subjectedto:π‘¦π‘–βˆ’ξ€·βŸ¨π‘€β‹…π‘₯π‘–ξ€Έξ€·βŸ©+π‘β‰€πœ€,𝑖=1,2,…,π‘™βŸ¨π‘€β‹…π‘₯π‘–ξ€ΈβŸ©+π‘βˆ’π‘¦π‘–β‰€πœ€,𝑖=1,2,…,𝑙.(3) The above convex optimization problem is feasible. Sometimes, however, this may not be the case, or we also may want to allow for some errors, analogously to the β€œsoft margin” loss function [39] which was used in SVM by Cortes and Vapnik (1995) [40]. As shown in Figure 1, the parameters πœ‰π‘–,πœ‰βˆ—π‘– are slack variables that determine the degree to which samples with error more than Ξ΅ are penalized. In other words, any error smaller than Ξ΅ does not require πœ‰π‘–,πœ‰βˆ—π‘– and hence does not enter the objective function because these data points have a value of zero for the loss function. The slack variables (πœ‰π‘–,πœ‰βˆ—π‘–) have been introduced to avoid infeasible constraints of the optimization problem  (3)1minimize:2‖𝑀‖2+𝐢𝑙𝑖=1ξ€·πœ‰π‘–+πœ‰βˆ—π‘–ξ€Έsubjectedto:π‘¦π‘–βˆ’ξ€·βŸ¨π‘€β‹…π‘₯π‘–ξ€ΈβŸ©+π‘β‰€πœ€+πœ‰π‘–ξ€·,𝑖=1,2,…,π‘™βŸ¨π‘€β‹…π‘₯π‘–ξ€ΈβŸ©+π‘βˆ’π‘¦π‘–β‰€πœ€+πœ‰βˆ—π‘–πœ‰,𝑖=1,2,…,𝑙𝑖β‰₯0,πœ‰βˆ—π‘–β‰₯0,𝑖=1,2,…,𝑙.(4) The constant 0<𝐢<∞ determines the trade-off between the flatness of 𝑓 and the amount up to which deviations larger than Ξ΅ are tolerated [41]. This optimization problem (4) is solved by Lagrangian multipliers [42], and its solution is given by𝑓(π‘₯)=nsv𝑖=1ξ€·π›Όπ‘–βˆ’π›Όβˆ—π‘–π‘₯𝑖⋅π‘₯+𝑏,(5) where 𝑏=βˆ’(1/2)𝑀⋅[π‘₯π‘Ÿ+π‘₯𝑠], 𝛼𝑖,π›Όβˆ—π‘– are the Lagrangian multipliers, and nsv is the number of support vectors. An important aspect is that some Lagrange multipliers (𝛼𝑖,π›Όβˆ—π‘–) will be zero, implying that these training objects are considered to be irrelevant for the final solution (sparseness). The training objects with nonzero Lagrange multipliers are called support vectors.

When linear regression is not appropriate, then input data has to be mapped into a high-dimensional feature space through some nonlinear mapping [38] (see Figure 4). The two steps that are involved are first to make a fixed nonlinear mapping of the data onto the feature space and then carry out a linear regression in the high-dimensional space. The input data is mapped onto the feature space by a map Ξ¦ (see Figure 4). The dot product given by Ξ¦(π‘₯𝑖)β‹…Ξ¦(π‘₯𝑗) is computed as a linear combination of the training points. The concept of kernel function [𝐾(π‘₯𝑖,π‘₯𝑗)=Ξ¦(π‘₯𝑖)β‹…Ξ¦(π‘₯𝑗)] has been introduced to reduce the computational demand [40, 43]. So, (5) becomes written as𝑓(π‘₯)=nsv𝑖=1ξ€·π›Όπ‘–βˆ’π›Όβˆ—π‘–ξ€ΈπΎξ€·π‘₯𝑖⋅π‘₯𝑗+𝑏.(6) In this study, radial basis function has been used as a kernel function.

5. SVM Implementation for Site Characterization Model

Figure 5 shows the architecture of SVM for 𝑁𝑐 prediction in 3D subsurface of Bangalore. In SVM, each of the input variables (π‘₯,𝑦, and 𝑧) is first normalized with respect to their respective maximum value. The output variable 𝑁𝑐 was also normalized with respect to the maximum 𝑁𝑐 value. For implementing the SVM, the data has been divided into two subsets:(1)a training dataset: this is required to train the model. In this study, 90% of total boreholes (number of total boreholes = 766, number of 𝑁𝑐 values = 2429, and 90% of total boreholes = 689.4β‰ˆ690) are considered for training dataset.(2)a testing dataset: this is required to examine the model performance. In this study, the remaining 10% of the total boreholes is considered as testing dataset, which consists of 76 boreholes of 293 𝑁𝑐 data.

The training and testing datasets have been chosen using sorting method to maintain statistical consistency. The application of SVM for this study requires the proper selection of design parameters (𝐢 and Ξ΅). The identification of optimal values of 𝐢 and Ξ΅ is largely a trial and error process. However, there are guidelines that can be used for selecting these parameters. A large 𝐢 assigns higher penalties to errors so that the regression is trained to minimize error with lower generalization, while a small 𝐢 assigns fewer penalties to errors; this allows the minimization of margin with errors, thus higher generalization ability. If 𝐢 goes to be infinitely large, SVM would not allow the occurrence of any error and result in a complex model, whereas when 𝐢 goes to zero, the result would tolerate a large amount of errors, and the model would be less complex. With regards to the selection of πœ€Β†Β†, if πœ€ is too large, too few support vectors are selected which leads to a decrease of the final prediction performance. If Ξ΅ is too small, many support vectors are selected which leads to the risk of overfitting. The optimum values of 𝐢 and Ξ΅ obtained in this study are presented in Section 6. The program of SVM is constructed using MATLAB.

6. Result and Discussion

In this analysis as a first step, the free parameters of Gaussian kernel function Οƒ, 𝐢, and πœ€ have been chosen arbitrarily. So it is necessary to investigate the impact of these free parameters on the generalization error and number of support vectors. Firstly, the influence of Οƒ on the prediction performance is studied. It is known to us that the level of predicting accuracy is greatly influenced by the value of Οƒ. Using too small Οƒ (i.e., πœŽβ†’0) or too large Οƒ (i.e., πœŽβ†’βˆ) will be not well suited for good model. Figure 6 represents the impacts of Οƒ on the testing results. The mean absolute error (MAE) (βˆ‘MAE=(1/𝑛)𝑛𝑖=1|π‘Žπ‘–βˆ’π‘π‘–|, where π‘Žπ‘– is the actual data, 𝑝𝑖 is the predicted data, and 𝑛 is the number of data) achieves minimum value of 0.0271 at Οƒ = 3 for 𝑁𝑐 values. It can be seen from Figure 6 that the MAE values change sharply when Οƒ < 40 and tend to flatten after Οƒ β‰₯ 40. In this study, a Οƒ value 3 has been used for 𝑁𝑐. Figure 7 shows the variation between the MAE and the 𝐢 values. The MAE has a minimum value of 0.0271 at 𝐢=150 for the 𝑁𝑐 values. Figure 8 shows the variation of a number of support vectors with the 𝐢 values. It can be seen from Figure 8 that the number of support vector values changes sharply when 𝐢<150 and tends to flatten after β‰₯150. In order to make the learning process robust, 𝐢 has been assigned a value of 150. Figure 9 depicts the variation MAE value with Ξ΅ values. The MAE has a minimum value at Ξ΅ = 0.002. Figure 10 shows the relation between the number of support vectors and the Ξ΅ values. It is also found that the number of support vectors is decreasing with increasing Ξ΅. In general, Ξ΅ should be set at small value, specified as Ξ΅ = 0.002 in this analysis. To produce the best possible result, the Οƒ value should be 3. The SVM was found to generalize well by setting the capacity factor 𝐢 as 150 and Ξ΅ value as 0.002. Figure 11 represents the performance of SVM model for training dataset (coefficient of correlation, 𝑅=0.994), and the results are almost identical to the original data. In order to evaluate the capabilities of the SVM model, the model is validated with new 𝑁𝑐 data that are not part of the training dataset. Figure 12 shows the performance of the SVM model for testing dataset (𝑅=0.986). From Figure 12, it is clear that the SVM model has predicted the actual values of 𝑁𝑐 very well, and it can be used for 3D site characterization model of Bangalore. Figures 13 and 14 show the 𝑁𝑐 values with depth corresponding to borehole nos. BH 176-2 and BH 276-2, respectively. From Figures 13 and 14, it is clear that the predicted values match very well with the actual values of 𝑁𝑐. Figures 15 and 16 shows three-dimensional and two-dimensional surface of 𝑁𝑐 using SVM model, respectively.

7. Conclusions

The three-dimensional site characterization model has been developed for Bangalore using SVM technique. SVM technique has shown to be a promising tool for site characterization. SVM training consists of solving a-uniquely solvable-quadratic optimization problem and always finds a global minimum. In this study, C and Ξ΅ factors are considered in SVM method by using a kernel function. A detailed parametric analysis of these parameters on the predictive performance has been carried out. The SVM was found to generalize well by setting the capacity factor C as 150 and Ξ΅ value as 0.002. The result obtained shows that the SVM model is accurate in predicting 𝑁𝑐 values. In general, SVM is shown to provide a general site characterization model of Bangalore. This has a potential for seismic hazard analysis, site response, and liquefaction studies for the development of microzonation maps for an area. The predicted 𝑁𝑐 values from the developed model can also be used to estimate the subsurface information, allowable bearing pressure of soils, and elastic modulus of soils.

Acknowledgment

The author thanks T. G. Sitharam for providing the SPT data.