Abstract
The main objective of site characterization is the prediction of in situ soil properties at any half-space point at a site based on limited tests. In this study, the Support Vector Machine (SVM) has been used to develop a three dimensional site characterization model for Bangalore, India based on large amount of Standard Penetration Test. SVM is a novel type of learning machine based on statistical learning theory, uses regression technique by introducing Ξ΅-insensitive loss function. The database consists of 766 boreholes, with more than 2700 field SPT values () spread over 220βsqβkm area of Bangalore. The model is applied for corrected () values. The three input variables (, , and , where , , and are the coordinates of the Bangalore) were used for the SVM model. The output of SVM was the data. The results presented in this paper clearly highlight that the SVM is a robust tool for site characterization. In this study, a sensitivity analysis of SVM parameters (Ο, , and Ξ΅) has been also presented.
1. Introduction
In general, geotechnical engineers characterize a site based on a limited number of tests. They interpret a site in terms of working soil profiles, which are generally prepared based on soil properties. So, they have to predict in-situ soil properties at any half-space point at a site, based on a limited number of tests. The prediction of soil property is a difficult task due to uncertainty. Spatial variability, measurement βnoise,β measurement and model bias, and statistical error due to limited measurements are the sources of uncertainty [1]. Prediction of soil properties using geostatistics has been reported by many researchers [2β6]. However, several reasons appear to hinder the use of geostatistics in geotechnical engineering [7]. In probabilistic site characterization, random field theory has been used by many researchers in geotechnical engineering [8β19]. One of the most important assumptions of random field theory is that the soil property is to be statistically homogeneous with the chosen layer. In addition, these models have also assumed that soil property consists of a constant mean or a global mean trend with a stationary stochastic portion. To model the constant mean or global mean trend, regression analysis with polynomial functions has been used by the above researchers. Autocovariance function, autocorrelation function, autoregressive processes, power spectra functions, variance function, and scale of fluctuation are available for modelling the stationary stochastic portion. Statistically homogeneous soil layers have been determined by using βModified Bartlett Statisticsβ [20]. However, random field methods and geostatistics have been applied in site characterization modelling with limited success [21]. Recently, artificial neural network has been used for site characterization [21]. A major disadvantage of ANN models is that there is no information about the relative importance of the various parameters [22]. In ANN, as the knowledge acquired during training is stored in an implicit manner, it is very difficult to come up with reasonable interpretation of the overall structure of the network [23]. This leads to the term βblack boxβ which many researchers use while referring to ANNβs behavior. In addition, ANN has some inherent drawbacks such as slow convergence speed, less generalizing performance, arriving at local minimum, and overfitting problems.
The support vector machine (SVM) based on statistical learning theory has been developed by Vapnik (1995) [24]. It provides a new, efficient novel approach to improve the generalization performance and can attain a global minimum. In general, SVMs have been used for pattern recognition problems. Recently, it has been used to solve nonlinear regression estimation and time series prediction by introducing Ξ΅-insensitive loss function [25β27]. The SVM implements the structural risk minimization principle (SRMP), which has been shown to be superior to the more traditional empirical risk minimization principle (ERMP) employed by many of the other modelling techniques [28, 29]. SRMP minimizes an upper bound of the generalization error, whereas ERMP minimizes the training error. In this way, SVM produces the better generalization than traditional techniques.
The standard penetration test (SPT) is a well-established and unsophisticated method of soil test, which was developed in the United States around 1925. It has been established as the most popular field testing method to characterize the subsurface soil profiles, despite its limitations. Field SPT () value is used to determine the bearing capacity, settlement, and liquefaction potential, and it is also correlated to many soil properties such as shear wave velocity, angle of internal friction, and cone tip resistance. The objective of this paper is to use SVM for three-dimensional (3D) site characterization model for Bangalore, India based on a large amount values in this area. Further, sensitivity analysis of SVM parameters (Ο, , and Ξ΅) has been carried out and results are presented to highlight their influence on the results.
2. Site Description
The city of Bangalore covers an area of over 220 square kilometres, and ground reduced levels (GRLs) also vary a lot in the city. It varies from 810βm in the north-eastern part to 940βm in the south-western part of Bangalore. Ground reduced levels do not vary much in the other parts of the city. There were more than 450 lakes once upon a time, and more than 340 lakes have dried up due to erosion and encroachments for construction of layouts and buildings. The population of greater Bangalore region is over 6 million, and it is the fifth biggest city in India. It is growing very fast and is situated on latitude of 12Β°8β²north and longitude of 77Β°37β²east.
From geology, the most part of Bangalore falls in gneiss complexes, which are formed due to several tectonic-thermal events with large influx of sialic material and are believed to have occurred between 3400 and 3000 million years ago giving rise to an extensive group of gray gneisses designated as the βolder gneiss complex.β These gneisses act as the basement for a widespread belt of schistβs. The younger group of gneissic rocks mostly of granodiorite and granite composition is found in the eastern part of the state, representing remobilized parts of an older crust with abundant additions of newer granite material, for which the name βyounger gneiss complexβ has been given [31]. The soil is mostly a residual soil from granite gneiss due to weathering action. In the old tank beds, silty sand/clay is also found as overburden.
3. Geographic Information System (GIS) Model and Geotechnical Data
The Bangalore map forms the base layer for the development of GIS model (see Figure 1). The map entities have been developed in view of two aspects, firstly for locating the borelogs to the utmost accuracy on a scale of 1β:β20000 and secondly for identification of borelogs by end user. The digitized map has several layers of information. Some of the important layers considered are the boundaries (outer and Administrative), highways, major roads, minor roads, streets, rail roads, water bodies, drains, ground contours, and borehole locations. A large amount of geotechnical data consisting of 766 boreholes has been collated along with index and engineering properties of subsoil layers at different locations in Bangalore (location of boreholes is shown in Figure 1). Geotechnical data were evaluated for geotechnical investigations of several major projects in Bangalore. In total, 766 borelogs information has been entered into the database using a GIS with ARCINFO package. The latitudes and longitudes were confirmed using global positioning system (GPS) stations at selected locations. In total, 2722 ββ values are available in 766 boreholes in the three-dimensional GIS model. Distribution of collected boreholes in Bangalore is shown in Figure 2, indicating a very good distribution of the boreholes in each quadrant of Bangalore from the city center. Figure 1 depicts a grid of 1βkm Γ 1βkm within the corporate boundary of Bangalore along with outer boundary circumscribing the ring road also with location of boreholes. It gives a clear view of the spatial distribution of boreholes in Bangalore region. An average of about four boreholes data is available within the grid of 1βkm Γ 1βkm.
Geotechnical data was collated from archives of Torsteel Research Foundation in India and Indian Institute of Science for geotechnical investigation carried out for several major projects in Bangalore. The data collected are of very high quality for important projects in Bangalore during the years 1995β2003. The data in the model are on average to a depth of 30βm below the ground level. The borelogs contain information about depth, density of the soil, total stress, effective stress, fines content, and values and depth of ground water table. For the purpose of general identification of soil layers, the Bangalore map area is divided into four parts (four quadrants) in north-south and east-west directions as shown in Figure 2. The typical soil profile in the north-western part of the Bangalore has three layers of soil deposition. The first layer contains brownish silty sand with clay or red soil in some location up to 3βm, after which up to 6βm, medium dense to very dense silty sand is present. The third layer has weathered rock varying from 6βm to 17βm depth and followed by hard rock. The south-western part contains red soil or reddish silty sand with gravel up to 1.7βm depth, yellowish clayey sand from 1.7βm to 3.5βm, yellowish silty sand with clay from 3.5βm to 8.5βm, and hard rock below 8.5βm. The soil in the south-eastern part can be classified into 4 layers. The first layer up to 1.5βm contains brownish clayey sand, brownish clayey sand with gravel from 1.5βm to 4βm, yellowish silty sand with gravel up to 5.5βm, different stages of weathered rock from 5.5βm to 17.5βm, and hard rock beneath. North-eastern side has 4 layer depositions, filled up soil to 1.5βm, reddish silty clay from 1.5βm to 4.5βm, sandy clay from up to 7.5βm, weathered rock form 7.5βm to 18.5βm, and hard rock below. The corrections for field values (shown in Tables 1 and 2) are applied for overburden pressures (), hammer energy (), borehole diameter (), presence or absence of liner (), rod length (), and correction for fines content () as per standard procedures existing in literature [32β37].
4. Support Vector Machine Model
SVM has originated from the concept of statistical learning theory pioneered by Boser et al. (1992) [38]. In this section, a brief introduction is presented on the construction process of SVM for regression problems. There are three distinct characteristics of SVM when they are used to estimate the regression function. First of all, SVM estimates the regression using a set of linear functions that are defined in a high-dimensional space. Secondly, SVM carries out the regression estimation by risk minimization where the risk is measured using Vapnikβs Ξ΅-insensitive loss function. Thirdly, SVM uses a risk function consisting of the empirical error and a regularization term which is derived from the SRMP. This study uses the SVM as a regression technique by introducing an Ξ΅-insensitive loss function. The Ξ΅-insensitive loss function () can be described in the following way: This defines an Ξ΅ tube (Figure 3) so that if the predicted value is within the tube, the loss is zero, while if the predicted point is outside the tube, the loss is the magnitude of the difference between the predicted value and the radius,Ξ΅, of the tube. Assume that the training dataset consists of l training sample where is the input and is the output. For site characterization model for Bangalore, and .
The main aim in SVM is to find a function that gives a deviation of Ξ΅ from the actual output and at the same time is as flat as possible. Let us assume a linear function where = an adjustable weight vector, = the scalar threshold, = -dimensional vector space, and = one-dimensional vector space.
Flatness in the case of (2) means that one seeks a small . One way of obtaining this is by minimizing the Euclidean norm . This is equivalent to the following convex optimization problem: The above convex optimization problem is feasible. Sometimes, however, this may not be the case, or we also may want to allow for some errors, analogously to the βsoft marginβ loss function [39] which was used in SVM by Cortes and Vapnik (1995) [40]. As shown in Figure 1, the parameters are slack variables that determine the degree to which samples with error more than Ξ΅ are penalized. In other words, any error smaller than Ξ΅ does not require and hence does not enter the objective function because these data points have a value of zero for the loss function. The slack variables () have been introduced to avoid infeasible constraints of the optimization problemββ(3) The constant determines the trade-off between the flatness of and the amount up to which deviations larger than Ξ΅ are tolerated [41]. This optimization problem (4) is solved by Lagrangian multipliers [42], and its solution is given by where , are the Lagrangian multipliers, and nsv is the number of support vectors. An important aspect is that some Lagrange multipliers () will be zero, implying that these training objects are considered to be irrelevant for the final solution (sparseness). The training objects with nonzero Lagrange multipliers are called support vectors.
When linear regression is not appropriate, then input data has to be mapped into a high-dimensional feature space through some nonlinear mapping [38] (see Figure 4). The two steps that are involved are first to make a fixed nonlinear mapping of the data onto the feature space and then carry out a linear regression in the high-dimensional space. The input data is mapped onto the feature space by a map Ξ¦ (see Figure 4). The dot product given by is computed as a linear combination of the training points. The concept of kernel function has been introduced to reduce the computational demand [40, 43]. So, (5) becomes written as In this study, radial basis function has been used as a kernel function.
5. SVM Implementation for Site Characterization Model
Figure 5 shows the architecture of SVM for prediction in 3D subsurface of Bangalore. In SVM, each of the input variables (, and ) is first normalized with respect to their respective maximum value. The output variable was also normalized with respect to the maximum value. For implementing the SVM, the data has been divided into two subsets:(1)a training dataset: this is required to train the model. In this study, 90% of total boreholes (number of total boreholes = 766, number of values = 2429, and 90% of total boreholes = 689.4β690) are considered for training dataset.(2)a testing dataset: this is required to examine the model performance. In this study, the remaining 10% of the total boreholes is considered as testing dataset, which consists of 76 boreholes of 293 data.
The training and testing datasets have been chosen using sorting method to maintain statistical consistency. The application of SVM for this study requires the proper selection of design parameters ( and Ξ΅). The identification of optimal values of and Ξ΅ is largely a trial and error process. However, there are guidelines that can be used for selecting these parameters. A large assigns higher penalties to errors so that the regression is trained to minimize error with lower generalization, while a small assigns fewer penalties to errors; this allows the minimization of margin with errors, thus higher generalization ability. If goes to be infinitely large, SVM would not allow the occurrence of any error and result in a complex model, whereas when goes to zero, the result would tolerate a large amount of errors, and the model would be less complex. With regards to the selection of ΒΒ, if is too large, too few support vectors are selected which leads to a decrease of the final prediction performance. If Ξ΅ is too small, many support vectors are selected which leads to the risk of overfitting. The optimum values of and Ξ΅ obtained in this study are presented in Section 6. The program of SVM is constructed using MATLAB.
6. Result and Discussion
In this analysis as a first step, the free parameters of Gaussian kernel function Ο, , and have been chosen arbitrarily. So it is necessary to investigate the impact of these free parameters on the generalization error and number of support vectors. Firstly, the influence of Ο on the prediction performance is studied. It is known to us that the level of predicting accuracy is greatly influenced by the value of Ο. Using too small Ο (i.e., ) or too large Ο (i.e., ) will be not well suited for good model. Figure 6 represents the impacts of Ο on the testing results. The mean absolute error (MAE) (, where is the actual data, is the predicted data, and is the number of data) achieves minimum value of 0.0271 at Ο = 3 for values. It can be seen from Figure 6 that the MAE values change sharply when Ο < 40 and tend to flatten after Ο β₯ 40. In this study, a Ο value 3 has been used for . Figure 7 shows the variation between the MAE and the values. The MAE has a minimum value of 0.0271 at for the values. Figure 8 shows the variation of a number of support vectors with the values. It can be seen from Figure 8 that the number of support vector values changes sharply when and tends to flatten after . In order to make the learning process robust, has been assigned a value of 150. Figure 9 depicts the variation MAE value with Ξ΅ values. The MAE has a minimum value at Ξ΅ = 0.002. Figure 10 shows the relation between the number of support vectors and the Ξ΅ values. It is also found that the number of support vectors is decreasing with increasing Ξ΅. In general, Ξ΅ should be set at small value, specified as Ξ΅ = 0.002 in this analysis. To produce the best possible result, the Ο value should be 3. The SVM was found to generalize well by setting the capacity factor as 150 and Ξ΅ value as 0.002. Figure 11 represents the performance of SVM model for training dataset (coefficient of correlation, ), and the results are almost identical to the original data. In order to evaluate the capabilities of the SVM model, the model is validated with new data that are not part of the training dataset. Figure 12 shows the performance of the SVM model for testing dataset (). From Figure 12, it is clear that the SVM model has predicted the actual values of very well, and it can be used for 3D site characterization model of Bangalore. Figures 13 and 14 show the values with depth corresponding to borehole nos. BH 176-2 and BH 276-2, respectively. From Figures 13 and 14, it is clear that the predicted values match very well with the actual values of . Figures 15 and 16 shows three-dimensional and two-dimensional surface of using SVM model, respectively.
7. Conclusions
The three-dimensional site characterization model has been developed for Bangalore using SVM technique. SVM technique has shown to be a promising tool for site characterization. SVM training consists of solving a-uniquely solvable-quadratic optimization problem and always finds a global minimum. In this study, C and Ξ΅ factors are considered in SVM method by using a kernel function. A detailed parametric analysis of these parameters on the predictive performance has been carried out. The SVM was found to generalize well by setting the capacity factor C as 150 and Ξ΅ value as 0.002. The result obtained shows that the SVM model is accurate in predicting values. In general, SVM is shown to provide a general site characterization model of Bangalore. This has a potential for seismic hazard analysis, site response, and liquefaction studies for the development of microzonation maps for an area. The predicted values from the developed model can also be used to estimate the subsurface information, allowable bearing pressure of soils, and elastic modulus of soils.
Acknowledgment
The author thanks T. G. Sitharam for providing the SPT data.