Research Article | Open Access
Priori Information Based Support Vector Regression and Its Applications
In order to extract the priori information (PI) provided by real monitored values of peak particle velocity (PPV) and increase the prediction accuracy of PPV, PI based support vector regression (SVR) is established. Firstly, to extract the PI provided by monitored data from the aspect of mathematics, the probability density of PPV is estimated with -SVR. Secondly, in order to make full use of the PI about fluctuation of PPV between the maximal value and the minimal value in a certain period of time, probability density estimated with -SVR is incorporated into training data, and then the dimensionality of training data is increased. Thirdly, using the training data with a higher dimension, a method of predicting PPV called PI--SVR is proposed. Finally, with the collected values of PPV induced by underwater blasting at Dajin Island in Taishan nuclear power station in China, contrastive experiments are made to show the effectiveness of the proposed method.
Underwater blasting is a kind of construction method often used in engineering, such as hydraulic engineering, port engineering, bridge-building, and dam excavation. In engineering practice, vibration velocity of peak particle is the main basis to measure the influence of seismic wave caused by blasting on nearby buildings. However, vibration velocity of peak particle is influenced by many factors, which leads to a relatively low prediction accuracy.
Recently, Xie et al.  analyzed the characteristic parameters for blasting vibration with Sadovsky formula, and predicted PPV of blasting vibration. Yang et al.  predicted the PPV induced by underwater blasting at Dajin Island in the first phase of Taishan nuclear power station based on Sadovsky formula. However, Sadovsky formula [1, 2] relies on a large amount of monitored data and only considers two parameters in regression analysis, and then it cannot reflect the influence caused by many complex factors in underwater blasting. Thus, T. N. Singh and V. Singh  made an attempt to predict the ground vibration using an artificial neural network (ANN) incorporating large number of parameters. Khandelwal and Singh  proposed a method to evaluate and predict the blast-induced ground vibration and frequency by incorporating rock properties, blast design, and explosive parameters using the ANN technique, and it was found that ANN was more accurate and able to predict the value of blast vibration without increasing error with the increasing number of inputs and nonlinearity among these. Furthermore, Liu et al.  introduced grey relational analysis to the prediction of PPV and proposed a genetic neural network model based on grey relational analysis.
However, there are many complex factors influencing PPV and the data are high-dimensional data; ANNs [6–8] have to face the problem of dimension disaster. And when the data are limited, ANNs are easy to fall into local minimum state. Thus, ANNs also cannot predict PPV accurately. Therefore, how to improve the prediction accuracy of PPV caused by underwater blasting vibration with the limited monitored data is still a problem worthy of study.
Support vector machines (SVMs), including support vector classifications (SVCs) and support vector regressions (SVRs), were proposed by Vapnik et al. [9, 10] in the 1990s. SVMs focus on the statistical learning problems for small size samples by solving a convex quadratic optimization problem and can solve the local minimization problem which cannot be avoided by ANNs. SVMs use a kernel function to map the data in original space to a high-dimensional feature space and then solve the nonlinear decision problem in high-dimensional space. Thus, SVMs can successfully solve the problem of dimension disaster that an ANN cannot solve and have good generalization ability. However, standard SVMs focus on monitored data and cannot incorporate prior information into learning process, which may cause the generalization ability of standard SVMs to decrease. Therefore, Guan et al.  proposed a modified method that incorporated prior information into cancer classification based on gene expression data to improve accuracy. Zhang et al.  proposed a fully Bayesian methodology for generalized kernel mixed models, which are extensions of generalized linear mixed models in the feature space induced by a reproducing kernel. Liu and Xue  focused on designing a new class of kernels to incorporate fuzzy prior information into the training process of SVRs. Currently, SVMs have received extensive attention and are attracting more and more scholars to study from different views [14–29].
However, for the problem of PPV prediction in practice, the unknown probability density of PPV provides much PI about PPV. If we can propose a method to estimate the probability density with monitored data and incorporate it into PPV prediction, the prediction accuracy may be greatly improved. In practice, the measured PPV at a fixed time is the mean value of many monitored values during certain period of time . Assume that and , and then the larger is the larger fluctuation of PPV during the period of time . Conversely, the smaller is the smaller fluctuation of PPV. Nevertheless, the mean value cannot provide the fluctuation information of PPV at all. Therefore, in order to increase the accuracy of PPV prediction, it is necessary to find a way to incorporate this PI about fluctuation into PPV prediction. Therefore, in order to increase the prediction accuracy of PPV, this paper focuses on proposing a new method of PPV prediction incorporating with PI.
This paper is structured as follows. Section 2 aims to estimate probability density of PPV with monitored data and -SVR, incorporate PI about the fluctuation of PPV into training data, and then establish prediction method for PPV based on priori information and -support vector regression (PI--SVR). Section 3 includes the contrastive experiments with real monitored data of PPV coming from Dajin Island in Taishan nuclear power station. Section 4 draws the conclusions and future directions.
2. Prediction Method of PPV Based on PI and SVR
In order to dredge much information from monitored data of PPV, here, based on the -SVR, we estimate the probability density of PPV with monitored data firstly. Then, we aim to establish a prediction method of PPV based on PI and -SVR.
2.1. Increasing the Dimensionality of Training Data with PI about Fluctuation of PPV
Suppose that is the probability density estimated with the -SVR and is the initial time and is the PPV at (also denoted by ). In practice, PPV is often monitored many times for every certain period of time (where ), and mean value is output as the predicted PPV. In other words, PPV at is the mean value of monitored values from to (also denoted by ), and . Of course, mean value can be the approximation of PPV at in a sense, but in some cases mean value is quite different from . For example, if monitored PPVs are with the same value during a certain period of time , then the fact that “the mean value is the PPV ” holds with probability 1. Conversely, if the monitored PPVs fluctuate wildly during a certain period of time , then the fact that “the mean value is the PPV ” holds with a very low probability.
Hence, in order to incorporate this PI into the prediction of PPV, the training datum is converted into , where
Remark 1. In fact, from (1) and (2) we can find that the larger is, the smaller becomes. That is to say, the possibility that “PPV at is ” is very small. On the other hand, the large illustrates that PPVs from to fluctuate wildly and that “mean value is PPV ” holds with a low probability (namely, the possibility that “PPV at is ” is very low), which is in accordance with the information provided by . Thus, provides PI about the fluctuation of PPV during a certain period of time . Therefore, training datum is converted into which contains PI provided by monitored data. Then, the problem of predicting PPV is as follows.
2.1.1. Problem of PPV Prediction
Let be the timesand let it be a training set. Denote . Then, training set (3) can be rewritten bywhere , , . The problem of PPV prediction is to find a real valued function on according to training set (4) to predict the PPV for any given input .
In the next subsection, a method of predicting the PPVs from the aspect of mathematics is established.
2.2. Method of Predicting PPV Based on PI and Epsilon-SVR
In order to solve the above problem of predicting PPV, based on the standard SVR, PI--SVR is constructed as follows.
Step 1. Obtain the training setwhere , , .
Step 2. Select a proper kernel function , , and punishment parameter .
Step 3. Construct and solve the convex quadratic programming problem:We can obtain an optimal solution
Step 4. Choose component or of vector in interval . If is chosen, thenIf is chosen, then
Step 5. Construct the decision function with
3. Contrastive Experiments
In order to predict the PPV (cm/s) at (minutes), we recorded the values of PPV induced by underwater blasting for 100 times with IDTS-3850 blast vibration recorder at Dajin Island in Taishan nuclear power station in China and collected 100 monitored datasets (or samples). Then, by the -SVR, the probability density of PPV is estimated with the former 80 samples (the latter 20 samples are used for prediction) in Section 3.1. As the probability density contains much information about PPV and probability (1) provides the PI about fluctuation of PPV during a certain period of time, then they are incorporated into training data in Section 3.2. Section 3.3 is the experiments predicting the PPVs for a given point with standard -SVR and the proposed PI--SVR, respectively, and the result analyses are shown in Section 3.4.
3.1. Estimating Probability Density of PPV with Monitored Data
Suppose that indicate PPV at time . PPV was monitored ten times during every ten minutes, and mean value was output as the PPV at time . According to -SVR, estimated probability density of PPV can be obtained and shown in Figure 1.
As the probability density provides much PI about PPV, we incorporate it in the training data in the next subsection.
3.2. Increasing the Dimensionality of Training Data with PI about Fluctuation of PPV
Set and . By the estimated probability density and (1), probability can be calculated. Then, according to training set (4), monitored data are converted intowhere . And data (12) are used to establish a model to predict PPV for the given .
Similarly, if we want to predict PPV for the given , we can estimate the probability density of PPV with monitored data and use data as training data.
3.3. Method of Predicting PPV Based on PI and Epsilon-SVR
In order to predict PPV for the given with training data (12), we make experiments with PI--SVR and standard -SVR, respectively. Here, a grid search method based on 5-fold cross-validation is chosen to determine model parameters, , , and kernel function is a radial basis function (RBF). The experiment results are shown in Table 1. Predicted PPVs of training data with PI--SVR and standard -SVR are shown in Figures 2 and 3, respectively. Predicted PPVs of testing data with the two methods are shown in Figures 4 and 5, respectively.
Similar to the steps of predicting PPV for the given , we make the experiment 20 times to predict the PPV for the given (namely, the latter 20 monitored datasets); the monitored values and predicted values are shown in Table 2, and the average mean squared errors are shown in Table 3 (the numbers after ± are the standard deviations).
3.4. Results Analysis
In the experiments, the optimal parameters are chosen via a grid search method based on 5-fold cross-validation. From Table 1, we find that when the numbers of training datasets with PI--SVR and standard -SVR are the same, mean squared error of training data with PI--SVR is 0.0064, which is smaller than the corresponding one (0.0241) with standard -SVR. It illustrates that PI--SVR is more accurate than standard -SVR in predicting PPV . Figures 2 and 3 show the predicted PPVs of the 80 training datasets with PI--SVR and standard -SVR, respectively, and comparing the two figures we can find that the predicted PPVs with PI--SVR are closer to the monitored PPVs than those with standard -SVR, which shows that the proposed PI--SVR method is more accurate than standard -SVR in predicting PPVs of the training data.
In Figures 4 and 5, we can find that the monitored PPV is 5.49 (cm/s), and the predicted PPVs with PI--SVR and standard -SVR are 4.73 (cm/s) and 4.56 (cm/s), respectively. That is to say, the predicted PPV with PI--SVR is closer to the monitored PPV than that with standard -SVR. Table 1 (the last column) shows that the mean squared errors of testing data with the two methods are 0.0438 and 0.0648, respectively. These illustrate that the proposed PI--SVR method is more accurate and effective than the standard -SVR in predicting PPV for the given .
In order to reduce the influence caused by randomness, we made the experiment 20 times to predict the PPV for the given (namely, the latter 20 monitored datasets); the real monitored values and predicted values are shown in Table 2 and the average mean squared errors and standard deviations are shown in Table 3.
From Table 2, we can see that most of the 20 predicted PPVs (numbers in the third column) with PI--SVR are closer to the monitored PPVs (numbers in the second column) than those (numbers in the fourth column) obtained with standard -SVR, showing that the proposed PI--SVR method is more accurate than the standard -SVR in predicting the latter 20 PPVs. And also, we find that most of the 20 mean squared values (numbers in the fifth column) obtained with PI--SVR are smaller than the corresponding ones (numbers in the last column) with standard -SVR, showing that the PI--SVR method is more stable than the standard -SVR.
From Table 3, we find that the average mean squared error obtained with PI--SVR is 0.0117 which is smaller than that (0.0156) obtained with standard -SVR. This illustrates that PI--SVR method is more stable than standard -SVR method in predicting the latter 20 PPVs . And also, the running time of PI--SVR is less than one minute, which shows that the model’s running time can meet the needs of PPV prediction in application.
Through the experiments, we can find that the PPV prediction method incorporating priori information can achieve both high prediction accuracy and good stability compared to the prediction method without priori information. That is to say, incorporating PI into the prediction of PPV may be a good way of increasing the prediction accuracy.
4. Conclusions and Future Directions
In this paper, a method of estimating probability density of PPV with real monitored data is proposed, and we find the estimated probability density providing the PI about fluctuation of PPV between the maximal value and the minimal value in a certain period of time. Then, the PI provided by estimated probability density is incorporated into training data. After that, PI--SVR method for predicting PPV is proposed. In Table 2, experiment results, including 20 predicted values and 20 mean squared errors, show that the proposed PI--SVR is more accurate in the prediction of PPV than the standard -SVR. In Table 3, the average mean squared errors of PI--SVR and standard -SVR are 0.0117 and 0.0156, respectively, and average variances are 0.0177 and 0.0199, respectively, which show that the PI--SVR is more stable than the standard -SVR. Therefore, incorporating PI into the prediction of PPV may be a good way of increasing the prediction accuracy.
And also some other factors, such as water pressure, blast design, geotechnical properties, and explosive parameters, are also impacting the prediction of PPVs. If the PI about these factors can be incorporated into the prediction of PPVs, the prediction accuracy may be further improved. Therefore, establishing a method including PI from the aspects of both monitored data and engineering practice is one of our research directions in the future.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by National Natural Science Foundation of China (61073121), Application Basic Research Plan Key Basic Research Project of Hebei Province (15967663D), Natural Science Foundation of Hebei Province of China (F2015402033), Natural Science Foundation of Hebei Education Department (QN2015116). Jiqiang Chen is also supported by the Youth Academic Backbone Project of Hebei University of Engineering. The authors also thank the anonymous reviewers for their constructive comments and suggestions.
- B. Xie, H. Li, Y. Liu, X. Xia, and C. Yu, “Study of safety control of foundation pit excavation by blasting in Ningde nuclear power plant,” Chinese Journal of Rock Mechanics and Engineering, vol. 28, no. 8, pp. 1571–1578, 2009.
- F.-W. Yang, H.-B. Li, Y.-Q. Liu, F. Zou, X. Xia, and Y.-F. Hao, “Comparative study on vibration characteristics of rock mass induced by bench blasting and pre-splitting blasting,” Journal of the China Coal Society, vol. 37, no. 8, pp. 1285–1291, 2012.
- T. N. Singh and V. Singh, “An intelligent approach to prediction and control ground vibration in mines,” Geotechnical & Geological Engineering, vol. 23, no. 3, pp. 249–262, 2005.
- M. Khandelwal and T. N. Singh, “Prediction of blast-induced ground vibration using artificial neural network,” International Journal of Rock Mechanics and Mining Sciences, vol. 46, no. 7, pp. 1214–1222, 2009.
- Y.-Q. Liu, H.-B. Li, Q.-T. Pei, and W. Zhang, “Prediction of peak particle velocity induced by underwater blasting based on the combination of grey relational analysis and genetic neural network,” Rock and Soil Mechanics, vol. 34, no. 1, pp. 259–264, 2013.
- T. N. Singh, R. Kanchan, A. K. Verma, and K. Saigal, “A comparative study of ANN and Neuro-fuzzy for the prediction of dynamic constant of rockmass,” Journal of Earth System Science, vol. 114, no. 1, pp. 75–86, 2005.
- R. Singh, V. Vishal, and T. N. Singh, “Soft computing method for assessment of compressional wave velocity,” Scientia Iranica, vol. 19, no. 4, pp. 1018–1024, 2012.
- R. Singh, V. Vishal, T. N. Singh, and P. G. Ranjith, “A comparative study of generalized regression neural network approach and adaptive neuro-fuzzy inference systems for prediction of unconfined compressive strength of rocks,” Neural Computing and Applications, vol. 23, no. 2, pp. 499–506, 2013.
- H. Drucker, C. J. C. Burges, A. S. L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS '97), vol. 9, pp. 155–161, May 1997.
- V. Vapnik, S. E. Golowich, and A. Smola, “Support vector method for function approximation, regression estimation, and signal processing,” in Proceedings of the 10th Annual Conference on Neural Information Processing Systems (NIPS '96), vol. 9, pp. 281–287, December 1997.
- P. Guan, D. Huang, M. He, and B. Zhou, “Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method,” Journal of Experimental and Clinical Cancer Research, vol. 28, article 103, 2009.
- Z. Zhang, G. Dai, and M. I. Jordan, “Bayesian generalized kernel mixed models,” Journal of Machine Learning Research, vol. 12, pp. 111–139, 2011.
- F. Q. Liu and X. P. Xue, “Design of natural classification kernels using prior knowledge,” IEEE Transactions on Fuzzy Systems, vol. 20, no. 1, pp. 135–152, 2012.
- W. Astuti, R. Akmeliawati, W. Sediono, and M. J. E. Salami, “Hybrid technique using singular value decomposition (SVD) and support vector machine (SVM) approach for earthquake prediction,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 5, pp. 1719–1728, 2014.
- O. Chapelle, V. Sindhwani, and S. S. Keerthi, “Optimization techniques for semi-supervised support vector machines,” Journal of Machine Learning Research, vol. 9, pp. 203–233, 2008.
- J. Q. Chen, W. Pedrycz, M. H. Ha, and L. T. Ma, “Set-valued samples based support vector regression and its applications,” Expert Systems with Applications, vol. 42, no. 5, pp. 2502–2509, 2015.
- M. G. De Giorgi, S. Campilongo, A. Ficarella, and P. M. Congedo, “Comparison between wind power prediction models based on wavelet decomposition with least-squares support vector machine (LS-SVM) and artificial neural network (ANN),” Energies, vol. 7, no. 8, pp. 5251–5272, 2014.
- M. H. Ha, C. Wang, Z. M. Zhang, and D. Z. Tian, Uncertainty Statistical Learning Theory, Science Press, Beijing, China, 1st edition, 2010.
- M. Hajian, A. A. Foroud, and A. A. Abdoos, “Power transformer protection scheme based on MRA-SSVM,” Journal of Intelligent & Fuzzy Systems, vol. 27, no. 4, pp. 1659–1669, 2014.
- M. Karasuyama and I. Takeuchi, “Multiple incremental decremental learning of support vector machines,” IEEE Transactions on Neural Networks, vol. 21, no. 7, pp. 1048–1059, 2010.
- A. Kavousi-Fard, H. Samet, and F. Marzbani, “A new hybrid modified firefly algorithm and support vector regression model for accurate short term load forecasting,” Expert Systems with Applications, vol. 41, no. 13, pp. 6047–6056, 2014.
- R. V. Maheswari, P. Subburaj, B. Vigneshwaran, and L. Kalaivani, “Non linear support vector machine based partial discharge patterns recognition using fractal features,” Journal of Intelligent and Fuzzy Systems, vol. 27, no. 5, pp. 2649–2664, 2014.
- M. Narwaria and W. Lin, “Objective image quality assessment based on support vector regression,” IEEE Transactions on Neural Networks, vol. 21, no. 3, pp. 515–519, 2010.
- C. Orsenigo and C. Vercellis, “Regularization through fuzzy discrete SVM with applications to customer ranking,” Journal of Intelligent and Fuzzy Systems, vol. 23, no. 4, pp. 101–110, 2012.
- E. G. Ortiz-García, S. Salcedo-Sanz, Ã. M. Pérez-Bellido, J. Gascõn-Moreno, J. A. Portilla-Figueras, and L. Prieto, “Short-term wind speed prediction in wind farms based on banks of support vector machines,” Wind Energy, vol. 14, no. 2, pp. 193–207, 2011.
- R. Sałat and S. Osowski, “Support vector machine for soft fault location in electrical circuits,” Journal of Intelligent and Fuzzy Systems, vol. 22, no. 1, pp. 21–31, 2011.
- V. Srinivasan, G. Rajenderan, J. V. Kuzhali, and M. Aruna, “Fuzzy fast classification algorithm with hybrid of ID3 and SVM,” Journal of Intelligent & Fuzzy Systems, vol. 24, no. 3, pp. 555–561, 2013.
- C. Suryanarayana, C. Sudheer, V. Mahammood, and B. K. Panigrahi, “An integrated wavelet-support vector machine for groundwater level prediction in Visakhapatnam, India,” Neurocomputing, vol. 145, pp. 324–335, 2014.
- J. Y. Zhou, J. Shi, and G. Li, “Fine tuning support vector machines for short-term wind speed forecasting,” Energy Conversion and Management, vol. 52, no. 4, pp. 1990–1998, 2011.
Copyright © 2015 Litao Ma and Jiqiang Chen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.