- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Annual Issues ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents
Abstract and Applied Analysis
Volume 2014 (2014), Article ID 459137, 8 pages
Nonlinear Methodologies for Identifying Seismic Event and Nuclear Explosion Using Random Forest, Support Vector Machine, and Naive Bayes Classification
1School of Resources and Safety Engineering, Central South University, Changsha 410083, China
2School of Mechanical Engineering, Northwestern Polytechnical University, Xi’an, 710129, China
Received 26 December 2013; Accepted 16 January 2014; Published 26 February 2014
Academic Editor: Carlo Cattani
Copyright © 2014 Longjun Dong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The discrimination of seismic event and nuclear explosion is a complex and nonlinear system. The nonlinear methodologies including Random Forests (RF), Support Vector Machines (SVM), and Naïve Bayes Classifier (NBC) were applied to discriminant seismic events. Twenty earthquakes and twenty-seven explosions with nine ratios of the energies contained within predetermined “velocity windows” and calculated distance are used in discriminators. Based on the one out cross-validation, ROC curve, calculated accuracy of training and test samples, and discriminating performances of RF, SVM, and NBC were discussed and compared. The result of RF method clearly shows the best predictive power with a maximum area of 0.975 under the ROC among RF, SVM, and NBC. The discriminant accuracies of RF, SVM, and NBC for test samples are 92.86%, 85.71%, and 92.86%, respectively. It has been demonstrated that the presented RF model can not only identify seismic event automatically with high accuracy, but also can sort the discriminant indicators according to calculated values of weights.
The problems of seismic source locations and identifications are two of the most important and fundamental issues in earthquake monitoring, microseismic monitoring, analyses of active tectonics, and assessment of seismic hazards [1–4].
Seismic analysts identify seismic signals from those of explosions or blasts by visual inspection and by calculating some characteristics of seismogram. As recorded quarry blasts or nuclear explosions can mislead scientists interpreting the active tectonics and lead to erroneous results in the analysis of seismic hazards in the area; an event classification task is an important step in seismic signal processing. Such task analyses data in order to find to which class each recorded event belongs.
Such work supposes a great deal of workload for seismic analysts. Therefore, an automatic classification tool is necessary to be developed for reducing dramatically this arduous task, turning classification as reliable, as well as removing errors associated with tedious evaluations and change of personnel.
Most discrimination methods are designed for a particular source region and a particular distance of the recording station from the epicenter . Some of them heavily depend on the heterogeneity of the uppermost crust in the sense that they might be effective only for a given region.
The widely used methods for discriminators include simulating explosion spectra in order to predict spectral details indicative of explosions and not of earthquakes or single-event explosions [6, 7]; examining compressional and shear-wave ratios (amplitude and spectral) between all types of explosions and earthquakes, in an attempt to apply the basic physical conclusion that explosions excite more compressional waves than earthquakes relative to shear waves [8–11]; differences in high frequency S-to-P ratios between all types of explosions and earthquakes [12–14]; analyzing observed spectra of ripple-fired explosions, instantaneous explosions, and earthquakes and contrasting time-independent modulations, path-independent modulations, spectral ratios, spectral slopes, and spectral maxima and minima [15–17]; and examining differences in energy ratios of various wave in velocity windows [18, 19].
However, most of developed methods above are based on single index or liner discriminant methods. And the methods seem to fail to capture the discontinuities, the nonlinearities, and the high complexity of wave series.
Random Forests (RFs), Support Vector Machines (SVMs), and Naive Bayes Classifier (NBC) provide enough learning capacity and are more likely to capture the complex nonlinear models, which are widely used in natural and science areas, including medicine, agriculture, and geotechnics.
So far, as to our knowledge, the RFs and SVMs were not used for seismic classification. The performance of RFs, SVMs, and NBC in this type of application has not been thoroughly compared.
In present work, RF, SVM, and NBC were applied to discriminate between earthquakes and nuclear explosions. And based on the one out cross-validation, ROC curve, and test accuracy, their discriminating performances were discussed and compared.
2. Materials and Methods
The measurements or parameters consist of ratios of the “high energies” contained within predetermined “velocity windows” on the seismograms . The choice of velocity windows is guided by the assumption that earthquake source mechanism is extended both in time and space and generates a larger fraction of energy in shear waves as compared to explosion source mechanism.
The different waves of “velocity windows” are listed as follows:(i): first arrival to 4.6 km/s;(ii): arrival to 4.6 to 2.5 km/s;(iii): first arrival to 4.9 km/s;(iv): arrival to 4.9 to 2.0 km/s;(v): arrival to 6.2 to 4.9 km/s;(vi): arrival to 4.9 to 3.6 km/s;(vii): arrival to 3.6 to 3.2 km/s;(viii): arrival to 3.2 to 2.8 km/s; and(ix): arrival to 2.8 to 2.5 km/s.
The factors, including ratios , , , , , , , , and , as well as Average Distance, were expressed as Ratio1, Ratio2, Ratio3, Ratio4, Ratio5, Ratio6, Ratio7, Ratio8, Ratio9, Ratio10, and AD, respectively.
Nine ratios of energies included within certain velocity windows have been computed for 20 earthquakes and 27 nuclear explosions by Booker and Mitronovas . All seismograms were recorded by the VELA UNIFORM LRSM Network on short-period Benioff instruments . Ratio1, Ratio2, Ratio3, Ratio4, Ratio5, Ratio6, Ratio7, Ratio8, Ratio9, and AD were selected as discriminant indicators. -score is used to standardize variables in this work. First, the mean is subtracted from the value for each case, resulting in a mean of zero. Then, the difference between the individual’s score and the mean is divided by the standard deviation, which results in a standard deviation of one. If we start with a variable and generate a variable , the process is
Box plot graphs of energy ratios and distance were plotted in Figures 1 and 2, respectively. Each group is represented as a box whose top and bottom are drawn at the lower and upper quartiles, with a small square at the median. Thus, the box contains the middle half of the scores in the distribution. Vertical lines outside the box extend to the largest and the smallest observations within 1.5 interquartile ranges. We conclude that Ratio1, Ratio2, Ratio3, Ratio4, Ratio5, Ratio6, Ratio7, Ratio8, Ratio9, Ratio10, and AD for earthquake and nuclear earthquake are obviously different. Such it is reasonable to select the ten factors as discriminating indicator.
The first 70% dataset of earthquake and nuclear earthquake were used to establish discriminating models and the other 30% dataset were used to test the model.
2.2.1. Overview of Random Forest
Random Forest (RF), a metalearner comprising many individual trees, was first developed by Tin Kam Ho in 1995 and later improved by Breiman in 2001. It was developed to operate quickly over large datasets and to be diverse by using random samples to build each tree in the forest. Each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them . Comprehensive review of applications of Random Forest have been provided by Rodriguez-Galiano et al., , Granitto et al. , and by Genuer et al. . Also, a number of researches have compared the performance of other data mining technique and Random Forest in different kinds of problems [23–26]. The theory of RF is summarized as follows .
A Random Forest is a classifier consisting of a collection of tree-structured classifiers , where the are independent identically distributed random vectors and each tree casts a unit vote for the most popular class at input .
Given an ensemble of classifiers and with the training set drawn at random from the distribution of the random vector , define the margin function as where is the indicator function. The margin measures the extent to which the average number of votes at for the right class exceeds the average vote for any other class. The larger the margin, the more confidence in the classification. The generalization error is given by where the subscripts indicate that the probability is over the space. In Random Forests, . For a large number of trees, it follows from the Strong Law of Large Numbers and the tree structure the following.
As the number of trees increases, for almost surely all sequences converges to
The margin function for a Random Forest is and the strength of the set of classifiers is Assume that , Chebychev’s inequality gives A more revealing expression for the variance of is derived in the following. Let so
The raw margin function is Thus, is the expectation of with respect to . For any function the identity holds where are independent with the same distribution, implying that Using (12) gives where is the correlation between and holding fixed and is the standard deviation of holding fixed. Then, where is the mean value of the correlation; that is, Write
In this work, A RF model of discriminating between natural earthquake and nuclear earthquake is established with optimal 5000 NT trees and 8 variables in rode. In the developed RF model, the calculated weighted values of the Ratio1, Ratio2, Ratio3, Ratio4, Ratio5, Ratio6, Ratio7, Ratio8, Ratio9, and AD are 1.2713, 0.1034, 0.0759, 0.3093, 0.3432, 0.1782, 0.2536, 0.0943, 0.2463, and 0.1512, respectively.
2.2.2. SVM Algorithm
The original SVM algorithm was invented by Vladimir N. Vapnik and the current standard incarnation (soft margin) was proposed by Cortes and Vapnik in 1995 .
SVM models were originally defined for the classification of linearly separable classes of objects. For any original separable set of two-class objects SVM are able to find the optimal hyperplanes that separates them providing the bigger margin area between the two hyperplanes. Furthermore they can also be used to separate classes that cannot be separated with a linear classifier.
The feature space in which every object is projected is a high dimensional space in which the two classes can be separated with the linear classifier. The effectiveness of SVM depends on the selection of kernel, the kernel's parameters, and soft margin parameter .
In the present work we used the Radial Basis Function (RBF) as Kernel functions for the SVM models because of its efficiency in providing very high performance classification results. The optimal RBF parameters and gamma were found to be 9 and 0.6, respectively, reassuring that the model does not over fit.
2.2.3. Naive Bayes Classier
The Naive Bayes Classier produces a very efficient probability estimation based on a simply structure, requiring a small amount of training data to estimate the parameters necessary for the classification. Its construction relies on two main assumptions: independency of features and absence of hidden or latent attributes.
An advantage of Naive Bayes is that it only requires a small amount of training data to estimate the parameters (means and variances of the variables) necessary for classification. Because independent variables are assumed, only the variances of the variables for each class need to be determined and not the entire covariance matrix.
The aim of the NBC, as with other classifiers, is to assign an object to one of a discrete set of categories based on its observable attributes . The NBC calculates the probability that belongs to each category, conditioning on the observed attributes; is typically assigned to the category with the greatest probability. This classifier is naive in the sense that it makes the strong assumption that the attributes are mutually conditionally independent; that is, the conditional probability that belongs to a particular class given the value of some attribute is independent of the values of all other attributes. Despite this unrealistic assumption, empirical studies demonstrate that this assumption does not need to significantly compromise the accuracy of the prediction, and NBCs are used in a variety of applications, including document classification , medical diagnosis [29, 30], systems performance management , probability classification of rockburst , and other fields. Domingos and Pazzani  prove optimality of the NBC under certain conditions even when the conditional independence assumption is violated.
In this paper, the prior probability of natural earthquake and nuclear earthquake is calculated according to the size of data. The prior probabilities of earthquake and nuclear earthquake are 0.424 and 0.576, respectively.
The discriminate functions for the earthquake and nuclear are If , the record is an earthquake, otherwise a nuclear event.
2.2.4. Classification Performance
ROC is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied . It is created by plotting the fraction of true positives out of the positives (TPR = true positive rate) versus the fraction of false positives out of the negatives (FPR = false positive rate), at various threshold settings.
ROC analysis provides tools to select possibly optimal models and to discard suboptimal ones independently from (and prior to specifying) the cost context or the class distribution. ROC analysis is related in a direct and natural way to cost/benefit analysis of diagnostic decision making.
In this study, seismic event and nuclear explosion were considered as a two-class prediction problem (binary classification), in which the outcomes were labeled either as positive (, events) or negative (, blasts). There are four possible outcomes from a binary classifier. If the outcome from a prediction is and the actual value is also , then it is called a true positive (TP); however, if the actual value is then it is said to be a false positive (FP). Conversely, a true negative (TN) has occurred when both the prediction outcome and the actual value is and false negative (FN) is when the prediction outcome is , while the actual value is .
An experiment from positive and negative was defined, for instance. The four outcomes can be formulated in a 2 × 2 contingency table or confusion matrix, as follows in Table 3.
The specificity or true negative rate (TNR) is defined as the percentage of seismic record which is correctly identified as being blast: The quantity 1-specificity is the false positive rate (FPR) and is the percentage of seismic records that are incorrectly identified as being blasts. The sensitivity or true positive rate (TPR) is defined as the percentage of seismic records which is correctly identified as being events: The accuracy (ACC) can be expressed as
3. Results and Discussions
The back-test classification for training samples is calculated using established models. The back-test accuracies of RF, SVM, and NBC are 100%, 100%, and 96.97% for training samples, respectively. The one out cross-validation method was used to validate the methods. Results show that accuracies of RF, SVM (RBF), SVM (liner), and NBC are 100%, 96.97%, and 84.88%, respectively.
The ROC curve is also used to verify and compare the discriminating performance of established models. The established RF model, SVM model, and NBC model were applied to both the training and test samples. The ROC curve is shown in Figure 3. The area under the curve is listed in Table 4. The classification results of test samples using all developed models are presented in Table 5.
In Figure 3, the closer a result from a contingency table is to the upper left corner, the better it predicts, but the distance from the random guess line in either direction or area under curve is the best indicator of how much predictive power a method has.
As shown in Figure 3 and Table 4, the result of RF method clearly shows the best predictive power with a maximum area of 0.975 among RF, SVM, and NBC. The result of SVM (area: 0.963) is better than that of NBC (area: 0.956).
According to Table 5, we can get the discriminant accuracy of RF, SVM, and NBC for test samples; their accuracy are 92.86%, 85.71%, and 92.86%, respectively. From back test results, one out cross-validation, ROC, and test results, we get the conclusion that RF discriminant model has the best accuracy and discriminant ability. Also, according to weighted values of RF, the most important factors are Ratio1, followed by Ratio5, Ratio4, Ratio7, Ratio9, Ratio6, AD, Ratio2, Ratio8, and Ratio3.
RF, SVM, and NBC were applied to seismic event identification. A thorough investigation of the discrimination capabilities of the techniques were undertaken using seismograms from 20 earthquakes and 27 nuclear explosions. Ratios , , , , , , , , and within certain velocity windows, as well as average distance, were selected as discriminant indicators.
The classification performance of RF, SVM, and NBC was analyzed and compared based on back test of training samples, one out cross-validation, and ROC curve. The result of RF method clearly shows the best predictive power with a maximum ROC area of 0.975 among RF, SVM, and NBC. The result of SVM (area: 0.963) is better than that of NBC (area: 0.956). Test results show the discriminant accuracies of RF, SVM, and NBC are 92.86%, 85.71% and 92.86%, respectively.
From back-test results, one out cross-validation, ROC curve, and test results, we get the conclusion that RF discriminant model has the best accuracy and discriminant ability. Not only can RF discriminant method be applied to seismic identification with high accuracy, but also it can give the weighted sorts of discriminant indicators. In this study, the most important factors are Ratio1, followed by Ratio5, Ratio4, Ratio7, Ratio9, Ratio6, AD, Ratio2, Ratio8, and Ratio3.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors gratefully acknowledge the financial support of the National Natural Science Foundation of China (50934006 and 41272304), National Basic Research (973) Program of China (2010CB732004), China Scholarship Council (CSC), Scholarship Award for Excellent Doctoral Student from Ministry of Education of China (105501010), and Support Program for Cultivating Excellent Ph.D. Thesis of Central South University.
- L. Dong and X. Li, “A microseismic/acoustic emission source location method using arrival times of PS waves for unknown velocity system,” International Journal of Distributed Sensor Networks, vol. 2013, Article ID 307489, 8 pages, 2013.
- L. Dong and X. Li, “Three-dimensional analytical solution of acoustic emission or microseismic source location under cube monitoring network,” Transactions of Nonferrous Metals Society of China, vol. 22, no. 12, pp. 3087–3094, 2012.
- X. B. Li and L. J. Dong, “Comparison of two methods in acoustic emission source location using four sensors without measuring sonic speed,” Sensor Letters, vol. 9, no. 5, pp. 2025–2029, 2011.
- L. Dong and X. Li, “Hypocenter relocation for Wenchuan Ms 8. 0 and Lushan Ms 7. 0 earthquakes using TT and TD methods,” Disaster Advances, vol. 6, no. 13, pp. 304–313, 2013.
- Y. Gitterman, V. Pinsky, and A. Shapira, “Spectral discrimination analysis of Eurasian nuclear tests and earthquakes recorded by the Israel Seismic Network and the NORESS array,” Physics of the Earth and Planetary Interiors, vol. 113, no. 1–4, pp. 111–129, 1999.
- S. J. Arrowsmith, M. D. Arrowsmith, M. A. H. Hedlin, and B. Stump, “Discrimination of delay-fired mine blasts in Wyoming using an automatic time-frequency discriminant,” Bulletin of the Seismological Society of America, vol. 96, no. 6, pp. 2368–2382, 2006.
- A. J. Mendecki, Seismic Monitoring in Mines, Chapman & Hall, 1996.
- J. Wuster, “Discrimination of chemical explosions and earthquakes in central Europe—a case study,” Bulletin of the Seismological Society of America, vol. 83, no. 4, pp. 1184–1212, 1993.
- R. Blandford, “Discrimination between earthquakes and underground explosions,” Annual Review of Earth and Planetary Sciences, vol. 5, p. 111, 1977.
- A. T. Smith, “Discrimination of explosions from simultaneous mining blasts,” Bulletin of the Seismological Society of America, vol. 83, no. 1, pp. 160–179, 1993.
- D. R. Baumgardt and G. B. Young, “Regional seismic waveform discriminants and case-based event identification using regional arrays,” Bulletin of the Seismological Society of America, vol. 80, no. 6, pp. 1874–1892, 1990.
- S. R. Taylor, M. D. Denny, E. S. Vergino, and R. E. Glaser, “Regional discrimination between NTS explosions and western US earthquakes,” Bulletin of the Seismological Society of America, vol. 79, no. 4, pp. 1142–1176, 1989.
- S. Taylor, M. Denny, and E. Vergino, “Regional m/sub b: M/sub s/discrimination of NTS explosions and western United States earthquakes,” Progress Report, Lawrence Livermore National Laboratory, Livermore, Calif, USA, 1986.
- S. G. Kim, Y. Park, and W. Kim, “Discrimination of small earthquakes and artificial explosions in the Korean Peninsula using Pg/Lg ratios,” Geophysical Journal International, vol. 134, no. 1, pp. 267–276, 1998.
- D. R. Baumgardt and K. A. Ziegler, “Spectral evidence for source multiplicity in explosions: application to regional discrimination of earthquakes and explosions,” Bulletin of the Seismological Society of America, vol. 78, no. 5, pp. 1773–1795, 1988.
- M. A. H. Hedlin, J. B. Minster, and J. A. Orcutt, “An automatic means to discriminate between earthquakes and quarry blasts,” Bulletin of the Seismological Society of America, vol. 80, no. 6, pp. 2143–2160, 1990.
- Y. Gitterman and T. van Eck, “Spectra of quarry blasts and microearthquakes recorded at local distances in Israel,” Bulletin of the Seismological Society of America, vol. 83, no. 6, pp. 1799–1812, 1993.
- A. Booker and W. Mitronovas, “An application of statistical discrimination to classify seismic events,” Bulletin of the Seismological Society of America, vol. 54, no. 3, pp. 961–971, 1964.
- L. Dong, X. Li, C. Ma, and W. Zhu, “Comparisons of Logistic regression and Fisher discriminant classifier to seismic event identification,” Disaster Advances, vol. 6, supplement 4, pp. 1–8, 2013.
- L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
- V. F. Rodriguez-Galiano, B. Ghimire, J. Rogan, M. Chica-Olmo, and J. P. Rigol-Sanchez, “An assessment of the effectiveness of a random forest classifier for land-cover classification,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 67, no. 1, pp. 93–104, 2012.
- P. M. Granitto, F. Gasperi, F. Biasioli, E. Trainotti, and C. Furlanello, “Modern data mining tools in descriptive sensory analysis: a case study with a Random forest approach,” Food Quality and Preference, vol. 18, no. 4, pp. 681–689, 2007.
- R. Genuer, J. Poggi, and C. Tuleau-Malot, “Variable selection using random forests,” Pattern Recognition Letters, vol. 31, no. 14, pp. 2225–2236, 2010.
- L. J. Dong, X. B. Li, and K. Peng, “Prediction of rockburst classification using Random Forest,” Transactions of Nonferrous Metals Society of China, vol. 23, no. 2, pp. 472–477, 2013.
- L. Dong and X. Li, “Comprehensive models for evaluating rockmass stability based on statistical comparisons of multiple classifiers,” Mathematical Problems in Engineering, vol. 2013, Article ID 395096, 10 pages, 2013.
- L. J. Dong, X. B. Li, M. Xu, and Q. Li, “Comparisons of random forest and Support Vector Machine for predicting blasting vibration characteristic parameters,” Procedia Engineering, vol. 26, pp. 1772–1781, 2011.
- C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
- M. E. Maron, “Automatic indexing: an experimental inquiry,” Journal of the ACM, vol. 8, no. 3, pp. 404–417, 1961.
- I. Kononenko, “Inductive and Bayesian learning in medical diagnosis,” Applied Artificial Intelligence, vol. 7, no. 4, pp. 317–337, 1993.
- P. Berchialla, F. Foltran, and D. Gregori, “Naïve Bayes classifiers with feature selection to predict hospitalization and complications due to objects swallowing and ingestion among European children,” Safety Science, vol. 51, no. 1, pp. 1–5, 2013.
- R. Powers, M. Goldszmidt, and I. Cohen, “Short term performance forecasting in enterprise systems,” in Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '05), pp. 801–807, ACM, August 2005.
- Y. Fu and L. Dong, “Bayes discriminant analysis model and its application to the prediction and classification of rockburst,” Journal of China University of Mining and Technology, vol. 38, no. 4, pp. 56–64, 2009.
- P. Domingos and M. Pazzani, “On the optimality of the simple Bayesian classifier under zero-one loss,” Machine Learning, vol. 29, no. 2-3, pp. 103–130, 1997.
- A. P. Bradley, “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognition, vol. 30, no. 7, pp. 1145–1159, 1997.