Abstract and Applied Analysis

Abstract and Applied Analysis / 2014 / Article
Special Issue

Scaling, Self-Similarity, and Systems of Fractional Order

View this Special Issue

Research Article | Open Access

Volume 2014 |Article ID 459137 | https://doi.org/10.1155/2014/459137

Longjun Dong, Xibing Li, Gongnan Xie, "Nonlinear Methodologies for Identifying Seismic Event and Nuclear Explosion Using Random Forest, Support Vector Machine, and Naive Bayes Classification", Abstract and Applied Analysis, vol. 2014, Article ID 459137, 8 pages, 2014. https://doi.org/10.1155/2014/459137

Nonlinear Methodologies for Identifying Seismic Event and Nuclear Explosion Using Random Forest, Support Vector Machine, and Naive Bayes Classification

Academic Editor: Carlo Cattani
Received26 Dec 2013
Accepted16 Jan 2014
Published26 Feb 2014

Abstract

The discrimination of seismic event and nuclear explosion is a complex and nonlinear system. The nonlinear methodologies including Random Forests (RF), Support Vector Machines (SVM), and Naïve Bayes Classifier (NBC) were applied to discriminant seismic events. Twenty earthquakes and twenty-seven explosions with nine ratios of the energies contained within predetermined “velocity windows” and calculated distance are used in discriminators. Based on the one out cross-validation, ROC curve, calculated accuracy of training and test samples, and discriminating performances of RF, SVM, and NBC were discussed and compared. The result of RF method clearly shows the best predictive power with a maximum area of 0.975 under the ROC among RF, SVM, and NBC. The discriminant accuracies of RF, SVM, and NBC for test samples are 92.86%, 85.71%, and 92.86%, respectively. It has been demonstrated that the presented RF model can not only identify seismic event automatically with high accuracy, but also can sort the discriminant indicators according to calculated values of weights.

1. Introduction

The problems of seismic source locations and identifications are two of the most important and fundamental issues in earthquake monitoring, microseismic monitoring, analyses of active tectonics, and assessment of seismic hazards [14].

Seismic analysts identify seismic signals from those of explosions or blasts by visual inspection and by calculating some characteristics of seismogram. As recorded quarry blasts or nuclear explosions can mislead scientists interpreting the active tectonics and lead to erroneous results in the analysis of seismic hazards in the area; an event classification task is an important step in seismic signal processing. Such task analyses data in order to find to which class each recorded event belongs.

Such work supposes a great deal of workload for seismic analysts. Therefore, an automatic classification tool is necessary to be developed for reducing dramatically this arduous task, turning classification as reliable, as well as removing errors associated with tedious evaluations and change of personnel.

Most discrimination methods are designed for a particular source region and a particular distance of the recording station from the epicenter [5]. Some of them heavily depend on the heterogeneity of the uppermost crust in the sense that they might be effective only for a given region.

The widely used methods for discriminators include simulating explosion spectra in order to predict spectral details indicative of explosions and not of earthquakes or single-event explosions [6, 7]; examining compressional and shear-wave ratios (amplitude and spectral) between all types of explosions and earthquakes, in an attempt to apply the basic physical conclusion that explosions excite more compressional waves than earthquakes relative to shear waves [811]; differences in high frequency S-to-P ratios between all types of explosions and earthquakes [1214]; analyzing observed spectra of ripple-fired explosions, instantaneous explosions, and earthquakes and contrasting time-independent modulations, path-independent modulations, spectral ratios, spectral slopes, and spectral maxima and minima [1517]; and examining differences in energy ratios of various wave in velocity windows [18, 19].

However, most of developed methods above are based on single index or liner discriminant methods. And the methods seem to fail to capture the discontinuities, the nonlinearities, and the high complexity of wave series.

Random Forests (RFs), Support Vector Machines (SVMs), and Naive Bayes Classifier (NBC) provide enough learning capacity and are more likely to capture the complex nonlinear models, which are widely used in natural and science areas, including medicine, agriculture, and geotechnics.

So far, as to our knowledge, the RFs and SVMs were not used for seismic classification. The performance of RFs, SVMs, and NBC in this type of application has not been thoroughly compared.

In present work, RF, SVM, and NBC were applied to discriminate between earthquakes and nuclear explosions. And based on the one out cross-validation, ROC curve, and test accuracy, their discriminating performances were discussed and compared.

2. Materials and Methods

2.1. Materials

The measurements or parameters consist of ratios of the “high energies” contained within predetermined “velocity windows” on the seismograms [18]. The choice of velocity windows is guided by the assumption that earthquake source mechanism is extended both in time and space and generates a larger fraction of energy in shear waves as compared to explosion source mechanism.

The different waves of “velocity windows” are listed as follows:(i): first arrival to 4.6 km/s;(ii): arrival to 4.6 to 2.5 km/s;(iii): first arrival to 4.9 km/s;(iv): arrival to 4.9 to 2.0 km/s;(v): arrival to 6.2 to 4.9 km/s;(vi): arrival to 4.9 to 3.6 km/s;(vii): arrival to 3.6 to 3.2 km/s;(viii): arrival to 3.2 to 2.8 km/s; and(ix): arrival to 2.8 to 2.5 km/s.

The factors, including ratios , , , , , , , , and , as well as Average Distance, were expressed as Ratio1, Ratio2, Ratio3, Ratio4, Ratio5, Ratio6, Ratio7, Ratio8, Ratio9, Ratio10, and AD, respectively.

Nine ratios of energies included within certain velocity windows have been computed for 20 earthquakes and 27 nuclear explosions by Booker and Mitronovas [18]. All seismograms were recorded by the VELA UNIFORM LRSM Network on short-period Benioff instruments [18]. Ratio1, Ratio2, Ratio3, Ratio4, Ratio5, Ratio6, Ratio7, Ratio8, Ratio9, and AD were selected as discriminant indicators. -score is used to standardize variables in this work. First, the mean is subtracted from the value for each case, resulting in a mean of zero. Then, the difference between the individual’s score and the mean is divided by the standard deviation, which results in a standard deviation of one. If we start with a variable and generate a variable , the process is

where is the mean of and is the standard deviation of . -score of each ratio and distance for seismic event and nuclear earthquake were listed in Tables 1 and 2, respectively.


NumberEarthquakes
 AD

1Baja California
2Baja California
3Box elder Creek
4Bridgeport
5Cache Creek
6Cache Creek AS
7Colona
8Mont.-Wyoming Border
9Pierre, S. Dakota
10Red Rock River
11Sierra De Juarez
12Teton County
13Western Mary land
14Western Vermont
15Western Vermont
16Western Vermont
17Western Vermont
18Western Vermont
19Western Vermont
20Western Vermont


NumberNuclear
explosion  AD

1Aardvark
2Agouti
3Armadillo
4Chinchilla II
5Cimarron
6 Codsaw
7Danny Boy
8Des Moines
9Dormouse II
10Fisher
11Gnome
12Hardhat
13Haymaker
14Mad
15Madison
16Marshmallow
17Mississippi
18Packrat
19Pampas
20Passaic
21Platte
22Sacramento
23Small Boy
24Stillwater
25Stoat
26Wichita
27York

Box plot graphs of energy ratios and distance were plotted in Figures