Abstract

The automatic discrimination of rock fracture and blast events is complex and challenging due to the similar waveform characteristics. To solve this problem, a new method based on the signal complexity analysis and machine learning has been proposed in this paper. First, the permutation entropy values of signals at different scale factors are calculated to reflect complexity of signals and constructed into a feature vector set. Secondly, based on the feature vector set, back-propagation neural network (BPNN) as a means of machine learning is applied to establish a discriminator for rock fracture and blast events. Then to evaluate the classification performances of the new method, the classifying accuracies of support vector machine (SVM), naive Bayes classifier, and the new method are compared, and the receiver operating characteristic (ROC) curves are also analyzed. The results show the new method obtains the best classification performances. In addition, the influence of different scale factor and number of training samples on discrimination results is discussed. It is found that the classifying accuracy of the new method reaches the highest value when = 8–15 or 8–20 and .

1. Introduction

In laboratory rock tests, in situ rock excavation, and a lot of other rock engineering, signals of rock fracture events are often mixed with other signals such as environmental noise, impact and vibration, and blast signal. When these signals are monitored by microseismic or acoustic emission machines [14], the presence of jamming signals, especially blast signals, may result in the wrong interpretation, for example, erroneous state evaluation and disaster prediction [5, 6]. Consequently, it is necessary to ensure a clean database of rock fracture signals. Although the discrimination of rock fracture and blast events can be performed by experts, manual discrimination of rock fracture and blast signals is time-consuming and subjective due to the fact that it depends on the experience. Therefore, discrimination of rock fracture and blast signals, in particular large quantities of signals, requires a reliable and automatic method.

In recent years, machine learning has been widely applied to realize automatic identification and classification about signals. Machine learning [710] includes many methods, such as neural network [1113], support vector machine [14, 15], and naive Bayes classifier [16, 17]. Currently, several recognition methods of rock fracture or similar signals were proposed in some studies. For example, Shang et al. [18] classified microseismic events and quarry blasts according to artificial neural networks (ANN) based on principal component analysis. Yildirim et al. [12] used the extracted peak amplitude ratio ( ratio) of quarry blasts and earthquakes to contrast classification accuracies of FFNN, PNN, and ANFIS. Liu et al. [19] proposed a method of wavelet transform and ANN to recognize acoustic emission signals for different rocks. Del Pezzo et al. [20] used ANN based on seismogram signatures to classify earthquakes and underwater explosions. Peng et al. [21] used improved BPNN and combined feature extraction method to recognize seismic signal.

All the aforementioned methods usually conduct feature extraction before feature recognition. Waveform parameters of signals, such as amplitude, frequency, and total radiated energy, are extracted as eigenvectors. However, those waveform parameters are sometimes impossible to reflect the characteristic of the total waveform absolutely. In addition, the process of extracting parameters also consumes much time and effort. In order to classify signals more precisely and easily, it is vital to find a classification method that need not depend on waveform parameters of rock fracture and blast signals.

In this study, a new method based on signal complexity analysis and machine learning has been proposed to achieve automatic identification of rock fracture and blast signals without waveform parameter. The method calculates signal complexity based on multiscale permutation entropy (MPE) and uses back-propagation neural network (BPNN) as a tool of machine learning. To calibrate and validate the proposed method, the signal complexity values of predetected events were also input into support vector machine (SVM) and naive Bayes classifier to classify signal category. In addition, the influence of scale factors and number of training samples on classifying accuracy was also analyzed for the new method.

2. Methodology

2.1. Signal Complexity Analysis with Multiscale Permutation Entropy

Feature extraction of signals is usually required before signal discrimination. Almost all the previous studies used waveform parameters as discrimination features. For example, Vallejos and McKinnon [22] used 13 parameters of seismic full waveform as discrimination feature vectors of blast and microseismic events. Mousavi et al. [23] extracted 40 features from time, frequency, and time–frequency domains to classify deep and shallow microearthquakes. However, the commonly used characteristic parameters are difficult to obtain automatically, which limits the automatic identification of rock fracture events. Furthermore, the above waveform parameters are obtained from single scale analysis, which reflects less information of signals. To solve the above problems, this paper extracts feature vectors of signals based on signal complexity standpoint. Signal complexity is expressed primarily by correlation and random degree of time series for a signal, which reflects the overall feature of a signal. The complexity of a signal can be described by many methods, such as permutation entropy (PE) [24, 25], multiscale permutation entropy (MPE) [26, 27], Lempel-Ziv complexity [28], and multiscale Lempel-Ziv complexity [29]. MPE is more robust due to the only use of the order of time series values; meanwhile MPE can obtain multiscale signal information. This paper applies thus MPE to calculate signal complexity as signal recognition features. The basic principles are introduced as follows.

A one-dimensional time series is given as follows:

Coarse graining of the above time series can be expressed bywhere stands for the scale factor and stands for the multiscale time series. When , the coarse graining time series stands for the original time series.

Phase space reconstruction of coarse graining series is performed:where is the embedded dimension and is the time delay. If the number of real values contained in each can be arranged in ascending order as

and if there exist two or more elements in that have the same value, for example, , their original positions can be sorted such that, for ,

Accordingly, any vector can be mapped onto a group of symbols aswhere and ; is the largest number of permutations. The permutation entropy of time series at scale is expressed as follows:

If , will reach a maximum and will be normalized; then

Thenwhere in represents signal complexity when the scale factor is equal to . The size of value indicates the degree of randomness of time series. The smaller the value of is, the more regular the time sequence states are. The greater the value of is, the more random the time series is.

2.2. Signal Identification with Back-Propagation Neural Network

After signal features are extracted by signal complexity, then discrimination of rock fracture and blast signals is performed by feature recognition. However, manual identification is time-consuming and easily influenced by individual factors. In order to reliably discriminate rock fracture and blast signals automatically, BP neural network [30] as an identification tool is applied. It is made up of an input layer, a hidden layer, and an output layer.

There are two kinds of signals flowing between layers in BP neural network. The working signals spread forward and other error signals between actual outputs and expected outputs are back-propagated. The basic process is shown as follows.

The hidden layer input of the th node:where represents the hidden layer input of the th node, stands for the weight value from the th node of the hidden layer and the th node of the input layer, is the th input of input layer, and is the th threshold of the hidden layer. The hidden layer output of the th node:

In the formula, is the hidden layer output of the th node and stands for the inspirit function of the hidden layer. The output layer input of the th node: where represents the output layer input of the th node, stands for the weight value from the th node of the out layer and the th node of the input layer, and is the th threshold of the output layer. The output layer’s output of the th node:In the formula, is the output layer’s output of the th node and stands for the inspirit function of the output layer.

The error function is given by (14) and the BP ANN stops when is satisfied, where is a given precision. where is expected value of output node .

A learning process updates the weights for each neuron based on the following equation:where is learning rate, .

3. Discrimination Process and Performance of the New Method

3.1. Discrimination Process

This section describes the process of whole discrimination of rock fracture and blast signals based on the proposed method. The process divides signals waveform data into training and test and validation sets. The specific steps of the new method are as follows.

Step 1 (sample selection). Choose training samples of numbers from 200 sets of samples that are named . The samples are composed of blasting and rock fracture signal samples. And the remaining data in are regarded as test and validation data.

Step 2 (feature extraction). Use MPE to calculate permutation entropies of training samples with different scale factors to form feature vectors of training sets; the remaining data are also extracted to form features vectors of test and validation sets.

Step 3 (train machine learning tools). Input the feature vectors of training samples to train BPNN and make it adjust the weight value constantly until the error is below the set error value.

Step 4 (classification of test and validation data). Input the feature vectors of test and validation samples to the BP neural network that has been trained. Through network internal calculation, the accuracies of test and validation data can be derived.

According to the above operation, the classification results are derived. The whole process sketch is shown in Figure 1.

3.2. Discrimination Performance Evaluation

In order to evaluate the performance of the new methods, the receiver operating characteristic (ROC) curve is applied. ROC is a graphical plot which illustrates the performance of a binary classifier system, as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the positives (TPR = true-positive rate) versus the fraction of false positives out of the negatives (FPR = false-positive rate), at various threshold settings. In this study, rock fracture and blasts events are considered as a two-class prediction problem; there are four possible outcomes from a binary classifier, as shown in Table 1. A true positive (TP) means that a rock fracture event has been identified as a rock fracture event and a false negative (FN) means that a rock fracture event has been identified as a blast event. A true negative (TN) means that a blast event has been identified as a blast event and a false positive (FP) means that a blast event has been identified as a rock fracture event. ThenThe accuracy (ACC) can be expressed as

4. Experiment Verification

4.1. Data Set

The experimental data sets were collected from Hubei Province, including one hundred rock fracture signals and a hundred blast signals. The partial signals are shown in Figure 2.

4.2. MPE-BPNN Analysis

Before MPE values are calculated, the coefficients of MPE method itself need to be chosen. The coefficients include embedding dimension , time delay , and scale factor . Bandt and Pompe [31] suggested that the embedding dimension should take a value from 3 to 7. Meanwhile, with the time delay increasing, small changes in signals are more difficult to monitor. Therefore, this paper selects and . The scale factor is determined by comparing two groups of rock fracture and blast events, which are selected from 200 sample data sets. The permutation entropies of the optional signals are calculated when ; the results are shown in Figure 3. From Figure 3, when , the two have better discrimination. Thus, the value of scale factor is chosen from 8 to 15. That is, an eight-dimensional vector can be constructed for each signal to describe its characteristic.

After the related coefficients are chosen, permutation entropies of rock fracture and blast signals with different scale factors (8–15) are calculated and shown in Figure 4.

Then MPE values of 140 waveforms from which include rock fracture and blast signals are chosen to train the BPNN. The remaining data are regarded as test data and validation data to evaluate the performance of MPE-BPNN method. Among BPNN that employed the typical network of three layers, Reyes et al. [32] stated that there should be neurons in the hidden layer, where is the number of input neurons. Due to the scale factors = 8–15, the nodes of input are 8, so the hidden layer has 17 neurons and then BPNN is trained.

BPNN is trained for 71 loops by 140 groups of data. The cross entropies of training and test and validation data are shown in Figure 5. Best validation performance is 0.01458 at 65th iteration from Figure 5. And error of each data is calculated and shown in Figure 6. From Figure 6, errors of 200 eight-dimensional vectors that are made up of permutation entropy are within −0.02664~0.02664; individual data have superior errors, which reveals little miscarriage of justice of events.

Results of classification of three data sets and total data set are shown in Table 2. From Table 2, three rock fracture events are regarded as blast events falsely and four blast events are misjudged as rock fracture events in the training data. TPR, FPR, and ACC of training data are 95.8%, 5.9%, and 95%, respectively. In the validation data, a blast event is regarded as a rock fracture event falsely and a rock fracture event is regarded as a rock fracture event falsely when TPR, FPR, and ACC of validation data are 92.9%, 6.3%, and 93.3%, respectively. Meanwhile, a rock fracture event and two blast events are not misjudged in the test data. Their TPR and FPR are 92.9% and 12.5%, respectively, and ACC reaches 90%. Overall, five rock fracture events are regarded as blast events falsely and seven blast events are misjudged as rock fracture events in the total data. TPR and FPR of training data are 95.0% and 7%, respectively, and ACC reaches 94%.

In order to display classification performance of the new method more intuitively, ROC curves of different sets of data are shown in Figure 7. From Figure 7, corner point of training set is closest to top left corner, which means TPR achieves maximum rapidly when related FPR is low and represents the better accuracy of classification of two signals in training. As a result of training, corner points of test set and validation set are both close to top left corner, which means rock fracture and blast events are classified accurately in test and validation sets. ROC curve of total set is also drawn in Figure 7. Corner point of total set is also close to top left corner. Thus this illustrates that the proposed method has high accuracy of discrimination of rock fracture and blast signals.

4.3. Comparison of Discrimination Performance for SVM, Naive Bayes, and the Proposed Method

To evaluate the performance of the proposed method in this paper, Naive Bayes and support vector machine [33] have also been implemented collectively. The first 70% of the rock fracture and blast events have been used as training samples and the remaining 30% of data have been used as test samples. The total data numbers are 200 groups. The classification results are shown in Table 3 and Figures 810.

As shown in Table 3 and Figures 8 and 9, FP that indicate blasts are regarded as rock fracture events mistakenly, being 11, 5, and 3, respectively, for SVM, naive Bayes, and the new method. FN that indicate rock fracture events are regarded as blasts falsely, being 3, 6, and 2, respectively, for SVM, naive Bayes, and the new method. FP and FN reveal that the proposed method has lower miscarriage of justice than others. The accuracies (ACC) are 76.67%, 81.67%, and 91.67%, respectively, for SVM, naive Bayes, and the new method, which illustrates that the proposed method obtains the best classification accuracies on the whole and shows the highly nonlinear mapping ability of the proposed method.

From Figure 10, corner point of the new method is closest to top left corner in ROC curve, which shows that the new method has high TPR when FPR is low. This phenomenon exposes the fact that the new method possesses better performance than other classifiers mentioned in this paper. As Table 4 shows, TPR are 90%, 80%, and 92.86% for the new method, SVM, and naive Bayes, respectively, when FPR are 36.67%, 16.67%, and 9.38%.

In conclusion, the new method obtains the best classification results.

5. Discussion

In order to further evaluate the performance of the proposed method, the influence of different scale factors and training sample numbers is discussed.

5.1. The Influence of Scale Factors on the Identification Results

From Section 4, an eight-dimensional vector is selected to express characteristic of each waveform. In order to further analyze the influence of scale factors on the identification results, the new method is run when are chosen from 8 to 10, 15, 20, 25, and 30, respectively. The changes in the classification accuracies are shown in Table 4 when the number of scale factors increases.

As shown in Table 4, when the feature vector is three-dimensional, the total classification accuracy of the new method is 90.5%, and with an increase of scale factor, the total classification accuracy increases. When are 8–15, the rock fracture accuracy reaches the highest value. When are 8–20, the blast classification accuracy only has the modest growth. When are greater than 20, the classification accuracy declines and tends to be stable afterwards. The reason is that the increase of the scale factor could make it more difficult to express the complexity of the signal. Meanwhile, the increasing number of scale factors increases the calculating time. Therefore, the best choices for scale factors that should be selected, according to this experiment, are 8–15 or 8–20.

5.2. The Influence of Training Sample Numbers on the Identification Results

Appropriate numbers of training samples are vital for the proposed method to determine classification accuracy. Here, 100, 120, 140, 160, and 180 samples of 200 groups of data are chosen, respectively, as training sets. The accuracies of different training sample numbers are shown in Table 5.

As shown in Table 5, with the number of training samples increasing, classification accuracies of all events first remain unchanged and then rise and decline lastly. Due to the fact that rock fracture and blast signals have complex waveform features, an excessive number of training samples may lead to an overfitting problem, which results in the decreased classification accuracy. According to the above analysis, it is appropriate to select the number of samples as 140.

6. Conclusion

In this paper, a new method has been proposed for distinguishing rock fracture and blast events. The new method has many advantages. First, the method turns out to be rather fast and it does not seek for waveform parameters of detected signals and only signal time series are required, which is more convenient and simple because signal time series have been detected by related equipment on the site. Secondly, depending on self-learning capacity of BPNN, it can classify rock fracture and blast signals automatically, which deals with time-consuming and subjective problem of manual discrimination.

In this study, multiscale permutation entropy (MPE) is applied to calculate complexity values for two hundred signals including 100 rock fracture and 100 blast signals. The calculated MPE values can indicate signal complexity and characteristic and are regarded as feature vectors of rock fracture and blast signals. Then back-propagation neural network as means of machine learning is used to construct discriminator for rock fracture and blast signals based on feature vectors. Accuracies of training, validation, and test sets from 200 data sets reach 95%, 93.3%, and 90%, respectively. Accuracy of all data reaches 94%. TPR of training, validation, and test sets and all data both achieve maximum rapidly when related FPR is low, which reveals better accuracy and sensitivity of classification of two signals.

To evaluate the performance of the new method, the comparison of classification performances of SVM, naive Bayes, and the new method is carried out. Accuracies of the above three methods are 76.67%, 81.67%, and 91.67%, respectively. The results show the new method obtains the best classification accuracy. ROC curves of the above three methods are also contrasted. Corner point of the new method is closest to top left corner in ROC curve, which illustrates that the new method has the best specificity and sensitivity.

It is noted that the scale factors of MPE and quantities of training samples for BPNN are very important for identification results. For 200 data sets, the best scale factors are 8–15 or 8–20 and the best quantities of training samples are 140. Excessive number of training samples may lead to an overfitting problem, which would reduce classification accuracy.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to acknowledge financial supports from the National Basic Research Program of China (973 Program) (no. 2015CB060200), the National Natural Science Foundation of China (nos. 41772313 and 51478479), the Key Research and Development Program of Hunan (2016SK2003), and the Fundamental Research Funds for the Central Universities of Central South University (2017zzts185).