#### Abstract

The microseismic signals in the coal minefield are very complex because of its special environment with a large number of blast vibration signals, and how to effectively identify the microseismic signals is still a big problem. S transform (ST) and Manifold Learning (ML) methods are introduced to extract the characteristics of the microseismic signals, and Gaussian Mixture Model based on the improved Bee Colony optimization algorithm (IBC-GMM) is established to identify the microseismic signals accurately. Firstly, the time-frequency characteristics of microseismic signals in coal mine are extracted by ST analysis. It is found that there are obvious time-frequency differences between rock-fracturing signals and blast vibration signals. Blast vibration signals have short duration, high frequency, and complex frequency spectrum, and their dominant frequencies are mainly over 100 Hz. However, rock-fracturing signals are relatively slow, with low frequency and stable spectrum change, and their dominant frequencies are generally below 100 Hz. Then, combining with the microseismic data of Xiashinjie coal mine in Tongchuan, China, the feature dimension reduction is carried out by Manifold Learning (ML) method, and the processed feature vectors are automatically recognized by IBC-GMM. Field test results show that the method summarizes the characteristics of the microseismic wave which are difficult to emerge as the learning vector, and the features reflect the key features of microseismic signals well. The identification accuracy is as high as 94%, and its recognition effect is superior to other recognition models (such as traditional Gaussian Mixture Model based on Expectation-Maximum (EM-GMM), Backpropagation (BP) neural network, Random Forests (RF), Bayes (Bayes) methods, and Logistic Regression (LR) method). Therefore, IBC-GMM could be used to mine engineering microseismic monitoring waveform recognition to provide the reference.

#### 1. Introduction

With the development of underground engineering in China, the engineering stress level is increasing, and the dynamic disasters such as rockburst are becoming more and more serious. Compared with the traditional monitoring methods, the microseismic monitoring system can effectively monitor the internal microfracture of rock mass before the macroscopic instability failure. As a real-time monitoring method, the microseismic monitoring system can dynamically reveal the damage evolution characteristics of rock mass, which has been widely used in the engineering fields such as the mine [1, 2], the hydropower station [3, 4], and the tunnel [5, 6]. However, the mine site environment is complex and there are many interfering factors. To extract effective microseismic events from a large number of data samples is a fundamental problem for the prediction and early warning of the microseismic monitoring system, which has important research significance [7]. Moreover, in these complex waveforms, the noise signals are easy to distinguish artificially due to their obvious spectral characteristics. Blast events account for a considerable proportion and are easily confused with microseismic events. Therefore, the primary task is to distinguish blast vibration signals from rock fracture signals. At present, the automatic identification effect of microseismic signals is not ideal, which mainly relies on the manual processing of technicians, which is inefficient and susceptible to personal factors. This lagging processing mode limits the further development of microseismic monitoring means, so the identification of microseismic wave has attracted extensive attention [8, 9].

At present, there are a lot of researches on the analysis and recognition of the microseismic wave. Lu et al. [10] and Cao et al. [11] used fast Fourier transform (FFT) to analyze the time-frequency signals of two typical shock pressure and roof blast microseismic signals in Sanhejian coal mine and summarized the signal characteristics of different microseismic events. Zhu et al. [12, 13], Shang et al. [14, 15], and Li [16] used SVM to identify microseismic signals in high precision, respectively, through adaptive microtremor signals decomposed feature vector based on Wavelet Transform (WT), Empirical Mode Decomposition (EMD), Singular Value Decomposition (SVD), and Local Mean Decomposition (LMD). Dong et al. [17, 18] extracted characteristic parameters from the source parameter characteristics and spectrum distribution of microseismic events by statistical analysis and established a statistical model for automatic identification of mine microseismic events and blast events. Zhao et al. [19] proposed a recognition method based on the Empirical Mode Decomposition (EMD) and the morphological fractal dimension in view of the difficulty in automatic identification of mine microseismic monitoring signals. Jiang et al. [20] used the quantitative description method of waveform characteristics to intercept and correct the effective waveform of microseismic signals and proposed the method of mine microseismic wave shape recognition based on the joint identification of multichannel waveform of single event. Li et al. [21] determined the differences between these waveforms by analyzing the dominant frequency, signal duration, following peak, and multifractal parameters. Zhang et al. [22] applied EEMD to decompose microseismic signals into multiple intrinsic mode functions (IMF) which provide and inform SVD to extract eigenvalues from matrices composed of these IMF components; Extreme Learning Machine (ELM) was introduced to establish a classification model. Vallejos and Mckinnon [23] established a high-precision identification model of microseismic events based on focal parameters through Logistic Regression and neural network. Pu et al. [24] investigated the performances of ten frequently used machine learning models for microseismic/blasting event recognition. Although these methods realize the feature extraction of seismic signal in a certain extent, they have also some deficiencies. Fourier transform (FT) is suitable for stationary signal analysis, but rock-fracturing signals and blast vibration signals have the characteristics of nonstationary and sudden transients. Although Wavelet Transform (WT) is suitable for nonstationary signal processing, the decomposition effect of WT depends on the choice of basis function and decomposition scale to a large extent. Empirical Mode Decomposition (EMD) and Local Mean Decomposition (LMD) are adaptive, but there is a problem of mode aliasing.

S transform (ST) combines the advantages of short-time Fourier transform (STFT) and the Wavelet Transform (WT), overcomes STFT’s defect of invariable window time width, and can adjust the time width adaptively according to the change of frequency. The analysis results of ST contain phase information, which can provide intuitive time-frequency features without selecting window function [25]. Ting et al. [26] introduced the optimized S transform (ST) into the analysis of microseismic signal, which has high-quality time domain and frequency domain information to analyze various vibration signals. The result shows that the method can be used to automatically identify and analyze microseismic signals. An optimization algorithm for S transform is proposed to find the optimum Gaussian windows for providing a TF map with maximum energy concentration. And the application of the proposed method to perform nonstationary deconvolution of synthetic and field seismic datasets presents a better performance than short-time Fourier transform (STFT) in improving the temporal resolution of the data [27]. Zheng et al. [28] combined the threshold method, proposed the S transform threshold filtering on the basis of S transform time-frequency filtering, and processed airgun seismic records from temporary stations in “Yangtze Program” (the Anhui experiment). Compared with the results of the bandpass filtering, the S transform threshold filtering can improve the signal-to-noise ratio (SNR) of seismic waves and provide effective help for first arrival pickup and accurate travel time. Dai et al. [29] used S transform (ST) to decompose waveform and process the waveforms in the impoundment and construction periods of the left bank abutment slope to investigate the time-frequency characteristics of microfractures and got some meaningful results based on the microseismic monitoring data at the left bank abutment slope of Jinping first stage hydropower station. The classification of microseismic events is one of the key technologies in microseismic monitoring. The frequency characteristics of rock fracture signals and blast vibration signals in underground powerhouse are studied through S transform (ST), the energy distribution ratio in different frequency bands was used to quantify the frequency characteristics, and a model for classification of microseismic events based on BP neural network optimized by genetic algorithm (GA) was established [25]. The application of the above S transformation has achieved beneficial results in the fields of earthquake, underground waste storage, and microseismic monitoring of hydropower stations, but the microseismic signals of mines are more complex and seldom applied.

There are many kinds of information that can be obtained by microseismic signals. Too much data leads to complex relationships, and the increase of information dimension seriously affects the learning efficiency of machine learning. How to effectively process these complex data and how to transform complex high-dimensional information data into low-dimensional intuitive and effective information have become crucial research topics in the field of machine learning. Manifold learning (ML) is a new unsupervised learning method, which can effectively discover the inner dimension of high-dimensional nonlinear datasets and carry out dimensional reduction, so as to realize dimensional reduction or data visualization in Manifold Learning. One of the most important algorithms in ML is Local Linear Embedding (LLE) algorithm, which is a nonlinear dimensionality reduction algorithm with good generalization and is widely used in image classification and target recognition. Zhao and Zhang [30] presented a novel dimension reduction method for classification based on probability-based distance and the technique of Local Linear Embedding (LLE). And the numerical results show that the proposed methodology performs better, compared with Logistic Discrimination (LD) classifiers applied on the lower-dimensional embedding coordinates computed by LLE or SLLE. Nadal et al. [31] proposed a hip fracture prediction tool for a local region, using clinical data of the population of that region based on LLE technique. Zhao et al. [32] proposed a method to control chart classification based on improved Supervised Locally Linear Embedding (SLLE) and support vector machine (SVM). However, from the research status at home and abroad, the method applied to the microseismic signals analysis and processing of direct correlation research is very few. Therefore, the efficient dimensionality reduction algorithm of ML is applied to the feature extraction of microseismic signals, and the transformation of microseismic signals features from the time domain, frequency domain, or time-frequency domain to the representation in the manifold domain can not only improve the accuracy and rapidity of later pattern recognition but also be a new idea and approach for feature extraction of time-varying microseismic signals.

As a typical method based on statistical theory, Gaussian Mixture Model (GMM) is widely used in pattern recognition and data analysis. Considering the complex scene and the poor video quality of the conveyer belt, Cheng et al. [33] detected and recognized coal gangue by an improved Gaussian Mixture Model (GMM) which extracts and subtracts the background of the video. Zhang et al. [34] combined the method of Discrete Wavelet Transform Mel Frequency Cepstrum Coefficients (DWPTMFCC) with an improved Gaussian Mixture Model (GMM) to classify and recognize the heart sound to improve the precision of feature extraction and efficiency of classification and recognition of heart sound. Raitoharju et al. [35] used GMM for modeling joint distributions of the position and the RSS value. Ding et al. [36] evaluated the real-time anomalies of each univariate sensing time series via LSTM model and then used a Gaussian Mixture Model to give a multidimensional joint detection of possible anomalies. Traditional GMM uses *k*-means algorithm to initialize parameters and then estimates parameters through Expectation-Maximization (EM) algorithm. As EM is a local optimization algorithm, it is sensitive to the setting of initial parameters and easy to fall into the trap of local optimization.

In view of this, a new method combining S transform (ST), Local Linear Embedding (LLE), and the improved Gaussian Mixture Model (GMM) was proposed. The ST-LLE model was used to automatically extract the characteristic value of the signal waveform, and IGMM was used to automatically identify and classify these values. First, ST is introduced into the field of analysis and identification of microseismic waves, and the time-frequency characteristics of blast vibration signals and rock-fracturing signals are compared. Then, the improved LLE is used for dimensionality reduction of signal features. They were used as the feature vector of recognition. Finally, the improved swarm algorithm (IBC) was introduced to optimize the GMM parameters and the classification model was established by IGMM to distinguish the signal. In order to compare the difference between IGMM and other machine learning algorithms in the high efficiency, Expectation-Maximum (EM-GMM), Backpropagation (BP) neural network, Random Forests (RF), Bayes (Bayes) methods, and Logistic Regression (LR) method were also established for waveform recognition.

#### 2. Project Overview and Microseismic Monitoring

##### 2.1. Project Overview

Xiashijie coal mine is located in Yaoqu town, Tongchuan, China, and 54 km away from Tongchuan city as shown in Figure 1. The minefield is 4 kilometers long, with a slope width of 3.3 kilometers and a coal-bearing area of 13.2 square kilometers. The mine design is two levels, which includes “Cross Double U” roadway layout of the mining face and the adit-inclined shaft stage mining. Long wall mining and fully mechanized top coal caving are adopted, and total collapse method manages the roof. The general structure of the mine is an undulatory monocline with a north dip, and the deep part is a syncline of Xinmin village. The fault structure is not developed and simple.

The underground shaft of comprehensive mining face 2305 is located in the lower stage at 950 m. The northeast direction is the dark shaft, and the southwest direction is the boundary of the minefield and adjacent to the Chenjiashan coal mine. The deep northwest direction is unprepared area. The direct roof of the coal seam is argillaceous siltstone, the old roof is middle sandstone and silt-fine sand interbedded, and the direct floor is silt-fine sand interbedded, with an average thickness of 29.1 m. According to the geological report and the measured data, the working face is an interwoven folded structure, and the coal seam fluctuates greatly along the east-west direction. No-fault structure was found during tunneling, and it is expected that a small fault with a drop of about 0.5 m may occur in the surface during mining. The presence of rock joints substantially reduces the rock mass strength [37]; therefore, there is a risk of weak rockburst.

##### 2.2. Installation of Microseismic System

In view of the randomness and multiple problems of mine rockburst, in order to reveal and predict potential risk areas of working face in advance, our research group adopted the advanced high-precision microseismic monitoring system produced by Canadian Engineering Seismology Group (ESG) in October 2019 to carry out real-time monitoring, positioning, and analysis of coal and rock fracture in the affected mining area of working face 2305. The system consists of underground signal acquisition system, ground data processing system, and remote system. Underground signal acquisition system includes microseismic sensors and Paladin downhole digital signal acquisition system. Remote system contains big data processing system and 3D visualization software based on remote network transmission developed by Mechsoft (Dalian) Co., Ltd. The microseismic sensors use seismometers with a response frequency range of 15–2500 Hz and a sensitivity of 43.3 v·s/m. The topological structure of the microseismic monitoring system is shown in Figure 2.

The microseismic monitoring sensors should be arranged in haulage roadway and return roadway on both sides of the working face, and they are staggered along the strike and depth direction of coal mine. Therefore, 4 groups of Paladin substations and 20 microseismic sensors are deployed to constitute the microseismic monitoring system on the working face 2305, as shown in Figure 3. Distributed on both sides of the working face, the sensors can monitor microseismic events which occurred because of deep mining rock mass unloading, microfracture events for 24 h and get spatiotemporal data, error, and the parameters of the magnitude and the energy source. Filtering the collected data, a full waveform and spectrum analysis are provided for users, and the microseismic event types are identified, through the filter processing automatically. The noise is eliminated by filtering and setting threshold and bandwidth detection.

##### 2.3. Microseismic Activity Monitoring of Rockburst

Each fracture of coal and rock mass will produce a microseismic event and sound wave, and the vibration energy, frequency, and density comprehensively reflect the stress and failure degree of coal and rock mass. Therefore, the stress distribution and failure characteristics of coal and rock can be analyzed by monitoring the energy, frequency, and location of microseismic events in the mining process. In order to facilitate analysis and explanation, 275 rock-fracturing events and 40 blasting events were collected during the period of 2019-12-01∼2019-12-07 microseismic data, as shown in Figures 4 and 5. The spatial positioning results of microseismic events were used to describe the spatial and temporal evolution of microseismic events and the movement rules of coal and rock.

Figure 4 is the top view of the spatial distribution of microseismic events. Figure 5 shows the lateral density contours of the spatial distribution of microseismic events along the trend. In Figure 4, the dot represents the position of a microseismic event in the rock layer. The round ball in the microseismic event distribution diagram represents the microseismic event generated by rock mass damage and failure. The color of the sphere represents the moment magnitude of the microseismic event and the size of the sphere represents the energy of the microseismic event. As can be seen from Figure 4, as the working face advances, microseismic events are mainly concentrated in two regions. The first is in front of the working face within 100 million range, but the magnitude and energy are small. The second is in the rear of the working face 100–150 m; the magnitude and energy are relatively large. From the perspective of stratification (Figure 5), microseismic events are mainly concentrated on the hard roof. As can be seen from Figure 6, blast vibration events account for less than 18% of the microseismic events collected every day, while rock-fracturing events are the majority.

According to the monitoring result of coal and rockburst, microseismic events are densely distributed in about 50 m thickness within the low rock caving zone distribution. Under the combined action of periodic pressure, square of working face, and other construction disturbances, the energy and quantity of microseismic events monitored in the return roadway are obviously better than those of the haulage roadway, which can show that the return roadway has a large stopping stress, and protection design should be paid attention to. However, the energy of the total microseismic event is relatively small. Therefore, under the existing mining conditions, rockburst will not occur in the working face 2305. In addition, the monitoring result also shows that the roof fracture depth is within 50 m, and there is no abnormal activity in the roof and floor fractures.

It is not difficult to find that the dynamic development law of microseismic events indicates that there is a close relationship between microseismic activities and mining. During normal activity, microseismic events are mainly concentrated in coal seam and floor or the direct roof. In the intense seismic activity period, microseismic events have obvious trend of the development in the high roof, along with working face advancing, which shows that roof pressure increases, the frequency of coal compression failure and roof shear failure increases, and the fracture height increases until the basic roof breaks, then the microseismic activity decreased significantly.

#### 3. Time-Frequency Feature Extraction Method Based on S Transform

##### 3.1. Data Preprocessing

After several months of field monitoring, it was found that a large number of events occurred, including blast events, which brought great difficulty and intensity to the field microseismic data processing. For this purpose, the frequency characteristics of rock-fracturing signals and blast vibration signals in the process of the working face mining are studied by selecting the field pickup waveform. 50 rock-fracturing signals and 50 blast signals were randomly selected. The waveform data with the strongest signal amplitude were selected from all channels as the initial data for feature analysis.

Due to the influence of mechanical vibration, electromagnetic pulse, and sensor installation conditions, some interference components are included in the data collected on-site, so it is necessary to conduct noise reduction processing on the original signal and eliminate the influence of background noise on the signal characteristics, so as to facilitate subsequent feature extraction. The wavelet coefficients have different distribution characteristics, which are obtained by the multiscale analysis of noise and signal. Therefore, on the basis of fully considering the frequency distribution characteristics of microseismic signals and the wavelet transform principle, the wavelet function db4 is selected to carry out noise reduction pretreatment for microseismic signals [38]. The signal waveforms of coal and rock fracturing and artificial blast vibration before and after pretreatment are shown in Figure 7. As can be seen from Figure 7, the pretreatment method removes part of the noise components, retains the useful signal components, reduces the interference of high-frequency signal waveform such as burr, and has a good denoising effect, which is beneficial to the later time-frequency analysis.

**(a)**

**(b)**

**(c)**

**(d)**

##### 3.2. S Transformation Method

S transform (ST) [39] is a time-spectrum representation method similar to short-time Fourier transform (STFT) proposed by American geophysicist Stockwell in 1996. S transform (ST) is the combination of short-time Fourier transform (STFT) and Wavelet Transform (WT), and its window function is a Gaussian window whose scale varies with frequency. Although ST does not have the high time-frequency resolution of Wigner distribution, it does not introduce cross-term interference and can retain the phase information of each frequency component of the analyzed signal, while ST is completely reversible [40]. The time-frequency resolution of ST is characterized by the following: the low-frequency part has high-frequency resolution, but its temporal resolution is low. The high-frequency part has higher time resolution, but its frequency resolution is lower. Combining with the characteristics of microseismic signals and the purpose of extracting microseismic information, ST is used as a time-frequency analysis tool in this paper. The formula of ST can be expressed aswhere is the ST result of time-domain signal . is the original signal. *t* is the time and *f* is the frequency parameter, Hz. *τ* is the parameters of the Gaussian window position to control in the *t* axis. *k* is the scale parameter, and the size of *k* affects the time-frequency resolution. The larger the *k* value is, the higher the frequency resolution will be, and the correspondingly lower the temporal resolution will be. The Gaussian window function is as follows:

The Gaussian window function must satisfy the formula

The wider the time window is, the lower the time resolution will be and the higher the frequency resolution will be. The narrower the time window is, the higher the time resolution will be and the lower the frequency resolution will be. Therefore, the time window of ST in low-frequency band is wider, so as to obtain higher frequency resolution. As the time window of high-frequency band is narrow, high time resolution can be obtained [41, 42]. ST represents the local spectral characteristics, and the Fourier spectrum can be easily obtained by integrating it in the direction of time:

Therefore, can also be obtained by inversion accurately:

##### 3.3. Feature Extraction of Manifold Learning

The validity and simplicity of the characteristics of microseismic signals are the premise of realizing the recognition of microseismic signals. However, the microseismic signals collected in the coal mining site are nonlinear and nonstationary, which brings great difficulties to effectively extract the time-frequency characteristics of microseismic signals. Compared with the shortcomings of traditional linear dimension reduction methods (such as principal component analysis) in dealing with nonlinear problems, Manifold Learning (ML) can discover the essential structure of nonlinear high-dimensional data and then effectively conduct data dimensionality reduction and extract concise and effective features [43]. As a classical ML algorithm, the basic idea of Local Linear Embedding (LLE) is to retain the nonlinear characteristics of the original data by maintaining the linear reconstruction coefficient of each point in the low-dimensional embedding space [44].

The principle of Local Linear Embedding (LLE) is to maintain the local order relation of data in the embedded space and the essential space [45]. Assume that observation dataset is located in a high-dimensional space of the embedded manifold, and the local data points which are embedded in the inner space and in the field of low-dimensional space corresponding remain the same local neighborhood relationship. Assuming that each sample point in the observation space is obtained by the weighted average of all points in the neighborhood and can form a weight matrix in the multidimensional space, finally several positions by using closed can be described effectively by the low-dimensional structure. Its essence is that, in the high-dimensional space *R*^{L} given dataset *X* = {*x*_{1}, *x*_{2}, …, *x*_{n}}, the purpose is to find a low-dimensional dataset *Y* = {*y*_{1}, *y*_{2}, …, *y*_{m}}, and *m* << *n*. LLE algorithm consists of three steps. At first, search nearest neighbor points of each point according to the Euclidean distance between data points, and construct nearest neighbor matrix. Secondly, the local approximate hyperplane of each data point is calculated according to neighbor matrix, and the adjacency weight matrix is calculated. Finally, the internal low-dimensional coordinates are calculated based on the optimal reconstruction error of local adjacency weight matrix.

##### 3.4. Improved Gaussian Mixture Model Signal Recognition

###### 3.4.1. Gaussian Mixture Model

Gaussian Mixture Model (GMM) is essentially a multidimensional probability density function, which is composed of multiple Gaussian probability distributions [46]. GMM can be used to model arbitrary signals or data because it can approximate the density distribution of any shape smoothly, its modeling calculation is small, and the model performance is not affected by the number of training samples.

GMM is to estimate the probability density distribution of samples, and the estimation model is the weighted sum of several Gaussian models. Each Gaussian model represents a class. GMM is an extension of a single-Gaussian probability density function. If the data in the sample are projected on several Gaussian models, the probabilities of each class will be obtained, respectively. Then, the class with the highest probability can be selected as the decision result. Its specific mathematical expression [35] is that given {*x*_{1}, *x*_{2}, …, *x*_{n}|*x* ∈ *R*^{d}} is the value of *n* random samples of random variable *x*, then the probability density function of *d* variable mixing model with *k* components is expressed as shown in equation (7):

The distribution consists of *k* mixed distributions, each of which corresponds to a Gaussian distribution, where *μ*_{i} and *σ*_{i} are the parameters of the *i*^{th} Gaussian mixture, while is the corresponding mixing coefficient, and . If the random variable represents the Gaussian mixture distribution of the generated sample *x*_{j}, whose value is unknown, the prior probability of *θ*_{j} corresponds to . According to Bayes’ theorem, the posterior distribution of *θ*_{j} corresponds to the formula

According to formula (7), gives the posterior probability of sample *x*_{j} generated by the *i*^{th} Gaussian mixture. It is assumed that all distributions have the same functional form and their properties are determined by the parameter vectors *μ*_{i}, *σ*_{i}, and . GMM usually uses Expectation-Maximization (EM) algorithm for GMM parameters. EM optimizes the nonlinear probability function of GMM. Firstly, the initial value of model parameters is assigned according to the number of classifications. From the initial estimate (*μ*_{0i}, *σ*_{0i}), the parameter (*μ*_{i}, *σ*_{i}) is calculated, that is, the probability of each sample belonging to the class *i*. At last, known from the maximum a posteriori probability criterion, the classification with the greatest a posteriori probability is the sort result [34].

###### 3.4.2. Gaussian Mixture Model Based on Improved Bee Colony Algorithm

GMM is a parameterized model, and its parameter selection directly affects the modeling effect. EM uses the gradient descent method to make the convergence rate unstable. Although the convergence can be guaranteed in theory, sometimes the singular matrix will appear in the calculation and lead to the convergence failure. Meanwhile, EM can only guarantee convergence to local extremum. So the algorithm has a strong dependence on the initial value of each class. Once there is a large deviation between the estimation of the initial value and the real value, EM will easily fall into the local optimization, which will affect the quality of model parameter estimation and lead to a decline in classification accuracy [33]. An improved Bee Colony algorithm is introduced to optimize the estimation of GMM parameters so as to improve the estimation accuracy.

Bee Colony (BC) algorithm [47] solved the optimization problem in practice by imitating the self-organizing behavior of bees in search of good nectar source. In the standard BC, the position of each honey source represents a possible solution to the optimization problem, and the excellent characteristics of the honey source correspond to the fitness value of the problem. The worker bees are mainly divided into two kinds of colonies, namely, scout bees and following bees. The scout bees are responsible for searching the honey source to find the solution of the appropriate spatial range. When the source in some place becomes bad, the other bees in the place immediately become scout bees. When a good source of honey is found, the scout bees also act as guide bees to guide the direction of the following bees. Its search behavior is as follows:where *X*_{i} is an individual of population *N*_{e} and *t* represents the *t*^{th} generation. *j* is a component of a *D*-dimensional solution vector, *k* is a bee of the size *Ne* of a group, and *k* is not equal to *i*. *k* and *j* are generated randomly and are between [−1, 1].

The following bees get the sharing information of the leading bees. Choosing a leading bee according to the adaptive value of the leading bees population, follow it to find the honey source, and search for new positions in its neighborhood. When the search reached a certain stage, the honey source quality would decrease; that is, when the search times *Bas* reached a certain threshold *Limit* and no better position was found, the honey source should be abandoned to obtain a better search range, and the position of bees should be randomly initialized and changed again. Its search formula is as follows:where *X*_{min} and *X*_{max} are the minimum and maximum values of the solution space, respectively, and rand(0, 1) is a random number between [0, 1].

Although the artificial Bee Colony (BC) algorithm has the advantage of globally efficient search compared with other optimization algorithms, it is easy to fall into the local optimization in the face of complex functions due to its many random factors and low convergence accuracy because of the fast convergence in the later stage of optimization. Bacterial chemotaxis idea [48] is to simulate the continuous switching between the attraction and repulsion operations of bacteria in the process of foraging in response to chemical stimuli. To increase the diversity of population and the local optimization ability, the following bees colonies will imitate attraction and exclusion operation of the thoughts of the bacterial chemotaxis optimization behavior to solve the accuracy of defects, namely, when they swarm into local optimum, increase the exclusive operation, quickly change direction, and carry on the random walk. When the optimization speed is too slow, IBC is proposed by increasing the attraction operation, rapidly gathering, and improving the swarm optimization speed:where *P*_{best} is the globally optimal position in the population, *P*_{worst} is the global optimal position in the population, *T* is the maximum number of iterations, and are the minimum and maximum values of inertia weight, respectively, *c*_{1} is the dynamic attraction constant, while *c*_{2} is the dynamic exclusion constant, which is usually between [0, 2]. When the attraction operation is performed, the value of *c*_{1} is larger. Conversely, when the exclusion operation is performed, the value of *c*_{2} is larger. Other parameters are the same as above.

From the above process, it can be seen that GMM mainly has three parameters (*μ*_{i}, *σ*_{i}, and ) that need to be optimized and set. The three parameters are usually set according to experience adjustment, with a great subjectivity, and they are often not the best parameters. In the production site, the parameters need to be adjusted according to the specific situation of different on-site monitoring. However, most coal mine operators do not know the procedures and algorithms, so it is difficult for nonprofessionals to set the parameters of GMM. In order to avoid these problems and obtain the optimal detection effect, the above IBC is used to realize the self-tuning and optimization of parameters, and the optimized parameters were provided to GMM.

##### 3.5. Evaluation Indicators

(1)Accuracy: for a given test dataset, accuracy is the ratio of the number correctly classified to the total number of samples by the classifier. To further analyze the classification ability of discriminators, predictive positive value (PPV), negative predictive value (NPV), and Matthew’s Correlation Coefficient (MCC) have been also calculated. The first three indicators evaluate the effects of positive and negative classifications, and the last one, a relatively balanced indicator, evaluates the reliability of the algorithm [22]. Generally, the larger their values are, the better the classification will be. In this paper, they are used to quantitatively evaluate the performance of the microseismic classification algorithm. These parameters are defined as follows: where a true positive (TP) represents a correctly identified rock-fracturing event, a true negative (TN) represents a correctly identified blast vibration event, a false positive (FP) represents a misidentified rock-fracturing event, and a false negative (FN) represents a misidentified blast vibration event.(2)*F*-measure: the indicators of *Precision* and *Recall* restrict each other, and sometimes there are contradictions, which need to be taken into comprehensive consideration. *F*-*measure* is an evaluation index combining the two, which is defined as the weighted and harmonic average of *Precision* and *Recall*: In the formula, *α* is the weight factor; it is generally considered that the precision rate and the recall rate are equally important; namely, *α* = 1. *P* and *R* are *Precision* ratio and *Recall* ratio, respectively. *F-measure* integrates the results of *Precisio*n ratio and *Recall* ratio. When *F-measure* is high, it indicates that the classification effect is relatively effective. So *F-measure* is also used to evaluate the data classification stability.(3)ROC curve and AUC value: receiver operating characteristic (ROC) is a curve with False Positive Rate (*FPR*, specificity) as the horizontal axis and True Positive Rate (*TPR*, sensitivity) as the vertical axis, reflecting the sensitivity to the same signal stimulus. The more the curve deviates from the 45-degree diagonal, the more accurate the model classification is. The value of AUC (area under curve) is the area under the ROC curve. Obviously, the value of this area cannot be greater than 1, and the value range is [0.5, 1]. Combined with ROC curve, the classifier with larger AUC has better effect. This value can better reflect the actual classification sensitivity of data.

##### 3.6. Signal Recognition Process

###### 3.6.1. Microseismic Signal Acquisition

In this paper, the mine microseismic monitoring system produced by ESG (Engineering Seismology Group) of Canada was used to conduct 24 h microseismic monitoring, and the waveform was picked up and analyzed to study the spectral characteristics of rock-fracturing signals and blast vibration signals in the mining process.

###### 3.6.2. Preprocessing

As mentioned in Section 2.1, microseismic signals will be mingled in the process of acquisition by the interference of background noise, construction, and the environment and background noise can affect the accuracy of feature parameter extraction. In full-consideration characteristics on the basis of seismic signal frequency distribution, this paper selects db4 as the wavelet function of microseismic signal denoising pretreatment, as much as possible to retain useful signal components, reduce the interference of high-frequency signal waveform such as burr, and improve the efficiency of feature to pick up.

###### 3.6.3. Time-Frequency Analysis of ST

Since the Gaussian window whose width varies inversely with the frequency is introduced, ST has better time-frequency resolution and time-frequency focusing, the time-frequency analysis results obtained by ST are more intuitive, and the analysis results of the high-frequency part of the signal are more detailed. Therefore, this paper uses the above ST calculation principle, uses MATLAB programming to realize the above preprocessed microseismic signal to S transformation, obtains the time-frequency characteristics of the waveform, and analyzes the field microseismic activity spectrum rule.

###### 3.6.4. Feature Extraction of ML

Using the time-frequency characteristic parameters obtained by ST time-frequency analysis, the main characteristics of microseismic wave were extracted by means of the dimensionality reduction calculation in ML. In this way, the corresponding feature space is established by using the feature data in the normal state, which can be used as the feature vectors of the improved GMM for the next step of cluster learning.

###### 3.6.5. GMM Signal Recognition

GMM optimized by the IBC was constructed, and the feature vectors are introduced into the model. The classification error rate recognized by the microseismic signals is used as the fitness function, and IBC was used to continuously optimize the parameters of GMM to generate (*μ*_{i}, *σ*_{i}, and ) to obtain the clustering results and realize the classification and recognition of the microseismic signal.

###### 3.6.6. Classification Evaluation and Application

As mentioned above, this paper proposes that the classification results of seismic wave shape of the algorithm can be reflected by accuracy and F-measure. In this paper, the obtained microseismic wave shape will be directly used in the subsequent microseismic source solution and other convenient analyses to save the time of artificial waveform processing. The specific implementation steps are shown in Figure 8.

#### 4. Feature Extraction of Microseismic Signal

##### 4.1. Time-Frequency Analysis of ST

Identifying effective microseismic events from a large number of signals received by the microseismic system is the key to processing microseismic data. Therefore, through observing the signal waveform diagram after the pretreatment in Figure 7, get the typical rock-fracturing signals and blast vibration signals with the following differences. (1) Rock-fracturing signals usually occur due to the shear failure of rock mass, while blast vibration signals are mainly compression wave; therefore, S waves of rock-fracturing signals are more obvious than that of the blasting vibration signals. (2) P wave of blast vibration signals is more obvious than that of rock-fracturing signals, which are caused by the fact that the blast vibration signals are mainly longitudinal wave. (3) The peak amplitude of blast vibration signals is usually larger than that of rock-fracturing signal. However, the above analysis is only the qualitative analysis of feature extraction, which is usually used as the basis for artificial recognition of microseismic signals and cannot be used for quantitative recognition of microseismic signals.

Therefore, the frequency variation characteristics of the signals with time are revealed by analyzing the waveform by ST. The sensor sampling frequency of ESG system is set as 5000 Hz, and according to the sampling theorem, the Nyquist frequency is 2500 Hz. The ST analysis results of two types of typical signals in Figure 7 are shown in Figure 9. As the mine signals are mainly concentrated in the low-frequency part, 0–1000 Hz is selected as the display frequency band, where Figures 9(a) and 9(b) are the spectrum of 2 types of events.

**(a)**

**(b)**

Real-time online monitoring of microseismic monitoring system is mainly signal continuous change over time, the transformation to the study of the signal in frequency domain, analysis of its waveform, and frequency characteristics; as can be seen in Figure 9, the scene to pick up the signals contains multiple frequency components, compared with the rock failure signals; frequency components of blast vibration signals are more complicated. From the range of frequency distribution, blast vibration signals are distributed in the higher frequency band, mainly in 100–200 Hz frequency band. Rock-fracturing signals are distributed in the lower frequency band, which is 0–100 Hz. The signal characteristic is analyzed by ST, and the following characteristic parameters are obtained, respectively.

###### 4.1.1. Duration

Figure 10 shows the statistical and probability distribution results of blast vibration waveform and rock-fracturing waveform in coal mine. As shown in Figure 10(a), on the point of the time, blast vibration signals have short, fast mutation characteristics, so the different frequency components have short duration, and the duration is 8–20 ms. While rock-fracturing signals originate in the internal microcracks initiation, development, and the rock mass, the process is relatively slow, and the duration is usually 18–35 ms. Therefore, it can be seen that ST can not only describe the frequency characteristics of the signal changing with time but also reproduce the dominant frequency duration of the microseismic wave shape very well, without the need to pick up the P wave arrival time and tail time, which reduces the workload.

**(a)**

**(b)**

The probability density for the duration is presented in Figure 10(b). This function is the result of single-Gaussian fitting based on the duration statistical results. The single-Gaussian fitting of the duration distribution provides good separation and the average duration of the rock-fracturing signal is usually greater than the blast vibration signal. The results of the single-Gaussian fitting are basically consistent with the actual distribution. 70% of the duration for the rock-fracturing waveform and the blast vibration waveform are distributed above 30 ms and below 20 ms, respectively, and the rest 30% of the waveforms are distributed within the overlap interval [20–30 ms]. Therefore, the overlap of events is small.

###### 4.1.2. Maximum Amplitude

As can be seen from Figure 9, the duration of blast vibration signals is short, and the vibration amplitude is large. The amplitude of rock-fracturing signals is small, there is no obvious waveform interval, and the signal attenuation process is slow. Figure 11 is the statistical results of 100 groups of microseismic signal amplitude distribution, 66.9% of the blast vibration signals’ amplitude is above 0.01 V, while 75.8% of rock-fracturing signals’ amplitude is under 0.001 V. It can be summarized to the field distribution regularity that the amplitude of blast vibration signals is significantly higher than rock-fracturing signals generally, which can be used as a characteristic of seismic signal recognition. But because not all signals are in accordance with the law, it cannot be used as the basis for a single signal classification vector.

**(a)**

**(b)**

The probability density distribution of the microseismic amplitude is obtained by the single-Gaussian fitting of the amplitude statistics results as shown in Figure 11(b). It can be seen from Figure 11(b) that the average amplitude value of the blast vibration waveform is usually greater than that of rock fracturing. 15% of the blast vibration waveform and rock-fracturing waveform are distributed in the overlap interval (0.0001–0.01 V). The amplitude distribution for 85% of the blast vibration waveform and rock-fracturing waveform can be distinguished well. It can be seen that the single-Gaussian fitting of the amplitude size distribution provides a good separation.

###### 4.1.3. Main Frequency and Dominant Frequency

As can be seen from statistical results of main frequency distribution in Figure 12, the main frequency of blast vibration signals is around 150 Hz, and the frequency distribution is relatively dispersed and the frequency component is relatively complex. The frequency distribution of rock-fracturing signals is relatively concentrated, and the main frequency is relatively low, mainly around 70 Hz. On the whole, the frequency band of blast vibration signals is rich, and the dominant frequency is distributed to a certain extent in 0–500 Hz, mainly between 100 and 200 Hz in the frequency band. However, the dominant frequency of rock-fracturing signals is lower than that of blast vibration, with narrow frequency band distribution and dominant frequency mainly distributed in 0–100 Hz. The waveform frequency generated by blast vibration signals is usually higher than that of rock-fracturing signals, which may be caused by the different source mechanisms. The typical rock-fracturing signal is usually in a continuous and single form with a long duration and a slow attenuation, which is due to the process of propagation and development of cracks in rock mass. It is worth noting that the above selected are blast vibration events and rock-fracturing events with obvious waveform characteristics, and in fact, the signals are often complex and changeable. Through the analysis of the typical events, the characteristics of distinguishing the two types of events can be obtained more effectively.

**(a)**

**(b)**

The single-Gaussian probability density distribution of the main frequency of microseismic is shown in Figure 12(b). It can be seen from Figure 12(b) that the distribution range of the main frequency of blast vibration waveform is relatively wide, with distribution within 300 Hz, and the average main frequency value is usually greater than that of rock fracturing. The dominant frequencies for 75% of blasting vibration waveform and rock-fracturing waveform are, respectively, distributed above 110 Hz and below 100 Hz, and the dominant frequencies for the remaining 25% of the blast vibration waveform and rock-fracturing waveform are in the overlapping interval.

###### 4.1.4. Energy Ratio of Specific Frequency Band

ST can effectively separate different frequency components at the same time, so the signals can be divided into the distributed frequency band, and the frequency characteristics can be accurately quantified through signal reconstruction. Given that *h* represents the original signal, , then the total energy *E*, frequency band energy *E*_{e}, and frequency band energy ratio *e* of signal *h* can be defined as follows:where *x*_{i} is the discrete point amplitude of the signal, *n* is the length of the original signal (the total sampling point), and *m* is the signal length of the frequency band [*f*_{1}, *f*_{2}].

It can be inferred from the analysis that blast vibration signals and rock-fracturing signals have different frequency domain characteristics and energy distribution characteristics. Two sets of typical blast vibration signals and rock-fracturing signals are selected for the above analysis. The frequency range of the microseismic signal [0–1000 Hz] is divided into 10 frequency bands in units of 100 Hz, and S-inverse transformation is carried out to obtain the energy of each frequency band. The resulting frequency band energy distribution is shown in Figure 13. As can be seen from Figure 13(a), the energy distribution of the two types of events shows obvious differences in frequency bands. Compared with the rock-fracturing signals, the frequency distribution range of blast vibration signals is relatively wide, mainly distributed in 0–400 Hz. The energy distribution of rock-fracturing signals is relatively single, mainly distributed below 200 Hz. However, both signals are concentrated in 0–200 Hz, and the energy proportion of rock-fracturing signals reaches 98.79%, and that of blast vibration signals is 82.03%. However, in the frequency range of 0–100 Hz and 100–200 Hz, there is a significant difference between them. The energy proportion of blast vibration signals are 12.14% and 69.89%, respectively, while the energy proportion of rock-fracturing signals is 88.13% and 11.86%, respectively. The above analysis shows that the blast vibration signals and rock-fracturing signals in 0–100 Hz and 100–200 Hz showed obvious difference, so in order to reduce the dimension of feature vector, this paper selects distribution in the frequency band energy ratio between the two as recognition features to analyze. It can reflect 2 kinds of signal frequency band energy distribution differences. But just showing special energy distribution characteristics of the microseismic signals, it cannot fully describe the overall characteristics of the seismic signal, so it is only taken as one of the classification feature vectors.

**(a)**

**(b)**

The single-Gaussian probability density distribution of energy ratio of specific frequency band of microseismic is shown in Figure 13(b). There is a good separation for the body of the distribution of energy ratio of specific frequency band for the rock-fracturing signals and blast vibration signals, with an overlap on the small moment values. Although the mode of the distribution for blast vibration signals is higher than that for rock-fracturing signals, the overlap of the two distributions for larger moments indicates that energy ratio of specific frequency band, as a discriminator parameter, will perform better for larger events and poorer for smaller events. The energy ratio values for the centre 70–80% of rock-fracturing signals and blast vibration signals are focused on the intervals of 0–0.7 and 0.9–2.4, respectively.

###### 4.1.5. Energy Entropy

For nonlinear nonstationary signals, the energy entropy which characterizes the energy distribution complexity of the signals can describe the characteristics of the signals more accurately compared with the total energy of the signal. Combining with Shannon information entropy theorem, Energy Entropy (*SEE*) of S-inverse transformation is proposed, as shown in (14). The total energy *E* of the original signal *h* is equal to the sum of the energy *E*_{e} of each frequency band, as shown in (13). If *ε*_{j} = *E*_{e}/*E*, ∑*ε*_{j} = 1. The greater the energy entropy of S-inverse transformation is, the more evenly the energy of the original signal is distributed among all frequency bands, the greater the complexity of the original signal is, and the stronger the randomness is:where *j* is the serial number of frequency band division and *N* is the total number of frequency bands divided. The band splitting rule is described in the frequency band energy above.

The spectrum [0–1000 Hz] of 100 groups of rock-fracturing signals and blast vibration signals is divided into 10 frequency bands for *SEE*, and the statistical results are obtained as shown in Figure 14. As can be seen from Figure 14(a), there is a certain difference in *SEE* between rock-fracturing signals and blast vibration signals. *SEE* of rock-fracturing signals fluctuates between 0 and 0.31, with an average value of 0.12. *SEE* of blast vibration signals has a wide range of floating values, all of which are greater than 0.12 and floating between 0.12 and 0.5, with an average value of 0.29. Thus, it can be seen that *SEE* of blast vibration signals is mostly greater than that of rock fracture signal, which reflects the instability of blast vibration signals. Due to the fact that the value has crossover in some events and is difficult to be distinguished, *SEE* cannot accurately represent the vibration signals under two working conditions, so it is necessary to increase the time-frequency characteristics to improve the accuracy of the microseismic signals’ classification.

**(a)**

**(b)**

The single-Gaussian probability density distribution of microseismic energy entropy is shown in Figure 14(b). The figure shows that the average energy entropy for blast vibration signals is larger than that for rock-fracturing signals, although the separation is small. The values of energy entropy for the 60% rock-fracturing signals and blast vibration signals are focused on the intervals of 0–0.15 and 0.24–0.45, respectively. The other 40% of values for rock-fracturing signals and blast vibration signals overlap. However, energy entropy can only classify about 60% of blasts from microseismic signals. Due to this separation in the distributions of energy entropy for blast vibration signals and rock-fracturing signals, this parameter contributes to an improved discrimination result.

###### 4.1.6. Corner Frequency

The corner frequency *f*_{c} of the microseismic signal can be obtained from the logarithmic curve of the microseismic displacement spectrum (Figure 15), which is the frequency at the intersection of the high-frequency part and the low-frequency part of the asymptotic line. Source parameters such as corner frequency of microseismic events are given by using the Brune dislocation model and geometric method [49]. Figure 16(a) shows the corner frequency distribution and the probability density of 50 blast vibration signals and 50 rock-fracturing signals. It can be seen that the corner frequency of blast vibration signals is mostly higher than that of rock-fracturing signals, and the average corner frequency is 214.50 Hz. The minimum corner frequency of rock fracture signals is 55.85 Hz, and most of them are over 70 Hz, with an average of 112.79 Hz. The difference in the corner frequency between blast vibration signals and rock-fracturing signals is 101.71 Hz on average. It can be seen that rock-fracturing signals are generally lower than the blast vibration signal at the corner frequency of microseismic signal.

**(a)**

**(b)**

Figure 16(b) shows the values of corner frequency for 70% of rock-fracturing signals which are smaller than that of blast vibration signals and the values of corner frequency for 60–70% of rock-fracturing signals and blast vibration signals that occur within the intervals around 70–120 and 150–280, respectively. The other 30–40% of rock-fracturing signals and blast vibration signals are distributed in the fuzzy intervals. Although there is some separation in the distributions of corner frequency, it is relatively small.

###### 4.1.7. Peak Frequency

Peak frequency is a basic parameter in the study of source parameters. Peak frequency not only is related to the stress state in the earthquake area but also reflects the distribution characteristics of the high and low-frequency energy of seismic waves and is also related to the process of earthquake initiation. To a considerable extent, the spectral characteristics are related to the inhomogeneity of the medium in the ray path between the source and the receiving station as well as to the frequency characteristics of the recording instrument. The peak frequency (excellent frequency) *f*_{0} of the microseismic signal can be obtained from the logarithmic curve of spectrum Figure 15, which corresponds to the maximum spectrum value.

Figure 17 is a comparison of peak frequencies of 100 groups of microseismic events. As can be seen from Figure 17(a), the peak frequency of blast vibration signals is at least 32.14 Hz, and most of them are higher than 40 Hz, with an average of 75.07 Hz. The peak of rock-fracturing signals is 76.03 Hz, and the average is 47.64 Hz. The peak frequency difference between blast vibration signals and rock-fracturing signals is 27.43 Hz on average. It can be seen that the peak frequency of rock-fracturing signals is generally lower than that of blast vibration signal.

**(a)**

**(b)**

Figure 17(b) shows that the average peak frequency of blast vibration signals is larger than that for rock-fracturing signals, although the separation is small. The values of peak frequency for the 60% blast vibration signal and rock-fracturing signals are focused on the intervals of 10–50 Hz and 60–110 Hz, respectively. The other 40% of values for blast vibration signals and rock-fracturing signals overlap. This may be attributed to the complex mechanism of blast vibration signals which may generate higher frequency components.

###### 4.1.8. Frequency Bandwidth

The frequency bandwidth of microseismic *f*_{b} is the difference between the corner frequency and the peak frequency, as shown in Figure 15, which reflects the effective spectrum range of the collected microseismic signal and the stability of the microseismic signals. According to the above calculation, the distribution results of frequency bandwidth of microseismic wave shape can be obtained in Figure 18. It can be seen from Figure 18(a) that the frequency bandwidth of rock-fracturing signals is mostly below 100 Hz, with an average of 65.15 Hz. The lowest frequency bandwidth of blast vibration signals is 27.37 Hz, and most of them are above 80 Hz, which indicates that rock-fracturing signals have a stable change law compared with blast vibration signals, and the variation of blast vibration signals is quite different, and there is a crossover.

**(a)**

**(b)**

Probability distribution of frequency bandwidth distribution of microseismic signals is presented in Figure 18(b) which shows similar features to that of the distributions of peak frequency. The values of frequency bandwidth for the centre 60% blast vibration signals and rock-fracturing signals occur within the intervals 0–70 Hz and 90–210 Hz, respectively. The other 40% of blast vibration signals and rock-fracturing signals are distributed in the overlap intervals. It can be seen that the interval of overlap for events is large. However, frequency bandwidth can only classify about 40% of blast vibration signals from events. Due to this separation in the distributions of frequency bandwidth for blast vibration signals and rock-fracturing signals, other parameters are also needed for classification.

###### 4.1.9. Fractal Dimension

Fractal is a universal feature of complex things in nature. The degree of irregularity of nonstationary random vibration signals can be described by fractal dimension. The common methods to calculate the fractal dimension include box dimension, grid dimension, and correlation dimension. The algorithm based on the idea of “covering” is simple and clear. This method assumes that 1 set of real planes *R*^{2}, *X*, and *Y* is, respectively, the closed set of 1 nonempty subset and 1 scale *ε*; then the fractal dimension of *X* is defined as follows:where is the number of *Y* required to cover *X*.

The fractal dimension of microseismic signals is calculated, and the fractal dimension of 100 groups of rock-fracturing signals and blast vibration signals is shown in Figure 19. As can be seen from Figure 19(a), the morphological dimension of rock-fracturing signals is distributed between 1.238 and 1.419, with an average value of 1.319. However, the fractal dimension of blast vibration signals fluctuates around 1.311–1.538 with an average value of 1.403, which means that the fractal dimension of rock-fracturing signals is basically lower than that of blast vibration signals. When the fractal dimension difference of rock-fracturing waveform and blast vibration waveform is 1.351, the boundary is obvious. The fractal dimension above 1.351 is dominated by blast vibration signals and below 1.351 is dominated by rock-fracturing signals. If the morphological dimension of 1.351 is used as the discriminating factor to identify rock-fracturing signals and blast vibration signals, there are 7 rock-fracturing events with fractal dimension higher than 1.351, and the recognition accuracy is 86%, while there are 8 blast vibration events with fractal dimension lower than 1.351, and the recognition accuracy is 84% in the samples extracted. The accuracy of overall event recognition is 85%. From the perspective of waveform complexity, most of the waveforms generated by rock-fracturing events are single, while the waveforms are relatively complex due to noise doping in underground production blasting. It is not enough to take the fractal dimension 1.351 as the waveform identification factor of the microseismic system in this mine, so it is necessary to analyze and extract the fractal characteristics of the microseismic wave by comprehensively considering the various time-frequency factors analyzed above.

**(a)**

**(b)**

There is a good separation for the body of the distribution of morphological fractal dimension for blast vibration signals and rock-fracturing signals, with an overlap on the small moment values, which is shown in Figure 19(b). 85% of blast vibration signals have values of morphological fractal dimension of approximately 1.35–1.45, whereas rock-fracturing signals have values of approximately 1.22 to 1.33. It can be seen that the interval of overlap for events is small. Although morphological fractal dimension performs well for events, it still produces a lot of errors.

##### 4.2. Feature Extraction of Manifold Learning

Figure 20 provides a box plot to demonstrate the distribution of normalized characteristic data. Obviously, the range of features for blast vibration signals locates above the range of features for rock-fracturing signals except for the duration. Nine data groups contain outliers including the duration, maximum amplitude, main frequency, energy ratio of specific frequency band, energy entropy, corner frequency, peak frequency, frequency bandwidth, and fractal dimension for blast vibration signals and rock-fracturing signals. The medians of nearly all data groups are not at the centre lines of boxes, which means data distribution for each group is asymmetric. The information delivered by Figure 20 demonstrates that the selected features are distinct for differentiating microseismic and blasting events.

Because the waveform features selected above are too complicated and affect the processing efficiency, Manifold Learning (ML) conducts reasonable dimensionality reduction screening of the signal features, randomly selects 50 rock-fracturing signals and 50 blast vibration signals from the mine microseismic system, and extracts the waveform features of each event. The feature vector *Q*, which is composed of the filtering waveform features in Section 4.1, constitutes a high-dimensional dataset, and the low-dimensional embedding of the high-dimensional dataset is obtained by using LLE. Silhouette [44] is used to quantitatively evaluate the dimension reduction effect of LLE.

Suppose a dataset has *N* clustering , is the average distance between sample and all other samples in the class, and is the average distance between sample *x* and all samples in another class *C*_{j}, denoted as ; then the Silhouette index *S*_{b} of sample *x* is as follows:

The average value of Silhouette *S*_{b} for all samples reflects the quality of the clustering results. The higher the value is, the better the clustering quality is and the closer the dimensionality reduction effect is to the topology of the original data. According to LLE’s algorithm, the selection of embedded dimension *d* and domain factor *l* is very important to the effect of dimension reduction of LLE. In this paper, Silhouette index is adopted as the target function value, and IBC is used to select the best embedded dimension *d* and domain factor *l*. The parameters of IBC are as follows: *N*_{e} = 20, maximum iteration times *T* = 100, searching times Limit = 20, *F* = 0.5, CR = 0.5, = 0.9, = 0.1, and *c*_{1} = *c*_{2} = 1.5. The final optimization results are as follows: *d* = 2, *l* = 12, and *S*_{b} = 0.91. The dimensionality reduction effect of LLE optimization is shown in Figure 21. It can be seen from Figure 21 that, after the dimension reduction of LLE, the low-dimensional embedding of rock-fracturing samples and blast vibration samples has an obvious interface, which intersects slightly, and the effect is better.

By the mentioned above, the duration, maximum amplitude, main frequency, energy ratio of specific frequency band, energy entropy, corner frequency, peak frequency, frequency bandwidth, and fractal dimension of two types of microseismic signals are obvious difference. However, the identification of microseismic signals with any indicator alone cannot achieve satisfactory results. The characteristics of nine groups are filtered, and the characteristics of the two kinds of microseismic signals can be well characterized. Too many feature vectors lead to complex computation. LLE dimensionality reduction is adopted to form multiscale feature vectors, and the filtering results are shown in Figure 21, which not only highlights the local characteristics of data but also reduces the computational workload.

#### 5. Improved GMM Signal Recognition

##### 5.1. Feature Input

Feature vectors are constructed to determine training data and test data. Two kinds of microseismic signal data established by the microseismic monitoring system in Xiashijie coal mine fully mechanized working face 2305 are taken as analysis object. Through ST and the time-frequency analysis, the duration, maximum amplitude, main frequency, energy ratio of specific frequency band, energy entropy, corner frequency, peak frequency, frequency bandwidth, and fractal dimension of two types of microseismic signals are constructed to nine time-frequency characteristics vector *Q*, and LLE is used for dimensionality reduction to obtain low-dimensional embedded eigenvector *R*.

From the existing data, 50 groups of rock-fracturing signals and blast vibration signals were selected, 1–80 groups of data were used as training samples, and 81–100 groups of data were used as prediction samples. The identification category of rock-fracturing signal is set as 1, and that of blast vibration signal is 2. The training set and test set are normalized to the interval of [0, 1], and the calculation formula is as follows:where *x* and are the values before and after normalization, respectively. The eigenvector *R*, which is the LLE dimensionality reduction result of the high-dimensional feature vector *Q*, is defined as the input vector of the classification machine learning judgment of the improved GMM.

##### 5.2. Pattern Recognition

Due to the limited experimental data collected, in order to achieve the goal of intelligent classification, machine learning algorithm GMM with high classification accuracy and few required samples was used for classification. Two kinds of microseismic signals were analyzed and processed by the feature extraction method of microseismic signals described in this paper, and 50 sets of multiscale and high-dimensional feature vectors were obtained. GMM uses the Gaussian probability density function to identify and classify the subsamples. Based on the single-Gaussian model (GSM), GMM can adapt to different classification requirements and approximate arbitrary continuous probability density distribution by adjusting the quantity and weight of GSM. IBC is introduced into the parameter estimation of GMM. The search space of bees is *R*, and the position of each individual is , , , *μ*_{j}, *σ*_{j}, and are, respectively, the mean value, variance, and mixed probability of component distribution of the *j*^{th} Gaussian distribution. The classification error of expected value and predicted value is used as the adaptive value function, and the optimization calculation is carried out by IBC. The parameter setting of IBC is the same as in Section 4.2. Finally, the optimal parameters are assigned to GMM for the recognition of microseismic signals. Meanwhile, EM is also used to calculate the parameters of the Gaussian Mixture Model to test the effect of the model. The recognition results are shown in Table 1.

Figure 22 is GMM optimization curve, where EM and IBC are used to solve GMM parameters. The same microseismic signal data was used to train GMM in the experiments, and the results of the adaptive value of each iteration were recorded. In the process of solving, the parameters of each algorithm are set as the values when the best results are obtained, respectively. It can be seen from Figure 22 that EM falls into the local optimum when iterating 29 times, and the resulting function has a poor fitness value and a slow convergence rate. By adding attraction and repulsion operations, IBC not only increases the optimization speed of the algorithm but also jumps out of the local extremum well after 50 iterations. After a series of improvement measures, compared with EM, the global optimal solution of IBC is obtained quickly. The convergence of IBC-GMM is accelerated, its function adaptation is greatly improved, and the classification performance is greatly improved.

##### 5.3. Classification Comparison of Different Methods

Back Propagation (BP) neural network is a kind of multilayer feedforward neural network. In forward transmission, the input signal is processed layer by layer from the input layer through the hidden layer until the output layer. The state of neurons in each layer affects the state of neurons in the next layer. If the output layer cannot obtain the expected output, it will be transferred to the backpropagation. According to the positive weight and threshold of the prediction error body, the predicted output of BP will keep approaching the expected output. This algorithm has been widely used in various fields. So BP network structure is divided into three layers: input layer, output layer, and a hidden layer. The excitation functions of hidden layer and output layer are hyperbolic tangent S-type function (tansig) and logarithmic S-type transfer function (logsig), respectively. Based on this, the sample training and classification of microseismic signal data are carried out.

Random Forest (RF) is a classification machine learning algorithm containing multiple decision trees, and its output category is determined by the mode of the category output of an individual tree. Each decision tree is a classifier (assuming that the classification problem is now addressed), so for an input sample, *N* trees will have *N* classification results. While RF integrates all the classification voting results and specifies the category with the most votes as the final output, this is the simplest Bagging idea. RF is used to divide the training samples of microseismic signals and the prediction samples, and MATLAB toolbox is used to realize RF. The new data is identified and classified, and the classification results are determined by the number of votes of the tree classifier.

Logistic Regression (LR) is a classical classification model, commonly used in dichotomies. The basic idea is to transform the input value into predicted value in linear regression and then map to Sigmoid function. In the regression model, the dependent variable is a qualitative variable, such as 0 or 1. LR is mainly used to study the probability of the occurrence of certain events. The closer the predicted value is to 1, the better the predicted result is. By learning the sample to construct the decision function, the microseismic signal can get classification and prediction.

Bayesian (Bayes) statistical method is a method developed based on Bayes’ theorem to systematically explain and solve statistical problems. The core of Bayesian statistical method is Bayesian formula. Naive Bayes classifier is the simplest and most common classification method in Bayes classifier. This algorithm is a supervised learning algorithm which solves the classification problem. The advantages of the algorithm are simple to understand, have high learning efficiency, and can be compared with decision tree and neural network in some areas of classification problems. In any statistical inference problem about an event, in addition to the information data provided by the sample information, a prior probability distribution of the event must be specified in advance. Bayes adopts the prior probability of 0.5 to carry out the classification and identification test of mine microseismic signal data.

The dimension reduction feature space *R* of LLE is taken as classification feature vector; BP, RF, LR, and Bayes mentioned above, four kinds of commonly used method of pattern recognition, are carried out to classify the microseismic signals to test the superiority of the improved GMM proposed. The recognition results are shown in Table 1. Figure 23 shows the recognition effects of six machine learning methods tested in this paper.

##### 5.4. Quality Evaluation

As shown in Section 3.4, accuracy, *PPV*, *NPV*, *MCC*, *F-measure*, ROC curve, and AUC value are used as the class separability measurement indexes to evaluate the above different microseismic signal recognition algorithms. Figure 24 shows the values of *PPV*, *NPV*, and *F-measure* indexes for different methods of identification and classification of measured data. Figure 25 shows the comparison of ROC curves, AUC, and MCC for different methods of measured data.

**(a)**

**(b)**

As can be seen from Figure 23, under the same machine learning method, IBC-GMM only recognizes 6 wrong signals, while EM-GMM recognizes 12 wrong signals. Overall, the recognition result of IBC-GMM is better than that of EM-GMM. Under the same feature vector, in other methods, the number of LR errors is up to 21 signals, while the number of BP errors is 7. However, under rock-fracturing signals identification, BP identification errors are 5 at least and LR identification errors are 10 at most. Under the condition of identifying blast vibration signals, BP methods have 5 errors at least, and Bays has 12 errors at most. It can be seen that the recognition effect of GMM and BP is better than that of RF, LR, and Bayes, while that of IBC-GMM is better than that of EM-GMM and BP. In summary, the classification of rock-fracturing signals based on ST and ML based on IBC-GMM pattern recognition has a good effect, with an accuracy of 94.0%. Compared with other algorithms, the microseismic types can be identified well, the error rate is low and the antinoise performance is good.

It can be seen from Table 1 that, based on the feature vector after LLE dimensionality reduction, the classification results of each algorithm are higher than 79.0% on the whole, indicating that LLE dimensionality reduction vector is reasonable as the feature vector of microseismic recognition. The recognition effect of GMM, BP, RF, and Bayes is 79.0% higher than that of LR, and they are 88.0%, 90.0%, 86.0%, and 81.0%, respectively. RF method is similar to Bayes method in rock-fracturing identification, and EM-GMM is similar to RF in rock-fracturing identification. However, the method proposed in this paper (IBC-GMM) has a higher comprehensive identification accuracy than any other method.

According to Figure 24, *NPV* (96%) of IBC-GMM are highest in the overall recognition effect. It is slightly better than that of BP and EM-GMM, which are both 90%. RF neural network has the lowest *NPV* (80%). *PPV* shows the same pattern. *PPV* of IBC-GMM is 92%, which is best in all methods. We ranked six classifiers according to the degree to which they account for *F-measur*e. From top to bottom, they were IBC-GMM, BP, EM-GMM, RF, Bayes, and LR. In terms of NPV, PPV, and F-measure, IBC-GMM is comparable to other methods, and IBC-GMM has the best classification performance and the classification index of LR is the worst.

All six models’ ROC curves and AUC were given in Figure 25. We plotted the mean ROC with different color and bold lines to show overall performance for modes (the grey dash line reflects the ROC for random guessing) in Figure 25(a). From the maximum AUC to the minimum, the classifiers were ranked as IBC-GMM, BP, EM-GMM, RF, Bayes, and LR. A histogram reflecting this relationship was drawn in Figure 25(b). MCC of discriminators are also shown in Figure 25(b). It shows that, for the six discriminators, the best results are obtained for IBC-GMM, with values equal to 0.88, 0.80, 0.76, 0.72, 0.62, and 0.58, respectively.

In this study, the applicability and performance of IBC-GMM, BP, EM-GMM, RF, Bayes, and LR for discriminating blast vibration signals and rock-fracturing signals were investigated. Accuracy, *PPV*, *NPV*, *MCC*, *F-measure*, ROC curve, and AUC value have demonstrated that IBC-GMM has a reasonably good discriminating performance, which well reflects the stability and superiority of the proposed method in the classification of microseismic signals. It shows that the improved GMM based on ST and ML is feasible and has high accuracy. Therefore, the method is effective in identifying the rock-fracturing signals and blast vibration signals. And it is still applicable to the discrimination and analysis of other vibration signals in other fields, but due to the complexity of the field environment, it is better to reanalyze the time frequency and summarize the characteristic rules of new vibration signals.

#### 6. Conclusion

In this paper, ST provided an effective time-frequency analysis method, and the characteristics of the coalfield microseismic signals were effectively acquired. The microseismic signal amplitude, frequency, duration, maximum specific frequency band energy ratio, energy entropy, the corner frequency and peak frequency, bandwidth, and the fractal dimension of nine time-frequency characteristics of rock-fracturing signals and blast vibration signals were, respectively, analyzed, which showed a larger difference. Blast vibration signals have short duration, high frequency, and frequency spectrum law which are relatively complex, while rock-fracturing signals are relatively slow with low frequency and spectrum change which are relatively stable. Due to the high-dimension duration of the feature vectors, the constructed learning algorithm is complex, easy to converge to the local optimal solution, and difficult to find the “optimal” interface. After LLE dimensionality reduction of the high-dimensional eigenvectors, it can not only simplify the characteristics of the microseismic signals but also improve the resolution of the features, making the processing of the microseismic nonstationary signals more effective.

Compared with EM-GMM, BP, RF, LR, and Bayes methods, IBC-GMM classification is better than other methods, and the classification and recognition accuracy reached 94.0%. It shows that IBC-GMM based on ST and ML to extract microseismic characteristics is feasible to identify rock-fracturing signals and blast vibration signals and has a high accuracy. The model is applied to the identification of mining microseismic events in coal mine, and the recognition results are consistent with the reality.

#### Data Availability

The detailed data of MS monitoring system are available in the concluding report of the MS monitoring service for the coal mine outburst project implemented at Xiashinjie coal mine in Tongchuan which is completed by Dalian University of Technology and Mechsoft (Dalian) Co., Ltd.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Acknowledgments

This research was supported by the National Natural Science Foundation of China (no. 4194-1018), the National Natural Science Foundation of China (Grants nos. 51774064 and 51974055), the Special-Funded Program on National Key Scientific Instruments and Equipment Development (no. 51627804), and the National Key Research and Development Plan of China (Grant no. 2017YFC1503103).