#### Abstract

Seismic events are phenomena which commonly occur in the mining industry. Due to their dangerous character, such information as the energy of the potential event, the location of hazardous regions with higher seismic activity is considered valuable. However, the acquisition of this information is almost impossible without the ability to detect the onset time of the seismic event. The main objectives of algorithms in finding P-wave are high accuracy, reasonable time of operation, and automatic detection of wave arrival. In this paper, an innovative method which incorporates principal component analysis (PCA) with time-frequency representation of the signal is proposed. Due to the significant difference between the spectra of recorded seismic wave and pure noise which precedes the event, time-frequency representation allows for better accuracy of signal change detection. However, with an additional domain, the complexity rises. Thus, the incorporation of PCA (which is known for high efficiency in lowering data dimensions while maintaining original information) seems to be recommended. In order to show the feasibility of the method, it will be tested on real data originating from monitoring system used in underground mine.

#### 1. Introduction

Analysis of mining-induced seismic events, prediction of such events, their evaluation in sense of mechanism, location, and amplitude (energy) identification are crucial in the underground mining industry. One of the basic issues is the problem of P-wave indication. P-wave detection is essential in calculating the energy of the event, description of the event, focal mechanism [1], and so on. One might say that its calculation is fundamental in the evaluation of seismic hazard. The seismic hazard in Poland is relatively high in underground mines, and there are many studies on that topic [2–5]. However, this paper is focused on the P-wave detection techniques only.

The problem of P-wave detection might be solved in many ways. The time of seismic wave arrival can be acquired manually by specialists from seismic station in the mine or automatically by previously developed algorithms. Obviously, these two approaches differ in terms of time consumption and accuracy, as human work is slower but more precise. However, as the specialists may face hundreds of events daily, the utilization of algorithms seems indispensable. The preliminary usage of some algorithms followed by a manual improvement became a common habit in seismic stations.

In the era of Big Data, it seems to be more reasonable to have an automatic, quick, and robust detection algorithm. Indeed, there are plenty of such algorithms developed by various authors. However, their accuracy appears very similar to the manual result. The problem of P-wave detection is well known in scientific literature and in engineering practice. The STA/LTA algorithm [6] is probably the most broadly used algorithm due to its simpleness, fast processing, and the possibility of working online. Its main idea inspects the quotient of previously assumed characteristic function considered in short-term (STA) and long-term (LTA) windows. Other classical methods include multiscale wavelet analysis [7], usage of neural network [8], and autoregressive-based techniques [9, 10] with usage of the AIC criterion [11]. In the literature, one can find also the acoustic emission- (AE-) based analysis [12–14]. Advanced analysis of the spectrogram allows detecting the decay of AE signal frequencies revealing the transition from microcracking to the macrocracking regime during laboratory fracture tests. However, in our application, it was not considered.

The automation of P-wave detection has been highlighted in many publications [7, 15–21]. The comparison of the classical methods can be found in [15, 22, 23]. In recent years, a couple of new algorithms were developed, e.g., [18–21, 24–29]. Further information about their accuracy and computational time can be found in [15, 23].

Due to the importance of onset time acquisition, the problem is still analyzed. Surprisingly, a couple of new solutions have been published in 2018 [19–21, 24–26]. It shows that there is still space for new solutions.

From the signal processing point of view, the problem of P-wave arrival can be considered, e.g., as a simplified segmentation issue, finding the time of regime switching, or rapid change of the value of some particular statistic. Thus, in order to acquire onset time, it seems to be reasonable to utilize tools used to solve the isometric problems. The exemplary algorithms solving these parallel issues have been included in the following articles [30–40]. As one might see, the segmentation problem is a fundamental task in the signal processing, and it might be applied to various types of signals.

In this paper, we propose to use a combination of two well-known techniques in signal processing and in data analysis. The seismic signal due to its impulsive nature is a nonstationary process. For that kind of signals, the most suitable way of analysis and processing is time-frequency representation. The short-time Fourier transform (STFT) is used here as the simplest technique for the time-frequency transformation. It is possible to use more precise representations (Wigner–Ville, scalogram, or ARgram) but more complex, and anyway, the core of the procedure will be the same. The benefits of time-frequency representation are the possibilities to observe the behavior of the signal in the time (as in raw signal) and also in the frequency domain simultaneously. It brings additional opportunity for further signal processing and interpretation. The STFT time-frequency representation could be considered as a set of signals (we call them subsignals) for each frequency bin which is easier to process due to better signal-to-noise ratio. In the case of very noisy signals, there is a possibility to use preprocessing techniques such as denoising [11] or mentioned segmentation, especially if the signal covers multiple events [31, 41].

Obviously, the idea of utilizing time-frequency domain in finding the P-wave arrival is not a new idea, e.g., [29, 32, 42]. The usage of short-time Fourier transform has the ability to present the frequency structure changing over time. As the spectrum of a pure noise shall be flat and in terms of energy greatly differs from a spectrum of a signal containing registered seismic wave, the onset time detection shall be simple. However, time-frequency representation is a two-dimensional map that from the mathematical point of view is just a matrix of data, usually in large sizes (depending on STFT parameters and length of the signal). Due to the high dimension of the time-frequency map, to avoid computational complexity during the detection phase, some data dimensionality reduction is recommended. Again, one of the most popular ways to reduce the dimensionality of high-dimensional data is principal component analysis (PCA). The PCA algorithm seems to be a natural selection due to its high efficiency in lowering data dimensions while maintaining original information [43]. As a potential future development, one might consider other techniques of data dimension reduction, but at this stage, the combination of simplest techniques has been used. The PCA could be seen as weighted linear combination of input data that seem to be reasonable and intuitive for further analysis. It should be mentioned that the combination of a time-frequency approach and the PCA is not a novel idea. It was successfully applied, for instance, in mining machine condition monitoring [44]. According to our knowledge, there is no example of such procedure applied for a seismic signal, so we claim it is original. To summarize this section, we can say that we apply STFT time-frequency analysis to a raw seismic signal (single channel); for high-dimensional data matrix (set of subsignals for each frequency bin ), we use the PCA to extract informative part of the map only; and for selected “subsignals,” we apply the P-wave detection technique based on a simple rule: find the maximum value on the differentiation function (with different window size) of principal component 1 (PC1). The method is very intuitive and provides automatically a result that could be compared to the decision of expert (and it is closer to expert’s opinion than the well-known LTA/STA method).

#### 2. Methodology

In this section, the detailed methodology of the proposed procedure has been described. A general outline of the algorithm is presented in Figure 1.

In the first step, the raw data are transformed into a time-frequency representation. A critical issue while using a spectrogram is establishing parameters of STFT. Due to Heisenberg’s uncertainty principle, the selection of window size and spectral resolution should be a compromise. It was found experimentally for these specific signals (will be discussed later).

After that, the spectrogram matrix is provided to the PCA algorithm to reduce the problem of high dimensionality. The PCA is applied to the spectrogram matrix with respect to the frequency dimension (principal components of time vectors for each frequency bin are derived). Values of the spectrogram matrix are describing the energy of frequency components with respect to time, so principal components are also informative in the energy context. Since the first principal component is the most informative one, it can be used as a time series indicator of the energy variability.

Finally, the differentiation function of the first principal component has been calculated, and its maximum value, obtained for the optimal order , is used as the P-wave arrival time indicator. One may expect that the simple derivatives (the differentiation function with order 1) could be calculated as the good enough to discover the sudden changes in PC1. Nevertheless, a more general approach for detection has been proposed to deal with the signal with very high frequency, where the simple derivative can fail and not always can give a unique solution.

For validation, the P-wave arrival time has been compared with the classical LTA-STA method and time estimated manually by an expert. The main steps of the algorithm will be described in the following subsections.

##### 2.1. Time-Frequency Representation

The short-time Fourier transform of a discrete-time signal with window in time point *t* and frequency point *f* is defined as follows [45]:where *M* is the length of (always an odd number), is the real sampling frequency of signal and *j* is the imaginary unit.

As a window function, the Hamming window is used [45]:

The spectrogram is a time-frequency map described as , where and , where *c* is the minimal desired difference between frequencies included in the spectrogram.

##### 2.2. Principal Component Analysis

Principal component analysis is widely known in statistical analysis [43]. It assumes that dataset consisting of *N* observations, each spanned over *K* variables, can be interpreted as a point cloud in *K*-dimensional space. The goal of PCA is a rotation of local coordinate system towards variance maximization over a new set of dimensions such that the first dimension is characterized with the greatest variance, the second dimension with second greatest variance, and so on. Such a transformed system consists of new values of data over a new set of dimensions. Vectors of data over the new coordinate system are called principal components. Newly created feature space describes the original dataset mostly within several first principal components that carry most of the original information. Since for many types of analyses, the information contained in a small number of first principal components is sufficient, because their information content is very high, PCA is regarded to be a dimensionality reduction method.

Given *n* observations of *m*-dimensional data stacked into a matrix , the principal components can be calculated using singular value decomposition (SVD) as follows:where and are unitary matrices and contains the nonnegative real singular values of nonincreasing magnitude (). Principal components are the orthonormal column vectors of the matrix **V**, and the variance of the *i*-th component is equal to .

##### 2.3. Detection Rule

After the PCA transformation, we use the first PC as the most informative component. We assume that due to the arrival of P-wave, a significant, rapid change in variance will appear. So, we propose to calculate the differentiation function with order from PC1, and a location of the maximum value will indicate sudden changes in PC1, so exactly the P-wave arrival time. However, in case of a high-frequency sampling, the first derivative not always gives a unique solution. In order to deal with that, we can calculate the ordinary differentiation for the given signal with bigger time interspace between given samples, what we called order (), namely,where *t* is the number of the observation, . Using this kind of approach, increasing the window, we allow to obtain the bigger peak in place of growth, and the detection can be much easier, but at the same time, we can lose some precision in time. It may happen that the maximum is not unique, and there are many local maximum values. However, application of different windows gives the possibility to localize the sudden changes more precisely. In our case, we select the differentiation order as small as possible, but such that it highlights the sudden change.

##### 2.4. STA/LTA as a Verification Method

STA/LTA method is often used in the problem of P-wave arrival time detection [28]. The core idea is the evaluation of the characteristic function (CF) of the seismic signal using two sliding windows: one shorter and one longer. As a CF, various functions can be used, e.g., energy, absolute amplitude, and envelope. Regardless of the type of CF, the short-time window (STA) is expected to represent the instantaneous amplitude of the seismic signal, while the long-time window (LTA) tracks the amplitude of seismic noise. When their ratio exceeds certain predefined value (the so-called activation threshold), the following portion of the signal is identified as the event until the described ratio decreases below a certain value (deactivation threshold). Inspection of such ratio called SLR (short-to-long ratio) can allow to identify the P-wave arrival time as the first time where SLR exceeded the value of .

#### 3. Real Data Analysis

##### 3.1. Seismic Data

In order to show the potential of the proposed method, it will be tested on real data originating from the ELOGOR-C seismic monitoring system, which is used for rock mass observation in the underground copper ore mine O/ZG “Rudna”. The system consists of 2 sets of differently located seismometers. Each of those seismometers collects velocity or acceleration with a sampling frequency equal to 500 Hz in the frequency band 0.5–150 Hz. Such a band is sufficient for locating and estimating seismic energy and indicating the focal mechanism by analyzing the first direction of motion, which is the primary goal of the monitoring system. The particular signal was registered in August 2015 (Figure 2).

##### 3.2. Results: Spectrogram

First, the spectrogram of the input signal (Figure 3) has been calculated with the parameters presented in Table 1.

Practically, it means that frequency map, presented graphically in Figure 3, is a matrix size of *M**N* ().

So, we have *N* subsignals, and for each subsignal, P-wave arrival time should be estimated. Instead of doing this, we will subject this matrix for dimensionality reduction via PCA. Instead of *M*-sample-long *N* signals, we expect to receive *M*-sample-long several PCs that will maximize information spread along frequencies.

##### 3.3. Results: PCA

After performed PCA on the spectrogram matrix, we receive *N* vectors, but only some of them should contain the information. As one can see in Figure 4, the distribution of information is uneven. Most of the information is focused on PC1. It means that the spectrogram is redundant, and there is no need to detect P-wave arrival time for each subsignal of the spectrogram; there is only need to analyse the first PCs which handle the most significant information.

In the further analysis, we focus on the first three principal components, presented in Figure 5. As one can see, only the first component carries useful information about energy variability. Hence, it is selected for further analysis.

##### 3.4. Results: Detection

In the next step, the differentiation function for orders has been calculated for PC1, and its maximum value locations have been investigated (Figure 6).

As one can see, the maximum values of for each order indicate the same place of rapid change in values of PC1, therefore, we take the location obtained from the differentiation with order , which maximizes the ability to precise detection and location. In Figure 7, the distances between the points indicated by the proposed method and indicated by the expert for each differentiation order have been presented. Then, the detection can be much easier. However, we lose some precision in time. Consequently, the result can be shifted by several samples (depending on the size of the order ). Finally, according to the description in Section 2, the optimal value for detection and location has been considered as the time of the P-wave arrival (Figures 6 and 8). In our case, we take the optimal value of differentiation order as , which gives the easy detection (the maximum value has a bigger amplitude easier to detect automatically) without loosing precision in the localization. The zooming of Figure 8 with the arrival time matched has been presented in Figure 9. As one might see, indeed, it pointed out the beginning of seismic impulse.

##### 3.5. Results: Verification

As presented in Figure 9, the proposed method returned the timestamp of 1.526 seconds after the beginning of the signal. In comparison, the STA/LTA algorithm, regarded as the classical method, returned the timestamp of 1.534 seconds, and an expert from the mine indicated the point 1.524 seconds from the beginning (Figure 9). One might conclude that all techniques provided similar results. Indeed, all arrival times are similar; however, LTA-STA and the technique proposed in the paper are automatic (expert provided result manually based on experience), and what is more, our result is closer to the expert’s decision, so we might state that results are better than the classical LTA-STA-based method.

It is important to notice that the presented method has a few limitations. Due to properties of the spectrogram, there is a reduction of precision in comparison with methods that operate purely in time domain. Considering a comparison of our method with the classical STA/LTA approach, one can see that the proposed technique provides better result. The benefit is related to better signal-to-noise ratio for the narrow band signal (for the given frequency band).

#### 4. Conclusion

In this paper, the concept of a new method for determining P-wave onset time is described. The method uses a spectrogram altogether with principal component analysis. The spectrogram gives the visible difference between the frequency spectrum of registered seismic wave and pure seismic noise which precedes this wave. The resulting time-frequency map is used as a multidimensional input for PCA, which extracts the information required to identify the P-wave arrival time. The usefulness of the method was proved on real seismic data originating from the underground mine. Obtained results have been compared with the STA/LTA method as well as with the arrival time indicated by the mine expert. The comparison shows that the described method shows greater accuracy than the STA/LTA approach by indicating the point diverging from the point indicated by the expert only by 0.002 s, while STA/LTA shows point distant by 0.01 s.

#### Data Availability

The data used to support the findings of this study have not been made available because of NDA statements.

#### Disclosure

Our work was prepared as part of the employment of the authors at the Wroclaw University of Science and Technology.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.