#### Abstract

It is difficult to accurately and efficiently detect seismic wave signals at the time of arrival for automatic positioning from microseismic waves. A U-net model to detect the arrival time of seismic waves is constructed based on the convolutional neural network (CNN) theory. The original data for 1555 segments and synthetic data of 7764 segments were detected using Akaike’s information criterion (AIC) algorithm, the time window energy eigenvalue algorithm, and the U-net model. During uniaxial compression of the test block, acoustic emission equipment is used to collect the vibration wave generated by the rupture of the test block. Source imaging images are drawn using the Origin software, the arrival time error is counted, and the advantages and disadvantages of the three arrival time methods are discussed. Similarities between the source image and the actual fracture image are observed. There is a high similarity between the source imaging map and the physical trajectory map when the U-net model is used. Thus, it is feasible to use the U-net model to detect the arrival time of seismic waves. Its accuracy is greater than that of the time window energy eigenvalue algorithm but lower than that of the AIC algorithm for high signal-to-noise ratios. After reducing the signal-to-noise ratio, the stability and accuracy of the U-net model to detect the arrival time have improved over the other two algorithms.

#### 1. Introduction

Microseismic monitoring technology is an important means for early warnings to improve the safety and stability of rock mass disasters. It is widely used in coal mine rockbursts, coal and gas outbursts, high and steep slope monitoring in hydropower projects, and tunnels surrounding rock construction for safety monitoring. The detection of seismic wave arrivals directly impacts the accuracy and reliability of automatically locating microseismic activity. The traditional manual arrival detection method has systematic errors, and automated detection is affected by the critical value, time window length, limit amplitude, and other parameters and cannot be applied to massive microseismic events. Therefore, finding a stable, fast, and precise intelligent seismic wave arrival detection method is a major problem in the field of microseismic monitoring.

To date, automatic detection algorithms for seismic wave arrival mainly refer to the algorithm of seismic wave arrival in the natural seismic data processing. In 1978, Allen [1] found that there was a significant difference in the energies of seismic waves and noise and proposed the classical long-time window energy average/short-time window energy average (STA/LTA) method and the concept of the characteristic function. In 1985, Maeda [2] improved the AIC pickup time method and proposed the Maeda AIC method to directly calculate the AIC without calculating the order of the AR model; thus, the AIC could be used for seismic detection with improved operational efficiency. In 2017, Wang [3] proposed the fast AIC method considering its slow operations when processing a large amount of microearthquake data. In 2018, Zhu et al. [4] adopted a multistep detection method. In the first step, the STA/LTA was used to determine the approximate position, and in the second step, the AIC was used for accurate detection. An improved AIC method was proposed, which solved the problem that the AIC is greatly affected by the signal-to-noise ratio when processing hydraulic fracturing signals. In 2020, Li Hongli [5] compared the STA/LTA, polarization, and AIC methods and explained their characteristics in detail. In recent years, with the rapid development of microseismic monitoring technologies and the explosive growth of microseismic data, limitations of automatic detection algorithms that rely too much on parameters have been exposed. For example, if the background noise of some seismic waves is relatively large, it is necessary to set a larger critical value. If there is a large difference between the peak values of some seismic waves, it is necessary to set a larger time window length. Thus, the stability of automatic seismic detection is insufficient. One method is only applicable to a specific signal or signal data with the same geological conditions, which cannot adapt to conditions when the signal characteristics change. Thus, the detection accuracy will be reduced, which results in deviations in the microseismic positioning.

The rapid development of machine learning has advantaged processing for big data. Many earthquake scholars have applied machine learning to seismic phase recognition and time selection. At the beginning of 2019, Jiang and Ning [6] applied the support vector machine to identify and select seismic body wave arrival times. In the same year, Zhao et al. [7, 8] used the CNN model to classify the seismic phases of Wenchuan aftershocks and applied the U-shaped CNN (U-net) to select the arrival time of the earthquake. In 2020, Wang et al. [9] used a CNN model for microseismic event monitoring. At the end of the same year, Gao et al. [10] used random forests to select the first arrival of microseismic waves. In 2021, David et al. [11] used several U-nets to select the first arrival of earthquakes. Selecting the first arrival of vibration waves can be transformed into a classification problem in CNNs, and U-net has a good application in the field of biomedical image segmentation [12], which has significant advantages in dealing with pixel-level image classification problems and meets the accuracy requirements of detecting vibration waves at the time of arrival.

In this study, U-net is used to determine the first break of a vibration wave. The classical U-net model is adjusted to use the existing 7700 coal and rock fracture vibration wave data inputs for training and testing. The parameters are adjusted to achieve the optimal effect of the network. Since the loss function curve reflects the difference between all predicted points of the sample and the labelled points, rather than the difference between the microseismic wave predicted first arrivals single points and the labelled first arrivals single points. In order to count the error between the microseismic wave prediction first arrivals and the labelled first arrivals, a U-net pickup microseismic first arrival program was written in python, using the saved U-net training parameters to pick up the microseismic wave arrival times. For comparison with other algorithms, a program written in C# containing the AIC algorithm and the time window energy eigenvalue algorithm [13] was also used to pick up the microseismic arrival times. The 1555 segments of the validation set data were fed into each of the two programs to pick up the microseismic arrival times and compare the accuracy. To test the generalisation capability of the U-net model, 7764 segments of training data were mixed with white noise and input into the two programs to test the noise immunity of the three methods. Then, uniaxial compression tests for acoustic emission monitoring of cement mortar block fractures were performed. The arrival time of vibration waves in the experimental data was determined from the U-net model, AIC algorithm, and time window method, and the similarities between the source imaging and physical trajectory images were compared. The stability of the U-net model was considered in an experimental environment.

#### 2. U-Net Theory and Process

The U-net is a kind of full convolutional neural network (FCN) that can classify at the pixel level and process data of different sizes. The overall flow of the U-net picking up vibration wave first arrivals is shown in Figure 1.

##### 2.1. U-Net Adjustment

Differences between the U-net model and traditional models are that the data processing, label marking of the first arrival of the vibration wave, and the layer size are all adjusted.

###### 2.1.1. Data Processing

The sample data in this study is from the vibration wave data generated by single-pressure fracture experiments in coal and cement mortar test blocks collected using acoustic emission instrumentation. The single channel waveform was derived through event detection and the arrival time was manually marked, which includes 1242 coal rock fracture vibration waves and 6528 cement mortar test block fracture vibration waves, giving a total number of 7770 samples.

Details of the data used for network training are as follows: a segment of 121.24 s with a sampling rate of 3 MHz for cement mortar test block waveform data; and a segment of 546.57 s with a sampling rate of 2.5 MHz for continuous coal rock rupture waveform data. The AE software recorded the approximate trigger position of each seismic source, and through the trigger position, the continuous waveform was decomposed into 6 channels of single event waveforms and exported to txt format. The txt format was exported as a txt, and the number of lines in the exported txt varied depending on the duration of each event. In order to meet the input requirements of the U-net model, the data were cropped and complemented with zero operations: for data with less than 1024 sampling points, the points with data of 0 were added at the end of the data to complement the length to 1024, and for data with sampling points greater than 1024, the data with sampling points after 1024 were deleted. 1024 was chosen because the U-net network requires five pooling operations to reduce the data length, which requires the sample length to be divisible by 2^{5}, and considering that the average length of the derived events is around 1000 samples, 1024 was chosen to be divisible by 32. Since this study is concerned with the arrival time of the vibration wave, this only destroys the integrity of the event and does not affect the pickup effect. The data were preprocessed by removing the mean and normalisation.

The U-net model is generally used to process image data, which can be used by reducing the dimension of each layer in the U-net structure [7] or by increasing the dimensions of the seismic wave data. Therefore, the newaxis function in TensorFlow was used to change the data into a tensor form of 1024 rows, 1 column, and 1 channel to meet the format requirements of the network input data.

###### 2.1.2. Label First Arrival of Vibration Wave

The problem of determining the first break of the vibration waves can be transformed into a binary classification problem in CNNs where labels can be placed as time points. However, the microseismic signal accounts for a small proportion of the entire waveform data, and the time of arrival is only one point in the microseismic signal. Such labeling leads to an extremely uneven distribution of positive and negative samples, and the training effect does not meet the requirements. Grouping processing is a method to solve uneven distributions in positive and negative samples [9], but the shock wave sequence is continuous and there is a relationship between the front and rear points. If the length of each sample group is too short, the sequence information will be lost, which is unreliable and lacks practicality. If the length of each sample group is too long to meet the accuracy requirements, other methods are used for timing detection.

U-net has a good performance in image segmentation. It performs pixel-level classification and can meet the requirements of first-break timing accuracy and practicality. When using U-net to make labels, uneven positive and negative samples are handled by selecting all points before the time point as a class and the marks after the time point as a second class. The input microseismic data and output characteristic data are compared with the labels. The specific operations are shown in Figure 2.

###### 2.1.3. Layer Size

In this study, the sizes of the convolutional, pooling, and upsampling layers have been adjusted. The size of the convolutional kernel is changed from 3 × 3 to 3 × 1, the pooling kernel is changed from 2 × 2 to 2 × 1, and the upsampling is changed to 2 × 1 corresponding to the pooling layer. The reason for this is that the data input to the network is one-dimensional, just the amplitude of the waveform, rather than a two-dimensional picture consisting of time and amplitude. For 1D data, the use of 3 × 1 convolution is an unavoidable choice; it would not be possible to convolve 1024 × 1 data using a 3 × 3 square kernel. For 2D data, 3 × 1 convolution loses edge information compared to 3 × 3 convolutions, degrading the performance of the network and generally requiring upsampling to adjust the recovery performance. 3 × 1 convolution can save 33% of the number of parameters and computations when used properly, resulting in a two-fold increase in speed with minimal performance degradation [14].

As shown in Figure 3, the U-net model is composed of five parts: input layer, lower sampling layer, upper sampling layer, fusion layer and hop line, and output layer. The lower sampling layer includes the convolutional and pooling layers with some having a discard layer. The U-net model first continuously extracts the characteristics of the data through downsampling, and gradually restores the data to its original size through upsampling. The pixels are classified one by one, and the fusion layer and jump line prevent overfitting.

The shock wave data first enters the convolutional layer 1 from the input layer. The convolutional layer uses an all-zero filling, the number of convolutional cores is set to 64, and the ReLU activation function is used. Through convolutional operations, the data enter the layer again, the characteristics are extracted twice, and the results are sent to pooling layer 1. The maximum pooling is used to reduce the data length by half. Discard layers 4 and 5 are added under convolutional layers 4 and 5, and 50% of the neurons are discarded to prevent overfitting. The lower sampling layer repeats this process five times (the final time does not need pooling) and constantly extracts data features. The convolutional operation is shown as follows:where represents the output of the neuron in the layer , is the output of the neuron in the layer , is the weight, is the offset term, is the all-zero filling function, is the convolution operation, and is the activation function.

The data then enter the upper sampling layer 6 from discard layer 5. The function of the upper sampling layer is opposite to that of the pooling layer, which doubles the length of data, reduces the number of convolutional cores by half, and gradually restores the original appearance of the waveform. The interlayer residual jump operation [15] is used to prevent overfitting. The data are output at discard layer 4 of the lower sampling layer corresponding to upper sampling layer 6, which is a matrix superimposed with upper sampling layer 6 through fusion layer 6 and is then output through convolutional layer 6. The specific operation is shown in Figure 4 and equation (2). The upper sampling layer repeats this process four times, which corresponds to the pooling layer and restores the data to its original length.where is the output of all neurons in the layer , is the output of all neurons in the layer , is matrix superposition, is the fusion function, and is the output of all neurons in the layer.

Finally, the data enter the convolutional layer 9, which gradually reduces the number of convolutional cores. They then enter the output layer, which passes through a 1 × 1 convolutional kernel, changing the depth of the characteristic map. Thus, the size of the output data is the same as that of the input data. The output layer uses the sigmoid activation function to produce the predicted shock wave arrival time characteristic map.

The binary cross-entropy function corresponding to the sigmoid activation function is used to measure the differences between the predicted time and label. When the binary cross-entropy function takes the derivative of the gradient decline, the gradient of the loss function for the final layer of the weights is no longer related to the derivative of the activation function but is only proportional to the difference between the output and the values. Using this function can accelerate the convergence speed and the rate of updating the weights. The calculation formula of the binary cross-entropy function is given as follows:where is the label and is the predicted value.

##### 2.2. Training Process

Using the pseudorandom number generator, the number of random seeds was established, the 7700 segments of vibration wave sample data were disrupted, and the reproducibility of the data was ensured. The first 6215 segments were selected as the training set, and the final 1555 segments were selected as the verification set, which were input into the neural network in batches. Each batch input 32 segments of data, and there were 40 training rounds. The Adam optimizer was used and the learning rate was 10^{−4}. The accuracy of the verification set test was determined after each round, and the loss function and accuracy curves are shown in Figure 5.

This study used PyCharm to realize and adjust the U-net arrival time model, which was trained and tested on TensorFlow. The U-net training ran on a computer with 8 GB memory and an NVIDIA GPU. Continuous debugging showed that the training rounds remained optimal after 40 rounds. As seen in Figure 5, the large amount of sample data caused the loss function curve of the training set to gradually decrease from 0.16 to about 0.04. The loss function curve for the verification set fluctuated between 0.11 and 0.05 in the first 10 rounds and then decreased steadily to 0.05 after 1022 rounds. It then began to fluctuate between 0.05 and 0.06 from 22–36 rounds and began to rise from 37–40 rounds. At this time, the training was terminated.

The correct rate curve for the training set gradually rose from 93.8% to 98.2% in the first 10 rounds. From rounds 1040, the curve fluctuated slightly, and the overall accuracy increased to 98.7%. The accuracy of the verification set fluctuated from 96.5% to 98.5% in the first 10 rounds and then increased from 97% to 98.4% from rounds 1025. It fluctuated around 98.4% from rounds 2636 and began to fluctuate with a downward trend over rounds 3740. The loss function and accuracy curve exhibited a negative correlation. When the loss function decreased, the accuracy increased, indicating the good training effect of the U-net model.

##### 2.3. Comparative Analysis

Generally, the error range of the first break choice is acceptable within 50 points, and more than 50 points is considered an incorrect selection. Here, the error quantitatively expresses the selection accuracy from various methods, as reflected by statistics. The calculation formula iswhere picks the first solstice for various methods. To meet the needs of practical engineering, the signal-to-noise ratio of the original data was statistically calculated, and the original data were denoised to test the stability of the three methods. The data collected in this study is from uniaxial compression tests on rock masses, where no other noise is generated in the laboratory, so there are two main types of waveform noise collected: noise generated by uniaxial compressors and other noise generated by the laboratory environment.

The frequency of the acoustic emission signal collected from the test was between 100 KHz and 400 KHz. Based on experience, the three types of environmental noise were between 10 Hz and 100 Hz, and the band-pass filter from 100 KHz to 400 KHz was selected using the filter that comes with the AE software, with a filter order of 5, so that the environmental noise could be effectively filtered out. Then, the data were imported into a C# program written to write a noise addition algorithm based on pseudorandom numbers and amplitude scaling to add noise to the data and a reduction in SNR values was measured from 15–50 dB to less than 15 dB. Gaussian white noise was added to each channel based on the original data, and the effects of noise addition are shown in Figure 6. The calculation formula iswhere is the fusion data point, is the original data point, and refers to white noise data points.

###### 2.3.1. Statistical Comparison between Raw and Noisy Data

The signal-to-noise ratio was calculated after adding noise, and the AIC, U-net, and time window energy eigenvalue algorithms were used to determine the arrival time. The error range points, error range proportion, error timing rate, and average error of the 1555 original waveforms and 7764 synthetic data were counted. The statistical results are shown in Figure 7 and Table 1.

Figure 7 shows that for the 1555 segments of raw data, there were 1430 segments with an AIC pickup error within 5 points, which is greater than that of the 1228 segments for U-net and 734 segments for the time window method. At the same time, there were 179 segments of data with errors of more than 50 points in the time window method, which accounts for the largest share. This shows that the accuracy of the AIC method is greater than that of the U-net and time window methods. After adding noise, the data volume for U-net with errors within 5 points was 5739 segments, which exceeded the 4893 segments of AIC and the 3183 segments of the time window method, while only 373 segments had errors exceeding 50 points. Therefore, U-net has strong stability when determining arrival time data with low signal-to-noise ratios.

Table 1 shows that when using the original waveform data, the average error of the U-net model is 6.38 *μ*s with an error pickup rate of 3.34%, those for the AIC algorithm are 2.79 *μ*s and 2.19%, and those for the time window method are 12.63 *μ*s and 11.51%, respectively. The timing precision for the U-net model is slightly lower than that for the AIC algorithm but much higher than that for the time window method. This is because, for high signal-to-noise ratios, the AIC algorithm has an excellent ability to distinguish between noise and signal boundary points. The U-net model uses multiple convolutional cores to extract data features, which also guarantees accuracy.

After adding noise to the original waveform data, the signal-to-noise ratio decreases, the average error U-net timing error is 11.64 *μ*s, and the error pickup rate is 4.80%. These values for the AIC algorithm are 12.16 *μ*s and 9.51%, and those for the time window method are 31.07 *μ*s and 28.55%. At this time, the pickup accuracy of the U-net model is greater than both the AIC algorithm and the time window method. Thus, after reducing the signal-to-noise ratio, the accuracy of the three algorithms for timing selection decreased to a certain extent, but the AIC algorithm and time-window method were greatly affected, while the stability of the U-net model was good with a high arrival time accuracy after a reduced signal-to-noise ratio.

###### 2.3.2. Comparison of Algorithm Feature Maps before and after Noise Addition

Illustrations of the waveform timing are shown in Figure 8 for the AIC, U-net, and time-window methods, illustrating that the three methods have different degrees of error. In Figure 8(a), the signal-to-noise ratio of the waveform is 31.98 dB, and Figure 8(b) shows a strong local minimum before the peak of the AIC characteristic curve with an error of 9 points. In Figure 8(c), there are double peaks at the beginning of the characteristic curve for the time-window method, and the coda at the end is continuous. An amplitude limit of 0.001 is set by the algorithm to avoid the coda, but there are still 43 points of error. In Figure 8(d), the place where the probability of the U-net output characteristic diagram exceeds 0.5 for the first time is distinguishable, and the arrival time accuracy improves. The accuracy of the time-window method is lower than those of the AIC and U-net methods.

After adding noise to the waveform data in Figure 8(e), the signal-to-noise ratio is reduced to 8.06 dB. At this time, the local minimum of the AIC characteristic curve in Figure 8(f) is not strong, and the timing error exceeds 50 points, which is judged as an incorrect selection. In Figure 8(g), the characteristic curve of the time-window method still has many interferences, and the error is 42 points. In Figure 8(h), the U-net output characteristic diagram is disturbed by a certain degree of noise, but the time point is still accurately picked up. The stability of the U-net and time-window methods are better than the AIC method, indicating that the AIC method is greatly impacted by the signal-to-noise ratio.

##### 2.4. Laboratory Validation

The prefabricated cement mortar test block was subjected to uniaxial compression, and acoustic emission instruments were used to collect the vibration signals during the fracture process. In the data processing stage, three automatic time arrival pickup methods were applied to compare the arrival time accuracy for the proposed U-net model against the AIC and time window methods. The source location was performed using the three different time arrival methods, and the event energy location and scatter images from the software were drawn. According to these fracture location images, various wave arrival time pickup methods are evaluated, which lays a foundation for the practical application of subsequent projects.

As this study is ultimately concerned with the effect of arrival time on localisation accuracy and the magnitude of energy at the localisation point, the concern is really with the quality and energy of the signal. The higher the quality of the signal, the more accurate it will be when picked up and the localisation accuracy will improve, ultimately reflecting the texture of the rock fracture; the energy of the signal will reflect the extent of the rock fracture.

We simply use the signal-to-noise ratio and energy here to represent the quality and energy of the signal, using Figure 9 to illustrate the effect of each stage of filtering on the test results. Effect on energy at the locus: Parseval’s theorem shows that the energy of a signal is obtained by accumulating the square of the amplitude. It is believed that passing through a time domain filter or a frequency domain filter will affect the amplitude and cause a loss of energy. The more filters one passes through, the greater the energy loss. Impact on the pickup effect: the signal is known from the test results to be between 100 KHz and 400 KHz. The band-pass filtering of the sensor is mandatory, between 50 KHz and 400 KHz, to be able to capture the complete event, filter out the noise, and to improve the initial to pickup accuracy; the filter of the amplifier is chosen to be 100 KHz–400 KHz to be able to further exclude the low-frequency noise of 50 KHz–100 KHz, while the amplifier itself can also significantly improve the signal-to-noise ratio and can substantially improve the wave pickup accuracy. When the signal becomes digital, the processing that follows is in fact controllable. The collector’s filter can be selected as a type to improve pickup accuracy, and the software filter is even more powerful, allowing the selection of order and threshold to restore waveform details and to improve the accuracy of the pickup wave arrival time.

###### 2.4.1. Test System and Cement Mortar Test Block

Cement mortar test blocks with an average length, width, and height of 100 mm are used, as shown in Figure 10. Table 2 shows the detailed parameters of each test block. The Yaw6206 microcomputer controlled the electrohydraulic servo pressure testing machine from MetS Industrial Systems Co., Ltd., for uniaxial compression until it was destroyed with the loading rate shown in Table 2. During the tests, the 8-channel acoustic emission instrument from Beijing soft Island times Technology Co., Ltd., was used to collect the vibration signals from the test environment shown in Figure 11. Before the uniaxial compression tests, a lead-breaking test was performed to calibrate the sound velocity of the test blocks [16], as shown in Figure 12. The sound velocity of the three test blocks is shown in Table 2.

Lead-break test for sound velocity: On the side of the cement mortar specimen, two sensors with coordinates (100, 50, 10) and (100, 50, 90) are placed, two lead breakpoints (100, 50, 20) and (100, 50, 80) are set, and the lead is broken three times at each point, respectively, by software setting. The sound velocity is calculated by the software and the average of the six calculated sound velocities is taken and set as the P-wave sound velocity. The principle is shown in the following equation, and the calculation interface is shown in Figure 13.where is the distance from the lead-break point to sensor 1, is the distance from the lead-break point to sensor 2, is the time for the sound wave to propagate to sensor 1, is the time for the sound wave to propagate to sensor 2, and *V* is the speed of sound of the wave.

###### 2.4.2. Data Processing and Analysis

The AIC, time window, and U-net algorithms were used to determine the arrival time of the vibration wave data collected from the tests. The positioning map was compared with the physical trajectory, and the accuracy and stability of each method were analyzed. The specific test results are shown in Figures 14 and 15. The coordinate axis takes O as the origin, OC as the *X*-axis, OA as the *Z*-axis, and OG as the *Y*-axis. It is specified that OABC is the front side, with the number 1 of the test block written on the front side, ABED as the top surface, and OADG as the left side.

The energy spheres were sized by removing the units of energy and inputting them into Origin and plotting the energy values as diameter in pounds. However, the diameter was too large to draw the inside of the test block, so a scale factor was added, which reduced the diameter of the energy sphere by a factor of 0.05.

Figure 14(a) shows that the high-energy event points of the acoustic emission software are concentrated in the upper middle part of the test block, and there are high-energy points around the OCFG surface of the test block. Figure 14(b) for the AIC method shows that high-energy events are concentrated in the middle and upper part of the GF line, and there are many positions with high energy in the middle and inner part of the BEFC plane. Figure 14(c) for the time-window method shows that high-energy events are concentrated in the upper part of the test block, and there are scattered low-energy points above the OCFG surface. Figure 14(d) for the U-net model shows that the energy is concentrated in the middle and lower parts of the DEFG plane, and the middle and upper parts of the front face of the test block also contain some high-energy points.

Figure 14 shows that the left side of the DEFG surface of the test block falls off over a large area, and the bottom side of the DEFG surface has significant cracking, which indicates that the test block produces high-energy events in this area. The acoustic emission software indicates that the AIC, time window, and U-net methods all have high-energy event points in this area. Among them, the time-window method locates the high-energy events close to the top of the test block, while the positioning software, AIC, and U-net all show that the high-energy events are concentrated in the middle and lower parts. The comparison of the physical drawings indicates that the AIC and U-net are closer to the actual fracture locations, suggesting that the predicted arrival time is more accurate and better reflects the specific event locations.

Figure 15 shows that there are four distinct cracks behind EDFG face of the test block, with large areas of exfoliation from cracks 1 to 3. The depth of crack 4 has been hollowed out, leaving only a few cracks on the surface.

Figure 15(a) was picked up and localised by the acoustic emission software. As can be seen from the figure, the events are concentrated inside the test block and are scattered and irregular. This is due to the fact that the acoustic emission pickup algorithm is based on the threshold trigger theory, and the 20 mV trigger was used in the software acquisition, which may have been set a little too small, resulting in an earlier wave arrival time and a larger overall calculation, the localisation points are scattered within the test block and no obvious fracture formation can be seen, which is not consistent with the actual rupture.

In Figure 15(b), crack 3 and crack 4 can be clearly seen, which indicates that the AIC correction is accurate in locating the points. The reason for this is that the AIC algorithm is based on the smoothness of the time series, and the waveform acquired by the AE software, with a high signal-to-noise ratio, is very suitable for the AIC algorithm. It is clear from the test results that the AIC algorithm is very accurate in picking up signals with high signal-to-noise ratios, and this is also reflected in the localisation.

Figure 15(c) shows crack 4, where the event points are scattered. This is because the time-window algorithm is based on the energy difference theory, which does produce a great deal of energy in the large area of the DEFG surface shedding, but due to its inherent lack of accuracy and the inevitable loss of energy caused by the system processing the data, the localisation results are unsatisfactory.

In Figure 15(d), a large number of event points are concentrated on the DEFG face, and the event points are concentrated to the extent that they can be connected to form crack 3 and crack 4, and this localisation result is ideal. The reason for this is that the U-net model is purely a data identification model, and it is more advanced theoretically, and as can be seen from the feature map, it has a strong generalisation capability, and this advantage is reflected in its high precision pickup wave arrival time and a localisation map that matches the actual rupture process.

#### 3. Conclusion

This study adjusts the input and output data dimensions and convolutional kernel size based on the U-net model and uses it to select the arrival time of seismic waves. The results are compared with the traditional AIC algorithm and time-window method to verify their accuracy. The stability of the three methods is compared using a synthetic signal after adding noise. Experiments that monitor fractures in cement mortar blocks through laboratory acoustic emission detection were performed and analyzed using the three methods to determine the seismic arrival time and to create source images. The results were compared with the actual fracture trajectory of the cement mortar test block, and the advantages and disadvantages of the three methods were determined. The following conclusions are drawn.(1)It is feasible to use the U-net model to automatically detect the vibration waveform arrival.(2)The data analysis shows that the average error of the U-net model is 6.38 *μ*s without noise, which is slightly lower than that of the AIC algorithm but higher than that of the time-window method. In the case of added noise, the accuracy of the U-net model increases above both the AIC algorithm and time-window method with an average error of 11.64 *μ*s.(3)Compared with the time-window method, the U-net model does not need to adjust the time window length and characteristic function multiple times. Compared with the AIC algorithm, there is no need to judge the peak value and limit the amplitude, which eliminates the influence of subjective factors and gives a higher stability. For measured and noisy synthetic signals, the error pickup rates of 1555 real signals and 7764 synthetic signals using the U-net model were 3.34% and 4.80% respectively, with a stability that was better than the other two methods. The test results of the three algorithms when monitoring fractures in the cement mortar test block through acoustic emissions show that the similarity between the source and physical trajectory images after detection by the U-net and AIC algorithms was high, which further verifies the accuracy and stability of the U-net model.(4)The U-net model significantly increased the timing error of noisy signals. Thus, the proposed model still needs further optimization. In the future, the network structure will continue to be adjusted to test the accuracy and stability of the arrival time determination.

#### Data Availability

The (Supplementary Materials.zip) data used to support the findings of this study were supplied by Haoyuan Chang under license and so cannot be made freely available. Requests for access to these data should be made to Haoyuan Chang, [email protected]. Some of the data can be accessed publicly from the download address: https://pan.baidu.com/s/1py_HkpH_RBaUqfVuVazt0w; extraction code: chhy.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (52104157 RMB 300000) and Zhangjiagang Science and Technology Plan Project (zkcxy2112 RMB 100000).

#### Supplementary Materials

The supplementary document contains some pictures of the application of the article in experiments and projects.* (Supplementary Materials)*