Abstract

An essential step in surface wave exploration is the inversion of dispersion curves. By inverting dispersion curves, we can effectively establish the shear-wave velocity model and obtain reliable subsurface stratigraphic information. The inversion of dispersion curves is an inversion problem with multiple parameters and multiple poles, and obtaining a high precision solution is difficult. Among the methods of inversion of dispersion curves, local search methods are prone to fall into local extremes, and global search methods such as particle swarm optimization (PSO) and genetic algorithm (GA) present the disadvantages of slow convergence speed and low precision. Deep learning models with strong nonlinear mapping capability can effectively solve nonlinear problems. Therefore, we propose a method called PSO-optimized long short-term memory (LSTM) network (PSO-LSTM) to invert the dispersion curves in order to improve the effect of inversion of dispersion curves. The method is based on the LSTM network, and PSO is used to optimize the LSTM network structure and other parameters that need to be given manually to improve the prediction of the network. Two theoretical geological models are used in the paper: Model A and Model B to test the PSO-LSTM. The tests include the noisy data test and noise-free data test. Model A was tested without noise, and Model B was tested with noise. In addition, PSO and LSTM were tested on model A to compare the performance of PSO-LSTM. In Model A, the maximum relative errors of PSO and LSTM are 20.76% and 5.85%, respectively, and the maximum standard deviations of PSO and LSTM are 57.37 and 1.97, respectively. For PSO-LSTM, the maximum relative errors of Model A and Model B in the inverse results are 2.05% and 2.09%, and the maximum standard deviations of Model A and Model B in the inverse results are 1.23 and 3.87, respectively. The test results of Model A show that the inversion performance of PSO-LSTM is better than those of LSTM and PSO, and the performance of the network can be improved after PSO is used to optimize the network parameters. The inverse results from Model B show that the PSO-LSTM is robust and can invert the dispersion curves well even after adding noise to the model. Finally, the PSO-LSTM is used to invert the actual data from Wyoming, USA, which demonstrates that the PSO-LSTM can be used for the quantitative interpretation of Rayleigh wave dispersion curves.

1. Introduction

Rayleigh waves are waves that transmit along the surface or medium partition interface formed by the interference and superposition of P-wave and S-waves. Since the discovery of Rayleigh waves by the British scholar Rayleigh [1], scholars have been investigating theories related to the propagation of Rayleigh waves in the strata. In the early stages of research, Rayleigh waves were treated as noise, and researchers focused on the characteristics of Rayleigh waves to reduce their hazard in earthquakes or to eliminate their impact on valid information in oil seismic exploration [2]. With the deepening of research, scholars found that Rayleigh waves propagate in layered media with dispersion phenomenon. Scholars then began to use Rayleigh waves in the study of stratigraphic structure. Compared with the conventional exploration methods, such as reflection seismic exploration, surface wave exploration has the advantages of nondestructive testing, convenient construction, high resolution at shallow depth and fast detection speed [3] and is broadly used in the fields of soil delineation, engineering nondestructive testing, and geological hazard survey.

Surface wave exploration can be divided into three steps: surface wave data acquisition, dispersion curves picking, and inversion of dispersion curves. As the key step of surface wave exploration, the inversion of dispersion curves directly affect the reliability of the requested stratigraphic information. Currently, there are two main types of methods to invert the dispersion curves: local search methods and global search methods. Local search methods include the least squares method, Levenberg–Marquardt (L-M) algorithm, and Occam algorithm. Dorman and Ewing [4] used the damped least squares method to invert the dispersion curves for the first time. The S-wave velocities obtained by the inversion were consistent with those obtained by the refracted wave method, confirming the validity of the method. Xia et al. [5] combined the L-M algorithm and the singular value decomposition technique to invert the Rayleigh wave phase velocity. The method improved the convergence speed, computational performance, and stability of the inversion. In addition to least squares method, the Occam algorithm has also been applied to the study of inversion of dispersion curves. For example, Ai and Cheng [6] employed the Occam algorithm for the inversion of Rayleigh wave dispersion curves, and the results of inversion showed that the Occam algorithm can balance the accuracy of model and the computational rate of the inversion well. Local search methods are extensively used because of their rapid computational speed. However, such methods rely excessively on the initial model, and reliable results of inversion can be obtained only if the initial model is similar to the real model. In addition, local search methods are prone to fall into local extremes, and the partial derivatives involved in the calculation and the results of inversion are affected by the accuracy of the Jacobi matrix. The development of local search methods is limited by these factors. Therefore, researchers have applied another global research method that can avoid the initial model selection and partial derivative calculation to invert dispersion curves, such as GA and simulated annealing algorithm (SA). Shi and Jin [7] used GA for inversion of dispersion curves. In the inversion, Shi and Jin modified the search range by analyzing the initial search results, and thus improved the search efficiency of the method. Yamanaka and Ishida [8] added an elite screening strategy to GA to promote the convergence of the solution. Dal Moro et al. [9] used GA and a posteriori probability density estimation for inversion of dispersion curves, and the method obtained solutions with higher accuracy compared to those obtained by GA. In addition, some scholars have also used SA for inversion of dispersion curves studies. Lu et al. [10] proposed a heat-bath simulated annealing algorithm based on the SA. The inverse results showed that the heat-bath simulated annealing algorithm is more suitable for inversion of dispersion curves than the L-M algorithm. Compared with local search methods, the above global search methods are superior, but these global search methods with long operation time and low solution accuracy are still needed to improve for high precision surface wave exploration.

Deep learning models are capable of building a good mapping between signal and semantics by building a hierarchical structure similar to the human brain, extracting features layer by layer from the bottom to the top of the input data [11]. Deep learning models are excellent at solving nonlinear problems and making fast predictions, and their applications in the geophysical field are gradually increasing in recent years, such as earthquake detection and localization [12], seismic lithology prediction [13], denoising [14], detecting faults [15], and other directions. Deep learning techniques have also yielded impressive results in the field of surface wave exploration. Dai et al. [16] proposed a network specifically for dispersion curve extraction called dispersion curves network (DCNet). The network can extract the dispersion curves quickly and accurately. Theoretical and practical data test results show that the accuracy of the extracted dispersion curves using DCNet has reached the level of manual pickup and can meet the needs of practical work. Song et al. [17] proposed a neural network Res-Unet++, which can accurately and efficiently extract the dispersion curves. Actual data have verified that using this network to select the dispersion curves is better than that of manual selection. Yablokov et al. [18] developed an artificial neural network for Rayleigh surface wave fundamental mode dispersion curve inversion. The accuracy of the inverse results of this method is better than the Monte Carlo algorithm inverse results and similar to the Gray Wolf optimization inverse results by theoretical model testing. For noisy data, the artificial neural network still works well. Wu et al. [19] proposed a LSTM network to invert surface wave based on the first height last velocity loss function. The test results of synthetic and real data show that the network can be effectively used not only for theoretical data inversion but also cope well with real data. The results of dispersion curves inversion mentioned above were summarized as shown in Table 1.

On this basis, the PSO-LSTM is used in this paper for the study of inversion of dispersion curves. PSO is used in the selection of parameters of the LSTM for the number of neurons in the hidden layer, the learning rate, and the number of training rounds in the LSTM to avoid the low prediction accuracy of the network model due to improper manual tuning of the parameters. In order to evaluate the ability of the PSO-LSTM to invert dispersion curves in detail, the feasibility of the model for inversion of dispersion curves was first verified by using a model without noise; then the stability of the PSO-LSTM was tested by using a model with 10% Gaussian noise; and finally, the ability of the PSO-LSTM to invert the actual data was tested by using seismic data from the Wyoming area.

2. Principle and Implementation of PSO-LSTM

2.1. Long and Short-Term Memory Network

The LSTM introduces a gating mechanism, which is better at handling timing problems than the traditional recurrent neural network. The cell structure of the LSTM is shown in Figure 1. In Figure 1, , , and denote the forget gate, input gate, and output gate, respectively, denotes the neural unit state, denotes the hidden layer state, denotes the input vector of the LSTM unit, and tanh denote the sigmoid and tanh activation functions, respectively.

The core components of the LSTM are the cell state and the gate structure; the cell state can be seen as a channel for information transfer, allowing information to be passed continuously; the gate structure continuously learns during the training process whether to keep or forget information. The input gate determines the important information in the current input, which in turn updates the cell state; the forget gate determines the information that should be discarded or retained; the output gate is used to determine the new hidden , and to pass the new cell state and the new hidden state to the next LSTM cell. The information transfer within the LSTM neural unit follows equation (1)–(6):where denotes the weight and denotes the bias amount of each gating unit.

2.2. Particle Swarm Optimization

PSO can be used to find the optimal solution quickly through the information interaction between particles. The particles in the algorithm are moving simultaneously, and all particles will generate memory and experience in the process of motion. Any individual particle will compare its experience with the experience provided by other particles in the process of finding the optimal solution, so that it will constantly be in the optimal solution. The PSO’s velocity position update equation is shown in equation (7):where is the inertia weight; and is the learning factor; and is the random number between [0,1]; , x, , are the velocity component, position component, individual optimum, and population global optimum of the ith particle in the jth dimension at the tth iteration, respectively.

2.3. PSO-LSTM
2.3.1. Flow of PSO Optimize Network Parameters

The selection of network parameters for LSTM is usually based on researchers’ experience, and the low prediction accuracy of the model caused by artificial selection can be avoided if PSO is used to determine the parameters. The process of PSO to find the optimal network parameters is as follows: the number of hidden layer neurons, learning rate, and number of training rounds of key model parameters in LSTM are used as optimization-seeking variables for particles in different dimensions, and the optimal model parameters are obtained by continuously updating the velocity and position of particles and calculating and comparing the objective function fitness values so as to achieve the global optimum. The PSO-LSTM flow chart is shown in Figure 2. The PSO-LSTM flow is described as follows:Step 1 The data are divided into training and test sets in a 4 : 1 ratio. The input data is only the Rayleigh wave phase velocity; therefore, no normalization of the data is performed.Step 2 The number of hidden neurons, learning rate, and number of training rounds of the LSTM are set as the search parameters to initialize the population, and the search ranges of hidden neurons, learning rate, and number of training rounds are 50–300, 0.05–0.3, and 200–1000, respectively (the search ranges for these parameters are given based on the researcher’s experience).Step 3 The prediction error of the PSO optimized LSTM is the fitness value of the particle, and the fitness value changes with the number of iterations, and the individual particle updates the individual optimal position and the global optimal position according to the fitness value and then updates its own speed and position according to equation (7).Step 4 Stop the iterative update when the fitness value of the particle stabilizes and determine the values of the number of hidden layer units, learning rate, and the number of training rounds.Step 5 Input the optimal parameters into the LSTM for training and prediction.

2.3.2. Evaluation Metrics for PSO-Optimized LSTM Network Parameters

In the process of parameter optimization by PSO, PSO continuously updates the number of hidden layer units, learning rate, and training rounds to build LSTM models with optimized parameters for training and prediction. The particle fitness value is represented by the mean absolute percentage error (MAPE) of the prediction of shear-wave profiles on the test set. The lower the value of MAPE, the better the parameters found in this iteration, and the optimal parameters are determined. The MAPE is calculated aswhere is the sample size, denotes the true value, and denotes the predicted value.

The model is built with optimal parameters and trained after the optimization search is completed. The accuracy of the LSTM solution is tested by the mean squared error (MSE) between the predicted and true values. The lower value of MSE indicates the higher accuracy of the model solution, and the formula for calculating MSE iswhere is the sample size, denotes the true value, and denotes the predicted value.

3. Synthetic Data Tests

The significant influence on the variation of Rayleigh wave dispersion curves characteristics is shear-wave velocity and thickness of the stratum [5], and the remaining parameters have a minor effect on it. In order to reduce calculation cost, only the shear-wave velocity and thickness are inverted. For the network training data, the scalar transfer algorithm is used to generate the data. The frequency band range of the dispersion curves is 5–100 Hz with a frequency interval of 3. The sample data were randomly generated within the upper and lower 50% of the theoretical model parameters [22]. The frequencies in the data are all the same, and to reduce the time cost, only the Rayleigh wave phase velocity in the dispersion data is used as the input data of the network model, and the stratum layer thickness and shear-wave velocity in the model parameters are used as the label data. In the article, the inertia weights, learning factors c1, and c2 of PSO are 0.8, 2.0, and 2.0, respectively. In the inversion of dispersion curves, the number of particles and iterations of PSO are 20 and 50, respectively. In PSO-LSTM, the number of particles and iterations of PSO are 10 and 20, respectively. All tests in the article were performed in the same environment. The software platform is PyCharm, and the programming environment is the Python language using the PyTorch framework. PyTorch and Python version are 1.10.2 and 3.6.13, respectively. The CPU and GPU of the computer used in this article are Intel Core i5-10400F and NVIDIA GeForce RTX 3060, respectively.

The objective function of the PSO to invert dispersion curve iswhere is the measured phase velocity of the Rayleigh wave; is the theoretical phase velocity of the Rayleigh wave; and is the number of points of frequency.

To verify the performance of PSO-LSTM, two typical geological models were designed. Model A is a four-layer geological model with increasing velocity with depth; Model B is a four-layer model with a low-velocity layer in the middle of the model; and the model parameters are shown in Table 2. Model A is tested without noise, and the data contain the dispersion data of the fundamental mode; Model B is tested with noise, and the data contain the dispersion data of the fundamental mode and second mode. The sample data of model A and B are shown in Figure 3. The sample data of model A contain no noise, and the sample data of model B have 10% Gaussian noise added to it.

3.1. Noiseless Synthetic Data Test

To compare the inversion performance of PSO, LSTM, and PSO-LSTM, we tested these three inversion methods using Model A. The number of neurons in the hidden layer 1 and hidden layer 2 in the parameters of the LSTM network without PSO optimization given by experience are 160 and 118, respectively; the learning rate and the number of training rounds of the LSTM network are 0.17 and 719, respectively. The optimal LSTM network parameters by PSO search are as follows: the number of neurons in hidden layers 1 and 2 is 254 and 276, respectively; the learning rate is 0.0890; and the number of training rounds is 719. From Figures 4(a) and 4(b), we can see that using the model parameters searched in 20 iterations to train the network can already get reasonable training results. The same trend of training error and validation error in Figure 4(b) indicates that the model is well trained. From Figure 4(c), it can be seen that PSO, LSTM, and PSO-LSTM inverted dispersion curves fit well with the observed dispersion curves, indicating that they have found the optimal solution. In Figure 4(d), the shear-wave velocity profile of PSO inverted has a large deviation from the shear-wave velocity profile of real model in the second and fourth layers; the shear-wave velocity profile of LSTM inverted agrees better with the shear-wave velocity profile of real model, but there are also some deviations in the layer thicknesses of the second and fourth layers; our proposed PSO-LSTM works the best, and the inverse transverse velocity profile agrees almost perfectly with the shear-wave velocity profile of real model. The fit of the inverted model to the real model shows that the LSTM network is more suitable for inversion of dispersion curves than PSO; the prediction effect of LSTM network is better after PSO optimizes the network parameters of LSTM, and PSO optimizes the network parameters to play a role in improving the network accuracy.

3.2. Noisy Synthetic Data Test

Noise is inevitable when acquiring real seismic data. The noise will make the extracted phase velocity deviate from the true value, making the inversion more difficult and reducing the accuracy of inverse results [10]. Therefore, the capability of PSO-LSTM to invert noisy data is necessary to be examined. In order to simulate the real data, 10% Gaussian noise is added to both Model B and the sample data of Model B. Then, the trained network is used to invert model B. The optimal LSTM network parameters are searched by PSO: the number of neurons in hidden layers 1 and 2, the learning rate, and the number of training rounds are 61, 239, 0.1140, and 747, respectively. From Figure 5(a), we can see that after 20 iterations, the function values have converged. Figure 5(b) indicates that the model is well trained. In Figure 5(c), the dispersion curves fit well, and there is no significant deviation between the inverted dispersion curves and the observed dispersion curves. Figure 5(d) shows that the PSO-LSTM can still reconstruct the shear-wave velocity model well after noise is added to the model. The reconstructed shear-wave velocity model fits well with the shear-wave velocity model of the real model, indicating that the inversion works well.

3.3. Stability Analysis

To further evaluate the performance of the PSO-LSTM to invert the dispersion curves, the theoretical model test was repeated 10 times, respectively, and the mean and standard deviation of the inverse results were calculated. The mean and standard deviation are generally used to reflect the stability of the inverse results, the smaller the mean and standard deviation, the more stable the inverse results [23, 24]. Keeping the network parameters constant, the network is trained 10 times and predicted separately. The inverted results are shown in Figure 6 and Table 3. In addition, 10 inversion tests were also performed separately for PSO and LSTM on Model A, and the inverted results are added in Table 3. In model A, the maximum relative errors in the inverse results of PSO, LSTM, and PSO-LSTM are 20.76%, 5.85%, and 2.05%, respectively. The maximum standard deviations of the inverse results of PSO, LSTM, and PSO-LSTM in Model A are 57.37, 1.97, and 1.23, respectively. These results show that among PSO, LSTM, and PSO-LSTM, PSO-LSTM performs the best and is the most stable.

In Table 3, the maximum relative errors and maximum standard deviations of model B are 2.09% and 3.87. In Figure 6(c), the inverted dispersion curves are close to the real dispersion curves. From Figure 6(d), it can be seen that the reconstructed shear-wave velocity model and the real shear-wave velocity profile are well fitted. These show that PSO-LSTM is stable, and the inverted models from PSO-LSTM can accurately predict the real ones.

4. Real Data Test

The next step will be to test the ability of PSO-LSTM to invert the actual data acquired from the Wyoming area of the United States [25]. The original seismic record is shown in Figure 7(a). Forty-eight 8 Hz vertical component geophones were used to collect data, the interval was 0.9 m, the minimum offset distance was 0.9 m, and the shock source was a hammer shock source. Figure 7(b) shows the dispersion image extracted from the seismic record. The inverse test was performed using the dispersion curves picked up on the fundamental mode (solid dots in Figure 7(b)). The exploration depth was divided into five layers based on the logging data, and the set stratigraphic physical parameters are shown in Table 4.

250 sample data were created according to Table 4 using the fast scalar method, and the data are shown in Figure 8. The parameters of the PSO-LSTM obtained by PSO search are as follows: the number of neurons in hidden layers 1 and 2 are 217 and 192, respectively, the learning rate is 0.1760, and the number of training rounds is 719. Figures 9(a) and 9(c) show the inverse results, and Figure 9(b) shows the network training error. From Figure 9(a), it can be seen that the dispersion curve obtained from the inversion matches well with the measured curve. In Figure 9(c), the shear-wave velocity model obtained from inversion matches well with the logging data. The inverted shear wave velocity model is not only close to the logging data in terms of speed but also roughly consistent with the depth of the real formation. This shows that the results of PSO-LSTM inverted the actual data are reliable.

5. Conclusion

In this paper, we propose a dispersion curve inversion method based on a deep learning model. The method can avoid the manual parameter selection and improve the prediction accuracy of the network. In the specific test work, the optimal network parameters are first selected using PSO, and the prediction of the dispersion data is performed after training the model using the selected network parameters. PSO-LSTM achieved favourable inverse results in the model test. The maximum relative error and maximum standard deviation of PSO-LSTM in Model A and Model B are 2.05%, 2.09%, and 1.23, 3.87, respectively. The inverse results of Model A show that the PSO-LSTM can be successfully applied to the study of inversion of dispersion curves. The test results of Model B confirm the stability of PSO-LSTM. Even if the dispersion curve data contain some noise, the inversion results of PSO-LSTM are still reliable. Finally, the ability of PSO-LSTM to invert actual data is tested. The real data tests show that PSO-LSTM is practical and the inverse results are reliable.

Data Availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was financially supported by the National Natural Science Foundation of China (42004113 and 42064007), National Science and Technology Supporting Program of Jiangxi Province (20212BAB211003 and 20212BBG73011), Open Fund from Engineering Research Center for Seismic Disaster Prevention and Engineering Geological Disaster Detection of Jiangxi Province (SDG202006 and DGD202106), and Science and Technology Research Project of Education Department of Jiangxi Province (GJJ200728), Doctoral Research Foundation of East China Institute of Technology (DHBK2018069), and Jiangxi University Students Innovation and Entrepreneurship Project (S202210405013 and S202210405016).