#### Abstract

The tunnel vibration level is usually employed as a vibration source intensity of the empirical prediction method. Currently, the analogy test and data base are two main means to determine the vibration source intensity. To improve the accuracy efficiency, the machine learning (ML) method was introduced to predict the tunnel vibration responses. To acquire model training samples, the measurements were performed in 80 different running tunnel sections of Beijing metro lines. Two types of method, back propagation neural network (BPNN) and generalised regression neural network (GRNN) were employed, which can make full use of characteristics of measured samples and reduce the data noise. The results indicate that the prediction efficiency is high and the mean square errors of the two ML methods are acceptable. Accordingly, both of the ML methods can be used as the reference of vibration source intensity in metro train-induced environmental impact evaluation. GRNN has relatively better predicting ability than BPNN.

#### 1. Introduction

With the development of urban rail transit construction, the environmental vibration problem arising from metro operation is becoming more and more prominent [1–4]. A reasonable vibration mitigation design has put forward higher requirements to the environmental vibration prediction. Various types of prediction model can be employed in different construction stages of a metro project [5, 6]. In the feasibility study stage of developing a rail system, the scoping or preliminary prediction can be used to identify whether the environmental vibration is an issue for potential sensitive buildings along the rail transportation alignment. Empirical and semi-empirical models are widely used in this stage [7–9]. Recently, the machine learning (ML) method has been introduced in the scoping prediction, such as the researches by Paneiro et al. [10], Connolly et al. [11, 12], Chen et al. [13], Yao et al. [14], Paneiro et al. [15], Fang et al. [16] and Liang et al. [17]. In the scheme design stage, the determined prediction can be used, including various types of numerical [18–22], analytical/semi-analytical methods [23–26]. In the construction design stage, the detail prediction methods were developed, such as measurement-based transfer function method [27–29] and hybrid methods [30–34].

A chain-type formula based on the assumption of uncoupled sub-systems is a classical empirical prediction method. Its idea was originally proposed by Kurzweil [35] and Melke [36], and developed in different standards and guidelines [9, 37–39]. In Chinese code HJ 453-2018 [38], the predicted environmental vibration level *VL*_{z} can be calculated by the superposition of the vibration source level *VL*_{Z,0} and a series of vibration level correction terms, details can be found in reference [40]. The value of *VL*_{Z,0} is defined as the vertical weighted vibration acceleration level on the tunnel wall. Two main approaches can be used to determine *VL*_{Z,0}. One is analogy test in a similar running tunnel section, which is regarded as the most accuracy method. Another is searching the database, in which the test has been performed in a similar tunnel section. However, there are disadvantages on both of the methods. The analogy test is time consuming, especially when the workload is heavy. Furthermore, the parameters of the test section should be consistent with the predicted section as far as possible. These parameters include train speed, route radius, tunnel shape and size, track type, soil parameters, etc. It is almost impossible to ensure that all parameters of the two sections are consistent, which introduces errors of the source intensity values. Owing to the large time and labor cost of the analogy test method, the database method is also recommended by standards. However, the database method has the same problem as the analogy test method, and the error is greater because the amount of data available for reference is not large enough.

To solve the problem and improve the prediction efficiency of *VL*_{Z,0}, ML method was introduced in the present study. The in-situ measurements were performed in 80 different tunnel sections of Beijing metro lines and the model training samples of *VL*_{Z,0} were obtained. Finally, two types of ML method were analysed and the prediction results were validated.

#### 2. Measurement in Metro Tunnel

##### 2.1. Measurement Outline

To obtain the data of *VL*_{Z,0} for training model, in-situ measurements were performed in 80 different running tunnel sections in Beijing metro. Various types of parameters were obtained and considered for each section, including track type, radius, tunnel shape, train speed and vehicle type, details as:(i)Track type: regular slab track, steel spring floating slab track (FST), rubber isolator FST; ladder sleeper track, slab track with short sleeper, and slab track with elastic sleeper;(ii)Radius: from 350 m to infinite (straight line);(iii)Tunnel shape: horse-shoe tunnel and shield tunnel;(iv)Train speed: between 15 and 92 km/h;(v)Vehicle type: types A and B.

Figure 1 illustrates the measuring point location in a tunnel with FST. According to the specification HJ 453-2018 [38], the location of the vibration source intensity is defined at the tunnel wall, with 1.25 m height from the rail top. For the curved tunnel, the sensors were installed on the side of inner rails.

In these tests, the date acquisition equipment INV 3060S was employed with a maximum sampling frequency of 51.2 kHz. The accelerometer is Lance 0105T with a measurement range of 20 g and working frequency between 0.35 and 6000 Hz.

##### 2.2. Measurement Result

The vibration source intensity *VL*_{Z,0} is expressed as the maximum Z-vibration level, defined as:where *VL*_{z}(*t*) is the frequency-weighed vertical vibration acceleration level as a function of time *t*, is the running root-mean-square weighted acceleration, is the frequency-weighed instantaneous vibration acceleration at time *ξ*, *τ* is the integration time of the measurement, and *t* is the instantaneous time. The weighting factor suggested by ISO 2631/1 was employed in this study.

An illustration of the calculation method for maximum Z-vibration level was shown in Figure 2.

All the values of *VL*_{Z,0} were averaged by five recording pass-by trains. Finally the averaged *VL*_{Z,0} of the 80 test sections were listed in Table 1. Figure 3 illustrates the *VL*_{Z,0} of different track type varies with train speeds. Generally, *VL*_{Z,0} increases with train speed, especially below 40 km/h. Generally speaking, the vibration reduction effect of steel spring FST is better than that of rubber isolator FST. However, sections 10 and 12 were measured in a deep buried horse-shoe tunnel, where the surrounding rock condition and tunnel shape affect the test results of the tunnel responses.

#### 3. Predicting *VL*_{z,0} Using ML Method

To provide a fast and accuracy prediction of *VL*_{z,0} based on the measured samples, two types of ML methods were used: back propagation neural network (BPNN) and generalised regression neural network (GRNN).

##### 3.1. BPNN Based Prediction

###### 3.1.1. Method Introduction

BPNN is a type of multilayer feedforward neural network which can acquire output vectors by processing input vectors through hidden layers (Figure 4). The output error can be evaluated by the error function. The error back propagation can be carried out by the gradient descent method based on the output error. Then, the connection weight and threshold *b*_{i} between neurons can be modified. Finally, the error of the neural network can be decreased to the minimum. The weight adjustment can be regarded as a prior probability distribution of the weight and threshold. Then, the posterior probability distribution of the weight and threshold are adjusted based on different input data. Finally, the network parameters can be modified and the generalisation ability of network is improved.

The BPNN can be optimised by introducing the Bayes’ principle, by which a modification function for the performance function is introduced:where, *α* and *β* are hyper-parameters; is the network coefficient related to the weight; *E*_{d} is the conventional error term. and *E*_{d} can be expressed as:where *m* and *N* are the neurons number of output and hidden layers, is the initial weight, and *y*_{i} is the output vector.

When predicting with BPNN, the initialised weight and hyper-parameters *α* and *β* need to be randomised firstly. Subsequently, the training set **P** is input as training samples. After training, the weight is calculated at which the grad of equation (2) is the minimum. Finally, the hyper-parameters can be calculated:where, *γ* can be calculated based on and the renewed values of *α* and *β* can be re-determined. In equation (5), *α*_{MP} and *β*_{MP} are the *α* and *β* when values . The steps above are repeated until the network converge [41].

According to the number of data eigenvalues, six nodes were set in the input layer, and the network layer number was set as 3 to lower the complexity of network and pretend over-fitting. One node was set in the output layer, and output value was the predicted vibration source intensity, e.g. maximum Z-vibration level.

The mean squared error (MSE) was used to analyse the prediction performance of the network. MSE is defined as:where *y*_{i} and are the true and predicted values of the test set, respectively; *N* is the number of output layer.

In this study, the value of *VL*_{Z,0} is expressed in dB, so MSE is measured in dB^{2}. Generally, the *VL*_{Z,0} is between 60 and 80 dB. If the predicted average percentage error is 10%, the absolute error is approximately between 6 and 8 dB. That is, if MSE is below 36 dB^{2}, it can be regarded as an acceptable result.

The relationship between node number of hidden layers, eigenvalue number and input/output node number can be expressed as [42].where *s* and *m* are node numbers of the hidden layer and input layer.

###### 3.1.2. Sample Training and Results

According to equation (6), the test begins with three nodes of the hidden layers. With the same training set and test set, the node number of the hidden layer can be increased gradually. The training is repeated three times for each hidden layer, and the averaged value can be obtained. The training results are illustrated in Figure 5 and the final optimised node number was six.

The model was building using the neural network toolbox in MATLAB. The activation function for the hidden layer was Sigmod function and for the output layer was purelin function. The maximum convergence times was set as 1000, and the maximum training accuracy was 0. The Bayesian regularisation method (trainbr) was selected as the training method. A total of eight groups of data were randomly selected for test and the rest 72 groups of data were used as the train set. After training the neural network with the train set, the test set was predicted.

When training, a weight matrix was randomly generated and then weight was modified with the transmission error.

Figure 6 illustrates the training results by comparing measured and predicted values. Based on equation (5), MSE of this test set can be calculated as 29.98, which proves the accuracy of training model is good enough to perform the prediction. According to detailed information in Figure 6, the absolute error of MSE is controlled within 10%, ranging from −7.57% to 9.86%, which demonstrates an acceptable prediction ability of this model.

###### 3.1.3. Test and Verification

To ensure the generalisation ability of the network, a cross validation was performed. After arranging data randomly, the data were divided into ten subsets. Every subset was selected as a test set, and the remaining were training sets. The cross validation were repeated ten times following the above steps. Figure 7 demonstrates MSE of ten cross validation results and the averaged MSE is 32.28, which demonstrates a good generalisation ability of this method.

Furthermore, the method of leave-one-out-cross-validation (LOOCV) was used to calculate and analyse the errors between measured and predicted values. Figure 8 illustrates the error normal distribution of BPNN using LOOCV.

Based on the verification of LOOCV and cross validation, the prediction error is generally below 10% and the average MSE is less than 35. Accordingly, BPNN can be used to undertake a preliminary prediction of the source intensity vibration of a running metro train.

Moreover, coefficient of determination *R*^{2} can be used to evaluate linear correlation degree of network fitting result. The coefficient of determination is defined aswhere, is the true value, is the predicted value, and is the average of true value.

Coefficient of determination *R*^{2} ranges from 0 to 1. The closer *R*^{2} approaches to 1, the better the linear correlation degree of network fitting result is. After calculation, *R*^{2} of BPNN is 0.9464, which reflects the BPNN model can also be used to predict the vibration source intensity.

##### 3.2. GRNN Based Prediction

###### 3.2.1. Method Introduction

GRNN is a type of radial basis function (RBF) neural network, which has strong non-linear mapping capability, fault tolerance and robustness. Besides, it has the advantage in the approaching ability and learning speed. The model of radial basis neural network includes two main independent variable and basis function. The independent variable is the Euclidean distance between the points to be measured and sampling. The basis function is a radial function. GRNN can transform a multi-dimensional problem into a one-dimensional problem. After transformation, the independent variable of one-dimensional issue turns into the Euclidean distance mentioned above. Any function can be obtained by making weighted combination of basis functions. Figure 9 demonstrates the topology of GRNN.

In the calculation and analyse by GRNN, the training set **P** is firstly input as a learning sample, and then *S*_{D} and *S*_{Ni} are calculated and output to the summation layer. *S*_{D} and *S*_{Ni} are neurons in the hidden layer, calculated by two different methods. One is the summation of denominator neurons, i.e. straightly summing up all the neurons in hidden layers. The other is the summation of molecular neurons, i.e. making a weighted summation of neurons in mode layers. *S*_{D} and *S*_{Ni} can be calculated bywhere *σ*_{i} is a network expansion constant, *y*_{ij} is the connecting weight of the *i*-th neuron in the summation layer and the *j*-th neuron in the mode layer. Finally, the output network predicted value can be obtained: *y*_{j} *=* *S*_{D}/*S*_{Ni} [43].

According to the characteristic of RBF neural network, the data should be expressed in the form of scientific counting before normalisation. Then, the route radius was multiplied by 0.01 and the train speed was multiplied by 0.1. Based on the fast learning speed of GRNN, building a large network and performing mutual authentications are possible. Then, the network was built for all 80 groups of data one by one. The *i*-th group of data was selected as the test set and the remaining were the train sets. Thus, a total of 80 networks can be established and the data can be made full used.

###### 3.2.2. Results, Test and Verification

After training, 80 groups of predicted results can be obtained using LOOCV. Figure 10 demonstrates the comparison between predicted and measured values.

Figure 11 demonstrates MSE values under different values of *σ*. According to Figure 11, the value of *σ* have little influence on the result of this experiment. When *σ* = 1 the MSE value is the minimum. Accordingly, this value was determined in this experiment.

Both of the MSE and coefficient of determination *R*^{2} were used to evaluate linear correlation degree of network fitting result. After calculation, MSE is 17.9205, and *R*^{2} is 0.8153. Figure 12 demonstrates the error distribution. For its normal fitting curve, average value is 0.004664 and standard deviation is 0.07571.

The above results and verification proves that GRNN can be used to predict the vibration source intensity with considering different parameters. Compared with BPNN, both of the calculation efficiency and the accuracy of GRNN are higher. Accordingly, GRNN is more recommended to predict the vibration source intensity.

#### 4. Conclusion

To improve the prediction efficiency of *VL*_{Z,0} in the empirical prediction formula, ML method was introduced in the present study. In-situ measurements were performed in 80 different running tunnel sections of Beijing metro and the model training samples of *VL*_{Z,0} were obtained. Two types of ML method were employed and compared for the prediction results. The results indicate that:(1)Both of BPNN and GRNN can be used to predict the tunnel vibration responses *VL*_{Z,0}. Proved by LOOCV, predicting by neural network has good extensionality. In the preliminary prediction phase, the neural network based prediction results can be used as reference values in the preliminary prediction.(2)GRNN has relatively better predicting ability than BPNN.

This study only explores the application of ML for predicting *VL*_{Z,0}. As the number and quality of test samples used for training determine the accuracy of prediction results, more test work is suggested to be carried out in future to enrich training samples.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The study was supported by National Engineering Laboratory for Digital Construction and Evaluation Technology of Urban Rail Transit with the open project fund (No. 2021JZ03) and the Scientific and Technology Research and Development Program of China State Railway Group Co., Ltd. (No. L2021G010).