#### Abstract

Tunnel settlement commonly occurs during the tunnel construction processes in large cities. Existing forecasting methods for tunnel settlements include model-based approaches and artificial intelligence (AI) enhanced approaches. Compared with traditional forecasting methods, artificial neural networks can be easily implemented, with high performance efficiency and forecasting accuracy. In this study, an extended machine learning framework is proposed combining particle swarm optimization (PSO) with support vector regression (SVR), back-propagation neural network (BPNN), and extreme learning machine (ELM) to forecast the surface settlement for tunnel construction in two large cities of China P.R. Based on real-world data verification, the PSO-SVR method shows the highest forecasting accuracy among the three proposed forecasting algorithms.

#### 1. Introduction

Accurate tunnel surface settlement prediction is crucial for construction companies to prevent unexpected disasters, such as tunnel collapse and landslide. For tunnel constructions in large cities, such as metro trains constructions, a precaution alarm of the tunnel settlement helps reduce the risks with affecting nearby people’s activities, possible building damage and environment pollution [1]. For rural area tunneling, especially for mountain tunneling, tunnel settlement monitoring prevents landslide that usually can cause construction workers injuries or deaths [2].

Two types of tunnel settlement forecasting methods are available in the literature, namely, model-based methods and artificial intelligence (AI) enhanced methods. Model-based methods build physical or mathematical models based on physics theories and verify the model using physical simulations [3, 4]. However, some physics or mathematical theories are hard to apply directly to real-world situations due to various dependent parameters and, in many cases, serious assumptions have to be made for the physical model can be applied, which may become invalid from time to time.

The fast development of artificial intelligence (AI) provides another option for tunnel settlement prediction. Available AI enhanced methods include pattern sequence forecasting (PSF) [5], support vector regression (SVR) [6], artificial neural networks (ANN) [7], and deep learning neural networks (DLNN) [8]. Between them, the PSF, SVR, and ANN are usually applied to small data analysis, whereas the DLNN methods are more popular for big data analysis and able to provide accurate forecasting results while a long history of the time series data is available.

Two difficulties arise for tunnel settlement forecasting using data-driven methods. First, the time period of tunnel construction is limited, which makes it impossible to collect a long history of data, e.g., up to several years. Moreover, tunnel settlements are usually expected to be fixed within a short period of time, which again makes the collected time series data in short length. Second, the tunnel construction company usually only records relative height of the measuring points, which makes the collected time series data in univariate form. Univariate time series data forecasting is reported to be more difficult than multivariate data forecasting problems [9]. The above two difficulties make the DLNN methods, such as the long short-term memory (LSTM) neural network and its extensions, not suitable for the tunnel settlement forecasting, since for small size data, the DLNN methods usually produce less accurate forecasting results compared with conventional neural networks, such as back-propagation neural network (BPNN), SVR, and extreme learning machine (ELM) [10].

In this study, an extended AI enhanced approach that combines the traditional machine learning techniques with particle swarm optimization (PSO) is proposed. A real-word tunnel surface settlement dataset is employed to verify the performance of the proposing method. In overall, the work that we described in this paper contributes to both the scientific and industrial areas with the following three points:(1)*Utilizing machine learning techniques for tunnel settlement forecasting*. Tunnel settlement forecasting is a realistic issue in real-world civilization process. However, not many works have been done in this area; especially when the AI enhanced techniques have been rapidly developed, the essentialness of fully utilizing the historical data in tunnel construction process must be emphasized.(2)*Univariate time series data forecasting with small data size*. The tunnel settlement data, which was employed in this study, was recorded by a metro tunnel construction company located in Shanghai. For each measured tunnel surface point, a time series dataset of size 100 is provided. Moreover, the construction company only records the height of each measured point. However, it is evident that the tunnel settlement is affected by multiple external factors, such as the environmental elements and civilization works. The univariate and small data size properties make the forecasting problem increasingly challenging.(3)*Extended machine learning approaches are proposed*. The proposed forecasting method modifies the traditional machine learning techniques, such as SVR, BPNN, and ELM, to make them more suitable for tunnel settlement forecasting. A PSO process is added to search for the optimal parameters for various classifiers. In the experiment phase, a comparative analysis is performed to justify the effectiveness of the proposed method.

#### 2. Literature Review

In general, there are two approaches for time series data forecasting, namely, model-based method and data-driven method. Model-based methods utilize mathematical of physical models to perform simulation and usually require multivariate data to be recorded. The extra variables excluding the tunnel surface point heights may include underground water pumping, soil quality measurements, and other assumptions. The forecasting accuracy depends on the validity of the physical assumptions. Shi et al. [11] investigated the soil movement responding to the tunnel excavation in clays through simulations. The soil movements are the main causes of tunnel settlements. Chakeri et al. [12] designed a FLAC3D (Fast Lagrange Analysis of Continua in 3 Dimensions) model to simulate the tunnel excavation process and consequently investigate the ground surface settlement. The proposed FLAC3D is finite-difference approach, based on a number of mathematical assumptions. Strokova [13] surveyed traditional model-based prediction methods for tunnel settlement during construction process. A finite-element based software named “Plaxis” and a mathematical model built based on real-world tunnel settlement data in 2007-2008 at Munich Technical University are utilized for simulation and performance comparison [14]. In summary, the model-based methods provide a white-box modeling for the tunnel settlement problem. The forecasting accuracy of model-based methods is comparable to data-driven approached methods while multiple external variables are available with valid mathematical assumptions.

Data-driven approaches are grey-box or black-box models that involve a complex internal structure, receive a preprocessed version of input dataset, and output integrated forecasting results. Conventional data-driven approaches for time series data forecasting include autoregressive (AR) methods [15], artificial neural networks (ANNs) [16], support vector regression (SVR) [17], deep learning neural networks (DLNNs) [18], and wavelets methods [19]. Ji et al. [20] proposed a least square support vector regression (LSSVR) method for ground surface settlement. Wang et al. [16] reported that by utilizing an adaptive differential evolution (ADE) algorithm to overcome the local extreme issues in optimal weight searching process in BPNN, the traditional BPNN can outperform most existing forecasting methods, such as SVR and AR models. Kuremoto et al. [21] proposed to use a deep belief network with restricted Boltzmann machines to perform time series data forecasting. Wang et al. [22, 23] proposed to use extended echo state network (ESN) to forecast electricity energy consumption in China. Wu and Gao [24] combined AdaBoost algorithm and long short-term memory (LSTM) neural network to forecast financial time series data. Lu et al. [25] introduced another extended LSTM algorithm combining with the differential evolution (DE) method for electricity price forecasting. Yan et al. [26] proposed a multistep forecasting algorithm that integrates convolutional neural network (CNN) with LSTM to forecast single household energy consumption.

#### 3. Proposed Algorithm for Tunnel Settlement Forecasting

##### 3.1. Data Description

Two real-world tunnel settlement datasets were employed for the study of tunnel settlement prediction based on various modern machine learning techniques. Both datasets were collected by a local China tunnel construction company with one of them measuring the tunnel surface settlement of the metro train line 3 construction in Ningbo city, China, and the other one measuring the tunnel surface settlement of a subway construction in Zhuhai city, China. Over 700 ground surface sensors were utilized, measuring the overall settlement on each day during the tunnel construction period. The recording frequency is once per day; and the total number of records for each surface point is around 100, depending on the particular construction progress conditions.

In the experiment phase, in total 10 measured surface points were selected and, for each point, 5/6 of the total recorded length was taken as the training dataset for modern machine learning prediction models, including BPNN, SVR, and ELM. The remaining 1/6 of the total recorded length was used for verification purposes, computing classic error measurement metrics, including root mean square error (RMSE), mean square error (MSE), and mean absolute percentage error (MAPE).

##### 3.2. Back-Propagation Neural Network

BPNN, as one specific form of ANN, represents one of the most classic machine learning techniques, which is continuously employed and improved in various application fields [27–29]. The most critical limitation of BPNN is probably the situation when it is used dealing with big data. For tremendous size data, parallelization of the original BPNN is required [29]. However, when the data size is serious small, the BPNN usually provides high forecasting accuracy with minimal time required compared with other machine learning techniques. Over the past few decades, many extensions of BPNN are proposed. With a preprocessing step, such as the particle swarm optimization (PSO), the extended BPNN becomes more suitable for forecasting and prediction under various working conditions.

##### 3.3. Support Vector Regression

Support vector regression (SVR) is a state-of-the-art and probably the most commonly applied machine learning technique for various purposes in the field of industry engineering, including solar energy generation optimization [30], traffic flow forecasting [31], and molecular dynamics forecasting [32]. Inheriting the core idea from support vector machine (SVM), SVR looks for a hyperplane in high dimension that best represents the data pattern. Figure 1 shows a simple linear support vector regressive plane with insensitive loss variable .

LibSVM is an assembled tool-box developed by Chang and Lin, which provides the easy access to use SVR and SVM [25]. For a given set of training data,* Tr* = , where is the training input and is the objective output value. LibSVM is able to find the objective function* f*(*x*) with specified three important parameters:* K*,* C,* and *γ*.* K* stands for the kernel function that maps the low dimensional input data into high dimensional feature space.* C* and *γ* can be optimized by the PSO algorithm.

##### 3.4. Extreme Learning Machine

Extreme learning machine (ELM), proposed by Huang et al. in 2004 and 2006 [33, 34], is reputable by its fast learning speed with low computational resources and simultaneously providing competitive classification results [35–37]. ELM was well known as a single-layer feed-forward neural network (NN) and also has been extended to non-NN forms. Compared to other neural networks in the literature, such as BPNN, multilayer neural networks and SVM, ELM is much faster in terms of training efficiency and provides higher generalized classification accuracy in many proven cases.

The traditional ELM algorithm maps the input data samples with the recognized pattern using one single layer of neurons. For any testing sample* x*, the ELM function mapping can be expressed by where* a, b* are tuned parameters and* w* is the weight vector for hidden neurons, which is fixed during the training phase. The function* f*(*x*) represents the recognized pattern of the input data samples. The tuning-free feed-forward training strategy of ELM is equivalent to the process of solving a linear equation system that requires very low computational cost.

The basic ELM implementation can be found at http://www.ntu.edu.sg/home/egbhuang/index.html. To achieve the best result using ELM, two important parameters are required to be tuned, which are the number of hidden neurons, and the activation function. The two parameters, again, can be optimized using PSO algorithm.

##### 3.5. Rolling Window Size Selection

Considering the properties of the real-world tunnel settlement data, such as short size, univariate and sparse sampling data points (1 sampling on each day), we select a suitable rolling window size for each machine learning technique in its training process. The univariate training data was reorganized into batches according to the rolling window size and inserted into the machine learning models to predict the next time stamp value (Figure 2). The rolling window size is another important parameter for each machine learning model and basically determines the length of effective source data samples in the training dataset for prediction, since too old data samples usually have less significant influence to the prediction results. According to the data description in Section 3.1, the suitable rolling window size usually lies in the range from 1 to 20.

##### 3.6. Using Particle Swarm Optimization to Find Optimal Parameters for Various Machine Learning Techniques

For all three machine learning techniques that we used in this work, i.e., BPNN, SVR, and ELM, there are important parameters to be tuned, which will seriously impact the final forecasting results [38]. In this study, the PSO is adopted to find the optimal parameters for the three machine learning techniques. The overall algorithms are denoted as PSO-BPNN, PSO-SVR, and PSO-ELM.

Compared to the other optimization search algorithms, such as the genetic algorithm (GA), ant colony algorithm and differential evolution (DE) algorithm, the PSO algorithm is more efficient and able to avoid problems of stagnation behavior and premature convergence [39–41]. Moreover, in the PSO algorithm, the number of parameters is small and the real number coding is adopted. Although the PSO algorithm has shortcomings, such as easy to fall into local extremes, the convergence speed is affected by inertia weight, etc. These shortcomings can be resolved by repeated runs and selecting an appropriate combination of the parameters for the algorithm [42].

Taking PSO-SVR algorithm as an example, the initial parameters of PSO include the number of particles* m*, inertia weight* w*, and two learning constants and . The search of the parameters of PSO depends on the mean absolute percentage error (MAPE) evaluation of SVR results. For a given set of training data* X* = with number of samples* n*, where stands for the actual data and stands for the forecasting result produced by SVR, the MAPE value is calculated by

First, we set* m* = 200 and search in the range , and in the range . Based on grid search results with step size 0.1, and with various combinations of parameters for SVR, we select* w* = 25, = 1.2, and = 1.6.

Next, after fixing the parameters of PSO, we look for the optimal parameter combination of SVR using PSO (illustrated in Figure 3). Then the optimal values of* C*, *γ*, and* k* (SVR parameters) are obtained when all particles converge (Figure 4). The detailed steps of the PSO-SVR algorithm are listed in Algorithm 1.

Input: Searching space of vector (C, γ, k), where C ranges from 1 to 10000; γ ranges from | |

-100 to 100; and k is the rolling window size, ranges from 1 to 20. | |

Output: The optimal values of C, γ, k based on MAPE evaluation of SVR. | |

Step 1: For each particle p, a location vector l_{p} and a velocity vector v_{p} are assigned. | |

Step 2: For each particle p, the fitness function is evaluated, which is the MAPE value of | |

SVR using this particular particle’s location vector. | |

Step 3: At each iteration, if the fitness function is not satisfied, all particles update their | |

historical optimal location h and global optimal location g according to their current | |

location and velocity. | |

Step 4: When the maximum iteration is reached, or the MAPE value is less than a | |

pre-defined value, the global optimal location g in the search space is outputted. |

The same process can be applied to search for the optimal parameter combination of BPNN and ELM.

#### 4. Experimental Results

The three machine learning techniques, namely, BPNN, SVR, and ELM, combining with PSO parameter optimization algorithm is applied to a real-world tunnel settlement prediction problem with two datasets collected by a local China tunnel construction company with one of them measuring the tunnel surface settlement of the metro train line 3 construction in Ningbo city, China, and the other one measuring the tunnel surface settlement of a subway construction in Zhuhai city, China. For each tunnel construction project, 5 representative surface points are selected, which are surface point numbers 184, 191, 192, 220, and 230 for the subway construction in Zhuhai city and surface point numbers 554, 569, 570, 571, and 580 for the metro train line 3 construction in Ningbo city. For each surface point, 5/6 of the total recorded length will be taken as the training dataset and the remaining 1/6 was used for testing purposes, which contains approximately 10 to 20 points.

Figure 5 shows the tunnel settlement prediction for measuring surface point number 184 for the subway construction in Zhuhai city. The actual surface point height decreases most of the time from -7.5mm to -34 mm with some unstable movements because of the underground tunnel construction. In total, there are 75 data points for this particular measurement point. All three machine learning models with parameters optimized by PSO were tested with this measurement point. The first 5/6 of the total dataset is used for training and looking for the best fits of the machine learning models. With each trained model, each rolling window batch will produce predicted value, which is shown in different colors. The results of BPNN are shown in blue color; the results of SVR are shown in pink color; and the results of ELM are shown in green color. For most of the cases, PSO-SVR produces the best RMSE, MSE, and MAPE values according to Table 1, following by BPNN and ELM. Figures 6–9 show the prediction results of surface point numbers 191, 192, 220, and 230, respectively.

Figure 10 shows the tunnel settlement prediction for measuring surface point number 554 for the metro train line 3 construction in Ningbo city. Most of the measuring surface points of this project go up in the first phase of the construction and drop down in the later phase due to the underground human interferes. The surface point movement trend of the first phase is useless for forecasting the testing time period. This is one important reason that we introduce the rolling window in Section 3.5. With proper rolling window sizes selected, the proposed machine learning framework predicts the tunnel surface point movement based on the most recent movement history and ignores the movement history outside the rolling window. Figures 11–14 show the prediction results of surface point numbers 569, 570, 571, and 580, respectively. Experimental results demonstrate that the proposed approach can well predict tunnel surface point movement with human interferes.

For all measurement points shown above, we list MSE, RMSE, and MAPE results of the three machine learning techniques in Table 1. The experimental results show that the PSO-SVR can most accurately predict the tunnel settlement compared with PSO-BP and PSO-ELM. All RMSE values are less than 0.1 with MAPE values less than 2.5%, which suggests that the proposed PSO-SVR method can be well fitted to real-world tunnel settlement forecasting problems.

#### 5. Conclusion and Limitation

Aiming at preventing serious damage during the tunnel construction process, this study proposes an extended machine learning framework combining different machine learning techniques with PSO to forecast the tunnel surface settlement based on univariate historical data. By evaluating the particular form of the real-world tunnel settlement historical data, three modern machine learning techniques were selected, including BPNN, SVR, and ELM. The PSO algorithm is adopted to select the globally optimized parameters for each machine learning technique.

In the experiment phase, two real-world datasets were used for performance comparisons between different machine learning techniques. One dataset records the tunnel surface settlement of a metro train line construction in Ningbo city, China and the other dataset records the tunnel surface settlement of a subway construction in Zhuhai city, China. A comprehensive comparative study is performed, with MSE, RMSE, MAPE values evaluated for each machine learning technique. The overall result suggests that the SVR is most suitable for tunnel settlement forecasting based on the univariate real-world data, followed by BPNN and ELM.

The current work has the following limitations. First, the tunnel settlement data that we used in this study is relatively a small size dataset, which makes the DLNN methods, such as long short-term memory (LSTM) and gated recurrent unit (GRU), not suitable for this study. As a result, instead, three representative nondeep learning techniques, i.e., BPNN, SVR, and ELM, are selected to perform the simulations. More machine learning techniques have to be tested in future study. Second, PSO method is employed to search for the optimal parameter combinations for the three machine learning methods. More searching algorithms, such as genetic algorithm (GA), ant colony algorithm, and differential evolution (DE) algorithm can be adopted and compared in future study.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest regarding publishing this paper.

#### Acknowledgments

This work was supported by the Foundation of Zhejiang Provincial Department of Education (No. 1120KZ0416255) and the Foundation of Talent’s Start-Up Project in Zhejiang Gongshang University (1120XJ2116016). This work was also partially supported by National Natural Science Foundation of China (Nos. 61850410531, 61602431, and 61374094), Natural Science Foundation of Zhejiang Province (No. LY18F030025), and Shanghai Science and Technology Commission (16DZ1201704).

#### Supplementary Materials

Two real-world tunnel settlement datasets were employed for the study of tunnel settlement prediction based on various modern machine learning techniques. Both datasets were collected by a local China tunnel construction company with one of them measuring the tunnel surface settlement of the metro train line 3 construction in Ningbo city (section code: NBDT), China, and the other one measuring the tunnel surface settlement of a subway construction in Zhuhai city (section code: ZHSD), China. Over 700 ground surface sensors were utilized, measuring the overall settlement on each day during the tunnel construction period. The recording frequency is once per day and the total number of records for each surface point is around 100, depending on the particular construction progress conditions. Individual attributes descriptions for “TunnelData.cvs”: Index. Point ID: each monitored point has an ID. Monitor_date: the date when the data was collected. This_time_change: relative movement (vertical) from the previous record of this point. All_time_change: total movement (vertical) of this point. Project_code: PC0001-PC0008 indicates 8 projects, where in this paper, we only use data with PC0001 (Ningbo) and PC0002 (Zhuhai). Section_code: PC0001 corresponds to NBDT, which stands for the tunnel in Ningbo city, China. PC0002 corresponds to ZHSD, which stands for the tunnel in Zhuhai city, China. Tunnel_code: one city may have multiple tunnels. These are the IDs for tunnels. Point_code: the code for point (another ID with alphabets). (10) Point_type: types of points. (11) Ring_number: the tunnel was constructed by inserting rings. These are the IDs for Rings. (12) Depth: this indicates the depth of the center of ring. (13) Relative_direction: the ring direction. (14) Min_distance_axis: the minimum distance between the actual ring axis and the predefined ring axis. (15) Alarm_rule_id_single: a predefined rule that is used to alarm when the min_distance_axis reaches a threshold. (16) Alarm_rule_id_sum: sum the IDs of all offended rules. (17) Comments.* (Supplementary Materials)*