#### Abstract

Deformation predicting models are essential for evaluating the health status of concrete dams. Nevertheless, the application of the conventional multiple linear regression model has been limited due to the particular structure, random loading, and strong nonlinear deformation of concrete dams. Conversely, the artificial neural network (ANN) model shows good adaptability to complex and highly nonlinear behaviors. This paper aims to evaluate the specific performance of the multiple linear regression (MLR) and artificial neural network (ANN) model in characterizing concrete dam deformation under environmental loads. In this study, four models, namely, the multiple linear regression (MLR), stepwise regression (SR), backpropagation (BP) neural network, and extreme learning machine (ELM) model, are employed to simulate dam deformation from two aspects: single measurement point and multiple measurement points, approximately 11 years of historical dam operation records. Results showed that the prediction accuracy of the multipoint model was higher than that of the single point model except the MLR model. Moreover, the prediction accuracy of the ELM model was always higher than the other three models. All discussions would be conducted in conjunction with a gravity dam study.

#### 1. Introduction

Deformation modelling is an important component of dam safety systems, both for the daily operation and for long-term behavior evaluation [1]. They are built to calculate the dam response under safe conditions for a given load combination, which is compared to actual measurements of dam performance with the aim of detecting anomalies and preventing failures. Current predictive models for simulating dam deformation can be classified as three types: deterministic models, statistical models, and hybrid models [2], i.e., a mixture of the first two.

Deterministic models based on physical laws such as load, material properties, and stress-strain relationships are often used to design dams and function throughout the life of concrete dams [3]. Anomalies in the operation of concrete dams can be fundamentally explained by deterministic models whose parameters have specific physical meanings; uncertainties existing in geological conditions and in material properties of rock base and concrete hinder the implementation of deterministic models.

Statistical models are mathematical equation that quantitatively describe the variation law of dam monitoring values and are the abstraction and simplification of the actual working state of the dam [4]. Without regard to the specific physical mechanism of dam operation, the statistical model is essentially an empirical model based on dam measurement data over years. As the most widely used model for characterizing concrete dam deformation, the statistical model usually consists of three parts temperature component, hydrostatic pressure component, and aging component. The statistical model assumes that the components are completely independent and may not match the actual situation [5]. At present, the most common statistical models are based on regression methods such as the multiple linear regression (MLR) [6], stepwise regression (SR) [7], principal component regression (PCR) [8], and partial least squares regression (PLSR) [9]. By gradually screening regression factors of MLR models, SR models obtain regression coefficients at a certain significance level, results more accurate than those obtained by the general least square method.

In recent years, more and more scholars have begun to apply machine learning algorithms or intelligent algorithms to dam safety effect prediction or dam safety diagnosis analysis [2, 10, 11]. Mata [12] found that ANN models can be a very powerful tool in evaluating dam behavior by comparing the multiple linear regression model with the multilayer perceptron model for the horizontal displacement of a concrete arch dam. F. Salazar [2] assessed the potential of some state-of-the-art machine learning techniques including random forests, boosted regression trees, neural networks, and multivariate adaptive regression splines to build models for the prediction of dam behavior in the field of displacement and leakage. Kao [13] studied the feasibility of ANN-based approaches for dam health monitoring and set an early warning threshold level of the Fei-Tsui dam based on the analysis results. But the artificial neural network based on gradient descent method is still relatively slow and easily gets stuck. Su [14] proposed a dam safety monitoring model based on support vector machine, which could overcome the disadvantages of the above artificial neural network, but the selection of kernel function parameters is a difficult point. The extreme learning machine (ELM) is a new method training single hidden layer feedforward neural networks proposed by Huang et al. [15, 16]. The extreme learning machine first randomly generates the hidden layer deviation and weight of the connected input and the hidden layer and then directly determines the weight value between the hidden layer and the output layer by Moore-Penrose generalized inverse method. While overcoming the shortcomings of the gradient descent learning method, the ELM greatly improves the learning speed of artificial neural networks and ensures good generalization ability. Kang [17] proposed an ELM-based gravity dam deformation prediction model and explored the application of ELM algorithm from the perspective of prediction accuracy. However, the adaptability of the ELM model to the interpretation of dam deformation has not yet been elucidated.

The dam deformation monitoring model can be divided into two types: the single point deformation monitoring model and the multipoint deformation monitoring model. The current research mainly focuses on single point monitoring deformation model, which cannot reflect the spatial distribution of deformation. And the multipoint deformation monitoring model can better reflect the mutual relationship between the deformation points of the dam body, which is more reasonable than the single point model. As a kind of statically indeterminate shell structure, the concrete arch dam is obviously affected by the spatial integrity of the concrete arch dam.

This paper studies the application characteristics and effects of the multiple linear regression (MLR), stepwise regression (SR), backpropagation (BP) neural network, and extreme learning machine (ELM) on concrete dam deformation modelling based on the monitoring data of the Dongjiang arch dam. The similarities and differences between the single point model and the one-dimensional multipoint model are discussed. The work focuses on prediction accuracy and the suitability for interpreting dam behavior. All discussions will be carried out in conjunction with the results of a gravity dam [17] study.

#### 2. Statistical Model

Statistical models are established by the correlation between observed effect quantities and environmental variables. With the environment treated as an independent variable, the structural response of the dam is affected by three effects the reversible effect of the hydrostatic load, the reversible thermal influence of the temperature, and the irreversible term due to the evolution of the dam response over time [4, 18]. According to the influencing factors, the displacement of the arbitrary point in the direction can be expressed as

where represent the hydraulic displacement component, temperature displacement component, and aging displacement component, respectively. indicates the deformed surface of the dam fixed point under the action of water pressure (), temperature () and aging () alone, where is approximated by multiple power series, and

And the hydraulic displacement component , temperature displacement component , and aging displacement component can expressed as follows [19]:

where is the upstream and downstream water level difference; represents the period, represents the annual period, and represents the half-year period in (4); is the number of days since the initial date. is the average temperature from to days before the observation day; , is the measured date and is the initial date; are the regression coefficients.

More attention should be placed on the choice of two calculation methods for temperature displacement (see (4) and (5)). When the temperature data is complete and continuous, (4) is adopted to consider the influence of the actual temperature. When the temperature data is incomplete or discontinuous, (5) is used.

Substituting (2), (3), (4), (6) or (2), (3), (5), and (6) into (1), using Taylor series expansion, omitting high-order terms, and combining similar items, we can obtain the space-time distribution model of the fixed point in the direction, that is, the one-dimensional multipoint statistical deformation model.

When the coordinate of measuring point remains unchanged, a displacement statistical model of the single measuring point is obtained:

According to the reasons mentioned above, this paper chooses (7) and (9) to study the deformation monitoring model of concrete dams. Therefore, the input variable of the single point deformation prediction model is , the input variable of the one-dimensional multipoint deformation prediction model is and the output variable is the radial displacement of the measuring point.

#### 3. Methodology

##### 3.1. Multiple Linear Regression

Multiple linear regression (MLR) models are based on the linear correlation between dam effect quantities and environmental variables. When considering the relationship between the independent variables and the dependent variable , a regression equation is established: , where are the regression coefficients to be estimated; ( is the sample size); is the random error [20].

Assuming that the random errors are generally normal distributed and independent of each other, the multiple linear regression equation is represented by a matrix: , where is the vector of observations; is the parameter vector; is the constant vector; is the random error vector. There is a set of parameter estimates such that the residual sum of squares is the smallest; that is, the system of equations is solved. Therefore, the overall parameter of the least squares estimation is , the fitted model is , and the vector of the residuals is denoted by . The ultimate goal of the overall model is to minimize the sum of the squared deviations between the model predictions and the observations.

##### 3.2. Stepwise Regression

For the MLR method, the more independent variables, the smaller the residual square sum , the better the regression equation effect, and the higher the prediction accuracy. In the optimal regression equation, it is always desirable to include as many independent variables as possible, especially the independent variables that have a significant influence on the dependent variable. Nonetheless, too many independent variables may also result in some disadvantages of the regression equation. Firstly, if more independent variables are required, many quantities must be measured and calculations are inconvenient. Secondly, if the regression equation includes an independent variable that has no effect on the dependent variable or has a very small effect, then the residual square sum will not decrease, thus affecting the accuracy of the regression equation. Thirdly, the existence of independent variables that have no significant influence on the dependent variable affects the stability of the regression equation and reduces the prediction accuracy. Thus, in the optimal regression equation, it is desirable to exclude independent variables that have no significant effect on the dependent variable.

Stepwise regression (SR) is a method for a linear regression model to select independent variables [21]. The basic idea is to introduce variables one by one, with the condition that its partial regression squared and experience are significant. According to the above principle, stepwise regression can be used to screen and eliminate the variables causing multicollinearity. The specific steps are as follows: first, use to make a simple regression for each considered and then gradually introduce the remaining based on the regression equation corresponding to the that contributes the most to . After a stepwise regression, that is finally retained in the model is both important and not heavily multicollinear. The effect of stepwise regression on the improvement of multiple linear regression is still controversial, which is also a focus of this paper.

##### 3.3. Backpropagation Neural Network

Artificial neural networks are often divided into two categories: one is a recursive network that generates loops through feedback connections, and the other is a feedforward neural network [22] in which the network structure has no loops. The typical single hidden layer feedforward neural network structure is shown in Figure 1. Both the ELM and BP neural network belong to the feedforward neural network, except that the learning methods of the two are different. The BP neural network is a learning method that uses backpropagation by the gradient descent method, which requires constant iteration to update the weights and thresholds, while the ELM randomly determines the initial weights and thresholds without adjustment.

The traditional BP neural network adopts the error backpropagation algorithm, whose guiding idea is that the weight and threshold should be adjusted along the direction of negative gradient, which is the fastest descending error function. Supervised BP neural network learning algorithm usually consists of three stages [10].

The first stage is to feed the data forward, and the computed output of the th node in the output layer is as follows:

where is the connective weight between nodes in the hidden layer and those in the output layer; represents the connective weight between the nodes in the input layer and those in the hidden layer; (or ) are bias terms that represent the threshold of the transfer function ; is the input to the th node in the input layer; , , and are the number of nodes in the input, hidden layer, and output layer, respectively.

The second stage is the backpropagation of the error. The learning process of error backpropagation is the process of propagating errors from the output layer to the input layer and correcting the corresponding network parameters. The goal of learning is to minimize or reduce the total error of the network.

The third and final stage is to adjust the weights and thresholds. The training is performed using a gradient descent method with a learning ratio in the standard BP algorithm and is defined as follows:

where ; is the parameter vector to be determined in the BP neural network. The weight is adjusted as follows:

where is the learning ratio; the superscript refers to the learning iteration; and is the system error function. The convergence speed of BP neural network depends on the learning speed. For computational efficiency, Levenberg-Marquardt [23] algorithm is applied to obtain , , , and by minimizing the system error function.

##### 3.4. Extreme Learning Machine

Extreme learning machine is an algorithm for single hidden layer feedforward neural network. Suppose there are arbitrary samples, where and . The output of a standard single hidden layer neural network with hidden nodes can be mathematically described as follows:

where is the output vector relative to the input ; is the activation function; is the input weight vector; is the output weight vector; is the offset of the hidden layer; is the inner product of and .

The learning goal of single hidden layer neural network is to minimize the output errors, which can be expressed as follows:That is, there exist specific , , and such thatEquation (17) can be simplified as where is the target matrix of training samples.There are , , and such thatwhere ; (21) is equivalent to the following minimization loss function:

Conventional gradient-based learning algorithms require adjustment of all parameters over multiple iterations. In the ELM algorithm, once the input weights and the hidden layer offsets are randomly determined, the output matrix of the hidden layer is uniquely determined [15]. The training single hidden layer neural network can be transformed into solving a linear system . And the output weight can be determined

where is the Moore-Penrose generalized inverse of .

#### 4. Data and Processing

##### 4.1. Case Dam and Operation Data

The Dongjiang (DJ) dam is located in Zixing City, Hunan Province, China. It is a variable center double-curved concrete arch dam with a maximum height of 157m and a center arc length of 438m. Its left and right shores are basically symmetrical. The designed normal storage level is 285m above the sea level corresponding to a storage capacity of 8.12 billion m^{3}. The first water impounding began in 1986 and has been going on for more than 20 years now. The DJ dam is equipped with monitoring items such as deformation and seepage. The deformation monitoring items include forward intersection, inverted perpendicular, vertical displacement, and cross-river length monitoring system. The layout of dam vertical monitoring system is shown in Figure 3. There are 5 sets of vertical lines, namely, L1, L3, L5, L7, and L9, and each vertical line has a vertical reversal line. The vertical line monitoring is available in both manual and automated monitoring.

The observation data of the measuring point L5-205 from February 2003 to December 2013 is the basis of this study. The data includes temperature, upstream and downstream water levels, and the radial deformation of measuring points. The position of the L5-205Z measuring point is shown in Figure 2.

**(a)**

**(b)**

Kang [17] has initially discussed the application of ELM in deformation prediction of Fengman (FM) concrete gravity dam, which will be presented here to be a reference for the research of this paper. Table 1 shows the basic parameters of the two dams, and Figure 3 shows a cross-sectional view of the two dams.

Figure 4 shows the time history curve of environmental variables and corresponding radial horizontal displacement of the L5-205Z measuring point, and the vertical dotted line marks the division between training and predicted sets. In Figure 4, sign (+) denotes the displacement towards the upstream, and sign (-) indicates the displacement towards the downstream. The radial displacement of the measuring point changes periodically with temperature, that is, the temperature rises, the measuring point deforms upstream, the temperature drops, and the measuring point deforms downstream. The change in displacement lags significantly behind the change in temperature. The highest temperature generally occurs between July and October, and the lowest temperature occurs between January and March. However, the displacement generally reaches a maximum value (maximum to the downstream) from April to June and reaches a minimum value (maximum to the upstream) from October to December. The radial displacement of the L5-205Z measuring point is in accordance with the general law of the arch dam deformation, which also proves the validity of the data samples from the side.

As shown in Figure 5 [17], the water level of the FM dam also changes periodically with the season. Unlike the DJ dam, the FM dam has hardly observed the hysteresis of the displacement change with respect to the temperature change. The existence of correlation among water level changes, temperature changes, and time changes may affect the prediction accuracy of conventional linear equations.

##### 4.2. Parameter Settings

For the BP neural network, the initial weights and thresholds were obtained by the most widely used Levenberg-Marquardt. The optimal number of hidden neurons and the optimal learning rate were determined by trial and error, while the transfer functions of the hidden layer and the output layer, respectively, selected the sigmoid function and the linear function. The number of training epochs was set as 10^{3} and the training goal for the MSE was set as 10^{−3}. The activation function of the ELM model was also a sigmoid function. Compared with the BP model, the ELM model only needs to determine the number of hidden nodes to obtain satisfactory results.

Because the weights are randomly initialized, different results can be achieved by separate runs. In order to enhance the reliability of calculation results, both the ELM and BP neural network are continuously trained 20 times, the result of a calculation with a small difference between the MSE of the training set and the predicting set will be the final result. Kang [17] used the average of 5 calculations as the final result, and this paper believes that 5 times may not be sufficient to prove the reliability of the results.

Figures 6 and 7 show the effect of the number of hidden layer neurons to the predictive performance of ANN models for the DJ dam. For the single point deformation monitoring model, the training and predicting errors of the BP model are relatively small when the number of hidden nodes is 16. And, for the ELM model, when the number of hidden nodes is 17, the error of training and predicting is relatively small. For the FM dam, Kang [17] determined the optimal number of hidden nodes of the BP neural network model is 15, and the ELM model is 22. For the multipoint deformation monitoring model, in the BP network and ELM models, the numbers of hidden nodes are set as 15 and 14, respectively.

**(a) Back propagation**

**(b) Extreme learning machine**

**(a) Back propagation**

**(b) Extreme learning machine**

##### 4.3. Performance Evaluation

It is important to appropriately estimate the prediction error of a model, since (a) it provides insight into its accuracy, (b) it allows comparison of different models, and (c) it is used to define warning thresholds [24, 25]. In order to facilitate the analysis of final calculation results, different performance evaluation functions are adopted in this paper, that is, mean absolute error (MAE), mean square error (MSE), maximum absolute error (S), and correlation coefficient (R), as shown below [17]. where and are simulation values and simulation averages; and are observed values and observed average values, respectively; is the number of measured samples.

To estimate the uncertainty associated with model simulations, the residuals of predicting sets are computed and analyzed [26]. The independence analysis, heteroscedastic analysis, and normality analysis of residuals are performed by plotting graphs of residual autocorrelation, residual variation relative to observed values, and residual probability distributions. If the residual sequence is autocorrelated, then the model fails to fully explain the variation rule of the variable. On the other hand, low residual heteroscedasticity and a close approximation to the normal distribution indicate the model is closer to unbiased estimation and has low uncertainty. In this paper, the standardized residual of the model is shown in where represents observed values, represents predicted values, represents standardized residuals, and *σ* represents the standard deviation. , and represents residuals.

#### 5. Results and Discussion

##### 5.1. Comparison of Simulation Results

In this section, the observed DJ dam deformations are compared with simulated results based on the four different models, i.e., the MLR, SR, BP neural network, and ELM model. The specific calculation results for both the DJ arch dam and the FM gravity dam of the four models are shown in Tables 2-3 and Figures 8–10. The calculation results show that, in the single point deformation monitoring model, the best MAE, MSE, S, and R values are obtained by the ELM models for both the DJ arch dam and the FM gravity dam. And the best results are highlighted in black bold. According to the comparison among different models, the best accuracy ranking for the DJ dam is the ELM model > BP neural network > SR > MLR, while for the FM dam, the ranking is the ELM model > BP neural network > MLR > SR. The stepwise regression does not always play a positive role in the improvement of multiple linear regression and should be analyzed in specific situations.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

It can be seen from Figure 8 that, except for the multimeasurement MLR model, the fitting and predicted displacements of all the models are consistent with the trend of the measured displacement. In June 2004, all models showed similarly large errors. The reason is that the high temperature generated by the high voltage line affects the measurement accuracy.

With respect to the multipoint deformation monitoring model, the best MAE, MSE, S, and R values are also obtained by the ELM models for the DJ arch dam. And the best accuracy ranking is the same as the point deformation monitoring model, which is the ELM model > BP neural network > SR > MLR. Nonetheless, in the multipoint deformation monitoring model, the prediction accuracy of the MLR model drops sharply due to the inclusion of too many redundant independent variables. And the SR, BP, and ELM perform better in the multipoint deformation modelling, and the prediction accuracy is higher, which indicates that the multipoint model is more reasonable than the single point model.

In addition to the simulation accuracy, the calculation speed is also an important index to measure the performance of a model. In this paper, the time consumption is used as an evaluation index to compare the calculation speed of the four models. In general, the time consumption is ranked as the BP neural network > ELM > SR > MLR model among the different models and as the multipoint deformation model > the single point deformation model among the different measuring points. It should be noted that the BP neural network is the longest-running model and the time consumption of the ELM model is significantly lower than that of the BP neural network. Table 2(FM dam) is Table in [Kang et al., 2017].

##### 5.2. Residuals Analysis

As shown in Figures 11-12, to evaluate the uncertainty of the models, residual analysis is performed on the statistical results of the four models. In the single point deformation monitoring models, the experimental results show that the autocorrelation of is almost not found in all four models, and the ACF lies mainly in the 95% confidence interval (Figures 11(a)–11(d)). Figures 11(e)–11(h) show the scatter points of as a function of observed deformations. It is clear that the values do not appear to be randomly distributed over the deformation interval except for the ELM model. And the of the other three models show a decreasing trend with an increase in deformation. Figures 11(i)–11(l) display the probability density distribution of for all the four models. The results show that the probability density distribution curve of for all four models is unimodal without considering the influence of two abnormal points, and the values of are mainly distributed between −2 and 2 (Figures 11(e)–11(h)). The existence of two abnormal points was caused by the measurement anomaly in December 2013, when the vertical line was being overhauled.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

**(i)**

**(j)**

**(k)**

**(l)**

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

**(i)**

**(j)**

**(k)**

**(l)**

In the multipoint deformation monitoring models, the experimental results show that the autocorrelation of is almost not found in the SR, BP and ELM models, whereas the values of the MLR model show remarkable autocorrelation (Figures 12(a)–12(d)). The values of the MLR model exhibit heteroscedasticity as the observed outflow changes. Compared with the MLR, SR, and BP neural network, the spatial distribution of with observed deformation for the ELM model is relatively uniform (Figures 12(e)–12(h)). The probability density of for the MLR displays a multimodal distribution, with four peaks (one high and three low) distributed at −1.4, 0.4,1, and 2.2, respectively (Figure 12(i)). The of the ELM model presents a unimodal distribution with a sharp peak without considering the influence of two abnormal points, and is mainly distributed between −1.5 and 1.2 (Figure 12(l)).

#### 6. Conclusion

This paper investigated the usefulness of two traditional multiple regression models (MLR and SR) and two artificial neural network models (ELM and BP neural network) in predicting dam deformation. All the four models presented here have the advantages of simple operation and fast application, which increases the confidence in using these models.

The artificial neural networks (ELM and BP) can significantly improve the accuracy of conventional statistical methods (MLR and SR) for predicting the behavior of concrete dams and have good adaptability and generalization ability for deformation predicting of concrete dams. Compared with the BP model, the ELM model has fewer adjustment parameters, faster learning, and higher efficiency. If there is a high accuracy requirement for concrete dam deformation prediction, the ELM model would be optimal.

The one-dimensional deformation multipoint monitoring model can reflect the deformation distribution in the one-dimensional direction of the arch dam, with clear physical concepts and spatial characteristics. Compared with the single point model, it has better anti-interference ability and higher prediction accuracy. In general, for the single point deformation monitoring model, the four models mentioned in this paper can meet the engineering needs. Nonetheless, artificial neural networks are a better choice when considering the interaction of measuring points. Among them, the ELM model can effectively solve the time consumption problem associated with the BP neural network, and it has superior performances over other three models in simulating dam deformation.

Obviously, artificial neural network-based models are more suitable for reproducing nonlinear effects and complex interactions between input variables and dam responses. Nonetheless, the determination of the number of hidden nodes is the key and difficult point that artificial neural networks are difficult to avoid. In order to overcome the error caused by randomness and improve the generalization ability model based on ELM, evolutionary algorithms, such artificial bee colony (ABC) algorithm [27] or particle swarm algorithm, can be used to optimize the ELM model, which is the next research goal.

#### Data Availability

(1) The initial observation data of Dongjiang dam used to support the findings of this study were supplied by Hunan Electric Power Company Science Research Institute under license and so cannot be made freely available. Requests for access to these data should be made to Tianhaiping, [email protected]. (2) The calculated data used to support the findings of this study are included within the article.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.