#### Abstract

Conventional statistical models provide inaccurate predictions of concrete crack openings because they do not consider the nonlinear temperature response and the residual characteristics of concrete. To address this problem, this study introduces a nonlinear temperature factor and develops an improved statistical model of crack openings. The chaotic characteristics of residual time series of the improved statistical model are analyzed based on chaos theory and phase-space reconstruction theory. These theories are integrated with back-propagation (BP) artificial neural networks and genetic algorithms (GAs) to establish a GA-BP neural network model for predicting residuals. Finally, a hybrid model is developed for predicting the concrete crack opening behavior. The predictions of the conventional statistical model, the statistical model considering nonlinear temperature component, and the hybrid model are compared using the case study on the crack openings of a regulating sluice. The results show that the proposed hybrid model in this study for predicting concrete crack openings is significantly more accurate than the conventional statistical model and the statistical model considering nonlinear temperature component.

#### 1. Introduction

Cracks are a common structural defect in hydraulic concrete. The operational behavior of macroscopic cracks in hydraulic concrete structures is directly related to the safety of the structure and is an important indicator of structural stability. To determine the operational behavior of concrete crack openings, the momentum misplacement is often monitored in real time by installing a crack meter group. Given the conditions of practical projects, concrete cracks are affected by various complex factors, including water pressure, temperature, and time. Hence, it is vital to establish an accurate mathematical model to aid in monitoring concrete crack openings and to quantitatively analyze, evaluate, and predict their behavior.

Concrete cracks fall into the category of concrete deformation. At present, mathematical models based on conventional statistical models of concrete deformations are widely used in engineering monitoring and parameter inversion [1]. Such models usually include temperature component, hydraulic component, and aging component. Experiments demonstrate that temperature is crucial to control concrete deformation [2, 3] and that variations in temperature are an important factor in determining the opening and closing of concrete cracks [4]. The temperature component is thus considerably sensitive to the temperature variations [1]. Although the hydraulic and aging components of conventional statistical deformation models generally fit well to experimental results, the temperature component is underfitted because conventional statistical models [5, 6] generally consider only the linear response of the system to temperature but neglect the complex nonlinear relationship between the actual affected variables and environmental factors in practical engineering. Previous studies show that variations in temperature are related nonlinearly to concrete deformation, and including such nonlinearities in prediction models should provide more accurate results than conventional models [7, 8].

To further improve the accuracy of statistical models, chaos theory has recently been harnessed to analyze and predict the residual series of statistical models, effectively establishing a hybrid model that provides more accurate predictions than the statistical models. For instance, Wei et al. [9] established a back-propagation autoregressive integrated moving-average model based on the wavelet analysis that directly predicts chaotic residual time series. The nonlinear method predicts the residual, which is generated by a phase-space reconstruction, with the result being a hybrid model based on a conventional statistical model [10, 11]. The results show that hybrid models based on residual prediction are more accurate than conventional models. Bao and Wu [10] improved the mathematical model for monitoring concrete cracks by combining chaos theory with a conventional statistical model to establish a hybrid model of concrete crack openings. The results indicate that the stress field at the crack tip generates singularities when exposed to water pressure and temperature variations, which irreversibly deforms the concrete at the crack tip. Hence, crack openings also involve an irrecoverable part caused by nonlinear factors, suggesting that conventional statistical models cannot predict the deformation behavior as a function of water pressure and temperature. This part of the deformation contains significant information about the crack evolution process, which is usually described in terms of a residual series, that is, a chaotic residual time series. Hence, it is necessary to analyze the residual time series of statistical models of concrete crack openings.

At present, there are some reports on the prediction methods of time series [12]. The studies show that methods based on phase-space reconstruction are commonly used to predict chaotic time series. The essence of this approach is to transform the extension of time series into the interpolation of phase space. Based on the Takens embedding theorem [13], the optimized delay time and minimum embedding dimension are determined according to the actual time series. The one-dimensional (1D) time series can then be reconstructed into a phase space with the same topological meaning as the motive power system. A reasonable selection of the delay time and embedding dimension then allows reconstruction of the phase space with satisfactory adaptability, thereby producing the internal nonlinear mapping. By exploiting the properties of the Lyapunov exponent, this approach can recognize and make short-term predictions of chaotic time series [14, 15]. Currently, the methods for determining optimized delay time and minimum embedding dimension depend on the amount of data they are fed, the amount of computation, the anti-noise performance, and the objectivity with which the parameters are selected. The methods available for selecting optimized delay time typically include the autocorrelation coefficient method [16], the mutual information method [17], and the average displacement (AD) method [18], and the methods used to calculate the minimum embedding dimension typically include the geometric invariant method [14], the pseudo-neighbor method [19], and the Cao method [20].

Casdagli [21] proposed the basic principle for predicting chaotic time series, which was used to develop prediction methods involving artificial neural networks [9, 22, 23]. BP neural networks are suitable for nonlinear mapping, making them good candidates for predicting chaotic time series. However, single neural networks converge slowly, resulting in local minima and overfitting. Ding et al. [24] proposed GA-BP neural networks and demonstrated that genetic algorithms (GAs) provide obvious improvements over BP neural networks. Hence, phase-space reconstruction combined with GA-BP neural networks is expected to accurately predict chaotic time series.

In this study, the conventional statistical model is firstly improved by considering the nonlinear temperature factor. Further, the improved statistical model is combined with the GA-BP neural network model using chaos theory and phase-space reconstruction theory, to develop a hybrid model for predicting concrete crack openings. Finally, a detailed case study on the crack openings of a regulating sluice is presented to illustrate the application of the hybrid model.

#### 2. Improved Statistical Model for Concrete Crack Openings

Hydraulic concrete structures with macroscopic cracks are affected by water pressure, temperature, and time. It is illustrated in Figure 1.

Consider a hydraulic concrete structure with *K* measuring points near a macroscopic crack. A conventional statistical model describing concrete crack openings (denoted as model I) is as follows:where is the measured crack opening, is the temperature component, is the hydraulic component, is the aging component, is the residual, *T*_{i} is the measured temperature of the *i*th measuring point, *K* is the number of temperature-measuring points, *H* is the water depth, *θ* = *t*/100, *t* is time, and *a*_{0}, *a*_{i}, *b*_{i}, *c*_{1}, and *c*_{2} are all regression coefficients.

Wang et al. [5] show that the temperature component in equation (1) does not consider a nonlinear temperature response. Since the concrete openings as a function of temperature is a complex nonlinear process, ignoring this nonlinearity will lead to underfitting. In contrast, including the environmental variables leading to nonlinearity alleviates this underfitting. Hence, we propose an improved statistical model of concrete crack openings, which is called statistical model considering nonlinear temperature component (denoted as model II) that accounts for the nonlinear temperature response [6–8]:where *L* represents the order of the temperature polynomial reflecting the influence of the nonlinear temperature response (*L* is generally a positive integer between 1 and 4 and is determined by fitting to measured values); *a*_{ip} is the regression coefficient corresponding to the *p*th power of the measured temperature at the *i*th measurement point, and the other symbols have the same meaning as in equation (1). The regression coefficients of the statistical model are determined using regression analysis or optimization algorithms. The residual time series is obtained by separating the measured data of the concrete crack openings.

#### 3. Prediction of Residual Based on Chaos Theory

Under the action of water pressure and temperature variation, the stress field at the crack tip manifests singularity, which makes the concrete at the crack tip produce irreversible deformation. That is, the crack opening contains an irreversible part. Since the aging component cannot reflect the irreversible deformation accurately because of the limitations of its simple functions, the irreversible deformation is retained in the residual. In particular, when the crack opening suddenly changes due to overloading or drastic temperature variation, the hydraulic component, temperature component, and aging component of statistical model can hardly reflect this mutation of crack opening. In other words, the residual sequence contains a lot of information about the evolution of cracks [10] and some studies showed [9, 14] that this residual sequence contains chaotic characteristics. Therefore, chaos theory is used to analyze and predict the residual.

##### 3.1. Phase-Space Reconstruction of Residual Sequence

Previous studies [10, 13] have shown that a 1D residual time series *x*(*t*) = {*x*(*t*_{i})|*i* = 1, 2, 3, …, *N*} can be reconstructed into the following form in *m*-dimensional phase space *R*^{m}:where *M* = *N* − (*m* − 1)*τ* is the number of phase points, *m* > 0 is the integer embedding dimension, and *τ* is the delay time, which is usually a positive integer multiple of the sampling interval of the residual time series *x*(*t*).

The first-phase point in equation (3) can be expressed as follows:

According to equations (3) and (4), the key of phase-space reconstruction lies in the determination of the delay time *τ* and the embedding dimension *m*.

###### 3.1.1. Determination of Delay Time

Selecting the optimized delay time *t*_{0} usually involves the autocorrelation coefficient method [16], the mutual information method [17], or the AD method [18]. The autocorrelation coefficient method is not suitable for nonlinear systems [25] and imposes a subjective selection of the method to calculate the descent coefficient. Although the mutual information method overcomes the shortcomings of the autocorrelation coefficient method, its calculation is relatively complicated. Hence, this study uses the AD method [18] to select the optimized delay time; it is an efficient calculation and avoids information redundancy and the complete irrelevance of delay coordinates caused by improper delay time. Under the 2-norm, the average displacement *S*_{2}(*m*, *τ*) is defined as follows:

When given different embedding dimensions (≥2), *S*_{2}(*m*, *τ*) tends to be stable with increasing *τ*. Rosenstein et al. [18] suggested that optimized delay time can be determined according to the *S*_{2}(*m*, *τ*) vs. *τ* graph under different embedding dimension *m*; that is, when the slope of *S*_{2}(*m*, *τ*) vs. *τ* decays to 40% of the initial slope, the corresponding *τ* is the optimized delay time. Research shows that this method is reliable for small, noisy datasets and that the computation time is short.

###### 3.1.2. Determination of Embedding Dimension

Methods to calculate the minimum embedding dimension *m*_{e} include the geometric invariant method, the pseudo-neighbor method, and the Cao method. However, the geometric invariant method requires a noise-free time series, which is barely possible in practical applications. Cao [20] proposed the pseudo-neighbor point method, but this method is very subjective when selecting the decision threshold. They also proposed a practical method (called the “Cao method”) for determining the minimum embedding dimension. This method not only is robust against noisy data but also reduces the interference of subjective factors and adverse effects caused by slight reductions in data size. For this reason, this study adopts the Cao method to calculate the minimum embedding dimension.

When an optimized time delay is given, the following parameters are defined in *m*-dimensional phase space:

*a*(*i*,*m*) is defined as follows:where is the distance defined under the infinite norm, and is the nearest point to the phase point *X*_{i}(*m*), which means the phase point at the minimum Euclidean distance from *X*_{i}(*m*).

The mean value of *E*(*m*) of *a*(*i*,*m*) is as follows:

When the embedding dimension is incremented from *m* to *m* + 1, *E*1(*m*) is as follows:

Cao et al. [20] report that, given a fixed-point attractor in the residual time series, *E*1(*m*) would stop changing once the embedding dimension *m* exceeds a certain value *m*_{0}. In this case, *m*_{0} + 1 is the minimum embedding dimension, which may be determined by a graphical analysis of *E*1(*m*) vs. *m*.

In actual calculations, it is difficult to determine whether *E*1(*m*) increases slowly or remains constant as *m* increases. For a group of random residual series, *E*1(*m*) increases with increasing *m* and converges for a deterministic residual series *E*1(*m*). Hence, we must introduce *E*2(*m*) to distinguish random residual series from deterministic residual series. *E*2(*m*) is given as follows:where

For random residual series, *E*2(*m*) = 1 for any *m*; for deterministic residual series, *E*2(*m*) is a nonconstant function of *m*; that is, some *m* must exist so that *E*2(*m*) ≠ 1. Therefore, Cao et al. [20] suggested that *E*1(*m*) and *E*2(*m*) be calculated at the same time to determine the minimum embedding dimension and distinguish between random residual series and deterministic residual series. For notational convenience, *E*1(*m*) and *E*2(*m*) are denoted as *E*1 and *E*2, respectively.

###### 3.1.3. Identification of Chaotic Characteristics

The maximum Lyapunov exponent *λ*_{1} is of important significance for determining the chaotic characteristics contained in residual time series. For phase space, *λ*_{1} > 0 indicates that the system is chaotic. Rosenstein et al. [26] analyzed common methods of calculating the maximum Lyapunov exponent. For problems of unreliable calculation, intensive computation, and difficult-to-implement existing methods for small datasets, a practical method (called the “Rosenstein method” and also known as the “small-dataset method”) was proposed for small datasets. The advantage of this method is that it is robust against the reconstruction dimension, time delay, and noise, thereby producing a more accurate maximum Lyapunov exponent. According to the theory of chaotic dynamics, the reciprocal of the maximum Lyapunov exponent is the maximum prediction time and can be used as a reliability indicator of a short-term prediction [27].

When adopting the Rosenstein method to calculate the maximum Lyapunov exponent, *dj*(*k*) is used in the phase space reconstructed by the optimized delay time and minimum embedding dimension to express the Euclidean distance between phase point *j*, , and its initial nearest point (the meaning is the same as the nearest point in (5)) at moment *k*Δt:where Δ*t* is the sampling interval and is usually taken as Δ*t* = 1 for calculational convenience.

The calculation result for *d*_{j}(*k*) is expressed in matrix form as follows:

*d*_{j}(*k*) is expressed as [26].

Taking the logarithm of both sides of equation (13) gives the following:where *C*_{j} is the initial bifurcation and its value determined for *j* is constant. *y*(*k*) is defined as follows:where is the average over all *j*, that is, the average of the natural logarithm of the nonzero elements in row *k* of the matrix in equation (11).

In this case, we take the part of *y*(*k*) that is close to linear with respect to *k* according to equations (13) and (14) and use the least-squares method to obtain a linear fit. The slope of the line is *λ*_{1}, that is, the maximum Lyapunov exponent.

##### 3.2. Predicted Value of Residual Based on GA-BP Neural Network

In the phase-space reconstruction of the residual time series, the “trajectory” in the reconstructed phase space is “dynamically equivalent” to the original system due to the concept of differential homeomorphism. Therefore, there is a unique mapping relation:where *P*_{t} is the prediction time, and the mapping relation *F* can be calculated by approximating all phase points in phase space. In this study, the nonlinear mapping relation is constructed using artificial neural networks with the whole domain method. The GA is a heuristic random evolutionary search algorithm, which has global search ability and can solve the error function without gradient information. Previous studies demonstrated that the GA-BP neural networks established by GAs provide better fits to nonlinear functions and more accurate short-term predictions of typical chaotic time series, even when compared with the predictions of BP neural networks [23, 28]. Therefore, this study uses GAs to optimize BP neural networks and thereby improves simple BP neural networks to attain the global maximum.

Setting the fitness function is quite important with GAs. In this study, we use the fitness function [29]as follows:which shows that a large *R*^{2} and small root-mean-square error (RMSE) indicate a better fit. The optimal fitness is attained when the fitness becomes stable as a function of the evolutionary generations.

The equations for *R*^{2} and the RMSE are as follows:where is the number of samples in the test set, is the measured value, is the predicted value, and is the average measured value.

In GA-BP neural networks, the input vector is as follows:and the output vector is as follows:

The separated residual time series is trained to optimize the neural network model, which gives the weight and threshold of the GA-BP neural network model. The residual time series is then predicted.

##### 3.3. Hybrid Model for Predicting Concrete Crack Opening

Model II and the GA-BP neural network (15) are integrated to establish the improved hybrid model for predicting concrete crack openings (denoted as “hybrid model”). Mathematically, the model is expressed as follows:where , , and predict the temperature, hydraulic, and aging components of *P*_{t}, respectively; *x*(*t* + *P*_{t}) is the predicted value of the residual based on GA-BP neural network.

Figure 2 presents a flow chart of the procedure used by the hybrid model for predicting concrete crack openings.

#### 4. Case Study of Regulating Sluice with Cracks

##### 4.1. Background of Regulating Sluice with Cracks

###### 4.1.1. Crack Distribution

Hubei Province has undertaken a large water control project designed for power generation and shipping, irrigation, aquaculture, and tourism. The project consists of a regulating sluice located in the main river with an earth-rockfill dam in the main riverbed, an earth-rockfill dam in the Gucheng Section, a powerhouse in the old river, a ship lock, a concrete gravity dam, and an earth-rockfill dam on both sides of the embankment.

On-site inspection has revealed three cracks at the top of the upstream traffic bridge of the gate storehouse on the left bank of the sluice (see Figure 3). Two of these cracks run upstream and downstream and stop at the lowest part of the breast wall. Meanwhile, a long-term calcium precipitation trace appears on the vertical surface of the retaining wall upstream of the gate storehouse and extends to the floor of the upstream wall of the gate storehouse, which is located between the two penetrating cracks at the top of the sluice and threatens the normal operation of gate storehouse.

###### 4.1.2. Monitoring by Crack Meter

To diagnose the evolution of the concrete crack in the regulating sluice, nine sets of bidirectional vibrating string crack meter groups were installed to monitor the openings and misplacement momentum (see Figure 4). Mkj-1 and Mkj-2 were installed on the upstream side of the sluice, Mkj-3–Mkj-5 were installed near the top of the sluice, Mkj-6 and Mkj-7 were installed near the upstream surface of the sluice, and Mkj-8 and Mkj-9 were installed on the top of the sluice near the downstream section. From June 12 to 19, 2020, nine sets of bidirectional crack meters were installed and debugged. Figure 5 shows the measured crack openings and temperature lines of crack meter Mkj-3.

An obvious negative correlation appears between the crack opening at the sluice and the temperature. When the temperature increases, the crack closes and vice versa. Next, we use the measured crack opening in the hybrid model for concrete crack openings.

##### 4.2. Model II of Concrete Crack Openings in Sluice

###### 4.2.1. Determination of *L* in Model II

The various measuring points may give different results due to the nonlinear temperature response. To determine *L* in model II according to the principle mentioned above, we use nine temperature measurements by the crack meter. Using equation (2), we establish model II for concrete crack openings for *L* = 1, 2, 3, 4, 5. The model regression coefficient corresponding to different results for *L* is obtained by regression analysis. *R*-square and RMSE of different improved models were calculated. The results are shown in Figure 6.

When *L* increases from 1 to 3, both *R*^{2} and RMSE improve significantly; when *L* ≥ 3, the two parameters remain stable. From the point of view of statistical models, this shows that temperature exerts a nonlinear effect on the concrete crack opening. Therefore, when model II with nonlinear temperature effects is included, *R*^{2} = 0.891 and RMSE = 0.04571. Table 1 lists the regression coefficients obtained through regression analysis. Using these, model II for concrete crack openings is as follows.

###### 4.2.2. Comparison with Other Models

Figure 7 compares model II for crack openings (22) with model I (1).

**(a)**

**(b)**

**(c)**

Model I considers only the linear response to temperature, as highlighted by the ellipses in Figure 7(a). Note that the fit near the extreme values is not ideal. In Figure 7(b), the nonlinear response to temperature is considered in model II, which makes the fit closer to the measured trajectory. This is seen in the decrease in the residual error of model II (Figure 7(c)).

Notably, limitations of the statistical models shown in Figures 7(b) and 7(c) still result in periods where the fit is poor, which will affect the accuracy of the models’ predictions to some extent. Hence, the prediction methods based on the statistical model may still be improved. For this reason, we apply chaos theory to further analyze the residual, which leads to the more accurate model.

##### 4.3. Phase-Space Reconstruction of Residual Series

###### 4.3.1. Determination of Embedding Dimension and Delay Time

The residual time series from model II for crack openings is obtained by separating the measured values. From the residual time series, 1825 samples from 22:00 on June 23, 2020, to 7:00 on February 11, 2021, are selected as the training dataset, and 24 samples from 10:00 on February 11, 2021, to 7:00 on February 14, 2021, are selected as a posteriori prediction dataset. Figure 8 shows *S*_{2}(*m*, *τ*) as a function of delay time *τ* constructed using the AD plotting method, which is based on phase-space reconstruction theory. Figure 9 plots *E*1 and *E*2 (determined by the Cao method) as a function of embedding dimension *m*.

When the embedding dimension is constant, the slope of *S*_{2}(*m*, *τ*) vs. *τ* stabilizes as *τ* increases. The delay time is optimal when the slope of *S*_{2}(*m*, *τ*) vs. *τ* decreases to 40% of its initial value, which gives *τ* = 2 as the optimal delay time. For a given optimal delay time, *E*1 stabilizes with increasing embedding dimension, and *E*2 0. Therefore, when *E*1 stabilizes, the corresponding embedding dimension (i.e., the minimum embedding dimension) is eight.

###### 4.3.2. Phase-Space Reconstruction of Residual Time Series and Determination of Prediction Step

Given the embedding dimension and delay time, phase-space reconstruction is applied to the 1D residual time series via equation (3), as per the principle mentioned above. The maximum Lyapunov exponent is calculated via the Rosenstein method, indicating that the system contains chaotic components. The reciprocal of the system is then calculated, giving a maximum prediction step of 40. Therefore, the posterior step is 24 (i.e., 24 samples at 7:00 on February 11, 2021, and 7:00 on February 14, 2021), and the duration is three days.

##### 4.4. Hybrid Model Based on GA-BP Neural Networks

###### 4.4.1. Training to Optimize GA-BP Neural Networks

As shown in Figure 1, we use 1825 samples from 22 on June 23, 2020, to 7:00 on February 11, 2021, to train the GA-BP neural networks. The structure of the BP neural network is 8-12-8, the GA’s maximum number of generations is 30, the population is 40, the crossover probability is 0.8, and the variation probability is 0.1. Figure 10 shows the resulting fit.

When the fitness stabilizes (i.e., for >5 generations) and satisfies the precision requirements of neural networks, the optimized BP neural network is created using the weights and threshold parameters output at this time to train and predict the reconstructed residual time series.

###### 4.4.2. Hybrid Model

Based on the length of the posteriori predicted dataset of residual time series, the prediction is made starting at 10:00 on February 11, 2021, and the prediction step is 24 (three days). The residual time series predicts that *x*(*t* + 1), *x*(*t* + 2), …, *x*(*t* + 24) are obtained, and the hybrid model is as follows:where is the predicted concrete crack opening at time *P*_{t}, where *P*_{t} = 1, 2, …, 24.

###### 4.4.3. Prediction Accuracy

Mean square error (MSE), mean absolute error (MAE), and normalized mean square error (NRMSE) are introduced to evaluate the prediction accuracy of the hybrid model. These are given by:where is the standard deviation of the test set, is the number of samples in the test set, is the measured concrete crack opening, and is the predicted concrete crack opening. The closer the three evaluation indexes are to zero, the more accurate is the prediction.

According to the posteriori prediction of the residual time-series dataset, equation (23) gives the prediction of the hybrid model for concrete crack openings starting from 10:00 on February 11, 2021, with a prediction step of 24 (three days). Figure 11 compares this prediction with those of model I and model II. Table 2 gives the evaluation indicators for different models.

Unlike the values predicted by the conventional statistical model, those predicted by the statistical model considering nonlinear temperature component, which considers the nonlinear response to temperature, are closer to the measured values, although the improvement is minimal near the extreme values. However, the difference between the predicted values and the measured values near the extrema is significantly reduced when using the hybrid model. Compared with model I, the MSE, MAE, and NRMSE of the hybrid model decreased by 87.0%, 68.5%, and 63.9%, respectively. Compared with model II, the MSE, MAE, and NRMSE of the hybrid model decreased by 73.4%, 51.4%, and 48.4%, respectively.

In summary, a positive Lyapunov exponent indicates that concrete crack openings have chaotic components, so the hybrid model makes significantly improved predictions of concrete crack openings.

##### 4.5. Dynamic Prediction Based on Hybrid Model of Concrete Crack Openings

Early warning is a dynamic process in real engineering because concrete crack openings evolve over time. Given new monitoring data, the parameters of the model must be updated and the dynamic prediction must be redone. To verify the reliability of the hybrid model, ten consecutive dynamic predictions were made, each with a time step of 24 (three days), and MSE, MAE, and NRMSE of model I, model II, and the hybrid model were statistically analyzed. Based on these three metrics, Table 2 gives the calculated improvement for the hybrid model relative to model I and model II, and Figure 12 shows the box plots for a comparative analysis.

**(a)**

**(b)**

For the hybrid model, the three evaluation metrics are significantly improved compared with the two statistical models. The improvement for the hybrid model relative to model I is 45.9%–70%, and the maximum improvement in the MSE exceeds 80%. The average improvement for the hybrid model relative to model II is 23.4%–40.1%, and the maximum improvement in the MSE exceeds 70%. This indicates that the hybrid model provides more accurate predictions of concrete crack openings than does model II.

#### 5. Conclusions

In this study, the conventional statistical model for concrete crack opening is improved by considering the nonlinear temperature factor. Using chaos theory and phase-space reconstruction theory, a hybrid model that combines the improved statistical model with a GA-BP neural network residual model is finally developed to predict concrete crack openings in hydraulic structures.

It is found that concrete crack openings are affected by the nonlinear temperature response. However, the conventional statistical model cannot accurately reflect the temperature component of the system. The improved statistical model makes a significant improvement in this area by introducing the second- and third-order temperature components.

The residual series of the statistical model considering nonlinear temperature component contains useful information of crack evolution; the presence of a chaotic component is demonstrated by calculating the maximum Lyapunov exponent. The phase-space reconstruction of the residual series is combined with a GA-BP neural network and the statistical model considering nonlinear temperature component to establish a hybrid model to predict the residuals. The predictions of the hybrid model are more accurate than those of the statistical model.

These results of the case study show that the hybrid model produces more reliable short-term dynamic predictions of concrete crack openings, making it of practical use for generating early warnings of concrete crack openings.

#### Data Availability

The data in this article were calculated and analyzed by the authors.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This study was financially supported by the National Key Research and Development Program of China (No. 2018YFC0407103), the Open Research Fund of State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin (China Institute of Water Resources and Hydropower Research) under Grant No. IWHR-SKL-201917, and the IWHR Research and Development Support Program (No. SM0145B022021).

#### Supplementary Materials

Supplementary 1. Concise description for Figure 5. Figure 5: concrete crack opening (blue) and temperature (red). Column A is time. Column B is concrete crack opening. Column C is measured temperature. Supplementary 2. Concise description for Figure 6. Figure 6: RMSE and *R*^{2} as a function of *L* in model II. Column A is the order of the temperature polynomial (denoted as L). Column B is RMSE. Column C is *R*^{2}. Supplementary 3. Concise description for Figure 7. Figure 7: fitting effect of model I and model II. Column A is time. Column B is measured value. Column C is fitted value by model I. Column D is residuals of model I. Column E is fitted value by model II. Column F is residuals of model II. Supplementary 4. Concise description for Figure 8. Figure 8: *S*_{2}(*m*,*τ*) as a function of delay time *τ* for different embedding dimensions. Column A is delay time. Column B is *S*_{2}(*m*,*τ*) with *m* = 2. Column C is *S*_{2}(*m*,*τ*) with *m* = 5. Column D is *S*_{2}(*m*,*τ*) with *m* = 8. Column E is *S*_{2}(*m*,*τ*) with *m* = 12. Supplementary 5. Concise description for Figure 9. Figure 9: *E*1 and *E*2 are function of embedding dimension *m*. Column A is embedding dimension *m*. Column B is *E*1. Column C is E2. Supplementary 6. Concise description for Figure 10. Figure 10: fitness curve for training to optimize GA-BP neural networks. Column A is evolutional generation. Column B is fit effect value. Supplementary 7. Concise description for Figure 11. Figure 11: evolution of predicted and measured values from the different models. Column A is time. Column B is measured values. Column C is predicted values by model I. Column D is predicted values by model II. Column E is predicted values by hybrid model. Supplementary 8. Concise description for Figure 12. Figure 12: dynamic prediction of hybrid model of concrete crack openings compared with that of statistical models. Column A is times of dynamic prediction. Column B is improvement of MSE due to hybrid model relative to model I. Column C is improvement of MAE due to hybrid model relative to model I. Column D is improvement of NRMSE due to hybrid model relative to model I. Column E is improvement of MSE due to hybrid model relative to model II. Column F is improvement of MAE due to hybrid model relative to model II. Column G is improvement of NRMSE due to hybrid model relative to model II.* (Supplementary Materials)*