Abstract

Fracture energy is always used to represent the fracture performance of concrete structures/beams, which is crucial for the application of concrete. However, due to the nature of concrete material and the complexity of the fracture process, it is difficult to accurately determine the fracture energy of concrete and predict the fracture behavior of different concrete structures. In this study, artificial intelligence approaches were tried to seek a feasible way to solve these prediction issues. Firstly, the ridge regression (RR), the classification and regression tree (CART), and the gradient boosting regression tree (GBRT) were selected to construct the predictive models. Then, the hyperparameters were tuned with the particle swarm optimization (PSO) algorithm; the performances of these three optimum models were compared with the test dataset. The mean squared errors (MSEs) of the optimum RR, CART, and GBRT models were 0.0447, 0.0164, and 0.0111, respectively, which indicated that their performances were excellent. Compared with the RR and CART models, the hybrid model constructed with GBRT and PSO appeared to be the most accurate and generalizable, both of which are significant for prediction work. The relative importance of the variables that influence the fracture energy of concrete was obtained, and compressive strength was found to be the most significant variable.

1. Introduction

Because of the beneficial properties of concrete, i.e., its excellent corrosion resistance and good compressive performance, it has been used extensively in the load-bearing members of building structures. However, during the casting and curing process, certain amounts of voids and defects are introduced into the concrete structures, leading to the heterogeneity of microstructure and the low bonding strength of the interfacial transition zone (ITZ) [15]. As a result, concrete structures always fracture when bearing tensile loads. Hence, both the analysis of the concrete fracture phenomenon and accurate prediction of the performance of concrete with respect to fractures are crucial for the application of concrete materials. Various indices have been proposed to represent the fracture performance of concrete, such as the fracture energy, fracture toughness, and tensile strength [68].

Due to the simplicity and accuracy of the testing and calculating processes, fracture energy always is selected as the fracture index of concrete. The International Union of Laboratories and Experts in Construction Materials, Systems, and Structures (RILEM) recommended the use of a standard to test and compute the fracture energy of concrete [9]. Subsequently, many researchers became interested in determining the best way to assess the fracture energy of concrete beams. Based on the RILEM recommendation, Bazant et al. analyzed the fracture energy of different sizes of concrete to determine the relationship between fracture energy and size, and they also found that the fracture energy was associated with the lengths of the notches in the concrete beams [10]. Hu found that fracture energy depends on both the size and geometry of the test specimen, and they proposed the concept of local fracture energy to describe the fracture along the width of a concrete beam [11]. Karamloo et al. studied the influence of the water-to-cement ratio on the fracture performance of self-compacting and light-weight concrete, and they concluded that a remarkable relationship existed between the water-to-cement ratio and the fracture energy of concrete [12]. Kozul and Darwin found that the fracture energy of high-strength concrete decreases as the size of the aggregate increases and that the fracture energy of normal-strength concrete increases as the size of the aggregate increases [13]. Also, fracture energy was shown to be related to the amount and coarseness of the aggregate in the concrete. The compressive strength of concrete, which always is used to evaluate the strength of concrete, also affects the fracture energy [1416].

It is apparent that the fracture energy of concrete is affected by various factors, which means it is difficult for the ordinary methods to predict the fracture energy of concrete accurately. However, artificial intelligence (AI) methods, which mimic human thinking, can be used to analyze such complex regression problems [17, 18]. The artificial intelligence approaches have been used extensively in various fields. Kitouni et al. constructed a smart agricultural enterprise system based on the integration of the Internet of Things and agent technology [19]. Srinivasa et al. used the data analytics-assisted Internet of Things to produce intelligent healthcare monitoring systems [20]. Biswas et al. used a hybrid model to treat the classification problems in the Internet of Things environment [21]. Also, artificial intelligence approaches have been used in other fields, such as the determination of the solubility of gases in different liquids [2225], analysis of seismic fragility [2630], prediction of the performance of tunnel boring machine (TBM) [3133], and prediction of rock burst in the underground space [34]. Unfortunately, the fracture performance of concrete, which is crucial for its application, rarely has been studied using artificial intelligence methods.

In this paper, hybrid artificial intelligence approaches were used to predict the fracture energy of concrete. The classification and regression tree (CART), support vector machine (SVM), and gradient boosting regression tree (GBRT) were used to establish the relationships between fracture energy and the influencing factors, and particle swarm optimization (PSO) was used to tune the hyperparameters of these three models. Subsequently, the performances of the three different prediction models were compared, and the importance of each of the various influencing factors was analyzed with the GBRT ensemble algorithm. This paper is structured as follows. Section 2 presents the main details of the three machine-learning algorithms that were used in this study and introduces the theory of PSO algorithms. Section 3 describes the dataset that was used for machine learning and for preprocessing the data. Section 4 presents the procedure used to tune the hyperparameters. Section 5 presents the results of the tests of the performances of the different predictive models with optimum hyper-parameters. The influences of different variables on the fracture energy of concrete are compared in Section 6, and Section 7 provides a summary of the paper.

2. Machine Learning and PSO Algorithms

2.1. Linear Regression (LR) Algorithm

The linear regression (LR) algorithm is one of the simplest and most extensively used prediction techniques. As shown in the following equation, LR uses only one equation to describe the relationship between different variables:where x1, x2, …, xn are the different features that are regarded as independent variables, Y is the target variable and it depends on the independent variables, and the values of θi are the weights assigned to the features based on their importance.

The cost function, J(θi), is introduced to evaluate the performance of the prediction equation; that is, when J(θi) reaches its minimum value, the best equation can be obtained. The cost function, J(θi), is defined as shown in the following:where m is the size of the data, Yθ(xi) is the predicted value, and yi is the actual value.

However, for simple linear regression, overfitting is a problem that cannot be ignored because the model can fit the training data perfectly but behave poorly in the prediction of unknown data. Hence, penalty methods are used to solve these problems, such as the L1 regulation technique, the L2 regulation technique, and others. In this paper, we only used the ridge regression (RR), which adds an L2 penalty term on the cost function, J(θi). Thus, the updated cost function is [35, 36]where λ represents the degree of penalty. The best ridge regression model also can be obtained by minimizing the cost function. Therefore, for the ridge regression model, the parameter, α, which determines λ in the L2 penalty term, should be set before a prediction is made.

2.2. Classification and Regression Tree (CART) Algorithm

The classification and regression tree (CART) algorithm is a kind of decision tree algorithm that can deal with both classification and regression problems [37]. It uses a tree-like graph to assist in making decisions, and it is considered to be one of the best and most-frequently used supervised learning methods [38]. The CART algorithm typically consists of two stages of procedures, i.e., the tree generation stage and the pruning stage.

Normally, a CART is generated by splitting a dataset, which consists of the root node, the decision nodes, the leaf nodes, and the branches. For regression problems, the splitting criterion that is selected is recursive binary splitting [37], as shown in the following equations:where j is the data attribute to be split, x(j) is the splitting variable, and s is the splitting point. The data are split into two subsets, that is, R1 and R2; yi is the output variable, c1 is the average value of the output variables in R1, and c2 is the average value of the output variables in R2. Such partitioning is required to ensure that the sums of the squared error (SSE) of subsets R1 and R2 are minimized separately, and it is also required that the value of the sum of SSE in R1 and R2 be minimized. Figure 1 shows that the splitting process divides the root node, which contains all of the data, into two subsets. The attributes in each subset are as homogeneous as possible under the condition of the biggest difference between these two subsets. It should be noted that all of the data splitting must follow this rule during the growth of the tree. With each partition, the complexity of the variance in each subset is reduced, but the model becomes more complicated. The partition will stop either (1) when the data in each leaf node share the same characteristics or (2) when the depth of the tree reaches its maximum value.

After the tree is fully grown, the tree model always tends to be overly complex. Such a model could fit the given training data very well, but it also could have poor behavior in predicting the outcomes of the untrained data, a phenomenon known as overfitting. This is because some branches of the tree are so specific that they contribute little to the tree’s ability to generalize information. Hence, the pruning process, which consists of prepruning and postpruning, is designed to remove these redundant branches. After the tree is pruned, the simplified CART model will do a better job of predicting the untrained data. For each decision tree, the following essential parameters should be considered, that is, max_depth, min_samples_split, and min_samples_leaf. The max_depth is used to control the size of the tree, and the min_samples_split and the min_samples_leaf are set to ensure the sizes of the samples in each leaf.

2.3. Gradient Boosting Regression Tree

During the application of the CART algorithm, the high sensitivity of the data is a big challenge. In some cases, small variations in the data might result in the generation of a completely different tree. Therefore, the boosting algorithm was proposed to solve this problem by combining several base learners [39]. The gradient boosting regression tree (GBRT) is a kind of boosting algorithm the base learner of which is the classification and regression tree (CART). By combining several CARTs, the ensembled model will have better predictive performance. The core of GBRT is to identify an additive model that minimizes the loss function. First, a regression tree is generated to provide maximum reduction of the loss function. Then, one new tree is added to the existing model at each iteration, and the residual is updated accordingly. It should be noted that the iterative process is stagewise, and the existing trees are not modified when the following trees are added. By adding the new trees, the updated model will perform better in the region in which the previous model did not perform well. The final GBRT model consists of several decision trees that have different structures. Consequently, the predictive model becomes more robust and accurate.

In addition to the parameters set in the base learners (CART), GBRT also uses three extra parameters, that is, the number of base learners (n_estimators), the impact of each additional base learner that is fitted (learning rate), and the loss function.

2.4. Particle Swarm Optimization

Particle swarm optimization (PSO) is an evolutionary computation technology that originated from the study of the predation behavior of birds. The basic idea of the particle swarm optimization algorithm is to find the optimal solution through collaboration and sharing information between individuals in the group [40]. The advantages of the PSO are its rapid convergence and easy implementation, and it has been proved to be efficient in optimizing various problems, such as optimization of the objective function, optimization in a dynamic environment, training neural networks, and others [41, 42].

Figure 2 shows a procedure of the PSO in a flowchart. The PSO algorithm starts with the random generation of particles. Every particle has only two attributes, that is, position x and velocity . The position represents the direction of movement, and the velocity represents the speed of movement. Each particle searches for the optimal solution separately in the search space and records it as the fitness value . Then, by comparing with the fitness value of (particle i’s previous best location), the current best location is determined, and the global best location, , can be obtained accordingly.

The velocity, , and the positions of of particle i can be updated based on the current best location, , and the global best position, ():where is the inertia weight parameter, and are the acceleration coefficients, and and are the random values between 0 and 1. As denoted in (6), the velocity of particle i depends on three factors, that is, the velocity at the previous iteration, its best location, and the global best position.

When the criterion of termination is met (usually a sufficiently good fitness or a maximum number of iterations), the iteration stops, and the optimum location is obtained.

3. Dataset

3.1. Data Description

The data of 736 3-p-b concrete tests were collected from the research published in 14 papers [4355]. Table 1 summarizes the items that were recorded during the experiments, and the range of each item also is listed. Some essential details of the items are demonstrated as follows. S represents the span between two supports in the 3-p-b tests, W is the width of the beams, T is the thickness, a0 is the length of the initial notch, w/c is the water/cement ratio, λ represents the distribution of aggregate size, dmax is the maximum diameter of the aggregate, fc is the compressive strength of concrete, and Gf is the calculated fracture energy of the specimens. It should be noted that all of the fracture energies of the 3-p-b beams were calculated following the recommendation of the RILEM TC50-FCM (1985) as follows:where P is the load and δ is the load point displacement, which can be obtained from the load-displacement curve of the 3-p-b tests.

To clearly identify the characteristics of different variables, their distributions were plotted as a histogram and analyzed with normal distribution. As shown in Figure 3, the distributions of the input variables were unordered and scattered, and huge differences could be found among the distributions of different variables. Figure 4 demonstrates the characteristics of concrete fracture energy; the values appeared to be continuous and regular. It was difficult to determine the connection between every input variable and establish the relationship between output variables and input variables. Hence, AI approaches are needed to solve such complex problems.

3.2. Data Processing

Some preparatory work was required before the data could be used in the predictive models. As shown in Table 1, the different variables have different units, and huge differences existed between the values of various variables. Hence, normalization was used to normalize all of the data into values that ranged from 0 to 1. Then, the database was split into two sets, that is, a training set and a testing set. The training set was used to train the predictive models to gain these indispensable parameters, and the testing set was used to evaluate the performance of the predictive models. In this study, the ratio between the training set and the testing set was 0.7 : 0.3. Therefore, 515 cases were used to train the model, and 221 cases were used to test the performance of the predictive model. Note that all of the data should be shuffled before being split to ensure the representativeness of the datasets.

4. Construction of Predictive Models

4.1. K-Fold Cross-Validation

In Section 3.2, it was suggested that all the data should be split into a training set and a testing set. However, in most cases for the experimental data, the sizes of the split sets were not sufficient for the predictive models. Thus, K-fold cross-validation was introduced to solve this deficiency by repeatedly using the data in the training set. Figure 5 shows that the training set was divided evenly into K parts, none of which had an intersection. Then, K-1 parts were chosen as the training subsets to train the predictive model, and the remaining parts served as the validation subset, which was used to validate the performance of the current model. The above process was repeated K times. Consequently, each part could be the validation subset once, and each part could be the training subset K-1 times. For regression problems, the mean squared error (MSE) is always set as the performance indicator of models. The value of MSE (MSEi) is assessed by the validation subset in each fold, and the average MSE value (MSEavg) in K folds represents the behavior of the predictive model. In this study, K was set as five, as recommended by An et al. [56].

4.2. Hyperparameter Tuning

Sections 2.1, 2.2, and 2.3 illustrated the theories of the RR, CART, and GBRT algorithms, respectively. In this section, the K-fold cross-validation and PSO are combined to tune the hyperparameters of these two algorithms. Due to the significant influence on the structure of the algorithm and the performance of the model, the alpha was tuned for the Ridge Regression model, the max_depth, min_samples_split, and min_samples_leaf parameters for CART and two additional parameters n_estimators, learning rate for GBRT model. Here, the average MSE value (MSEavg) in K folds is regarded as the fitness value of the particles, and the least MSEavg criterion is applied to searching the optimum parameters.

First, the RR, CART, and GBRT were trained and validated with training data from the 3-p-b tests. Figure 6 shows the evolution of MSEavg with progressing iterations. It is apparent that the variations of the three algorithms are different, that is, for both the convergent rate and the optimum value of MSEavg. The RR converged to a stable state in only one iteration with 0.0445 MSEavg value, the CART algorithm converged to stable within two iterations, and the optimum MSEavg was 0.0155; it took five iterations for GBRT to become stable, and the optimum MSEavg converged from 0.0195 to 0.0116. It was concluded that the PSO was efficient in tuning the hyperparameters of these three models. The convergence rate of RR was the fastest due to its simple structure and fewer parameters. With the increase of parameters, the time of convergence also increases. Although the GBRT was the slowest to converge, its MSEavg value was the smallest among the three models, which means the GBRT had the best performance in the training process.

When the maximum number of iterations is met, the optimum MSEavg is obtained, and the current hyperparameters are regarded as the optimum parameters. Then these parameters will be used to build the predictive models. The optimum hyperparameters of different models are provided as follows:RR: alpha = 0.266CART: max_depth = 16, min_samples_split = 5, min_samples_leaf = 2GBRT: max_depth = 6, min_samples_split = 5, min_samples_leaf = 10, n_estimators = 230, learning rate = 0.453

5. Testing of Predictive Models

It is well known that an excellent predictive model must both fit the training data well and accurately predict the unknown data. That is to say, a predictive model must have both low training error and low generalization error. The hyperparameters, that is, RR, CART, and GBRT, were determined in Section 4, and the predictive models can be designed accordingly. Before the predictive models can be used, it is essential to test their performances, especially their ability to generalize based on the available information. In this study, 221 pieces of data were used to verify the predictive capabilities of the three models. The MSE and the R2 values were selected to quantify the behaviors of the three models. The R2 value can be calculated as follows:where is the coefficient of determination of the predictive model, is the experimental results, is the predictive results, is the average value of the experimental results, and N is the total number of data.

Figure 7 compares the experimental data and the predicted results, and it also provides the MSE and R2 values. During the testing process, the RR model obtained the MSE value of 0.0447 and R2 value of 0.3120, the CART model achieved the MSE value of 0.0164 and R2 value of 0.7468, and the GBRT model got the MSE value of 0.0111 and R2 value of 0.8167. Based on these results, it was concluded that the GBRT and PSO hybrid model was more successful than the RR and CART model in establishing the relationship between the concrete fracture energy and the factors that influenced it.

Then, the performances of various predictive models in the testing process were compared with their performances in the training process with the index of MSE. Table 2 shows the differences between the MSE values in the training process and the testing process for these three models. These observations highlighted the importance of the generalization ability of the predictive models. The GBRT model produced the smallest MSE value that was obtained in the testing process, and this model also produced a smaller MSE value during the training process. However, the RR and CART models produced higher MSE values, which indicated that those two models were unable to accurately predict the unknown data because of the poor generalization. By the addition of several different CART models as base learners, the GBRT models, with their optimum hyperparameters, had improved performances on the test dataset and on future predictions.

As discussed above, the fracture energy of 3-p-b concrete beams is influenced by various factors, and the relationships between these factors are difficult for simple ridge regression methods to describe. By adding different CARTs that focused on various regions, the hybrid models with GBRT and PSO provided higher accuracy and better generalization ability.

By comparing the AI approaches with the empirical calculation of the fracture of concrete, the AI approaches had several merits. Assumptions always are needed for the simplification and application of an empirical equation, but this requirement is avoided in AI. The size effect phenomenon of concrete fracture energy, which always is troublesome for the empirical calculation, is directly incorporated into the AI predictive models. Hence, the AI approaches seem to be more accurate in determining the fracture energy of concrete.

6. Relative Importance of Influencing Variables

Since the GBRT and PSO models with optimum hyperparameters had the best performance, they were selected to study the influence of different input variables on the fracture properties of concrete. For such regression problems, the mean squared error represents the impurity of a model. The importance of the variables was evaluated by their contributions to the reduction of the model’s impurity, and the result was represented by a relative importance score. A higher score indicated a feature that had a stronger influence. For each feature, the importance score was calculated in every single base learner of GBRT, and then the relative importance score was obtained by averaging the scores over all CARTs. Figure 8 illustrates the influence of the input variables on the fracture energy of the concrete beams.

When the importance scores of different influencing variables were compared, it was apparent that compressive strength, with the score of 0.425, had the strongest influence on the determination of the fracture energy of the concrete beams. This high relevance between fracture properties and compressive strength has been proved repeatedly in the literature, and many empirical equations have been established between them [57, 58]. Compressive strength always is regarded as the universal index for concrete performance in various countries. Hence, when it is not convenient to test the tensile or fracture properties, they can be obtained based on the empirical relationship between the fracture properties and compressive strength.

The importance scores of the aggregate distribution and maximum aggregate size were 0.175 and 0.150, respectively. These two influencing variables are parts of the characteristics of the aggregate that can reflect the microstructure of the concrete. For most laboratory 3-p-b tests of concrete, the specimens are too small to be regarded as homogeneous [59]. Therefore, the effect of aggregate on fracture energy cannot be ignored. Especially, when the laboratory tests are expected to predict the failure of large structures, the aggregate characteristics should be quantified and considered.

The span, width, thickness, and length of the initial notch can be unified as the geometric parameters of a specimen. Normally, the length of the ligament (width minus initial notch length) is regarded as the size of a concrete specimen, and it has proven to be relevant for the fracture energy of the 3-p-b specimens [60, 61]. However, the results in Figure 8 appear to be different, and this may have been caused by the limited range of the width and length of the initial notch, which should be considered carefully in future work. Also, the water/cement ratio should be considered carefully during the prediction of concrete fracture energy because it has an impact on the hydration degree of the concrete, which will determine the fracture process of concrete beams.

7. Conclusions

Research related to concrete fracture properties is very important in order to address the durability issues in the application of concrete. In this study, the ridge regression, CART algorithm, and the ensemble GBRT algorithm were adopted to develop predictive models, and the metaheuristic method (PSO) was used to tune their hyperparameters. The fracture energy of concrete was set as the output variable, and eight influencing parameters were set as the input variables. After the optimum predictive model was obtained, the relative importance of the various influencing variables was analyzed. The main conclusions are summarized as follows:(1)The PSO algorithms were proved to be efficient in seeking the optimum hyper-parameters of three machine learning algorithms in this study, and these three predictive models all converged to the stable state within a small number of iterations(2)The relationship between fracture energy and its influencing factors is complex, and it cannot be predicted accurately by the simple Ridge regression or single CART prediction models(3)Using the PSO algorithm and the ensemble method, the hybrid GBRT predictive models gained an improved generalization ability, and they had the best performance in predicting the fracture energy of concrete(4)The compressive strength of concrete was found to have a significant influence on the predictive models, which should be considered carefully during the prediction of the fracture energy of concrete

Although the fracture properties of concrete have been predicted accurately by artificial intelligence approaches, there are still some limitations in comparison with other previous works [62, 63]. First, the dataset used in this study is limited, and some cases even are removed. A large and multinational dataset should be used to enhance the accuracy of the predictive models. Then, although the optimum models get a high accuracy in predicting the concrete fracture energy, there still exists obvious prediction error around the boundaries of variables range, and the effect of variables’ distribution and boundaries on the performance of predictive models should be carefully considered. Moreover, other optimization algorithms are suggested to be tried in the future work, such as firefly algorithm, ant colony optimization algorithm, and iterated greedy algorithm.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding this publication.

Acknowledgments

This work was supported by the National Key R&D Program of China (2018YFC0808700 and 2018YFC0808706). The author XY Han acknowledges the support of China Scholarship Council (201707000075) for part of his Ph.D. research at the University of Western Australia (UWA) from 2017 to 2019.