Abstract

Ever since their presentation in the late 80s, self-compacting concrete (SCC) has been well received by researchers. SCC can flow under their weight and exhibit high workability. Nonetheless, their nonlinear behavior has made the prediction of their mix properties more demanding. Furthermore, the complex relationship between mixed proportions and rheological and mechanical properties of SCC renders their behavior prediction challenging. Soft computing approaches have been shown to optimize and reduce uncertainties, and therefore in this paper, we aim to address these challenges by employing artificial neural network (ANN) models optimized using the grey wolf optimizer (GWO) algorithm. The optimized model proved to be more accurate than genetic algorithms and multiple linear regression models. The results indicate that the four most influential parameters on the compressive strength of SCC are the cement content, ground granulated blast furnace slag, rice husk ash, and fly ash.

1. Introduction

A proper self-compacting concrete mix design requires balancing two conflicting objectives: deformability and stability, that is, acceptable rheological behavior and appropriate mechanical characteristics. So, the proportions of available materials, minerals, and admixtures must be considered. The optimum balance of coarse and fine aggregates and chemical admixtures ensures the greater cohesiveness of self-compacting concrete. External variations such as changes in the production process of cement and mineral additives and the type of aggregates can trigger significant variations in the properties of fresh self-compacting concrete. To minimize such external variants, the use of industrial derivatives and mineral additives in the manufacturing of lightweight self-compacting concrete has been the focus of many scientists [1].

Due to the complex relationship between mixed proportions and rheological and mechanical properties of SCC, researchers have proposed numerous treatments in the literature. Some researchers have used statistical models such as linear regression to predict the compressive strength of SCC [2], while others used numerical methods to this end [3]. Among the different approaches suggested, soft computing-based methods have reported promising results [4]. They include methods such as random kitchen sink algorithm [5], swarm optimization algorithm [6], and backpropagation algorithm [7].

Zhang et al. developed a random forest model based on the beetle antenna search algorithm to predict the compressive strength of SCC [8], while Nehdi et al. utilized a neural network approach to this end [2]. ANNs have shown promising results in the engineering domain [9, 10] and are widely employed by researchers to predict concrete’s compressive strength [11, 12]. Prasad et al. employed ANNs to predict the compressive strength of high-performance SCC [13], and Siddique et al. used ANNs for SCC containing fly ash [14]. Uysal and Tanyildizi used ANNs to predict the compressive strength of SCC mixtures with mineral additives [15]. A novel approach based on the proposed normalization method for artificial neural networks was employed by Asteris et al. to determine the compressive strength of self-compacting concrete [1, 16]. Support vector machine and relevance vector machine methodologies were proposed by Aiyer et al. to predict the compressive strength of SCC [17], and Ghanizadeh et al. investigated ANNs and support vector machines in their research [18]. Asteris et al. proposed predictive models for the compressive strength of self-compacting concrete using surrogate models such as multivariate adaptive regression splines and M5P model tree [19].

There are several opportunities in soft computing to develop new models aiming to optimize and reduce uncertainties. Mirjalili introduced the grey wolf optimizer metaheuristic algorithm, which is inspired by grey wolves’ social hierarchy and has successfully outperformed other metaheuristic algorithms [20, 21].

This study emphasizes enhancing the prediction model by training an ANN using the GWO algorithm. The proposed ANN-GWO model addresses all significant mix design parameters that influence SCC’s compressive strength. The accuracy of the model is confirmed against 205 samples of a database from the literature. Section 2 offers the required literature review on ANNs and GWO. Section 3 addresses the development and training perspectives of the empirical model. Finally, the results and conclusion of the study are presented in Sections 4 and 5. An overview of the research study is illustrated in Figure 1.

2. Background

2.1. Artificial Neural Networks

Artificial Neural Networks (ANNs) are among the most dynamic areas of current research [22, 23]. The features and capabilities of ANNs include learning and adapting to existing knowledge, generalizability, parallel processing, and thus, higher processing speed and a high tolerance for errors [23, 24]. Figure 2 shows an ANN with 11 neurons in the input layer, two hidden layers comprising 7 and 13 neurons, and a single output layer.

The basis of ANNs is as follows [25]:(1)Data are processed in units called neurons (or nodes).(2)The signals between nodes are transmitted over the connection lines.(3)The weight associated with each connection indicates its power.(4)An activation function is applied to the weighted input (plus a bias value) of each neuron to determine its output [25].

A feed-forward network is a type of ANN in which the connection between its constituent units does not form a cycle, and information flows from the input nodes through the hidden layers to the output nodes [22].

The weights of an ANN are generally set randomly, thus, resulting in different output values in the network. The network’s weights and biases are adjusted and optimized in a training process to reduce the model error to minimize the model error [26, 27]. The methods for solving this optimization problem can generally be divided into metaheuristic and gradient-based methods. Gradient-based methods are fast but can get stuck in the local minima. Contrarily, metaheuristic methods do not get trapped in the local minima. However, there is a drawback to using metaheuristic methods (https://www.sciencedirect.com/science/article/abs/pii/S235201242100182X?via%3DihubThanks). Their result might not provide the global minimum. Nonetheless, considering their speed and capabilities, they are designed to explore and exploit large parts of the solution space to return precise outcomes [2628].

2.2. Grey Wolf Optimizer

Native to the remote areas of Eurasia and North America, grey wolves are members of the Canidae family and are at the top of the food chain. Proposed by Mirjalili et al. in 2014, the grey wolf optimizer (GWO) algorithm is inspired by the strict sovereignty hierarchy of these animals [20].

The grey wolves typically live in groups of 5 to 12, and their social dominance hierarchy consists of four groups: alphas, betas, deltas, and omegas. Alpha grey wolves are the pack’s leaders and dictate things like the decision for hunting, place to sleep, time to wake up, and so on [20, 21, 29]. The alphas are the best managers of the pack and not necessarily the strongest. The betas are the second in command. They assist alpha wolves in decision-making. Deltas (or, in some references, subordinates) are responsible for tasks such as scouting, guarding, hunting, and caregiving. Deltas are the third in command after alphas and betas. The last rank in the dominance hierarchy is the omegas. They follow their ranking wolves and are the last ones allowed to eat [20].

Grey wolves also have strict hunting rituals. The main stages of their hunting are as follows [30]:(1)Pursuing and advancing toward the prey.(2)Circling and harassing the target to make it stop moving.(3)Attacking the prey [20].

If the position of the prey is assumed to be Xp, then the hunting process can be described using the following equations:where A and C are coefficient vectors computed using the r1 and r2 random vectors that are in the [0, 1] range, a is a vector with components decreasing from 2 to 0 during the algorithm’s iterations, t is the current iteration, X (t) is the position vector of the hunting wolf, and X (t+1) is its updated position in the t + 1 iteration [20].

The grey wolf optimizer starts by randomly generating a wolf population and then ranking them to increase values of the cost function used to evaluate solutions. Then, GWO labels the top, second, and third solutions as alpha, beta, and delta, respectively, to simulate the social hierarchy of the grey wolves. The algorithm designates the remaining population of wolves as omega that follows the alphas, betas, and deltas. Considering that we are unaware of the exact location of the prey in the search space, we assume that the top three wolves have better knowledge of the location of the potential prey. Accordingly, in GWO, the hunting process is driven by alpha, beta, and delta. So GWO modifies equations (1) to (4) by substituting the location of alpha, beta, and delta in place of the location of the prey and updating the location of the remaining wolves. The update rules are as follows [20]:where the A1, A2, and A3 values are calculated using equation (1), and C1, C2, and C3 are calculated using equation (2). X (t+1) is the location of the omega at the end of the iteration. The GWO updates the locations of all omega wolves using equations (5) to (7) and at the end of each iteration calculates the cost of all solutions and designates the top three solutions as alpha, beta, and delta. In a typical implementation, the algorithm continues until a predefined number of iterations has been completed. Figure 3 displays how a search agent (i.e., omega) updates its position consistent with the location of alpha, beta, and delta in a two-dimensional search space. To explain it more clearly, the alpha, beta, and delta wolves predict the prey’s position while the rest of the pack randomly (close in) update their position around the prey [20].

When using the GWO algorithm for ANN, each wolf’s position vector represents the weights and biases of the neural network. Therefore, in terms of dimension, it equals the weights and biases of the network. The prediction error of the network is defined as the cost function. After the GWO has completed its set number of iterations, the position vector of the wolf with the least cost (the alpha) is selected as the trained network’s final weights and biases.

3. Methods and Materials

3.1. Dataset

The database in this paper uses the dataset obtained by Asteris and Kolovos [1]. The dataset includes 205 experimental self-compacting concrete (SCC) compressive strength results collected from several articles by Asteris and Kolovos [3138]. As suggested by Asteris and Kolovos [1], the influencing parameters on compressive strength of SCC are the 11 parameters of cement (C), silica fume (SF), rice husk ash (RHA), limestone powder (LP), ground granulated blast furnace slag (GGBFS), coarse aggregate (CA), fine aggregate (FA), fly ash (F), water (W), viscosity modifying admixtures (VMA), and new generation superplasticizers (SP) as chemical admixtures. The SCC specimens’ 28-day compressive strength is taken as the target value of [1]. The inputs are in the units of kg/m3, and the compressive strength is in MPa. The maximum, minimum, mean, median, standard deviation, range, standard error, and average deviation of the data are provided in Table 1.

The histogram of the 28-day compressive strength of SCC specimens is depicted in Figure 4. As can be seen, most of the samples (159 cases) have compressive strength values that range from 30 to 80 MPa.

During ANN training, different input variable ranges can negatively affect the resulting model, including the divergence of the optimization algorithm and increased training time [26]. The input and output variables were normalized to the [−1, 1] range using the following equation:where Xn is the normalized value of the variable, Xmax is the maximum value, Xmin is the minimum value, X is the original (nontransformed) value of the variable.

The minimum and maximum values used for each of the 11 input parameters and the target value of compressive strength are provided in Table 1. It is noteworthy considering that ANN is trained using normalized data. It is necessary to feed the network with normalized inputs when training the ANNs and using the original values (unnormalizing the data values) for the network output [9].

When evaluating the model results, it is essential to define the measures by which the model’s performance and accuracy will be evaluated. The objective used will be the best fitness value (or lowest cost) on test data. The reason for gauging model performance based on test data is to choose the model with the highest generalization capability.

The statistical indices, including Mean Error (ME), Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Average Absolute Error (AAE), Model Efficiency (EF), and Variance Account Factor (VAF), are utilized to assess the performance of different topologies. They are defined as follows [39]:

3.2. Developing the Empirical Model Using ANNs and GWO

According to Section 3.1, a total of 11 parameters affect the SCC’s compressive strength. Consequently, as illustrated in Figure 2, the trained ANNs comprise 11 neurons in the input layer and a single neuron in the output layer.

ANNs are inclined toward the overfitting phenomenon, meaning that the trained ANN is a high performer (i.e., with minimum error on training data) but fails to perform well on the test data, which are unavailable during the training process. As recommended in literature [9, 26], data were randomly divided into two sets to alleviate the overfitting effect. Seventy percent (144 test cases) were employed for ANN training, and the remainder (61 test cases) were used as test data.

In ANNs, the number of hidden layers and their neurons varies based on the problem [40]. Consequently, a trial-and-error approach must be taken to identify a suitable architecture that best represents the existing data. Equation (10) offers a frequently used heuristic that calculates the total number of neurons in an ANN [41]:where NH is the number of hidden layer nodes and NI is the number of inputs.

Considering that the number of effective parameters equals 11, equation (10) shows that the number of hidden layers could be less than 23. Therefore, different architectures with two hidden layers and 26 neurons (three more than the suggested value) were trained. Each hidden layer possessed 1 to 13 neurons; thus, 132 equals 169 different architectures (see Table 2) undertook training. For all the ANNs, we selected the hyperbolic tangent function as the activation function of the hidden layers; also, the identity function was selected as the activation function of the output layer.

The ANN training process (i.e., adjusting the weights and biases) is a minimization problem. The optimal solution to the problem is choosing the weights and biases that minimize the network error (cost function). The grey wolf optimizer (GWO) algorithm (see Section 2.2) was used for this purpose. In GWO, the weights and biases of the ANN represented each wolf’s position vector, and the optimal solution, that is, the alpha grey wolf, found during the predetermined number of iterations of the algorithm represented the ANN’s weights and biases, giving the least prediction error for the architecture trained.

MATLAB [42] software was utilized to program the ANNs and the grey wolf optimizer algorithm. The GWO parameters employed for training the 169 ANNs are provided in Table 3.

3.3. Empirical Model Evaluation

As mentioned previously, 169 different ANN architectures with two hidden layers were trained using the GWO algorithm. The hyperbolic tangent function was the common activation function of the hidden layers in all the networks. The output layer’s activation function was chosen as the identity function. For simplicity, in the following sections, we will refer to ANNs as ANN 2L (n1 − n2), where 2L indicates two hidden layers and n1 and n2 define the number of neurons in the first and second hidden layers, respectively.

The top 10 models among the 169 trained models were selected based on their MSE values. The ANNs and their training and testing data results are provided in Tables 4 and 5. Figure 5 offers an illustration of the association between the network’s architecture and its performance measure and the RMSE graph of the investigated architectures based on the number of neurons in the first and second hidden layers.

According to Tables 4 and 5, the ANN 2L (7-13) and ANN 2L (9-6) models result in the lowest MSE and RMSE values for the training and testing data. The MSE on test data of ANN 2L (9-6) is less than ANN 2L (7-13), suggesting better generalization capability; however, since the performance of ANN 2L (7-13) is consistent for training and testing, it is chosen as the top model trained using the GWO algorithm for additional investigation. On training data, ANN 2L (7-13) has MAE, MSE, EF, and VAF values of 3.53, 27.55, 0.95, and 0.95, respectively. For testing data, ANN 2L (7-13) has MAE, MSE, EF, and VAF values of 3.43, 27.10, 0.94, and 0.94, respectively. It is noteworthy that the error metrics for training and testing the data were computed based on their original values and not their normalized range of [-1 1], as is sometimes used in literature.

Figure 6 shows the amount of RMSE for the top three GWO-ANN models, and Figures 7 and 8 offer a graphical illustration of the ANN 2L (7-13) model’s performance, including comparing the empirical model’s predicted and experimental values for the training, testing, and all data, respectively. The predicted values are plotted near the y = x line, which confirms the model’s accuracy.

The floating bar chart of the prediction errors of the ANN 2L (7-13) on testing data is depicted in Figure 9. For most samples, the prediction error is less than 5 MPa, and for other less accurate predictions, the prediction error is less than about 15 MPa; however, for one test sample, the error is about 25 MPa. The ANN 2L (7-13) empirical model is chosen for further analysis.

4. Comparison of Different Approaches

Two separate models were developed to assess the grey wolf optimizer algorithm’s performance on training an ANN for predicting SCC compressive strength. For this purpose, an ANN was trained using the genetic algorithm and a multivariable linear regression model.

4.1. Genetic Algorithm Model

The 169 architectures provided in Table 3 were utilized for training the GA to identify the best performing ANN. Consequently, the highest-performing ANN had 11 and 9 neurons in the first and second hidden layers, respectively. To distinguish between ANN trained using GWO and GA, the networks will be designated with ANN-GWO 2L (7-13) and ANN-GA 2L (11-9) for GWO-trained and GA-trained networks, respectively.

The GA parameters used for the ANN-GA 2L (11-9) training were determined by trial and error (see Table 6). The floating bar chart of the prediction error of the GA-trained neural network is given in Figure 10. Compared to the GWO-trained neural network, this network has a much higher prediction error. The predicted versus experimental values of the compressive strength of SCC are illustrated in Figure 11. The points on the plot are much farther from the y = x line than the ANN-GWO 2L (7-13) model.

4.2. Multiple Regression Model

For simplicity, a multiple linear regression (MLR) model [43] was developed in Minitab software version 19 using the same dataset. Each of the variables’ impact can be estimated using the regression coefficients [43, 44]. The final regression equation is as follows:

In equation (11), C refers to cement, CA refers to coarse aggregate, FA refers to fine aggregate, W refers to water, LP refers to limestone powder, F refers to fly ash, GGBFS refers to ground granulated blast furnace slag, SF refers to silica fume, RHA refers to rice husk ash, SP refers to superplasticizers, VMA refers to viscosity modifying admixtures, and refers to compressive strength. The floating bar chart of the prediction error of the multiple linear regression model on testing data is plotted in Figure 12. Compared to GWO-trained and GA-trained ANNs, the multiple linear regression model performs better than the GA-trained model but lacks the accuracy of the ANN-GWO 2L (7-13) model. Figures 11 and 13 show the experimental versus predicted values of ANN-GA (7-13) and MLR models based on testing data. The resulting predictions of the model are plotted near the y = x line, depicting its high accuracy.

The statistical indices of MAE, MSE, EF, and VAF for ANN-GWO 2L (7-13), ANN-GA 2L (11-9), and multiple linear regression models based on the training and testing data are provided in Tables 7 and 8, respectively. The highest-performing empirical model is ANN-GWO 2L (7-13), followed by the multiple linear regression and ANN-GA 2L (11-9).

For a visual comparison of the three models, the Taylor diagram plot is shown in Figure 14, and the predictions of the three models for the testing data are plotted in Figure 15.

4.3. Sensitivity Analysis

Given that the ANN-GWO 2L (7-13) model offers superior performance compared to the ANN-GA 2L (11-9) and multiple linear regression models, we conducted a sensitivity analysis to measure the relative influence of each of the 11 input parameters on SCC’s compressive strength. For this purpose, we implemented the profile method suggested by Lek [45, 46] in MATLAB software [42]. The general idea behind this method is to vary each input variable while others are kept at fixed values. In this application, the range of input variables was divided into several equal intervals called the scale. The remaining variables were set to m different fixed values, the network’s output was calculated for the entire range of the selected variable, and this resulted in m groups of outputs. Finally, the m groups of outputs were combined by calculating the median output for every input case. The fixed values utilized for each variable were their min, Q1, median, Q3, and max. For further details, please refer to the paper by Lek [44, 45], where they offer in-depth coverage of the implementation and theory of the procedure.

Figure 16 illustrates the relative importance and influence of the 11 input parameters on the response variable (SCC compressive strength). The figure shows that the most influential parameter is the cement content, where 22.3% of the response variable variation is attributed to cement content. The next three most influential parameters are GGBFS, RHA, and fly ash, with 18.3%, 17.3%, and 11.6% of the response variable variation. The least influential parameters are fine aggregate (FA) content and viscosity modifying admixtures (VMAs) with 0.3% and 1.5% of the response variable variation attributed to them.

Since the relative importance of FA (fine aggregate), W (water), LP (limestone powder), and VMA (viscosity modifying admixtures) is small (less than 5%), they were removed from the model, and all 169 architectures were trained with 7 inputs instead of 11 as in the previous models. The statistics of the top 3 models selected based on small MSE for training and testing data are given in Tables 9 and 10.

The top-performing ANN-GWO (11-4) model has 114.99 and 178.12 for MSE of training and testing, respectively. Compared to 27.55 and 27.10 training and testing MSE of the 11-input ANN-GWO (7-13) model, these values are higher and indicate less accuracy than 7-input models. Although 7-input models are simpler, they will not be considered in the following sections.

4.4. The Predictive Model and ANN Weights

This study’s top two empirical models are ANN-GWO 2L (7-13) and multiple linear regression (MLR) models. MLR can be efficiently utilized; however, the ANN-GWO 2L (7-13) model would not be applicable unless the source code is provided. Hence, the weights and biases of the ANN-GWO model are provided in this section. As noted previously, the ANNs input data for each variable must be initially normalized by equation (8) and the max/min values in Table 1. Also, the network output must be denormalized using equation (15). The network input is an 11 × 1 vector entitled . The SCC’s compressive strength is computed as follows:where tanh is the hyperbolic tangent function, fpredicted is the predicted compressive strength value, fmax is the maximum compressive strength, fmin is the minimum compressive strength, and weight (θ) and bias (b) matrices are given as follows:

5. Conclusion

This paper aimed to predict the compressive strength of self-compacting concrete (SCC). To this end, a total of 205 experimental results were obtained from the literature. The grey wolf optimizer (GWO) algorithm was employed to train the artificial neural network models. The following points were concluded:(1)For predicting the compressive strength of self-compacting concrete, ANN-GWO 2L (7-13) model provides superior performance compared to other ANN models. The model’s Mean Squared Error and efficiency using test data were 27.10 and 0.94, respectively.(2)A less accurate, but easy to implement, multiple regression model is provided to simplify the prediction process.(3)Benchmarking the ANN-GWO 2L (7-13), ANN-GA 2L (11-9), and multiple regression models portrays the ANN-GWO’s superior accuracy,followed by the multiple regression and ANN-GA models.(4)The sensitivity analysis on the ANN-GWO 2L (7-13) model suggests that the most influential parameters on compressive strength of self-compacting concrete are cement content, ground granulated blast furnace slag, rice husk ash, and fly ash. The fine aggregate content was the least influential factor.(5)The weights and biases of the best performing model, ANN-GWO 2L (7-13), are extracted, and a predictive matrix-based equation is developed for predicting the compressive strength of self-compacting concrete.

Data Availability

The datasets are available in Table 1 and at https://link.springer.com/article/10.1007/s00521-017-3007-7.

Conflicts of Interest

The authors declare no conflicts of interest regarding this paper.