#### Abstract

This study aims to predict the shear strength of reinforced concrete (RC) deep beams based on artificial neural network (ANN) using four training algorithms, namely, Levenberg–Marquardt (ANN-LM), quasi-Newton method (ANN-QN), conjugate gradient (ANN-CG), and gradient descent (ANN-GD). A database containing 106 results of RC deep beam shear strength tests is collected and used to investigate the performance of the four proposed algorithms. The ANN training phase uses 70% of data, randomly taken from the collected dataset, whereas the remaining 30% of data are used for the algorithms’ evaluation process. The ANN structure consists of an input layer with 9 neurons corresponding to 9 input parameters, a hidden layer of 10 neurons, and an output layer with 1 neuron representing the shear strength of RC deep beams. The performance evaluation of the models is performed using statistical criteria, including the correlation coefficient (R), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). The results show that the ANN-CG model has the best prediction performance with *R* = 0.992, RMSE = 14.02, MAE = 14.24, and MAPE = 6.84. The results of this study show that the ANN-CG model can accurately predict the shear strength of RC deep beams, representing a promising and useful alternative design solution for structural engineers.

#### 1. Introduction

Deep beams are defined as load-bearing structural elements in the form of simple beams, in which a considerable amount of load is transferred to the supports by a combined compression force of load and jet. Deep beams are characterized by a larger beam depth compared to conventional beams, classified by the ratio of the length of the cut span to the beam depth (*a*/*h*) or on the ratio between calculated span length and beam height (*l*/*h*). Several design codes have given the conditions for defining deep beams. For instance, according to IS Code 456-2000, the deep beam is defined by a ratio of effective span-to-overall depth (*l*/*h*), which does not exceed 2.0 for the simple beam and 2.5 for the continuous beam [1]. Besides, the ACI 318–14 [2] classifies a beam as a deep beam if it satisfies the following: (a) the spacing does not exceed four times of overall structural depth, or (b) the cutting span does not exceed twice the overall part depth. According to Eurocode 2 (EC2) [3], when the ratio *l*/*h* is less than three, the beam is considered a deep beam.

Currently, RC deep beams are widely used in structural works, such as transfer beam, wall foundation, foundation pile cap, floor partition wall, and shear wall [4]. In particular, deep beams play a crucial role in the design of large structures as well as small structures. In several specific cases for architectural purposes, the buildings are designed without using any columns for a very large span. In this case, if normal beams are used, failures such as bending failures might occur. So using deep beams is an effective solution that could increase the durability of structures [5–7]. Due to the large height of deep beams, the primary type of damage is shear damage [8–10]. For deep beams, cracks often appear quite early, in the direction of primary compressive stress, perpendicular to the direction of tensile stress. In many cases, the crack appears vertical or inclined when the beam is damaged by shear force. This leads to a sudden malfunction of the beam when the beam’s height increases [11]. In deep beams, the shear capacity can be 2 to 3 times greater than that determined by the calculation method obtained with conventional beams. Therefore, the shear stress in the high beam cannot be ignored compared with the conventional bending beam. The stress distribution is not linear even in the elastic phase. At given ultimate stress, the stress field is not the same parabolic shape as the conventional beams anymore, which is also a significant reason for slippage problems in deep beams [5].

In the past several decades, many methods have been proposed to analyze the shear strength of deep beams, including the strut-and-tie method (STM) [12, 13] and the upper limit theorem of plasticity theory [14, 15]. Based on the STM, theoretical methods to calculate the shear strength are proposed, such as compression field theory (CFT) and modified compression field theory (MCFT) [16, 17], the theory of softened strut-and-tie model (SSTM) considering the compression softening of concrete [18, 19], and strut-and-tie model based on the crack band theory [20, 21]. Besides, the current design codes, such as ACI 318–14 [2], EN 1992-1-1:2004 [3], and CSA A23.3–04 [22], have recommended the STM approach as a deep beam design tool. In addition, some in-depth studies have been carried out to analyze the shear behavior of deep beams as well as determine the most critical parameters affecting the shear strength. According to studies [4, 23–25], several important parameters have been identified, including compressive strength of concrete, yield strength of longitudinal and transverse reinforcement, the ratio of effective depth to breadth, as well as the main reinforcement ratio. In fact, the relationship between the parameters and the shear capacity of deep beams is nonlinear [9, 12, 23]. Consequently, building an accurate model that can accurately estimate shear strength based on mathematical equations is challenging [26]. Meanwhile, the deep beam shear strength obtained by experimental tests or numerical analysis is more or less limited because of the complexity of such kind of material and beam structure [12, 23]. To overcome these difficulties and to improve the ability to estimate the shear strength of deep beams, artificial intelligence (AI) approaches have been used in several investigations [27, 28].

Indeed, the construction field has effectively applied AI models to solve many problems such as geotechnical [29, 30], building materials [31, 32], structure analysis, and design [33–35]. The application of AI models for problems related to the shear strength of deep beams has been studied by many scientists. Goh’s first study in 1995 [36] applied the artificial neural network (ANN) model to predict beam shear resistance with 6 input parameters. Later, Sanad and Saka [37] also checked the effectiveness of the ANN model in predicting the shear strength of deep beams using 10 input parameters related to the geometry and material properties. The results showed that ANN provides an effective alternative solution in predicting the shear artificial neural network of reinforced concrete (RC) deep beams. It is obvious that the ANN algorithm is a widely used machine learning (ML) prediction tool, but the selection of an appropriate ANN algorithm is still being questioned. In fact, it is challenging to find the best ANN model that could accurately predict the target and optimize many factors, such as the processing speed, numerical precision, and memory requirements. Such an optimization problem lies in the learning process in a neural network and could be solved by using an appropriate training algorithm. In fact, the ANN algorithm contains four principal training algorithms, including Levenberg–Marquardt (ANN-LM), quasi-Newton method (ANN-QN), conjugate gradient (ANN-CG), and gradient descent (ANN-GD). A given training algorithm might be suitable for a given problem but might fail in another case [38]. Gradient descent is the slowest training algorithm but requires less memory than the other three algorithms. The fastest algorithm is Levenberg–Marquardt, which requires the most memory. Therefore, an in-depth investigation is crucial to determine the best training algorithm in general and in predicting the shear strength of deep beams in particular. Besides, the basis of selecting the best ANN black-box raises a number of fundamental questions, especially the criterion to define the best one. In the field of ML, the performance evaluation of the models is assessed by different metrics [39–41], namely, the correlation coefficient (*R*) or the coefficient of determination (*R*^{2}), mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE). A concise evaluation and comparison of different criteria need to be conducted to confirm ML models’ effectiveness.

Therefore, in this study, the procedure to determine the best ANN algorithm is conducted through different ANN training algorithms and evaluation metrics, with the highest aim is to accurately and reliably predict the shear strength of the deep beam. To achieve this goal, in the first step, the construction of the deep beam database is conducted by gathering different experimental results published in the literature. The general theory of ANN models is then presented, including four previously mentioned training algorithms. An architecture of ANN models is proposed, along with an extensive investigation on the ANN epoch numbers. The best ANN algorithm is deduced by comparing different performance metrics and the corresponding probability density functions, taking into account the random sampling effect, while constructing the two datasets. Finally, the representative results in predicting the shear strength of deep beams are presented and compared with several existing prediction results in the available literature.

#### 2. Significance of the Research Study

Accurate prediction of the deep beam shear strength is crucial in the construction design. Although some machine learning models have been proposed to predict the shear strength of deep beams in the available literature, namely, genetic-simulated annealing [4], backpropagation neural network [42], artificial neural network [43], gene expression programming [43], support vector machine [42], multivariate adaptive regression splines [42], smart artificial firefly colony algorithm and least squares support vector regression [24], and adaptive neural fuzzy inference system [44], the prediction accuracy and reliability could be further improved. Therefore, different contributions of the present investigation could be pointed out by the following ideas:(1)Four representative training algorithms for the ANN model are investigated to predict the shear strength of deep beams, in which the training epoch of each model is fine-tuned.(2)The reliability of ANN models is carefully evaluated by Monte Carlo simulations with random sampling strategy to construct the database.(3)The model using the conjugate gradient algorithm (ANN-CG) containing 10 neurons in the hidden layer is deduced as the best predictor.(4)The performance of the best ANN-CG architecture is compared with 10 previously published works in the literature and achieved the highest value of the correlation coefficient (*R*) and lowest values of mean absolute error (MAE). Thus, the simplicity and effectiveness of the proposed approach using ANN-CG are confirmed.

#### 3. Database Construction

In this study, the database used to develop the ML models is collected from published research. The dataset includes 106 test results of the shear strength of deep beams. Specifically, 19 test results of high-strength RC deep beams are collected in the study by Tan et al. [45], 52 test results from the work of Smith et al. [46], and 35 test results from the work of Kong et al. [47]. This database includes various parameters affecting the shear strength of RC deep beams (denoted as *V*), including the ratio of effective span to effective depth (*L*/*d*), ratio of effective depth to breadth (*d*/*b*_{w}), ratio of shear span to effective depth (*a*/*d*), concrete cylinder strength (*f’*_{c}), yield strength of horizontal reinforcement (*f*_{yh}), yield strength of vertical web reinforcement (*f*_{yv}), ratio of horizontal web reinforcement (*ρ*_{h}), ratio of longitudinal reinforcement to concrete area (*ρ*_{s}), and ratio of vertical web reinforcement (*ρ*_{v}). Representative information on these parameters is detailed in Table 1. Besides, the histograms of each input and output parameter are shown in Figure 1. The beam test diagram and a schematic illustration of RC deep beams are illustrated in Figure 2. Prior to the training process of ANN models, all input and output values are normalized in the range of [0, 1] and then converted back to the initial range of values for the sake of clarity and postprocessing processes.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

**(i)**

**(j)**

The database is randomly divided into two parts, representing the generation of the random sampling effect. The first part (containing 70% of the total data, 74 samples) is used to train the ANN network, called the training part. The second part (using the remaining 30% of data, 32 samples) is used to verify the ANN models, referred to as the testing part. The random sampling effect generates variability in the input space of the training part, which considerably affects the accuracy of the ML models. Besides, the meaning of separating the training and testing parts in machine learning problems is to fully assess the accuracy of the models, as the testing data are entirely unknown to the model during the training phase. In general, the prediction capacity of the model is the most important factor. Therefore, the results in the next sections only focus on the evaluation criteria of the testing parts.

#### 4. General Presentation of ANN Models

An artificial neural network (ANN) is a computational model that is built based on the human brain with many biological neurons. It consists of many artificial neurons, interconnected in a network, including input and output data. From input data to come up with a complete result or output, a set of learning rules is used. It is called the backpropagation or backward propagation of error. The structure of a backpropagation network is a combination of different layers, including the input layer, the output layer, and the hidden layer. The input layer is the first layer, the output layer is the last one, and the connection between the two layers is the hidden layer, which might contain one or many hidden layers (Figure 3).

During the training phase of the algorithm, ANN learns to recognize patterns from the input data. Then, it compares the produced result with the desired output. The difference between the two results is adjusted through a backward working process until such a difference is lower than a predefined criterion. Therefore, to train a neural network, the selection of an appropriate training algorithm is very important. The training algorithms are the underlying engines for building neural network models with the goal of training features or patterns from the input data so that a set of internal model parameters can be found to optimize the model’s accuracy. There are many types of training algorithms, but frequently used ones can be listed as gradient descent, conjugate gradient, quasi-Newton method, and Levenberg–Marquardt algorithms.

##### 4.1. Training Algorithms of ANN Model

###### 4.1.1. Gradient Descent Algorithm (ANN-GD)

Gradient descent is an iterative optimization algorithm used in ML and deep learning problems with the goal of finding a set of internal variables for model optimization. Inside, the “gradient” is the rate of inclination or declination of a slope, and the “descent” means descending. Gradient descent often performs in 3 steps, namely, (1) internal variable initialization, (2) evaluating the model based on the internal variable and loss function, and (3) updating internal variables in the direction of finding optimal points. The gradient descent method possesses the iteration step bywhere is the set of variables to be updated, is the gradient of the loss function *f* according to set , *η* is the training rate, and *i* = 0, 1, …, *η* can be a fixed value or determined by one-dimensional optimization along the training direction per step. The nature of the optimization process of the loss function is finding the suitable points to minimize or maximize the loss function. The goal of the gradient descent method is to find such global minimum points. The stopping criterion of the gradient descent method can be (i) the maximum number of epochs reached, (ii) the value of the loss function is small enough, and the accuracy of the model is large enough, and (iii) the value of the loss function remains stable after a finite number of epochs. The gradient descent algorithm is often used with the big neural networks. The advantage of this method lies in the storage of the gradient vector, instead of the Hessian matrix. The diagram for the training process with the gradient descent is shown in Figure 4.

###### 4.1.2. Conjugate Gradient Algorithm (ANN-CG)

The conjugate gradient algorithm could be considered as one of the algorithms to improve the convergence rate of the artificial neural network, being the intermediate between gradient descent and Newton’s method. The advantage of this approach lies in the fact that there is no need to evaluate, store, and reverse the Hessian matrix. In this algorithm, the search is performed along with conjugate directions, which produce generally faster convergence than gradient descent directions. These training directions are conjugated concerning the Hessian matrix. In this algorithm, the sequence of training directions is built using the following formula:with the initial training direction vectorwhere *y* is the training direction vector, c is the conjugate parameter, and *i* = 0, 1,…

The training direction, in all the cases, is reset to the gradient’s negative [48]. The parameters’ improvement process with the conjugate gradient algorithm is defined bywhere *i* = 0, 1, …, *η* is the training rate, usually found by line minimization. The diagram for the training process with the conjugate gradient is shown in Figure 4.

###### 4.1.3. Quasi-Newton Algorithm (ANN-QN)

The advantage of the quasi-Newton method is that it is computationally inexpensive because it does not need many operations to evaluate the Hessian matrix and calculate the corresponding inverse. An approximation value to the inverse Hessian matrix is built at each iteration. It is computed using only information on the first derivatives of the loss function. The Hessian matrix is composed of the second partial derivatives of the loss function. The quasi-Newton formula is presented bywhere is the inverse Hessian approximation. The quasi-Newton method is commonly used because it is faster than gradient descent and conjugate gradient. The diagram of the quasi-Newton method is shown in Figure 4.

###### 4.1.4. Levenberg–Marquardt Algorithm (ANN-LM)

The Levenberg–Marquardt (LM) algorithm, also called the damped least squares method, is used to solve nonlinear least squares problems. Instead of computing the exact Hessian matrix, this algorithm calculates with the gradient vector and the Jacobian matrix. The loss function is expressed as a sum of squared errors aswith *a* is the number of instances in the dataset and *u* is the vector of all error terms. The Jacobian matrix of the loss function is defined as follows:for *i* = 1,…, *a* and *j* = 1,…, *b* and *a* is the number of instances in the dataset, *b* is the number of parameters in the neural network, and *A* is the Jacobian matrix. The size of the Jacobian matrix is [a, b]. The gradient vector of the loss function is calculated as

The Hessian matrix is approximately computed bywhere *B* is the Hessian matrix, *β* is a damping factor that ensures the positive of the Hessian, and *I* is the identity matrix. The large parameter *β* is chosen in the first step. Next, if there is an error in any iteration, *β* will be increased by some factor. On the contrary, if the loss decreases, *β* will be decreased so that the Levenberg–Marquardt algorithm approaches the Newton method. Finally, the parameters’ improvement process using the Levenberg–Marquardt algorithm is defined asfor *i* = 0, 1, …

The diagram of the ANN-LM training algorithms is shown in Figure 4.

##### 4.2. Validation of Models

To evaluate the performance of the machine learning models, in this investigation, four indexes, namely, root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and correlation coefficient (*R*) are used. The RMSE is used to evaluate the difference between the actual and predicted values. MAE shows the average error of the actual and predicted values. The MAPE is defined as the difference between the actual and predicted values and then divided by the actual value. Specifically, the lower the RMSE, MAE, and MAPE values, the higher the accuracy of the models and the better the performance of the models. On the contrary, the higher *R* values mean higher model performance. The *R* value varies in the range from −1 to 1. The *R* values close to 0 show the poor performance of the model and close to 1 means good accuracy. The values of RMSE, MAE, MAPE, and *R* are defined by the following formulas:where *Q*_{AV} and are the actual and the average values and Q_{PV} and are the predicted and the average predicted values.

#### 5. Methodology Flowchart

In this study, the flowchart of the proposed methodology includes the following steps:(a)Data collection: this is the first step, and the dataset is built by gathering data from the available literature. All data are randomly divided into 2 parts: training data and testing data, in which the training part accounts for 70% of the dataset and the testing part accounts for 30% of the dataset.(b)Building models: in this step, the data of the training part was used for training the models based on training algorithms such as gradient descent, conjugate gradient, quasi-Newton, and Levenberg–Marquardt.(c)Model validation: in this final step, the data of the testing part is applied to validate the proposed models. Statistical indicators including RMSE, MAE, MAPE, and *R* are utilized to evaluate the models.

A schematic diagram of the methodology is illustrated in Figure 5.

#### 6. Results and Discussion

The definition of the ANN structure is critical in solving problems [49, 50]. In the case that the number of input and output is fixed, the performance of the ANN model depends on the hidden layer number and the neuron number in each hidden layer. Cybenko [51] and Bound [52] have succeeded in using a single hidden layer model in classifying the input variables for model processing. Besides, some studies [53–55] have shown that an ANN model with only one hidden layer could be enough to successfully explore a complex nonlinear relationship between input(s) and output. Therefore, one hidden layer is proposed for the structure of the ANN model in this investigation. Moreover, semiempirical relationships proposed by Nagendra [56], Tamura [57], and some investigations [58–60] have recommended that the neuron number of the hidden layer is equal to the total number of inputs and outputs. In the current database, the number of input and output representing deep beams’ shear strength is equal to 9 and 1, respectively. Therefore, 10 neurons in the hidden layer ANN is proposed. The sigmoid activation function for the hidden layer is selected, while the activation function for the output layer is a linear function. The cost function has been chosen as the mean square error one. Due to the random sampling effect, the number of simulations is proposed 300 times to obtain reliable results.

The main purpose of this work is to investigate the performance of four ANN models to predict the shear strength of deep beams, trained by the four algorithms, namely, Levenberg–Marquardt (ANN-LM), quasi-Newton method (ANN-QN), conjugate gradient (ANN-CG), and gradient descent (ANN-GD). The training process is repeated until the network output error reaches an acceptable value (less than the initial specified error threshold). In this study, the network training is performed with various epoch numbers, ranging from 100 to 1000 with a step of 100. Finally, Table 2 summarizes the characteristics of the ANN models proposed in this study.

##### 6.1. Comparison of ANN Models’ Prediction Capability

The results of the network training by different algorithms are evaluated by the values of criteria R, RMSE, MAE, and MAPE. Figures 6(a)–6(d) show the mean and std values of *R*, RMSE, MAE, and MAPE in function of different epoch numbers for the testing part of the ANN-LM algorithm. Similarly, Figures 7(a)–7(d) show the mean and std values of *R*, RMSE, MAE, and MAPE in function of different epoch numbers for the testing parts obtained by using ANN-QN, ANN-CG, and ANN-GD algorithms. For the ANN-LM model, the mean and std of *R* values decrease with a higher number of epochs, and the mean value and std of RMSE, MAE, and MAPE increase. This behavior shows that the accuracy of the ANN-LM model is highly affected by the number of epochs. It means that, with a higher number of epochs, the accuracy of the ANN-LM model decreases. It could be confirmed that the ANN-LM model can produce high accuracy results with high speed, and the same conclusion could be drawn for the case of the ANN-CG model. However, an opposite conclusion is found for the case of ANN-QN and ANN-GD models, in which the mean and std values of *R* increase and those of RMSE, MAE, and MAPE decrease with a higher number of epochs. Thus, the accuracy of ANN-QN and ANN-GD models increases with a higher number of epochs.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

Moreover, Table 3 details the values and std of *R*, RMSE, MAE, and MAPE of four models with different epoch numbers, varying from 100 to 1000 with a step of 100. It is found that the accuracy of the ANN-LM model is very low, where the maximum value of *R* is only 0.747 at 100 epochs. Similarly, for the ANN-CG model, at 100 epochs, the highest *R* value is 0.971. Therefore, the optimal ANN-LM and ANN-CG model is at 100 epochs. In contrast, with ANN-QN and ANN-GD algorithms, the highest value of *R* is *R* = 0.961 and *R* = 0.969, respectively. Besides, the std values of the three criteria RMSE, MAE, and MAPE of the ANN-LM model are the highest compared to the other models. This shows that the ANN-LM model has the lowest accuracy among the 4 models.

Next, the values of criteria RMSE, MAE, and MAPE of the three remaining models are compared. With the ANN-CG model, the values of these criteria are the lowest at 100 epochs, compared with the lowest value of the ANN-QN model at 900 epochs and the lowest value of the ANN-GD model at 1000 epochs. Through evaluation and analysis, it is found that the ANN-CG model is the model with the best accuracy with the least number of epochs. Considering the case of large numbers of epochs, it can be seen that the ANN-GD model is superior to the ANN-QN model. Therefore, a reliability evaluation of the three models is performed in the following sections. The lowest accuracy ANN-LM model for shear beam prediction is not proposed for the next investigation.

##### 6.2. Reliability Evaluation of the Best ANN Training Algorithms

The main purpose of this section is to evaluate the reliability of the three models, including the optimal ANN-CG at 100 epochs, the optimal ANN-GD at 900 epochs, and the optimal ANN-QN at 1000 epochs. Figure 8 shows the distribution of the probability density function (PDF) of the four statistical criteria for the training part, namely, *R* (Figure 8(a)), RMSE (Figure 8(b)), MAE (Figure 8(c)), and MAPE (Figure 8(d)), over 300 simulations performed using the mentioned ANN structures. Meanwhile, Figures 9(a)–9(d) show the corresponding distribution of the probability density function for the testing part. According to observation, the PDF curves for the four statistical criteria of the training and testing parts of the ANN-CG model are the narrowest, and the best values of the four criteria are better than the two other algorithms.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

Simultaneously, Table 4 presents in detail the values of the four statistical criteria (maximum, minimum, average, and standard deviation) according to the three proposed ANN models. Considering the testing part, the values of average and standard deviation for the case of *R* are 0.961 and 0.032 for ANN-QN, 0.971 and 0.02 for ANN-CG, and 0.966 and 0.032 for ANN-GD, respectively. For RMSE, these are 32.8 and 8.62 for ANN-QN, 28.07 and 8.44 for ANN-CG, and 29.44 and 9.01 for ANN-GD. In the case of MAE, the average and standard deviation are 22.56 and 4.89 for ANN-QN, 19.26 and 4.75 for ANN-CG, and 20.28 and 5.18 for ANN-GD. Finally, these values of MAPE are 12.78 and 3.81, respectively, for ANN-QN, 11.05 and 3.45 for ANN-CG, and 11.35 and 3.5 for ANN-GD. Thus, in terms of the average value, the ANN-CG model outperforms the other two models, with the average value of *R* being the highest. Meanwhile, the mean values of RMSE, MAE, and MAPE are the lowest. More importantly, the standard deviation values obtained from the ANN-CG model are also the lowest, which shows that ANN-CG is the most stable and reliable method. The results show that the ANN-CG is the most reliable training algorithm for predicting the shear strength of the deep beam. Therefore, the ANN-CG model is chosen to predict the shear strength of the deep beam in the next section.

##### 6.3. Prediction of Beam Shear Strength Using the Best ANN Model

In this section, the capability of the ANN-CG model to predict the shear strength of deep beams is investigated. The selection criterion of the best model depends on the values of the four statistical criteria used in this study. Therefore, 4 cases of performance evaluation are considered, namely, (1) maximum value of *R*, (2) minimum value of RMSE, (3) minimum value of MAE, and (4) minimum value of MAPE. With respect to each case, the values of the criteria, as well as the standard deviation and mean error, are detailed in Table 5.

In the first case, the maximum value of *R* is 0.993 for both training and testing parts. For the second case, the minimum RMSE value is 14.73 for the training part and 14.02 for the testing part. The minimum value of MAE is considered in Case 3, where MAE = 9.88 for the training part, and MAE = 10.06 for the testing part. The last case finds the minimum MAPE of 6 and 5.79 with the training and testing parts, respectively.

In analyzing the results presented in Table 5, the prediction performance is evaluated through 4 criteria for the testing part. The maximal value of *R* is slightly different, considering cases 1, 3, and 4. The difference between the Case 4 and min MAPE values of cases 1, 2, and 3 is relatively small, especially when comparing with those of RMSE and MAE values. It means that the Cases 2 and 3 have higher performance compared with the two other cases. However, Case 3 possesses an RMSE value higher than that of Case 2. Therefore, Case 2 has the best performance of shear strength prediction.

The error diagrams of the training and the testing parts of the ANN-CG model are presented in Figures 10(a) and 10(b). According to the results, the number of samples with errors out of the range from −30 to 40 kN is small (only 2 samples) for the training part. The error of the testing part is lower than that of the training phase. Besides, the cumulative red lines show that 80% of error is in the range from −20 to 20 kN for the training part, whereas about 90% of error is within −15 to 20 kN for the testing part.

**(a)**

**(b)**

Finally, Figures 11(a) and 11(b) show a regression model representing the correlation between the actual and predicted shear strength values for the training and testing parts, respectively. A linear fit is also applied and plotted in each case. The values of *R* calculated for the training part are *R* = 0.993 and *R* = 0.992 for the testing part, respectively. The values of RMSE, MAE, and MAPE for the training and testing parts are shown in Table 5.

**(a)**

**(b)**

Finally, the results of this investigation are compared with the results previously published with some other predictive methods, summarized in Table 6. Using the artificial neural network-conjugate gradient (ANN-CG) in this study, the performance of shear strength prediction of the deep beam seems to be the best with the highest value of *R*, the lowest value of MAE, and almost the lowest values of RMSE and MAPE. More importantly, while comparing the four algorithms proposed in this study, the ANN-CG appears as the best predictor with respect to the accuracy in estimating the shear strength of the deep beam as well as less computation time is required (i.e., best performance at 100 iterations). Furthermore, the computation memory and cost are less demanded in comparison with other algorithms. It implies that the prediction of deep beam shear strength would not require a high-performance computer with the use of the ANN-CG algorithm. Usually, hybrid ML algorithms take a longer computation time than standalone ones. However, given the prediction accuracy achieved in this study, the development of a hybrid approach would not be necessary. Overall, this confirms the effectiveness of ANN-CG proposed in this study, suggesting a promising and useful alternative design solution for structural engineers. For practical applications, the final weight and bias values of the best ANN-CG model are given in Table 7 and could be used to develop a supporting numerical tool for estimation of shear strength of deep beams.

#### 7. Conclusion

In this study, the neural network (ANN) model is proposed to predict the shear strength of deep beams. For this purpose, a database of 106 results from shear tests of RC deep beams is built from the available literature. The ANN model is built with 9 input parameters divided into two groups, namely, the geometric size parameter group and the parameter group representing the material properties. Four training algorithms of ANN are explored, namely, the Levenberg–Marquardt (ANN-LM), quasi-Newton method (ANN-QN), conjugate gradient (ANN-CG), and gradient descent (ANN-GD). The prediction performance of different ANN training algorithms is compared. Four different statistical criteria, namely, the correlation coefficient (*R*), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), are introduced to validate and evaluate the performance of the ANN model. The conjugate gradient (CG) algorithm is chosen as the best training ANN algorithm for predicting the shear strength of deep beams. With the ANN-CG model chosen as best, four cases corresponding to four different black-boxes are studied. The results show that the crucial information to choose an accurate machine learning model might lie on the criterion that the smallest value of RMSE is obtained. Besides, the analysis of error between predicted and actual shear strength shows that the ANN model can be a promising numerical tool that could considerably avoid time-consuming and costly experimental procedures. Despite an extensive investigation on different potential training algorithms and epochs, this study is only conducted on one ANN architecture. Therefore, regardless of the highest and outstanding prediction accuracy achieved, it is interesting to perform another investigation related to the neuron number and the hidden layer number to, possibly, enhance the performance of the ANN-CG model, or to further decrease the computation time by decreasing the neuron in the hidden layer.

#### Data Availability

The data used to support the findings of the study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.