Accurate measurement of the critical buckling stress is crucial in the entire field of structural engineering. In this paper, the critical buckling load of Y-shaped cross-section steel columns was predicted by the Artificial Neural Network (ANN) using the Levenberg-Marquardt algorithm. The results of 57 buckling tests were used to generate the training and testing datasets. Seven input variables were considered, including the column length, column width, steel equal angles thickness, the width and thickness of the welded steel plate, and the total deviations following the Ox and Oy directions. The output was the critical buckling load of the columns. The accuracy assessment criteria used to evaluate the model were the correlation coefficient (R), root mean square error (RMSE), and mean absolute error (MAE). The selection of an appropriate structure of ANN was first addressed, followed by two investigations on the highest accuracy models. The first one consisted of the ANN model that gave the lowest values of MAE = 40.0835 and RMSE = 30.6669, whereas the second one gave the highest value of R = 0.98488. The results revealed that taking MAE and RMSE for model assessment was more accurate and reasonable than taking the R criterion. The RMSE and MAE criteria should be used in priority, compared with the correlation coefficient.

1. Introduction

In the field of modern construction today, steel materials are used for most constructions such as infrastructure work, bridges, towers, and airports, thanks to its advantages compared to other types of objects of other materials [1, 2]. Along with the development and construction needs in the field of transportation, the requirements of steel materials in structures are becoming ever more concerned, in which high-strength steel is widely used for many advantages compared to conventional steel [3]. However, applications are limited to a number of columns with traditional cross-sections such as steel pipes, angle steels, and cross-sections. In addition to the above traditional columns, a Y-type column has been proposed for use in compressible structures [4]. For compressive structures, instability is the most critical cause of failure of the structure and the whole structure [5]. Stability is defined as the deformation of the building that increases suddenly and can cause the building to collapse completely, causing significant damage to people and property [6]. Therefore, the determination of structure stability has a relatively long history of development in the world. In 1744, the first studies on the instability of compressive structures were proposed by Euler. However, the Euler formula is only suitable for structures with large slenderness [7, 8]. Later, Johnson’s proposal allowed to find the unstable critical force for small- and medium-sized structures [9]. However, these formulas are only applied to isotropic, homogeneous materials. Moreover, there are many assumptions about the perfection of the components [10]. In fact, the instability phenomenon depends on a variety of factors, for example, cross-section geometry, structural length, limit conditions, and applied loads [11]. Experimental works have been conducted in many studies to characterize the instability of structural components under compression, for instance, in the work of Shi et al. [12] on the steel structure under axial load. The welding of 460 MPa stainless steel plates created four specimens of the square box section with an identical slenderness ratio. The results showed that the stability of steel tubes decreased compared with conventional design codes. Several experimental studies indicated that the buckling had occurred earlier than predicted by existing codes, including box segment (800 MPa yield strength steel) [3, 13], I-section (690 MPa yield resistance steel) [14, 15], and hollow circular part (420 MPa yield resistant steel) [16]. Besides, all such laboratory experiments were complicated, costly, and time-consuming.

The buckling behavior of columns is a highly nonlinear problem and complicates the investigations of structural elements’ behaviors [17]. Analytical and semianalytical researches were conducted, focusing on lift beams [18, 19], or helping beams [20]. The finite-element method is commonly used to study the nonlinear phenoms in mechanics, in particular in the buckling problem. ANSYS was used by Shi et al. [16] to model the axial compression response for circular steel tubes. In many studies, such as Shi et al. [12, 13] for rectangular steel tubes, Yang et al. [14] for I cross-section columns, or Jiang et al. [21] for hollow circular tubes made of nickel-titanium alloy, the behavior of structural members under compression was analyzed. Software such as ANSYS [22] or ABAQUS [23] was used for problems with input parameters such as the geometry of the cross-section, weight, mechanical properties, loading, and other considerations. Understanding the software in programming languages is vital to study the problem thoroughly and to perform high-performance computing in the case of big data. However, the nonlinear finite-element approach is still difficult to deal with, especially the implementation phase of nonlinear algorithms [2426]. It is evident that the behavior of structural components under compression should be predicted accurately.

Artificial intelligence (AI) simulations have been commonly used in mechanical engineering during the last several decades to predict the properties and the behavior of structural components [2734]. The ANN model of Lakshmi et al. [35] was developed to forecast the mechanical properties of steel austenite in the superplastic region. The AI approach to study the deterioration of the metallic materials in the presence of hydrogen was conducted by Thankachan et al. [36]. In material sciences, various mechanical properties were investigated using data-driven algorithms, including tensile strength [37] or compression strength [38, 39]. Besides, AI approach could investigate many types of failure of structural elements, for example, the fatigue of the steel components [40]. In Tan et al. [41] or Padil et al. [42], the use of a nonprobabilistic artificial neural network model to calculate, identify, and evaluate the harm to steel beams was studied. The buckling behavior of axial load structural elements was predicted by ANN considering different geometries such as shells [43], panels [44], I-section beams [45], or elliptical segments [46]. In the investigation of Bilgehan [47], an Adaptive Neuro-Fuzzy Inference system (ANFIS) was also applied to predict a column’s buckling in the presence of cracks. Until now, the mechanical behavior and buckling response of the structural components can be determined through experiments combining with AI approaches.

Therefore, in this study, with 57 experimental results collected from the available literature, the authors propose an approach using an artificial neural network (ANN) to accurately predict the critical buckling load (Fcr) of Y-section steel columns. The next section is dedicated to a brief presentation of the ANN model using the Levenberg-Marquardt algorithm (LMA) for the neural network training process. The experimental results are used to build the training and testing datasets, collected from [4] with seven input variables (column length, column width, steel equal angles thickness, width, the thickness of the welded steel plate, and total deviations following the Ox and Oy directions) and one output variable (critical buckling load of 420 MPa Y cross-section steel columns). The assessment of the accuracy of the proposed model is performed by the correlation coefficient (R), root mean square error (RMSE), and mean absolute error (MAE). Finally, a reliability analysis is performed, combined with the three mentioned criteria to deduce the best ANN black-box for the prediction problem.

2. Method Used

2.1. Artificial Neural Network (ANN)

The use of machine learning algorithms or artificial neural networks has been widely applied over the past decades [4853]. Neural networks are systems based on the human brain. These provide a variety of mathematical calculations used to model the biological mechanisms of the human brain, such as knowledge and memory [54]. Compared with conventional computational methods, the ANN algorithm is particularly useful in solving problems of high complexity and challenge to construct a standard mathematical model [55]. In ANN, the information related to the inputs is provided to an artificial neuron; each input is associated with a weight and a bias. The LMA is the common tool used to tune the weight and bias of each neuron in the network [56, 57].

2.2. Levenberg-Marquardt Algorithm

Levenberg-Marquardt Algorithm (LMA) is commonly used as a reference algorithm to overcome nonlinear problems [56, 57]. This algorithm is a mixture of the descent of gradients with Gauss-Newton methods. In many situations, the LMA ensures that its adaptive actions can address problems [58]. The algorithm is slow and provides no ideal solution [59] when back-propagation (BP) is represented in gradient descent. In comparison, the algorithm is more likely to have an optimal solution if the word BP is represented as Gauss-Newton. This method demonstrates the approximation of the Hessian function in (1) and is used in (2) for the estimation of gradients:where the Jacobian matrix is J and “e” indicates the network error function. And then the LMA function as in the following equation describes Newton:where xk+1 is a new weight, determined using the Newton algorithm as a gradient and existing weight xk.

2.3. Model Performance Assessment

In this investigation, three statistical criteria were chosen. The correlation coefficient (R), root mean square error (RMSE), and mean absolute error (MAE) were introduced in order to evaluate the developed AI model. The R is commonly used in regression analysis to evaluate or quantify the variation of target data that could be achieved by predicted data [60]. Both RMSE and MAE measure the average magnitude of error [61]. However, RMSE is more useful in case significant errors appeared (the errors are squared using RMSE). All the criteria are essential to evaluate the AI model. The values of R, RMSE, and MAE can be written by the following equations [62]:where N is the number of samples in the database, p0, and are the actual experimental value and the average real experimental value, and pt and are the predicted value and the average predicted value, calculated according to the model forecast. Besides, reliability analysis is performed to evaluate the consistency of the proposed ANN model. Detailed formulation and the two steps to calculate the reliability of the model can be referred to in the work of Saberi-Movahed et al. [63].

3. Database Preparation

The database containing 57 data points was collected from the work of Yu et al. [4]. In the experiments, the steel has high-strength steel with 210 GPa of Young’s modulus and 420 MPa of yield strength. The Y-section steel columns were made by welding equal angles steel and a steel plate. The pinned-pinned boundary conditions were applied to the steel columns. The tests aimed to determine the critical load or critical applied force (denoted as Fcr) that causes the buckling behavior of steel columns.

The fluctuations of mechanical properties, as well as the residual stresses, were considered not to affect the buckling behavior of columns. Only variables related to the geometrical and initial geometrical imperfections were considered. Thus, the input variables in this study were the length of the column, the width, the thickness of steel equal angles, and the width and the thickness of the welded steel plate. Two geometrical imperfection variables were the total deviations in the Ox and Oy directions. Overall, seven inputs were used for the development of the machine learning algorithm, whereas the Fcr was the only output of the problem. Statistical values of these variables, such as min, max, average, standard deviation, could be referred to as the literature.

The database was randomly split into two parts. The training part was used for the development phase of the machine learning toolbox, whereas the testing part was used for the validation phase. It is denoted herein “Train 50%” as the simulations with 50% of the data points were used to construct the training dataset, and the remaining data (50%) were used for the construction of the testing dataset. Similar nomenclatures were applied for 60%, 70%, 80%, and 90%, whereas the data points in the testing dataset were the remaining samples.

4. Results and Discussion

This first part is dedicated to present simulation results for five cases, where the size of the training part varied from 50% to 90% of the total data. It is worth noticing that 200 simulations were performed in each case, where the indexes of samples were randomly taken to establish the training dataset. The results with respect to R, RMSE, and MAE are plotted under probability density functions over 200 simulations. It should be pointed out that only the results of the testing part were shown, as they reflected the accuracy of a given machine learning algorithm. Besides, the parameters of ANN model used in this study are 1 hidden layer of 5 neurons, the hyperbolic tangent sigmoid transfer function is used for the hidden layer, the linear function is applied for the output layer, the training algorithm is the Levenberg-Marquardt algorithm, number of training epochs is 1000, and mean squared error is used as the performing function of the ANN model.

With respect to R values, the probability density functions are highlighted in Figure 1. As can be seen, the accuracy of R was increased proportionally with the increase of samples in the training set. No significant difference was observed between the cases of Train 50% and Train 60%. However, between the cases of Train 70%, 80%, and 90%, the mode of the curves was different (i.e., the modes were R = 0.90, 0.92, and 0.95 for Train 70%, Train 80%, and Train 90%, respectively). Figure 2 shows the probability density functions for RMSE values. Similar to the previous criterion, the modes of Train 50% and Train 60% were identical. Besides, the Train 80% was slightly superior to the Train 70%, and the mode of Train 80% was similar to Train 90% but with different magnitude. Moreover, the Train 90% curve was broader than the Train 80%, representing the dispersion of the results obtained with Train 90%. Figure 3 shows the probability density functions for MAE values. Similarly, the accuracy of Train 50% and Train 60% was equivalent, as the two curves of probability density functions were almost superposed. Based on the modes, it can be seen that the Train 80% was the best choice, as it exhibited the highest value of mode and the narrowest probability density curve.

Table 1 summarized the statistical information concerning RMSE, MAE, and R values for the testing parts over 1000 simulations (5 cases x 200 simulations for each). It is observed that the case of Train 80% was the appropriate choice for this problem, as the mean values over 200 simulations were small, and close to those obtained by 90%. More importantly, the standard deviation values for RMSE and MAE (denoted as St. D) were the smallest compared with other cases, showing the reliability of such a choice. The Train 80% also exhibited the highest values of average R and the lowest St. D values of R. Besides, the min and max values of the three criteria in the case of Train 80% were also reasonable compared with other choices. Overall, it could be concluded that, for this dataset, taking 80% of the data to construct the training dataset was a reasonable choice.

The following part is dedicated to present results from representative ANN models. Such models were taken from the Train 80% as a result of previous analysis. It was an appropriate choice as the Train 80% gave the most stable solutions for predicting the Fcr.

Two cases were considered in this study. The first one represented the ANN algorithm, which gave the smallest values of MAE, RMSE, whereas the second one gave the highest value of R. The two cases were denoted herein as ANN1 and ANN2, respectively. It is worth noticing that (i) in the case of ANN1, the selected simulation gave the smallest values of MAE and RMSE simultaneously, and the lower R value is compared with that of ANN2 model and (ii) only the results concerning the testing parts were presented, as they reflected the accuracy of machine learning algorithms. The main objective of this section is to verify the choice of the machine learning algorithm, whether choosing the model with the smallest values of error (i.e., could be RMSE or MAE) or the highest values of correlation (i.e., could be R or R2).

The results concerning the ANN1 model are plotted in Figures 4 and 5. The experimental values were well correlated with the predicted outputs. In addition, low values of error were obtained. The computed values of RMSE, MAE, mean of error, and standard deviation error for the testing dataset were 43.5634, 35.3637, 2.2009, and 45.6313, respectively. Those of the training dataset, in this case, were 40.0835, 30.6669, −3.2598, and 40.3922, respectively. The prediction accuracy of the training part was slightly higher than the testing one, showing excellent reliability of the selected model (i.e., ANN1). A correlation of R = 0.97722 was obtained for the testing dataset (Figure 5).

The results concerning the ANN2 model are plotted in Figure 6 and Figure 7. The experimental results, in this case, were correlated with the predicted outputs but inferior to the ANN1 model. In addition, the values of error were higher. The values of RMSE, MAE, mean of error, and standard deviation error for the testing dataset were 106.5248, 85.8798, −8.5156, and 111.3666, respectively. Those of the training dataset were 87.7486, 68.5177, 6.6403, and 88.4638, respectively. Similarly, the accuracy of the training part was superior to the testing one, showing the reliability of the ANN2 model. A correlation of R = 0.98488 was obtained for the testing dataset (Figure 5).

For comparison purposes, the prediction results in the present study are compared with previously published results [64], using hybrid artificial intelligence approaches such as adaptive neuro-fuzzy inference system (ANFIS) and two ANFIS-optimized algorithms by simulated annealing (SA) and biogeography-based optimization (BBO). The prediction accuracy of the proposed ANN model is higher than other models: with R values of 0.97722 (the proposed ANN model), 0.896 (ANFIS), 0.941 (ANFIS-SA), and 0.960 (ANFIS BBO); RMSE values of 30.6669 (the proposed ANN model), 111.830 (ANFIS), 81.990 (ANFIS-SA), and 66.558 (ANFIS BBO; and MAE values of 40.0835 (the proposed ANN model), 82.407 (ANFIS), 65.594 (ANFIS-SA), and 50.723 (ANFIS BBO). These results confirm that the ANN model is a simple but effective algorithm in solving complex problems.

Finally, a reliability analysis is performed on the testing parts of the two ANN models. The results are summarized in Table 2. Based on the definition of the reliability criterion [63], it can be seen that even the threshold (Δ) is chosen as 10%, and the ANN1 model exhibits a high level of reliability (100%), whereas the ANN2 model produces lower reliability (81.82%) because of the presence of 2 samples with high RAE values (samples 3 and 4 in Table 2). It could be easily concluded that, in accordance with RMSE and MAE, the reliability analysis shows that the ANN1 model produces better prediction accuracy than the ANN2 model. In summary, the difference between the values of R in the two cases was insignificant, but the error values were considerable. It could be concluded that the criteria such as RMSE and MAE should be considered as the primary measurements to select a suitable machine learning model. The correlation value (R) should be considered as an additional assessment of the prediction accuracy in case the values of RMSE or MAE are similar.

5. Conclusion

In this investigation, the Artificial Neural Network model (ANN) using the Levenberg-Marquardt Algorithm (LMA) was used to predict the critical buckling load of Y-shaped cross-section steel columns. A total of 57 data points of the buckling behavior were used to generate the training and testing datasets. There were seven input variables in this study, such as the length of the column, the width, the thickness of steel equal angles, the width and the thickness of the welded steel plate, and the total deviations in the Ox and Oy directions. The output variable was the critical buckling load. Various validation criteria, namely R, RMSE, and MAE, were introduced to validate and evaluate the performance of the model. After selecting the best ANN structure, two investigations were considered: the first one represented the ANN algorithm, which gave the smallest values of MAE = 40.0835 and RMSE = 30.6669 with R = 0.97722, whereas the second one gave values of MAE = 106.5248, RMSE = 85.8798, and the highest value of R = 0.98488. In the two investigations, the validation criteria showed that ANN seems to be a promising predictor to estimate the critical buckling load of Y-shaped cross-section steel columns. The results showed that the difference between R values in the two investigations was small, but those for RMSE and MAE values were significant. This result revealed that the critical metrics for choosing an accurate machine learning model would be criteria such as RMSE and MAE. To further strengthen the applications of ANN in structural engineering, the authors suggest several perspectives, including (i) to collect more data on the given problem to be applied to a broader range of Y-shaped cross-section steel columns, (ii) to improve the prediction accuracy by performing an in-depth study of the hyperparameters of ANN model (transfer functions, training algorithms, and training epochs), and (iii) to improve the robustness of ANN model by using optimizations algorithms.

Data Availability

The processed data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.