#### Abstract

The purpose of this study is to develop a practical artificial neural network (ANN) model for predicting the atmospheric corrosion rate of carbon steel. A set of 240 data samples, which are collected from the experimental results of atmospheric corrosion in tropical climate conditions, are utilized to develop the ANN model. Accordingly, seven meteorological and chemical factors of corrosion, namely, the average temperature, the average relative humidity, the total rainfall, the time of wetness, the hours of sunshine, the average chloride ion concentration, and the average sulfur dioxide deposition rate, are used as input variables for the ANN model. Meanwhile, the atmospheric corrosion rate of carbon steel is considered as the output variable. An optimal ANN model with a high coefficient of determination of 0.999 and a small root mean square error of 0.281 mg/m^{2}.month is retained to predict the corrosion rate. Moreover, the sensitivity analysis shows that the rainfall and hours of sunshine are the most influential parameters on predicting the atmospheric corrosion rate, whereas the average chloride ion concentration, the average temperature, and the time of wetness are less sensitive to the atmospheric corrosion rate. An ANN-based formula, which accommodates all input parameters, is thereafter proposed to estimate the atmospheric corrosion rate of carbon steel. Finally, a graphical user interface is developed for calculating the atmospheric corrosion rate of carbon steel in tropical climate conditions.

#### 1. Introduction

Atmospheric corrosion is considered as an electrochemical nonlinear and complex phenomenon, which is mostly depending on external factors and material properties. It is a challenge to evaluate the influence of these parameters on the degradation of materials, specifically, for structures exposed to various climatic conditions.

In fact, data of the atmospheric corrosion can be obtained properly based on realistic measurements. Nevertheless, there is still some problem related to the mechanism of the atmospheric corrosion and the effects of environmental parameters on this phenomenon. Among those, the potential interaction between the pollutants and the meteorological parameters is the one of critical issues. Closer looking into these problems would be very useful and provide a better understanding of the atmospheric corrosion process.

In the last few decades, atmospheric corrosion has been an interesting topic for researchers around the world. Kallias et al. [1] proposed a deterioration modeling and performed the assessment of metallic bridges affected by atmospheric corrosion. Several studies investigated the atmospheric corrosion process of metals considering multiple environmental factors [2–4]. It was demonstrated that the presence of atmospheric pollutants sulfur dioxide in urban and industrial atmospheres and chloride concentration in marine atmospheres affected the corrosion rate of metal significantly. The effects of relative humidity on the atmospheric corrosion were evaluated in some studies [5–9]; meanwhile, the influence of temperature on the atmospheric corrosion was demonstrated in the work of Kong et al. [10]. They showed that the corrosion rate of materials was increased as a function of temperature and relative humidity. A multiscale model for predicting atmospheric corrosion was proposed by Cole et al. [9], in which Australian conditions and marine aerosols were considered. Besides, the effects of rainfall on the atmospheric corrosion rate were investigated by several studies [11, 12]. However, several studies pointed out that the chloride ions () coming from the sea and sulfur dioxide (SO_{2}) are the most important atmospheric corrosive agents [13–16].

A prediction of the atmospheric corrosion accounting for exposing time, relative humidity, temperature, time of wetness, and pollutant concentration was proposed by Tidblad [17]. The quantitative relationships of environmental factors on the corrosion process were presented using the basic linear model [18, 19], the basic log-linear model [19–21], and dose-response functions [22–24]. Empirical equations to calculate the atmospheric corrosion rate were also proposed by some studies [19, 21, 25]. However, these equations only considered few input parameters, which are sulfur dioxide deposition rate, chloride, and time of wetness. Also, the atmospheric corrosion is controlled by various external factors of corrosion and pollution parameters such as humidity, temperature, and pollutants. Additionally, those atmospheric corrosion models are only valid for specific local geographical conditions. As the local geographical condition changed, such corrosion models are no longer applicable. Therefore, a sufficient model, which can cover various environmental factors, is still needed for predicting the atmospheric corrosion rate of carbon steel.

Artificial intelligence (AI) models have been commonly utilized in predicting corrosion behaviors of steel structures. Seghier et al. [26] estimated the maximum pitting corrosion depth in oil and gas pipelines using support vector regression (SVR) combined with optimization techniques such as Genetic Algorithm and particle swarm optimization and firefly algorithm (FFA). They demonstrated that the SVR-FFA model had superior performance compared with other considered models. In the study of Seghier et al. [27], the authors applied various data-driven models, which are artificial neural networks (ANNs), M5 tree, multivariate adaptive regression splines, locally weighted polynomials, kriging, and extreme learning machines, for calculating the maximum pitting corrosion depth of pipelines. Recently, a hybrid soft computing model, namely, multilayer perceptron-marine predators algorithm, was proposed for predicting the corrosion rate in the suspension bridge cables [28]. Diao et al. [29] developed corrosion rate models for low-alloy steels using the random forest and gradient boosting decision tree algorithms.

ANNs, an example of the most powerful algorithms of machine learning models, have been widely applied for metal sciences and atmospheric corrosion fields due to their advantages [30–34]. Those typical benefits are as follows: (1) ANN is a nonlinear model, which is easy to use and understand compared to statistical methods, and (2) ANNs allow the modeling of physical phenomena in complex systems without requiring explicit mathematical representations. Their findings showed that neural network models had a reliable prediction with a small error and a large coefficient of determination value (). The ANN models were effective for various investigations such as thickness prediction of sherardizing coating [31], corrosion of metals in equatorial climate [33], and corrosion of copper in Valparaiso (Chile) [34]. Particularly, ANN was used for predicting the penetration of corrosion or the corrosion rate of carbon steel considering input parameters such as humidity, temperature, time of wetness, precipitation, sulfur dioxide concentration (SO_{2}), and chloride deposition rate () [30, 35]. It was demonstrated that ANN models predicted the corrosion rate accurately with values of 0.90 in the work of Pintos et al. [35] and 0.998 in the work of Díaz and López [30]. It should be noted that the database used in the study of Pintos et al. [35] was measured from the Ibero-America region. Besides, the effects of temperature and hours of sunshine were not considered in the study of Díaz and López [30]. Additionally, it was stated that ultraviolet light can activate the metal surface and then lead to a sooner initiation and a faster rate of corrosion process [36]. Therefore, the influence of sunshine hours, an important meteorological parameter, on the corrosion rate needs to be considered in the predicted model. Moreover, ANN-based explicit formulas or practical tools have not been proposed to apply the ANN model for realistic engineering problems so far.

The purpose of this study is to develop a practical ANN model, which can be readily applied for predicting the atmospheric corrosion rate () of carbon steel. A total of 240 experimental data samples are used to establish the ANN model. Seven external factors, which are the average temperature (T), average relative humidity (RH), total rainfall (Rf), time of wetness (TOW), hours of sunshine (HoS), , and SO_{2} deposition rate, are considered as input variables of the ANN model. The performances of the ANN model are also compared with those of three existing empirical formulas and three regression models. Moreover, the influences of all input variables on the predicted corrosion rate are investigated thoroughly. Eventually, an ANN-based equation and a graphical user interface (GUI) tool are established to predict the atmospheric corrosion rate of carbon steel.

#### 2. Data Collection

A set of 240 measured data samples of the atmospheric corrosion under tropical climate conditions in Vietnam were used to build up the ANN model. These databases were provided by the report of the Center for Material Failure Analysis [37], in which the data points were recorded in 2 years. Seven parameters, namely, T, RH, Rf, TOW, HoS, , and SO_{2}, were involved as input parameters. It should be noted that the atmospheric corrosion rate was measured based on the weight loss of carbon steel samples. The relationship between corrosion rate () and weight loss is expressed by the following equation [38]:where and are the weights of samples before and after corrosion, respectively; is the area of sample surface; is the corrosion time considered.

The summary of the statistical properties of the input parameters is presented in Table 1. It should be noted that the database used in this study was mostly focused on the tropical monsoon climate, where steel is the most susceptible to corrosion. Figure 1 presents the histogram of the used data samples. In addition, the relationships between the atmospheric corrosion rate of carbon steel and seven input parameters are represented by the correlation matrix, as described in Figure 2. Based on this figure, it can be found that some parameters had a strong correlation such as RH and TOW or T and HoS. It is attributed to the reason that the relative humidity is always accompanied by time of wetness, and temperature is associated with the sunshine hour. Meanwhile, some others were poorly correlated such as Rf and SO_{2} or TOW and SO_{2} since their physical meanings have no connection. Moreover, the correlation between each single input parameter and the output, , appeared to be weak.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

#### 3. Existing Empirical Equations, Regression, and ANN Models

##### 3.1. Existing Equations for Predicting the Atmospheric Corrosion Rate

In addition to regression models, the existing formulas for calculating the corrosion rate are presented in this study. Three typical formulas proposed by various studies [19, 21, 25] were used to obtain the atmospheric corrosion rate of carbon steel. Table 2 summarizes the empirical equations for calculating the atmospheric corrosion rate of steel.

##### 3.2. Regression Models

We also performed regression models to calculate the corrosion rate based on the used database. The regression is normally employed to define a relationship between variables. The regression model can be expressed by a general linear least-squares model as follows:where are basis functions; represent the regression coefficients; is the residual, while denotes the residual matrix [39]; is the matrix determined by minimizing the mean squared difference between the regression values and the actual experimental data; is the input parameter matrix; is the least-square estimate of , and it is determined according to [40] and expressed as follows:where is the transformation matrix of . This study has used linear, quadratic, and quadratic with mixed terms regression models for input data. A summary of the forms and coefficients of three regression models is presented in Table 3.

##### 3.3. Proposed ANN Model

ANN is capable of dealing with various categories such as regression analysis, classification, or data processing [41, 42]. Neurons are the smallest units in an ANN model. An ANN model comprises (1) an input layer, which contains input parameters, (2) single or multiple hidden layers, and (3) an output layer, which holds the output result. Neurons transfer signals to other neurons based on the signals they receive from other neurons. Thus, each neuron is connected to other neurons in the network through these synaptic connections, whose values are weighted. The signals transmitting through the network are strengthened or dampened by these weight values. It should be noted that there is a bias and an activation in each neuron [43]. The input signal of neuron is represented by a vector as , while the weighted sum of the input vector is determined by as shown in equation (7).where is the vector of weight in the -dimension; denotes the bias.

An activation function in the network determines the transformation of the weighted sum of the input into an output from a node or nodes in a layer of the network. Activation functions also support normalizing the output of any input in the range [1, −1] or [0, 1]. The selection of activation functions is depending on the problem purpose. Some typical activation functions such as and forms can be used in the hidden layer of the recurrent neural network [44]. Since this study focuses on the prediction problem, the hyperbolic tangent sigmoid function, so-called , and a linear activation function, namely, , are employed, as expressed by equations (5) and (6). It should be noted that the function is used in the hidden layer, while the function is utilized in the output layer. Those functions were also used for training neural networks in previous studies [45–47].

According to Golafshani and Ashour [48], normalizing the database in a range of [−1, 1] before training is required. The normalization of input variables is determined by the following expression:where is the considered input variable, is the normalization of variable, and and denote the minimum and maximum of the variable in the dataset, respectively.

For the proposed ANN model in this study, seven parameters, namely, T, RH, Rf, TOW, HoS, , SO_{2}, are considered as the input variables, whereas the atmospheric corrosion rate of carbon steel, , is the output variable. The two following steps are implemented for training ANN model:

*Step 1. *The input signals, after entering into the input layer, are transferred through the connections, from the hidden layer to the output layer.

*Step 2. *The predicted result is obtained from the feedforward process; however, we need to minimize the error, which uses the mean square error () indicator. To diminish this error, the iteration is conducted till a convergence is obtained. This step is for minimizing the error and obtaining an optimal model. This procedure is called back-propagation. The value is calculated using the following equation:where is the number of training data samples; and represent the predicted and target values of the sample, respectively.

Overfitting describes the phenomenon of a model adapting too well to the training data such that it cannot predict unseen data samples well. Therefore, the model will fail to predict the output of data outside of the used training set. Accordingly, this phenomenon hinders the performance accuracy of the model and causes a deviation of the predicted result. To prevent such problem, the regularization solution is employed to modify the error function using the following equation [47, 49]:where is the performance ratio; represents the mean squared network weights and biases, which is expressed as follows:To optimize the performance of the predictive model, an efficient ANN model has to be determined using trial-and-error process. Various ANN architectures were tested with the training ratio varying from 0.6 to 0.85 and a wide range of neuron numbers in the hidden layer. It should be noted that only one hidden layer was used in testing ANN models. In this study, the Levenberg-Marquardt (i.e., damped least-square) algorithm was utilized for regulating weights and biases of ANN models [50]. The advantages of this algorithm are solving nonlinear least-squares problems, robustness, and obtaining rapid convergence [51]. This algorithm was also widely used in previous studies [43, 46, 47, 52–55]. To assess the ANN models, two indicators, which are the value and , were quantified. Accordingly, the optimum ANN model contains largest and smallest after training process was chosen. It should be noted that the proportion, 70%, 15%, and 15% of the dataset, was employed in training, testing, and validation, respectively.

The number of neurons in the hidden layer is an important factor to train the ANN model. The best ANN model for experimental data was achieved by a sensitivity analysis. The number of neurons in the hidden layer was varied from 5 to 21 to obtain the optimum ANN model. After performing the sensitivity analysis, the best model with 10 neurons in the hidden layer was chosen, as illustrated in Figure 3.

Figure 4 shows the structure of the proposed ANN model. In this model, seven neurons in the input layer denote the seven input variables (shown in Table 1), and one neuron in the output layer represents the atmospheric corrosion rate of carbon steel. It should be noted that the developed ANN model and its performance were conducted using MATLAB [56].

#### 4. Results and Discussion

##### 4.1. ANN Model Performance

The performance of the proposed ANN model is shown in Figure 5, in which for training, validating, and testing decrease with an increment of the epoch. The best validation performance was selected since was reduced to at the epoch. A small value of the squared error indicates that the ANN model was well trained.

Figure 6 shows the regression of the developed ANN model, in which the output and target results are highly matched. The values for training, testing, validation, and all-data regression are 0.9998, 0.9998, 0.9999, and 0.9998, respectively. It is observed that the values were mostly close to unity, highlighting that the proposed ANN model has a good performance. In other words, the ANN model was highly reliable in predicting the atmospheric corrosion rate of carbon steel.

**(a)**

**(b)**

**(c)**

**(d)**

Figures 7–10 show the comparisons of the atmospheric corrosion rate of carbon steel obtained from the ANN model and numerical data for all-data, training, testing, and validation. The red lines in the left subfigures show the normalized values predicted based on the ANN model; meanwhile the blue lines demonstrate the measured values of all-data samples, training, testing, and validation sets. Moreover, the right subfigures describe the corresponding errors of the comparisons. The errors were shown to be trivial, mostly smaller than 0.08. Again, it was demonstrated that the ANN model determined the atmospheric corrosion rate of carbon steel accurately. Even though the ANN performance results were compared with the validation dataset, cross-validation should be considered. However, due to the limitation of the developed algorithm, cross-validation was not performed in this study.

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

##### 4.2. Comparison between the Developed ANN Model and Existing Formulas

The results obtained from the ANN model were compared with those of the regression models and existing formulas. Three regression models presented in Table 2 and existing formulas in Table 3 were utilized. To evaluate the performances of all predictive models, four indicators, which are , mean absolute percentage error (), , and Pearson correlation coefficient (), were employed. These indicators are calculated by the following equations:where and are the target and output of the sample, respectively; is the number of samples.

It should be noted that the and values represent the mean of errors, whereas, the and values were used to measure the variation and linear correlation between predicted and actual data, respectively. The higher values of and and the lower values of and indicate a good performance of the predictive model. If the predictive model is perfect, the values of and are equal to 1.0, and the error is zero. Figure 11 shows the calculated values of statistical parameters with various predictive models. It is clear that the ANN model has the smallest values of and and largest values of and , followed by quadratic regression models. In other words, the ANN model is superior in predicting the corrosion rate of carbon steel compared to the other models. Moreover, an overall performance of all predictive models is illustrated in Figure 12. Again, it is observed that the proposed ANN model contains the smallest standard deviation, followed by the regression models and predictive models proposed by Knotkova et al. [19], Roberge et al. [21], and ISO and MICAT [25]. Details of the calculated results can be seen in Table 4.

**(a)**

**(b)**

**(c)**

**(d)**

Table 5 also shows the statistical results of different ratios of the predictive models to test results. It demonstrates that the mean ratio of the ANN model was 1.0002, mostly equal to unity, and the standard deviation was lowest compared to those of other models. Again, the ANN model was shown to be the optimal and reliable option in predicting the corrosion rate of carbon steel.

#### 5. Evaluation of the Effects of Input Parameters

A parametric study was carried out to evaluate the influences of input parameters on the atmospheric corrosion rate of carbon steel using the developed ANN model. To account for the interaction of multiple parameters on calculated , the considered variable was varied from the lowest to the highest range, and simultaneously other variables were changed in turn. It should be noted that the L, ML, M, MH, and H letters in Table 6 are the abbreviations of the lowest, middle-low, mean, middle-high, and highest values, respectively. Consequently, the variations of the predicted result caused by the variation of the input parameters were quantified.

##### 5.1. Effect of the Average Temperature

Figure 13 shows the effects of the average temperature (i.e., *X*_{1}) on the atmospheric corrosion rate of carbon steel, . During the variation of the average temperature , other parameters were varied in turn to evaluate the effects of the interaction between and other variables on . It was found that the increment of the average temperature caused an increase in the atmospheric corrosion rate of carbon steel. If was 1.5 times increased, the value was increased by 10%. This observation can be attributed to the reason that the increment of temperature can intensify the chemical reaction, which may boost the corrosion process in the carbon steel.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.2. Effect of the Average Relative Humidity

The effects of the average relative humidity (i.e., *X*_{2}) on the atmospheric corrosion rate of carbon steel are shown in Figure 14. The corrosion rate was gradually increased when the average relative humidity increased in all cases. Specifically, the corrosion rate was not affected by relative humidity at low temperature, short time of wetness, low level of SO_{2}, and short time of sunshine.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.3. Effect of the Time of Wetness

The time of wetness (i.e., *X*_{3}) depends on the temperature, humidity, total rainfall, and hours of sunshine. TOW had been identified according to the suggestion of Tidblad and Mikhailov [57]. The effects of TOW on the atmospheric corrosion rate of carbon steel are shown in Figure 15. Similar to T, the corrosion rate value was enlarged as TOW increased. This is consistent with the previous study [35].

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.4. Effect of the Average Chloride

The effects of the average chloride (i.e., *X*_{4}) on the atmospheric corrosion rate of carbon steel are shown in Figure 16. It was found that the corrosion rate was increased with the increment of . It can be attributed to the reason that the passivation film of steel can be damaged by chloride ions in the process of competing with hydrogen and oxygen ions in the absorption process, thus causing the occurrence of pitting corrosion [58].

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.5. Effect of the Average Sulfur Dioxide Deposition Rate

Figure 17 shows the effects of the average sulfur dioxide deposition rate (i.e., *X*_{5}) on the atmospheric corrosion rate of carbon steel. The atmospheric corrosion rate of carbon steel increased as the SO_{2} rate increased. This is due to the attribution of SO_{2} to react with H_{2}SO_{4} in the atmosphere or on the surface of carbon steel. Combining with high humidity or wetness, the damage caused by SO_{2} would be considerable [59].

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.6. Effect of the Total Rainfall

Figure 18 shows the influences of the total rainfall (i.e., *X*_{6}) on the atmospheric corrosion rate. It can be observed that the atmospheric corrosion rate of carbon steel was increased, since Rf varied from the minimum to the maximum value. Moreover, the *K* value was increased by 6% if the total rainfall increased 6 times. In the tropical climate region, due to the annual high rainfall, a consideration of the effects of rainfall on the corrosion rate is needed. This statement was also pointed out in previous studies [30, 60].

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.7. Effect of the Hours of Sunshine

Figure 19 shows the effects of the hours of sunshine (i.e., *X*_{7}) on the atmospheric corrosion rate of carbon steel. It was found that the atmospheric corrosion rate was decreased with an increment of HoS. This observation is probably due to the reason that the sunshine hours have a strongly negative correlation with the relative humidity and time of wetness, as shown in Figure 2. Moreover, the corrosion mechanism of carbon steel in the tropical region is a complex combination of chemical and physical conditions.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

Figure 20 demonstrates the sensitivity of input variables to the atmospheric corrosion rate of carbon steel. It should be noted that the value in this figure was achieved at the upper bound (i.e., maximum) of each input parameter. It was observed that the rainfall was the most influential parameter on predicting the atmospheric corrosion rate, followed by the time of wetness, the average temperature, the average sulfur dioxide deposition rate, the average chloride, and average relative humidity. Meanwhile, the sunshine duration negatively affected the atmospheric corrosion rate.

#### 6. Practical Tools for the Atmospheric Corrosion Rate of Carbon Steel

##### 6.1. ANN Model-Based Equation

As analyzed above, the proposed ANN model can predict the atmospheric corrosion rate of carbon steel accurately. It is needed to develop an ANN-based formula for explicit usage in the practical problems. Considering the value as the output response, the procedures presented in the previous sections were adopted herein. The explicit formulation of was obtained directly from the developed ANN model by using the activation functions, weights, biases, and normalization factors, expressed aswhere is a normalized atmospheric corrosion rate of carbon steel. The form of equation (12) comes from the denormalization procedure of equation (3). As a result, the value of 20.429 is the minimum value of the atmospheric corrosion rate of the database. The value of 2.419 is a half of the difference of maximum and minimum atmospheric corrosion rate values of database, as shown in Table 1. The normalized value was a function, which is expressed by the following equation:where = 10 is the number of neurons in the hidden layer of developed ANN model. The other coefficients, to and to , are presented in Table 7.

##### 6.2. ANN Interactive Graphical User Interface (GUI)

In this study, a practical GUI tool was constructed using MATLAB [56] to simplify calculating the atmospheric corrosion rate of carbon steel, as shown in Figure 21. Seven input parameters, from *X*_{1} to *X*_{7}, were provided in the input signal. Also, ten neurons in the hidden layer are shown in Figure 21. This tool is accessed freely, and it is very convenient to use. Users can easily obtain the output by clicking on the “Start Predict” button after putting all input parameters. It takes less than one second to obtain the result. Since this GUI tool was developed using the proposed ANN model, the accuracy of prediction was verified and demonstrated in the previous section. This GUI tool is freely available at https://github.com/duyduan1304/GUI_corrosionrate.

It should be noted that the ANN algorithm cannot tackle extrapolation; thus the input values should be restricted to the minimum and maximum of the utilized database. To expand the coverage of ANN model, a wide range of collected data should be considered.

#### 7. Conclusions

A practical ANN model was developed to predict the atmospheric corrosion rate of carbon steel based on a set of 240 experimental data samples. The results of the proposed model were compared with those of three regression models and three existing formulas. Additionally, a series of parametric studies were performed to evaluate the effects of input parameters on the atmospheric corrosion rate. The following conclusions are drawn: The developed ANN model in this study predicted the atmospheric corrosion rate of carbon steel more accurately than the regression models and existing equations. The accuracy of the model was verified by the statistical properties including , , , and value. The rainfall and hours of sunshine were the most influential parameters on predicting the atmospheric corrosion rate. Meanwhile, the average chloride ion, the average temperature, and the time of wetness were less sensitive to the atmospheric corrosion rate. An ANN model-based formula, which considered all seven input parameters, was proposed to calculate the atmospheric corrosion rate of carbon steel. A graphical user interface tool was developed and easily applied for simplifying the prediction of the atmospheric corrosion rate of carbon steel.

#### Data Availability

All the data supporting the key findings of this paper are presented in the figures and tables of the article. Requests for other data will be considered by the corresponding author.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This study was supported by the Ministry of Education and Training of Vietnam (Grant no. B2020-TDV-05).