Abstract

Pitting corrosion can lead to critical failures of infrastructure elements. Therefore, accurate detection of corroded areas is crucial during the phase of structural health monitoring. This study aims at developing a computer vision and data-driven method for automatic detection of pitting corrosion. The proposed method is an integration of the history-based adaptive differential evolution with linear population size reduction (LSHADE), image processing techniques, and the support vector machine (SVM). The implementation of the LSHADE metaheuristic in this research is multifold. This optimization algorithm is employed in the task of multilevel image thresholding to extract regions of interest from the metal surface. Image texture analysis methods of statistical measurements of color channels, gray-level co-occurrence matrix, and local binary pattern are used to compute numerical features subsequently employed by the SVM-based pattern recognition phase. In addition, the LSHADE metaheuristic is also used to optimize the hyperparameters of the machine-learning approach. Experimental results supported by statistical test points out that the newly developed approach can attain a good predictive result with classification accurate rate = 91.80%, precision = 0.91, recall = 0.94, negative predictive value = 0.93, and F1 score = 0.92. Thus, the newly developed method can be a promising tool to be used in a periodic structural health survey.

1. Introduction

Currently, ageing infrastructure elements with constraints on maintenance budgets are the main concern of infrastructure management agencies around the world. These facts urge the implementation of smart and cost-effective methods in the field of structure health monitoring [17]. Typical objectives of structure health monitoring include the correct recognition of the presence, the location, and the type of the structural defects. These pieces of information can be then used by various assessment models to support decisions on rehabilitation options to maximize the service life of infrastructure elements [8, 9].

For metal infrastructure elements, corrosion negatively affects their durability and operability. It is reported that corrosion is a dominant form of defects with 42% of frequency of failure mechanisms in engineering structures [10]. Therefore, recognition as well as diagnostics of corroded areas is an important task in periodic structural heath surveys [11, 12]. The surveys’ outcome significantly helps owners or maintenance agencies to judge the effectiveness of the currently employed protection methods and to prioritize rectifying measures [13, 14].

Particularly, pitting corrosion, recognized by isolated corroded damage units within the metal surface, is a severe type of structural defect [3, 13]. This defect appears on the surface of various civil engineering structures including bridges, high-rise buildings, pipelines, and storage tanks (see Figure 1). Pitting corrosion is generally more harmful than uniform corrosion since it is hard to detect and predict, as well as design against its damages [15]. This localized corrosion damages may have diverse shapes (often hemispherical or cup-shaped). Moreover, pits can be covered by a membrane of corrosion products.

If left undetected, pitting corrosion can lead to catastrophic failure of civil engineering structures. It is because this kind of damage only causes small loss of material with insignificant effect on its surface, while it can bring about extremely dangerous damages to deep areas of structures below the tiny pits. One example of the devastating effect of pitting corrosion is the explosion caused by gasoline fumes leaking from a steel gasoline pipe in Guadalajara (Mexico) on April 22, 1992; this accident caused the deaths of 12 people [10, 16]. Pitting corrosion is also the cause of the deadly collapse of the U.S. Highway 35 bridge in 1967 within which 46 people died [17]. On 04/22/1992, a gas explosion in Guadalajara (Spain) is believed to be brought about by pitting corrosion appeared on gas pipes [10]; this event caused the deaths of 252 people and injuries of about 1,500 people [10].

In Vietnam, as well as in many other nations, surveying infrastructure elements is usually performed manually by human inspectors. Although the manual method can help to attain accurate detection results, its notable downsides are low productivity and effects of subjective criteria [14]. These are due to the time-consuming processes of measurement and data report as well as inconsistent experience/assessment of surveyors. Moreover, the sheer number of existing structure elements creates a significant challenge to human inspectors to perform surveying works frequently and detect structural damage timely. Therefore, automatic approach for pitting corrosion detection is an urgent need of infrastructure management agencies.

Accordingly, this study aims at constructing an intelligent method for automatic recognition of metallic surface area subject to pitting corrosion. The newly constructed model is an integration of metaheuristic, image processing techniques, and machine learning. The History-Based Adaptive Differential Evolution with Linear Population Size Reduction (LSHADE) metaheuristic approach is employed for multileveling image thresholding as well as optimizing the performance of the Support Vector Machine- (SVM-) based data classifier. The LSHADE optimized multileveling image thresholding supported by the connected component labeling algorithm is used to extract the region of interest (ROI) from image samples. Based on the extracted ROI, texture information including statistical measurements of color channels, gray-level co-occurrence matrix, and local binary pattern is used to characterize properties of the metal surface. The SVM relies on the texture information to separate input samples into two categories of nonpitting corrosion (the negative class) and pitting corrosion (the positive class). A dataset consisting of 213 image samples has been collected to train and verify the proposed integrated model.

Furthermore, the model training phase of the SVM necessitates an appropriate setting of its hyperparameters including the penalty coefficient and the kernel function parameter. The task of determining such hyperparameters is generally regarded as the model selection problem [18]. This is a crucial task in machine learning because hyperparameters strongly influence the generalization capability of prediction models [1921]. A poor model selection may either result in an overfitted or underfitted model. The model selection can be a challenging task because hyperparameters are often searched in continuous domains [2225].

Accordingly, there are infinite possible solutions to the model selection problem and an exhaustive search to determine the optimal set of hyperparameters is infeasible. Therefore, metaheuristic can be utilized to tackle the model selection task [26]. Metaheuristic is regarded as a high-level heuristic designed to seek for a sufficiently good solution to a global optimization problem. This advanced method generally does not require assumptions regarding the optimization task being solved and can be applied to a wide range of problems [2731]. Metaheuristics have been demonstrated to be capable optimization methods which can help to identify solutions with quality superior to conventional methods (e.g., iterative algorithms and gradient descent-based algorithms) [3234]. With this motivation, this study employs the LSHADE algorithm [35], a state-of-the-art metaheuristic to optimize the SVM model performance. The LSHADE is selected in this study due to its outstanding performance reported in recent works [3638].

In summary, the main contributions of the current study can be stated as follows. (1) A novel integrated framework based on image processing techniques, metaheuristic optimization, and machine-learning prediction for pitting corrosion is proposed. (2) An autonomous model operation is achieved by means of the LSHADE metaheuristic which minimizes human’s efforts for model construction and parameter tuning. (3) An integrated approach of the LSHADE and SVM algorithms is proposed for achieving better corrosion detection accuracy compared to benchmark machine-learning models.

The rest of the paper is organized as follows: Section 2 reviews the research method; Section 4 is dedicated to describing the newly developed model which employs the aforementioned methods of the LSHADE, multilevel image thresholding, image texture computation, and SVM; Section 5 reports the experimental results, followed by Section 6 which summarizes this study with several concluding remarks.

Among various methods of automatic structural health monitoring, image processing-based visual inspection is commonly employed due to the available of low-cost digital cameras and a fast advance of computer vision techniques. Choi and Kim [39] relies on attributes (color, texture, and shape) extracted from digital images. In addition, data dimensionality reduction and linear classifiers are employed to identify corrode areas from metal surface [39]. However, due to the complexity of the problem of interest, the accuracy of the aforementioned detection model can be enhanced with nonlinear and more sophisticated data classification approaches. Valor et al. [12] constructed Markov chain models for stochastic modeling of pitting corrosion appearing on metallic structures. Medeiros et al. [40] demonstrated the effectiveness of the gray level co-occurrence matrix and color descriptors in classification of corroded and noncorroded surfaces. High dynamic range imaging is used in [3] for the recognition of pitting corrosion via visual inspection [3].

Chen et al. [41] employed Fourier-transform-based and color image processing methods for recognition of defects on the steel bridge surface. Ji et al. [42] put forward a computer vision-based method for rating of corrosion defects on coated materials using the watershed segmentation method. Shen et al. [43] established an automatic method for detecting steel bridge coating rust defect based on image texture analysis and discrete Fourier transform. Jahanshahi and Masri [44] investigated the influence of color space, color channels, and subimage block size on corrosion detection capability. This model also employed color wavelet-based texture analysis algorithms for recognition of defective areas [44].

A digital image processing method based on the K-means clustering algorithm, the double-center-double-radius algorithm, and the least-square support vector machine has been used by Liao and Lee [45] to detect rust defects on steel bridge coatings. Liao and Cheng [46] relied on the least-square support vector machine, spectral power distribution, spectral reflectance, and matrix restoration to detect discoloration status of a steel bridge coating. A model used for detection and assessment of corrosion on pipelines through image filtering and morphological operations has been constructed by Bondada et al. [13]. Enikeev et al. [14] utilized the linear support vector machine method for recognizing pitting corrosion on aluminum surfaces. Zhao et al. [47] employed deep-learning and artificial bee colony algorithm to identify material corrosion. Zhang et al. [48] put forward a segmentation process for detecting localized corrosion on the rust-removed metallic surface using deep-learning technique. Ivasenko and Chervatyuk [49] propose an image processing-based method for segmentation of rusted areas on painted construction surfaces. Based on a recent review work carried out by Ahuja and Shukla [50], there is an increasing trend of applying image processing techniques for automatic corrosion detection; this work also points out a great potentiality of machine learning for corrosion recognition performed on large area. Following this trend of research, the current study proposes a novel approach for automatic detection of pitting corrosion. The proposed method is an integration of image processing techniques, metaheuristic optimization, and machine-learning-based pattern recognition. The individual components needed to establish the newly developed model for pitting corrosion detection are presented in the subsequent section of the article.

3. Research Method

3.1. Multilevel Image Thresholding

In the field of image processing, multilevel thresholding of a gray image is a widely utilized method for coping with a variety of computer vision tasks including feature extraction, object detection, and image analysis [5155]. Particularly for multilevel thresholding, the image histogram is first computed. Based on this histogram, the gray levels of an image are categorized into multiple clusters based on a set of thresholds. To determine such thresholds appropriately, the Otsu criterion, formulated by Otsu [56], that relies on the maximization of between-class variance approach can be used due to its ease of implementation simplicity and good image segmentation performance [5759]. Since the computation of the threshold values based on the Otsu criterion is a computationally expensive image processing operation, metaheuristic approaches are often employed [60, 61].

Given a digital image in L gray levels 0, 1, …, L − 1, its histogram H =  can be constructed. Herein, fi denotes the occurrence frequency of gray level i. Let represent the total number of pixels in the image; the ith gray level occurrence probability is computed as follows:

The objective is to segment the image of interest into K+1 clusters (C0, C1, …., Ck, …, CK) using K thresholds chosen from the set T = , where 0 ≤ tk ≤ L. Ck denotes a set of gray levels. For each cluster Ck, the cumulative probability and mean gray level ηk can be obtained via

The mean intensity ηT of the whole image and the between-class variance are computed as follows [59]:

It is noted that the threshold levels for a given number of clusters are determined based on maximizing the separation between cluster means. Thus, the optimal thresholding values can be attained by maximizing the between-class variances mathematically stated as follows:

3.2. Connected Component Labeling (CCL)

Connected component labeling (CCL) is an operation on a binary image. This method analyzes the binary-1 pixels and divides the binary image into distinctive component regions [6264]. The CCL is often followed by further property measurement operations on each region [57, 65]. In essence, the CCL algorithm carries out the unit change from individual pixel to region; all pixels having value binary 1 and are connected to each other are grouped into one cluster [64]. In this study, the iterative algorithm proposed by Haralick [66] is employed for performing CCL. This method is selected due to its simplicity and the fact that it does not require auxiliary storage to yield the labeled image from the binary image. The iterative algorithm includes an initialization step and a sequence of top-down label propagation followed by bottom-up label propagation repeated until no label changes is observed.

3.3. Image Texture Analysis

In the field of computer vision, an image texture is commonly regarded as a set of metrics used to quantify the perceived texture or regional characteristic of an image [64]. This set of metrics provides information regarding the spatial arrangement of color intensity in an image sample. Image textures can express intuitive quantities of the surface image including the degrees of roughness and smoothness. Due to such reasons, image texture analysis can be highly useful for pitting corrosion recognition.

3.3.1. Measurement of Statistical Properties of Color Channels

Since corrosion often results in areas on the metal surface with distorted color, using the statistical properties of image color channels (red, green, and blue) can be helpful for the task of interest. Let I be a variable that denotes the gray levels of an image sample. The first-order histogram P(I) is obtained as follows [67]:where c denotes a color channel, NI,c is the number of pixels having intensity value I, and H and W are the height and width of the image sample, respectively.

Accordingly, the mean , standard deviation , skewness , kurtosis (Kc), entropy , and range (Rc) of color channel are given as follows [68]:where since the image samples are 8 bit, NL = 256 denotes the number of discrete intensity values.

3.3.2. Gray-Level Co-Occurrence Matrix (GLCM)

Properties extracted from the Gray-Level Co-Occurrence Matrix (GLCM) [69, 70] are highly effective for texture discrimination. This method aims at analyzing the repeated occurrence of certain gray-level patterns existing in an image texture. Therefore, it can be used to assess the coarseness of image regions [67, 71]. Let r and denote a distance and a rotation relationship between two pixels. The GLCM, denoted as , represents a probability of the two gray levels of i and j having the relationship specifying by r and .

As suggested by Haralick et al. [69], four GLCMs with r = 1 and  = 0o, 45o, 90o, and 135o can be used to characterize an image texture. Accordingly, the measurements of angular second moment (AM), contrast (CO), correlation (CR), and entropy (ET) can be computed and employed for texture discrimination [69, 72]:where is the number of gray level values; , and denote the means and standard deviations of the marginal distribution associated with [69].

3.3.3. Local Binary Pattern (LBP)

Local Binary Patterns (LBP), proposed by [73, 74], is a nonparametric approach used to summarize local structures of an image sample. This approach essentially compares each pixel with its neighboring ones. The notable advantages of the LBP are its computational efficiency and tolerance of monotonic illumination variations [75]. This image processing technique has been applied successfully for analyzing texture and has also been extended to be applied in various computer vision tasks face image analysis [7678], image and video retrieval [79], visual inspection [80], remote sensing [81], and human action recognition [82].

The LBP algorithm labels the pixels of an image sample with LBP codes, which expresses the local structure around each pixel of interest. The size of the neighboring pixels is usually 3 × 3. Accordingly, the center pixel is compared with its eight neighbors. The neighboring pixel is coded as 1 if its gray intensity is greater than that of the center pixel. Otherwise, neighboring pixel is coded as 0. Given a center pixel at xc and yc, its LBP can be obtained as follows [75]:where ic and ip denote gray intensities of the center pixel and its neighboring pixels. s(x) is 1 if x ≥ 0 and 0 if x < 0.

3.4. Support Vector Machine (SVM) for Pattern Recognition

The Support Vector Machine (SVM), formulated by Vapnik [83], is a powerful method for solving nonlinear and complex pattern recognition tasks. This machine-learning approach relies on the concept of structural risk minimization to construct the decision hyperplane that classifies data into distinctive groups. Therefore, the SVM is less prone to overfitting than other machine-learning methods such as neural networks which are relied on the theory of empirical risk minimization [18, 84]. Notably, to deal with noisy data as well as nonlinear separability problems, the SVM employs the framework of maximum margin construction and kernel tricks [22, 8587]. In addition, the learning phase of a SVM model used for data classification is formulated as to solving a convex optimization problem (i.e., a quadratic programming problem); this fact ensures a global convergence of the model training phase [18].

Given a training dataset with a vector representing image texture information and a scalar denoting the class labels (either nonpitting corrosion or pitting corrosion), a SVM model constructs a decision boundary expressed in the form of a hyperplane to separate the data samples into two categories. To do so, it is required to solving a nonlinear programming problem described as follows [8890]:where denotes a normal vector to the classification hyperplane and b R is the model bias; represents slack variables; c denotes a penalty constant; and is the employed nonlinear data mapping function.

Notably, an explicit expression of the data mapping function is not required. The model training and prediction phases only necessitate the product of in the input space which is called a Kernel function:

The Radial Basis Function Kernel (RBFK) is often used for nonlinear data classification:where denotes a model hyperparameter which needs to be specified by the user.

Accordingly, the final SVM model used for pitting corrosion is given bywhere represents the solution of the dual form of the abovementioned nonlinear programming problem and SV denotes the number of support vectors found by the model training phase.

3.5. The History-Based Adaptive Differential Evolution with Linear Population Size Reduction (LSHADE)

The History-Based Adaptive Differential Evolution with Linear Population Size Reduction (LSHADE), proposed in [35, 91], is a popular variant of the standard Differential Evolution (DE) [92]. The LSHADE inherits the integrated mutation-crossover operation of the DE used for exploring and exploiting the search space. Tanabe and Fukunaga [91] also enhances the searching capability of the standard optimization algorithm by a novel adaptive strategy used for fine-tuning the mutation scale (F) and the crossover probability (CR) coefficients which are the two crucial hyperparameters of the DE. This adaptive strategy is based on a record of successful population members. In addition, to improve the mutation operator, a method called DE/current-to-pbest/1 is implemented. Finally, for meliorating enhance convergence rate of the LSHADE, a population size shrinking schedule is implemented. Due to such advantages, superior performance of the LSHADE algorithm has been reported in various comparative studies [36, 9395].

The overall structure of the LSHADE metaheuristic method is illustrated in Figure 2. Given the searched domain (lower and upper boundaries), the number of decision variable, and the population size (PS), a population of PS members is randomly generated within the searched domain. The DE/current-to-pbest/1 and the crossover operations are used to generate a new candidate solution via the creations of mutated solution () and trial solution (ui,g+1) [35, 91]:where xpbset,g denotes the best found solution at generation and rand represents a uniform random number ranging between 0 and 1.

Based on the fitness value of the trial solution and its corresponding parent, a selection operation is performed to preserve better solution and cast out the worse. Moreover, the LSHADE relies on two archives of MF and MCR which record the mean values of the mutation scale and the crossover probability. To update these mean values, the two sets of SF and SCR storing all CR and F values of successful child solutions are used. Furthermore, at the end of each generation, the population size gets shrunk by casting out inferior population members to enhance the algorithm convergence speed [35, 96].

4. The Proposed Metaheuristic Optimized Image Processing and Machine-Learning Method for Automatic Pitting Corrosion Detection

This section of the article aims at describing the structure of the proposed metaheuristic optimized image processing and machine-learning method for automatic recognition of pitting corrosion. The overall structure of the proposed model is presented in Figure 3 which can be divided into several steps.

4.1. Image Sample Preparation

In the first step, to construct the SVM machine-learning model used for pitting corrosion recognition, the set of image samples capturing the texture of metal structures must be prepared. This image set includes samples which contain pitting corrosion and samples without such defect. Based on the collected image samples, regions of interest are extracted within which areas on metal surfaces having distorted color are identified. During field trips in Danang city (Vietnam), 120 image samples containing pitting corrosion (the positive class) have been attained and labeled by human inspectors. It is noted that one image samples may have multiple areas of pitting corrosion. Accordingly, the number of extracted regions of interest that belongs to the positive class is 124. To guarantee a balanced dataset, the number of the negative (without pitting corrosion) data samples should also be 124. Therefore, a set of 93 image samples with no pitting corrosion areas has been included in this study. These 93 image samples have been used to extract 124 regions of interest belonging to the negative class. To train and test the proposed machine-learning model used for pitting corrosion detection, the collected dataset has been divided into two subsets of training (90%) and testing (10%) datasets. The training set is utilized for model construction and testing set is reserved for verifying the model predictive capability.

It is proper to note that ground truth labels of data samples have been determined by human inspectors. Herein, the label = −1 means the negative class and the label = 1 denotes a positive class. The collected images in this study have been taken by the Cannon EOS M10 (CMOS 18.0 MP) and Nokia 7.2 (48 MP main sensor). To enhance the speed of the texture computation phase and to ensure the consistency of an image region, the image size has been set to be 64 × 64 pixels. The image samples are illustrated in Figure 4. Additionally, to better cope with the diversity of the metal surface, the negative class of nonpitting corrosion deliberately includes samples of intact surfaces and stains.

4.2. Extraction of Region of Interest (ROI)

Since an area subject to pitting corrosion can have a diverse form of shape, it is necessary to automatically identify the ROI that covers this area. Image texture of this ROI can be subsequently computed and used for pitting corrosion detection. This study proposes a novel integration of metaheuristic optimized multilevel image thresholding, morphological operation, and other image processing techniques to determine the ROI from images of the metal surface.

The whole process of ROI extraction is presented in Algorithm 1. The first step is to smooth the image sample by removing noise; a median filter with window size of 7 × 7 is implemented. The color image is then converted to its corresponding gray-scale one. Based on this gray-scale image, the multilevel image thresholding with the aforementioned Otsu objective function is carried out. Based on several trial-and-error experiments with the collected images, the appropriate number of pixel groups has been found to be three. The LSHADE metaheuristic is then employed to determine the most desired thresholds used to separate image pixels into distinctive groups. The metaheuristic based multilevel image thresholding phase is illustrated in Figure 5. The population size of the LSHADE and the maximum number of evolutionary generations are experimentally set to be 20 and is 100, respectively.

Specify an input image sample (I) and the number of thresholds (K)
Apply median filter to I to obtain IMF
Convert IMF into a gray-scale image IG
Use LSHADE to identify the set of optimal thresholds
Perform image multilevel thresholding on IG
For each image segment k
 Perform image binarization to obtain IBin,k
 Perform morphological operations on IBin,k (filling and removing noise)
 Remove background
 Perform CCL operation on IBin,k to obtain a list of objects
For each object j within IBin,k
  Construct its binary image
  Perform image convolution:
  
  Identify the enclosing rectangle IREC
  Perform image cropping operation to obtain ROI
 End For
End For
Return ROIs

Based on the thresholded image, morphological operations (filling and removing small objects) are performed. In addition, the operation of background removal is carried out. Herein, an object is defined as background if its width or height is equal to that of the whole image sample. Based on the binary image representing each pixel group, the CCL is used to identify separated pitting corrosion areas. Subsequently, image convolution and image cropping operations are used to extract ROI. This whole process of ROI extraction is displayed in Figures 6 and 7.

4.3. Image Texture Computation

As mentioned earlier, image texture is used in this study to characterize the feature of the metal surface. The statistical measurements of color channels, GLCM, and LBP descriptors are computed based on each extracted ROI. For each image sample, one color channel yields six statistical measurements of mean, standard deviation, skewness, kurtosis, entropy, and range. Therefore, the number of statistical measurements of color channels is 6 × 3 = 18. In addition, the four co-occurrence matrices corresponding to the directions of 0o, 45o, 90o, and 135o are computed and each of which yields the four indices of the angular second moment, contrast, correlation, and entropy are acquired from one co-occurrence matrix. Hence, the number of GLCM-based texture descriptors is 4 × 4 = 16. Finally, 59 features which are the histogram of the LBP after removing nonuniform patterns [73] are included in the feature set. Accordingly, the total number of texture descriptors is 18 + 16 + 59 = 93.

4.4. The Model Hyperparameter Optimization and the SVM Classification (SVC) for Pitting Corrosion Detection

Based on the above stated ROI extraction and image texture computation, a dataset consisting of 248 data samples and 93 input features can be used for the subsequent model hyperparameter optimization and the SVM classification (SVC) for pitting corrosion detection. It is noted that the created dataset has two class outputs: −1 for nonpitting corrosion (negative class) and +1 for pitting corrosion (positive class). Furthermore, for standardizing the data ranges, this dataset has been preprocessed by the Z-score data normalization described as the following equation:where Xo and XZN denote an original and a normalized feature, respectively, and mX and sX represent the mean and the standard deviation of the original feature, respectively.

After the dataset has been standardized, the principal component analysis is then applied to for dimensionality reduction. The three values of total variance explained of 90%, 95%, and 99% are used to select appropriate number of input features used for the SVC-based pitting corrosion recognition. In addition, the LSHADE metaheuristic approach is utilized to search for the most desired values of the SVM model hyperparameters including the penalty coefficient (c) and the kernel function parameter (σ). The LSHADE randomly generates an initial population of the SVM model hyperparameters. In each evolutionary generation, this metaheuristic method gradually explores and exploits the search space for better solution candidates capable of yielding high quality pitting corrosion detection models. Notably, the feasible domains of the penalty coefficient (c) and the kernel function parameter (σ) are [1, 1000] and [0.1, 1000], respectively. In addition, the population size and the number of the LSHADE searching generations are selected to be 20 and 100, respectively.

In order to optimize the SVM performance, a K-fold crossvalidation (with K = 5) is employed in this study. The average predictive performance obtained from this crossvalidation is employed to quantify the model predictive capability. Accordingly, the following cost function (CF) is used by the proposed integration of LSHADE and SVM used for pitting corrosion detection:where FNRk and FPRk represent the false negative rate (FNR) and the false positive rate (FPR) obtained from kth run of the aforementioned K-fold crossvalidation, respectively.

The FNR and FPR are calculated as follows:where FN, FP, TP, and TN denote false negative, false positive, true positive, and true negative data samples, respectively.

5. Experimental Results and Discussions

The proposed metaheuristic approach used for pitting corrosion detection, named as LSHADE-SVC-PCD, has been developed in Visual C#.NET environment (Framework 4.6.2). It is noted that the LSHADE optimization method as well as the employed image processing techniques have been constructed by the authors; the SVM classification model is implemented via built-in functions supported by the Accord. NET Framework [97]. Experiments with the newly developed LSHADE-SVC-PCD model are performed on the ASUS FX705GE - EW165T (Core i7 8750H, 8GB Ram, and 256GB solid-state drive).

To train and test the integrated LSHADE-SVC-PCD model used for pitting corrosion detection, the collected dataset has been divided into two subsets of training and testing datasets. The training dataset, accounting for 90% of the original dataset, is used for model construction and the rest of the dataset is reserved for testing the model generalization. In addition, to alleviate the effect of randomness caused by data sampling and to evaluate the predictive capability of the newly developed method reliably, the training and testing data sampling process has been repeated 20 times. In each time, 10% of the dataset is randomly chosen to create the testing dataset; the other 90% of the dataset is employed for model training. The datasets used for model training and testing are illustrated in Table 1.

Moreover, to assess the predictive capability of the proposed LSHADE-SVC-PCD, classification accuracy rate (CAR), precision, recall, negative predictive value (NPV), and F1 score are computed from the outcomes of the TP, TN, FP, and FN. These indices are computed in the following equations [98]:

The LSHADE with an initial population size = 20 and a maximum number of evolutionary generations = 100 has been employed and utilized to determine the most appropriate set of the SVM model’s hyperparameters including the penalty parameter (c) and the kernel function parameter (σ). As stated earlier, the original dataset having 93 features has been inspected by the commonly used principal component analysis (PCA) to seek for possibility of dimensionality reduction. Based on the PCA outcomes with three scenarios of the values of total variance explained (90%, 95%, and 99%), the dimensionality of the original dataset can be reduced to 4, 8, and 19. The evolutionary progresses of the LSHADE metaheuristic-optimized SVM corresponding to these three scenarios of dimensionality reduction are graphically described in Figure 8. The LSHADE optimization results are reported in Table 2. The prediction performances of the LSHADE-SVC-PCD corresponding to different running cases are summarized in Table 3. As can be seen from this table, the dataset from Case 3 has helped to gain the most desired testing performance with CAR = 91.80%, precision = 0.91, recall = 0.94, NPV = 0.93, and F1 score = 0.92. Therefore, this dataset has been selected to be used in the subsequent part of model result comparison.

In addition, to demonstrate the superiority of the proposed LSHADE-SVC-PCD, the Backpropagation Neural Network (BPNN) [99, 100] and the Random Forest models are used as benchmark methods. The BPNN- and RF-based classifiers have been developed in Visual C#.NET by the authors and trained with the mini-batch mode [100, 101]; the batch size is selected to be 32 and the number of neuron in the hidden layer is selected to be according to the recommendation of Heaton [102]; herein, DX and CN are the numbers of features and class outputs, respectively. The BPNN model is then constructed with the sigmoidal activation function with the maximum number of epochs = 1000 epochs and the learning rate = 0.01. Moreover, the RF model is established with the number of individual classification tree = 50.

The prediction performances of the proposed LSHADE-SVC-PCD model and the two benchmark models are summarized in Table 4 and box plots are provided in Figure 9. As can be observed, the performance of the LSHADE-SVC-PCD (CAR = 91.80%, precision = 0.91, recall = 0.94, NPV = 0.93, and F1 score = 0.92) is better than those of the BPANN (CAR = 86.15%, precision = 0.90, recall = 0.84, NPV = 0.82, and F1 score = 0.87) and RF (CAR = 77.31%, precision = 0.87, recall = 0.75, NPV = 0.68, and F1 score = 0.80). Furthermore, the two-sample t-test [103] with a focus on the CAR index is also employed in this study to better confirm the statistical significance of the model predictive performances. This statistical test is often employed to inspect the null hypothesis that prediction performances may be drawn from normal distributions with equal means. In this experiment, the significant level (p-value) of the test is set to be 0.05 and results of the hypothesis testing are provided in Table 5. As can be seen from the testing results, the p-values < 0.05 reliably reject the null hypothesis.

Moreover, the coefficients of variation of the proposed model as well as the two benchmark models are provided in Figure 10. The coefficient of variation (COV) [104], also regarded as the relative standard deviation, is an index for measuring dispersion of a probability distribution. The COV is computed as the ratio of the standard deviation to the mean and can express the reliability of a prediction model’s performance [105]. Generally, a small COV value associates with a small variation on prediction outcome and is an indicator of a reliable machine-learning model. As can be seen from Figure 10, the COV values computed for the CAR index (for the training and testing phases) points out that the LSHADE-SVC-PCD (with COV = 0.78% for the training phase and COV = 6.46% for the testing phase) is superior to the BPANN (with COV = 2.16% for the training phase and COV = 6.53% for the testing phase) and the RF (with COV = 8.43% for the training phase and COV = 14.15% for the testing phase). Thus, it is able to confirm that the proposed LSHADE-SVC-PCD is best suited for the task of pitting corrosion detection.

Based on the experimental results, the proposed framework which integrates the LSHADE metaheuristic, multilevel image thresholding, image processing, and SVM-based pattern recognition can deliver satisfactory performance on the task of pitting corrosion detection. Nevertheless, since the role of the LSHADE metaheuristic in this study is two-fold, optimizing the multilevel image thresholding and fine-tuning the SVM-based pattern recognition model, a future direction of the current work may consider to apply multiobjective metaheuristic optimization methods [23, 26, 106]. In addition, since the background of metal surfaces may contain noisy objects, sophisticated image quality enhancement methods including image dehazing [107110], image filtering [111114], gradient profile prior [115118], and deep-learning-based image fusion [119121] can be useful for meliorating the accuracy rate of pitting corrosion detection.

6. Conclusions

Pitting corrosion is a severe damage that can bring about critical failure of infrastructure elements. This study has put forward an intelligent method for automatic detection of this damage via the employment of the LSHADE metaheuristic, SVM machine-learning, and image-processing techniques. The application of metaheuristic in this study is multifold. The LSHADE metaheuristic is first utilized in the task of multilevel image thresholding to extract ROI which is subsequently used by the texture descriptors (the statistical measurements of color channels, the GLCM, and the LBP). The LSHADE optimizer is then applied to search for the most appropriate SVM model’s hyperparameters including the penalty coefficient and the kernel function parameter. The SVM is then employed to generalize a classification boundary that separates input data into two distinctive categories of pitting corrosion and nonpitting corrosion. Experimental results with a repeated data sampling with 20 runs confirms that the newly developed LSHADE-SVC-PCD is highly suitable for the computer vision task of interest with CAR = 91.80%, precision = 0.91, recall = 0.94, NPV = 0.93, and F1 score = 0.92. Hence, the newly developed model can be a useful tool to assist infrastructure management agencies in the task of periodic structural health survey.

Although the LSHADE-SVC-PCD is capable of delivering good predictive outcome, one shortcoming of the study is that the number of the collected image samples is still limited. Therefore, more effort on data collected should be focused to enhance the size of the image dataset; this may help to improve the generalization of the pitting corrosion detection method. Another limitation of the current study is that the LSHADE-SVC has not been integrated with feature selection methods. Future directions of the current study may include the utilization of more sophisticated component-labeling algorithms, texture description, and feature selection methods to enhance the feature extraction phases. In addition, when the size of the collected dataset is enlarged, the employment of advanced deep-learning model can be worth-attempting to obtain higher detection accuracy. Although the LSHADE metaheuristic has resulted in good optimization outcomes, this method has a drawback of requiring a considerable amount of computational expense. Therefore, other sophisticated hyperparameters tuning (e.g., Bayesian optimization [122]) can be used as alternative approaches for optimizing the SVC-based pitting corrosion detection model.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was financially supported by Duy Tan University.