Abstract

In order to improve the strip surface defect recognition and classification accuracy and efficiency, Rough Set (RS) attribute reduction algorithm based on Particle Swarm Optimization (PSO) algorithm was used on the optimal selection of strip surface defect image decision features, which removed redundant attributes, provided reduction data for the follow-up Support Vector Machine (SVM) model, reduced vector machine learning time, and constructed the SVM classifier, which uses Second-Order Cone Programming (SOCP) and multikernel Support Vector Machine classification model. Six kinds of typical defects such as rust, scratch, orange peel, bubble, surface crack, and rolled-in scale are recognized and classification is made using this classifier. The experimental results show that the classification accuracy of the proposed algorithm is 99.5%, which is higher than that of SVM algorithm and Relevance Vector Machine (RVM) algorithm. And because of using the Rough Set attribute reduction algorithm based on PSO algorithm, the learning time of SVM is reduced, and the average time of the classification and recognition model is 58.3 ms. In summary, the PSO-RS&SOCP-SVM evaluation model is not only more efficient in time, but also more worthy of popularization and application in the accuracy.

1. Introduction

Strip surface defect recognition is a kind of pattern recognition problem with multifeatures and multitypes, which is quite complicated. The features of different types of image will have some overlap, which will cause some dimension of image features in a certain correlation. Some distortion of image description leads to mismatch of subsequent recognition. And high dimensional feature data is not conducive to the computational process of image feature clustering recognition, resulting in low efficiency, and cannot meet the rapidity and effectiveness of recognition.

In recent years, the methods used for feature selection and dimension reduction include exhaustive method, branch and bound method, TaBu search algorithm and simulated annealing algorithm, self-organizing neural network method, Principal Component Analysis (PCA) method, Genetic Algorithm (GA), Locally Linear Embedding (LLE) method, and Locality Preserving Mapping (LPM) method. But of these algorithms, some are easy to fall into local optimal solution, and some may damage the topological structure of the data. Genetic Algorithm is an efficient optimization method, but its complex genetic operation makes it difficult to meet the expectations in convergence speed and accuracy. These algorithms cannot take into account the validity of the defect characteristics and reduce the complexity of the computation time when the features of the surface defect of steel are optimized. Relief F series algorithm [1] is a feature evaluation algorithm for large scale data set and high real-time requirements. It has the characteristics of fast speed and good general performance. In [2, 3], the Relief F algorithm does the corresponding improvement and supplement.

In view of this situation, it is necessary to further study the new feature selection and dimension reduction method, so as to meet the needs of cold strip steel surface defect detection. Pawlak proposed the Rough Set (RS) theory [4], which is a mathematical tools of fuzzy and imprecise knowledge, which has been obtained and widely used in machine learning, data mining, intelligent control, and other fields in recent years [5]. Support Vector Machine (SVM) has also been proved to be a more effective tool for classification [6, 7]. But there are many difficulties that need to be solved, like the optimization of kernel function, the test speed to be improved, and so on; especially for the large scale problems, the training speed is too slow.

Therefore, we study the decision feature selection and classification of the cold rolled strip shape defect images based on Rough Set theory and Support Vector Machine. Firstly, for six kinds of typical defects of strip steel surface such as rust, scratch, orange peel, bubble, surface crack, rolled-in scale, twenty-dimensional feature vectors such as the geometry, gray, and texture features are extracted to construct the decision table of defect recognition, and the RS attribute reduction algorithm is used to eliminate the contradictory, not important, redundant attributes. Then the key indicators that reflected the defects are obtained, which can provide simplified modeling data to the SVM model. This process can reduce the learning time of SVM. Secondly, for the high dimension decision table, the calculation speed of common attribute reduction methods is very slow. We study the attribute reduction algorithm based on evolutionary computation used to deal with the high dimension decision table. Then, the simplified information is used as the modeling data of SVM, and the SVM is used to classify and identify the information. Thirdly, in order to improve the classification accuracy, the design and construction of multikernel SVM classifier is designed, and the multikernel learning is transformed into Second-Order Cone Programming (SOCP) problem. Finally, the proposed model is used to classify and identify the surface defect images of the steel strip, and the effectiveness of the proposed method is verified.

2. Rough Set Attribute Reduction Based on PSO Algorithm

2.1. The Theory of Rough Set Attribute Reduction Based on PSO Algorithm

Rough Set theory is a kind of mathematical tools that described the characterization of incomplete and uncertain data, which can effectively analyze and process all kinds of incomplete, imprecise, and inconsistent information and discover the implicit knowledge and reveal the potential rules [8].

The attribute reduction of RS is used to remove the redundant attributes of the discrete data set, and the classification ability does not change. It is to select the best attributes subset from the entire attributes set. However, finding the optimal subset of attributes has been proved to be NP-hard problem [9].

The traditional attribute reduction methods, like the differential matrix method, reduction method based on the importance, and so on, can find all reduction in the decision table [10, 11], but these methods are only suitable for the decision table with small amount of data and low dimension. When the data dimension of the decision table is over 30, or the number of data is more than 10000, using these methods is quite difficult. So many scholars explored some fast algorithms for the reduction problem [12, 13]. But these methods mostly belong to greedy algorithms, which are easy to fall into the local minimum in the complex solution space, and the final reduction is often local optimal solution. A large number of experimental studies have found that the Particle Swarm Optimization (PSO) algorithm can achieve better optimization results than the genetic algorithm when solving some typical optimization problems [14]. There are two important steps in the application of PSO to solve the optimization problem: the coding of the problem solution and the choice of the fitness function. And PSO algorithm in the optimization can not only avoid evolutionary selection, crossover, and mutation operation, but also greatly simplify the above two steps, and the optimal speed and accuracy are better than the genetic algorithm and some other algorithms [15, 16], especially for complex function with super high dimension and with a large number of local extreme values. So the attribute reduction method based on PSO algorithm was studied in this paper.

2.2. Minimal Attribute Reduction in Rough Set Theory

We only introduce some basic concepts of minimal attribute reduction in Rough Set in this paper [17].

Definition 1. represents a decision table, where is a domain, which is a collection of objects; is an attribute set, which is divided into condition attribute set and decision attribute set , and , ; is a set of attribute values, represents the range of ; is the mapping of .

Definition 2. For , , if the condition is met, one called the object , not visible for attribute set . Otherwise, one called , identifiable. The identifiability relation is denoted as , which is the intersection of all equivalence relations in .

Definition 3. Let , , . The lower approximation of the set is as follows:where the equivalence classes of the -indiscernibility relation are denoted by , and .

Definition 4. If , , that is, , then is called a relative reduction of . All , a collection of elementary relations in , is called a core, and this is . Suppose is the family set to all the of , then .
Suppose the set is constructed by the division , which is determined by the decision attribute . , which denotes the approximation accuracy of on the attribute set .

Definition 5. The minimum attribute reduction problem can be described as is the set of conditional attributes, and , where denotes the total number of conditional attributes.

2.3. Description of the Attribute Reduction Method Based on PSO Algorithm
2.3.1. Steps of the Algorithm

(1)Generate initial particle swarm.(2)The fitness function is evaluated for each particle in the population.(3)For each particle, if its current adaptation is better than the past, the current value is set to a new . Select the best in the particle swarm as and continue to update the flight position.(4)Determine whether the termination condition is satisfied: if it is satisfied then transfer to step (5); otherwise go to step (2). The termination conditions can reach a certain number of iterations or can achieve the evaluation function value of the Particle Swarm Optimization, and so on.(5)To test each particle of the end groups by reduction definitions, obtain all candidate reductions in end groups, and then, to delete the redundant attributes, obtain the final reduction set.

2.3.2. The Selection of the Fitness Function

Fitness function is the most important part of evolutionary computation, and the goal of attribute reduction is the minimal conditional attribute set with the same dependence as the original condition attribute set. The particle fitness function is designed as follows: where is the number of attributes selected in the particle . is the total number of conditional attributes, is the dependence of the particle, and is the dependence of the complete set of original conditions. is a parameter defined by user; it can increase the flexibility of the fitness function, so that the search path is developed to the direction of the user’s expectations. For example, when the target solution is more sensitive to the loss of information, you can reduce the value of . In the same manner, large values of are beneficial to searching for particles with a certain level of dependency that contains fewer properties. The value of is 0.9, and the maximum iteration number is 100 in this paper.

3. Multikernel Support Vector Machine Optimized Using Second-Order Cone Programming

3.1. Basic Theory of Second-Order Cone Programming

Second-Order Cone Programming (SOCP) is a class of convex programming problem [18]. It is to minimize a linear function subject to a set of second-order cone constraints and linear equality constraints: where is the optimization variables, , , , , , , , , represents the Euclidean norm, and represents the real number set.

3.2. Support Vector Machine Using Second-Order Cone Programming

The original optimization problem of SVM (soft-margin) is depicted in the following [19]:Here, is the penalty parameter and is the number of samples.

The convex linear combination of kernel learning multiple (MKL) kernel matrix is as follows [20]: where ,  , is as the kernel matrix and is the kernel weight to be optimized.

The Second-Order Cone Programming form of multikernel Support Vector Machine is where , , , and , , are the new optimization variables introduced in the process of problem transformation. The variable is the number of kernel functions. is the output domain. is a diagonal matrix with diagonal elements of . In this paper, the classical primal- dual interior point algorithm is used to solve the SOCP problem using the SeDuMi toolbox.

3.3. The Establishment of PSO-RS&SOCP-SVM Recognition Model

The establishment of PSO-RS&SOCP-SVM recognition model is shown in Figure 1.

Steps of feature selection and classification recognition of strip surface defects are as follows.

(1) Strip Surface Defect Images Preprocessing. Strip surface defect samples information should be accurate and comprehensive, as far as possible, to ensure that the sample information is not repeated. Samples are divided into two parts: the training samples and the test samples, and the training samples and the test samples are, respectively, normalized.

(2) Defect Features Extraction. The strip surface defect image features are extracted. The extracted defect feature parameters include three kinds of features. They are, respectively, the geometrical features, the gray features, and the texture features.

(3) Rough Set Attribute Reduction Based on PSO Algorithm. In the defect features extraction, many feature attributes are extracted, but some properties are not important. In order to avoid the redundancy of information, the attribute reduction of the sample information must be carried on. In order to realize the attribute reduction of the strip surface defect sample information, the decision attribute generalization on the sample information must be carried on firstly. The curve inflection point method is used to realize the discretization of the continuous attributes. A Rough Set attribute reduction algorithm based on PSO algorithm is introduced in Section 2 of this paper.

(4) Construction of SOCP-SVM Classification and Recognition Model. In the multikernel SVM, the selection parameters are 3 radial basis kernel functions for linear combination. We choose Radial Basis Function as the kernel function. The formula is The selection of is based on experiences and repeated tests [21]. They are, respectively, and , . The Second-Order Cone Programming algorithm is used to solve the kernel function combination coefficients, and the results are, respectively, 0.0710, 0.0161, and 0.9862. At this point we can get an optimal recognition effect. In the SOCP-SVM model, the information under attribute reduction of samples is as the training sample set. We obtained the multiple kernel SVM optimal penalty parameter and kernel parameter using -fold cross-validation method in LibSVM. Then the parameters are applied to the form of SOCP-SVM solution. Finally, a good training SOCP-SVM classification and recognition model is obtained.

(5) Recognition and Output. The trained RS&SOCP-SVM model is used to classify and recognize the strip steel surface defects, and the result is output.

4. Experiment Simulation and Result Analysis

4.1. Introduction to the Experimental Environment

The hardware conditions of PC in the experimental operation are as follows: CPU is Intel Core i5 760 with 2.8 GHz and 2 GB memory; the software platform is Windows 7 operating system; the simulation software is Matlab 7.0.

The strip defect images in this experiment are collected in a steel plant. The configuration of the image acquisition system is shown in Figure 2. There are two cameras over the strip surface. One on the left and the other on the right. The collected images are sent to the acquisition processing unit.

We choose the most common 6 kinds of defects as the typical research object: rust (100) and scratch (100), orange peel (100), bubble (100), surface crack (100), and rolled-in scale (100); a total of 600 samples are used to carry on the analysis. The defect images are shown in Figures 3(a)3(f). Part of the images are the train images; the others are the test images.

4.2. Feature Extraction and Selection

Twenty-dimensional features are selected: the geometrical features, the gray features, and the texture features, and then the defect images classification is carried on in real-time. Feature parameters are shown in Table 1.

4.3. Experimental Results Analysis

Firstly, edge extraction of steel strip surface defect images is carried on, as shown in Figure 4; then a total of 20-dimensional features such as the geometrical features, the gray features, and the texture features are extracted.

Specifically, the condition attributes of decision table are set to = = area, = perimeter, = elongation, = rectangle degree, = first-order invariant moments, = second-order moment invariants, = third-order moment invariants, = fourth-order moment invariants, = fifth-order invariant moments, = sixth-order invariant moments, = seventh-order invariant moments, = mean, = variance,   = gray entropy,   = energy,   = inertance,   = consistency,   = roughness,   = contrast,   = .

Here, a total of 600 samples were used as training samples. The attribute values formed a two-dimensional table, each row describes an object, and each column describes an attribute of the object. So a decision table with is formed. The top 20 column vectors of features represent the 20-dimensional feature values of strip surface defect. The 21st attribute is the decision attribute which represents the categories of defects. The decision table of the feature values is shown in Table 2.

The discretization intervals of each feature attribute are shown in Table 3.

According to the discretization interval of Table 3, the result of discretization of decision table is shown in Table 4.

Attribute reduction is carried on the discretization data in Table 4, as shown in Table 5, the reduction attributes are obtained, and the final reduction attributes are area, perimeter, elongation, third-order moment invariants, variance, gray entropy, roughness, and direction. This process removed the redundant attributes which did not affect the classification accuracy and get a subset of attributes under the premise of unchanged classification ability. The attribute reduction method is based on PSO algorithm.

Finally, the attribute values under reduction are input to the SOCP-SVM model to classify and recognize the strip surface defect. Gaussian radial basis functions are chosen as the kernel function, and the classification and recognition results were compared with that of the SVM algorithm and the RVM algorithm. Performance comparison is shown in Table 6.

From Table 6, we can see the strip surface defect classification and recognition accuracy using three kinds of algorithms. The average recognition accuracy of SVM was 97.5%, which is the lowest among the three algorithms. The average running time is 71.3 ms; while RVM algorithm improved the recognition accuracy (98.5%), the average running time is 81.3 ms, whereas the recognition accuracy based on PSO-RS&SOCP-SVM model is higher than that of the SVM algorithm and the RVM algorithm. For 20-dimensional condition attributes, the highest classification accuracy is up to 100%. the average recognition accuracy of several defects is 99.5%, which is higher than that of the SVM algorithm, the RVM algorithm, and many other existing methods. With the Rough Set attribute reduction based on PSO algorithm, the modeling data are simplified, and the SCOP-SVM model training time and the overall classification recognition time are reduced. The average running time of the proposed method is 58.3 ms, which is lower than that of the SVM algorithm and the RVM algorithm. The strip steel surface defect classification and recognition efficiency is improved.

The classified ROC curves, respectively, corresponding to the traditional multikernel SVM algorithm and PSO-RS&SOCP-SVM algorithm are shown in Figure 5.

From Figure 5 it can be seen that it has more accurate classification and lower probability of error classification using PSO-RS&SOCP-SVM algorithm, compared with the traditional classification of multikernel SVM algorithm. The ROC area is larger than that of the traditional multikernel SVM algorithm. So PSO-RS&SOCP-SVM algorithm is superior to the traditional multikernel SVM algorithm in classification accuracy and classification effect.

In summary, the PSO-RS&SOCP-SVM model is feasible and effective in the surface defect feature selection and classification.

5. Conclusions

In order to improve the strip steel surface defect classification recognition accuracy and reduce the running time of recognition, feature selection of steel strip surface defect images based on RS algorithm and PSO algorithm for attribute reduction method is introduced in this paper. Thereby the sample feature dimensions are reduced and the modeling data are simplified. Then the traditional multikernel SVM model is optimized based on SOCP, which can optimize the parameters of the multikernel SVM model. Experimental results show that its application effect is remarkable, and it is superior to the traditional SVM and RVM algorithm in recognition accuracy and efficiency. In the next step of research, we try to study how to further reduce the time complexity of multikernel learning, so as to meet the real-time requirement.

Conflicts of Interest

The authors declare that they have no competing interests.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 51208168) and Hebei Province Foundation for Returned Scholars (no. C2012003038).