Study on Intelligent Diagnosis of Rotor Fault Causes with the PSO-XGBoost Algorithm
On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance. To improve the diagnostic accuracy and practical efficiency, a hybrid model based on the particle swarm optimization-extreme gradient boosting algorithm, namely, PSO-XGBoost is designed. XGBoost is used as a classifier to diagnose rotor fault causes, having good performance due to the second-order Taylor expansion and the explicit regularization term. PSO is used to automatically optimize the process of adjusting the XGBoost’s parameters, which overcomes the shortcomings when using the empirical method or the trial-and-error method to adjust parameters of the XGBoost model. The hybrid model combines the advantages of the two algorithms and can diagnose nine rotor fault causes accurately. Following diagnostic results, maintenance measures referring to the corresponding knowledge base are provided intelligently. Finally, the proposed PSO-XGBoost model is compared with five state-of-the-art intelligent classification methods. The experimental results demonstrate that the proposed method has higher diagnostic accuracy and practical efficiency in diagnosing rotor fault causes.
The steam turbine rotor plays an important role in transforming thermal energy into mechanical energy. In a high-speed rotating working station, any defect on the rotor will affect the safe running and even cause serious accidents [1–3]. Therefore, intelligent diagnosis of rotor fault causes is essential besides diagnosing rotor fault categories intelligently.
In the field of industrial intelligent operation and maintenance, the research studies mainly focus on the detection of rotor fault categories [4–6], while the studies on the diagnosis of rotor fault causes are less. The specific rotor fault causes provide a reasonable and practical maintenance decision, ensuring the steam turbine’s safe and stable running. The traditional diagnosis of rotor fault causes is mainly based on the expert system , but the knowledge is difficult to obtain, and the portability is poor. A serious of running parameters, such as temperature and pressure, can accurately assess the operating status of equipment , but they are rarely used to build the intelligent diagnosis system of rotor fault causes. Therefore, the intelligent algorithms can be used to diagnose rotor fault causes and realize the intelligent operation and maintenance depending on running parameters of a rotor.
In essence, diagnosing rotor fault causes is a classification problem, and various intelligent classification methods have been applied. Support vector machine  (SVM) is a popular supervised learning algorithm that many researchers have used to train for classification. Jan et al.  used SVM classify sensor faults. Lobato et al.  used SVM for the classification of the machinery condition. However, the intelligent diagnosis of rotor fault causes is a typical nonlinear problem. Because the principle of SVM is a linear classifier based on maximum interval, it does not work well in solving nonlinear problems. Random forest  (RF) and gradient boosting decision tree  (GBDT) are two well-known ensemble machine learning methods, and the weak learning model used in them is the decision tree (DT) model. Wang et al.  proposed a hybrid approach of a random forest classifier for the fault diagnosis in rolling bearings. Quiroz et al.  used random forests to diagnose broken rotor bar failure in a line start-permanent magnet synchronous motor. Zhu et al.  proposed a novel performance fault diagnosis method for SaaS software based on the GBDT algorithm. Zhong et al.  used GBDT to predict railway accident types and analyze causes. Although RF and GBDT have advantages such as high classification accuracy, less overfit, excellent generalization performance, and a good explanation, they also have some shortcomings for the intelligent diagnosis of rotor fault causes. RF may not produce good classification for small data or low dimensional data. GBDT uses the first-order Taylor expansion to calculate the loss, which is not accurate enough. On basis of GBDT, the extreme gradient boosting (XGBoost) algorithm was proposed by Chen Tianqi . The XGBoost algorithm introduces second-order derivatives and regularization terms, which improve the accuracy on classification no matter whether the data scale is large or small. Zhang et al.  designed a data-driven method for fault detection of wind turbines using XGBoost. Lei et al.  diagnosed hydraulic valves by integrating PCA and XGBoost. Wu et al.  proposed a method of wind turbine fault diagnosis based on the ReliefF algorithm and XGBoost algorithm in order to improve the accuracy of fault diagnosis on wind turbines. Although the XGBoost has excellent classification results, there are many parameters in the XGBoost model, such as the learning rate, the subsample ration of columns when constructing each tree, the subsample ration of columns for each level, the regularization term on weights, and so on. Different combinations of these parameters determine the performance of the model to a large extent . Usually, the parameter setting of the XGBoost model is to find a set of parameters making the performance best by fixing the values of several parameters and optimizing other parameters by a finite number of exhaustive methods. But different permutations and combinations increase the complexity of the work, and it is difficult to find the optimal parameters. It is an optimization problem to find the most suitable parameters of the XGBoost model. In recent years, various intelligent optimization algorithms have been proposed one by one [23–25]. In , an improved PSO-based QEA method was proposed to allocate gate resource. In , an enhanced MSIQDE algorithm with multiple strategies was proposed to solve global optimization problems. In , an enhanced success history adaptive DE with greedy mutation strategy is employed to optimize parameters of PV models. Aiming at the optimization problems of model parameters, particle swarm optimization (PSO) has simple principle and easy implementation. Many researchers have achieved better results by combing PSO with other classification methods. Wang et al.  used PSO to search the optimal architecture of convolution neural networks. Li et al.  used PSO to search the penalty factor and kernel function of SVM.
To diagnose rotor fault causes accurately and efficiently, a hybrid model based on the particle swarm optimization-extreme gradient boosting algorithm (PSO-XGBoost) is proposed. XGBoost, a scalable end-to-end tree boosting system, with the second-order Taylor expansion and the explicit regular term, is used as a classifier to diagnose the rotor fault causes. PSO is used to automatically optimize the parameters such as the L1 and L2 regularization terms on weights during the XGBoost model training, which overcomes the low accuracy and low efficiency when using the empirical method or the trial-and-error method to adjust these parameters of the XGBoost model. The hybrid model combined with the advantages of the two algorithms can diagnose rotor fault causes more accurately. Following diagnostic results, maintenance measures referring to the corresponding knowledge base are provided intelligently.
The innovations and main contributions of this study are described as follows:(1)On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance(2)A novel hybrid model based on PSO and XGBoost is developed to effectively simplify the parameter adjustment process of the XGBoost model and improve the accuracy of diagnosis
The further detailed structure of this study is summarized in the remaining sections. Section 2 introduces the preliminaries for diagnosis of rotor fault causes. Section 3 conducts an experiment to validate the performance of the proposed method. Conclusions are elaborated in Section 4.
2. Materials and Methods
2.1. XGBoost Algorithm
XGBoost  is a highly scalable end-to-end tree boosting system, which provides a theoretically justified weighted quantile sketch for efficient proposal calculation, a novel sparsity-aware algorithm for parallel tree learning, and an effective cache-aware block structure for out-of-core tree learning.
For a given dataset with examples and features , a tree ensemble model uses additive functions to predict the output. Here, is the characteristic parameter of the ith sample, composed of fault types and operating parameters , where is the number of fault types, and is the number of operating parameters. The predicted category of fault cause iswhere is the number of trees, and is a function in the functional space .
The objective function is
The first term is the training loss function, and the second term is the regularization term. The training loss measures how predictive the model is with respect to the training data. The regularization term controls the complexity of the model, which helps to avoid overfitting.
Formally, let be the prediction of the ith instance at the tth iteration; then, add to minimize the following objective.
Second-order approximation can be used to optimize equation (3) in the general setting, i.e.,where and are the first- and second-order gradient statistics on the loss function. After removing all the constants, the specific objective at step becomes
The definition of the tree is refined as
Here, is the vector of scores on leaves, is a function assigning each data point to the corresponding leaf, and is the number of leaves. The regularization term is defined as
After reformulating the tree model, the objective value with the tth tree can be written aswhere is the set of indices of data points assigned to the jth leaf.
By defining and , the expression can be compressed as
In equation (9), is independent with respect to each other, the form is quadratic, and the best for a given structure and the best objective reduction are
Equation (11) measures how good a tree structure is. Typically, it is impossible to enumerate all the possible tree structures . A greedy algorithm that starts from a single leaf and iteratively adds branches to the tree is used instead. By splitting a leaf into two leaves, the score it gains is
Equation (12) can be decomposed as the score on the new left leaf, the score on the new right leaf, the score on the original leaf, and regularization on the additional leaf, respectively.
2.2. Particle Swarm Optimization Algorithm
The particle swarm optimization algorithm  is a popular population-based heuristic algorithm that is inspired by the foraging behavior of birds flocking.
Suppose a population of the particles in a D-dimensional search space, where the ith particle is represented as a D-dimensional vector. According to the objective function, each particle’s corresponding fitness of position can be calculated. The individual extremum of ith particle’s speed is . Corresponding individual extremum is . During each iteration, the particle updates its speed and position through the extremum of the individual and the extremum of the population as
In equations (13) and (14) above, is the inertia weight; ; ; is the current iteration number; is the velocity of the particle; is the individual optimum; is the global optimum; and are the acceleration constants; and and are the subjects to a uniform distribution in the (0, 1) interval.
2.3. Improved XGBoost Algorithm Based on PSO
Although XGBoost has excellent results in many aspects, there are many parameters in it and different combinations of parameters determine the performance of the model to a large extent. PSO has the unique advantage of optimizing the parameters of XGBoost, which can effectively improve the effectiveness and accuracy of diagnosing rotor fault causes. In this study, six parameters that have a great influence on the model are optimized by PSO. The information of each parameter is given in Table 1.
According to Table 1, the velocity vector and the position vector of the ith particle at the tth iteration can be expressed as
The position vector is assigned to the corresponding parameters of XGBoost, and the negative accuracy score of the XGBoost model is used as the fitness value to measure the performance of PSO. The fitness value of the ith particle at the tth iteration is shown aswhere represents the negative accuracy score of XGBoost; is the indicator function, taking 1 and 0, respectively, when is true and false; is the prediction label of XGBoost; is the real label of samples; and is the total number of samples.
The individual optimum of the ith particle at the tth iteration is
The global optimum of the ith particle at the tth iteration iswhere is the number of particles.
The XGBoost algorithm and the PSO-XGBoost algorithm are shown in Figure 1. The process of the improved XGBoost algorithm based on PSO is shown in Figure 1(b). Compared with the original XGBoost algorithm shown in Figure 1(a), the PSO-XGBoost final training accuracy score is used as the objective function to search out the optimal parameters. The optimal result can be obtained by running PSO-XGBoost once, while XGBoost needs to be adjusted manually many times, and the optimal result may not be obtained.
The procedures of the proposed method for diagnosis of rotor fault causes, which can be seen from Figure 1(b), are as follows: Step 1. Initialize the particle swarm. Initialize the particle swarm parameters, including the particle number, learning factors, weighting coefficient, and the maximum number of iterations. Step 2. Train the XGBoost model. The parameters to be optimized change along with the flying of particles. Step 3. Calculate and assess the fitness value. The fitness value, originating from the output negative accuracy score of the XGBoost model, is used to evaluate the performance of PSO. A smaller fitness value indicates better performance. Step 4. Judge the stop condition. Terminate the iteration process and obtain the optimal parameters of the XGBoost model if the number of iterations is reached. Otherwise, proceed to the iterative calculation. Step 5. Validate the classification model. Use the optimization results to build the XGBoost model and output the results of diagnosing rotor fault causes.
3. Results and Discussion
3.1. Data Description
In this study, 450 sets of operation data related to three kinds of high-pressure rotor faults of a 330 MW unit in a power plant are summarized as example verification. The specifications are given in Table 2. Three kinds of faults are represented by F1, F2, and F3, respectively. They are high-pressure rotor rubbing fault (F1), the mass imbalance fault (F2), and the self-excited oscillation (including oil film half-speed whirl and oil film oscillation) fault (F3) and are taken as objects. C1–C9 represent nine different fault causes. Among them, four causes are leading to rotor rub impact, including rubbing at shaft seal caused by cylinder deformation (C1), rubbing at shaft seal caused by the fast rate of loading up (C2), rubbing at shaft seal caused by a long time of low load remaining (C3), and rotor rubbing with oil baffle (C4); three causes are leading to mass imbalance fault, including inadequate stiffness of bearing pedestal (C5), fracture and falling off of rotating parts (such as blades and coupling windshields) (C6) and other reasons (C7); and two causes are leading to self-excited oscillation fault, including poor stability of bearing (C8) and excessive journal disturbance (C9). A total of 50 groups of data samples for each fault cause constitute the sample set.
In this study, ten running parameters with high correlation with rotor rubbing fault, mass imbalance fault, and self-excited oscillation fault are selected, including the steam temperature of high-pressure cylinder shaft seal and cylinder expansion value of high-pressure cylinder. The details are given in Table 3.
In all, the input data, composed of running parameters and fault type, have eleven dimensions, i.e., .
3.2. Data Preprocessing
Data preprocessing aims to make the data adapt to the model and match the model’s needs. Data preprocessing mainly includes missing value processing, data dimensionless processing (including central processing and scaling processing), classified feature processing (text to digital), and continuous feature processing.
3.2.1. Missing Value Processing
For missing values, in this study, the mean is used to fill the numerical feature, and the mode is used to fill the character feature.
3.2.2. Feature Coding of Character Features
In the original dataset, digits do not represent the fault types in the classification features (rubbing fault (F1), mass imbalance fault (F2), and self-excited oscillation (F3)) and fault cause category labels (rubbing at shaft seal caused by cylinder deformation (C1) and rubbing at shaft seal caused by the fast rate of loading up (C2)). For making the data adapt to the algorithm, the data must be encoded, converting texts to numerical types. The independent fault types (F1, F2, and F3) are transformed into dummy variables by using one-hot coding, namely, F1 [1 0 0], F2 [0 1 0], and F3 [0 0 1]. Labels of fault cause category [C1, C2, …, C9] are directly converted into the digital form [0, 1, …, 8].
3.2.3. Data Standardization
First, decentralize the data by mean . Then, scale them by standard deviation . The conversion function is given in (10). After the above two steps, the data will follow the standard normal distribution, i.e., .
The preprocessed dataset is given in Table 4.
3.3. Experimental Results
The test set is used to verify the performance of the PSO-XGBoost model. The model is quantitatively evaluated using evaluation indicators, such as the accuracy, confusion matrix, precision, recall, and F1-score [31–33].
The results can be divided into four classes, including true positive (TP), false positive (FP), true negative (TN), and false negative (FN). Here, TP is the correct predicted positive category, FP is the incorrect predicted positive category, TN is the correct predicted negative category, and FN is the incorrect predicted negative category.
Accuracy is simply a ratio of the correctly predicted classifications to the total dataset. In formula, the accuracy ratio is .
Precision is the ratio of the system generated results that TP to the system’s total predicted positive observations, both TP and FP. In formula, the precision ration is .
Recall is the ratio of the system generated results that TP to all categories in the actual class. In formula, the recall ratio is .
F1-score is the weighted average of precision and recall, and the calculation formula is .
The confusion matrix is used for evaluating the model when faced with a multiclassification problem. Each column of the confusion matrix represents a predicted category, and the total numbers of data for each column represent the number of data predicted to be in the category. Each row represents the data’s actual category, and the total numbers of data for each row represent the number of data instances belonging to that category. For a confusion matrix, the larger the value on the diagonal is, the better the matrix. The smaller the value on other locations is, the better the matrix.
Figure 2 shows that the overall accuracy of the PSO-XGBoost model is 98.52%. From Figure 3, the accuracy of rubbing fault caused by cylinder deformation is 92.86%, the accuracy of rubbing at shaft seal caused by the fast rate of loading up is 92.31%, and the accuracy of three faults caused by other reasons is 100%.
Table 5 provides the accuracy, precision, recall, and F1-score for the PSO-XGBoost model. From this table, it can be seen that the accuracy, precision, recall, and F1-score of the proposed method as a whole are all above 98% for the performance of diagnosing rotor fault causes, and it can perform the accurate and comprehensive identification of various categories. Therefore, the proposed method’s performance has good results in accuracy, precision, recall, and F1-score.
3.4. Comparative Analysis
An investigation of five different classifiers is performed to verify the superiority of PSO-XGBoost in classification performance, including XGBoost, RF, GBDT, DT, and SVM. The classification results of these algorithms are shown in Figures 4–8 . The results of accuracy are 95.56%, 93.33%, 92.59%, 91.85%, and 84.44%, respectively. Compared with Figure 3, we can conclude that the PSO-XGBoost algorithm is superior to the other five algorithms in classification accuracy.
To have a detailed quantitative analysis related to each classifier’s classification results, five confusion matrixes according to five studied classification experiments are introduced for recording the recognition results and the percentage of misclassification of the rotor with different fault causes. Figures 9–13 show the confusion matrixes of XGBoost, RF, GBDT, DT, and SVM, respectively.
Figures 10–12 show that the RF model and the DT model are confused with C1, C2, and C3, and the GBDT model is confused with C1, C2, C3, C6, and C7. Figure 13 shows that the SVM model plays the worst performance. Figures 3 and 9 show that the PSO-XGBoost model and the XGBoost model are all confused with C1 and C2, but in category C1, PSO-XGBoost has higher accuracy than XGBoost. Therefore, the PSO-XGBoost model is superior to the other five algorithms. The comprehensive model evaluation indicators are given in Table 6.
In the view of Figures 14–17, the SVM model’s accuracy, precision, recall, and F1-score are the lowest of the five algorithms because its principle is a linear classifier based on maximum interval, which does not work well in solving nonlinear problems. DT’s performance is better than SVM, but its accuracy, precision, recall, and F1-score are slightly lower than other four algorithms because RF, GBDT, and XGBoost all use the DT model as their weak learning models. Except PSO-XGBoost, the performance fault causes the diagnosis model constructed using XGBoost superior to the other four algorithms in accuracy, precision, recall, and F1-score because XGBoost uses second-order derivatives and regularization terms, which improves the accuracy and is not affected by the size of the dataset. After optimizing the parameter tuning process during training an XGBoost model by using PSO, the PSO-XGBoost model’s accuracy, precision, recall, and F1-score are the highest of the five algorithms. Evidently, PSO can effectively optimize the parameters of XGBoost, thereby improving the classification performance on the dataset. From the model’s comprehensive classification performance perspective, choosing the PSO-XGBoost model for diagnosing rotor fault causes is more reasonable than other algorithms.
The comparison of different algorithms in the iterative process is shown in Figure 18.
From Figure 18, in the initial iteration stage, for the proposed method, the iterative curve shows a rapid downward trend; then, the iterative process in the proposed method is easily converged. Obviously, other five methods still need several iterations to achieve the final convergence result. Therefore, the proposed method is faster in convergence rate and has higher efficiency in practice.
3.5. Maintenance Strategy according to Fault Causes
For nine different rotor fault causes, we build a knowledge base, mapping each rotor fault cause to a specific solution, in order to achieve the purpose of intelligent operation and maintenance. For example, when we diagnose the rotor fault cause C1, the computer will automatically link to the solution M1. Other details in the knowledge base are given in Table 7.
On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance. This study proposes a hybrid model for diagnosing rotor fault causes using the PSO-XGBoost algorithm. Aiming at the problems of low accuracy and low efficiency in using empirical methods to adjust parameters of the XGBoost model, PSO is used to solve the difficulty of parameter adjustment when using the XGBoost model to diagnose rotor fault causes and improve the diagnostic accuracy at the same time. The experimental results show that(1)Compared with the direct construction of the XGBoost model to diagnose rotor fault causes, the hybrid model can achieve higher diagnostic accuracy and practical efficiency(2)The hybrid model can effectively identify nine different failure causes under three types of failures, and the classification accuracy, precision, recall, and F1-score are all above 98%. Compared with XGBoost, RF, GBDT, DT, and SVM, from the perspective of the PSO-XGBoost’s comprehensive classification performance, choosing the PSO-XGBoost model in diagnosing rotor fault causes is more effective than other algorithms.
The csv data used to support the findings of this study have been deposited in the Baidu Netdisk repository (https://pan.baidu.com/s/1A8jqMmykRYbOxqJwYPC3fw; password: suep).
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was supported by Shanghai 2019 “Science and Technology Innovation Action Plan'' High-tech Field Project” (19511103700).
H. M. Zhong, W. L. Zhang, Y. R. Li et al., “GBDT based railway accident type prediction and cause analysis,” Acta Automatica Sinica, vol. 45, pp. 1–9, 2020.View at: Google Scholar
TQ. Chen and C. Guestrin, “XGBoost: a scalable tree boosting system,,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794, San Francisco, CA, August 2016.View at: Google Scholar
B. Wang, Y. Sun, B. Xue et al., “Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification,” in Proceedings of IEEE Congress on Evolutionary Computation, pp. 1514–1521, Rio de Janeiro, Brazil, July 2018.View at: Google Scholar
M. Zhang, Z. Liu, and X. Dang, “fault diagnosis on train brake system based on multi-dimensional feature fusion and GBDT enhanced classification,” in Proceedings Of the International Conference On Intelligent Rail Transportation,, Singapore, December 2018.View at: Google Scholar
XF. Wang, XB. Yan, and YC. Ma, “Research on user consumption behavior prediction based on improved XGBoost algorithm,” in Proceedings Of IEEE International Conference On Big Data, pp. 4169–4175, Seattle, WA, December 2018.View at: Google Scholar