Abstract

The on-board power supply system provides power for the launch vehicle. The power transmission and transformation system plays an irreplaceable role to ensure that the on-board power supply system receives the normal working voltage of the launch vehicle. There are many types of faults in power transmission and transformation systems. The traditional faulty diagnosis method of power transmission and transformation equipment has the disadvantages of being susceptible to experts’ subjectivity and model’s ossification. In this paper, a new method of equipment fault diagnosis based on big data is proposed. On the basis of big data, this paper introduces the failure mode clustering algorithm, the state parameter correlation analysis algorithm, the fault diagnosis method based on the correlation matrix, and other key fault diagnosis technologies. The fault record data of the 400 kV voltage grade oil-immersed transformer bushing in the past ten years by a Chinese combat unit is used as a case for demonstration. The results show that the accuracy rate of SC-LSTM-K-means clustering model exceeds 95%. And the fault classification mode can be accurately obtained. A priori correlation algorithm with TA coefficient can be used to evaluate the strong and weak relationship between the state parameters; the fault diagnosis matrix based on Pearson’s correlation coefficient can accurately determine the fault mode consistent with the actual operation and maintenance test results. Therefore, the fault diagnosis method of power transmission and transformation system based on big data can both effectively obtain the inherent laws of historical data and realize more accurate fault diagnosis with data adaptability.

1. Introduction

The vehicle-mounted power supply system is the power source of the launch vehicle, and the safety of the power transmission and transformation system is the basis of reliable and stable operation of the launch vehicle’s power grid, which is of great significance to mobile warfare. Effective and accurate evaluation, diagnosis, and prediction of equipment status can significantly improve the reliability of power supply and the intelligent level of power grid operation [1].

The research on condition monitoring, evaluation, and fault diagnosis technology of high-voltage power equipment was carried out earlier abroad [2, 3]. As early as 1951, engineers from Westinghouse Electric Corporation monitored and diagnosed the motor damage caused by electric discharge in normal operation [4, 5].

Before the 1970s, developed countries such as the former Soviet Union, Japan, the United States, Germany, and Canada made more explorations in live and online monitoring of power transmission and transformation systems. First, they opened up the research field of online monitoring technology and developed partial discharges of dissolved gases in transformer oil, transformers, and gas-insulated switchgear [6]. After the 1990s, equipment condition monitoring and diagnosis technology has developed rapidly, and measurement methods have been continuously improved with the development of sensors, computers, network communication, and other technologies. Monitoring objects have gradually expanded from substation equipment to transmission equipment, and condition information has become increasingly rich. There are also some other test instruments that reflect the equipment condition through nonelectric quantity measurement, such as ultra-high-frequency partial discharge detection, gas chromatography sensor, optical fiber temperature online measurement, infrared equipment, ultrasonic equipment, etc.

The research on condition monitoring and evaluation of power system equipment in China began in 1970s–1980s [7]. Since the 1980s, the research of online monitoring technology has laid the foundation of development status assessment technology in China. In the past 10 years, live detection and online monitoring systems for primary equipment have been widely used in China. Particularly, with the construction and development of smart grid, online monitoring technology has been rapidly popularized and applied [8, 9]. In recent years, China’s power grid companies have made a lot of explorations and attempts in the field of equipment operation and maintenance, gradually realizing the important value of accurately grasping equipment status information, and have begun to promote maintenance management strategies based on status evaluation. In recent years, with the rapid development of sensor technology, countries such as Europe, America, Australia, and Japan have significantly accelerated the research and application of intelligent diagnostic devices. Large foreign companies, especially European companies such as SIEMENS, ABB, Alston, AGE, etc., use online detection systems in high-voltage circuit breakers and GIS. In modern technology, there is still a hard connection between the compartment control cabinet and the primary components of the GIS [10]. In order to overcome this shortcoming, ABB has developed a serial fiber optic bus system, eliminating hard-wired cables and developing the third-generation secondary intelligent technology [11]. ABB adopts intelligent sensor technology and microprocessing technology in the equipment it develops and realizes online monitoring, diagnosis, process monitoring, and in-station computer monitoring of the equipment through digital communication [12].

At present, the widely used state evaluation methods of power transmission and transformation systems in Chinese power grid companies include equipment state scoring system method, expert system method, multidimensional equipment state evaluation method based on traditional machine learning, and sample training method introducing remote expert opinions. However, with the development of intelligent monitoring equipment in recent years, the amount of state parameter data of power transmission and transformation system has increased exponentially; the equipment status data comes from a number of different systems. Traditional state evaluation methods cannot deal with this kind of multisource heterogeneous massive data.

Firstly, this paper analyzes the shortcomings of traditional fault diagnosis methods for power transmission and transformation equipment, including the shortcomings of setting model parameters, being difficult to change after model training and forming, and some relationships being unable to be expressed by equations. A fault diagnosis method of power transmission and transformation system based on big data is proposed. The improved LSTM-K-means algorithm based on silhouette coefficient (SC) is used for fault classification, the a priori correlation algorithm is combined with TA coefficient to obtain the strong-weak relationship between state parameters, and Pearson’s coefficient is used to construct the fault diagnosis matrix. Finally, the feasibility and accuracy of the proposed method are verified by a fault example of 400 kV oil-immersed transformer bushing in combat unit launch vehicle in the recent ten years.

2. State Evaluation Methods of the Traditional Power Transmission and Transformation System

With the development of artificial intelligence algorithms such as neural networks, an equipment state evaluation method based on machine learning has been developed, as shown in Figure 1. This kind of method is usually based on limited sample training and adopts a certain mathematical modeling method to construct a predictable relationship between input and output. Compared with the traditional scoring system, the state evaluation method based on machine learning can use more time section data for sample training, and the prediction results obtained are more accurate than the traditional scoring system. In addition, this method can use complex physical and mathematical functions to model. Usually used mathematical methods include artificial neural network, Bayesian network, support vector machine, and Markov [13].

However, the data that can be used by machine learning method is still limited, and it is still difficult to consider the influence of external factors such as meteorological environment on equipment status in the modeling process of such functions. The model is solidified after setting and training. Unless it is modified or trained again, it cannot adapt to various changes in the process of equipment operation and maintenance, nor can it reflect the influence of differences between different equipment models and different operating environments on diagnosis results. Because the core problem of machine learning method is the selection of historical samples and the training of samples, there are limitations of training speed, training convergence, local minima, and other problems in practical application. Some improvement measures are usually adopted to solve the above problems, such as preimprovement algorithm, postimprovement algorithm, and so on.

On the basis of machine learning method and the concept of expert system, a diagnosis system based on remote expert intervention is developed. The system can train the system with expert opinions as new samples, which integrates the advantages of machine learning method and expert system method, and can improve the accuracy and reliability of subsequent diagnosis results. However, the introduction of remote expert opinions cannot solve the problem of model solidification.

The rise and development of big data mining analysis method has opened up a brand-new technical route for state evaluation and fault diagnosis of power equipment in weapons and equipment and put forward higher requirements for existing equipment state monitoring parameters. This method introduces the theories and tools of mathematical statistics and pattern recognition. On the basis of large-scale data analysis, it focuses on mining the correlation between analyzed factors under uncertain model conditions. The equipment status evaluation model based on big data is shown in Figure 2. This method adopts the idea of big data mining, focusing on mining and investigating equipment defects and the correlation degree between fault state results and equipment state parameters.

Compared with traditional methods, the most fundamental difference between power equipment condition assessment and fault diagnosis methods based on big data mining analysis is as follows:(1)In the evaluation model of traditional methods, the equipment condition monitoring quantity is the input parameter, while the equipment defects and faults are the output parameters; in the analysis method of big data mining, equipment condition monitoring quantity and equipment defects and faults are all input parameters, while output quantity is the association rules, association degrees, and other elements among all input parameters. The traditional model formed by input and output training, once generated, cannot be changed unless retrained; the big data mining analysis method model uses dynamic correlation coefficient matrix to model the correlation between equipment condition index and equipment condition monitoring parameters, which can be continuously regressed and revised and flexibly changed according to the studied equipment objects, state parameters, fault types, etc., without reconstructing the model and without the problem of model solidification. The condition evaluation method of power transmission and transformation equipment based on big data mining is suitable for evaluating and predicting any parameter index of equipment, including equipment health and load capacity.(2)In the traditional methods, the evaluation model is the most critical; it is impossible to express the results other than the preset logical relationship in the evaluation model between the input and output; it is difficult to properly reflect the personalized and differentiated elements such as equipment manufacturers, meteorological environment, habits of operation and maintenance personnel, etc.; however, the correlation mining analysis method based on big data analysis is different from the traditional method. On the basis of higher requirements for existing equipment condition monitoring parameters, it takes massive data as mining objects and uses data mining methods suitable for big data to mine the correlation between factors to be analyzed in uncertain models. In the data mining analysis method, the most important thing is the effective integration and fusion of massive data with multiple sources, multiple time scales, and multiple space-time dimensions, so as to find the inherent (known or hidden) correlation between various equipment condition monitoring quantities and equipment defects and faults, even if it is difficult to integrate the correlation of physical and logical models.

3. Key Technology of System Condition Evaluation Based on Big Data Analysis

3.1. The Clustering Model
3.1.1. K-Means Algorithm

K-means algorithm is proposed by Ding and He [14], which can divide the data into k clusters that minimize the sum of squares of errors through continuous iterative calculation. The algorithm is widely used in all walks of life because of its simple and efficient operation, strong scalability, nearly linear time complexity, and suitability for processing large data sets [15]. The implementation steps of K-means clustering algorithm are as follows [1618]:(1)Initializing the clustering center: randomly selecting K sample points from N sample data as the initial clustering center.(2)Cluster division: calculating the distance from the remaining sample points of K initial cluster centers to each initial cluster center and dividing the sample points into the cluster with the smallest distance.(3)Calculating a new clustering center: calculating the sum of the divided cluster sample points in step (2), completing the average value calculation, and taking the calculated average value as the new clustering center.(4)Convergence judgment: E function is usually used as judgment function, where E function is the sum of squares of errors between sample data and each clustering center. Steps (2) and (3) are circulated until the end of the division cycle that minimizes the E value, which is the best clustering result.

3.1.2. SC-LSTM-K-Means Clustering Model

Based on the long short-term memory (LSTM) network’s strong nonlinear deep learning capabilities [19] and the advantages of the K-means clustering algorithm, this paper proposes a hybrid clustering model combining LSTM and K-means to identify fault types in power transmission and transformation systems. The flowchart of the LSTM-K-means hybrid model is shown in Figure 3.

In LSTM-K-means clustering, the selection of K value of cluster number is very important. Only by finding out the appropriate K value can we get ideal clustering effect. Silhouette coefficient can solve this problem well. In this paper, by introducing SC, the two concepts of cohesion and separation are integrated, and it is more effective to evaluate the clustering effect by SC.

For the measurement of cohesion in a cluster, the way is to calculate the average distance between the fth element in the cluster and other elements in the cluster, which is denoted as . For the quantification of the separation degree between clusters, the way is to select a cluster other than the cluster of the above elements, calculate the average value of the distance between the element and all elements in , then calculate the distance between the above elements and all elements in the other clusters, and find the minimum value of the distance between the element and other clusters, which is recorded as . The equation for calculating the of the ith element is

Finally, the silhouette coefficients of all elements in all clusters are calculated, and the average value of the silhouette coefficients of each element is obtained as the overall silhouette coefficients of the current cluster.

3.2. Correlation Analysis Algorithm of State Parameters

As far as the current situation is concerned, there are many and complicated equipment state parameters, and there is a lack of mining and analysis of the relationship between equipment parameters, which leads to the lack of systematic understanding of equipment parameters. Through the analysis of state parameter association rules, the effective combination of multiple state parameters of equipment, the extraction and merging of feature quantities, and the analysis of the mutual influence degree of state parameters can be realized. For association rules, the general form is the implication of , which can be understood as “if X, then Y,” where X is the equipment state parameter in the preceding item, which can be a single state parameter or a set of multiple state parameters, and Y is the state parameter in the following item, which is generally a single state parameter.

Taking a priori algorithm used in this paper as an example, set the library to be mined of association rules be , which is a collection of transactions . If there are n transactions, , for each transaction, consists of “m items,” .

For item set X, the degree of support S is defined as

For the association rules with x > y, the degree of support iswhere S is the degree of support and is the number of sets.

The degree of support described in equation (3) reflects the probability of simultaneous occurrence of these two item sets. The support degree is equal to that of frequent sets.

Similarly, for association rules with , its credibility C is

The reliability described in equation (4) reflects the probability that if the item set contains X, it also contains Y. For users who use association rules, users can mine association rules with higher S and C by defining thresholds of minimum support and credibility.

3.3. Fault Diagnosis Based on Correlation Matrix

In order to diagnose the fault mode, it is necessary to consider the correlation between each state parameter and each fault mode, that is, the possibility of a certain fault mode when a certain state parameter is abnormal. After obtaining the correlation coefficient between each state parameter and each fault mode of the equipment, the equipment fault mode diagnosis matrix R can be obtained; that is, where is the correlation coefficient of vector of the rth failure mode under vector of the tth state parameter; among them, , there are p fault modes; , and there are q state parameters. When calculating , there are many sets of data for each failure mode and state parameter, so and are vectors.

When calculating the correlation coefficient , Pearson’s coefficient is used in this paper. The correlation coefficient is based on the deviation between the two variables and the average value of their respective variables, calculated by the product-difference method, multiplied by the two deviations, and the product is used to reflect the correlation degree between the two variables [2022]. Pearson’s coefficient ranges from −1 to 1. A value of 0 indicates that there is no significant linear relationship between the two variables. −1 and 1 indicate that the two variables are completely negative or positive. The following equation shows the correlation coefficient of and applied to equipment fault diagnosis:where is the covariance of and ; is the variance of ; and is the variance of . After the diagnosis matrix R is obtained by the above method, the fault data can be diagnosed by the following equation:

In equation (7),where U is the data vector of the fault case to be diagnosed, including the state parameter level of each state parameter, and F is a fault mode diagnosis result vector, and the value of each element in the vector can indicate the membership degree of the fault case under each fault mode. When the most likely failure mode is finally diagnosed, the failure mode with the largest membership degree (the largest value) can be selected as the final result.

4. Simulation Analysis of Fault Diagnosis Based on Big Data

In this paper, the fault cases of 400 kV oil-immersed transformer bushing in a combat unit launch vehicle in the recent ten years are taken as data mining objects, and the equipment fault diagnosis based on big data mining is studied.

4.1. Preprocessing Data

Firstly, the abnormal state data of the equipment to be mined are collected, with emphasis on the case data of faults and defects. The case code is represented by and the state parameter is represented by . According to the representation of faults and defects, the state parameters are assigned. Since the construction of knowledge map is only to mine and analyze the state parameters or equipment abnormal cases themselves, only one state parameter needs to be known whether it is abnormal or not and does not involve the equipment state level or the deterioration degree of the state parameters, so only binary quantification is carried out. According to the severity, defects can be divided into emergency defects, major defects, and general defects.

When a certain state parameter is abnormal, the value of Wu is 1, which means that the state parameter has faults or major or urgent defects. The value of Wu is 0, which means that the state parameter is normal. The parameters of the state parameters considered in this paper are shown in Table 1. Through summary statistics, there are 34 groups of equipment failure and defect cases, of which 22 groups are failure data.

4.2. Clustering Analysis of Failure Cases
4.2.1. Clustering Analysis of SC-LSTM-K-Means

Firstly, 22 groups of fault cases in the original data are clustered hierarchically, and all faults can be divided into 3–8 categories by preliminary analysis. K is selected as 3, 4, 5, 6, 7, and 8, respectively, and the clustering effects are compared by calculating the silhouette coefficients of different cluster numbers K. The results are shown in Figure 4. As can be seen from Figure 4, k = 5, the clustering result is the most ideal when the failure modes are divided into 5 categories. Combined with the fault physical background and expert experience of transformer bushing, five common fault modes of transformer bushing can be summarized. Using LSTM-K-means to cluster fault cases, the results are shown in Table 2.

4.2.2. Comparison of Three Models’ Clustering Accuracy

In recent years, some scholars have studied the combined clustering model of BP neural network and K-means algorithm. The basic idea is to use raw data and historical prediction errors as the input of the model, use the prediction error interval as the output, and then use BP neural network to learn the relationship between the input and output of the model. Finally, we can get a combination model with determined parameters.

This section uses the model proposed in this paper to compare the accuracy of clustering with K-means algorithm and BP-K-means. To solve this problem, SSE, the sum of squares of distances from all sample points to the corresponding cluster centers, and the change curve of K value are drawn. With the increase of K value, SSE decreases and finally tends to remain unchanged, and the K value with the largest image slope decrease is found, which is a relatively reasonable value. MATLAB simulation analysis is shown in Figure 5; the K values of the K-means model and the BP-K-means model both take 5 when the SSE is maximum.

We use the actual classification results of 22 groups faulted cases as the basis for the comparison of clustering accuracy. Figure 6 is a comparison diagram of the clustering effect of the three models. It can be seen that the clustering effect of the model proposed in this paper is better and the boundary is clearer, while the clustering results of the other two models are more disordering. Table 3 is a comparison of the accuracy of the three models. The accuracy of the K-means model is 81.81%, the accuracy of the BP-K-means model is 86.37%, and the accuracy of the model in this paper is 95.45%, which is significantly higher than the other two clustering models.

4.3. Correlation Analysis of State Parameters

In the analysis of association rules, the determination of confidence and support is very important. Only reasonable thresholds of confidence and support can better mine association rules with comparative value; because there are many kinds of state parameters, the threshold of support should not be set too large. Support is set to 0.1. In order to obtain association rules with high credibility, confidence is set to 0.85. A total of 21 association rules with high confidence are obtained. The visualization effect of 21 association rules RULE1–RULE21 based on association relationship is shown in Figure 7.

In Figure 7, ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; and . The abscissa is the previous equipment state parameter of the association rule; the ordinate is the latter equipment state parameter y of the association rule; circles RULE1–RULE21 at the intersection of vertical and horizontal axes represent 21 association rules, which are used to reflect the correlation between the front and back equipment state parameters; the size of circle area indicates the support degree of correlation, and the larger the circle, the greater the corresponding support degree; the depth of the circle color indicates the confidence of correlation, and the darker the color, the greater the confidence. We can intuitively see the association rules with high support. The latter equipment state parameters are mainly concentrated in W10 and W12, which not only shows that the end screen is a component prone to problems in transformer bushing, but also shows that the dielectric loss, capacitance, and insulation resistance of the end screen have high possibility of occurring together with other abnormal states. Because the casing is prone to insulation damp failure, the factors are often unqualified production quality or aging, and human factors are caused by poor casing sealing. Therefore, when the main insulation dielectric loss or the end screen dielectric loss is abnormal, the end screen insulation resistance will often drop seriously, which is consistent with the actual situation on-site. According to the fault mode name of poor connection, the state parameters such as infrared temperature measurement, terminal screen outgoing line, and casing wiring are obviously abnormal. By extracting each abnormal state parameter and TA coefficient of the abnormal state parameter under this kind of fault mode, the correlation between the state parameter and the fault mode can be obtained, as shown in Figure 8. The TA coefficient of infrared temperature measurement, terminal screen outgoing line, and casing connection is higher. In case of poor contact, due to the increase of resistance at the poor contact point, the heat generation is more serious, and infrared temperature measurement can find the poor contact point well.

4.4. State Evaluation

Through cluster analysis, five fault modes of oil-immersed transformer bushing are mined out. For oil-immersed transformer bushing, the number of fault modes is . There are 15 key state parameters in total, and the number of state parameters is q = 15. In the state evaluation, because the equipment state and state parameters need to be quantitatively and finely calculated, the data task in this section cannot be well carried out only by binary quantification of state parameters. Therefore, according to the deterioration degree of health level, the equipment state can be divided into five grades: 0, 1, 2, 3, and 4, which can better reflect the correlation with state parameters. Considering all 34 groups of fault and defect cases, the key state parameters are renumbered from 1 to 15. For the last screen discharge fault mode, the abnormal state vector is

Calculate the correlation coefficient of each state parameter in the terminal screen discharge fault mode; for example, the vector of the state parameter for analyzing the dissolved gas in the oil is

Therefore, the factor in the diagnosis matrix is calculated by

Similarly, the remaining elements in the diagnostic matrix R of equation (5) can be calculated. The condition evaluation results of oil-immersed transformer bushing equipment twice are taken as cases for example verification. In case 1, the parameters of casing wiring, terminal screen outgoing line, and infrared temperature measurement state are abnormal; in case 2, the porcelain insulation was damaged, and the parameters of oil level indication and oil leakage inspection state were abnormal. The evaluation of the state parameters is quantified by the deterioration level, and the vectors U of the two cases to be diagnosed are shown in the following equation:

After substituting equation (7), the fault mode diagnosis result vector F of the two diagnosis cases is shown in Table 4.

If the fault mode with the highest diagnosis membership value is taken as the diagnosis result, the diagnosis result of the sample of diagnosis case 1 is poor contact; the diagnosis result of diagnostic case 2 is serious oil leakage. After on-site fault diagnosis by relevant operation and maintenance personnel, the former is an abnormal situation caused by the thread of the casing end screen not tightening, while the latter is an abnormal situation caused by the failure to observe the oil level through the oil level mirror. After inspection, it is caused by the failure to replenish oil for a long time and the normal aging and oil leakage of the casing.

5. Conclusion

Firstly, the SC-LSTM-K-means clustering algorithm can be used to mine the fault modes of power transmission and transformation systems of launch vehicles, and the number of fault classifications can be determined through the SC. By comparing the clustering effects with K-means and BP-K-means algorithms, it can be concluded that the clustering accuracy of this model reaches 95.45%. Secondly, a priori algorithm based on Boolean association rules to mine frequent item sets can be used to mine the internal correlation relationship of characteristic parameters of power transmission and transformation system, so as to realize the effective combination of multiple state parameters of equipment, feature extraction and merging, and analysis of the mutual influence degree of state parameters; the obtained TA coefficient can characterize the strong-weak relationship of correlation. Finally, the equipment fault correlation matrix based on Pearson’s coefficient can get that the maximum value of Fr for case 1 is 5.1630, and the maximum value of Fr for case 2 is 4.8892. The method accurately analyzes and diagnoses the equipment fault mode which is consistent with the actual operation and maintenance test results.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.