Abstract

Evidence reasoning (ER) combined with dimensionless index method can be used in rotating machinery fault diagnosis. In ER algorithm, reliability is mainly obtained in two ways: distance-based method and correlation measure by set theory. In practice, the distance-based method cannot generate high-discrimination reliability in high-coincidence data like dimensionless index data. Therefore, correlation measure by set theory method is used in fault diagnosis more frequently. Because correlation measure by set theory only considers upper bound and lower bound of fault data, we add a regularization term to calculate the relationship between the inner data. Experience result shows that fault diagnosis accuracy had improved, which illustrates that the new reliability can describe data relationship better.

1. Introduction

Rotating petrochemical machinery has become more and more complicated. For instance, the connection between parts is getting closer and closer. Its working and operating environment is more complex and demanding [1, 2]. Therefore, higher reliability and safety requirements are put forward for equipment design, structure, process, and operation state [3]. As a key component of petrochemical units, rotary units cover important engineering fields such as petrochemical, power, chemical, metallurgical, and mechanical manufacturing [3, 4]. Rotating unit equipment (such as generators, steam turbines, blowers, and large rolling mills) is often the plant’s key equipment [5]. The operating condition not only affects the operation of the machine itself but also affects the subsequent production. Therefore, it is urgent to discuss and study the fault diagnosis technology of rotating units [6, 7].

At present, fault diagnosis methods can be divided into three types according to the diagnosis model: analytical based model, qualitative knowledge based model, and Dempster Shafer Theory based model. Fault diagnosis based on the analytic model is a method to find out the running rule of the object. By studying the intrinsic relation between dynamic parameters and response symptoms in fault state [8], the information of normal operation and abnormal correlation is obtained. This kind of method is suitable for systems with an accurate quantitative mathematical model and a sufficient number of sensors. They can gain fault pattern recognition result by establishing physical model and mathematical model. Typical analytical model-based methods include the state estimation method, the parameter estimation method, the equivalent space method, and the analysis redundancy method.

Qualitative experience based fault diagnosis method is a kind of reasoning method based on the qualitative model. The core of this method is using incomplete prior knowledge to describe the function structure of the system and establish a qualitative model to realize reasoning. According to the model, the behavior of the system is predicted and compared with the actual system behavior to detect the failure of the system. This method usually includes expert system, graph search, and fault tree analysis. For complex fault diagnosis, because the number and combination of faults are unpredictable, the workload of constructing the qualitative model is relatively heavy. Especially for complex systems, unpredictable fault combinations will increase the scale of the model exponentially. Therefore, when this kind of method is applied, it is often used to analyze some specific complex faults [9, 10].

Evidence theory based fault diagnosis method is an inexact reasoning theory, which can deal with uncertainty information. The confidence interval is used to replace the probability, the event is represented by the set, and the Bayesian formula is replaced by the rule of evidence combination. The confidence function can be expressed directly by uncertainty and not knowing. In the application of composite fault diagnosis, the D-S evidence theory makes a decision result through the fusion reasoning of each evidence body on the same recognition frame [1113]. At present, a large number of evidence theory based fault diagnosis methods are mainly aimed at the diagnosis of single fault, which requires that the elements in the identification framework have mutually exclusive relations [14]. But, for complex fault diagnosis, such settings have fundamental limitations. In order to extend its effectiveness in complex faults, the extended evidence theory (Dezert-Smarandache Theory, DSmT), which uses the intersection of elements in the identification framework to represent concurrent and composite faults, is proposed. Taking [14] as an example, it gives an identification framework that can cover a single fault and a composite fault. It sets the correlation degree for different faults to the evidence. Each evidence in each group is decomposed into two kinds of evidence, which are independent and relevant. Then several independent pieces of evidence are fused by using the DSmT combination rule, and the uncertainty of different independent source evidence is inferred; thus the identification of composite faults is realized. However, this method can not realize fault discrimination for conflicting evidence. Therefore, a fault diagnosis method based on evidence reasoning and the dimensionless index is proposed in [15]. This method can realize the fusion of multiple attributes and realize the diagnosis of conflicting evidence. However, there are many methods to calculate the reliability in the process of ER (e.g., [16] uses distance method and [17] uses set correlation measure), which has a great influence on reasoning results. Because of the coincidence of dimensionless index values between different faults, it is more effective to use the set correlation measure method. However, such a method will be affected by wild value in dimensionless indicators.

To solve the above problems, we propose a method to regularize the reliability value. We use the correlation coefficient as the regularization term to improve the reliability calculation formula. The Gini correlation coefficient is used in this paper because it can describe the relation of nonlinear data effectively [18]. Therefore, this paper uses the improved evidence reasoning algorithm and dimensionless index to carry out fault diagnosis. The main contributions of this paper are as follows:

In traditional ER, the reliability only considers set interval in fault data, neglecting the impact of relationship between the inner data. In new proposed method, we used correlation coefficient as reliability regularization. In practice, the new reliability is more reliable.

To achieve better fault diagnosis result, we combine improved ER and dimensionless indexes in rotating machinery fault diagnosis. The experimental result shows that the new method is better than the traditional one.

2.1. Dimensionless Index

The dimensionless index is a value obtained by comparing two dimensions’ values [19]. The value of the dimensionless index is determined by the nature or shape of the probability density function of vibration signal amplitude. The change of working condition has less effect on the dimensionless index, which is helpful to the time domain analysis of fault diagnosis. At the same time, the dimensionless index is a ratio, which has little to do with the sensitivity and magnification of the vibration detector. So the monitoring system does not need to be calibrated, which brings convenience to the fault diagnosis of the actual equipment. The accuracy of traditional dimensionless fault diagnosis is high for single fault diagnosis, but for complex fault diagnosis, it needs to be further improved [20, 21]. In order to overcome the shortcomings of traditional dimensionless index construction and improve the accuracy of fault diagnosis of rotating units, scholars have put forward new algorithms for fault diagnosis.

Consider a random signal with its amplitude and probability density function denoted by and , respectively. Using these notations, various types of dimensional indexes can be defined as follows [22]:

Average amplitude:

Root mean square value:

Root mean square amplitude:

Kurtosis:

Maximum value:

Dimensional indexes are sensitive to early fault data but they are affected in a nonlinear manner with the increase in the degree of failure, which results in diagnosis error. Therefore, we take a ratio of two-dimensional indexes to form a dimensionless index. Dimensionless indexes can eliminate the nonlinear change effect of dimensional index values. Various dimensionless indexes can be defined as follows:

Waveform index:

Pulse index:

Margin index:

Peak index:

Kurtosis index:

We can express all the dimensionless indexes using the general equation (11). In (11), different dimensionless index equation can be generated by choosing different values for parameters and .

Equation (11) shows that dimensionless index calculation is based on the probability density function of the input signal. Hence, dimensionless index is a ratio that is not affected by the absolute level of the signal.

2.2. Fault Diagnosis Method Based on Evidence Theory and Dimensionless Index

Inspired by [16], we have the following fault diagnosis process. For a fault diagnosis problem , assume that there are L basic attributes represented as . Define the set of L basic attributes as a source of evidence . According to the above description, assume that every attribute has its own weight . Here, represents the important degree of attribute. The evaluation results of every attribute can simply represent the following reliability distribution form:

Note that , and represent the reliability of attribute and evaluate result point to fault .

We use as the reliability of problem diagnosis to . is the final reliability which fuses all the attribute evaluation results. The following is to use the ER algorithm proposed by Yang et al. to fuse the information [16].

Let denote the basic probability assignment value of basic attribute . Support diagnosis problem is . Otherwise, denotes the basic probability assignment value of it not assigned to any of the fault types. The value of describes the degree of uncertainty. The basic probability assignment value can be obtained as follows:

It is easy to find that is decomposed to two parts: and . is affected by the weight of attribute and is affected by the attribute by incomplete evaluation information.

To obtain the final diagnosis result, we apply Dempster combination rules directly to get the final evaluation results:

Note that are final reliability of fault diagnosis problem to . Therefore, we can obtain the result simply:

According to the following description, reliability and attribute weight play an important role in ER algorithm. The reliability based on correlation measure by set theory will be introduced in Section 3.1. For attribute weight and detailed derivation of ER, see [16].

3. Method Description

3.1. Traditional Reliability Calculation Method

The value of reliability obtained method has a very important effect on the evidence reasoning result. This is because the reliability value can accurately reflect the fault feature information and directly determines the fusion weight of each dimensionless index in the process of data fusion. In addition, reliability also directly affects the calculation of Dempster combination rules, such as basic probability assignment. The traditional idea is to determine the reliability based on the distance between the input data and the average value of the data. This method can be used in low overlap data circumstances. But, for high-overlap dimensionless index data, we are willing to use correlation measure by set theory, because it is only affected by upper and lower bounds of index values.

The dimensionless index can be denoted as interval form . When we want to obtain the correlation between and , It can be directly generated by

In (23), denotes the length of set. Then, reliability can be obtained:

Equation (23) shows that the reliability calculation method depends entirely on the interval value between the two groups of calculated data. This method will lose some information about the structure and correlation of the data, which leads to an inaccurate calculation of reliability. Therefore, a natural idea in the improved reliability design method is to add a regularization condition after the reliability calculation method. The new method should be able to represent the relationship between the two sets of data. In this paper, the calculation of the correlation coefficient is used to obtain the information inside the data.

3.2. Improved Reliability Method

Assumption 1. Given two sets of independent and identically distributed data sets of the same length , the two sets of data are matched one by one to form pairs. At first, we sort and . The notation denotes the pairs that are sorted based on values of , such that . Additionally, represent the values of paired with unsorted points . Similarly, we can obtain .

Definition 2. Based on the sorted data pairs, Gini correlation coefficient can be defined as [18]

In (25) and (26), denotes the number of points in a data set. From (25) and (26), we can note that . Hence, we defined a symmetric Gini correlation (SGC) coefficient as defined in the following equation:

The correlation coefficient has the following properties: correlation coefficient lies in the interval [-1,+1]; the correlation between and is a positive correlation or a negative correlation if the sign of the correlation coefficient is positive or negative, respectively; if the correlation coefficient is 0, then and are uncorrelated; if magnitude of the correlation coefficient value is close to 1, it implies that the correlation between and is stronger.

It can be seen from the above correlation equation that the Gini correlation coefficient calculation method is relatively simple, which provides the condition for real-time fault diagnosis. Gini correlation coefficient is more stable than other classical correlation coefficients in dealing with nonlinear data [18].

According to the correlation measure of the set, the same dimensionless index of different fault types is considered as the regularization based on the set correlation metric. The composition equation is as follows:

Then, the new reliability can be generated by

According to the new reliability calculation equation, it can be found that the accuracy calculated in the same recognition frame is not lower than the accuracy of old reliability.

4. Experiment

4.1. Experiment Data

Experimental data is collected from large rotating machinery in petrochemical fault diagnosis experiment platform of multistage centrifugal fan fault diagnosis unit. The fault diagnosis unit consists of 11KW 5-stage centrifugal blower plus transmission, torque sensor, inverter motor, and several failure axes, tooth, and bearing members. The fault diagnosis unit can simulate common fault in multistage centrifugal blower unit. EMT390 data acquisition probe is placed in a position denoted with label “f” as shown in Figure 1. At the same time, the experimental data is read and stored using the Guangdong Provincial Key Laboratory system software. The originally collected data comprises the chassis vibration acceleration values. Since the different location of fault can have different effect on the operation of the entire axis, we can obtain the fault type by analyzing the chassis vibration acceleration information.

4.2. Fault Diagnosis Model

Before data acquisition, the fault type, fault combination, motor speed and so on are determined. Then the lab staff change the normal parts of the unit and replace the corresponding fault parts according to the type of fault. Turn the machine on to a specified speed of 1000 rpm. Then the vibration acceleration of the housing is collected by the EMT390 data collector in the specified position. In order to facilitate the cross-use of data validation and diagnosis method, a fault type data acquisition process is completed by two people, each collecting two groups of data. The process is shown in Figure 2.

The vibration acceleration of all fault types is stored in the fault data folder, and 46 sets of data stored in one folder are read out by the data-reading program. There are 1024 vibration acceleration values in each group of data. In the process of dimensionless index calculation, five different dimensionless index values are calculated for 1024 vibration acceleration values. Therefore, each set of fault data contained 46 5 dimensionless values.

The fault diagnosis model is divided into five steps. First, the raw data are collected on the large petrochemical unit in Guangdong Petrochemical Equipment Fault Diagnosis Laboratory. Second, the dimensionless processing is used to extract the eigenvalues of the original data. Third, according to the composite degree of fault type, it can be divided into single fault and composite fault. Fourth, the input fault data is determined according to the fault type of fault data within the identification framework. Fifth, the fusion results are obtained to determine the diagnostic results. The specific steps can be described as follows.

Step 1. Determining the type of fault to be collected.

Step 2. Replacing the normal petrochemical unit parts to the designated fault parts.

Step 3. Electrifies the motor and debugs to 1000 rpm.

Step 4. Data acquisition personnel use EMT390 to collect vibration acceleration of the housing.

Step 5. Using data management software to read the sensor data and save it on the computer.

Step 6. Using MATLAB program to read the collected data and convert them into five dimensionless indexes.

Step 7. Calculating the initial reliability according to the correlation measure method of five dimensionless index sets.

Step 8. Calculating the correlation coefficient according to the five dimensionless index values to obtain the regularization term.

Step 9. Setting the parameter value to obtain the new reliability.

Step 10. Calculating the weight of each dimensionless index according to the result of new reliability calculation.

Step 11. Fusing reliability and weight according to Dempster combination rule.

Step 12. Finding out the fault type corresponding to the maximum reliability of the four fusion results.

In the process of experiment, we need to establish a fault identification framework and train the optimal parameters and corresponding to the framework by collecting multiple groups of data. So, in the experiment, the diagnosis effect of each recognition frame is optimized. When the optimization reaches a certain effect, we begin to consider the linkage diagnosis in each recognition frame. When unknown fault data are input, the optimal diagnosis results can be found in each identification framework. Therefore, the establishment of a relatively complete identification framework library is necessary for the practical application of the fault diagnosis method in the industry. Figure 3 is the basic structure of the recognition framework library.

The types of failures used in the experiment include two types, one fault, and more than two complex faults. In the experiment on this paper, we use three recognition frameworks. Identify faults included in frame 1: outer ring wear, inner ring wear, normal, and left bearing outer ring wear. The faults included in frame 2 include large gear missing teeth and left bearing outer ring wear composite failure, large gear missing teeth and left bearing missing ball composite failure, large and small gear missing teeth and outer ring wear composite failure, and large and small gear missing teeth and inner ring wear and composite failure. The faults included in frame 3 include large gear missing tooth, large and small gear missing tooth composite failure, bearing lacking ball, large gear missing tooth, and a left bearing inner ring wear composite failure.

4.3. Fault Diagnosis Result

According to Figures 4 and 5, the accuracy of fault diagnosis obtained by using the traditional method is 58.3% on identification frame 1 and 58.3% on identification frame 2. The accuracy of fault diagnosis obtained by using the proposed method is 75.0% on recognition frame 1 and 66.7% on recognition frame 2. Recognition frame 3 combines single fault and composite fault. Fault diagnosis result is shown in Figure 6. The accuracy of the method in [17] is 58.3% and the accuracy of the improved method is 75.0%. The results show that the total diagnostic accuracy of traditional reliability calculation method is 58.3%, and that of the improved algorithm is 72.23%. The overall diagnostic accuracy has been greatly improved. From the fault type of error diagnosis, the main fault in identifying frame 1 is outer ring wear, while in frame 2, the main fault is the composite fault of large and small gear missing teeth and inner ring wear of left bearing. The fault in recognition frame 3 is the large gear tooth-missing fault.

It can be seen intuitively from the three diagrams of the experimental results that the actual fault identification effect of the improved evidence reasoning method of recognition frame 1, recognition frame 2, and recognition frame 3 is better than [17]. The feasibility and accuracy of the proposed method in practical operation are verified. In terms of the overall diagnosis effect, the diagnosis effect of a single fault is better than that of a complex fault. This is because the information carried by the data collected by the single fault is easy to identify and distinguish. Because its fault data represent many features of fault, complex fault is prone to misdiagnosis.

5. Conclusion

In traditional evidence reasoning and dimensionless indexes combining fault diagnosis method, the diagnosis result is often wrong. This is largely due to the coincidence of the dimensionless indicators of different fault data. Because the reliability calculated based on dimensionless index is not accurate when the dimensionless index overlaps between different faults, in this paper, an improved evidence reasoning method based on reliability regularization is proposed. The reliability regularization is mainly realized by calculating the Gini correlation coefficient between data. For the reason that Gini correlation coefficient has a good ability to judge the nonlinear data, the regularized reliability value can better reflect the relationship between the two groups of data and obtain a more practical evaluation result. The experimental results show that the improved reliability method is closer to the actual fault situation and the diagnostic accuracy is improved.

Data Availability

All of the data used in this paper is collected from a large rotating machine in Guangdong Provincial Petrochemical Equipment Fault Diagnosis Key Laboratory. It could be obtained by contacting the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61473331, in part by the Science and Technology Plan of Guangdong Province of China under Grant 2017A070712024, in part by the Sail Plan Training High-Level Talents of Guangdong Province of China, in part by the Introduction of Talents Project of Guangdong Polytechnic Normal University of China under Grant 991512203, in part by the 2016 Annual Scientific and Technological Innovation Special Fund to Foster Students Projects of Guangdong under Grant pdjh2016b0341, in part by the Guangdong University of Petrochemical Technology College Students Innovation Incubation Project under Grant 2015pyA006, in part by the Science and Technology Project of Guangzhou under Grant 201604010099, Grant 2016B030306002, and Grant 2016B030308001, in part by the Fundamental Research Funds for the Central Universities under Grant x2jqD2170480, in part by Guangdong Province Science and Technology Major Special Projects under Grant 2017B030305004, by Guangdong Province Science and Technology Application of Major Special Projects under Grant 2016B020243011, and by Major Provincial Scientific Research Projects of Guangdong Normal Universities under Grant 2017KZDXM052.