Abstract

When a compound fault occurs, the randomness and ambiguity of the gearbox will cause uncertainty in the collected signal and reduce the accuracy of signal feature extraction. To improve accuracy, this research proposes a gearbox compound fault feature extraction method, which uses the inverse cloud model to obtain the signal feature value. First, EEMD is used to decompose the collected vibration signals of gearbox faults in normal and fault states. Then, the mutual information method is used to select the sensitive eigenmode function that can reflect the characteristics of the signal. Subsequently, the inverse cloud generator is used to extract cloud digital features and construct sample feature sets. On this basis, the concept of synthetic cloud is introduced, and the cloud-based distance measurement principle is used to synthesize new clouds, reduce the feature dimension, and extract relevant features. Finally, a simulation experiment on a rotating machinery unit with a certain type of equipment verifies that the proposed method can effectively extract the feature of gearbox multiple faults with less feature dimension. And comparing with the feature set extracted by the single cloud model, the results show that the method can better represent the fault characteristic information of the signal.

1. Introduction

Gear transmission is one of the commonly used transmission methods in mechanical equipment and is often used in high-speed trains, wind power generation, aviation, shipping, petrochemical, mining, lifting, and transportation industries. According to domestic and foreign statistics, about 10.3% of mechanical failures are caused by gearbox failure, so it is particularly important to predict and diagnose gearbox failures [1].

Due to the complex and harsh working environment of mechanical equipment, the vibration signals collected on-site are often doped with noise. To eliminate the influence of noise in the signal, a large number of researchers have carried out relevant research work in recent years. To reduce the noise in the signal, some researchers applied the wavelet denoising method to feature extraction and achieved good results [24]. However, this method has difficulties in selecting wavelet bases and determining thresholds in practical applications. Empirical mode decomposition (EMD) has no fixed basis, so compared with wavelet analysis methods, it solves the problem of difficult selection of wavelet basis, and it has a better processing effect on nonstationary signals than wavelet, but there is a problem with model confusion. To solve the above problems, Wu et al. [5] proposed the ensemble empirical mode decomposition (EEMD) to denoise the original signal, which overcomes the inherent mode confusion problem compared with the original EMD method. Also, there are some other methods used for fault feature extraction [58]. For example, Deng et al. [9] proposed an improved quantum heuristic differential evolution method to construct the best deep confidence network and propose a new fault classification method. The advantage of this method is to integrate the fault feature extraction process in the fault diagnosis algorithm.

The cloud model theory proposed by Professor Wang et al. in 1995 has been widely used in data mining [10, 11], intelligent control [1214], decision analysis [15, 16], intelligent transportation [17], image processing, and other fields in the past 20 years. Han et al. [18] proposed that EEMD can be combined with the cloud model to perform feature extraction of bearing faults and achieved good results, but there is a problem of more fault feature dimensions. Therefore, this article has improved based on the literature [18] and proposed a fault feature extraction method based on EEMD and synthetic cloud model, which can effectively extract fault features while avoiding difficult parameter selection problems. First, EEMD is used to decompose multiple IMF components of the vibration signal, and the mutual information method is used to select the sensitive eigenmode function that can reflect the characteristics of the signal. Subsequently, the cloud model is used to extract cloud digital features and use them as sample features. Then, the concept of synthetic cloud is introduced, the cloud similarity criterion is used to determine the choice of the base cloud, and then the number of features is reduced by synthetic cloud. Finally, by comparing with the feature sets extracted by the single cloud model, the result shows that this method can better represent the feature information of the fault signal.

2.1. EEMD Decomposition Principle

Ensemble empirical mode decomposition (EEMD) uses the statistical characteristics of Gaussian white noise with uniform time-frequency distribution to solve the problem of mode confusion, to achieve the purpose of improving EMD. It adds Gaussian white noise to the signal for multiple EMD decompositions and finally defines the overall average of the IMF decomposed multiple times as the final IMF. Based on the above, the principal steps of the EEMD algorithm are rough as follows:(1)Initialize the overall average number M and the added noise amplitude, and set m = 1.(2)Perform the mth EMD decomposition.(1)Add white noise of constant amplitude to the signal to be analyzed;In the above formula, is the white noise added for the mth time, and is the signal after the mth noise is added.(2)Use EMD to decompose the noised signal to obtain a set of IMF , where is the nth IMF obtained from the mth decomposition(3)If , then return to step (1) and make . Repeat steps (1) and (2) until .(3)Calculate the overall average of the M IMFs(4)Save the average of the previous N IMF decompositions as the final IMF.

2.2. Cloud Model Related Theories
2.2.1. Cloud Model

The cloud model [19] is a qualitative and quantitative conversion model proposed by the academicians Li and Du. The cloud generator can realize the mutual conversion between qualitative concepts and quantitative data. The cloud model uses expectations Ex, entropy En, and hyper-entropy He as digital features to represent qualitative concepts. The expected value Ex is the value that best represents the current qualitative concept, reflecting the information center value of the corresponding qualitative knowledge, and entropy En is a measure of the randomness of a qualitative concept, reflecting the degree of dispersion of cloud drops that can represent this qualitative concept. The hyper-entropy He is the entropy of the entropy En, reflecting the random degree of the numerical value belonging to the qualitative concept, and it also indirectly reflects the thickness of the cloud. As is shown in Figure 1, it is a simple cloud model (Ex = 18, En = 2, He = 0.2), and its ordinate is the degree of certainty of the cloud drop on the qualitative concept, which represents the certainty of the current cloud drop on its concept.

The above-mentioned cloud digital feature expectations Ex, entropy En, and hyper-entropy He are calculated using the algorithm of backward cloud algorithm [20]. The specific calculation method is as follows:Input: N cloud drops ;Output: the qualitative concept expectations Ex, entropy En, and hyper-entropy He represented by these N cloud drops.(1)The estimated value of Ex is(2)The estimated value of En is(3)The estimated value of He is

The one-dimensional forward cloud algorithm isInput: Three numerical characteristic values Ex, En, He, cloud drop N representing the qualitative concept ;Output: the quantitative value of N cloud drops, and the certainty that each cloud drop represents a concept .(1)Generate a normal random number with En as the expected value and He as the standard deviation;(2)Generate a normal random number with Ex as the expected value and is the standard deviation, is the cloud drop;(3)Calculate , which is the certainty of ;(4)Repeat the above steps until N cloud drops are generated.

2.2.2. Cloud Synthesis

Cloud synthesis [21] is the process of superimposing two cloud models to obtain a comprehensive cloud model. , are two cloud models, and a and b are two constants. According to the independent normal distribution algorithm, the synthesis method of the integrated cloud can be expressed as follows:

The method of selecting the base cloud to be synthesized is based on the similarity criterion of the cloud. To consider the basic structure of the original base cloud as far as possible, the cloud similarity [19] is used as the judgment of the base cloud to be synthesized. According to the guidelines, the definition of cloud similarity is mainly described as follows:Input: two cloud models and , and the number of cloud drops and ;Output: the distance between two cloud models .(1)The two cloud models generate and cloud drops respectively through the cloud generator.(2)Sort the cloud drops according to the abscissa from largest to smallest.(3)Filter the cloud drops and keep the cloud drops in the range of .(4)Assuming , randomly select cloud drops from cloud drops in cloud 1, and sort them in sequence, and keep them in set Drop1 and set Drop2 respectively. If , the same is true.(5)Calculate the distance between each cloud drop in the two sets Drop1 and Drop2 in the corresponding order:

In the above steps, in step 3, since the cloud satisfies the normal distribution, most of the cloud drops remain in the interval , so the number of cloud drops outside the interval can be ignored. In the cloud similarity measurement, it is difficult to distinguish the similarity by setting a threshold. In this article, the distance is directly used as the similarity selection, and the two clouds with the smaller distance are selected as the base cloud to be synthesized.

2.2.3. Mutual Information Method

Mutual information (MI) can be used to describe the relationship between two random variables. It is regarded as the amount of information contained in one random variable about another random variable. The mutual information between two variables can be described as

In the formula, and are the entropy of variables X and Y, respectively, is the joint entropy of variables X and Y, and the distribution can be expressed aswhere and are the probability density functions of X and Y; is the joint probability density function.

2.3. Feature Extraction Method Based on the EEMD Cloud Model

The cloud model is used as a composite fault signal feature characterization method. The feasibility of its cloud digital feature entropy as a fault signal feature characterization has been demonstrated by related experiments [18]. Also, cloud digital features have related applications in fault diagnosis applications [2224]. Therefore, it is theoretically feasible to use the digital feature of the cloud model as a feature representation of the fault signal.

The cloud model can be used as a feature extraction method to obtain cloud digital features, but for gearbox multifault vibration signals, the cloud digital features obtained with a single cloud model have a high dimensionality in numbers, and some features are difficult to distinguish effectively. Therefore, this paper uses the synthetic cloud model as the feature extraction method to extract the features of the gearbox multifault vibration signal. According to the previous analysis, the feature extraction method based on EEMD and cloud model can be completed by the following steps:(1) is obtained by decomposing the vibration signal collected by the EEMD experiment.(2)Calculate all mutual information values between all components and the original signal. Select the sensitive IMF based on the mutual information threshold. The threshold is determined according to reference [25]. In the above formula, it is the mutual information between the IMF and the original signal n is the number of IMFs and is the maximum value of the mutual information.(3)Keep the IMF components whose mutual information value with the original signal is greater than the threshold , and delete the IMF components whose mutual information value with the original signal is less than the threshold.(4)Perform cloud model feature extraction and transformation on the retained IMF components, synthesize the cloud into a new cloud, and calculate the cloud digital features of the new cloud as a new sample feature set.

The algorithm flow diagram of the method for extracting the fault feature of the gearbox compound fault based on the EEMD and cloud model is shown in Figure 2.

3. Experimental Verification and Result Analysis

To verify the effectiveness of the feature extraction method proposed in this paper, it is applied to the actual diagnosis of multiple faults in a certain type of equipment bearing. The experimental data [26] is collected from the rubber expansion dryer and extrusion dehydrator simulation platform of the Guangdong Petrochemical Equipment Fault Diagnosis Key Laboratory. By replacing various faulty gears, bearings, transmission shafts, and other components, the simulation cantilever centrifugal compression realized common single failures and compound failures of the engine or expander unit.

Aiming at common bearing and gear faults of complex equipment, combined with the typical industrial unit structure and load, based on the above simulation experiment platform, a set of fault accessories matching the system is designed, including bearing external cracks, bearing internal cracks, bearing ball wear, bearing lack of balls, cracked teeth, and gear wear. Some parts of the experimental failure parts are shown in Figures 35. Based on the above fault accessories, the test selects the NSK NN3021 bearing model for multiple fault simulation, and each fault sample is set to 40.

Based on the above fault accessories, the experiment selects NSK NN3021 bearing model for multiple fault simulation and designs 5 types of multiple fault types, namely, type 1-normal, type 2-gearbox large and small gear missing teeth + Left bearing inner ring missing the ball, type 3-gearbox large and small gears missing teeth + Outer ring wear on the right bearing, type 4-gearbox large and small gears missing teeth + Left bearing inner ring wear, and type 5-gearbox large and small gears missing teeth + Left bearing outer ring wear. The original signal of the five sample data is shown in Figure 6.

The EEMD parameter sets the total average time M = 100, and the added noise amplitude is 0.01 times the standard deviation of the original signal. After the above signal is decomposed by EEMD, 9 groups of IMF components are obtained. Usually, the most important information of the original signal is concentrated in the decomposed EEMD among the first few IMF components, as shown in Figure 7, and the MI values of IMF1∼IMF9 and the original signal are calculated by the MI method in the five states. The abscissas in the figure represent the IMF components, the threshold is calculated by formula (10), and the thresholds are 0.1861, 0.1550, 0.1565, 0.1421, and 0.1359, respectively. It can be seen from Figure 8 that both IMF1 and IMF2 are higher than the corresponding threshold, and IMF3 in type 3 is higher than the threshold, so IMF1, IMF2, and IMF3 are selected as the sensitive IMF components after EEMD decomposition. To facilitate subsequent experimental simulations, IMF4 is selected as the sensitive IMF component at the same time, so IMF1∼IMF4 components were selected as the sensitive IMF components.

IMF1∼IMF4, respectively, represent the first 4 sensitive IMF components selected, and the cloud digital features are calculated by formulas (3)–(5). The cloud digital feature average values of each category signal and IMF component are shown in Table 1.

For the convenience of calculation, in the paper, the clouds of IMF1∼IMF4 components are defined as base clouds . In this paper, the synthetic cloud is used to extract cloud digital features, the number of cloud drops is set to 1000, the cloud digital information obtained by IMF components is calculated by similarity to calculate the distance, and the two IMF components with the smaller distance are selected as the base cloud as the synthetic cloud algorithm. Calculate the distance between the cloud and the cloud by formula (7), and use this as the basis to determine the base cloud to be synthesized, and get  = 0.2642,  = 0.1642. Therefore, IMF1 and IMF2, IMF3, and IMF4 are selected as the base cloud to be synthesized. In the synthetic cloud algorithm, the value of a is set to 1, and the calculation method of the value of b is calculated as follows:

In the above formula, is the value of the coefficients of different synthetic base clouds, and and are the average expected values of the base cloud to be synthesized, where . Therefore IMF1 and IMF2, IMF3, and IMF4 are, respectively, used as base clouds to perform synthetic cloud, and calculation of the digital characteristics of the synthetic cloud is shown in Table 2.

As the final pattern recognition algorithm, there are many classifier algorithms, such as the literature [9, 27, 28], and the proposed method boosts the classification performance across the classes of the data. Since the fault sample data is small, considering the time efficiency issue, this paper directly uses the support vector machine as the classifier for experimental verification. For the calculated synthetic cloud digital features, 200 samples were selected from the samples at a ratio of 6 : 4, as 120 samples were used for training and 80 samples were used for testing. In the support vector machine (SVM) algorithm, the penalty factor C = 150, σ = 1, the experimental results are shown in Table 3, and the test classification effects of the two methods are shown in Figures 7 and 9. The results show that, in the feature extraction method of the EEMD synthetic cloud model, compared with the single cloud model, the feature dimension is reduced, and the degree of discrimination is also improved.

From Figure 7 and Table 3, it can be seen that the cloud digital features extracted by the single cloud model are used as the fault feature extraction method to verify that the classification accuracy is up to 88.25%, which verifies that the cloud model as a method for extracting composite fault features is reliable and effective. It can be seen from Figure 9 and Table 3 that the synthetic cloud model feature extraction method proposed in this paper has a verification classification accuracy of 91.25%. At the same time, analyzing Figure 7 shows that fault category 1 and fault category 2 in the single cloud model are prone to misdiagnosis. Analysis of Figure 9 shows that fault categories 2 and 4 in the synthetic cloud model have fault identification phenomena. In the synthetic cloud algorithm, the choice of parameters will also directly affect the category of features, so it depends on the situation. But overall, in terms of feature dimension and classification accuracy, the synthetic cloud model method and the single cloud model fault feature extraction method have certain advantages.

4. Conclusions

This paper proposes a feature extraction method for gearbox composite fault signals based on EEMD and synthetic cloud model. The EEMD algorithm is used for signal decomposition and then uses the mutual information method to select the sensitive IMF to obtain the feature information. Then, the concept of synthetic cloud is introduced, and the cloud-based distance measurement principle is used to select the cloud to be synthesized, synthesize the new cloud, and reduce the number of features at the same time, and relevant features are extracted. Finally, use the actual composite fault data set for verification and compare it with the feature set extracted by the single cloud model. Also, the time complexity of the method proposed in this article mainly depends on the choice of parameters. There are mainly several parameters to determine including the number of decomposition k in the EEMD algorithm, and the similarity distance of the cloud model. In terms of judgment, the number of cloud drops needs to be calculated, and the number of generated cloud drops determines the timeliness of the entire algorithm. In practical applications, the method proposed in this paper is mainly determined by the number of cloud drops, depending on the scale of the data. The experimental results prove that this method is effective and superior to the single cloud model fault extraction method, which has certain engineering practical application significance.

Data Availability

The fault-related database used for this research can be obtained from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank Darong Huang for his comments and suggestions. This work was supported in part by the National Natural Science Foundation of P.R. China under Grants (62073051, 61304104); the Science and Technology Research Project of the Chongqing Municipal Education Commission of P.R. China under Grants (KJZD-K 201900704); Chongqing Technology Innovation and Application Special Key Project under Grants cstc2019jscx-mbdxX0015; and the Innovation Foundation of Chongqing Postgraduate Education under Grants CYS20282.