Abstract

Each pattern recognition method has its advantages and disadvantages to diagnose the state of rotating machinery. There are many fault types of rolling bearings with apparent uncertainty. The optimal fusion level is usually challenging to be selected for a specific fault diagnosis task, and extensive human labour and prior knowledge are also highly required during these selections. To solve the above problems, a multimodel decision fusion method based on Deep Convolutional Neural Network and Improved Dempster-Shafer Evidence Theory (DCNN-IDST) is proposed for the inspection of rolling bearing. To solve the defect of the original evidence theory method in the fusion of high-conflict evidence, the fuzzy consistency matrix is introduced. By calculating the factor weight, the reliability and rationality of D-S evidence theory are improved. The DCNN model can learn features from the original data and carry out adaptive feature extraction for multiple sensor information. The features extracted by DCNN adaptively are input into multiple network models for decision fusion. The new method of DCNN-IDST multimodel decision fusion is applied to detect the damage of rolling bearings. To evaluate the effectiveness of the proposed method, both the BP neural network and RBF neural network are used to set up a multigroup comparison test. The result demonstrates that the proposed method can detect the fault of the rolling bearing effectively and achieve the highest diagnosis accuracy among all the tested methods in the experiment.

1. Introduction

Mechanical fault diagnosis  technology is a newly developing subject to monitor, diagnose, and predict the condition and the malfunction of continuously operating mechanical equipment to guarantee its safe operation. It can effectively improve the safety and the stability of rotating machinery equipment [1]. The rolling bearing plays an essential role in the running process of mechanical equipment, where its status will directly affect the running efficiency and life of the mechanical system. According to statistics, CNC machining by rolling bearings caused by mechanical failure accounted for about 30% of the total fault. It can be seen that timely diagnosis of the status information and fault diagnosis of rolling bearing is vital, which can reduce maintenance costs and ensure the regular operation of equipment.

There are many fault types of rolling bearing with apparent uncertainty. It is impossible to predict and diagnose the bearing state comprehensively by collecting a single sensor data. Therefore, data from different sources are combined to achieve the purpose of effective diagnosis of complex systems [2, 3]. Dempster–Shafer evidence theory (DST) has the advantage of expressing “uncertain” and “unknown” without knowing prior probability, which has been widely used in information fusion technology [4, 5]. Niu et al. [6] collected the current signal and the vibration signal of electrical machinery, integrating voting decisions with Bayes’ detective method, raising the multiagent decision-making layer blending approach, to incorporate the information from the results of multisensor signals’ malfunction diagnosis. Li et al. using D-S evidence theory to process decision-making layer’s information integrate thus to diagnose the malfunction of the diesel engine [7]. Zhang et al. adopt the SVDD approach to improve D-S evidence theory, setting up a two-stage fusion model to process experimental verification in the malfunction diagnosis for chisel engine and finally achieve hierarchical and multilevel malfunction diagnosis [8]. Cao et al. improved D-S evidence theory, set up the integrating relations between each evidence, and increased the recognition rate of malfunction diagnosis according to the problem of highly conflict evidence which appeared in malfunction diagnosis for large-scale equipment [9]. Liu et al. adopted the method of combining quantum medical neural network with D-S evidence theory to achieve the high precision diagnosis for mechanical characteristics of circuit breakers [10]. Jiang and Lin et al. attained the goal of improving D-S evidence theory by changing the combination rules in the evidence theory, which also had specific application effects [11, 12].

In the last few years, deep learning has played an essential role in the field of artificial intelligence. Deep convolutional neural network (DCNN) has strong ability of data mining and information integration [1319]. It has been widely applied to the state monitoring and diagnosis research of rotating machinery. Ga et al. classified the faults of rolling bearings by analyzing vibration signals with wavelet transform and extracting features and feeding them into the deep learning model [17]. Jing et al. used the deep convolution model to adaptively extract the original fault signal characteristics and identified the fault of the motor [19]. Zhao et al. combined the depth convolution model with the LSTM model to monitor and estimate tool wear and achieved good fault diagnosis results [20].

As an emerging machine learning method, deep learning has strong capabilities of feature extraction and function mapping. Deep learning method can meet the analysis requirements of diverse, nonlinear, and high-dimensional health monitoring data, which can be applied in the life prediction of rotating machinery equipment. Babu GS et al. [21] first used the Convolution Neural Network (CNN) to predict the residual life of bearings and verified the effectiveness of the method. Li [22] combined the convolutional neural network with the short-time memory neural network and used CNN’s features of convolution and weight sharing to extract deep features and input them into the LSTM network, effectively realizing the prediction of the remaining service life of the rolling bearing. Zhao et al. [20] combined the depth convolution model with the LSTM model to monitor and estimate tool wear, which achieved good fault diagnosis results. Shi et al. [23], based on complementing the advantages of CNN and LSTM, proposed the space-time series prediction method based on ConvLSTM. Luo et al. [24] proposed a ConvLSTM-AE framework which better encodes the change of appearance and motion for regular events. Qiao et al. used the TD-ConvlSTM time series model to analyze the space-time characteristics of multisensor data on different time scales [25].

There are many fault types of rolling bearings with apparent uncertainty. It is often challenging to select the optimal feature or fusion level in state monitoring. Therefore, combining the advantages of multiple intelligent identification methods, this paper proposes a multimodel decision fusion method based on DCNN-IDST for fault diagnosis of rolling bearing. By this method, the fault information of rotating machinery can be diagnosed comprehensively.

2. Theoretical Background

2.1. Dempster–Shafer Evidence Theory

Dempster–Shafer theory has the advantage of dealing with uncertainty, and its basic formula is shown as follows.(1)Suppose there are proposition A and U as the framework:.There is a function m that satisfies the following conditions:where m (A) is the Basic Probability Assignment, which represents the exact trust function for A.The belief function can be defined as follows:The plausibility functions are set towhere and are the upper and lower limit function of A. The relationship between them is shown as follows: .(2)The combination rule of Dempster–Shafer evidence theory [2629].Suppose that the basic probability assignment of frame U is m1 and m2; if and , the focal target elements are .The combination rule of Dempster–Shafer evidence theory is defined as follows:(3)The new improved method of Dempster–Shafer evidence theory (IDST).In this paper, the improved DST method proposed in the literature [30] is adopted to diagnose the state of rotating machinery. The specific steps are as follows.Assuming as the Basic Probability Assignment as the framework of U, the framework is .

The IDST method is defined as follows.(1)The original basic probability assignment is redistributed, and a new fuzzy consistent matrix is established.Add the fuzzy matrix in rows:On this basis, the consistent fuzzy matrix is obtained by transformation according to the following formula:(2)Figure out the factor weight coefficient.Suppose there is factor and target of , and the weight is as follows:where , to improve the resolution of the sorting result, and the values are .(3)Recalculate the basic probability allocation.The adjusted value of basic probability distribution is defined as follows:where k = 1, 2, 3, ….Since the sum of the changed basic probability distribution values is not 1, the supplementary definition is as follows, to constitute the basic probability distribution function:(4)Get the average evidence:(5)The improved evidence theory formula, as shown below:where is the moderate support degree after weighted the evidence.

2.2. Basic Theory of Multiple Network Models

Deep convolutional neural network (DCNN) was initially proposed by Hubel and Wiesel [31] based on the structural design of the visual nervous system. It has a strong ability of data mining and information fusion and can effectively realize the local connection, weight sharing, space pooling, and other functions [16, 18, 3237]. It contains many convolutional layers, pooling layers, and fully connected layers with vital data mining and information integration capabilities. It can effectively realize the local connection, weight sharing, space pooling, and other functions. The classical network of convolutional neural networks (CNN) is shown in Figure 1. The convolutional layer is the core component of deep convolutional network, which is composed of multiple sets of two-dimensional filters. When the data enters the convolutional layer, it will conduct convolution operation with the weight of the two-dimensional filter. The result after procedure is the output of the convolutional layer. The input data is processed by the down-sampling algorithm. The commonly used down-sampling algorithms include maximum pooling, average pooling, and nonuniform pooling, among which the maximum pooling method is the most widely used. The full connection layer is the last layer where the data has been processed by the previous convolution layer and the pooling layer, which has been extracted and converted into high-level information features. The full connection layer classifies the high-level information features to obtain the final recognition results. Soft-max is the generalization of the logistic classifier, which mainly solves the problem of multiclassification.

Backpropagation neural network (BPNN) can realize the nonlinear mapping between input and output, with one input layer, one output layer, and multiple hidden layers, each layer having various neurons. The output layer transfer function uses the linear function. With the minimum mean square error as the training objective, the BP algorithm is used as the learning algorithm of the network. BP neural network can learn the network through the gradient descent method and adjust the weight of the system through the backpropagation error. Through the above practices, the overall error of the network can be minimized.

The structure of Radial Basis Function neural network (RBFNN) is simple, with self-learning adaptive ability, fast convergence speed, function approximation ability, and superior advantages, thus having pervasive applications [3840]. Its network structure is mainly composed of the input layer, hidden layer (using radial basis function as activation function), and output layer.

2.3. A Multimodel Decision Fusion Method Based on DCNN-IDST
2.3.1. The Proposed Method

In this paper, the fault information of rolling bearings is collected by multiple sensors, and the extracted characteristic values are input into various network models (BPNN, RBFNN, and DCNN) by using the adaptive feature extraction feature of deep convolutional neural network. The output results are normalized as the basic probability distribution of evidence theory. The original D-S synthesis formula is improved by consistent fuzzy matrix, and the final fusion result is obtained by decision level fusion.

The flowchart based on the DCNN-IDST method is shown in Figure 2. (1) Vibration signals of rolling bearings at three different speeds are collected by two vibration sensors. (2) Data preprocessing. (3) The vibration signals of three rates of the rolling bearing collected are fused at the data level and then input into the DCNN model. (4) The eigenvalues of DCNN adaptive output were input into three network models, respectively, for training. (5) The output results of the three network models are normalized and then input into IDST for decision fusion.

2.3.2. Set Up Comparative Test Methods

The flowcharts of comparative test methods are shown in Figure 3.

This research method verifies the feasibility of DCNN-IDST-based multimodel decision fusion method in rotating machinery equipment state monitoring by setting up multiple groups of comparative tests. The PCA method is used to compare the classification effect of artificial selection feature and DCNN adaptive feature extraction. The feature quantity extracted by artificial feature and the feature quantity removed by DCNN adaptively were input into BPNN and RBFNN, respectively, to verify the recognition effect of the model. The diagnosis effect of the new method based on IDST and D-S synthesis formula is proved.

3. Experiment and Discussion

3.1. Experimental Setup

Four states of rolling bearing (normal, inner ring fault, outer ring fault, and rolling body fault) are selected for test verification. The rolling bearing fault diagnosis test is shown in Figure 4. Figure 5 shows the types of bearing fault. The sampling frequency is 20 kHz. The motor speed is set as 600 rpm, 900 rpm, and 1200 rpm. Each 2048 data point is used as a set of fault data samples. There are 300 samples for each health condition under each identical operating speed. The data of rolling bearings used in the test are shown in Table 1.

3.2. Data Processing

The original vibration signal of the rolling bearing at three rotational speeds (600 rpm, 900 rpm, and 1200 rpm) is preprocessed to improve the signal monitoring ratio. The preprocessed data are fused at the data level. Fifteen time-domain characteristics of vibration signals at three speeds of the rolling bearing are extracted manually. There are ten dimensionless time-domain indexes including mean value, root mean square value, root amplitude value, absolute mean value, variance, maximum value, and minimum value. There are five dimensionless time-domain indexes, including waveform index, peak value, pulse index, margin index, and kurtosis index. There are three frequency-domain eigenvalues, including mean square frequency, barycenter frequency, and frequency variance. These features will be input to BPNN and RBFNN for pattern recognition. Due to the limitation of space, only some time-domain indexes are listed in Table 2.

At the same time, the time-domain signal of the original rolling bearing is directly input into the DCNN model. Twenty feature values are adaptively extracted and input into BPNN and RBFNN for pattern recognition.

3.3. Model Design
3.3.1. Backpropagation Neural Network

The number of nodes in the input layer of Backpropagation Neural Network (BPNN) is set as fifteen (fifteen characteristic quantities). The number of nodes in the output layer is set as four (four fault types). The number of nodes in the hidden layer can be determined as fifteen after several tests. The learning rate is set at 0.05, and the expected error is 0.002. The desired output can be set as follows: normal [1,0,0,0], inner ring pitting [0,1,0,0], outer ring pitting [0,0,1,0], and rolling element pitting [0,0,0,1]. The output of the BP network is normalized and directly used as the basic probability distribution function of DST. Figure 6 shows the output of BP-adaptive network model.

3.3.2. Radial Basis Function Neural Network

For the RBF network model structure, the same training set and test set as BP neural network are used to train and test the system. The output result is normalized as the basic probability assignment function of evidence theory.

3.3.3. Deep Convolutional Neural Network

In this paper, the parameters of the selected model of DCNN are chosen by the method of ergodic reference. It is necessary to adjust the length of input data, the number of convolution kernel, size of convolution kernel, etc. Parameters of the selected model of DCNN are shown in Table 3. The main structure of DCNN is shown in Table 4.

4. Experimental Results

4.1. Principal Component Analysis (PCA)

To verify the adaptive feature extraction capability of DCNN, principal component analysis (PCA) was used to analyze the manually selected features and the adaptive feature extraction capability of DCNN. The output graph of principal component analysis (PCA) method is shown in Figure 7.

As can be seen from Figure 7, the classification effect of features extracted by DCNN is better than the result of manual selection. It can be seen that DCNN has the advantage of adaptive feature extraction, which can lay a foundation for subsequent equipment fault diagnosis.

4.2. Experimental Results
4.2.1. The Comparison Test Results of Several Network Models

The fifteen time-domain, frequency-domain, and twenty DCNN adaptive features extracted manually were input into BPNN, RBFNN, and DCNN, respectively, for pattern recognition. 80% of these data are selected for training and 20% for testing. The comparison test results of several network models are shown in Table 5. Figure 8 shows the testing accuracy of several network models.

As can be seen from Table 5 and Figure 8, DCNN adaptive feature extraction was used to input the extracted feature into each network model, and the recognition accuracy was higher than that of the manually removed part. It can effectively improve the diagnostic accuracy of the model, which provides a new diagnostic idea for the state monitoring of rotating machinery and the decision fusion of multiple models.

4.2.2. Decision Fusion Results

In this paper, the generalization ability of the network model is used to construct the BPA of D-S evidence theory. The output results of BPNN, RBFNN, and DCNN are normalized and used as the BPA of D-S evidence theory. The output results of the three networks are shown in Tables 68, where represented the BPA of normal, inner ring fault, outer ring fault, and rolling body fault. is the uncertainty and is the reliability (FL = feature learning).

As shown in Table 6, the output of each network is uncertain. After the fusion of the three network models, the output results of the network are all inner ring faults. With the addition of credible evidence to the fusion process, the BPA of B is continuously increasing. Based on the proposed method in this paper, the final diagnosis accuracy rate reaches 94.82%, which is higher than that of any single network model.

To further compare the advantages of DCNN-IDST multimodel decision fusion method, two groups of comparative experiments were set up. (1) Three network models were fused under D-S synthesis formula. (2) Three network models were fused under the IDSET method. The between the two fusion rules is shown in Table 9. Figure 9 offers the fusing by D-S synthesis formula, and Figure 10 shows the fusing by the IDSET method.

By comparing Tables 79, it can be seen that the output results of the three networks were normalized and then fused again, which significantly improved the recognition accuracy. With the increasing of supporting evidence, the reliability of the IDST method for decision fusion has a steady rise and the pattern recognition rate is also greatly improved. The test results show that the DCNN-IDST multimodel decision fusion can achieve better diagnosis results and perfect diagnosis system.

5. Conclusions

This paper gives a multimodel decision fusion method based on DCNN-IDST for the damage detection of rolling bearing. Combining the advantages of DCNN and D-S evidence theory, the DCNN model presents a convincing adaptive feature extraction ability with a better effect than the manual feature extraction. The D-S evidence theory method is improved by fuzzy consistency matrix, and multimodel integration decision-making system of rotating mechanical malfunction diagnosis is established based on experimental research. The basic probability assignment after decision fusion is generally higher than the initial recognition result of a single network model. With the increase of supporting evidence, the BPA and reliability after fusion recognition increase further. The work demonstrates that the proposed method combines the advantages of every single network and overcomes the defects of a single network with uncertainty effectively, with the best recognition accuracy among all the tested methods.

Data Availability

The basic data for the study were obtained from the laboratory.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The research was supported by the National Key Research and Development Program of China (Grant no. 2018YFB1701302), Key Research and Development Program of Shandong Province (Grant no. 2018GGX103016), and Shandong university science and technology plan project (Grant no. J15LB10).