Abstract

Effective and efficient diagnosis methods are highly demanded to improve system reliability. Comparing with conventional fault diagnosis methods taking a forward approach (e.g., feature extraction, feature selection, and fusion, and then fault diagnosis), this paper presents a new association rule mining method which provides an inverse approach unearthing the underlying relation between labeled defects and extracted features for bearing fault analysis. Instead of evenly dividing methods used in traditional association rule mining, a new association rule mining approach based on the equal probability discretization method is presented in this study. First, a series of extracted features of signal data are discretized following the guideline of equalized probability distribution of the data in order to avoid excessive concentration or decentralized data. Next, the data matrix composed of arrays of discretized features and defect labels is exploited to generate the association rules representing the relation between the features and fault types. Experimental study on a bearing test reveals that the proposed method can generate a series of underlying association rules for bearing fault diagnosis, and the related features selected by the proposed method can be used directly to analyze bearing signals for fault classification and defect severity identification. As a new feature selection method, it possesses prominent superiority compared to traditional PCA, KPCA, and LLE dimension reduction methods.

1. Introduction

The machinery reliability is critical to the system operational safety. Rolling element bearing, as the widely used component in large mechanical system, plays an important role in ensuring the availability of machineries such as aircraft engine, wind turbine, and compressor [1]. Due to the harsh operating conditions (e.g., high speed, heavy load, and great heat), bearings may lead to a sudden and catastrophic failure [2]. If it fails to diagnose earlier, bearing fault may incur great losses, and even a terrible accident. Effective algorithms for bearing defect diagnosis and prognosis are demanded and have remained an active research field to increase system reliability [3].

Fault diagnosis is the problem of detecting the potential faults hidden in the observed instances that are related to specific application domains [4]. Extensive efforts have been made by taking a forward approach for bearing fault diagnosis including feature extraction, feature selection, and fusion, and then fault diagnosis modeling. Various signal processing techniques including wavelet transform [57], empirical mode decomposition [8], order tracking [9], and spectral analysis [10, 11] have been investigated for incipient defect feature extraction and diagnosis. An integrative algorithm of sparse coding and online dictionary learning is developed in [2] to extract impulse features for machinery fault detection. Then, the extracted features are selected or fused to build a data-driven model based on artificial intelligence techniques including neural network [12, 13], support vector machine [1416], and fuzzy c-means [17, 18] for bearing fault classification. In such a forward approach, feature extraction plays a key role in the model for bearing defect diagnosis. The feature extraction is usually performed empirically based on prior experience. Thus, it lacks a systematic manner for bearing defect signature analysis.

To address the above issues, the association rule mining method provides a new approach for bearing defect signature analysis [19]. It takes an inverse approach to search the relevance and association of large information and has been investigated in commerce [20, 21], traffic [22], tourism [23], biomedical applications [24, 25], power plant equipment diagnosis [26], and the analysis of telecommunication networks [27]. In the association rule mining technique, data discretization is a critical step to find the quantitative attributes from the relation tables of potential items. In terms of data discretization, the equal density discretization and equal width discretization are commonly used uniform partitioning approaches. However, such approaches neglect the probability distribution characteristics of these features. Thus, it may make data unbalance, such as excessive concentration or decentralized data, and generate unsatisfactory association rules [28].

In line with the above challenges, this paper presents a new association rule mining method based on equal probability for bearing defect features analysis. First, a series of extracted features of signal data are discretized following the guideline of equalized probability distribution of the data in order to avoid excessive concentration or decentralized data. To evaluate the data discretization performance, the new criteria named information entropy of interval class is formulated. Next, the data matrix composed of arrays of discretized features and defect labels is exploited to generate the association rules representing the relation between features and fault types. The generated rules are then used for bearing defect classification based on fuzzy proximity methods and feature selection that generating the representative features related to typical defects. The related features selected by the proposed method can be used directly to analyze signals for fault classification and defect severity identification avoiding the impact of irrelevant features on the premise of keeping the original feature state. An experimental study is performed to validate the effectiveness of the presented method using the bearing test data provided by Case Western Reserve University (CWRU), and the experimental results reveal that the new method can effectively generate a series of underlying association rules for bearing fault diagnosis, and yields the best discretization performance and classification accuracy. The related features selected by the proposed method can be used directly to analyze bearing signals for fault classification and defect severity identification avoiding the impact of irrelevant features on the premise of keeping the original feature state. The proposed method outperforms the traditional principal component analysis (PCA) [29], kernel principal component analysis (KPCA) [30], and locally linear embedding (LLE) [31] dimension reduction methods [32, 33].

The intellectual merits of this paper rest on two folds. (1) A new association rule mining method with the data discretization of equal probability distribution is firstly presented, and a new criterion of information entropy of interval class is also formulated. (2) The presented method, as an inverse approach for bearing defect signature analysis, provides a new tool to guide the feature extraction instead of empirical feature extraction in the current forward approach of bearing defect diagnosis. The rest of the paper is structured as follows. In Section 2, the theoretical framework is introduced including the association rule mining technique and the Apriori algorithm. The proposed method is discussed in detail in Section 3. The information entropy criteria of interval class are also formulated to assess the performance of data discretization. In Section 4, the effectiveness of the presented method is demonstrated using bearing test data provided by CWRU. Finally, the conclusions are drawn in Section 5.

Many studies published in the literature adopt association rule mining to find useful knowledge from database proactively. The discovered knowledge with association rules can be applied to information management, decision making, process control, and many other applications.

2.1. Association Rule Mining

Association rule mining is a technique to detect and extract meaningful association relationships hidden in databases. It is firstly introduced by Agrawal et al. [34] and has been investigated in different applications including commodity sales [35], disease study [36], quality improvement of a production process [37], and alarm correlation analysis [38]. The association rule mining method is formulated as follows.

Let I = {i1, i2, …, im} be a set of literals, referred to as items. Let D = {t1, t2, …, tn} be a set of transactions. Each transaction t in D has a unique transaction TID and contains a subset of items I′ where I′ ⊆ I. An association rule is defined as an implication of the form X ⟹ Y where X, Y ⊂ I, and X ∩ Y = ∅. The support for an itemset X (supp(X)) is defined as the proportion of transactions in the transactions which contains the itemset. Itemsets with the minimum support are called large itemsets and all others small itemsets. The confidence of a rule is defined as conf(X ⟹ Y) = supp(X ∪ Y)/supp(X). Therefore, the association rule X ⟹ Y will satisfy:where σ and δ are the minimum support and confidence, respectively.

Association rules are typically summarized as two steps. First, find all large itemsets that have transaction support above the minimum support. Then each large itemset is used to generate the desired rules, which satisfies the minimum confidence constraint.

2.2. Apriori Algorithm

As a classical algorithm, Apriori algorithm discovers the frequent itemsets which make enormous passes over the data. In the first pass, the frequent itemsets are calculated by finding the support of individual items with the minimum support. In each subsequent pass, a seed set of itemsets which are found to be large in the previous pass are taken as the objects. This seed set is used to generate new potentially large itemsets, called candidate itemsets. The actual support is counted for these candidate itemsets during the passes over the data. At the end of the pass, the actual large candidate itemsets become the seed for the next pass. This process continues until no new frequent itemsets are found. The Apriori algorithm generates the candidate itemsets without considering the transactions in the database to improve the computational efficiency.

3. The Proposed Method

In bearing defect diagnosis, it is a challenge to analyze the bearing defect signatures in a systematic manner. In order to recognize bearing fault efficiently and accurately, this paper presents a new equal probability-based association rule mining method for the bearing defect signature analysis method.

3.1. Formulation of the Proposed Approach

The framework of the presented method consists of four different modules including data acquisition, feature extraction, equal probability-based association rule mining, association analysis, and fault diagnosis as shown in Figure 1. First, it collects normal and fault signal data from the monitoring equipment and extract features in time and frequency domains from the obtained data. Then, the extracted features are discretized and transformed into symbolic sequences. Next, the relation between discretized features and defect modes labeled is used to formulate the rules. Finally, the representative features related to typical defects are extracted and investigated based on the rules, which can not only be used to classify the bearing in different conditions, but also provide a guiding significance to traditional bearing fault diagnosis.

As a new feature selection method, it is different from other data dimensionality reduction methods, such as PCA, KPCA, and LLE, which need to transform the data matrix leading to bereave of the actual physical meaning of the selected features. The related features selected by the proposed method can be used directly to analyze the fault to avoid the impact of irrelevant features on the premise of keeping the original feature state. The range of eigenvalues obtained from the representative features can be used to determine the type and size of fault according to the respective values of features. This makes it possible to assess bearing status directly from sensing measurements instead of relying on complex models in conventional bearing defect diagnosis.

3.2. Association Rule Mining Based on Equal Probability

Typically, it is required that the data follow Boolean attributes, such as “0” and “1,” for association rule mining [39]. However, the relational tables in most business and scientific domains have the rich attribute forms such as quantitative attributes (e.g., age and income), while the Boolean attributes can be considered as a special case of quantitative attributes [39]. To solve such a problem, a simple approach is to partition the values into intervals and then map each interval into a Boolean attribute. Therefore, data discretization becomes an essential step for mining association rules to transform data from quantitative attributes to Boolean attribute. While inspired by symbolic aggregate approximation (SAX, a symbolic representation of sequential data) [40], this paper proposes a new association rule mining method based on equal probability distribution. According to the characteristics of data distribution, this method divides the sequence into several intervals according to the criterion of equal probability distribution. First, the sequence is standardized and the normalized sequence is subject to the Gaussian distribution, XN(0, 1). The equation is as follows:where B is the normalized array of A, μ is the mean of the sequence A, and σ is the standard deviation.

When the data obey the Gaussian distribution, the probability that the data points fall within the range of [a, b] is the area surrounded by the standard Gaussian distribution curve, as shown in Figure 2. The probability formula is described as

According to the characteristics of the Gaussian distribution, the data can be graded in the form of the equal probability distribution. Then, the data matrix consists of an array of features in different levels and a series of labels that represent different states.

Next, the association relation between the discretized features and labeled defect modes is drilled to formulate the rules. Finally, the sensitive features related to typical defects are extracted and investigated according to the rules. The presented method supports the bearing status assessment directly from sensing measurements instead of relying on complex models in the traditional fault diagnosis approach.

3.3. Feature Extraction and Discretization Effect Assessment Method

A total of 12 commonly used bearing vibration signal features are extracted, including eight features from time domain and four features from frequency domain, as listed in Table 1. These features are often used to depict the waveform, mutation, and distribution characteristics of the bearing vibration signal for bearing fault diagnosis. In order to compare the discretization performance of different methods and demonstrate the effectiveness of the discretization represented in this paper, the interval class-information entropy criteria is introduced as

Typically, the interval class-information entropy reflects the category diversity of one interval. Normally, the larger the interval class-information entropy, the worse the discretization performance.

In order to evaluate the performance of the proposed interval class-information entropy criteria, the common RMS-based indicator is introduced to compare the discretization performance of different methods. The indicator is described as follows:where is the RMS-based discretization performance indicator of evaluating different methods. A larger calculation result leads to more different kinds of data that exist in an interval, and the discretization method shows a worse performance.

4. Experimental Studies

4.1. Experimental Setup and Dataset

The experimental data were provided by Case Western Reserve University [41], and the experimental setup is shown in Figure 3. A motor drives a shaft via a dynamometer and electronic control system. The test data used in this study come from deep groove ball bearings (6205-2RS JEM SKF) installed in the motor-driven mechanical system at the drive end of the motor. The failures of the test bearings were set using an electrodischarge machining (EDM) with single point faults. For each test, vibration data were collected through accelerometers attached to the housing with a magnet at the drive end with a sampling rate of 12,000 Hz. The motor speed is 1,797 r/min, and the theoretical shaft frequency is 29.95 Hz. There are three types of fault datasets for outer and inner race faults and rolling element faults, and each fault is grouped into three categories according to the fault diameters of 0.007, 0.014, and 0.021 inches. Considering the normal state, there are 10 types of data samples in total.

4.2. Discretization Effect Assessment

In the experiment, there are 12 columns of feature arrays and an array of corresponding bearing defect labels making up a 150 × 13 data matrix. The feature set is composed of eight features from time domain and four features from frequency domain, as listed in Table 1, and the types of bearing running states are total of 10 types. The matrix is processed using the equal probability-based discretization representation technique to divide each column feature array into 10 intervals. To validate the performance of the proposed discretization method intuitively, the interval class-information entropy is introduced. The results of the discretization methods based on equal probability, equal density, and equal width are listed in Table 2.

Besides, the RMS-based criteria have been used to compare with the newly interval class-information entropy criterion and evaluate the performance of the newly proposed criterion. The results of the discretization methods based on equal probability, equal density, and equal width are listed in Table 3.

Figure 4 is a line chart more clearly showing the performance comparison of different discretization methods. We can see that the average value of the interval class-information entropy and the RMS-based indicator based on equal probability are smaller than the other two methods, which indicates the proposed method is conspicuous with a better discretization performance. We draw distribution maps related to the equal probability discretization method and the two other discretization methods based on equal density and equal width, as illustrated in Figure 5. Comparing with the other two methods, the waveform of the equal probability discretization method is most like the distribution of the raw data.

In addition, Support Vector Machine (SVM) is applied to discretization classification. There are 15 sets of data in each type of failure, including seven sets of training data and eight sets of predictions. Table 4 shows the results of classification accuracy of the three different discretization methods. The equal probability method also achieves the highest classification accuracy.

As a most basic discretization technique, evenly divided methods such as equal density and width approaches only consider the data boundary and neglect the overall distribution of the data, which facilitate excessive concentration or decentralized data, while the equal probability-based discretization method follows the guideline of equalized probability distribution of the data to naturally divide an array of features into reasonable sections, avoiding unbalancing each data extent and blindness caused by manually setting in traditional methods.

4.3. Representative Feature Mining of Bearing

In the experiment, the 150 × 13 matrix introduced previously is still used for further analysis. The number of the intervals is still 10, so the grade is divided into 10 levels. For easier observation and analysis, we transform each section into a glyph with sign like “a1a2b1e10.” The alphabet presents a specific feature, and the numeric as sign indicates the level of the interval. There are 12 column feature vectors, so “a-l” are used to represent features “fBPFO, fBSF, fBPFI, f xMV, xSD, xRMS, xRA, xCF, xS, xK, xKC,” respectively. Thus, the numerical matrix is transformed into a symbolic matrix. Then, the symbolic matrix can be mined for generating a series of underlying association rules for bearing fault diagnosis by the Apriori algorithm [42]. The minimum support and confidence are set at 0.07 and 0.9, respectively. Table 5 presents the mining results.

According to the mining results, a map of three bearing failures in different fault size is drawn to illustrate the range of characteristic value intuitively and it can clearly reflect the sensitivity and relevance between the features and bearing defects with the fault degree changing as shown in Figure 6. The eigenvalues associated with the normal running state located in the low level interval. That is, the feature amplitude of the fault-free operation is relatively small. For roller fault, the eigenvalues are close to that of the normal state except that fBSF and xRA which have increased slightly. In addition, since the eigenvalues of the fault in 0.014 in. diameter have an unusual performance, the priority is given to the features of the faults in diameters of 0.007 and 0.021 in. as for the inner race defect, and the amplitudes of fBSF, fBPFI, xSD, xRMS, and xRA are much higher than the normal, of which fBPFI changes dramatically, while the other features fBSF, xSD, xRMS, and xRA change steadily. When the defect severities increase, the amplitude increases obviously. So do xK and xKC, although the changes are not so obvious. Similarly, it is easy to find that the features fBPFO, xSD, xRMS, xRA, and xCF are more sensitive to the outer race defect, while fBPFO fluctuates remarkably and is extracted as a representative feature. It is also found that with the increase of the outer race defect severities, the related features fluctuate distinctly instead of increasing steadily. The eigenvalues fBSF and xRA are selected as ball rolling element defect-related features and fBSF is chosen as a representative feature fluctuating regularly. Representative features are listed in Table 6 according to different bearing defects. Then the range of eigenvalues in 0.007 and 0.021 in. fault is mined, as presented in Table 7, which can be used to find the type and size of the fault based on the respective values of the features.

Fuzzy proximity is applied to validate the classification effectiveness of the rules mined by the proposed method, and the analysis results are shown in Figure 7. The fuzzy proximity classification accuracy of each type of bearing defect with the proposed method is generally higher than that of the other methods, and the average accuracy of the 10 types of bearing defects is calculated as shown in Figure 7(b). The proposed method gets the highest accuracy of 98.67%, which demonstrates that the equal probability-based association rule mining is an effective and excellent method.

According to Table 7, nine representative features related to various defects can be selected from 12 features, and the nine selected features are used as the indicators of data classification. SVM and BP neural networks are used as the classifier, respectively. Then, based on the three types of fault datasets including outer race fault, inner race fault, and rolling element faults, the traditional dimensionality reduction methods including PCA, KPCA, and LLE are chosen to be contrast methods for reducing the data dimension from 12 to 9. The classification results are shown in Table 8. The representative features selected by the proposed method achieve superior performance compared to the traditional feature selection methods based on data transformation.

5. Conclusions

To improve the reliability of rotary machinery, effective and efficient diagnosis methods are highly needed. There is a new equal probability-based association rule mining method presented in this paper which provides an approach directly unearthing the underlying relation between labeled defects and unusual features for bearing fault analysis. In view of the shortcomings of the traditional evenly dividing methods used in association rule mining, this paper presents a new association rule mining approach based on the equal probability discretization method to avoid data excessive concentration or dispersion. First, a series of extracted features of signal data are discretized following the guideline of the equalized probability distribution of the data. Then, the data matrix composed of arrays of discretized features and defect labels is exploited to generate the association rules representing the relation between features and fault types. The rules are used for bearing fault diagnosis and help take a bearing defect signature analysis in a systematic manner. The proposed method has been compared with two other evenly dividing discretization methods in the experimental study. Moreover, as a new feature selection method, it does not need to transform the data matrix leading to bereave of the actual physical meaning of the selected features compared to the traditional PCA, KPCA, and LLE dimension reduction methods. From the analysis and compare results, conclusions can be drawn as follows:(1)Discretization is the most important process of the quantitative association rule mining. Two types of methods, including interval class-information entropy criteria and SVM, are used to assess discretization performance. It turns out that the equal probability method possesses prominent superiority.(2)From the study, some features, which can be called representative features, show their sensitivity to the bearing defects such as the ball pass frequency, and inner race to the bearing inner fault. And the feature map drew by the proposed method intuitively illustrates the sensitivity and relevance between the features and bearing defects as the fault degree changes. However, there are also some special circumstances, such as the mining result of bearing rolling element defect. There is no obvious characteristic except the inconspicuous relationship between ball (roller) spin frequency and root amplitude, which is partially due to the complex rolling mechanism. So, the proposed method also provides a new idea for feature selection which does not need to transform the data matrix leading to bereave of the actual physical meaning of the selected features.(3)The effectiveness of the proposed method is confirmed by the fuzzy proximity method compared to common evenly dividing discretization methods. The presented method gets the highest classification accuracy. In addition, the proposed method also achieves superior performance compared to the traditional feature selection methods based on data transformation.

Nomenclature

xi:Value of the ith point in the sequence
n:Length of time series
fr:Shaft speed
N:Number of rolling elements
D:Bearing pitch diameter
d:Rolling element diameter
ϕ:Angle of the load from the radial plane
k:Number of intervals
ri:Total number of samples in the ith interval
cij:Number of samples in the ith interval with the jth eigenvalue.

Data Availability

The data used to support the findings of this study are acquired from the bearing data center of CWRU and the web page http://csegroups.case.edu/bearingdatacenter/home (accessed March 2019).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research acknowledges the financial support provided by the National Key Research and Development Program of China (no. 2016YFC0802103), National Science Foundation of China (nos. U1862104 and 51674277), and Science Foundation of China University of Petroleum, Beijing (no. ZX20180008).