Abstract

An enhanced -nearest neighbor (-NN) classification algorithm is presented, which uses a density based similarity measure in addition to a distance based similarity measure to improve the diagnostic performance in bearing fault diagnosis. Due to its use of distance based similarity measure alone, the classification accuracy of traditional -NN deteriorates in case of overlapping samples and outliers and is highly susceptible to the neighborhood size, . This study addresses these limitations by proposing the use of both distance and density based measures of similarity between training and test samples. The proposed -NN classifier is used to enhance the diagnostic performance of a bearing fault diagnosis scheme, which classifies different fault conditions based upon hybrid feature vectors extracted from acoustic emission (AE) signals. Experimental results demonstrate that the proposed scheme, which uses the enhanced -NN classifier, yields better diagnostic performance and is more robust to variations in the neighborhood size, .

1. Introduction

Rotary machines, in both industry and common households, use bearings to reduce friction and ensure steady and energy efficient operation. Bearings reduce the noise and vibration levels associated with a machine, which is essential for the long term health of both the machine and its operators. Although bearings are very sturdy components and have very long useful lives; nevertheless, material fatigue due to variations in operating load, currents due to electric discharge, thermal stresses due to variations in operating temperature, corrosion, and contaminants in the operating environment can cause them to fail abruptly. A bearing failure can result in the abrupt shutdown of a machine, which leads to tremendous financial losses. Bearings account for more than 50% of failures in induction motors alone [1], which makes their condition monitoring essential to preventing any abrupt failures. Thus, early and reliable detection of bearing defects is very important as these defects lead to bearing failure.

Many data driven techniques have been proposed for diagnosing faults in bearings. These techniques largely use time-frequency analysis of the fault signals for the extraction of meaningful information about underlying faults [2, 3]. Fault signals, such as stator current, vibration acceleration, and acoustic emissions, are inherently nonstationary and hence they are processed in the time-frequency domain, using the short-time Fourier transform (STFT) [4], wavelet transforms [510], empirical mode decomposition (EMD) [1115], and the Hilbert-Huang transform [1618], to extract characteristic information about different bearing defects. Acoustic emissions are characterized by their low energies and very high bandwidths. They are captured using wide-band acoustic sensors and are very effective in diagnosing nascent faults [1921]. This paper presents a data driven approach for fault diagnosis in bearings, which extracts hybrid features from the acoustic emission (AE) signals and then employs the proposed enhanced -NN classifier to diagnose different bearing defects.

The hybrid feature vectors are constructed by calculating different statistical measures of the time and frequency domain AE signal and its envelope power spectrum. This rather extensive set of features is constructed to uniquely identify each fault condition; nevertheless, all features are not of equal utility in classifying a given fault correctly. Moreover, a high dimensional feature vector is bound to make the classification process computationally more expensive. Furthermore, if the feature vector contains too many redundant or irrelevant features, it may also degrade the classifier’s accuracy. Hence, the dimensionality of the feature vector is reduced using feature selection methods, which prune the high dimensional feature vector by eliminating the suboptimal features and selecting only those, which would result in the highest classification accuracy. These optimal features are used to create a model of the data by training a classifier, which is then employed to classify the unknown fault signals.

Due to its simplicity and effectiveness, -NN is usually the first choice in solving any classification problem. However, two factors can degrade its performance. First, -NN determines the similarity between two samples using only a distance measure of similarity; the widely used distance measures are the Euclidean and Manhattan distance. Second, the classification decision and hence accuracy are sensitive to the neighborhood size, . These problems have been highlighted in Figure 1, where the classification decision for the unknown test sample (shown as a red circle) changes with change in the neighborhood size. The test sample is labeled as “B” if , whereas it is labeled as “A” if . The limitations of traditional -NN, due to its use of distance based similarity measure, can be overcome using the local outlier factor (LOF) [22, 23] and local correlation integral (LOCI) [24], which are measures of similarity, based on the density of data samples. Hence, in this study, hybrid similarity measures (i.e., both distance and density based) are proposed to improve the diagnostic performance of the classical -NN and make it more resilient to the choice of neighborhood size, .

The main contribution of this study is that an enhanced -NN classifier is proposed, which uses hybrid measures of similarity between data samples to make it more resilient to the choice of neighborhood size, , and to increase its diagnostic performance relative to classical -NN. The density based similarity measure (i.e., LOF) is used to boost the decision of classical -NN, which classifies an unknown sample based only upon its Euclidean distance from its “” nearest neighbors using the majority rule. In the proposed -NN, when the nearest neighbors of an unknown sample do not belong to the same class, then the LOF is used to decide the class membership of the unknown simple.

The organization of the rest of the paper is as follows. In Section 2, the fault simulator and data acquisition setup are presented. In Section 3, the fault diagnosis scheme and the proposed enhanced -NN classifier are discussed in detail. In Section 4, a discussion of the achieved results is provided, whereas, in Section 5, conclusions of this work are provided.

2. Fault Simulator and Data Acquisition System

The acoustic emission (AE) signals are acquired using a machinery fault simulator, which is used to simulate different fault conditions. The fault simulator uses cylindrical roller element bearings (FAG NJ206-E-TVP2), which are ingrained with cracks on its different parts. AE signals are collected for bearings at the nondrive end of the simulator using a wide-band acoustic sensor and a PCI-2 based data acquisition system, which samples the AE signals at a rate of 250 KHz [25]. The acoustic sensor is connected to the top of the bearing housing and is at an approximate distance of 21.48 mm from the bearing, as shown in Figure 2. The nondrive end shaft is connected to the drive end through a gearbox with a reduction ratio of 1.52 : 1.

The bearings are seeded with cracks of two different sizes (e.g., 3 mm and 12 mm), and these cracks are introduced on either one or two components of the bearing to study both single and compound bearing defects. The AE signals recorded for bearings with 3 mm cracks and for bearings with 12 mm cracks are grouped into separate datasets. Moreover, for each crack size, the AE signals are recorded at two different shaft speeds (e.g., 300 RPM and 350 RPM). Thus, a total of four datasets are considered, each with AE signals recorded at a different shaft speed along with different crack sizes. The types of single and compound bearing defects are shown in Figure 3; they include cracks on the roller (BFR), inner raceway (BFI), outer raceway (BFO), inner and outer raceways (BFIO), inner raceway and roller (BFIR), outer raceway and roller (BFOR), and both inner and outer raceways and the roller (BFIOR). For each shaft speed, AE signal for a healthy bearing (FFB) is also recorded.

As mentioned earlier, the AE signals are divided into 4 datasets based upon the crack size and shaft speed, as given in Table 1. For every bearing defect, 90 AE signals are recorded; each signal is of 5-second duration. Similarly, 90 AE signals are recorded for the healthy bearing. Thus, every dataset contains a total of 720 AE signals.

3. The Proposed Methodology for Bearing Fault Diagnosis

The proposed methodology for bearing fault diagnosis works in two phases, as illustrated in Figure 4. The first phase comprises an offline process that involves feature extraction and feature selection, which are discussed in detail in Sections 3.1 and 3.2, respectively. The offline process is used to determine the set of optimal features that would yield the highest classification accuracy. In the second phase, an online process is used to classify the unknown AE signals using the proposed enhanced -NN classifier. The online process calculates only the optimal set of features for each AE signal and, using only those features, it labels the unknown AE signals.

3.1. Features Extraction

In order to accurately identify each bearing defect, a high dimensional hybrid feature vector is constructed using 22 different features of the AE signal. These features are useful in extracting maximum information about each fault [26] and include ten statistical measures of the time-domain AE signal and three statistical measures of the frequency domain AE signal. These features are listed in Table 2 along with the mathematical relationships for their calculation. Moreover, nine statistical measures, calculated over the envelope power spectrum of the AE signal, are also included in the hybrid feature vector. The features from the envelope power spectrum include the root mean square (RMS) values for each of the three defect frequencies and its first two harmonics. The defect frequencies include the ball pass frequency over inner race (BPFI), the ball pass frequency over the outer race (BPFO), and the ball spin frequency (BSF). The range of values for these defect frequencies and their harmonics is shown in Figure 5.

The range of values for the defect frequencies and their first two harmonics is calculated using (1), (2), and (3), respectively.where is the number of sidebands, is the operating frequency, is the error rate, is the inner defect frequency, is the outer defect frequency, is the cage frequency, and is the roller defect frequency.

3.2. Feature Selection

Although a high dimensional hybrid feature vector is highly desirable to capture the characteristics of different types of defects, the diagnostic performance of the proposed method can be degraded by potentially irrelevant and redundant features. Moreover, a high dimensional feature vector entails an increased computational cost during feature extraction and classification, which involves the calculation of distances and densities between different samples [2527]. Hence, the original feature vector is evaluated to determine the set of optimal features that would yield the best diagnostic performance and reduce the computational cost of the proposed method.

In this study, sequential forward selection (SFS) is used for feature selection, which is a simple and fast greedy search algorithm. It starts with an initially empty set, , and then iteratively selects the most significant feature from the original set with respect to the set, . This is done by first selecting a feature from the original set and then adding it to the set, , only if the newly selected feature maximizes the value of the objective function for the set, . The feature is discarded and the process moves to the next feature, if the selected feature decreases the value of the objective function for the set, . The objective function for SFS is given by (4), which is basically the ratio of interclass separability to intraclass compactness [25]. The interclass separability is given by the interclass distance , whereas is the intraclass compactness. Although SFS is simple, efficient, and reasonably accurate, it has its own disadvantages. It suffers from the nesting problem; that is, a feature retained once cannot be discarded, which can result in suboptimal feature selection [2830].

3.3. Enhanced -NN Classification Algorithm

The traditional -NN classifier labels an unknown test sample according to the majority of its nearest neighbors in the training set. The nearest neighbors are determined using a distance measure, which is mostly the Euclidean distance between two samples. In multiclass classification problems, where the density of each class is different, the use of a distance based measure of similarity between the test and training samples can result in misclassification and render the classification result sensitive to the choice of neighborhood size, , as illustrated in Figure 1. This happens because traditional -NN does not take into account variation in densities across different classes. Therefore, an enhanced -NN classifier is proposed, which uses both distance and density based similarity measures to improve its classification accuracy. For a given test sample, first its membership probabilities for different classes are calculated. This is done through voting by its nearest neighbors, which in turn are determined using the Euclidean distance of the test sample from all the training samples. If the membership probability for the test sample is one (i.e., all its nearest neighbors belong to a single class), then the proposed -NN classifier admits this result and labels the test sample according to its nearest neighbors. However, if the membership probability of the test sample is less than one, (i.e., all the nearest neighbors do not belong to a single class), then the LOF based density measure is used to determine the label of the test sample. The use of LOF in conjunction with Euclidean distance makes the classification performance, of the enhanced k-NN, insensitive to the neighborhood size, .

As shown in Figure 6, the proposed -NN first calculates the membership probabilities for the unknown test samples using probabilistic -NN, which uses Euclidean distance as a measure of similarity. The probabilistic -NN does not assign any class labels to the test samples; instead it only calculates their membership probabilities for all the classes.

If, for each class, the membership probability of a test sample is less than 1.0, then the output of the majority rule is ignored and the final membership of the test sample is determined using the LOF value, as shown in Figure 7.

3.4. Calculating the Local Outlier Factor (LOF)

The local outlier factor (LOF) has been used for the detection of outliers or anomalous data points [22], which have relatively lower probabilities of being members of any class. An unknown sample is classified by comparing its density with that of its neighbors. Points with densities like their neighbors are classified accordingly; that is, points with lower densities are labeled according to their neighbors with lower densities, whereas points with higher densities are labeled according their neighbors with higher densities. The LOF can be calculated as follows:(i)First, the calculation of the distance of every data point “” to its th nearest neighbor (i.e., is calculated), for , is illustrated in Figure 8(a).(ii)Second, for each data point “”, its reachability distance with respect to the data point “” (i.e., is calculated) is the true distance between points “” and “” with a minimum value of , as illustrated in Figure 8(b). It can be calculated as follows:(iii)Third, for each data point “”, its local reachability density (i.e., is calculated) is defined as the inverse of its average reachability distance from its “” nearest neighbors, as given in (6). The value of “” is set to 16, as given in Table 3:(iv)Finally, for each data point “”, its local outlier factor or LOF value is determined, by comparing its local reachability density to that of its “” nearest neighbors using the following relation:

The LOF values for all the training samples are computed using (7) during the training phase. The unknown test samples are classified based upon the similarity of their LOF values to that of their neighbors.

4. Results and Discussion

In this section, a discussion of the experimental results achieved by the proposed method for bearing fault diagnosis is provided. As mentioned earlier, four datasets are used to test the proposed method, details of which are given in Table 1. The method uses the enhanced -NN classifier, which has been proposed to address the limitations of traditional -NN. The enhanced -NN classifier was used with the parameters given in Table 3.

To demonstrate the effectiveness of the proposed k-NN classifier, the classification of inner race fault samples from dataset 1 is illustrated in Figure 9, using both the traditional and proposed -NN classifiers with neighborhood sizes of 3 and 7 (i.e., and ). The samples shown inside the red ellipse are to be classified; their true label is “inner_race_fault” (i.e., these samples belong to the inner race fault class). However, the classification result of the traditional -NN classifier varies with the value of (i.e., for ); it correctly classifies these samples as inner race fault samples, whereas, for , it classifies these as outer race fault samples, which is incorrect. It happens because traditional -NN uses the majority rule to decide the class label for an unknown test sample. In this particular case, among the nearest three neighbors of these unknown test samples, two are inner race fault and one is outer race fault. Hence, for the case of , they are correctly classified as inner race fault samples. However, among the nearest seven neighbors of these unknown test samples, four are outer race fault and three are inner race fault. Hence, for the case of , they are incorrectly classified as outer race fault samples. In contrast, the proposed -NN always classifies these samples as inner race fault samples, irrespective of the size of neighborhood (i.e., the value of ).

The proposed -NN classifier correctly classifies these unknown test samples because it uses the LOF, which is a density based similarity measure. LOF is used only when the nearest neighbors of a given test sample do not belong to the same class (i.e., the vote is not unanimous). Therefore, the class membership probabilities for the unknown test samples are determined. In this particular case, for , the probability that a given test sample is a member of the inner race fault is 66.7%, and the probability that it belongs to the outer race fault is 33.33%. Since both class membership probabilities are less than one, the proposed -NN classifier employs the LOF values of the unknown test samples and their neighbors to determine the final class labels. This is demonstrated in Figure 10, which shows the LOF values for the test samples and their nearest neighbors. The LOF values of the test samples for outer race fault class are 5.09, 5.069, and 4.979, whereas, for the inner race fault class, their LOF values are 3.33, 3.399, and 3.192, respectively. If the LOF values of these test samples for both the outer and inner race fault classes are compared to the LOF values of their nearest training samples, it can be observed that the LOF values of the test samples for inner race fault are similar to the LOF values of training samples from the inner race fault class. Hence, it can be argued that these test samples are outliers to the outer race fault class and inliers to or members of the inner race fault class.

Similarly, when , the probability that a given test sample is a member of the inner race fault is 42.86%, and the probability that it belongs to the outer race fault is 57.14%. Here again, the class membership probabilities are less than one, and, thus, the proposed -NN classifier employs the LOF values of the unknown test samples and their neighbors to determine the final class labels. Using the LOF values of the test samples and their nearest training samples, the test samples are classified as members of the inner race fault class.

Likewise, for other datasets and fault types, this is how the proposed -NN classifier improves the classification accuracy of traditional -NN. It is clearly evident in Figure 11, which compares the performance of these two classifiers in terms of average classification accuracy, and Table 4, which lists the classification accuracies for each dataset and individual fault type. Moreover, it can also be observed that the accuracy of the proposed k-NN is not affected by the neighborhood size, , whereas the accuracy of traditional -NN varies with variations in the neighborhood size, . It achieves a maximum accuracy for .

The size of the optimal neighborhood, which maximizes the classification accuracy of traditional -NN, has to be determined on a case to case basis. There are no general rules that work equally well in all situations and for all classes, which can be challenging as it makes the whole process computationally expensive and inflexible. The robustness of the proposed -NN to variations in the neighborhood size, , makes it more flexible and efficient to use. It delivers better and steadier performance. Moreover, in multiclass problems like the one considered in this study, where the densities of different classes vary, traditional -NN performs poorly as it does not consider variations in density. The proposed -NN takes into account variations in density of different classes and uses the LOF to decide the class membership of test samples in such cases.

5. Conclusion

In this paper, an enhanced -nearest neighbor (-NN) classification algorithm was presented, which employs both density and distance based similarity measures to improve the diagnostic performance in bearing fault diagnosis. The density based similarity measure, LOF, was used to boost the classification performance of traditional -NN, which deteriorates in case of overlapping samples, outliers, and multiple classes that show different feature distributions. Moreover, the distance based similarity measure makes the classification performance of traditional -NN highly susceptible to the neighborhood size, . These limitations were addressed through the use of both distance and density based similarity metrics, between the training and test samples. Using the enhanced -NN classifier, the diagnostic performance of the proposed bearing fault diagnosis scheme was significantly improved, and the results were more robust to variations in the neighborhood size, .

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This work was supported by the development of a basic fusion technology in electric power industry (Ministry of Trade, Industry & Energy, 201301010170D); funded in part by The Leading Human Resource Training Program of Regional Neo Industry through the National Research Foundation of Korea (NRF), Ministry of Science, ICT, and Future Planning (NRF-2016H1D5A1910564); funded in part by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry, & Energy (MOTIE) of the Republic of Korea (no. 20162220100050); funded in part by the Business for Startup R&D funded by the Korea Small and Medium Business Administration in 2016 (Grants S2381631 and C0395147); and funded in part by the “Leaders INdustry-university Cooperation” Project supported by the Ministry of Education (MOE).