Abstract

Traditional fault diagnosis methods of bearings detect characteristic defect frequencies in the envelope power spectrum of the vibration signal. These defect frequencies depend upon the inherently nonstationary shaft speed. Time-frequency and subband signal analysis of vibration signals has been used to deal with random variations in speed, whereas design variations require retraining a new instance of the classifier for each operating speed. This paper presents an automated approach for fault diagnosis in bearings based upon the 2D analysis of vibration acceleration signals under variable speed conditions. Images created from the vibration signals exhibit unique textures for each fault, which show minimal variation with shaft speed. Microtexture analysis of these images is used to generate distinctive fault signatures for each fault type, which can be used to detect those faults at different speeds. A -nearest neighbor classifier trained using fault signatures generated for one operating speed is used to detect faults at all the other operating speeds. The proposed approach is tested on the bearing fault dataset of Case Western Reserve University, and the results are compared with those of a spectrum imaging-based approach.

1. Introduction

In modern industries, motion is mostly powered by electromechanical systems (e.g., induction motors), which account for nearly 70% of the gross energy consumption in industrialized economies [1]. Induction motors and other industrial machines that undergo rotatory motion use bearings to reduce friction. The reduction in friction conserves energy that would otherwise be lost in overcoming it. It also increases the useful life of a machine by reducing its wear. Nevertheless, adverse operating conditions and cyclic loading can lead to material fatigue in bearings, which manifests itself in the form of surface cracks and spalls [2]. These cracks and spalls, if allowed to go undetected, can lead to costly and unexpected shutdowns, which is detrimental to economic productivity.

Bearings are at the heart of condition monitoring techniques since they are the most common cause of failure in induction motors (i.e., in more than 50% of the cases) [3]. In addition, their failure can cause prolonged downtimes. According to a recent study, gearbox bearings cause the longest downtime per failure in wind turbines [4, 5]. Hence, condition monitoring techniques for bearings are widely used in almost all forms of rotary machinery (e.g., gearboxes, wind turbines, helicopters, and even rotary microelectromechanical systems or MEMS) [610]. As bearing degradation is accompanied by increased levels of noise and vibration, vibration condition monitoring of bearings is a standard practice in the industry and an essential component in any predictive maintenance strategy. The vibration levels of machine components, especially bearings, are measured using accelerometers and analyzed to determine underlying faults [11, 12].

A detailed survey of fault diagnosis and fault-tolerant techniques, which have been developed in different domains, is provided in [13, 14]. These techniques have been broadly categorized into model-based, signal-based, knowledge-based, and hybrid/active approaches. However, fault diagnosis in bearings has mostly been done through signal-based techniques. These techniques usually involve three major steps: (1) measurement of the signal that will be used for fault diagnosis (different types of signals, such as structural vibration [10, 1518], stator current for bearings in induction motors [7, 8], acoustic emissions [1925], temperature [6], and more recently rotor speed [26], have been used); (2) processing the signal to extract features that are characteristic of anomalous conditions; (3) using different classifiers such as -NN, support vector machines (SVMs), or artificial neural networks (ANNs) for classifying normal and faulty signals.

Signal-based approaches detect localized defects in bearings, mostly by extracting their associated characteristic frequencies from the modulated fault signal through envelope analysis [27, 28]. These characteristic frequencies are the ball pass frequency for the outer raceway (BPFO), which is associated with faults on the outer raceway of a bearing, the ball pass frequency for the inner race way (BPFI), which is associated with faults on the inner raceway, and the first even harmonic of the ball spin frequency (2xBSF), which is associated with roller faults [29, 30]. Performance of the envelope analysis is improved by using it in conjunction with time-frequency analysis tools such as the discrete wavelet transform (DWT) [17, 31, 32], short time Fourier transform (STFT) [33, 34], empirical mode decomposition (EMD) [3538], and discrete wavelet packet transform (DWPT) [1923, 39]. These tools are primarily used to filter frequency bands near the carrier frequency, where the signal components corresponding to defects are modulated.

Nevertheless, these characteristic or defect frequencies are kinematic quantities that depend on the shaft speed and bearing geometry. The shaft speed and load angle from the radial plane are subject to random variations, which make the bearing signal inherently nonstationary and cause variations in the fundamental defect frequencies [29]. The detection of defect frequencies that are subject to random variations is challenging and hence requires tedious methods, which are difficult to implement. In [21, 22, 39], Kang et al. reduced the effects of nonstationarity by using subband analysis of fault signals through filter banks. They proposed measures like the mean-peak ratio and Gaussian mixture model-based residual component-to-defect component ratios to select the most informative subband. After selecting the most effective subband, features such as the relative wavelet packet energy and wavelet packet node kurtosis were extracted and then used for fault diagnosis. In [16], Amar et al. used binary vibration spectrum images and artificial neural networks for bearing fault diagnosis. The classification performance of this approach is highly susceptible to the quality of the binary spectral images. This method therefore hinges on the appropriate selection of the grayscale threshold value, which is used to generate those binary spectral images. Despite their complexity, these methods are dependent on the shaft speed and their performance is affected by its random variations. Moreover, these methods cover the design variations in shaft speed by dividing the vibration acceleration or acoustic emission signals into different datasets on the basis of shaft speed (revolutions per minute or rpm). A unique instance of the classifier is trained for each operating speed, which can only be used to classify the test samples for that operating speed alone. For a different operating speed, a new instance of the classifier needs to be retrained on a new set of features. In [40], a mechanism for feature extraction was proposed, which can be used to diagnose bearing faults under gear interference and variable speed conditions. However, this approach is very tedious and computationally complex since it extracts a feature, called the instantaneous dominant meshing multiplying, using STFT, and then resamples the original signal using this feature, decomposes the resampled signal into intrinsic mode functions (IMF) using EMD, and finally carries out the envelope demodulation of the IMF, with the highest kurtosis value, to determine bearing fault. Moreover, the output is not processed automatically; rather it requires manual interpretation to diagnose bearing faults.

The proposed approach addresses the three primary limitations of existing techniques: (1) They are tedious and complex processes, and hence practical solutions based on these methods are difficult to implement and more likely to be costly; (2) they require retraining the classifier each time if there is a change in the motor rpm to achieve given operating conditions; and (3) these techniques are not fully automated and require manual interpretation of the output. In contrast with the conventional approach, the proposed scheme is automated and simple to implement, uses pure pattern analysis, and requires the training of only a single instance of the classifier for all the four operating speeds considered in this study. The classifier is trained using features extracted from any of the four datasets and it can effectively diagnose faults at all the operating speeds as demonstrated in Section 4. In this study, the operating speed varies by approximately ±5% across all the datasets.

The main contribution of this study is that it proposes a method for fault diagnosis of bearings that is impervious to both random and design variations in shaft speed. In this method, the time domain vibration signal is converted into grayscale images. The dimensions of these images are determined experimentally, to ensure minimal variation in textures across different shaft speeds. The proposed approach, which is discussed in detail in Section 3, is validated using the publicly available benchmark dataset from [41]. A comparison of the proposed approach with vibration spectrum imaging [16] is provided.

The rest of the paper is organized as follows: Section 2 describes the seeded fault test data used to validate the proposed approach. Section 3 provides a detailed discussion of the proposed fault diagnosis scheme. Section 4 provides the discussion and analysis of experimental results, and Section 5 concludes the paper.

2. Experimental Setup and Vibration Fault Data

The proposed approach is tested on the publicly available seeded fault test data of Case Western Reserve University [41]. The data was collected using a 2-horsepower (hp) motor with a torque transducer and a dynamometer. The dynamometer is used to apply different loads on the bearing (i.e., 0 hp, 1 hp, 2 hp, and 3 hp). In this study, the vibration acceleration signals that were recorded for the drive end bearings are considered for analysis. The specifications of the drive end bearings are given in Table 1.

The test bearings are seeded with single point localized defects on the rollers, the inner raceways, and the outer raceways. The dimensions of the seeded faults are given in Table 2. The vibration data used in this analysis was measured using accelerometers placed at the 12 o’clock position on the bearing housing. The signal was recorded at a sampling rate of 12,000 Hz, using a 16-channel encoder.

As mentioned in Table 2, a total of four fault conditions including the normal or fault-free condition, an inner raceway fault, a ball fault, and an outer raceway fault are considered in this study. The snapshots of vibration acceleration signals for each of these four conditions are given in Figure 1.

For each fault condition, the vibration acceleration signals are available at four different shaft speeds (i.e., 1796 revolutions per minute (rpm), 1772 rpm, 1748 rpm, and 1722 rpm). Therefore, a total of sixteen vibration acceleration signals are analyzed in this study. These vibration signals are divided into four datasets, one for each shaft speed including 1796 rpm, 1772 rpm, 1748 rpm, and 1722 rpm. For each dataset, the measured shaft speed is assumed as constant and taken as the value given in [41]. In other words, any inevitable random variations in the measured speed are not considered and the proposed method is not affected by either these random or design variations to achieve specified operating conditions. The details of these datasets are given in Table 3.

The length of vibration acceleration signals, in terms of the number of cycles of available data, varies across the datasets. In the proposed approach, the rectified vibration signal is divided into cycle length slices, as discussed in Section 3. The length of each slice, therefore, is different and varies with the shaft speed (i.e., the cycle length for the 1796 rpm dataset is ~400 samples; for the 1772 rpm dataset, it is ~406 samples; for the 1748 rpm dataset, it is ~412 samples; and for the 1722 rpm dataset, it is ~418 samples). These slices are then stacked over each other to construct the grayscale vibration images. The heights of these images correspond to the number of slices that are stacked during their construction. The details of this process and the motivation for it are discussed in detail in Section 3. The raw vibration acceleration signals are first converted into grayscale images, which are then used for extracting microtexture information using the local binary operator.

3. The Proposed Fault Diagnosis Scheme

3.1. Vibration Image Construction

The proposed fault diagnosis scheme, based on the microtexture analysis of vibration images using local binary patterns, is shown in Figure 2. The scheme works on two-dimensional (2D) images that are constructed from the time domain vibration acceleration signals. A vibration acceleration signal is first rectified to get rid of the negative values. The resultant signal is segmented into equal length slices. The length of each slice, , is equal to the number of samples in one full shaft cycle or revolution of the bearing and is calculated using where is the sampling frequency and ω is the shaft speed in revolutions per minute (rpm). The value of the quotient in (1) is rounded off to the nearest multiple of 3 because the local binary pattern (LBP) operator used in this study works on pixel blocks. A total of slices, each of length , are used to construct the vibration fault images. Both dimensions including and of the grayscale vibration fault images are chosen so that they are integer multiples of 3. It also ensures an integer number of -pixel blocks in each vibration image. This eliminates the chance of any overlap or loss of vibration data during segmentation and stacking. The process of image construction does not change or omit any samples from the original data. It simply projects the original vibration acceleration signals into a 2D grayscale intensity space, where the instantaneous acceleration values can be viewed as pixels. The intuition behind this method is to observe the behavior of the time domain vibration signal in intervals of cycles and discover unique patterns in that behavior for each fault condition. The number of cycles is determined experimentally such that the selected value of gives the highest classification accuracy, as discussed in Section 4. These cycles of the vibration signal are stacked so that the width of the resulting image is equal to pixels, while its height is equal to lines. Experiments with images of different dimensions, which are discussed in Section 4, show that images with widths corresponding to cycle lengths display marginal variation in texture due to changes in the shaft frequency or speed.

3.2. Local Binary Patterns

LBPs were first proposed for texture classification [42] as a simple, effective, and computationally efficient technique that is invariant to rotation and changes in illumination. Since then, they have been extensively used in texture classification, image indexing, and facial recognition applications [43]. The use of local binary patterns has also been explored for diagnosing rotor unbalance, broken rotor bars, eccentricity, stator faults, and bowed rotor faults in induction motors [44].

LBP, as the name suggests, looks for micropatterns in small neighborhoods and then constructs a global frequency distribution of those micropatterns across the entire image in the form of a histogram. It is a nonlinear operator, which encodes microtexture information across small neighborhoods into -bit texture descriptors or codes. The histogram of these texture descriptors is used to uniquely identify an image. These micropatterns or texture primitives, as they are called, can be an edge, a corner, a line end, a spot, or a flat area. The LBP operator can be applied to both circular and noncircular neighborhoods. In this study, this approach is applied to noncircular neighborhoods, whereas in the literature it has been applied to both circular and noncircular neighborhoods of and pixels and even larger dimensions. The performance of the LBP operator is invariant under changes of illumination. The LBP operator is computationally simple, yet highly effective [43]. The illumination invariance property is especially useful in dealing with noisy vibration signals because the noise mostly affects the illumination level of the grayscale image [44].

The LBP operator constructs the texture descriptor by thresholding each pixel of the neighborhood with respect to its central pixel. Simply, the outer eight pixels of the neighborhood are compared with the central pixel (the 9th pixel) and replaced by a “1,” if it is greater than or equal to the central pixel and “0” otherwise. This thresholding reduces the neighboring pixels to a binary value of either 1 or 0, and these values are then used to construct an 8-bit texture descriptor for that neighborhood. The 8-bit texture descriptor, which has a decimal value within the range 0 to 255, encodes the texture information for that particular neighborhood. This process is repeated for all the neighborhoods in the entire image, and the texture information for each neighborhood is encoded into 8-bit texture descriptors. The total number of neighborhoods in each image is determined using where is the height of the image, is its width, and is the size of each neighborhood. In this study, has been set to nine pixels, as we are considering only neighborhoods. These local texture descriptors are then used to construct a global histogram that can be used to uniquely identify the entire image. Mathematically, for a grayscale image , if we let denote the gray value of any arbitrary pixel in a given neighborhood and let denote the gray value of the central pixel in that neighborhood, then the texture descriptor for a neighborhood of size is given as follows [42]:where is the thresholding step function, which is defined as

The texture descriptor in (2) can have unique values. Therefore, the global histogram for the entire image would require bins. However, a previous study [42] suggests that there are certain micropatterns, called the uniform patterns, which occur more frequently than others and are more discriminative. Uniform patterns have a uniformity measure of at most two, which is calculated by counting the number of binary transitions (i.e., 0 to 1 or 1 to 0 in the texture descriptor). Thus, a texture descriptor that has two or less than two binary transitions is considered uniform. Among the 256 ( for ) possible texture descriptors, only 58 are uniform, while the rest are nonuniform. Hence, our global LBP histogram has 59 bins (i.e., 58 bins for the 58 uniform texture descriptors and one for the remaining 198 nonuniform texture descriptors). This global LBP histogram is used to uniquely identify each vibration fault image and the histograms of images for the same fault conditions would be similar to each other. In this study, we use Euclidean distance based similarity measure for the LBP histograms.

3.3. Classification Using -NN

The duration of vibration signals for each fault condition, considered in this study, spans across hundreds of cycles as mentioned in Table 3. Each signal is converted into multiple grayscale images, which are then encoded into LBP histograms. These histograms are used by the classifier to uniquely identify each fault. In this study, the -nearest neighbor classifier (-NN) is used for classification [45]. It uses the Euclidean distance between the histogram of an unknown fault image and histograms of the training dataset images to classify the unknown image. The Euclidean distance between the histograms of two fault images and is calculated using where is the number of bins in the histogram ( in this study). In this study, , which is the number of training set samples or nearest neighbors considered in determining the class of an unknown sample, is assigned the value of 3. The diagnostic performance of the classifier is evaluated using average classification accuracy, sensitivity, and specificity, which are calculated using (6), (7), and (8), respectively:where is the number of images in class that are correctly classified as class , is the number of images in class that are not classified as class , is the number of images that are not in class and are classified as not in class , is the number of images that are in class but classified as not in class , is the total number of images for all classes combined, and is the number of fault types or classes in the study.

4. Experimental Results and Analysis

4.1. Determining the Optimal Image Dimensions

Prior to the conversion of the vibration signal into 2D grayscale images, the image dimensions, which would yield a more uniform and robust texture, are determined. It is observed that variation in the image dimensions causes stark variations in texture. This is because stacking different segments of the vibration signal over each other would give rise to a more consistent texture if those segments have good correlation. For a given shaft speed, the images do show a uniform texture, but it may change as we change the shaft speed depending upon the selected image dimension. This is clearly evident in Figure 3, which shows the images for an inner race fault. It is observed that the texture spreads across the image as the shaft speed decreases and hence they look different for each shaft speed. Textures like these, as shown in Figure 3, cannot achieve shaft speed invariance in fault diagnosis, because they are different from one another and a classifier trained on the LBP histogram of one cannot be used to accurately detect the other.

On the contrary, in Figure 4(a), the texture changes little with changes in shaft speed for the same fault condition. The only notable variation occurs in the illumination and global distribution of micropatterns or texture elements across the image. The LBP operator, as discussed earlier, is invariant to changes in illumination and therefore does not suffer any loss in performance. Similarly, the variation of the global spatial distribution of micropatterns does not adversely affect the performance of the LBP operator, as it only considers the frequency of occurrence of micropatterns, contrary to their spatial distribution. In Figure 4, the fault images have widths equal to , which is calculated using (1). It can also be observed that the images in Figure 4 have higher aspect ratios as compared to those in Figure 3.

In general, it is observed that images with higher aspect ratios result in better classification accuracy, as shown in Figure 5. The higher aspect ratios prevent the texture from smearing across the image when there are changes in shaft speed. Therefore, in this study, vibration images with higher aspect ratios are used, as shown in Figure 4.

The width of these images is set to , while their height is usually set to 15 pixels. The height of these images corresponds to the number of cycles of data used in constructing them; that is, one cycle of the vibration acceleration signal contributes one line of pixels to the fault image. It is clearly evident from Figure 6 that the classification accuracy improves as the height of the image (the number of cycles of vibration data used in constructing it) is increased, which is understandable as more cycles of vibration data result in a more distinct pattern for each fault. However, it reaches its asymptote when the image height is 15; any increase in image height beyond that point does not improve the classification accuracy.

4.2. Diagnostic Performance of the Proposed Method

In order to validate the proposed method, the available data is divided into four datasets, as shown in Table 3. The speed invariance of the proposed approach is verified by considering four different scenarios. In the first scenario, dataset 1 is used for training, whereas datasets 2, 3, and 4 are used for testing. That is, the classifier is trained using the vibration acceleration signals for 1796 rpm only and then used to classify the unknown fault signals for 1772, 1748, and 1722 rpm. In the second scenario, dataset 2 is used for training, whereas datasets 1, 3, and 4 are used for testing. That is, the classifier is trained using the vibration acceleration signals for 1772 rpm only and then used to classify the unknown fault signals for 1796, 1748, and 1722 rpm. Similarly, in the third and fourth scenarios, datasets 3 and 4 are used as training datasets, respectively, whereas the remaining datasets are used for testing in each case. As mentioned in Table 3, each of these datasets corresponds to vibration signals recorded at a different shaft speed or frequency. Thus we verify the proposed method by training our classifier on a dataset for one shaft speed while testing it on datasets for three different shaft speeds or frequencies and we repeat this process for each of the four shaft frequencies considered in this study.

The diagnostic performance of the proposed approach is given in Table 4, which clearly indicates that the proposed method is effective in diagnosing bearing faults independent of variations in shaft speed, both random and planned. As explained earlier, this is due to the uniformity of textures in the vibration images, which are constructed after determining their optimal dimensions. It is observed from the results given in Table 4 that the diagnostic performance of the proposed method in terms of its classification accuracy, sensitivity, and specificity generally improves with increasing height of the fault images, which is equal to the number of cycles of the vibration data used to construct these images. The average classification accuracy reaches a maximum value of 99.74% for . In each scenario, the test dataset is three times larger than the training dataset, which shows that the proposed approach achieves very good generalization compared to existing techniques, which are generally validated using 3-fold cross validation, where 2 datasets out of 3 datasets are used for training and the remaining one is used for testing.

4.3. Comparison with Vibration Spectrum Imaging-Based Fault Diagnosis

The proposed method is compared with a recent study [16], which uses vibration spectrum imaging (VSI) and artificial neural networks (ANN) to diagnose bearing faults. The VSI based approach uses data from Case Western Reserve University but considers only dataset 3 (i.e., the dataset for 1748 rpm in Table 3). It takes eight windows of the time domain vibration signal, each 1024 samples in length, and then applies a 513-point fast Fourier transform (FFT) to each window. The resultant spectral information for these eight windows is stacked on top of each other to create a 513 × 8 pixel grayscale vibration spectrum image. This image is smoothed using a 2D averaging filter (of size 8 × 4), and then the filtered grayscale image is converted into a binary image using thresholding. The threshold is determined by optimizing a certain cost function, and the optimum value of 0.7 is used for thresholding. The performance of the VSI based method is highly susceptible to the value of the threshold because it governs the quality of the input vectors to the ANN and hence its classification accuracy. The binary image with 513 × 8 (4104 binary spectral components as inputs to the neural network) is used as an input to the ANN. The data is divided into training (70%), validation (15%), and testing (15%) sets. In order to compare the proposed approach with VSI, the same network architecture is used as shown in Figure 7, except for the input layer (the proposed approach uses an input vector of length 59).

The datasets in Table 3 are merged to create one dataset, with four fault conditions (i.e., normal, inner race fault, roller fault, and outer race fault) at four different shaft speeds (i.e., 1796 rpm, 1772 rpm, 1748 rpm, and 1722 rpm). The diagnostic performance of VSI and the proposed approach is shown in Table 5. It is clearly evident that the proposed approach delivers superior diagnostic performance on datasets with variations in shaft speed compared to VSI, which uses the shaft speed dependent spectral information for fault diagnosis. Under variable speed conditions the diagnostic performance of methods that use spectral information is bound to deteriorate, as speed variations drastically change the spectral content of the vibration signals.

5. Conclusion

This paper investigates a new dimension in bearing fault diagnosis and presents a new method that is invariant to both random and premeditated variations in shaft speed. This is a very important aspect of fault diagnosis in bearings since traditional approaches diagnose bearing defects generally by detecting their fundamental defect frequencies. Though highly effective, these techniques have certain caveats. The defect frequencies depend on the nonstationary shaft speed, and variations in shaft speed cause inevitable variations in these fundamental defect frequencies. These variations can be small random variations that are usually tackled by time-frequency and subband analysis of the vibration signals, which makes these approaches tedious and computationally expensive, with costly and difficult practical implementations. In the case of large premeditated variations in shaft speed, the fault data is divided into different datasets depending upon the speed, and each dataset is separately processed for fault diagnosis. Thus, premeditated variations in shaft speed entail the recalculation of features and the training of a new instance of the classifier for every dataset. These crucial limitations justified investigation into a simple, automated, approach for fault diagnosis in bearings that is effective under variable speed conditions. The proposed scheme transforms the time domain vibration acceleration signal into grayscale fault images of appropriate dimensions and then classifies those images based upon their unique textures. The image textures are encoded using the local binary pattern operator, which is a highly effective texture descriptor. This study validated the proposed scheme by using fault images for four different operating speeds. A k-NN classifier was trained using images for one operating speed, and then its classification performance was measured by testing it with the fault images for the remaining three operating speeds. This was repeated for the fault images in all the datasets. The classifier yields an average classification accuracy of 99.74%, which shows that the proposed approach is invariant to variations in shaft speed. A comparison with a recent technique based upon vibration spectrum imaging shows that the proposed method gives better diagnostic performance. Despite its advantages, there are certain aspects of the proposed approach that need further investigation, such as its performance at relatively low shaft speeds.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this manuscript.

Acknowledgments

This work was supported by Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of the Republic of Korea (no. 20162220100050, no. 20161120100350), in part by the Leading Human Resource Training Program of Regional Neo Industry through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2016H1D5A1910564), in part by Business for Cooperative R&D between Industry, Academy, and Research Institute funded by Korea Small and Medium Business Administration in 2016 (Grants nos. C0395147, S2381631), in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2016R1D1A3B03931927), and in part by the Development of a Basic Fusion Technology in Electric Power Industry (Ministry of Trade, Industry & Energy, 201301010170D).