Abstract

Bearing faults are the biggest single source of motor failures. Artificial Neural Networks (ANNs) and other decision support systems are widely used for early detection of bearing faults. The typical decision support systems require feature extraction and classification as two distinct phases. Extracting fixed features each time may require a significant computational cost preventing their use in real-time applications. Furthermore, the selected features for the classification phase may not represent the most optimal choice. In this paper, the use of 1D Convolutional Neural Networks (CNNs) is proposed for a fast and accurate bearing fault detection system. The feature extraction and classification phases of the bearing fault detection are combined into a single learning body with the implementation of 1D CNN. The raw vibration data (signal) is fed into the proposed system as input eliminating the need for running a separate feature extraction algorithm each time vibration data is analyzed for classification. Implementation of 1D CNNs results in more efficient systems in terms of computational complexity. The classification performance of the proposed system with real bearing data demonstrates that the reduced computational complexity is achieved without a compromise in fault detection accuracy.

1. Introduction

Electric machines are used widely in many commercial and industrial applications. The total number of operating machines in the world increased by about 50% in the last five years reaching a number around 16.1 billion in 2011 [1]. Rolling element bearings are one of the most widely used elements in machines and their failure is the single biggest cause for machine breakdowns [2]. Therefore, there is a great amount of research effort directed towards monitoring of bearing health. Motor vibration analysis and motor current signature analysis (MCSA) are two commonly used noninvasive methods in bearing condition monitoring [3].

Bearing fault detection systems usually employ ANNs or other classifiers to provide better detection rates. Such intelligent systems typically consist of three main parts: data acquisition, feature extraction and selection, and data classification. Time-domain [4], frequency-domain [58], enhanced frequency [912], and time-scale analysis [1316] are four main areas where signal processing techniques are used in the feature extraction [17]. The extracted features are used to both train and operate Artificial Neural Networks (ANNs) or other decision support systems (DSS) [1831]. Extracting fixed features each time data is analyzed by DSS may require significant amount of computational effort. Furthermore, the selection of suboptimal features may result in performance degradation for DSS.

The feature extraction and feature-based classification are the two unique and distinct phases of decision support systems. The proposed method does not require any form of transformation, feature extraction, and postprocessing. The feature extraction and feature-based classification phases of the bearing fault detection could be combined into a single learning body with 1D CNNs [32, 33]. It can directly work over the raw data, that is, the motor current signal, to detect the anomalies. Here, the time-domain vibration data is fed directly into the 1D CNN which is previously trained with test data using back-propagation (BP). The only preprocessing involved in the process is resampling to get raw data in the desired input format for the CNN. The overview of the proposed 1D Convolutional Neural Network with offline training and fault detection phases is illustrated in Figure 1.

A brief introduction on bearing faults is provided in the next section. Adaptive 1D CNN structure is introduced in Section 3. The bearing vibration dataset and the preprocessing of the raw data are then covered in Section 4. The results from analysis of bearing vibration data analyzed by the proposed 1D CNN for bearing fault detection are discussed using the standard performance metrics in Section 5. Finally, the conclusion and the future direction of the research are presented in Section 6.

2. Bearing Faults

Both mechanical and electrical faults may cause induction motor failures. Bearing faults, mechanical, are the single biggest cause of all motor failures. They account for 65–70% of all motor failures. Bearing replacement is the least expensive fault type to fix, but it is the most difficult to detect at early stages. Therefore, there is considerable amount of research directed in this area.

Bearing faults are categorized into two types as generalized-roughness faults and single-point faults. The generalized-roughness type is a distributed and noncyclic fault caused by improper lubrication, erosion, or pollution. There is no identifiable characteristic frequency associated with this fault type. The single-point type is a localized fault generally caused by a small hole, a pit, or a missing material. The single-point fault creates periodic impact generating vibration at specific frequencies when the bearing is run at a constant speed. The magnitudes of vibration signals increase as defects initiate and get more deteriorated. Therefore, this study focuses on single-point bearing faults.

Bearing fault characteristic frequencies could be grouped into four different zones: shaft speed zone, bearing defect frequency zone, bearing natural resonances zone, and high frequency zone [34]. The first zone contains harmonics of rotor speed related vibration frequencies. The healthy bearings will have some energy associated with shaft phenomenon such as unbalance or misalignment. The initial stage of a bearing fault is indicated by energy in both zones I and IV where the ladder zone contains high frequency components over 20 kHz. In the second stage of the fault, zone III with bearing natural frequencies will have some energy with increased energy levels in zone IV. The third stage is identified with bearing defect related frequencies becoming apparent in zone II and increased energy levels in other three zones. In the final stage, the bearing defect frequencies become more pronounced and their harmonics. The frequency content for all four stages of the bearing failure is depicted in Figure 2. BSF, BPFO, and BPFI stand for ball spin frequency, ball pass frequency outer ring, and ball pass frequency inner ring, respectively, in the figure.

This paper focuses on the detection of bearing fault frequencies in zone II for the final two stages of a bearing fault. Bearing fault frequencies are determined from bearing geometry and shaft speed. The geometry of a typical ball bearing is depicted in Figure 3.

The characteristic vibration frequencies are calculated by the following equations [35]:

Outer race fault frequency, , is given by where the number of balls is , the rotor speed in revolutions per second is , and the contact angle (zero for ball bearings) is .

Inner race fault frequency, , is expressed asCage fault frequency, , is given by Ball fault frequency, , is given by

3. Adaptive 1D CNNs

The feature extraction and fault detection (also learning) phases of the raw bearing vibration signals are fused together with an adaptive 1D CNN configuration in this study. Any input layer dimension can be handled with the adaptive CNN topology. Furthermore, the hidden neurons of the convolution layers in the proposed compact CNN can perform both convolution and subsampling operations as illustrated in Figure 4. The 1D CNNs are composed of an input layer, hidden CNN and MLP layers, and an output layer. The convolution and the subsampling are fused together in the CNN layer which makes the distinction. The remaining layers are MLP layers.

The main difference between the proposed 1D CNNs and the traditional 2D CNNs is the use of 1D arrays instead of 2D arrays for both feature maps and kernels. Consequently, 1D convolution (conv1D) and reverse replace 2D convolution (conv2D) and lateral rotation (rot180). The parameters for kernel size and subsampling (ss) are now scalars in the case of 1D CNNs. On the other hand, the MLP layers are identical in both cases and both use the same traditional BP formulation algorithm.

The 1D forward propagation (FP) from convolution layer to the input of a neuron in layer is expressed aswhere the scalar bias of the neuron , the output of the th neuron at layer , and the kernel from the the th neuron at layer to the neuron at layer are used to determine the input at layer .

The intermediate output of the neuron, , is a function of the input, , and the output of the neuron at layer is a subsampled version of asThe adaptive CNN configuration in Figure 4 requires the automatic assignment of the subsampling factor, , to 8 in the last CNN layer (the output CNN layer) since the array size is 8 at CNN layer . The adaptive CNN design allows the processing of different raw data lengths with usage of any number of CNN layers with different subsampling factors.

Looking at the back-propagation (BP) steps, the BP of the error starts from the output MLP layer. The mean-squared error (MSE) in the output layer for the input is expressed aswhere is the input layer, is the input and output layers, is the number of classes in the database, is the input vector, and and are its corresponding target and output vectors, respectively.

The objective of the BP is to minimize the contributions of network parameters to this error. The derivatives of the MSE with respect to an individual weight, , and bias, , of the neuron are computed to minimize their contributions to MSE. Here, the gradient descent method is used in an iterative manner. Specifically, the bias of neuron and all weights of the neurons in the previous layer are updated using , the delta of layer . The regular (scalar) BP is simply performed from the first MLP layer to the last CNN layer asOnce the first BP is performed from the layer to the layer , then we can further back-propagate it to the input delta, . Writing zero-order upsampled map as , then is written aswhere since each element of was obtained by averaging number of elements of the intermediate output, . The inter BP of the delta error () is expressed aswhere rev(·) is used for reversing the array and conv1Dz(·,·) is used for performing full convolution in 1D with zero padding.

Here, the weight and bias sensitivities are expressed asAs a result, BP algorithm given in [33] is used iteratively with the learning factor, , for scaling weight and bias.

4. Bearing Fault Data Preparation

In this section, the vibration data from NASA Prognostic Data Repository is used. The data was generated by the Center for Intelligent Maintenance Systems (IMS http://www.imscenter.net) of NSF with the support of Rexnord Corp. in Milwaukee, WI [36]. The test setup consisting of four double row bearings on one shaft is depicted in Figure 5 [37].

The shaft is driven by a belt coupled to a motor at constant speed of 2000 RPM throughout the data collection process. The debris collected by magnetic plug is used to indicate the degradation in bearing health. Data is collected until the accumulated debris which adhered to the magnetic plug exceeds a fixed-threshold level. The vibration data is collected at 20 kHz for a second from accelerometers installed on each bearing housing once every twenty minutes for thirty-five days. The data has 20,480 points at each recording with sampling rate of 20 kHz. The proposed bearing fault detection algorithm is applied to the data collected from accelerometer mounted on bearing 4 which develops an outer race defect. Rexnord ZA-2115 double row bearings with 16 rollers in each row are used in the test rig shown in Figure 5. Rexnord ZA-2115 double row bearings have a pitch diameter of 2.815 in., roller diameter of 0.331 in., and a tapered contact angle of 15.17°. Then, (1) would yield the outer race fundamental vibration frequency of 236 Hz at rotational speed of 2000 RPM.

The raw input vibration signal is decimated by 8 to allow the system implementation with less complex CNN configuration. The original data was collected at 20 kHz; as a result, the frequencies up to 10 kHz can be detected from the raw data. The fundamental bearing fault frequency is at 236 Hz for the dataset used in the analysis and the first five integer multiples of this fault frequency (0–1180 Hz) would be enough for the fault detection. Decimating the original data by 8 would result in the bandwidth of 1250 Hz. The decimated signal is low-pass filtered to avoid aliasing and then normalized properly before inputting to the 1D CNN classifier. The use of lower complexity CNN configurations helps with both training and detection speeds. Finally, the effects of dc offset and the biases are removed by normalizing input data to have zero mean before feeding to the CNN classifier. The spectrum of vibration signal for a healthy bearing before and after preprocessing is depicted in Figure 6.

The spectrum of vibration signal for a faulty bearing before and after preprocessing is depicted in Figure 7.

5. Experimental Results

In this section, the experimental results for the proposed bearing condition monitoring approach are presented. First, the details of the CNN structure are provided including the format required for input vibration data. Then, commonly used metrics in the literature such as classification accuracy (Acc), sensitivity (Sen), specificity (Spe), and positive predictivity (Ppr) are used to evaluate the performance of the proposed system.

5.1. Experimental Setup

The 1D CNN for the proposed bearing fault detection system has a simple configuration with only three hidden convolution layers and 2 MLP layers. Three hidden convolution layers have 60, 40, and 40 neurons, respectively, whereas the hidden MLP layer has 20 neurons. The input to the 1D CNN is 240 (time-domain) samples of the bearing vibration data and the output is MLP layer with size of 2 indicating faulty or healthy classes. The results show that deep and complex CNN configurations are not necessary for achieving high detection rates. The proposed structure provides computational efficiency for both training and detection stages of the design.

5.2. Detection Performance Evaluation

The hit/miss counters such as true positive (TP), false negative (FN), true negative (TN), and false positive (FP) are used commonly in expressing standard performance metrics such as accuracy, sensitivity, specificity, and positive predictivity. The real bearing vibration data samples for a total of 260 runs are used in both healthy (H) and faulty (F) cases. NASA Prognostic Data Repository dataset is used for testing. A C++ program developed in MS Visual Studio 2013 is used to implement the proposed adaptive 1D CNN classifier. The code utilizes 10-fold cross-validation technique to prevent overfitting and improve generalization in training the 1D CNNs. The confusion matrix obtained from all (5) test runs of the proposed system is presented in Table 1.

Accuracy is defined as the ratio of the number of correctly classified patterns to the total number of classified patterns, . Sensitivity is the ratio of correctly classified fault events to all fault events in testing, . Specificity is the ratio of correctly classified events to all healthy events in testing, . Finally, positive predictivity (precision) is the ratio of correctly classified fault events to all classified fault events, . The performance metrics are then easily calculated using confusion matrix of Table 1. The performance of the proposed system is compared with three commonly used classifiers in the literature: Multilayer Perceptron (MLP), Radial Basis Function Networks (RBFN), and Support Vector Machines (SVM).

All four metrics for the comparison with existing work with similar complexity is provided are provided in Table 2. The results demonstrate that the proposed system has a quite satisfactory performance and that deep and complex CNN configurations are not necessary for improving detection performance.

6. Conclusions

Typical decision support systems require feature extraction and classification as two distinct phases. The feature extraction phase of such systems involves implementation of signal processing techniques for preprocessing of data in both training and real-time fault classification parts. The need to extract features from raw data for fault classification phase places additional computational burden on such systems when commonly used algorithms such as fast Fourier and wavelet decomposition are implemented. In this paper, an adaptive implementation of 1D Convolutional Neural Networks (CNNs) is proposed for bearing health monitoring. The proposed system fuses the feature extraction and classification blocks of a commonly employed fault detection approach into a single learning body. Here, the convolutional layers of proposed 1D CNN learns to extract optimized features from raw data with BP training as the classification is performed by MLP layers. Since the raw bearing vibration data is directly fed into the proposed system, the computational burden due to feature extraction is eliminated in fault detection phase. The proposed system is tested with real bearing vibration data. The fault detection accuracy of over 97% was achieved in experimental results. The performance comparison of the proposed system with three commonly used classifiers in the literature (MLP, RBFN, and SVM) with similar complexity indicates that the proposed system has no compromise on detection accuracy.

The future direction of the research would be twofold: hardware implementation of the algorithm and classifying stages of the fault. FPGA or ASIC implementations of the proposed system would also be cost effective since only scalar multiplications and additions are required for 1D CNNs with 1D convolutions [38]. Therefore, hardware implementation of the proposed algorithm would be useful. Second, the test bearings were run to failure providing data at different fault stages. The proposed algorithm can be modified to classify the stages of the fault as initial and advanced stages.

Conflicts of Interest

The author declares that there are no conflicts of interest.