Abstract

The ability to identify incipient faults at an early stage in the operation of machinery has been demonstrated to provide substantial value to industry. These benefits for automated, in situ, and online monitoring of machinery, structures, and systems subject to varying operating conditions are difficult to achieve at present when they are run in operationally constrained environments that demand uninterrupted operation in this mode. This work focuses on developing a simple algorithm for this problem class; novelty detection is deployed on feature vectors generated from the cross correlation of vibration signals from sensors mounted on disparate locations in a power train. The behavior of these signals in a gearbox subject to varying load and speed is expected to remain in a commensurate state until a change in some physical aspect of the mechanical components, presumed to be indicative of gearbox failure. Cross correlation will be demonstrated to generate excellent classification results for a gearbox subject to independently changing load and speed. It eliminates the need to analyze the highly complex dynamics of this system; it generalizes well across untaught ranges of load and speed; it eliminates the need to identify and measure all predominant time-varying parameters; it is simple and computationally inexpensive.

1. Introduction

The dynamics of the vibrations generated by a gearbox subject to changing load and speed are complex and nonlinear. Faults in bearings, gears, or other aspects of prime movers can easily be masked by the effects of these state changes alone when one fails to consider their effects on decision rules. The detection of faults in this class of machineries is a growing concern in the literature. In this work, we adapt a technique from sensor failure analysis to reduce this present problem’s complexity. A common approach in detecting failure in sensors employs decision rules based on the cross correlation of their signals; in broaching this technique to variable-state machinery, the authors note that vibrations at disparate locations in a power train should be correlated to one another (e.g., the spectra of vibrations from the output shaft of a gearbox are related to those of the input shaft by the gear ratio of the gearbox). Signals from disparate locations of a power train may contain similar vibration from components along the train; for instance, the load on the gearbox’s bearings is modulated by the meshing of the gear’s teeth and its vibrations or acoustics will be apparent at both the input and the output of the gearbox (and possibly at more distant locations in the train; see [1]). The cross correlation signal between these vibration signals should remain commensurate until components of the train change—a state presumed indicative of faults.

Under this hypothesis, the authors propose deploying standard novelty detection on feature vectors generated from the cross correlation signal generated between disparate vibration sensors. Past efforts by the authors focused on adapting either novelty-detection techniques or feature vectors in order to address this problem. These algorithms required the investigators to measure all predominant state parameters and to include them in the algorithm [1, 2]. While the proposed techniques were shown to work well, they suffered from various limitations. Some classification schemes work only for one changing system input parameter [1]. Others require measurement of a gearbox’s load which can be either a costly or cumbersome requirement when an inline load cell needs to be installed on a system not fitted with it. Finally, the computational complexity of others requires large processing facilities not typically available on distributed embedded systems employed in condition monitoring. The cross correlation technique should eliminate or mitigate all of these drawbacks. This approach should provide an excellent means of failure detection in systems whose dynamics are too complex for traditional approaches and consequently may extend well beyond the monitoring of variable load and speed gearboxes.

To validate these conclusions, the necessary theoretical background is first explored including a review of cross correlation and how it is presently employed in this field as well as an overview of other existing approaches for solving this class of problems. The underlying methodology is subsequently described, from a description of the employed mechanical test bench to the details of each of the steps in the classification problem. Finally, the results are demonstrated to establish the flexibility of this simple approach.

2. Background

The mathematics of cross correlation is first reviewed followed by an overview of related existing techniques.

2.1. Cross Correlation Analysis

Cross correlation analysis provides a signal representing the measure of the similarity between two signals as a function of time lag , defined as where denotes the cross correlation function and denotes complex conjugation; similarly, it can be expressed in discrete form It is used extensively in pattern-recognition for speech, fingerprint and face recognition, automatic target recognition, and so forth. In these applications, typically one cross correlates a reference pattern with a test pattern when the two patterns are expected to lack shift invariance. The cross correlation signal between two patterns will have a peak at the shifted value τ if they have some similarity.

2.2. Cross Correlation of Systems Subject to Common Excitation

In this work, signals from disparate aspects of machinery, under common excitation, are cross correlated in order to simplify discerning the system’s health when the excitation is nonstationary. If two linear systems, with impulse response functions and , are commonly forced with some function , having equivalent frequency domain representation of , the particular solutions for the systems’ response will be the product of the forcing function and the system’s impulse response for all in ; that is, and . From elementary Laplace and Fourier transform theory, it is known that the frequency domain representation of the convolution of two signals is the product of their frequency domain representations. Cross correlation is equivalent to the convolution operation except without the folding operation; as such the frequency domain representation of the cross correlation of two signals is the product of the two signal’s frequency domain representations. Since the linear systems are forced with the same function, their output signals’ bandwidths overlap and the frequency domain representation of the cross correlation operation returns a product of the two system’s impulse response functions and the forcing function. The impulse response function for each system is determined by the system’s parameters (e.g., for a spring, the impulse response function is a function of the spring’s stiffness, the damping constant, etc.). The cross correlation of the two systems’ output therefore is a relation given by the system’s parameters. If any parameters of a system change, the cross correlation of the two systems’ output will change; it is on this basis that this work is advanced.

The vibration from a gearbox is inherently nonlinear and some of the assumptions of the foregoing therefore break down. Complex pattern-recognition techniques like novelty detection are engaged to handle these aspects.

Sampled systems are discrete in nature which was not presumed in the above analysis. The discrete systems under scrutiny herein are made discrete by sampling the continuous phenomena. The argumentation of the above is very similar in discrete form and a direct analogy can be made between the transforms of discretized form and the continuous form.

2.3. Relevant Cross Correlation Techniques from the Literature

Cross correlation is used heavily in signal processing for denoising purposes. Several examples of denoising in the domain of fault detection can be found in the literature; in [3], the authors used cross correlation from two proximate vibration sources for signal-to-noise ratio improvement while [4] used cross and autocorrelation for denoising. The authors in [5] exploited the auto- and cross correlation of different variables for signal processing in developing a fault-detection technique.

Cross correlation is used in a similar vein as the present approach in the detection of failed sensors as was the case in [6] whose authors used cross correlation between two flow sensors along with neural networks to verify sensor accuracy. The work in [7] acknowledges the dynamic nature of a motor run by an adjustable speed drive and the resultant effects on monitored signals are one of the common factors that yield erroneous fault tracking and unstable fault detection; the authors employed matched filtering (i.e., cross correlation between expected fault signals and actual motor current signals) the result of which is fed through a statistical hypothesis-testing fault-detection regime. Statistical-process monitoring with spectral clustering was used to classify samples according to differences in correlation among measured variables in [8]. In [9] cross correlation of the fault-response echo in electrical-power transmission systems from test-input excitation was used to detect potentially faulted cables. Jiang et al. [10] used the correlation dimension (a type of fractal dimension) in gearbox fault diagnosis.

More directly related techniques can be found in a number of other works. For instance, Parlar employed a similar methodology to that of this thesis in the monitoring of vibrating screens in [11]. In [12] Napolitano et al. exploited cross correlation of an airplane’s pitch and yaw state variables along with neural networks for fault identification in airplane systems. Rajamani et al. found the cross correlation between healthy and faulted transformer winding signals that was used to generate statistical feature vectors for classification [13]. In [14], Wu and Sun used the cross correlation of energy performance of a variable-air-volume (VAV) unit in an HVAC system [15] and the outside temperature as the criteria to evaluate the VAV health.

Cross correlation is used heavily in this field but the methodology proposed herein on this particular class of problems does not appear to exist in the literature.

2.4. Established Techniques

In the literature, there are a number of other algorithms focused on means other than correlation based fault detection for this complex class of machineries. Nonlinear principal-component analysis (NLPCA) in [16], advanced signal processing in [17, 18], adaptive filters in [19, 20], and adaptations to pattern-recognition techniques in [2124] are all well established—each having differing strengths and weaknesses.

To provide a baseline for comparison for the approach advanced within, a comparison between a number of related techniques developed by the present authors will be undertaken. In [1] the authors explored expansions to the work by Worden et al. in [25]; Worden et al. suggested that vibration data from structures be grouped into discrete ranges of the time-changing parameters whose statistics (mean and covariance) are regressed or interpolated to develop a health rule as a function of the time-varying parameters. The work in [1] applied this approach to data from real gearbox vibrations along with an augmentation to Worden’s approach that focused on first whitening the statistical distribution so that any variant of novelty detection could be employed. Both techniques were subject to the assumption of normally distributed data and the double curse of dimensionality, a phenomenon occurring when there is a need not only to gather sufficient data to describe a complex high-dimensional problem space but also to do so for continuous changes in that problem space (e.g., in the form of changing speed or load). These initial investigations were conducted with only one time-changing parameter; in this work, two time-changing parameters are used (i.e., speed and load). While a large amount of data has been collected (nearly 20,000 feature vectors generated with ambitious segmentation), they are insufficient to accurately characterize the behavior of the gearbox with these approaches due to the double curse of dimensionality.

In an upcoming work, the present authors suggested the almost trivial approach of adding a gearbox’s average speed over a feature vector’s segment to that feature vector. The results generated with the same experimental data were found to be excellent; unfortunately, the fault-detection methodology does not extend beyond one time-varying parameter. The confusion eliminated in adding one time-varying parameter to the feature vector is again reintroduced when another time-varying parameter is added.

In a different upcoming work, the authors suggest using the parameters of a discrete state-space model as elements of the feature vector in the novelty-detection problem [26]. In a simple view, this state-space model can be regarded as the transfer function of a gearbox modeled as a torsional spring; the state-space model’s parameters are ultimately functions of the physical nature of the gear (i.e., stiffness, damping, geometric configuration, etc.). These parameters ought to be insensitive to changes in load and speed and should be highly indicative of incipient fault states. The model is generated by assuming that the gearbox’s input speed and load are the inputs to a MIMO system; the vibration signal at any point on the machine is used as the output signal and the MIMO model formed with ARMAX techniques [27]. While the vibration problem being modeled with this linear state-space approach is in reality nonlinear, the use of novelty detection to develop a boundary around a set of these linear models is shown to provide adequate adaptation to the underlying nonlinear problem. The approach was shown to eliminate the double curse of dimensionality and assumption of normally distributed data. As evidence of the model’s sound nature, the results demonstrated excellent generalization to speeds and loads not experienced during training. The only limitations to the approach are the need to collect speed and load signals (a potentially costly consideration) and the computationally intensive nature of the algorithms for generating these models.

3. Experimental Configuration

This work focuses on the use of the parameters generated by cross correlating signals from sensors on disparate components of a machine. The pattern-recognition problem as advanced by [28] focuses on first collecting and conditioning signals (in this case, on a simulation test bench), segmenting them, and transforming them into -dimensional feature vectors that are ultimately fed into pattern-recognition solutions. The steps for this problem instance are described below.

3.1. Apparatus

The fault-detection algorithm proposed herein was evaluated based on data collected from a gearbox under realistic load and speed as shown in Figure 1. The test bench is described in further detail in [29].

The gearbox’s independent load and speed profiles were affected via a 25 hp and 50 hp AC induction servomotor ultimately controlled by two Baldor variable frequency drives (VFDs) with appropriate capacity. This gearbox was a single-stage reduction spur gearbox from SpectraQuest. Its shaft was supported by Rexnord ER-10 deep-groove rolling element ball bearings. Coupling between the motors and gearboxes was achieved through a combination of rigid shaft couplings and two zero-backlash alignment-enhancing BK3 Bellows flexible couplings. The entire drive train from the load to speed motors is shown in Figure 2.

Control and data acquisition were achieved primarily with a national instruments (NI) PCIe-7851-R field programmable gate array (FPGA) card with 8 channels of analog input/output and 96 channels of digital input/output. The control and data acquisition routines were written in LabVIEW code for both the real-time Windows PC and mounted FPGA card (capable of loop iteration in the nanosecond range). This PC was further fitted with an NI PCI-4472 card supporting 8 channels of IEPE acceleration data.

Four accelerometers, sampled at 10 kHz, were fitted on diverse components of the drive train. One accelerometer was mounted radially on the bearing of the drive motor, two were mounted radially and orthogonally to one another on the output side of the gearbox near the input shaft, and the final accelerometer was mounted on the input side of the gearbox near the output shaft. A Lorenz Messtechnik DR-2112-R inline torque meter was fitted on the input side of the gearbox and data were collected from it at 1 kHz with the FPGA card. Tachometer signals from the two motors were first counted by sampling the TTL pulses at a rate of 40 MHz on the FPGA card; this count signal was then sampled at 10 kHz and written to disc.

Control is achieved by using two analog output lines, one to each of the motors VFDs. A typical speed/load profile employed during data collection is shown in Figure 3.

3.2. Faulted Components

The first data set consisted of spur gears with a gear ratio of 3 : 1 in a reduction arrangement. Data were collected by swapping healthy and faulted components; bearing faults consisted of rolling elements with rough balls, a chopped ball, and inner and outer race faults of varying severity. Faulted gear signals were generated through the use of eccentric gears and two different gears with increasing root-crack depth (generated by wire electric discharge machining).

An additional set of gears consisting of a ratio of 80 : 48 were deployed in order to show the effect of the analyzed techniques on a different set of interesting gear faults including a gear with both a missing tooth and crack as well as a gear with teeth with progressively less material.

3.3. Signal Segmentation

Feature vectors are generated from continuously sampled signals split into meaningful and coherent intervals. In selecting the size of a signal segment, one must ensure that there is sufficient data to confirm that all necessary mechanical behavior is captured and that subsequent segments ensure a coherent comparison (i.e., each segment should accurately represent the cycle of mechanical behavior). When monitoring systems experience changes in state, the problem can become slightly more complex. One must gather sufficient data to adequately characterize the feature in question; there might also be a need to minimize the duration of the interval in order to eliminate large changes in signal behavior due to changes in system states. This is particularly true where the feature vectors are sensitive to the changing states and other means of ensuring accurate classification are employed (see [1]).

The constraints on segmentation in the problem at hand are more similar to the steady-state system case. Since the objective is to seek parameters immune to changes in system state with cross correlation based feature vectors, the only concern is the coherence and sufficiency of the segment. Consequently, concerns over accelerations and higher level rates of change in a segment from state variables, such as speed, should provide little impact. These constraints will be satisfied by using a variable-length period with a fixed number of shaft rotations (i.e., 15).

3.4. Feature Vectors

Feature parameters are formed from processing signal segments and are combined together to form an -dimensional vector. The authors’ favored approach in the form autoregressive (AR) models will be considered; AR models provide a high-dimensional feature vector by minimizing a signal in the least-squares sense to the most representative samples (the parameters of these models have a strong tie to the frequency characteristics of the signal) (see [30] for a better background).

3.5. Pattern-Recognition Algorithm

In developing a model of a machine’s behavior, it is generally only a simple task to collect data representative of the machine’s healthy state; collection of data from faulted states is either too difficult because of the varied number of such states or economically/operationally infeasible to do so (particularly with machinery in use in industry). This class-imbalance problem is typically resolved through the use of novelty detection where a decision boundary is fit around exemplars of -dimensional vectors derived from system signals that ideally represent the healthy system state well. During regular operation, a test pattern is declared as faulted if it falls outside this boundary and healthy in the contrary case (see [31, 32] for further information about novelty detection).

Due to its posttraining computational efficiency and ease of automation through a minimal number of configuration parameters, Tax’s support vector data descriptor (SVDD) [33] for novelty detection is preferred by the authors. It provides many advantages over other traditional techniques [34]. The SVDD fits target data with a minimal-radius hypersphere in an augmented space to generate nonconvex irregular decision boundaries in the normal feature space. The distance from the boundary is considered the novelty score; positive scores indicate that tested data fall within “normal behavior” while negative ones indicate a faulted state.

4. Classification Results

4.1. No Consideration of State

When attempting to detect faults in a gearbox subject to varying load and speed, the impact of failing to consider the effects of these parameters can be severe. The results in Figure 4 demonstrate the consequences of using traditional fault-detection techniques that do not consider the variable nature of the problem; they are derived from a standard autoregressive model of order 20 and are fit to the vibration data that was in turn fit to an SVDD. While the healthy state is adequately characterized, all of the faulted states are so poorly indicated that it would be impossible to discern the presence of any of the described faults. The faults employed were relatively incipient in nature and one might assume that this approach might detect their presence later in the fault progression, possibly too close to catastrophic failure.

4.2. Failure to Consider Load

The vibrations from a gearbox subject to both load and speed variations must be monitored with techniques sensitive to both parameters. Including the average speed of a feature vector’s segment in that segment results in improved classification error and the earlier detection of faults as compared to those achieved when no efforts are made to adjust for time-varying parameters. Figure 5 demonstrates improved results that remain substantially poor. Severe faults like root cracks, chipped teeth, and outer race faults are easily detected due to the strength of their signals with respect to noise levels and the masking effects from speed and load variations. Less prominent faults like eccentric gears and more subtle bearing faults will remain masked without full consideration of all modal parameters.

4.3. State-Space Based Feature Vectors

Taking full consideration of all predominant time-varying parameters drastically improves classification. Figure 6 demonstrates that all faults (subtle and severe) become easily discernible when employing state-space based feature vectors. The healthy class is somewhat difficult to classify having an error over 10%; this error is high and can be reduced via varying the order of the ARMAX models with a consequential tradeoff in classification error on faulted states. The analysis in the upcoming work exposes this approach’s insensitivity to the double curse of dimensionality and its excellent tendency to generalize beyond untaught ranges of time-varying parameters [26]. A more detailed discussion is limited herein but generalities are provided to facilitate a means of comparison.

4.4. Cross Correlation Model

The cross correlation of vibration signals from disparate locations on a power train results in a signal not a feature vector; as discussed, this signal is in turn fit with an AR model whose parameters are used as the classification problem’s feature vector. The vibration from the load motor’s bearing was correlated with the vibration from the gearbox’s input shaft bearing to generate the cross correlation results discussed. Figure 7 shows the effect of changing the model order on the classification results and the novelty score’s distribution with respect to the decision boundary. Classification on all classes is poor with a low model order but as the model order increases the classification error drops in almost all cases. As was the case with state-space based feature vectors, a higher model order results in poorer classification error on the healthy class and good results on the faulted classes. This tradeoff seems present with cross correlation as there is a gentle increase in the error of the healthy state under these conditions. Balanced results are achieved with an order between 30 and 50 as shown in Figure 8.

4.5. Curse of Dimensionality

The double curse of dimensionality arises when a large amount of data is not only required to characterize a high-dimensional system’s behavior but when more is required due to the system’s time-varying nature. Figure 9 demonstrates that this cross correlation technique enjoys a general immunity to the curse but it also demonstrates that classification results on the healthy training set can suffer with too little data. State-space based feature vectors were slightly less susceptible to this phenomenon [26].

4.6. Generalization

The analysis surrounding Figure 9 and the double curse of dimensionality is relevant when analyzing these variable-state classification problems for generalization. This figure demonstrates that, with only limited training data from a select range of speed and load, cross correlation can be used to represent a gearbox’s behavior in a manner not sensitive to these time-varying parameters. To analyze this effect further, consider classification results achieved when training is conducted with data from profiles shown in Figure 3 but with test data from Figure 10 (i.e., different accelerations on load and speed). Figures 11 and 12 demonstrate that the approach has excellent generalization when varied acceleration is used in speed and load; healthy and faulted data remain easily detected.

4.7. Results from 80 : 48 Gearbox Arrangement

Figure 13 demonstrates that cross correlation works well with different mechanical parameters, that is, a less drastic gear ratio of 80 : 48. The gear teeth faults in this data set are fairly severe with concurring results in the form of novelty scores spaced a far distance from the decision boundary.

4.8. Sensitivity Analysis: Segmentation Interval

Figure 14 demonstrates the effects of the length of the segmentation interval defined by a fixed number of (input) shaft rotations. There is a general trend of reduction in the classification error as the segmentation interval increases, particularly for the eccentric-gear fault; for most classes of fault, however, the change is not as dramatic. Classification is generally poor while the number of input shaft rotations falls below the gear ratio but becomes desirable after the interval rises to above 3–5 times the gear ratio. This seems reasonable as the output shaft will not have undergone a complete revolution until the former condition is met; after the latter condition is met, sufficient data to characterize the system’s variations in a noisy environment will have been captured. Part (b) of this figure exposes a fairly consistent level novelty score distribution, suggesting that the improvement in classification error is not from a reduction in its variance but, instead, by a change in the average distance from the novelty boundary.

5. Automation

The block diagram in Figure 15 revises the steps taken in the proposed methodology as would be required in a more practical application. The segmentation interval could first be set to a value of 5 times the gear ratio. Vibration data would then be collected over these segments. The appropriate choice of the AR model order might vary from application to application; an online means of determining the appropriate order is therefore desirable. Since AR models are built by a least-squares fit of a signal’s data samples, the choice of order could be selected by iterating through possible choices of order and using the one with the smallest value. Training data could then be collected over a number of segments fit to AR models which would in turn be stored until a certain amount had been collected at which point the SVDD boundary would be calculated. The completion of the SVDD training would be followed by online monitoring of the gearbox under scrutiny.

6. Conclusions

By cross correlating the signal from vibration sensors on disparate locations of a power train and processing the resultant signal into a feature vector for novelty detection, a powerful technique for classifying time-varying classification problems like fault detection in variable load and speed gearboxes has been demonstrated. The technique removes the need to analyze the complex nonlinear dynamics of the problem. It eliminates the need for costly sensors, like inline torque sensors, and the difficulties in deploying them in machinery not originally fitted for their use. The approach is computationally efficient and retains the excellent fault-detection abilities of other techniques under review. It also generalizes well across untrained state parameters. Through an established technique in sensor validation, the approach has been shown to provide a powerful means of reducing a complex condition monitoring problem to a near-trivial one.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors gratefully acknowledge the support of the Center of Excellence in Mining (CEMI) in Sudbury, Ontario, and Vale for their financial and in-kind support. The authors are also appreciative of Dr. Chris Mechefske’s support through the loan of an experimental gearbox.