Abstract

This paper extends traditional Gaussian mixture model (GMM) techniques to provide recognition of operational states and detection of emerging faults for industrial systems. A variational Bayesian method allows a GMM to cluster with its mixture components to facilitate the extraction of steady-state operational behaviour; this is recognised as being a primary factor in reducing the susceptibility of alternative prognostic/diagnostic techniques, which would initiate false-alarms resulting from control set-point and load changes. Furthermore, a GMM with an outlier component is discussed and applied for direct novelty/fault detection. An advantage of the variational Bayesian method over traditional predefined thresholds is the extraction of steady-state data during both full- and part-load cases, and a primary advantage of the GMM with an outlier component is its applicability for novelty detection when there is a lack of prior knowledge of fault patterns. Results obtained from the real-time measurements on the operational industrial gas turbines have shown that the proposed technique provides integrated preprocessing, benchmarking, and novelty/fault detection methodology.

1. Introduction

Industrial gas turbines (IGTs) are utilised globally with units generally acting as prime-movers for either pumps, generators, or compressors, for both on- and off-shore systems. Various root-cause factors responsible for failures on such systems include vibration, shock, noise, heat, cold, dust, corrosion, humidity, rain, oil debris, flow, pressure, and excessive operating speed [1]. A key challenge of condition monitoring in order to provide an “early warning” of faults is to distinguish between sensor-based failure, component failure, and the normal operational transient behaviour of the system, for example, due to control or load changes. With advances in instrumentation, communications hardware, and computational capability [2], there has been an increased ability to realize such prognostic and diagnostic methods and provide remedial action and informed flexible maintenance scheduling prior to encountering unplanned downtime. This is especially pertinent as a result of the increasing operational speeds of IGTs [3].

Two popular categories of monitoring techniques have emerged over recent years, namely, model-based and signal processing-based approaches [4]. Model-based approaches construct models (or virtual sensors) to estimate physical variables from which residuals are calculated and used as indicators of emerging failure modes [5]. However, to build an accurate dynamic model that can accommodate the full operating envelope of IGTs is, in general, computationally demanding. In such circumstances, direct signal processing and data fusion methods often provide for more practical and effective monitoring solutions [6]. It is this latter category of techniques that is considered here.

Traditional signal processing-based methodologies use techniques such as principal component analysis (PCA) [7, 8], artificial neural networks (ANNs) [9], and data filter methods [10]. However, monitoring systems based on these algorithms are generally only applicable during steady-state operating conditions, since measurement transients caused by changes of loading or control action can generate “false alarms.” Many techniques are therefore only effective under very constrained operating regimes. For instance, [11] employed ANNs for fault detection on gas turbines during the engine start-up phase, whilst [12] only considers solutions during steady-state operation. Typically, such techniques do not attempt to address the issue by incorporating implicit methods that discriminate between steady-state and transient behaviour as part of the fault detection system, and it is this aspect that is initially considered here [13].

By recognizing that steady-state measurement data is often superimposed by noise having a characteristic Gaussian distribution [14], that is, containing practical “measurement transients” that are not due to operational variations, the signals can be modelled through use of Gaussian Mixture Models (GMMs) [15]. This characterization of signals has previously been reported [16, 17] and used in a related family of clustering methods based on GMMs, regarded as “soft clustering” techniques, with the benefits for condition monitoring and fault detection explored in [1618].

Here then, it is initially shown that GMMs provide a convenient mechanism to effectively discriminate between data relating to steady-state operation and that relating operation with transients and are specifically regarded in this instance as a preprocessing tool for subsequent operational pattern discrimination [13]. An essential part of developing the GMM methodology is parameter fitting, which is often carried out using an expectation-maximization (EM) method [19]. However, this requires the number of the mixture components (MCs) in the model to be fixed a priori [18]. Consequently, Bayesian-based frameworks are commonly used to provide a probabilistic inference on the data [20], and Variational Bayesian GMM techniques (VBGMMs) have been proposed [21] to provide improved performance [22, 23] by automatically selecting the number of MCs in the GMMs. VBGMMs are able to classify steady-state operation that can occur under full- or part-load conditions (e.g., 50% load). Additionally therefore, as well as identifying steady-state operation, the remaining data, including that associated with start-ups, shutdowns, and load changing conditions, is also naturally separated and it can therefore be used as a preanalysis tool for alternative dynamic scenarios, as reported in [24, 25], for instance.

An additional property of GMMs that facilitates novelty/fault detection is the ability to filter novel class samples when the machine learning system has no a priori knowledge [26]. Moreover, GMMs have been previously reported for the detection of “novel” vibration signatures with experimental results showing good fault detection properties with known classifications [27]. However, the technique was sensitive to vibration characteristics that are not associated with the detection of wear or damage and so remained susceptible to initiating false alarms. As a consequence of these features, the outlier components are now included as background “noise” for the GMM [28, 29], and the resulting technique is considered as a GMM with an outlier component (GMMOC). When using GMMOC, measurements remain characterized by the GMM, but novel characteristics, that is, measurements that have a very low probability of being clustered into existing distributions, are considered as outliers. It is the identification of such outliers that provides a robust mechanism for IGT fault detection considered here to provide an “early warning” fault detection tool.

Key contributions of this paper are summarized as follows:(1)An extension to the use of GMMs, including VBGMM and GMMOC, is proposed for novelty/fault detection on industrial systems.(2)Steady-state operation is discriminated from transient operation using VBGMM.(3)GMMOC is used to indicate the presence of outliers and hence the emergence of faults.(4)The efficacy of the proposed techniques is demonstrated from case studies (CSs) on IGTs, where CS1 is considered as a feasibility study of VBGMM and GMMOC for novelty detection and CS2 demonstrates a real bearing fault case study.

2. Methodology

The stages in the proposed methodology are depicted in Table 1. To provide an application focus for the development of the algorithms, measurements taken from bearing vibration probes on IGTs during commissioning are used as an illustrative example. Operational pattern separation is achieved through the use of VBGMM, where the steady-state data are distinguished for further analysis in this paper. Datasets from the identified transient operation can also be used for fault detection through start-up analysis and shutdown analysis and during load changes [6, 24, 25], which is not included in the current paper. The most relevant features are then extracted from the steady-state data and a statistical “fingerprint” for the extracted features is obtained through the application of GMMOC.

2.1. Underlying Principles of GMM

The empirical probability distribution of sampled data can be estimated by a GMM using a linear combination of Gaussian distributions [15], for example, as a sum of Gaussian distributions with mean and standard deviation . The GMM with is expressed aswhere is a multidimensional variable and are the mixing coefficients that need to be chosen.

Let represent data samples, , where each sample consists of a multidimensional variable . Provided that are statistically independent, the probability function of can be expressed aswhich constitutes the likelihood function and by taking the natural logarithm givesThe mixture density is thenwhere indicates the MC index and indicates the data sample index. The conditional probability is calculated asfor a selected component on the given data sample .

Evaluation of (3) typically necessitates an EM optimization procedure to maximize the log-likelihood function [18] from the maximization step (termed the -step): The unknowns in (3) are solved in the expectation step (-step), from (6), as follows:(1)Choose an initialization of , , and .(2)Iteratively update , , , and until convergence to a desired tolerance, usingwhere is the dimensionality of the data. For the special case , the whole dataset belongs to only 1 cluster, and the problem reduces to that of a Gaussian distribution fit.

2.2. Extension to VBGMM

A Variational Bayesian (VB) method can be used to determine the required number of MCs. Specifically, binary latent variables are used to indicate which MCs the data sample clusters into. When forming a GMM using classical methods, is selected a priori. However, when using a VB method, is resolved from the solutions of , termed . The joint probability distribution function of all variables is thereforeA lower bound on can be determined using the VB method reported in [21]. Let , and the marginal likelihood is expressed asNext, a variational distribution is introduced, that is, . Through the use of Kullback-Leibler (KL) divergence [22], that is, , (9) becomeswhere indicates the lower bound. Minimization of the KL divergence can be achieved by maximizing by selecting appropriate distributions. can be rewritten as the product of , in terms of the subsets , givingThe best distribution for each term, , can be solved usingwhere denotes the expectation of for all .

Since are mutually coupled, these can be calculated using the -step:(1)Initialize the parameters, which are normally set to be small, real, and positive [30].(2)Calculate in (11) through use of (12).(3)Iteratively update until the predefined tolerance is met.

After calculating , can be obtained. Maximization of , which minimizes , is again achieved by using the EM procedure. The -step for calculating can be derived from (7), and the optimized distributions can be obtained through the -step as mentioned above. For brevity, the reader is directed to [30] for a more in-depth discussion of the VB framework on GMMs.

2.3. Principles of GMMOC

GMMOC extends the original GMM [28, 29] approach by adding an outlier component that is modelled by a uniform distribution. The hybrid GMMOC model is then written asSince is uniformly distributed, the EM procedure described previously can still be employed to solve for the required parameters. The outlier component is normally assumed to be small initially (e.g., 0.01, and therefore the initialization of the mixing coefficients in GMM satisfies ). In this case, there are clusters, including Gaussian distributions and one uniform distribution, the outlier component.

Parameters , , and can be estimated from (7), and if the probability of a data sample, belonging to any of the Gaussian MCs, is smaller than a predefined threshold, it is clustered to the outlier component and therefore indicates a warning of an emerging fault, or facilitate novelty detection.

2.4. Feature Extraction

Feature extraction provides an essential tool for reducing the dimensionality of raw data whilst keeping informative features [25]. Many feature extraction techniques have been reported and successfully applied, including the use of the Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT), all involving elaborate time-frequency transforms [31, 32]. However, the data used in the following studies are taken from IGT units in the field with sampling rates in the order of minutes, which excludes the use of frequency domain based methods as they do not satisfy traditional sampling rate criteria for the measured variables. In this case, statistical features of the data in the time domain are used, for example, the peak value, root mean square, crest factor, kurtosis, clearance factor, impulse factor, shape factor, and skewness, and the most informative features can be identified through optimization methods [33]. For this study, in order not to divert the focus from the use of extended GMMs, only the most basic statistical attributes are employed; the mean (which carries information about measurement equilibrium) and standard deviation (which carries information about signal power) are used here as a proof of concept and also for practical reasons since it has been observed empirically that these features are sufficient in most cases.

3. Application Case Studies

The proposed methods are applied to monitor the vibration characteristics of fluid-film inlet- and output-bearings which typically support the compressor rotors of sub-15MW IGTs (Figure 1). The thrust bearings and journal bearings have operating speeds in excess of ~10,000 rpm. Radial and axial positions are monitored using noncontact probes. Two experimental case studies are now considered, both of which adopt the procedure in Table 1. CS1 uses measurements taken over a relatively short time period (1-month) and demonstrates the effectiveness of VBGMM for operational state discrimination (steady-state/transient behaviour), feature extraction, and the initial setup of a benchmarking ellipse using GMMOC, whilst CS2 considers the analysis of longer periods of measurement data (12-month) and aims to show the efficacy of GMMs for identifying longer-term emerging faults.

3.1. Case Study 1: Operation State Discrimination, Feature Extraction, and Novelty Detection

VBGMM is used to cluster measurements of output power (in terms of loading percentage) from a sub-15MW IGT to classify the unit’s operational behaviour (Figure 2). One month (31 days) of daily data, each containing 1440 sample measurements, is used (i.e., sampling period = 1 minute). The resulting clusters from the power/load measurements are also shown in Figure 2 after applying VBGMM to each individual set of daily data. In line with the algorithm description, it can be seen that when the unit is considered to be operating normally, in steady-state, a classification label of 1 is assigned and that classification labels > 1 are assigned for all other detected cases (cluster result given in Figure 3). A classification label of 0 indicates constant null readings and is precluded from further pattern analysis as the unit is considered to be shut down. Corresponding inlet bearing vibration measurements taken for the same 31-day period are shown in Figure 4. Having discriminated between transient and steady-state operation using VBGMM, the steady-state data (days 1–12 and days 29–31) shown in Figure 5 can be used for subsequent novelty detection.

Having identified appropriate datasets, feature extraction is used to capture important characteristics present in the data. The mean and the standard deviation of each day’s data for months 1–3 are calculated and given in Figure 6, to present a benchmarking envelope representing normal operation. Having effectively obtained an operational fingerprint of behaviour, measurements taken over subsequent months are then used and compared to the fingerprint.

Considering only the magnitude of vibration amplitudes, levels up to ~50 μm are typically considered normal, with warnings at ~70 μm and unit shutdown occurring at ~90 μm. Having obtained a fingerprint representing normal operation, GMMOC is applied to subsequent periods of data on a daily basis (in this case in month 4, Figure 7). It is well known that gradual bearing wear leading to failure is often preceded by gradual changes in vibration characteristics. Figure 8 provides an ellipse boundary drawn according to the 1-cluster GMM model (see (13)). In general, the confidence level to identify outliers will be set according to application. For instance, in this case, a 99.99% confidence level does not discriminate the outliers (here, it indicates outliers in normal operation which may be caused by sensor malfunctioning); however, a 95% confidence level is clearly seen to be appropriate in this instance. The results from GMMOC are compared with the envelopes drawn from the original GMM, as shown in Figure 8, and the advantage of GMMOC over the original GMM is evident, since the outliers, days 9 and 10, are clearly identified even with a 99.99% confidence level of GMMOC in this case.

By considering the following month of measurement data, it is notable that the measurements have correctly been identified as outliers (day 9 and day 10 in this case), whereas measurements from days 1–4 are correctly considered to correspond to normal behaviour. Although it is not known at that stage if the increase in vibration in days 9 and 10 is related to a component fault, the measurements are considered as anomalies. Although CS1 is used as a proof of concept application of VBGMM and GMMOC for novelty detection, the bearing considered in the study did fail around 3 months after initially being identified using the extended GMM.

3.2. Case Study 2

CS2 uses measurements taken over a longer period of 12 months using a lower sampling rate; specifically 1 data sample taken every 9 minutes, as shown in Figure 9. Again the procedure depicted in Table 1 is applied.

Measurements from months 1–4 are deemed to describe normal operational characteristics. VBGMM is then applied to identify what is considered to be steady-state operational data for further analysis (Figure 10). Important characteristics are then determined; once again it is known from empirical studies that for this application case the mean and standard deviation are effective measures of underlying behaviour. Through the application of GMMOC and by clustering the extracted features into a single cluster, a fingerprint of normal operation is obtained, as shown in Figure 11, where the 95% confidence envelope is considered as an early warning boundary in this instance and the 99% confidence envelope as a fault detection boundary. Using measurements from month 5 as testing data, it is shown in Figure 11 that, between days 20 and 25 of the 5th month, operation falls outside the “normal fingerprint” ellipsoid and is therefore identified as an outlier, thereby providing an early warning of expected failure, which was evidenced during the field service.

Referring again to measurements shown in Figure 9, on day 140 of operation (day 20 of month 5) a transient in the vibration level is evident, and although the mean level ultimately returns to normal after month 5, the variance continued to indicate evidence of emerging failure during and after month 6 (Figure 12). Considering months 6–12 as a period of emerging fault, the number of fault patterns can be discriminated using VBGMM (2 clusters for faults in this case) and the fault pattern locations identified using GMMOC, as shown in Figure 12. It can be seen that Fault Pattern 2 overlaps with the original fingerprint from months 1–4, indicating in this case that ideally other feature extraction indices (e.g., those involved with loading conditions) could be used to further isolate this type of fault characteristic from that of normal operation. Between the 2 sets of fault patterns (Figure 12), some anomalies are apparent which are due to the increasing levels of vibration as the bearing deteriorates, such as the 20th to 25th days of the 5th month’s operation.

In this instance the unit was shut down for maintenance in order to prevent a catastrophic failure. The bearing was known to be undamaged in month 1 as it was assessed in the previous service check (see Figure 13). Through subsequent decommissioning of the unit and investigation, vibration damage is evident on the inlet bearing, with excessive wear on the tilt pad shown in Figure 14. It can be clearly seen that there has been abnormal wear from the markings shown on both the shaft (Figure 15) and bearing pads (Figure 14). Through root-cause analysis, the damage was attributed to an incorrectly specified lubricant oil cooler causing high temperatures in the lubricant oil.

4. Conclusion

The paper has developed and demonstrated extensions of GMMs to provide a highly practical preprocessing and novelty/fault detection tool. The main contributions of the paper are (1) an automatic clustering method for VBGMM which identifies steady-state operational behaviour from transient operation, allowing the extraction of steady-state measurement segments for subsequent condition monitoring and (2) a GMMOC method that has been proposed and shown to provide a valuable tool for use as an early warning system of emerging failure through novelty detection. The presented techniques are currently being utilised in an industrial environment to monitor the operational status of a global fleet of IGTs. Although the experimental trials have focused on IGTs and bearing vibration measurements in this instance, the proposed methods are much more widely applicable to other industrial components and systems for pattern analysis, benchmarking, and novelty/fault detection.

Nomenclature

IGT:Industrial gas turbine
GMM:Gaussian Mixture Model
VBGMM:Variational Bayesian Gaussian Mixture Model
GMMOC:Gaussian Mixture Model with an outlier component
PCA:Principal component analysis
ANN:Artificial neural network
MC:Mixture component
EM:Expectation-maximization
-step:Maximization step
-step:Expectation step
KL:Kullback-Leibler
CS:Case study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to thank Siemens Industrial Turbomachinery, Lincoln, UK, for providing research support and access to field data to support the research outcomes.