Abstract
The modelbased fault detection technique, which needs to identify the system models, has been well established. The objective of this paper is to develop an alternative procedure instead of identifying the system models. In this paper, subspace method aided datadriven fault detection based on principal component analysis (PCA) is proposed. The basic idea is to use PCA to identify the system observability matrices from input and output data and construct residual generators. The advantage of the proposed method is that we just need to identify the parameterized matrices related to residuals rather than the system models, which reduces the computational steps of the system. The proposed approach is illustrated by a simulation study on the Tennessee Eastman process.
1. Introduction
The safety, stability, and efficiency of dynamic systems have always been a matter of great concern in the field of complex industrial processes. Fault detection plays an extremely important role in improving the safety of processes and attracts more and more attention in the field of process monitoring systems.
During the past three decades, a modelbased fault detection technique for linear time invariant (LTI) systems has been well established [1–4]. But in some complex systems and largescale industries, it is difficult to obtain accurate mathematical models. With the development of the computer and information industry, Big Data has attracted more and more attention. Actually, industrial processes have large amounts of operating data which contain abundant information about the system. Subspace model identification which uses process data to identify system models comes into being under this kind of background. There are several methods of subspace algorithms, such as MOESP [5], N4SID [6], and CVA [7]. In a typical SMI, the identification procedure comprises two steps. Firstly, the extended observability matrix and a block triangular Toeplitz matrix are identified from input and output data. Then, , , , and are calculated from the identified observability matrix and the Toeplitz matrix. Fault detection using this method usually comprises two steps: system identification and modelbased fault detection.
Residual generation and residual evaluation are essential for modelbased fault detection. Fault detection can be achieved if residuals are obtained. Parity space approach (PSA) [8] is the simplest and most widely used among modelbased fault detection methods. We noticed that there is a close bond between the two methods by comparing PSA and SMI. The residual could be calculated by some parameterized matrices related to and . So, it is feasible that we only need to use the first step of SMI to identify parameterized matrices instead of the system models, which reduces the calculation steps. To the best of our knowledge, Ding and his coworkers subtly associated SMI with fault detection and proposed a subspace aided approach [9–11]. But most of these SMI methods are biased under the errorsinvariables (EIV) situation. Wang and Qin [12] proposed a subspace identification approach based on PCA that gives consistent model estimates under the EIV situation. Inspired by their work, we propose subspace method aided datadriven fault detection based on PCA in this paper. Figure 1 shows the difference between the proposed approach and the classic modelbased approach.
In this paper, the main contribution of this proposed approach lies in the direct identification of the parameterized matrices by PCA. As long as these matrices related to residuals are identified, we can construct residual generators. The proposed approach is illustrated by a simulation study on the Tennessee Eastman process.
The rest of the paper is organized as follows. Section 2 gives the preliminaries and problem formulation. The identification of parameterized matrices is presented in Section 3. Section 4 illustrates a simulation study on the Tennessee Eastman process. Finally, the conclusions are presented in Section 5.
2. Preliminaries and Problem Formulation
2.1. Process Descriptions
The form of the state space representation of a discrete time LTI system is given bywhere , , , and denote the vectors of the state variables, noisefree inputs, noisefree outputs, and process noise, respectively. The available input measurements and output measurements are described by [13]where and represent the input and output noise. The following assumptions are introduced: (i)The system is observable and controllable.(ii), , and are assumed to be zeromean, normally distributed white noise; that is, (iii)All the noise is assumed to be independent of the past noisefree input and initial state . They satisfy (iv)System matrices , , , and and system order are unknown.
We define the following vectors and Hankel matrices. The subscripts and represent the past and the future, where . By iterating (1) and (2), we can get
The vectors , , , and have similar structures to . If we use Hankel matrix instead of the vectors to describe the system, (6) can be written as
The matrices and are defined similarly to .
2.2. Residual Generation and Evaluation
Residual generation and residual evaluation are essential to design a fault detection system. The main idea of this paper is to generate residual from input and output data. How to generate and evaluate residual is introduced in the following section.
Moving to the left of the equation, (7) can be rewritten as
Taking as the orthogonal complement of , we can obtain
Since is unknown, we multiply on both sides of the equation to eliminate the effect of the unknown state [14]. Then, we can get
So, the residual signal
We noticed that the residual is closely related to the extended observability and a block triangular Toeplitz matrix . Therefore, if we want to obtain the residual, the key step is to identify and . In Section 3, how to identify these parameterized matrices will be introduced in detail.
Once these parameterized matrices are identified, we can get residual signal. Then, the fault detection can be completed by residual evaluation. Let be the residual vector at the instant , defining the testing statistic [15]where is the covariance matrix of the residual matrix in the faultfree case. Note that when there is no fault, , where is the chisquared distribution with degrees of freedom. We set as threshold and define the detection logic as
Besides the chisquared distribution, kernel density estimation is also utilized to determine the threshold [16]. It is assumed that the measurements follow a Gaussian distribution. The threshold can be calculated by , where is confidence level. where is the th sample of , is the number of samples, is a smoothing parameter, and is a kernel function.
Given a proper confidence level, we can get the threshold . In this paper, we set . After the statistics and threshold are deduced, the fault detection can be realized according to the above detection logic.
3. The Identification of Parameterized Matrices
Based on the knowledge that has been introduced in the previous sections, we know that the identification of matrices related to residual is the key step in this approach. In this section, we discuss the issue of how to identify these matrices. Wang and Qin proposed a subspace identification approach based on PCA. We will briefly introduce it. When the process is corrupted by process noise and measurement simultaneously, they use instrumental variables to remove the noise, because the past data is uncorrelated with future noise. According to these assumptions, we can get
If we utilize to represent residual instead of and , (11) can be expressed as
Multiplying on both sides of the equation, we can get
From (16), we knowEquation (19) indicates that has zero scores. From the rank condition we know that when , has zero scores. Performing PCA on the process data, When ,where is a nonsingular matrix, is the first rows of , and is the last rows of . The nonsingular matrix is unit matrix in our paper. The matrices and can be obtained according to (22), and then the residual can be generated. The algorithm from data to realize fault detection can be summarized as the steps shown in Algorithm 1.

4. Simulation Study on the Tennessee Eastman Process
In this section, we apply the proposed approach to Tennessee Eastman (TE) process. Three indices are used—fault detection rate (FDR), false alarm rate (FAR) [17], and fault detection time (FDT)—to demonstrate the efficiency of the proposed approach. The fault detection time is the first time instance with the testing statistic above the thresholds .
4.1. TE Process
TE process model is a realistic simulation program of a chemical process that is widely accepted as a benchmark for control monitoring studies [18]. The TE model that we used was downloaded from http://depts.washington.edu/control/LARRY/TE. The flow diagram of the process is shown in Figure 2. The process contains five major units named reactor, condenser, compressor, separator, and stripper.
There are 53 measurements, 12 of which are manipulated variables and 41 are process variables. The 12 manipulated variables are listed in Table 1. In this paper, the other 9 manipulated variables except for , , and are taken as inputs. The dimension of the TE dataset is very large. To overcome the curse of dimensionality, the 41 process variables are divided into eight blocks [19]. We choose the input feed block as outputs, which are shown in Table 2. Table 3 shows the 20 process faults of TE process.
In our simulation, these data are acquired in the case of Mode 1. The operating time is set to be 36 h. The total number of samples is . The faults are introduced at 8 h, which means that the fault occurs after 800 samples. and are both set as 10.
4.2. Simulation Results
The process is simulated with 20 different faults. For the sake of simplicity, we only take some typical faults to show. The first type of fault is step change in and feed ratio (IDV1). Figure 3 shows the testing statistic based on residual vectors for input feed block. The blue line is the testing statistic and the threshold is shown by a red line. The second type of fault is a random variation change in , , and feed composition (IDV8). The simulated result is shown in Figure 4. Figure 5 shows the testing statistic of the third type of fault which is a sticking fault in reactor cooling water valve (IDV14). The last fault (IDV17) shown in Figure 6 is an unknown fault. The type and process variables of fault are not known.
Based on the fault detection method, if the testing statistics exceed the threshold, the fault is detected. We can see from Figures 3–6 that all the testing statistics exceed the threshold after 800 samples. The four faults have been detected.
In addition to the simulation results, the FDR, FAR, and FDT shown in Table 4 are also obtained to demonstrate the efficiency. Besides the comparison between SMIPCA and SMIPCA KDE, we also compare them with the classic PCA in order to better reflect the advantages of this proposed method. According to the study of Mahadevan and Shah [20], the FDR of and for PCA are listed in the table. The maximum fault detection rate has been highlighted in boldface. We can see that most of the fault detection rate of our proposed method is higher than PCA. The fault detection rate of using kernel density estimation to calculate the threshold is slightly higher than the chisquared distribution. Most of the faults can be detected well and they have high FDRS and low FARS. But for some faults, that is, , , , and , the efficiency of fault detection is very poor. This may be caused by a very small change in the variables, which is also a common problem in the TE process fault detection. There is no meaning to calculate the fault detection time if the faults cannot be detected, so the fault detection time is indicated with “” in the table. The fault detection time of SMIPCA and SMIPCA KDE is the same because of their similar detection rate.
5. Conclusions
In this paper, subspace method aided datadriven fault detection based on PCA has been presented. This method is to identify the parameterized matrices using PCA and construct residual generators with the input and output data. A simulation study on the TE process demonstrates the availability of this method. It indicates that the method proposed in this paper has better effects than PCA on fault detection rate and is suitable for linear systems which are observable and controllable. Moreover, the problem of threshold is also discussed in this paper. The fault detection rate of using kernel density estimation is slightly superior to the chisquared distribution.
Compared with the traditional modelbased fault detection method, prior knowledge of model mechanisms is not needed and only the parameterized matrices related to residuals rather than the system models need to be identified.
Conflicts of Interest
The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This work is financially supported by the Fundamental Research Funds for the Central Universities (WUT:2017II41GX).