Abstract

Incipient faults in high-speed railway have been rarely considered before developing into faults or failures. In this paper, a new data-driven incipient fault estimate (FE) methodology is proposed under multivariate statistics frame, which incorporates with Kullback-Leibler divergence (KLD) in information domain and neural network approximation in machine learning. By defining one sensitive fault indicator (SFI), the incipient fault amplitude can be precisely estimated. According to the experimental platform of China Railway High-speed 2 (CRH2), the proposed incipient FE algorithm is examined, and the more sensitivity and accuracy to tiny abnormality are demonstrated. Followed by the incipient FE results, several factors on FE performance are further analyzed.

1. Introduction

Due to its high efficiency and loading capacity, high-speed railway has been rapidly developed in the past two decades [13]. As one of the most important equipment of high-speed railway, inverter is the actuator of motors, and its sensor information is used to construct the control law of motors [4]. Typically, sensors equipped in inverter are vulnerable to faults as they are direct interactions with the working environment [3, 5]. This may lead to abnormal operation of motors and effectiveness loss of the traction force and even cause an emergency stop. Thus, the real-time sensor fault detection and diagnosis (FDD) and fault estimation (FE) are urgently needed in high-speed railway to improve its reliability.

There exists abundant researches for inverter sensor FDD over the past several decades [2, 4, 69]. All of them can be mainly classified into two general categories, including model-based methods and knowledge-based methods. For the model-based FDD methods, they are mainly to build the mathematical model of electrical traction systems via outputs or work mechanisms of being monitored system. Usually, a residual is to generate the fault indicating by available input and output information from sensors of inverter [4, 6, 9, 10]. In addition, the knowledge-based methods [2, 8] mainly depend on prior knowledge of inverter whose characteristics are to be summarized to distinguish or diagnose the faults.

Besides that, the data-driven FDD methods coming from chemical industry use the information of sampling data to directly analyze some features, such as data correlation, spectrum, and mean value. These kinds of latent information are crucial factors to identify and diagnose the sensor faults without considering the mathematical model of system. Recently, some modified data-driven methods were proposed to improve the process monitoring performance. For example, key-performance-indicator-related PLS was developed to acquire more meaningful fault information [11]; and fault-relevant PCA was designed to select variable optimally [12]. In addition, some detailed reviews on data-driven methods can be found in [1315]. But these methods are rarely found to be FDD in electrical domains.

The accurate inverter mathematical model of high-speed railway is needed for model-based incipient fault estimation method. However, due to the complex operation environments, the model of inverter is time-varying and effected by unknown noise and cannot be represented by some accurate functions. In addition, the expertise cannot be completely acquired and is often invalid when coming to incipient faults. Therefore, the data-driven method is considered to estimate incipient faults in inverter of high-speed railway.

In this paper, the main contributions of the proposed method are summarized as follows:(1)Considering non-Gaussian characteristics of traction system, the rotary principal component subspace (RPCS) and rotary residual subspace (RRS) are firstly used and introduced for data-driven FE problems.(2)Aiming at incipient FE, a sensitive fault indicator (SFI) is defined without any improper assumptions on measuring signals.(3)The analytic relation between incipient fault amplitude and SFI is analyzed and is then estimated by the neural network.

The rest of this paper is organized as follows. In Section 2, a brief introduction on inverter of China Railway High-speed 2 (CRH2), PCA, and Kullback-Leibler divergence (KLD) scheme is given. In Section 3, a rotary data space including RPCS and RRS is presented, in which SFI is defined by a information distance named KLD; then the relation between incipient fault amplitude and SFI is analyzed and obtained by theoretical derivations and function approximation. The whole FE strategy is presented in Section 4. Section 5 presents FE results and discussions about the performance of the proposed method. Finally, the conclusion drawn from the research along with discussions is given in Section 6.

2.1. A Three-Phase Three-Level VSI of CRH2

CRH2 traction system is a complex electrical system consisting of several electrical types of equipment, such as traction transformers, traction rectifiers, traction inverters, and traction motors. Traction inverter is reviewed as a major part which supply variable velocity variable frequency voltage. A three-phase three-level voltage source inverter (VSI) of CRH2 is mainly component of a filter circuit, overvoltage suppression unit, Insulated-Gate Bipolar Transistor (IGBT), and so on. As shown in Figure 1, every IGBT switch is composed of transistor unit (TU) and diode unit (DU), and there are altogether twelve switches in the inverter.

The incipient faults in CRH2 had not been considered sufficiently. Numerous studies [5, 16, 17] have mentioned that the incipient fault should be characterized with the following features:(i)Qualitative aspect: the degree of deteriorated system performance is such insignificant that it is not enough to trigger any set alarms.(ii)Quantitative aspect: the gain percent of deviation, fault signal comparing with the actual value under normal condition, is quite small, for example, ranking from 1% to 10%.(iii)Necessity aspect: if the incipient fault cannot be detected successfully and no action is taken, it must develop to fault or even failure as time goes on.

Based on the main circuit protection list of CRH2, there are altogether 41 types of faults [1]. Though some 3%–15% abnormalities can be detected, many missing alarm and false alarm cases still exist actually. For example, the over current fault encoded Fault 28 will not trigger any fault alarm.

A physical model of common train systems via first principals contains 84 differential equations together [18]. When coming to the more complex CRH2, it is impossible to set up accurate system model. Besides that, charging and discharging of many energy storage devices in high frequency make signals change frequently. This leads, predominantly, to the fact that all signals in inverter obey non-Gaussian distribution. In the following two subsections, preliminaries of SFI are presented.

2.2. Basic Form of PCA Process Monitoring

PCA is a popular multivariate statistical method which is proposed for dimensionality reduction of a mass correlated data [19]. The lower dimensional subspace obtained by projecting contains most of the original data features [20]. This data-driven method and its many variants [13, 21, 22] have been successfully applied in fault detection and diagnosis [14, 15].

The offline training data contains sampling measurements from sensors and can be completely denoted as , where is -th sampling. It is usually normalized to zero mean and unit variance before PCA modeling and its normalized former can be written as . Then, the sample covariance matrix can be denoted as Doing SVD on the covariance matrix as where , all eigenvalues rank in descending order. As given in [20], the loading matrix and the diagonal eigenvalue matrix are usually divided into the following formers according to : where , , , , and is the number of principal components. In this application, can be obtained by cumulative percent variance. Then the principal and residual parts of can be calculated by and , respectively.

2.3. Kullback-Leibler Divergence

The KLD is a most fundamental quantity in information domain [23]. It has been reviewed as a power tool in many applications [24, 25]. The original definition can be found in [26] with the following former: where has base , and are the continuous probability density functions (PDFs), and is the defined KLD of with respect to . As pointed in [23], KLD have the characteristic with Pythagorean inequality, so a symmetric quantity is usually defined to be

Given two normal PDFs such that then the KLD between two above normal distributions in case of positive variances in (5) can be equal to

3. Incipient Fault Estimation Method

3.1. Data Preprocessing

As explained in Section 2.1, CRH2 is a non-Gaussian system which prevents from adapting existed PCA-based FDD methods. Therefore, preprocessing the system data combined with its properties should be considered to cater for the Gaussian distribution of the measurements from three current sensors in Figure 1. Followed by this problem, some preprocessing steps based on the characteristic of selected signals should be implemented before incipient FE.

Let be the offline normal current signals of the CRH2; it is straightforward to know that . Figure 2 gives the relationship among three date spaces, as -- coordinate, - coordinate, and - coordinate. In Figure 1, the currents, , , and , are revolving axes with the same constant length in original -- coordinate if there is no fault.

In stationary coordinates, the phase currents can be transformed from -- frame to - frame by Clarke transformation, and the mathematical formulation is given in where is the Clarke transformation matrix and is the angle between the coordinate and the coordinate . In order to reduce complexity, is used to be chosen as 0, which is depicted in Figure 2.

- coordinate in Figure 2 is a rotary frame, and its rotary angular velocity is synchronous angular velocity. And the transformation angle can be calculated by . Then the currents and can be derived from and in stationary - frame, and the Park transformation matrix is given as follows:

Furthermore, the transformation matrix from -- coordinate to - coordinate can be obtained by combination of Clarke and Park transformation matric, and its form can be given as

With , it can be obtained by By the nonlinear projections in (11), can be expressed as where and are the principal and residual subspace belonging, respectively, to , and and are principal and residual vectors of . Since is closely related to the rotary angular velocity of - coordinate, both of the two subspaces must be rotary. Then, the three-phase sine currents can be projected on RPCS and RRS, which are, respectively, spanned by and .

Remark 1. The data processing in (12) transforms the original data set into a new rotary data set which obeys Gaussian distribution approximately. It make an important connection between in (2) and in (7). Rewriting and combining (1) with (2), it is easy to know that , where . Considering as the referenced model obtained by offline data and calculating from the online data, then KLD can be used as a SFI to estimate amplitude of incipient fault online.

3.2. SFI

After the above nonlinear projections, the measurement signals obey Gaussian distribution. Let the reference PDF of score be that . After data normalization for , every column of will have zero mean and unit variance. Then, is zero and every variance parameter of principal score can be obtained by in (3). For online sampling, the online PDF can be estimated via scores in RPCS and RRS.

If the incipient fault can be estimated accurately, one SFI with high sensitivity to fault should be chosen. In order to emphasize the unpredictable small changes caused by incipient faults, the SFI using KLD can be defined as where is the sign of .

For the SFI in (13), it contains two terms of parameters, as and obtained from offline data and and calculated in real-time.

Remark 2. If the slight abnormality can be estimated accurately, SFI in (13) should conform with four conditions: (i) sufficient sensitivity to faults; (ii) appropriate robustness to noises; (iii) precise relation between fault amplitude and SFI; (iv) the relation from fault amplitude to SFI which is a double mapping function.

For condition (i), the proposed SFI is more sensitive than Mahalanobis distance and Euclidean; similar theoretical demonstration can be found in [27]. For condition (ii), the satisfied robustness to noises is analyzed in Section 5.3. Condition (iii) is easy to achieve because many curve fitting techniques can be used. In this application, neural network is adopted; this relation and approving experimental results are presented in Sections 3.3 and 5.2, respectively. And the mapping relation in condition (iv) is established, because sgn function ensures that fault and SFI share the same sign, and the positive correlation between fault amplitude and SFI can be known from Sections 3.3 and 3.4.

Remark 3. The developed SFI in (13) has three advantages over other related works. Firstly, it is more effective in detecting the change in mean deviation, while the methods in [16, 17] are not capable because they used the assumption that is 0. Secondly, it can determine the sign of the faulty parameter, which is useful in the subsequent fault isolation. Thirdly, it is more robust to noises and disturbances than the methods in [16], because a moving window strategy is used in (13).

3.3. Covariance Matrix Analysis

For simplicity, is replaced by in the following analysis. In this study, let -th measurement variable of fault-free portion be . In the presence of an incipient fault , the sample value can be described as follows [13]: In order to simplify analysis, (14) can be further rewritten as where is the amplitude variation rate on . Assume that there are observations; then (15) within multivariate statistical former can be expressed as the following expression: where is fault magnitude rate (FMR) matrix and denotes Hadamard product.

Assume that -th sensor is affected by incipient fault after the sampling step , and the change of fault amplitude depends on sampling step ; then , and the FMR vector . For simplicity, the single FMR value is abbreviated as where . Here, when ; and if . For incipient fault, the fault magnitude satisfy in a size of small moving window ; then one can know that . After data normalization using the means and variances obtained by offline data set, in (16) can be rewritten as where . Substituting by in (2), the online covariance matrix is where , , and . Based on the computation rule of Hadamard product, we can see that where , , and .

Asymptotically, as . Comparing the covariance matrices to , the statistical parameters will be changed slightly by the amplitude of incipient fault . Because the spanned vectors in RPCS and RRS share the same directions, variations in the eigenvalues will be tinily affected by from the expression of . Based on (18), there is That is to say that when the measurements are affected by tiny abnormal value. Then Taylor development for can be used in the neighborhood of with the following form: From (20), . Substituting (21) into (20), it has where then . It is interesting to find that all higher-order partial derivatives of are equal to if their order is more than 2. Thus, can be described as

Remark 4. From (24), is only affected by fault amplitude. As and , the KLD between offline and online PDFs can be directly calculated by and instead of obtaining PDFs by kernel density estimation, the merit of which is a remarkable computational cost reduction.

3.4. Incipient FE Analysis

Recall (13) and combine assumptions and which are the PDFs of and ; then the -th SFI can be written as where In (25), , and can be obtained according to online data. The variations of SFI caused by incipient fault only depend on fault magnitude from the above expressions. In fact, there exists four candidate items for analytical expression of which allow one to derive relation among fault amplitude and KLD. However, two main reasons make this solution improper results: (i) this may lead to a undetermined result from four analytical expressions of ; (ii) some approximate conditions are used in the above expressions, which must introduce estimation bias on fault amplitude even if the system noise is considered.

By considering the relation in (25), the correlation between SFI and fault amplitude can be hence given as In (27), the correlation must be nonlinear and monotonic increasing. Then, an alternative solution based on neural network is used to determine the nonlinear relationship . By backpropagation neural network, the correlations for application in Section 5 are shown in Figures 3 and 4. Both curves reflect the correlations which are described in (27). The curves in both Figures 3 and 4 providing an indicator proportional to abnormality allow for the possibility of FE.

4. On-Line Fault Estimation Strategy

The flow diagram of fault estimation strategy is given in Figure 5, which contains data preprocessing, SFI, and fault estimation. The data processing is one of the key steps in Figure 5, which can make the original data set obey Gaussian distribution in the rotary space. And the rotary speed depends on the rotary angular velocity of traction motor in CRH2. In addition, data normalization is needed, which gives the same weighting for and and simplifies the online fault diagnosis algorithms. The data normalization steps are summarized in Algorithm 5, and projecting original data into a rotary space is embedded in Algorithms 6 and 7. In addition, the complete fault estimation strategy shown in Figure 5 can be implemented by Algorithms 6 and 7.

Algorithm 5. Consider the following.
Step 1. Define , where and ; then calculate the mean value and variance of every column as Step 2. Based on and , the normalization data can be defined by the following equation:

4.1. Offline Modeling Steps

Algorithm 6. Consider the following.
Step 1. Collect normal operating data from three current sensors in CRH2 under steady state.
Step 2. Project into a rotary data space by (11).
Step 3. Normalize by Algorithm 5.
Step 4. Do an SVD as follows: where and .
Step 5. Calculate the score matrix in RPCS by where .
Step 6. Obtain the mean value and variance of score vector by Step 7. Determine the nonlinear double mapping relations between and , .

4.2. Online FE Steps

Algorithm 7. Consider the following.
Step 1. Load a new current data from the running CRH2.
Step 2. Project into RPCS and normalize the data as using Algorithm 5.
Step 3. Obtain online score vector by where and is the size of moving window.
Step 4. Calculate the mean value and variance of the online score vector by Step 5. Compute SFI by (13).
Step 6. Estimate the fault amplitude according to the obtained relations in Step of Algorithm 6. And then go back to Step .

Remark 8. The reasons for adopting moving window approach in Step of Algorithm 6 are the following: (i) the single sampling value has no mean and variance; (ii) the moving window approach can weaken the influences caused by noises.

5. Results and Illustrations

In order to test the performance of the proposed method, some experiments of CRH2 explored by Central South University [28] are conducted in this section, as shown in Figure 6. The experimental setup with fault injection operations of CRH2 consists of DSP controller, upper computer, dSPACE, data acquisition, and display devices. Its main parameters in electric multiple unit (EMU) are given in Table 1. In this part, we concentrate on incipient sensor FE to illustrate the effectiveness of the proposed methodology.

5.1. Incipient Fault Injections

When the reference running speed  Km/h is given, the tendency of six curves will be invariable after 1 s. It indicates that the traction system is running in the steady state. Therefore, the historical data are generated after 1 s to establish the offline data model. In this paper, we consider three continuous output currents as the selected signals.

For simplification without generality, -phase current is chosen as the corrupted signal. And two types of incipient current sensor faults are considered.

Case 1 (incipient bias fault ). After 0.5 s,  A is injected into .

Case 2 (incipient ramp fault ). After 0.5 s,  A is injected into .

5.2. Incipient Fault Estimation Results

(1) Results for . For the normal and the corrupted current signals with , both data sets under SNR = 30 db are provided in rotary space, as shown in Figure 7. The added sensor noise allows the SNR of 30 db which is a reasonable noise level for electric system. And the equivalent value of actual fault is presented in Figure 8. As the current sensors are infected by the zero-mean Gaussian noises, there exists extreme tiny value in the case of normal condition which can be filtered out by using the proposed method. After 0.5 s, the constant bias fault will be invariably fluctuating by the preprocessing, as shown in Figure 8. In rotary data space, this waving abnormality is sine wave whose period and amplitude is dependent on synchronous angular velocity and abnormal value, respectively.

Based on the online strategy in Section 4, two latent components are selected to describe the data model in rotary space. This is caused by two approximate equalities of eigenvalues in (30). In Figures 9 and 10, two SFIs display the sensitivity to tiny distortion. Both figures clearly show this effectiveness for small bias. This performance is actually perfect for practical application under 30 db noise level.

In Figures 11 and 12, the red line is actual incipient fault value on , and the blue line marked by + is the estimation value by using the proposed method. On the basis of the enlarge figure in Figure 11, the estimation amplitude is close to actual fault value, which can show the correctness of the proposed incipient fault estimation method. However, there exists small acceptable delay for estimation results. The reason for this phenomenon will be discussed in the following subpart. Similarly, the result depended on in Figure 12 also showing its satisfied estimation performance.

(2) Results for . For a ramp incipient fault, its wave and the corresponding corrupted signals are shown in Figure 13. In this case, the tiny amplitude is steadily climbing from 0 to 1% when the time ranged from 0.5 s to 1 s. Therefore, this must lead to the higher difficulty for FE than constant bias distortion. By using the nonlinear projection onto the rotary space, its characteristic of gradually changing is invariable, which is depicted in Figure 14.

For this type of slight abnormality, Figures 15 and 16 present and . From two indicators, both and can emphasize the tiny fault successfully. It is interesting to see that two curves are fluctuated with fault amplitude.

After successfully detecting drifting fault, Figures 17 and 18 display the actual and estimated incipient fault amplitude under SNR = 30 db. In Figures 17 and 18, the estimated amplitude is close to actual fault value according to the red line and marked blue line. Even coming to the very tiny fault amplitude, the accuracy of results is still acceptable.

5.3. Discussions

Based on (11), if there exists an incipient fault in current sensors, the derived forms in rotary space can be described as where is the incipient fault in -th sensor, , and is its equivalent form in coordinate, . Therefore, both bias and drifting fault will be transformed to periodic signals which are described in Figures 7 and 14.

In this paper, the number of scores within the moving window is chosen, 20. Following the results in two examples, a window size of is sufficient for perfect FE performance. However, it has a little flaw which is the short delay. On the online stage, the estimated mean and variance of score vector in (34) are calculated by using multiple current measurements. This will lead to two results.(i)Improving the robustness: by using the moving window approach in online computation part, the sensor noise effect can be notably reduced. Theoretically, the effect caused by noise will be completely filtered out if the window size trends toward infinity.(ii)Producing the time or step delay: because of periodicity of fault value after nonlinear projection, the amplitude will fluctuate between its bottom and peak in the rotary space. Combining with multiple score values, the evaluation function at step will be impacted by which are in the window. In fact, the length of delay is dependent on the moving window size. If is chosen as 1, the SFI can be only determined by the current score value. In this case, the delay can be eliminated.

Therefore, the tradeoff between robustness and the estimation delay should be considered in choosing the window size . From the asymptotic behavior of SFI, increasing the number of score value does not affect the robustness when . In fact, this effect can be approximately achieved if . This characteristic was illustrated and shown in incipient fault detection [27]. Moreover, the waves of SFI and estimation values are similar to actual fault value. As the longer window size is chosen, the similarity among them will be reduced, and fault estimation delay will become bigger. In addition, for the degradation in sensor precision, the large window size is problematic.

The sampling time of the experimental platform in Table 1 gives the step time. Then, the constant delay time is 0.4 ms which can be obtained by the size of moving window. From the enlarge pictures in four fault estimation figures, the short delay is acceptable for industrial application. Among them, Figures 11 and 12 perform better amplitude estimation because of the bigger fault amplitude.

Beyond the moving size, Fault-to-Noise-Ratio (FNR) is introduced to explain the effectiveness of the proposed method from the results in Figures 17 and 18. It is well known that FNR is defined as , where and are fault power and noise level, respectively, although the weaker results than Figures 11, 12, 17, and 18 show more useful information by varying FNR. For the drifting fault in Figure 13, the peak of FNR level in every period ranges from minus infinity to 10 db after 0.5 s. From the FE results, it can be seen that the proposed SFI in this paper is sensitive enough to emphasize such tiny abnormality under high noise levels.

6. Conclusion

In this paper, the real-time incipient sensor FE in CRH2 is investigated. An effective SFI based on KLD and PCA is developed and analyzed. In order to cater for the latent requirement of PCA, a rotary space is firstly introduced into data-driven FDD and FE domain. The proposed FE methodology can not only emphasize the tiny abnormality but also be insensitive to sampling noises. Through testing incipient faults in experimental setup of CRH2, the feasibility and efficiency of the developed method are validated. The effect of some factors on FE results have further been explored. Moreover, the proposed method can be extended to other electrical systems based on the nonlinear projections and SFI from both theoretical and practical points.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work is partially supported by the National Natural Science Foundation of China (Grants nos. 61490703 and 61573180) and Funding of Jiangsu Innovation Program for Graduate Education (Grant no. KYLX16_0378).