Aero Engine Fault Diagnosis Using an Optimized Extreme Learning Machine
A new extreme learning machine optimized by quantum-behaved particle swarm optimization (QPSO) is developed in this paper. It uses QPSO to select optimal network parameters including the number of hidden layer neurons according to both the root mean square error on validation data set and the norm of output weights. The proposed Q-ELM was applied to real-world classification applications and a gas turbine fan engine diagnostic problem and was compared with two other optimized ELM methods and original ELM, SVM, and BP method. Results show that the proposed Q-ELM is a more reliable and suitable method than conventional neural network and other ELM methods for the defect diagnosis of the gas turbine engine.
The gas turbine engine is a complex system and has been used in many fields. One of the most important applications of gas turbine engine is the propulsion system in aircraft. During its operation life, the gas turbine engine performance is affected by a lot of physical problems, including corrosion, erosion, fouling, and foreign object damage . These may cause the engine performance deterioration and engine faults. Therefore, it is very important to develop engine diagnostics system to detect and isolate the engine faults for safe operation of an aircraft and reduced engine maintenance cost.
Engine fault diagnosis methods are mainly divided into two categories: model-based and data-driven techniques. Model-based techniques have advantages in terms of on-board implementation considerations. But they need an engine mathematical model and their reliability often decreases as the system nonlinear complexities and modeling uncertainties increase . On the other hand, data-driven approaches do not need any system model and primarily rely on collected historical data from the engine sensors. They show great advantage over model-based techniques in many engine diagnostics applications. Among these data-driven approaches, the artificial neural network (ANN) [3, 4] and the support vector machine (SVM) [5, 6] are two of the most commonly used techniques.
Applications of neural networks and SVM in engine fault diagnosis have been widely studied in the literature. Zedda and Singh  proposed a modular diagnostic system for a dual spool turbofan gas turbine using neural networks. Romessis et al.  applied a probabilistic neural network (PNN) to diagnose faults on turbofan engines. Volponi et al.  applied Kalman Filter and neural network methodologies to gas turbine performance diagnostics. Vanini et al.  developed fault detection and isolation (FDI) scheme for an aircraft jet engine. The proposed FDI system utilizes dynamic neural networks (DNNs) to simulate different operating model of the healthy engine or the faulty condition of the jet engine. Lee et al.  proposed a hybrid method of an artificial neural network combined with a support vector machine and have applied the method to the defect diagnostic of a SUAV gas turbine engine.
However, conventional ANN has some weak points: it needs many training data and the traditional learning algorithms are usually far slower than required. It may fall in the local minima instead of the global minima. In case of gas turbine engine diagnostics, however, the operating range is so wide. If the conventional ANN is applied to this case, the classification performance may decrease because of the increasing nonlinearity of engine behavior in a wide operating range .
In recent years, a novel learning algorithm for single hidden layer neural networks called extreme learning machine (ELM) has been proposed and shows better performance on classification problem than many conventional ANN learning algorithms and SVM [12–14]. In ELM, the input weights and hidden biases are randomly generated, and the output weights are calculated by Moore-Penrose (MP) generalized inverse. ELM learns much faster with higher generalization performance than the traditional gradient-based learning algorithms such as back-propagation and Levenberg-Marquardt method. Also, ELM avoids many problems faced by traditional gradient-based learning algorithms such as stopping criteria, learning rate, and local minima problem.
Therefore ELM should be a promising method for gas turbine engine diagnostics. However ELM may require more hidden neurons than traditional gradient-based learning algorithms and lead to ill-conditioned problem because of the random selection of the input weights and hidden biases . To address these problems, in this paper, we proposed an optimized ELM using quantum-behaved particle swarm optimization (Q-ELM) and applied it to the fault diagnostics of a gas turbine fan engine.
The rest of the paper is organized as follows. Section 2 gives a brief review of ELM. QPSO algorithm is overviewed in Section 3. Section 4 presents the proposed Q-ELM. Section 5 compares the Q-ELM with other methods on three real-world classification applications. In Section 6, Q-ELM is applied to gas turbine fan engine component fault diagnostics applications followed by the conclusions in Section 7.
2. Brief of Extreme Learning Machine
Extreme learning machine was proposed by Huang et al. . For arbitrary distinct samples , where and , standard SLFN with hidden neurons and activation function can approximate these samples with zero error which means thatwhere , ( and ) is the hidden layer output matrix, denotes the output of th hidden neuron with respect to , and is the weight connecting th hidden neuron and input neurons. denotes the bias of th hidden neuron. And is the inner product of and . is the matrix of output weights and () is the weight vector connecting the th hidden neuron and output neurons. And is the matrix of desired output.
Therefore, the determination of the output weights is to find the least square (LS) solutions to the given linear system. The minimum norm LS solution to linear system (1) iswhere is the MP generalized inverse of matrix . The minimum norm LS solution is unique and has the smallest norm among all the LS solutions. ELM uses MP inverse method to obtain good generalization performance with dramatically increased learning speed.
3. Brief of Quantum-Behaved Particle Swarm Optimization
Recently some population based optimization algorithms have been applied to real-world optimization applications and show better performance than traditional optimization methods. Among them, genetic algorithm (GA) and particle swarm optimization (PSO) are two mostly used algorithms. GA was originally motivated by Darwin’s natural evolution theory. It repeatedly modifies a population of individual solutions by three genetic operators: selection, crossover, and mutation operator. On the other hand, PSO was inspired by social behavior of bird flocking. However, unlike GA, PSO does not need any genetic operators and is simple in use compared with GA. The dynamics of population in PSO resembles the collective behavior of socially intelligent organisms. However, PSO has some problems such as premature or local convergence and is not a global optimization algorithm.
QPSO is a novel optimization algorithm inspired by the fundamental theory of particle swarm optimization and features of quantum mechanics . The introduction of quantum mechanics helps to diversify the population and ameliorate convergence by maintaining more attractors. Thus, it improves the QPSO’s performance and solves the premature or local convergence problem of PSO and shows better performance than PSO in many applications . Therefore it is more suitable for ELM parameter optimization than GA and PSO.
In QPSO, the state of a particle is depicted by Schrodinger wave function , instead of position and velocity. The dynamic behavior of the particle is widely divergent from classical PSO in that the exact values of position and velocity cannot be determined simultaneously. The probability of the particle’s appearing in apposition can be calculated from probability density function , the form of which depends on the potential field the particle lies in. Employing the Monte Carlo method, for the th particle from the population, the particle moves according to the following iterative equation:where is the position of the th particle with respect to the th dimension in iteration . is the local attractor of th particle to the th dimension and is defined aswhere is the number of particles and represents the best previous position of the th particle. is the global best position of the particle swarm. is the mean best position defined as the mean of all the best positions of the population; , , and are random number distributed uniformly in , respectively. is called contraction-expansion coefficient and is used to control the convergence speed of the algorithm.
4. Extreme Learning Machine Optimized by QPSO
Because the output weights in ELM are calculated using random input weights and hidden biases, there may exist a set of nonoptimal or even unnecessary input weights and hidden neurons. As a result, ELM may need more hidden neurons than conventional gradient-based learning algorithms and lead to an ill-conditioned hidden output matrix, which would cause worse generalization performance.
In this section, we proposed a new algorithm named Q-ELM to solve these problems. Unlike some other optimized ELM algorithms, our proposed algorithm optimizes not only the input weights and hidden biases using QPSO, but also the structure of the neural network (hidden layer neurons). The detailed steps of the proposed algorithm are as follows.
Step 1 (initializing). Firstly, we generate the population randomly. Each particle in the population is constituted by a set of input weights, hidden biases, and -variables:where , , is a variable which defines the structure of the network. As illustrated in Figure 1, if , then the th hidden neuron is not considered. Otherwise, if , the th hidden neuron is retained and the sigmoid function is used as its activation function.
All components constituting a particle are randomly initialized within the range .
Step 2 (fitness evaluation). The corresponding output weights of each particle are computed according to (5). Then the fitness of each particle is evaluated by the root mean square error between the desired output and estimated output. To avoid the problem of overfitting, the fitness evaluation is performed on the validation dataset instead of the whole training dataset:where is the number of samples in the validation dataset.
Step 3 (updating and ). With the fitness values of all particles in population, the best previous position for th particle, , and the global best position of each particle are updated. As suggested in , neural network tends to have better generalization performance with the weights of smaller norm. Therefore, in this paper, the fitness value and the norm of output weights are considered together for updating and . The updating strategy is as follows:where , , and are the fitness value of the th particle’s position, the best previous position of the th particle, and the global best position of the swarm. , , and are the corresponding output weights of the position of the th particle, the best previous position of the th particle, and the global best position obtained by MP inverse. By this updating criterion, particles with smaller fitness values or smaller norms are more likely to be selected as or .
Step 5. Update particle’s new position according to (3).
Finally, we repeat Step 2 to Step 5 until the maximum number of iterations is reached. Thus the network trained by ELM with the optimized input weights and hidden biases is obtained, and then the optimized network is applied to the benchmark problems.
In the proposed algorithm, each particle represents one possible solution to the optimization problem and is a combination of components with different meaning and different range.
All components of a particle are firstly initialized into continuous values between 0 and 1. Therefore, before calculating corresponding output weights and fitness evaluation in Step 2, they need to be converted to their real value.
For the input weights and biases, they are given bywhere and are the upper and lower bound for input weights and hidden biases.
For -parameters, they are given bywhere is a function that rounds to the nearest integer (0 or 1, in this case). After the conversion of all components of a particle, the fitness of each particle can be then evaluated.
5. Evaluation on Some Classification Applications
In essence, engine diagnosis is a pattern classification problem. Therefore, in this section, we firstly apply the developed Q-ELM on some real-world classification applications and compare it with five existing algorithms. They are PSO optimized ELM (P-ELM) , genetic algorithm optimized (G-ELM) , standard ELM, BP, and SVM.
The performances of all algorithms are tested on three benchmark classification datasets which are listed in Table 1. The training dataset, validation dataset, and testing dataset are randomly generated at each trial of simulations according to the corresponding numbers in Table 1. The performances of these algorithms are listed in Tables 2 and 3.
For the three optimized ELMs, the population size is 100 and the maximum number of iterations is 50. The selection criteria for the P-ELMs and Q-ELM include the norm of output weights as (8), while the selection criterion for G-ELM considers only testing accuracy on validation dataset and does not include the norm of output weights as suggested in . Instead, G-ELM incorporates Tikhonov’s regularization in the least squares algorithm to improve the net generalization capability.
In G-ELM, the probability of crossover is 0.5 and the mutation probability is 10%. In Q-ELM and P-ELM, the inertial weight is set to decrease from 1.2 to 0.4 linearly with the iterations. In Q-ELM, the contraction-expansion coefficient is set to decrease from 1.0 to 0.5 linearly with the iterations. ELM methods are set with different initial hidden neurons according to different applications.
There are many variants of BP algorithm; a faster BP algorithm called Levenberg-Marquardt algorithm is used in our simulations. And it has a very efficient implementation of Levenberg-Marquardt algorithm provided by MATLAB package. As SVM is binary classifier, here, the SVM algorithm has been expanded to “One versus One” Multiclass SVM to classify the multiple fault classes. The parameters for the SVM are and . The imposed noise level .
In order to account for the stochastic nature of these algorithms, all of the six methods are run 10 times separately for each classification problem and the results shown in Tables 2 and 3 are the mean performance values in 10 trials. All simulations have been made in MATLAB R2008a environment running on a PC with 2.5 GHz CPU with 2 cores and 2 GB RAM.
It can be concluded from Table 2 that, in general, the optimized ELM method obtained better classification results than ELM, SVM, and BP. Q-ELM outperforms all the other methods. It obtains the best mean testing and training accuracy on all these three classification problems. This suggests that Q-ELM is a good choice for engine fault diagnosis application.
Also it can be observed clearly that the training times of three optimized ELM methods are much more than the others. This mainly is because the optimized ELM methods need to repeatedly execute some steps of parameters optimization. And ELM costs the least training time among these methods.
6. Engine Diagnosis Applications
6.1. Engine Selection and Modeling
The developed Q-ELM was also applied to fault diagnostics of a gas turbine engine and was compared with other methods. In this study, we focus on a two-shaft turbine fan engine with a mixer and an afterburner (for confidentiality reasons the engine type is omitted). This engine is composed of several components such as low pressure compressor (LPC) or fan, high pressure compressor (HPC), low pressure turbine (LPT), and high pressure turbine (HPT) and can be illustrated as shown in Figure 2.
The gas turbine engine is susceptible to a lot of physical problems and these problems may result in the component fault and reduce the component flow capacity and isentropic efficiency. These component faults can result in the deviations of some engine performance parameters such as pressures and temperatures across different engine components. It is a practical way to detect and isolate the default component using engine performance data.
We have already developed a performance model for this two-shaft turbine fan engine in MATLAB environment. In this study, we use the performance model to simulate the behavior of the engine with or without component faults.
The engine component faults can be simulated by isentropic efficiency deterioration of different engine components. By implanting corresponding component defects with certain magnitude of isentropic efficiency deterioration to the engine performance model, we can obtain simulated engine performance parameter data with component fault.
The engine operating point, which is primarily defined by fuel flow rate, has significant effect on the engine performance. Therefore, engine fault diagnostics should be conducted on a specified operating point. In this study, we study two different engine operation points. The fuel flow and environment setting parameters are listed in Table 4. The engine defect diagnostics was conducted on these operating points separately.
6.2. Generating Component Fault Dataset
In this study, different engine component fault cases were considered as the eight classes shown in Table 5. The first four classes represent four single fault cases. They are low pressure compressor (LPC) fault case, high pressure compressor (HPC) fault case, low pressure turbine (LPT) fault case, and high pressure turbine (HPT) fault case. Each class has only one component fault and is represented with an “F.” Class 5 and class 6 are dual fault cases. they are LPC + HPC fault case and LPC + LPT fault case. And the last two classes are triple fault cases. They are LPC + HPC + LPT fault case and LPC + LPT + HPT fault case.
For each single fault cases listed in Table 2, 50 instances were generated by randomly selecting corresponding component isentropic efficiency deterioration magnitude within the range 1%–5%. For dual fault cases, 100 instances were generated for each class by randomly setting the isentropic efficiency deterioration of two faulty components within the range 1%–5% simultaneously. For triple fault cases, each case generated 300 instances using the same method.
Thus we have 200 single fault data instances, 800 multiple fault instances, and one healthy state instance on each operating point condition. These instances are then divided into training dataset, validation dataset, and testing dataset.
In this study, for each operating point condition, 100 single fault case datasets (randomly select 25 instances for each single fault class), 400 multiple fault case datasets (randomly select 50 instances for each dual fault case and 150 instances for each triple fault case), and one healthy state instance were used as training dataset. 60 single fault case datasets (randomly select 15 instances for each single fault class) and 240 multiple fault case datasets (randomly select 30 instances for each dual fault case and 90 instances for each triple fault case) were used as testing dataset. And the left 200 instances were used as validation dataset.
The input parameters of the training, validation, and test dataset are the relative deviations of simulated engine performance parameters with component fault to the “healthy” engine parameters. And these parameters include low pressure rotor rotational speed , high pressure rotor rotational speed , total pressure and total temperature after LPC , , total pressure and total temperature after HPC , , total pressure and total temperature after HPT , , and total pressure and total temperature after LPT , . In this study, all the input parameters have been normalized into the range .
In real engine applications, there inevitably exist sensor noises. Therefore, all input data are contaminated with measurement noise to simulate real engine sensory signals as the following equation:where is clean input parameter, denotes the imposed noise level, and is the standard deviation of dataset.
6.3. Engine Component Fault Diagnostics by 6 Methods
The proposed engine diagnostic method using Q-ELM is demonstrated with single and multiple fault cases and compared with P-ELM, G-ELM, ELM, BP, and SVM.
6.3.1. Parameter Settings
The parameter settings are the same as in Section 5 except that all ELM methods are set with 100 initial hidden neurons. All the six methods are run 10 times separately for each condition and the results shown in Tables 6, 7, and 8 are the mean performance values.
6.3.2. Comparisons of the Six Methods
The performances of the 6 methods were compared on both condition A and condition B. Tables 6 and 7 list the mean classification accuracies of the 6 methods on each component fault class. Table 8 lists the mean training time of each method.
It can be seen from Table 6 (condition A) that Q-ELM obtained the best results on C1, C4, C6, C7, and C8, while P-ELM performed the best on C2, C3, and C5. From Table 7, we can see that P-ELM performs the best on one single fault case (C4), and Q-ELM obtains the highest classification accuracy on all the left test cases.
In general, the optimized ELMs obtained better classification results than ELM, SVM, and BP on both single fault and multiple fault cases. This conclusion can be also demonstrated in Figures 3(a) and 3(b), where the mean classification accuracies obtained by any optimized ELM methods are higher than that obtained by the other three methods. It can be also observed that the classification performance of ELM is on par with that of SVM and is better than that of BP. Due to the nonlinear nature of gas turbine engine, the multiple fault cases are more difficult to diagnose than single fault cases. It can be seen from Tables 3 and 4 that the mean classification accuracy of multiple fault diagnostics is lower that of single fault diagnostics cases.
(a) Mean classification accuracies on single fault cases of the 6 methods
(b) Mean classification accuracies on multiple fault cases of the 6 methods
Notice that the two mean accuracy curves on both Figures 3(a) and 3(b) are very close to each other; we can conclude that engine operating point has no obvious effect on classification accuracies of all methods.
The training times of three optimized ELM methods are much more than the others. Much of training time of the optimized ELM is spent on evaluating all the individuals iteratively.
6.3.3. Comparisons with Fewer Input Parameters
In Section 6.3.2 we train ELM and other methods using dataset with 10 input parameters. But, in real applications, the gas turbine engine may be equipped with only a few numbers of sensors. Thus we have fewer input parameters. In this section we reduce the input parameters from 10 to 6; they are , , , , , and . We trained all the methods with the same training dataset with only 6 input parameters and the results are listed in Tables 9–11.
Compared with Tables 6 and 7, we can see that the number of input parameters has great impact on diagnostics accuracy. The results in Tables 9 and 10 are generally lower than results in Tables 6 and 7.
The optimized ELM methods still show better performance than the others, this conclusion can be demonstrated in Figures 4(a) and 4(b). Q-ELM obtained the highest accuracies in all cases except C3 in Table 6 and it attained the best results in all cases in Table 10.
(a) Mean classification accuracies on single fault cases of the 6 methods
(b) Mean classification accuracies on multiple fault cases of the 6 methods
The good results obtained by our method indicate that the selection criteria which include both the fitness value in validation dataset and the norm of output weights help the algorithms to obtain better generalization performance.
In order to evaluate the proposed method in depth, the mean evolution of the accuracy on validation dataset of 10 trials by three optimized ELM methods on fault class C5 and C7 cases in condition B is plotted in Figure 5.
It can be observed from Figure 5 that Q-ELM has much better convergence performance than the other two methods and obtains the best mean accuracy after 50 iterations; P-ELM is better than G-ELM. In fact, Q-ELM can achieve the same accuracy level as G-ELM within only half of the total iterations for these two cases.
The main reason for high classification rate by our method is mainly because the quantum mechanics helps QPSO to search more effectively in search space, thus outperforming P-ELMs and G-ELM in converging to a better result.
In this paper, a new hybrid learning approach for SLFN named Q-ELM was proposed. The proposed algorithm optimizes both the neural network parameters (input weights and hidden biases) and hidden layer structure using QPSO. And the output weights are calculated by Moore-Penrose generalized inverse, like in the original ELM. In the optimizing of network parameters, not only the RMSE on validation dataset but also the norm of the output weights is considered to be included in the selection criteria.
To validate the performance of the proposed Q-ELM, we applied it to some real-world classification applications and a gas turbine fan engine fault diagnostics and compare it with some state-of-the-art methods. Results show that our method obtains the highest classification accuracy in most test cases and show great advantage than the other optimized ELM methods, SVM and BP. This advantage becomes more prominent when the number of input parameters in training dataset is reduced, which suggests that our method is a more suitable tool for real engine fault diagnostics application.
Conflict of Interests
The authors declare no conflict of interests regarding the publication of this paper.
S. Ogaji, Y. G. Li, S. Sampath, and R. Singh, “Gas path fault diagnosis of a turbofan engine from transient data using artificial neural networks,” in Proceedings of the 2003 ASME Turbine and Aeroengine Congress, ASME Paper No. GT2003-38423, Atlanta, Ga, USA, June 2003.View at: Google Scholar
L. C. Jaw, “Recent advancements in aircraft engine health management (EHM) technologies and recommendations for the next step,” in Proceedings of the 50th ASME International Gas Turbine & Aeroengine Technical Congress, Reno, Nev, USA, June 2005.View at: Google Scholar
S. Osowski, K. Siwek, and T. Markiewicz, “MLP and SVM networks—a comparative study,” in Proceedings of the 6th Nordic Signal Processing Symposium (NORSIG '04), pp. 37–40, Espoo, Finland, June 2004.View at: Google Scholar
M. Zedda and R. Singh, “Fault diagnosis of a turbofan engine using neural-networks: a quantitative approach,” in Proceedings of the 34th AIAA, ASME, SAE, ASEE Joint Propulsion Conference and Exhibit, AIAA 98-3602, Cleveland, Ohio, USA, July 1998.View at: Google Scholar
C. Romessis, A. Stamatis, and K. Mathioudakis, “A parametric investigation of the diagnostic ability of probabilistic neural networks on turbofan engines,” in Proceedings of the ASME Turbo Expo 2001: Power for Land, Sea, and Air, Paper no. 2001-GT-0011, New Orleans, La, USA, June 2001.View at: Publisher Site | Google Scholar
A. J. Volponi, H. DePold, R. Ganguli, and C. Daguang, “The use of kalman filter and neural network methodologies in gas turbine performance diagnostics: a comparative study,” Journal of Engineering for Gas Turbines and Power, vol. 125, no. 4, pp. 917–924, 2003.View at: Publisher Site | Google Scholar
J. Sun, C.-H. Lai, W.-B. Xu, Y. Ding, and Z. Chai, “A modified quantum-behaved particle swarm optimization,” in Proceedings of the 7th International Conference on Computational Science (ICCS '07), pp. 294–301, Beijing, China, May 2007.View at: Google Scholar