Shock and Vibration

Volume 2016 (2016), Article ID 3838765, 15 pages

http://dx.doi.org/10.1155/2016/3838765

## A Hybrid Prognostic Approach for Remaining Useful Life Prediction of Lithium-Ion Batteries

^{1}College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China^{2}College of Engineering, Nanjing Agricultural University, Nanjing 210031, China^{3}Nanjing Surveying and Mapping Instrument Factory, Nanjing 210003, China

Received 3 July 2015; Revised 30 October 2015; Accepted 1 November 2015

Academic Editor: Chuan Li

Copyright © 2016 Wen-An Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Lithium-ion battery is a core component of many systems such as satellite, spacecraft, and electric vehicles and its failure can lead to reduced capability, downtime, and even catastrophic breakdowns. Remaining useful life (RUL) prediction of lithium-ion batteries before the future failure event is extremely crucial for proactive maintenance/safety actions. This study proposes a hybrid prognostic approach that can predict the RUL of degraded lithium-ion batteries using physical laws and data-driven modeling simultaneously. In this hybrid prognostic approach, the relevant vectors obtained with the selective kernel ensemble-based relevance vector machine (RVM) learning algorithm are fitted to the physical degradation model, which is then extrapolated to failure threshold for estimating the RUL of the lithium-ion battery of interest. The experimental results indicated that the proposed hybrid prognostic approach can accurately predict the RUL of degraded lithium-ion batteries. Empirical comparisons show that the proposed hybrid prognostic approach using the selective kernel ensemble-based RVM learning algorithm performs better than the hybrid prognostic approaches using the popular learning algorithms of feedforward artificial neural networks (ANNs) like the conventional backpropagation (BP) algorithm and support vector machines (SVMs). In addition, an investigation is also conducted to identify the effects of RVM learning algorithm on the proposed hybrid prognostic approach.

#### 1. Introduction

Lithium-ion batteries are significant energy solution for many systems (e.g., satellite, spacecraft, and electric vehicles) due to their high energy density, high galvanic potential, lightness of weight, and long lifetime compared to lead-acid, nickel-cadmium, and nickel-metal-hydride cells [1]. Their failure can lead to reduced capability, downtime, and even catastrophic breakdowns. For example, in November 2006, The National Aeronautics and Space Administration’s Mars Global Surveyor stopped working after the radiator for its batteries was positioned towards the sun causing an increase in the temperature of the batteries, which resulted in lost charge capacity [2]. Battery health management would greatly enhance the reliability of such systems. Thus, this raises the challenging issue of remaining useful life (RUL) prediction in relation to lithium-ion batteries.

In the past few years, much research effort has been devoted to developing approaches to lithium-ion battery degradation modeling and RUL prediction. In general, these approaches can be classified into categories of model-based and/or data-driven methodologies. The model-based methodologies attempt to constitute physical models of the lithium-ion battery for RUL prediction. Recently, various Bayesian filtering models such as Kalman filter [3], extended Kalman filter [4–6], particle filter [7–9], and unscented particle filter [10] have been extensively used to construct exhaustive models of deteriorating lithium-ion batteries. However, uncertainty due to assumptions and simplifications in the models may impose severe limitations upon their applicability in practical applications. In order to overcome the aforementioned problems that can occur with the model-based methodologies, intensive research has been conducted into the utilization of various data-driven methodologies, for example, autoregressive moving average (ARMA) models [11], artificial neural networks (ANNs) [12], and support vector machines (SVMs) [13], to model lithium-ion battery degradation and to predict the RUL of lithium-ion batteries. Data-driven techniques utilize monitored operational data related to lithium-ion battery health. Compared with the model-based methodologies, the data-driven methodologies may be more appropriate when the understanding of first principles of system operation is not comprehensive or when the system is so complex such that developing an accurate model is prohibitively expensive but sufficient data are available for constructing a map of the performance degradation space. Furthermore, rapid development has recently been achieved in automatic data collection and processing of real-time field data, which hugely facilitate the continuous monitoring of the state of health of operating lithium-ion batteries and the lean management of the related large amount of reference data. The most natural data-driven methodology for RUL prediction is to fit a curve of the available data of the lithium-ion battery degradation evolution using regression models and then to extrapolate the curve to the criteria indicating failure. In practice, however, the lithium-ion battery degradation history available may be short and incomplete and even differ significantly because of different operating conditions, so that a common extrapolation may lead to large errors and unreliable results. The same problem arises when employing ARMA models, although the method can handle the situation in which more run-to-failure data are unavailable or insufficient. With respect to ANNs, they have the advantages of superior learning, noise suppression, and parallel computation abilities. However, despite their advantages, ANNs also have some disadvantages: () design and training often lead to a complex and time-consuming task, in which architecture and many training parameters must be tuned; () minimization of the training errors can result in poor generalization performance; and () performance can be degraded when working with low-sized datasets. With respect to SVMs, they are powerful in solving problems with small samples, nonlinearities, and local minimum. However, despite their advantages, SVMs also have some disadvantages: () by assuming an explicit loss function (usually, the *ε*-insensitive loss function), one assumes a fixed distribution of the residuals; () the soft margin parameter must be tuned usually through cross-validation methods, which result in time-consuming tasks; () the kernel function used in SVM must satisfy Mercer’s theorem to be valid; and () sparsity is not always achieved and a high number of support vectors are thus obtained.

More recently, some researchers have attempted to combine model-based and data-driven methods for RUL prediction of lithium-ion batteries in order to leverage the strength from both data-driven methodology and model-based methodology and have obtained promising results [14]. Most of the combination of model-based and data-driven methods in literature has focused on the utilization of relevance vector machines (RVMs) in place of ANNs or SVMs as the prognostic technique. RVM, a general Bayesian probabilistic framework of SVM, can efficiently alleviate some of these shortcomings of SVMs [15]. Saha et al. employed a RVM to find the most representative relevant vectors to fit the capacity degradation data of lithium-ion batteries [16]. Maio et al. combined a RVM and an exponential function to predict the RUL of bearings [17]. Zio and Maio employed a RVM to find the most representative relevant vectors to fit a crack growth model for predicting RUL [18]. Wang et al. employed a RVM to find the most representative relevant vectors to fit the three-parameter capacity degradation model to predict the RUL of lithium-ion batteries [19]. A review of the related literature also indicates that similar idea has already been investigated in the area of applying SVM to RUL prediction. Benkedjouh et al. [20] employed a SVM to find the most representative support vectors to fit a power model for RUL prediction of the cutting tool. Also based on a similar idea, Benkedjouh et al. employed a SVM to find the most representative support vectors to fit an exponential regression for bearing performance degradation assessment and RUL estimation [21]. The ability to extract the relevant vectors is very useful for making good predictions, as the relevant vectors can be used to find the representative training vectors containing the cycles of the relevant vectors and the predictive values at the cycles of the relevant vectors. A review of the related literature [16–21] also indicates that, for the hybrid prognostic approaches that are based on RVM learning algorithm, their RUL prediction performances are very sensitive to kernels choice and kernel parameters setting. A kernel (or kernel parameter setting) that works well for one situation might not be the appropriate choice for the other. However, no systematic methodology as yet has been established for determining the optimal kernel type and kernel parameters for the RVM learning algorithm. Most of the previous work in the area of applying RVM to RUL prediction determined single kernel and kernel parameters by trial and error and did not deal with automatic kernel choice and kernel parameters optimization.

According to the literature review given above, the aim of this study is to develop a hybrid prognostic approach of physical laws and data-driven modeling that integrates selective kernel ensemble-based RVM (a data-driven methodology) and exponential regression (a model-based methodology) for on-line RUL prediction of lithium-ion batteries. The choice of kernel (and kernel parameters) of RVM is evolutionarily determined via coevolutionary swarm intelligence, without the need of any human intervention. A sum of two exponential functions’ model is fitted to these relevant vectors to predict the RUL of degraded lithium-ion batteries. The experimental results indicate that the proposed hybrid prognostic approach can accurately predict the RUL of degraded lithium-ion batteries. Empirical comparisons show that the proposed hybrid prognostic approach using the selective kernel ensemble-based RVM learning algorithm performs better than the hybrid prognostic approaches using popular learning algorithms of feedforward artificial neural networks (ANNs) like the conventional backpropagation (BP) algorithm and support vector machines (SVMs). The proposed hybrid prognostic approach using the selective kernel ensemble-based RVM learning algorithm outperforms the hybrid prognostic approaches using the single kernel-based RVM learning algorithm and the Ensemble All-based RVM learning algorithm.

The rest of this study is organized as follows. Section 2 gives a review of the RVM basic framework. Section 3 presents a selective kernel ensemble-based RVM learning algorithm. Section 4 describes a hybrid prognostic approach for RUL prediction of lithium-ion batteries. Section 5 conducts an investigation to identify the effects of RVM learning on the hybrid prognostic approach. Section 6 provides an empirical comparison of the proposed hybrid prognostic approach with other existing approaches. Section 7 presents a concluding summary and suggests some directions for future research.

#### 2. Review of Relevance Vector Machine

RVM is a Bayesian form representing a generalized linear model of identical functional form of SVM. Unlike SVM, RVM can provide probabilistic interpretation of its outputs [15]. As a supervised learning, RVM starts with a dataset of input-target pairs . The aim is to learn a model of the dependency of the targets on the inputs to make accurate prediction of for previously unseen values of . Typically, the predictions are based on a function defined over the input space, and learning is the process of inferring (perhaps the parameters of) this function. In the context of SVM, this function takes the following form:where are the model “weights,” is bias, and is a kernel function.

By considering only the scalar valued output we follow the standard probabilistic formulation and add additive noise with output samples for better data overfitting, which is described as follows:where are independent samples from some noise process which is further assumed to be zero-mean Gaussian noise with variance .

The likelihood of the complete dataset can be written as where , , and is the “design” matrix with , wherein .

Maximizing likelihood prediction of and in (3) often leads to overfitting. Therefore, a preference for smoother functions is encoded by choosing a zero-mean Gaussian prior distribution over : where is a vector of hyperparameters.

Using Bayes’ rule, the posterior over all unknowns can be computed; that is,

However, we cannot compute the solution of the posterior in (5) directly. But we can decompose the posterior as , where where the posterior covariance and mean are expressed as follows: with and . Thus, RVM method becomes the search for the best hyperparameters posterior mode. Predictions for new data are then made according to integration of the weights to obtain the marginal likelihood for the hyperparameters: The hyperparameters and which maximize (8) are obtained by using an alternate reprediction approach [15], because values of and cannot be directly calculated in closed form. Suppose that the values of and that can maximize (8) are obtained. Then we can compute the predictive distribution for a new input by using (6):Since both terms in the integral are Gaussian, one can easily compute the probability as follows:where the mean and variance of the predicted value are, respectively,The variance of the predicted value (i.e., (12)) is the sum of the variance associated with noise in the training data and uncertainty associated with prediction of weights.

#### 3. Selective Kernel Ensemble-Based Relevance Vector Machine

As mentioned in Section 1, kernel types and kernel parameters have significant influences on the generalization capability of the RVM learning. Generally, commonly used basic kernels for RVM learning include Gaussian kernel (i.e., (13)), Exponential kernel (i.e., (14)), Laplacian kernel (i.e., (15)), Polynomial kernel (i.e., (16)), Sigmoid kernel (i.e., (17)), Cauchy kernel (i.e., (18)), and Multiquadric kernel (i.e., (19)):where , , , , , , , and are kernel parameters that need to be finely tuned. It is impossible to fully determine which one is the best kernel for all problems, because the choice of a kernel depends on the problem at hand. For example, Gaussian kernel is a local kernel and Polynomial kernel is a global kernel [22]. In the case of local kernel, only the data that are close or in the proximity of each other have an influence on the kernel values [22]. In the case of global kernel, samples that are far away from each other still have an influence on the kernel value [22]. With respect to Gaussian kernel and Polynomial kernel only, the mixture of these two basic kernels has been demonstrated to substantively improve the generalization performance of the SVM [23, 24]. However, for many existing basic kernels mentioned above, this combination of basic kernels can also be different for different problems. In one extreme case where all of the individual basic kernels are completely identical, the size of the combination can be reduced without sacrificing the generalization performance of the RVM. In addition, in some scenarios, eliminating some unacceptable basic kernels and meanwhile selecting several acceptable ones to construct a kernel ensemble may be better than combining all of those basic kernels. In this study, each kernel applied to RVM learning algorithm is a selective kernel ensemble of these basic kernels. It should be noted that although the multikernel idea has been successfully used in several machine learning models [25–28] that assume a weighted linear sum of basic kernel and estimate the kernel weights during training, to the best of the authors’ knowledge, it is the first time that a multikernel version of RVM with adaptive kernel selections, adaptive kernel combinations, and adaptive kernel parameters optimization is proposed. The selective kernel ensemble can be expressed as follows:where is the number of basic kernels under consideration and equals 7 in this study, denotes the th basic kernel, stands for the weight assigned to , and represents the selection label assigned to .

##### 3.1. Selection of Candidate Basic Kernels

Instead of combining all of candidate basic kernels, selective kernel ensemble tries to select an optimal subset of individual basic kernels to constitute a selective convex combination. However, selecting an optimal subset from candidate basic kernels is not an easy task since the space of possible subsets is very large for a basic kernel population of size . It is very difficult if not impractical to use exhaustive search to find an optimal subset if and especially when is a large number. In this study, discrete particle swarm optimization (DPSO) [29] algorithm is used for obtaining an optimal subset from candidate basic kernels. Each dimension of a particle in DPSO is encoded by binary bit, where each element of “1” (i.e., ) denotes an individual basic kernel appearing in the selective kernel ensemble while “0” (i.e., ) denotes its absence, . The optimal subset of individual basic kernels can be obtained according to the best evolved selective label vector that can achieve the maximum fitness value. Thus, such a DPSO bit representation gets rid of the tedious trial-and-error search for an optimal subset of basic kernels.

##### 3.2. Determination of Kernel Parameters and Additional Weights

Although utilization of selective kernel ensemble can relieve the influence of kernel types on the generalization capability of RVM, it involves 7 additional weight coefficients . In addition, more component basic kernels mean more kernel parameters. It is not easy to determine the optimal values of all these design parameters, including kernel parameters (, , , , , , , and ) and convex combination coefficients that can allow the RVM to achieve the maximum performance. In this circumstance, manual trial-and-error method is absolutely tedious and unacceptable. Moreover, manual trial-and-error method does not necessarily guarantee a good decision, because these parameters usually interact with each other nonlinearly. In this study, these 7 additional weight coefficients together with kernel parameters (, , , , , , , and ) constitute a general real-value parameter vector , which will be represented in the population of continuous particle swarm optimization (CPSO) [30]. Thus, such a CPSO real-value representation gets rid of the tedious trial-and-error search for optimal kernel parameters and additional weights.

##### 3.3. Coevolution of DPSO and CPSO

As mentioned in Sections 3.1 and 3.2, one swarm population DPSO with population size s_DPSO and the other swarm population CPSO with population size s_CPSO are involved in equipping the RVM with adaptive kernel selections, adaptive kernel combinations, and adaptive kernel parameters optimization. From a pure DPSO perspective, this suffices for the design of the RVM with the best kernel selection, but without taking kernel parameters and weights in kernel combination into account; that is, only good kernel selection obtained with DPSO may not necessarily mean good RVM performance. Similarly, only good kernel parameters and weights in kernel combination may not necessarily evoke maximum RVM performance. Therefore, the evolution of kernel selections by DPSO and the evolution of kernel combinations and kernel parameters by CPSO should be taken into consideration simultaneously. Inspired by the coevolution of swarms, a coevolutionary PSO scheme is proposed in this section. In the proposed coevolutionary PSO scheme, the DPSO and the CPSO interact with each other through the fitness evaluation. Within each iteration, the DPSO is run for a certain number (g_DPSO) of generations; then the CPSO is run for a certain number (g_CPSO) of generations; this process is repeated until either an acceptable solution has been obtained or the maximum number (max_i_PSO) of iterations has been reached. The global best in the population of DPSO is the final solution for the selection label vector, and the global best in the population of CPSO is the final solution for the general parameter vector with regard to kernel parameters and additional weight coefficients. The procedure of coevolution of DPSO and CPSO is outlined in the following pseudocode.

*Step 1. *Initialize randomly one swarm population DPSO with population size s_DPSO.

*Step 2. *Initialize randomly the other swarm population CPSO with population size s_CPSO.

*Step 3. *Run the DPSO for g_DPSO generations.

*Step 4. *Reevaluate the personal best values for the CPSO if it is not the first cycle.

*Step 5. *Run the CPSO for g_CPSO generations.

*Step 6. *Reevaluate the personal best values for the DPSO.

*Step 7. *Go back to Step 3. Repeat this procedure until a termination criterion is reached.

In the above coevolutionary PSO scheme, when one PSO is running, the other PSO serves as its ecological environment; that is, for each PSO its ecological environment has varied from iteration to iteration. Therefore, the personal best obtained in the previous iteration has to be reevaluated in accordance with the new ecological environment before playing its coevolving role. It is also worth noting that, in each generation of the coevolution, the real weights are normalized so that the selected individual basic kernels are combined using a weighted average. Hence, this study uses a quite simple normalization scheme as follows:

#### 4. Hybrid Prognostic Approach for RUL Prediction

As a lithium-ion battery ages, its maximum capacity begins to deteriorate over time. If the maximum capacity falls below 80% of its initial rated capacity, the battery is considered to be unable to provide reliable power supplies and needs to be replaced. In the current academia/industry practices, reliability of a lithium-ion battery for providing reliable power supplies is ensured via the prediction of the remaining maximum capacity. In this study, a hybrid prognostic approach that integrates selective kernel ensemble-based RVM learning algorithm and exponential regression is proposed for RUL prediction of lithium-ion batteries. Figure 1 shows an overall flowchart of the proposed hybrid prognostic approach.