#### Abstract

In this paper, a novel multistep ahead predictor based upon a fusion of kernel recursive least square (KRLS) and Gaussian process regression (GPR) is proposed for the accurate prediction of the state of health (SoH) and remaining useful life (RUL) of lithium-ion batteries. The empirical mode decomposition is utilized to divide the battery capacity into local regeneration (intrinsic mode functions) and global degradation (residual). The KRLS and GPR submodels are employed to track the residual and intrinsic mode functions. For RUL, the KRLS predicted residual signal is utilized. The online available experimental battery aging data are used for the evaluation of the proposed model. The comparison analysis with other methodologies (i.e., GPR, KRLS, empirical mode decomposition with GPR, and empirical mode decomposition with KRLS) reveals the distinctiveness and superiority of the proposed approach. For 1-step ahead prediction, the proposed method tracks the trajectory with the root mean square error (RMSE) of 0.2299, and the increase of only 0.2243 RMSE is noted for 30-step ahead prediction. The RUL prediction using residual signal shows an increase of 3 to 5% in accuracy. This proposed methodology is a prospective approach for an efficient battery health prognostic.

#### 1. Introduction

The depletion of fossil fuel resources and issues related to climate change provides a strong impetus to developers to focus on green energy resources, green transportation, and smart grids [1, 2]. Energy storage devices are the core component in the above-mentioned fields. Due to their lightweight, high energy and power density, low self-discharge rate, and long lifecycle, lithium-ion (Li-ion) batteries have superiority among other sources of energy storage devices [3, 4]. However, as the Li-ion battery is one of the system’s costly components, it must be handled carefully using an efficient battery management system (BMS) [5]. The role of an intelligent BMS is to manage the battery efficiently and monitor the state of the battery with high accuracy. Li-ion battery malfunctions often lead to functional impairment, degraded performance, or total failure. In recent years, the estimation and prediction of battery state of health (SoH), state of charge (SoC), state of life (SoL), remaining useful life (RUL), and state of function (SoF) gained significant attention for battery health prognostic (BHP) [6–8]. In smart grids, renewable energy systems, and electric vehicles, battery life is one of the most important features to accomplish economic viability. In battery life, battery degradation due to dynamic operational conditions is one of the most critical issues. So early estimation and prediction of battery SoH and RUL are crucial tasks of smart BMS for reliable operation.

Researchers have been working on the Li-ion battery capacity estimation in recent years, as it is the determinative SOH indicator [9, 10]. When the Li-ion battery capacity reaches 80% of its initial capacity, it must be replaced to ensure smooth and reliable operation [11]. However, the battery capacity cannot be measured using any physical sensor, so it is challenging to measure the accurate SoH and RUL. To date, various methodologies have been reported to estimate and predict the SoH and RUL. Based on the literature, these procedures can be categorized as specific model-based methods, data-driven methods, and hybrid approaches [1, 12].

The model-based methods define the battery degradation behavior by using differential, algebraic, or empirical equations. Different researchers presented empirical models [13–15], mechanistic models (also known as chemical models) [16–18], equivalent circuit models [19, 20], and fused models [21] to capture the battery degradation behavior. Hu et al. [22] presented a model-based method for coestimation of SoC and SoH of Li-ion batteries. The utilized fractional-order battery model is identified using a hybrid optimization algorithm, and the model shows a steady-state error of less than 1%. In their subsequent work [23], the authors utilized incremental capacity analysis to determine the SoH of the electric taxi. Their proposed methodology has the root mean square error of 0.0204. Although the model-based methods have good accuracy, they still have some drawbacks. The empirical and equivalent circuit techniques are easy to build a model. Still, it only accurately measures the short-term states due to changing parameters during the cycling process. However, filtering algorithms are utilized to update the model parameters at the cost of the high complexity of the system. Similarly, mechanistic models also have increased complexity and require expert knowledge to build the model [1]. It is also difficult to build these models in noisy/uncertain environments.

The data-driven methods require only Li-ion battery sensor data (voltage, current, and temperature) to predict the SoH and RUL [24]. Different machine learning algorithms were used to build the connection between operation data and battery degradation. Compared to model-based approaches, it does not require any complex physical model; it only builds a weight vector based upon its training data. Tian et al. [25] proposed a deep learning sequence to sequence model to predict the capacity degradation of the Li-ion battery. The authors used the data of one cycle of the Li-ion battery for multistep (100, 200, and 300 cycles) ahead prediction. In another study [26], for the prediction of the entire charging curve, a deep neural network was trained with discrete sections of the charging curves as input. Thirty data points were collected as input in less than 10 minutes to train the deep learning model. Wang et al. [27] proposed a data-driven approach to diagnosing the abnormality in the battery charging capacity. These techniques need historical data to train the model. In the past, relevance vector machine, logic regression, and support vector machine have been reported to predict the RUL [28]. In a study [29], the authors presented the Bayesian model to predict the RUL of Li-ion batteries under dynamic operating conditions. They showed that their proposed model had better prediction accuracy as compared to the support vector machine. Tang et al. [30] proposed a balancing current ratio-based SOH predictor for series-connected cells in a battery pack. Liu et al. [31] proposed a two-stage trajectory model to determine the future aging trajectory with uncertainty quantification. Wang et al. [32] proposed another variant of the Bayesian model to predict the RUL. Neural network [33, 34], autoregressive fused model [35], and Box-Cox transformation [36] were also utilized to estimate the battery capacity. In all aforementioned literature, they directly neglect the effect of fluctuation and local regeneration phenomena in the capacity, affecting prediction accuracy. A Gaussian process functional regression model was proposed to tackle the issue of local capacity regeneration [37]. A variant of recurrent neural network (long short-term memory) was proposed to predict the Li-ion battery capacity [38]. Their experimental results show an average error of 0.0765 Ah (2.46%). In a recent study [39], a hybrid method based upon long short-term memory and Gaussian process regression (GPR) has been proposed to predict the capacity and RUL of Li-ion batteries. The GPR and long short-term memory were utilized to capture local regeneration and global capacity degradation trend. They also predict the battery RUL for multistep ahead. The maximum noted error was less than 1.8%. However, it has been observed that the battery local fluctuation and regeneration have a significant impact on the multistep ahead prediction of SoH and RUL. Therefore, further research is needed to predict q-step ahead SoH and RUL with high accuracy.

Driving by the desire to increase the BMS reliability and improve battery safety. In this study, a novel hybrid method consisting of multiscale kernel recursive least square (KRLS) and GPR is proposed for the q-step ahead SoH prediction of Li-ion battery. To be more explicit, the following are the proposed approach’s key contributions:(i)The empirical mode decomposition (EMD) method is employed to split the local generation, global battery degradation, and other fluctuations.(ii)The KRLS with an autoregressive moving average with exogenous signals (ARMAX) model is recursively used to predict global battery degradation. GPR is applied to track the local fluctuation and regeneration of the Li-ion battery.(iii)Finally, the prediction of KRLS and GPR ensemble to obtain the final predicted SoH.(iv)The RUL is predicted using SOH, intrinsic mode functions (IMFs), and a residual value of the battery data.(v)The suggested approach is validated using various online datasets (NASA and CALCE).(vi)Experimental results and comparative analysis reveal the effectiveness and supremacy of the proposed methodology, respectively.

#### 2. State of Health of Lithium-Ion Battery

Li-ion battery is a highly nonlinear and complex electrochemical system, which significantly impacts its health under dynamic operating conditions. SoC, SoH, SoL, and RUL are the different parameters primarily used to predict the health of Li-ion batteries [40, 41]. SoH is one of the essential components of the BHP system [42]. The most widely accepted definition of SoH of the Li-ion battery can be stated as the ratio of battery capacities at the *kth* cycle and initial cycle. In other words, it can be explained using the following mathematical equation:where is the SoH at *kth* cycle, and and are the battery capacities at *kth* cycle and initial cycle, respectively. However, battery degradation can occur in the cathode and anode. Therefore, a scalar SOH is not sufficient. For further details, see [43, 44].

#### 3. Methodology

In this section, the framework of the proposed methodology has been explained in detail.

##### 3.1. Empirical Mode Decomposition (EMD)

The EMD is a very efficient tool for analyzing highly dynamic signals; it decomposes the nonstationary and nonlinear signals into different oscillatory components known as series of IMFs and residuals. Owing to its extraordinary abilities, it has been implemented in other fields (e.g., image processing, vibration, rotating machinery). Huang et al. [45] discussed the EMD approach in more detail. In the EMD approach, the IMFs should satisfy the following condition after decomposition.(1)The mean value of upper and lower envelopes must be equal to 0 at any instant.(2)In the whole time series input dataset, the no. of zero crossings and the no. of extrema must be equal to 1 or 0.

In this work, it is considered that the local fluctuation and regeneration phenomena in original SoH signals are the high-frequency components, and global SoH degradation is the low-frequency SoH signal. This signal decomposition is also known as the sifting phenomenon. After finding all the extreme values (minima and maxima) in the input signal then connect all the local minimum and maximum values using a spline line to develop a lower and upper envelope, respectively. After this, compute the local mean of both envelopes by using the following equation:

Determine the difference between the and the mean value .

After calculating the difference, check whether fulfills the IMFs condition, as discussed above. If it meets all the conditions to be an IMF signal, remove it from the to obtain the residual signal .

Repeat all the steps until the residue meets the stopping criteria. All the information on local fluctuation and regeneration has been saved in IMFs, and monotonous residue contains the information on the global degradation of SoH [46]. By adding all the IMFs and monotonous residue, the original input signal can be described as follows:

In this work, the wavelet and signal processing toolbox of MATLAB® was utilized to perform the EMD. The flowchart of the working of EMD is shown in Figure 1.

##### 3.2. Kernel Recursive Least Square

In this work, the ARMAX model is used to predict the SoH of the battery. The ARMAX model can be represented using the following equation [47]:where and are the measured signal and desired response, respectively. and are the model coefficients, which have to be estimated recursively. represents the zero-mean Gaussian noise. and are the order of the system and the input. The above mathematical model can be written in a simplified form as follows:where is the transpose of the regression vector. The KRLS method can be utilized to determine the unknown coefficients of the above equation. The cost function can be expressed by the following equations:where Mercer kernel is represented by . and are the kernel matrix, regularization factor (always taken as a positive number), reproducing kernel Hilbert space (RKHS), and the forgetting factor, respectively. The most commonly used kernel for prediction are the Gaussian kernel polynomial kernel and sigmoid kernel [48], where and are the scaling factor, latest upcoming data, positive valued constant, and polynomial order, respectively. and both are positive constants. In this work, all the kernel function was implemented. The presented results are of the polynomial kernel, which shows the best accuracy.

The KRLS method works by mapping input data into high dimension RKHS. In this process, the linear inner product changes into RKHS by simply replacing the inner product with kernels [49, 50]. The linear algorithms can then be used to solve the transformed feature space (RKHS). The unique global solution is the salient feature of kernel-based methods [51]. Additionally, if the input data is highly nonlinear, the linear regression techniques fail to model it accurately. Kernel-based algorithms can easily tackle this issue by mapping the nonlinear data into high dimension linear feature space. Because of the high dimensionality of data in RKHS, it experiences overfitting problems. This issue can be resolved by penalizing it to the L2 norm, as shown in (10) [52], which can be solved and updated as follows [53]:

The approximate linear dependency criteria are used to reduce the computation complexity of KRLS due to an increase in observations [54]. In this work, the KRLS coupled with approximate linear dependency has been employed using MATLAB®. To estimate the model capacity , (7) can be written as follows:

(13) can be modified for q-step ahead prediction as follows:

##### 3.3. Gaussian Process Regression

A GPR is an effective approach to solving nonlinear regression and classification problems [55, 56]. GPR is a probabilistic nonparametric model, which combines different variables; these combinations are defined by the probability distribution . The GPR model can be described by its mean and covariance (kernel) function as follows:where and are the mean and covariance functions, respectively. The function is mainly assumed as zero. The relation between input and output can be expressed as follows:where is the additive noise, which has zero mean and variance of .

By using (16), the likelihood can be written as follows:where and is the unit matrix. According to [57], the marginal distribution of can be written as follows:where , using (18) and (19).where , for the prediction of the target value for the updated input value, the joint distribution over can be written as follows:where is the latent function corresponding to its input and noise . and . The predictive distribution is the Gaussian distribution, which has the following characteristics:

The can be calculated using Cholesky decomposition [58]. The covariance (kernel) function is a very critical component in the prediction process. The rational quadratic kernel functions are used for the prediction [39].

##### 3.4. Proposed Methodology

In this work, EMD, KRLS, and GPR-based fused battery SoH prediction models have been proposed. The framework of the proposed approach is shown in Figure 2.

The raw battery sensor data is passed through the Savitzky-Golay filter to reduce the measurement noise error [59]. The filter is implemented using the MATLAB® tool *sgolayfilt*. After that, the battery SoH was calculated using (1). The EMD technique is utilized to decompose the battery SoH in IMFs and its residual signals, as discussed in Section 3.1. The KRLS and GPR methodology was adopted to track the global degradation and local regeneration phenomenon in the Li-ion battery, respectively. Finally, the predicted IMFs and residuals were ensembled to get the predicted SoH. When the predicted SoH exceeds the battery end of life (EOL), the RUL will be predicted. Percentage fitting (FIT) and root mean square error (RMSE) were utilized to evaluate the performances of SoH prediction.where and are the original and estimated output, and is the total number of samples.

In this study, to examine the accuracy of RUL prediction, the following testing standard has been followed:

#### 4. Experimental Data and Results

In this section, the proposed methodology’s distinctiveness is evaluated using NASA’s online available data source [60]. The details of different battery datasets are presented in Table 1. All the processing is done on MATLAB 2021 ® with the personal computer having the specification of Intel(R) Core (TM) i7-10700 CPU @ 2.90 GHz processor with 32 GB RAM, 1 TB SSD, and a 64-bit Windows 10 Pro operating system (OS).

The cyclic aging experiments were carried out on all NASA batteries using a programmed electric load, adjustable temperature chamber, and electric supply [61]. The discharge current and temperature of all the Li-ion batteries are shown in Table 1. Further details of the experimental setup can be found in [61]. The SoH trends of all Li-ion batteries can be seen in Figure 3.

After collecting battery data through transducers, it passes through the Savitzky-Golay filter. The filter reduced the measurement noise error. The EMD technique decomposes the Li-ion battery SoH into residual and IMFs signals, as shown in Figure 4.

The prediction results of Li-ion batteries B0005. B0006, B0018, B0055, and B0056 using the proposed technique (EMD, KRLS, and GPR) are shown in Figure 5, respectively.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

For the comparison between the proposed and other methodologies such as solo GPR, solo RLS, EMD + GPR, and EMD + KRLS, the results are presented in Figure 6.

To further validate the model, another available online dataset of the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland is used for prediction [62]. The Arbin BT2000 system with a temperature-controlled chamber was used to perform all cycling tests on the CALCE battery dataset (CX2-16). The CX2-16 battery was drained at 1.1 A steady current, for further information on the experimental setting, see [39, 61]. 60% of the data is used for the training and the rest for the Li-ion battery capacity prediction (CX2-16). The prediction results are shown in Figure 7.

The FIT of 1-step ahead prediction of the proposed methodology for all datasets is shown in Figure 8.

The q-step ahead prediction of the proposed methodology is shown in Figure 9 (for B0018), and the RMSE results of all datasets are presented in Table 2, respectively.

The q-step ahead prediction comparison of proposed and other methodologies is illustrated in Figure 10.

The results of RUL prediction accuracy against different parameters have been presented in Tables 3–7 for the Li-ion batteries. For all Li-ion battery datasets, the RUL is predicted at various cycle numbers to check the accuracy. The comparison of the proposed approach with another state-of-the-art study is shown in Table 8.

#### 5. Discussion

In this work, a BHP model is proposed to avoid unexpected battery failures. As discussed earlier, the accurate and early SoH prediction of Li-ion batteries is one of the main components of intelligent BMS.

The basic framework of the proposed approach is shown in Figure 2. After the filtration step, the EMD technique divides the SoH of the Li-ion battery into its global degradation (residual) and local regeneration (IMFs) (see Figure 4). The EMD technique consumes 0.54 ms and 2.31 ms to decompose the data of Li-ion batteries B0055 (102 data points) and CX2-16 (1998 data points), respectively. The residual shows the actual SoH degradation of the Li-ion battery (Figure 4). Meanwhile, all local regeneration points of the original SoH were captured by all IMFs. The one-step-ahead prediction results of Li-ion batteries B0005. B0006, B0018, B0055, and B0056 are shown in Figure 5. 110 battery cycles out of 168 were used to train the model for B0005 and B0006, as shown in Figures 5(a) and 5(b). The KRLS effectively tracks the residual values without any significant error, as shown in Figure 5. The GPR was utilized to predict the IMFs signal of the batteries, and it shows good tracking ability, also reported in [39]. The proposed methodology shows similar accuracy in the case of B0018. 80 out of 127 samples were used to train the models (see Figure 5(c)). In [61], the author used B0005, B0006, and B0018 to validate his proposed multiscale logic regression (LR) and GPR model. The results showed the maximum RMSE of 0.8 for 1-step ahead prediction; in comparison, our proposed methodology shows the maximum RMSE of 0.284 for the mentioned dataset. The data of Li-ion batteries B0055 and B0056 was noisy because these batteries were operated at 4°C. The proposed methodology still shows high accuracy in the presence of perturbation, as seen in Figures 5(d) and 5(e). Figure 6 reported the comparison results; the solo GPR has poor tracking capability and shows a significant prediction error. In contrast, EMD with KRLS has shown the second-best prediction accuracy after the proposed approach. For the CALCE dataset, 1200 data points from 1998 were used to train the model. The prediction RMSE was just 0.64 for the whole prediction of 798 data points (see Figure 7). The proposed method predicts 1-step ahead values with high accuracy (Figure 8). The SoH fitting accuracy of B0055 and B0056 is on a bit lower side due to high perturbation in the measured signal. However, it still shows better accuracy as compared to [61].

For all the datasets, for q-step ahead prediction, 5, 10, 15, 20, 25, and 30 steps ahead prediction was carried out. The graphical presentation of the q-step ahead of the Li-ion battery (B0018) is shown in Figure 9. It can be observed that the proposed methodology shows high accuracy even in the case of a 30-step ahead prediction (see Table 2). The RMSE of the 1-step prediction of B0006 was 0.2299, while it shows only a small increase of 0.2243 in RMSE for the 30-step ahead prediction. In some cases, the prediction RMSE reduces with the increase of the value of the ahead prediction step. In the case of B0005, the 0.2823 RMSE was noted at 5-step ahead prediction, while the RMSE at 10-step ahead prediction is just 0.2296, which is 0.0527 lesser than the 5-step ahead prediction error. At 5-step ahead prediction of Li-ion battery (B0005), there was a regeneration point to predict, which is why the RMSE was more at 5-step than 10-step. The maximum RMSE of 1.1021 was noted for Li-ion battery (B0055) at 30-step ahead prediction under a perturbated environment. The q-step ahead prediction comparison analysis reveals the effectiveness and distinctiveness of the proposed methodology under q-step ahead prediction (see Figure 10).

For a smart BMS, the early accurate prediction of Li-ion battery RUL is one of the key components for safe and reliable operation. Different features were used to predict the RUL at different cycle numbers using the proposed robust model in this work. Predicted SoH, IMFs, and residual were used to estimate the future RUL of the Li-ion battery. All the RUL prediction results of Li-ion batteries are tabulated in Tables 3–7. For B0005 and B0006, the RUL prediction was started at cycles 50 to 120 with a difference of 5 cycles; it can be observed in Table 3 that the RUL accuracy was just 75.59% at the 50^{th} cycle using SoH as the predictor, while residual has the accuracy of 99.21% at the same point. In [61], the RUL prediction accuracy of just 79.84% was observed at the 50^{th} cycle. The RUL prediction accuracy increased with the prediction point (i.e., at 110 cycles, the RUL prediction accuracy was 94.57%). Similarly, a prediction error of 3.3% was noted in [39]. The residual has the minimum RUL prediction accuracy of 96.06% at the 90^{th} cycle. In comparison, SoH has an accuracy of 96.85% at the same point. The accuracy of IMFs was far below the accuracy of SoH and residual, which is also reflected in the results. The average RUL prediction accuracy using residual and SoH as a feature was 97.53% and 94.12%, respectively. It can be concluded that the prediction of RUL using residual value has better accuracy as compared to other parameters. Similar results can be observed for all other batteries (see Tables 4–7). For Li-ion battery CX2-16, the average RUL prediction accuracy was 99.51%. An average absolute error of only 3, 12, and 3 cycles is noted for the Li-ion battery B0005, B0006, and B0018, which is 13, 10, and 2 cycles lesser than the other study [61] (see Table 8). Hence, after extensive experimentation and comprehensive analysis, it can be concluded that the proposed trained model predicts the SoH and RUL with high accuracy.

#### 6. Limitations and Future Perspectives

The presented technique for predicting battery health might be employed to develop a BMS. The prediction model, on the other hand, is validated in a controlled environment, such as constant charging/discharging current and temperature. In contrast, the operation circumstances fluctuate substantially throughout cycles, causing the battery to deuterate in numerous phases. Therefore, the performance of the proposed approach must be checked under dynamic conditions. Furthermore, the RUL prediction of a single battery cell is solely considered in this study. However, in a battery pack, numerous cells are connected in series/parallel. Because of the unequal aging of the battery cells caused by the temperature differential, the battery pack RUL prediction must be investigated with uncertainty quantification in the future.

#### 7. Conclusion

In this work, the battery health predictor has been proposed to reduce the chances of unexpected battery failures. To address the issue of accurate prediction for local regeneration in the SoH signal, the EMD technique was employed to decompose data into low and high-frequency signals. The recursive KRLS method was utilized to track the global battery degradation and GPR to predict the local fluctuation and regenerations points with high accuracy. The proposed methodology shows above 91% fitting accuracy at 1-step ahead prediction under a normal environment. It has the maximum RMSE of 1.1021 at 30-step ahead prediction under a perturbated environment. The comparison analysis also illustrated that the proposed methods are more effective and accurate. Furthermore, the results show that the RUL prediction using the residual has 3 to 5% higher accuracy than the RUL prediction using SoH. It means that the proposed technique can be utilized to design the battery health prognostics.

#### Data Availability

The data used in this work are collected from the following public websites: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/ and https://web.calce.umd.edu/batteries/data.htm.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Authors’ Contributions

All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

#### Acknowledgments

This work was supported by the “Human Resources Program in Energy Technology” of the Korea Institute of Energy Technology Evaluation and Planning (KETEP), granted financial resources from the Ministry of Trade, Industry & Energy, Republic of Korea (no. 20204010600090).