#### Abstract

Crude oil is one of the most important types of energy for the global economy, and hence it is very attractive to understand the movement of crude oil prices. However, the sequences of crude oil prices usually show some characteristics of nonstationarity and nonlinearity, making it very challenging for accurate forecasting crude oil prices. To cope with this issue, in this paper, we propose a novel approach that integrates complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and extreme gradient boosting (XGBOOST), so-called CEEMDAN-XGBOOST, for forecasting crude oil prices. Firstly, we use CEEMDAN to decompose the nonstationary and nonlinear sequences of crude oil prices into several intrinsic mode functions (IMFs) and one residue. Secondly, XGBOOST is used to predict each IMF and the residue individually. Finally, the corresponding prediction results of each IMF and the residue are aggregated as the final forecasting results. To demonstrate the performance of the proposed approach, we conduct extensive experiments on the West Texas Intermediate (WTI) crude oil prices. The experimental results show that the proposed CEEMDAN-XGBOOST outperforms some state-of-the-art models in terms of several evaluation metrics.

#### 1. Introduction

As one of the most important types of energy that power the global economy, crude oil has great impacts on every country, every enterprise, and even every person. Therefore, it is a crucial task for the governors, investors, and researchers to forecast the crude oil prices accurately. However, the existing research has shown that crude oil prices are affected by many factors, such as supply and demand, interest rate, exchange rate, speculation activities, international and political events, climate, and so on [1, 2]. Therefore, the movement of crude oil prices is irregular. For example, starting from about 11 USD/barrel in December 1998, the WTI crude oil prices gradually reached the peak of 145.31 USD/barrel in July 2008, and then the prices drastically declined to 30.28 USD/barrel in the next five months because of the subprime mortgage crisis. After that, the prices climbed to more than 113 USD/barrel in April 2011, and, once again, they sharply dropped to about 26 USD/barrel in February 2016. The movement of the crude oil prices in the last decades has shown that the forecasting task is very challenging, due to the characteristics of high nonlinearity and nonstationarity of crude oil prices.

Many scholars have devoted efforts to trying to forecast crude oil prices accurately. The most widely used approaches to forecasting crude oil prices can be roughly divided into two groups: statistical approaches and artificial intelligence (AI) approaches. Recently, Miao et al. have explored the factors of affecting crude oil prices based on the least absolute shrinkage and selection operator (LASSO) model [1]. Ye et al. proposed an approach integrating ratchet effect for linear prediction of crude oil prices [3]. Morana put forward a semiparametric generalized autoregressive conditional heteroskedasticity (GARCH) model to predict crude oil prices at different lag periods even without the conditional average of historical crude oil prices [4]. Naser found that using the dynamic model averaging (DMA) with empirical evidence is better than linear models such as autoregressive (AR) model and its variants [5]. Gong and Lin proposed several new heterogeneous autoregressive (HAR) models to forecast the good and bad uncertainties in crude oil prices [6]. Wen et al. also used HAR models with structural breaks to forecast the volatility of crude oil futures [7].

Although the statistical approaches improve the accuracy of forecasting crude oil prices to some extent, the assumption of linearity of crude oil prices cannot be met according to some recent research, and hence it limits the accuracy. Therefore, a variety of AI approaches have been proposed to capture the nonlinearity and nonstationarity of crude oil prices in the last decades [8–11]. Chiroma et al. reviewed the existing research associated with forecasting crude oil prices and found that AI methodologies are attracting unprecedented interest from scholars in the domain of crude oil price forecasting [8]. Wang et al. proposed an AI system framework that integrated artificial neural networks (ANN) and rule-based expert system with text mining to forecast crude oil prices, and it was shown that the proposed approach was significantly effective and practically feasible [9]. Barunik and Malinska used neural networks to forecast the term structure in crude oil futures prices [10]. Most recently, Chen et al. have studied forecasting crude oil prices using deep learning framework and have found that the random walk deep belief networks (RW-DBN) model outperforms the long short term memory (LSTM) and the random walk LSTM (RW-LSTM) models in terms of forecasting accuracy [11]. Other AI-methodologies, such as genetic algorithm [12], compressive sensing [13], least square support vector regression (LSSVR) [14], and cluster support vector machine (ClusterSVM) [15], were also applied to forecasting crude oil prices. Due to the extreme nonlinearity and nonstationarity, it is hard to achieve satisfactory results by forecasting the original time series directly. An ideal approach is to divide the tough task of forecasting original time series into several subtasks, and each of them forecasts a relatively simpler subsequence. And then the results of all subtasks are accumulated as the final result. Based on this idea, a “decomposition and ensemble” framework was proposed and widely applied to the analysis of time series, such as energy forecasting [16, 17], fault diagnosis [18–20], and biosignal analysis [21–23]. This framework consists of three stages. In the first stage, the original time series was decomposed into several components. Typical decomposition methodologies include wavelet decomposition (WD), independent component analysis (ICA) [24], variational mode decomposition (VMD) [25], empirical mode decomposition (EMD) [2, 26] and its extension (ensemble EMD (EEMD)) [27, 28], and complementary EEMD (CEEMD) [29]. In the second stage, some statistical or AI-based methodologies are applied to forecast each decomposed component individually. In theory, any regression methods can be used to forecast the results of each component. In the last stage, the predicted results from all components are aggregated as the final results. Recently, various researchers have devoted efforts to forecasting crude oil prices following the framework of “decomposition and ensemble.” Fan et al. put forward a novel approach that integrates independent components analysis (ICA) and support vector regression (SVR) to forecast crude oil prices, and the experimental results verified the effectiveness of the proposed approach [24]. Yu et al. used EMD to decompose the sequences of the crude oil prices into several intrinsic mode functions (IMFs) at first and then used a three-layer feed-forward neural network (FNN) model for predicting each IMF. Finally, the authors used an adaptive linear neural network (ALNN) to combine all the results of the IMFS as the final forecasting output [2]. Yu et al. also used EEMD and extended extreme learning machine (EELM) to forecast crude oil prices, following the framework of "decomposition and ensemble." The empirical results demonstrated the effectiveness and efficiency of the proposed approach [28]. Tang et al. further proposed an improved approach integrating CEEMD and EELM for forecasting crude oil prices, and the experimental results demonstrated that the proposed approach outperformed all the listed state-of-the-art benchmarks [29]. Li et al. used EEMD to decompose raw crude oil prices into several components and then used kernel and nonkernel sparse Bayesian learning (SBL) to forecast each component, respectively [30, 31].

From the perspective of decomposition, although EMD and EEMD are capable of improving the accuracy of forecasting crude oil prices, they still suffer “mode mixing” and introducing new noise in the reconstructed signals, respectively. To overcome these shortcomings, an extension of EEMD, so-called complete EEMD with adaptive noise (CEEMDAN), was proposed by Torres et al. [32]. Later, the authors put forward an improved version of CEEMDAN to obtain decomposed components with less noise and more physical meaning [33]. The CEEMDAN has succeeded in wind speed forecasting [34], electricity load forecasting [35], and fault diagnosis [36–38]. Therefore, CEEMDAN may have the potential to forecast crude oil prices. As pointed out above, any regression methods can be used to forecast each decomposed component. A recently proposed machine learning algorithm, extreme gradient boosting (XGBOOST), can be used for both classification and regression [39]. The existing research has demonstrated the advantages of XGBOOST in forecasting time series [40–42].

With the potential of CEEMDAN in decomposition and XGBOOST in regression, in this paper, we aim at proposing a novel approach that integrates CEEMDAN and XGBOOST, namely, CEEMDAN-XGBOOST, to improve the accuracy of forecasting crude oil prices, following the “decomposition and ensemble” framework. Specifically, we firstly decompose the raw crude oil price series into several components with CEEMDAN. And then, for each component, XGBOOST is applied to building a specific model to forecast the component. Finally, all the predicted results from every component are aggregated as the final forecasting results. The main contributions of this paper are threefold: We propose a novel approach, so-called CEEMDAN-XGBOOST, for forecasting crude oil prices, following the “decomposition and ensemble” framework; extensive experiments are conducted on the publicly-accessed West Texas Intermediate (WTI) to demonstrate the effectiveness of the proposed approach in terms of several evaluation metrics; we further study the impacts of several parameter settings with the proposed approach.

The remainder of this paper is organized as follows. Section 2 describes CEEMDAN and XGBOOST. Section 3 formulates the proposed CEEMDAN-XGBOOST approach in detail. Experimental results are reported and analyzed in Section 4. We also discussed the impacts of parameter settings in this section. Finally, Section 5 concludes this paper.

#### 2. Preliminaries

##### 2.1. EMD, EEMD and CEEMDAN

EMD was proposed by Huang et al. in 1998, and it has been developed and applied in many disciplines of science and engineering [26]. The key feature of EMD is to decompose a nonlinear, nonstationary sequence into intrinsic mode functions (IMFs) in the spirit of the Fourier series. In contrast to the Fourier series, they are not simply sine or cosine functions, but rather functions that represent the characteristics of the local oscillation frequency of the original data. These IMFs need to satisfy two conditions: the number of local extrema and the number of zero crossing must be equal or differ at most by one and the curve of the “local mean” is defined as zero.

At first, EMD finds out the upper and the lower envelopes which are computed by finding the local extrema of the original sequence. Then, the local maxima (minima) are linked by two cubic spines to construct the upper (lower) envelopes, respectively. The mean of these envelopes is considered as the “local mean.” Meanwhile, the curve of this “local mean” is defined as the first residue, and the difference between original sequence and the “local mean” is defined as the first IMF. An illustration of EMD is shown in Figure 1.

**(a) The line of the original sequence, local mean, upper envelopes, and lower envelopes**

**(b) The first IMF decomposed by EMD from the above original sequence**

After the first IMF is decomposed by EMD, there is still a residue (the local mean, i.e., the yellow dot line in Figure 1(a)) between the IMF and the original data. Obviously, extrema and high-frequency oscillations also exist in the residue. And EMD decomposes the residue into another IMF and one residue. If the variance of the new residue is not small enough to satisfy the Cauchy criterion, EMD will repeat to decompose new residue into another IMF and a new residue. Finally, EMD decomposes original sequence into several IMFs and one residue. The difference between the IMF and the residues is defined as where is the* k*-th residue at the time* t * and* K* is the total number of IMFs and residues.

Subsequently, Huang et al. thought that EMD could not extract the local features from the mixed features of the original sequence completely. One of the reasons for this is the frequent appearance of the mode mixing. The mode mixing can be defined as the situation that similar pieces of oscillations exist at the same corresponding position in different IMFs, which causes that a single IMF has lost its physical meanings. What is more, if one IMF has this problem, the following IMFs cannot avoid it either. To solve this problem, Wu and Huang extended EMD to a new version, namely, EEMD, that adds white noise to the original time series and performs EMD many times [27]. Given a time series and corresponding noise, the new time series can be expressed as where stands for the original data and is the* i*-th white noise (*i*=*1*,*2*,…,*N*, and* N* is the times of performing EMD).

Then, EEMD decomposes every into . In order to get the real* k*-th IMF, , EEMD calculates the average of the . In theory, because the mean of the white noise is zero, the effect of the white noise would be eliminated by computing the average of , as shown in

However, Torres et al. found that, due to the limited number of in empirical research, EEMD could not completely eliminate the influence of white noise in the end. For this situation, Torres et al. put forward a new decomposition technology, CEEMDAN, on the basis of EEMD [32].

CEEMDAN decomposes the original sequence into the first IMF and residue, which is the same as EMD. Then, CEEMDAN gets the second IMF and residue, as shown in where stands for the first IMF decomposed from the sequence and is used to set the signal-to-noise ratio (SNR) at each stage.

In the same way, the* k*-th IMF and residue can be calculated as

Finally, CEEMDAN gets several IMFs and computes the residue, as shown in

The sequences decomposed by EMD, EEMD, and CEEMDAN satisfy (8). Although CEEMDAN can solve the problems that EEMD leaves, it still has two limitations: the residual noise that the models contain and the existence of spurious modes. Aiming at dealing with these issues, Torres et al. proposed a new algorithm to improve CEEMDAN [33].

Compared with the original CEEMDAN, the improved version obtains the residues by calculating the local means. For example, in order to get the first residue shown in (9), it would compute the local means of* N* realizations(*i=1, 2,..., N*).where* M(.)* is the local mean of the sequence.

Then, it can get the first IMF shown in

For the* k*-th residue and IMF, they can be computed as (11) and (12), respectively:

The authors have demonstrated that the improved CEEMDAN outperformed the original CEEMDAN in signal decomposition [33]. In what follows, we will refer to the improved version of CEEMDAN as CEEMDAN, unless otherwise stated. With CEEMDAN, the original sequence can be decomposed into several IMFs and one residue, that is, the tough task of forecasting the raw time series, can be divided into forecasting several simpler subtasks.

##### 2.2. XGBOOST

Boosting is the ensemble method that can combine several weak learners into a strong learner as where is a weak learner and* K* is the number of weak learners.

When it comes to the tree boosting, its learners are decision trees which can be used for both regression and classification.

To a certain degree, XGBOOST is considered as tree boost, and its core is the Newton boosting instead of Gradient Boosting, which finds the optimal parameters by minimizing the loss function , as shown in where is the complexity of the* k*-th tree model,* n* is the sample size,* T* is the number of leaf nodes of the decision trees, *ω* is the weight of the leaf nodes, controls the extent of complexity penalty for tree structure on* T*, and controls the degree of the regularization of .

Since it is difficult for the tree ensemble model to minimize loss function in (14) and (15) with traditional methods in Euclidean space, the model uses the additive manner [43]. It adds that improves the model and forms the new loss function as where is the prediction of the* i*-th instance at the* t*-th iteration and is the weaker learner at the* t*-th iteration.

Then, Newton boosting performs a Taylor second-order expansion on the loss function to obtain , because the second-order approximation helps to minimize the loss function conveniently and quickly [43]. The equations of , and the new loss function are defined, respectively, as

Assume that the sample set in the leaf node* j* is defined as =, where q() represents the tree structure from the root to the leaf node* j* in the decision tree, (19) can be transformed into the following formula, as show in

The formula for estimating the weight of each leaf in the decision tree is formulated as

According to (21), as for the tree structure* q*, the loss function at the leaf node* j* can be changed as

Therefore, the equation of the information gain after branching can be defined as where and are the sample sets of the left and right leaf node, respectively, after splitting the leaf node* j*.

XGBOOST branches each leaf node and constructs basic learners by the criterion of maximizing the information gain.

With the help of Newton boosting, the XGBOOST can deal with missing values by adaptively learning. To a certain extent, XGBOOST is based on the multiple additive regression tree (MART), but it can get better tree structure by learning with Newton boosting. In addition, XGBOOST can also subsample among columns, which reduces the relevance of each weak learner [39].

#### 3. The Proposed CEEMDAN-XGBOOST Approach

From the existing literature, we can see that CEEMDAN has advantages in time series decomposition, while XGBOOST does well in regression. Therefore, in this paper, we integrated these two methods and proposed a novel approach, so-called CEEMDAN-XGBOOST, for forecasting crude oil prices. The proposed CEEMDAN-XGBOOST includes three stages: decomposition, individual forecasting, and ensemble. In the first stage, CEEMDAN is used to decompose the raw series of crude oil prices into* k*+1 components, including* k* IMFs and one residue. Among the components, some show high-frequency characteristics while the others show low-frequency ones of the raw series. In the second stage, for each component, a forecasting model is built using XGBOOST, and then the built model is applied to forecast each component and then get an individual result. Finally, all the results from the components are aggregated as the final result. Although there exist a lot of methods to aggregate the forecasting results from components, in the proposed approach, we use the simplest way, i.e., addition, to summarize the results of all components. The flowchart of the CEEMDAN-XGBOOST is shown in Figure 2.

From Figure 2, it can be seen that the proposed CEEMDAN-XGBOOST based on the framework of “decomposition and ensemble” is also a typical strategy of “divide and conquer”; that is, the tough task of forecasting crude oil prices from the raw series is divided into several subtasks of forecasting from simpler components. Since the raw series is extremely nonlinear and nonstationary while each decomposed component has a relatively simple form for forecasting, the CEEMDAN-XGBOOST has the ability to achieve higher accuracy of forecasting crude oil prices. In short, the advantages of the proposed CEEMDAN-XGBOOST are threefold: the challenging task of forecasting crude oil prices is decomposed into several relatively simple subtasks; for forecasting each component, XGBOOST can build models with different parameters according to the characteristics of the component; and a simple operation, addition, is used to aggregate the results from subtasks as the final result.

#### 4. Experiments and Analysis

##### 4.1. Data Description

To demonstrate the performance of the proposed CEEMDAN-XGBOOST, we use the crude oil prices from the West Texas Intermediate (WTI) as experimental data (the data can be downloaded from https://www.eia.gov/dnav/pet/hist/RWTCD.htm). We use the daily closing prices covering the period from January 2, 1986, to March 19, 2018, with 8123 observations in total for empirical studies. Among the observations, the first 6498 ones from January 2, 1986, to September 21, 2011, accounting for 80% of the total observations, are used as training samples, while the remaining 20% ones are for testing. The original crude oil prices are shown in Figure 3.

We perform multi-step-ahead forecasting in this paper. For a given time series , the m-step-ahead forecasting for can be formulated as where is the m-step-ahead predicted result at time* t*,* f* is the forecasting model, is the true value at time* i*, and* l* is the lag order.

For SVR and FNN, we normalize each decomposed component before building the model to forecast the component individually. In detail, the normalization process can be defined as where is the normalized series of crude oil prices series, is the data before normalization, is the mean of , and is the standard deviation of . Meanwhile, since normalization is not necessary for XGBOOST and ARIMA, for models with these two algorithms, we build forecasting models from each of the decomposed components directly.

##### 4.2. Evaluation Criteria

When we evaluate the accuracy of the models, we focus on not only the numerical accuracy but also the accuracy of forecasting direction. Therefore, we select the root-mean-square error (RMSE) and the mean absolute error (MAE) to evaluate the numerical accuracy of the models. Besides, we use directional statistic (Dstat) as the criterion for evaluating the accuracy of forecasting direction. The RMSE, MAE, and Dstat are defined as where is the actual crude oil prices at the time* t*, is the prediction, and* N* is the size of the test set.

In addition, we take the Wilcoxon signed rank test (WSRT) for proving the significant differences between the forecasts of the selected models [44]. The WSRT is a nonparametric statistical hypothesis test that can be used to evaluate whether the population mean ranks of two predictions from different models on the same sample differ. Meanwhile, it is a paired difference test which can be used as an alternative to the paired Student t-test. The null hypothesis of the WSRT is whether the median of the loss differential series is equal to zero or not, where and are the error series of model* a* and model* b* respectively, and g(.) is a loss function. If the* p* value of pairs of models is below 0.05, the test rejects the null hypothesis (there is a significant difference between the forecasts of this pair of models) under the confidence level of 95%. In this way, we can prove that there is a significant difference between the optimal model and the others.

However, the criteria defined above are global. If some singular points exist, the optimal model chosen by these criteria may not be the best one. Thus, we make the model confidence set (MCS) [31, 45] in order to choose the optimal model convincingly.

In order to calculate the* p*-value of the statistics accurately, the MCS performs bootstrap on the prediction series, which can soften the impact of the singular points. For the* j*-th model, suppose that the size of a bootstrapped sample is* T*, and the* t*-th bootstrapped sample has the loss functions defined as

Suppose that a set M_{0}= that contains* n* models, for any two models* j* and* k*, the relative values of the loss between these two models can be defined as

According to the above definitions, the set of superior models can be defined as where E(.) represents the average value.

The MCS repeatedly performs the significant test in M_{0}. At each time, the worst prediction model in the set is eliminated. In the test, the hypothesis is the null hypothesis of equal predictive ability (EPA), defined as

The MCS mainly depends on the equivalence test and elimination criteria. The specific process is as follows.

*Step 1. *Assuming M=, at the level of significance *α*, use the equivalence test to test the null hypothesis .

*Step 2. *If it accepts the null hypothesis and then it defines , otherwise it eliminates the model which rejects the null hypothesis from M according to the elimination criteria. The elimination will not stop until there are not any models that reject the null hypothesis in the set* M*. In the end, the models in are thought as surviving models.

Meanwhile, the MCS has two kinds of statistics that can be defined as where* T*_{R} and stand for the range statistics and the semiquadratic statistic, respectively, and both statistics are based on the t-statistics as shown in (35)-(36). These two statistics (*T*_{R} and ) are mainly to remove the model whose* p*-value is less than the significance level *α*. When the* p*-value is greater than the significance level *α*, the models can survive. The larger the* p*-value, the more accurate the prediction of the model. When the* p*-value is equal to 1, it indicates that the model is the optimal forecasting model.

##### 4.3. Parameter Settings

To test the performance of XGBOOST and CEEMDAN-XGBOOST, we conduct two groups of experiments: single models that forecast crude oil prices with original sequence, and ensemble models that forecast crude oil prices based on the “decomposition and ensemble” framework.

For single models, we compare XGBOOST with one statistical model, ARIMA, and two widely used AI-models, SVR and FNN. Since the existing research has shown that EEMD significantly outperforms EMD in forecasting crude oil prices [24, 31], in the experiments, we only compare CEEMDAN with EEMD. Therefore, we compare the proposed CEEMDAN-XGBOOST with EEMD-SVR, EEMD-FNN, EEMD-XGBOOST, CEEMDAN-SVR, and CEEMDAN-FNN.

For ARIMA, we use the Akaike information criterion (AIC) [46] to select the parameters (*p-d-q*). For SVR, we use RBF as kernel function and use grid search to optimize* C* and* gamma* in the ranges of and , respectively. We use one hidden layer with 20 nodes for FNN. We use a grid search to optimize the parameters for XGBOOST; the search ranges of the optimized parameters are shown in Table 1.

We set 0.02 and 0.05 as the standard deviation of the added white noise and set 250 and 500 as the number of realizations of EEMD and CEEMDAN, respectively. The decomposition results of the original crude oil prices by EEMD and CEEMDAN are shown in Figures 4 and 5, respectively.

It can be seen from Figure 4 that, among the components decomposed by EEMD, the first six IMFs show characteristics of high frequency while the remaining six components show characteristics of low frequency. However, regarding the components by CEEMDAN, the first seven ones show clear high-frequency and the last four show low-frequency, as shown in Figure 5.

The experiments were conducted with Python 2.7 and MATLAB 8.6 on a 64-bit Windows 7 with 3.4 GHz I7 CPU and 32 GB memory. Specifically, we run FNN and MCS with MATLAB, and, for the remaining work, we use Python. Regarding XGBoost, we used a widely used Python package (https://xgboost.readthedocs.io/en/latest/python/) in the experiments.

##### 4.4. Experimental Results

In this subsection, we use a fixed value 6 as the lag order, and we forecast crude oil prices with 1-step-ahead, 3-step-ahead, and 6-step-ahead forecasting; that is to say, the horizons for these three forecasting tasks are 1, 3, and 6, respectively.

###### 4.4.1. Experimental Results of Single Models

For single models, we compare XGBOOST with state-of-the-art SVR, FNN, and ARIMA, and the results are shown in Table 2.

It can be seen from Table 2 that XGBOOST outperforms other models in terms of RMSE and MAE with horizons 1 and 3. For horizon 6, XGBOOST achieves the best RMSE and the second best MAE, which is slightly worse than that by ARIMA. For horizon 1, FNN achieves the worst results among the four models; however, for horizons 3 and 6, SVR achieves the worst results. Regarding Dstat, none of the models can always outperform others, and the best result of Dstat is achieved by SVR with horizon 6. It can be found that the RMSE and MAE values gradually increase with the increase of horizon. However, the Dstat values do not show such discipline. All the values of Dstat are around 0.5, i.e., from 0.4826 to 0.5183, which are very similar to the results of random guesses, showing that it is very difficult to accurately forecast the direction of the movement of crude oil prices with raw crude oil prices directly.

To further verify the advantages of XGBOOST over other models, we report the results by WSRT and MCS, as shown in Tables 3 and 4, respectively. As for WSRT, the* p*-value between XGBOOST and other models except ARIMA is below 0.05, which means that there is a significant difference among the forecasting results of XGBOOST, SVR, and FNN in population mean ranks. Besides, the results of MCS show that the* p*-value of and of XGBOOST is always equal to 1.000 and prove that XGBOOST is the optimal model among all the models in terms of global errors and most local errors of different samples obtained by bootstrap methods in MCS. According to MCS, the p-values of and of SVR on the horizon of 3 are greater than 0.2, so SVR becomes the survival and the second best model on this horizon. When it comes to ARIMA, ARIMA almost performs as good as XGBOOST in terms of evaluation criteria of global errors but it does not pass the MCS. It indicates that ARIMA does not perform better than other models in most local errors of different samples.

###### 4.4.2. Experimental Results of Ensemble Models

With EEMD or CEEMDAN, the results of forecasting the crude oil prices by XGBOOST, SVR, and FNN with horizons 1, 3, and 6, are shown in Table 5.

It can be seen from Table 5 that the RMSE and MAE values of the CEEMDAN-XGBOOST are the lowest ones among those by all methods with each horizon. For example, with horizon 1, the values of RMSE and MAE are 0.4151 and 0.3023, which are far less than the second values of RMSE and MAE, i.e., 0.8477 and 0.7594, respectively. With the horizon increases, the corresponding values of each model increase, in terms of RMSE and MAE. However, the CEEMDAN-XGBOOST still achieves the lowest values of RMSE and MAE with each horizon. Regarding the values of Dstat, all the values are far greater than those by random guesses, showing that the “decomposition and ensemble” framework is effective for directional forecasting. Specifically, the values of Dstat are in the range between 0.6165 and 0.9054. The best Dstat values in all horizons are achieved by CEEMDAN-SVR or EEMD-SVR, showing that, among the forecasters, SVR is the best one for directional forecasting, although corresponding values of RMSE and MAE are not the best. As for the decomposition methods, when the forecasters are fixed, CEEMDAN outperforms EEMD in 8, 8, and 8 out of 9 cases in terms of RMSE, MAE, and Dstat, respectively, showing the advantages of CEEMDAN over EEMD. Regarding the forecasters, when combined with CEEMDAN, XGBOOST is always superior to other forecasters in terms of RMSE and MAE. However, when combined with EEMD, XGBOOST outperforms SVR and FNN with horizon 1, and FNN with horizon 6 in terms of RMSE and MAE. With horizons 1 and 6, FNN achieves the worst results of RMSE and MAE. The results also show that good values of RMSE usually are associated with good values of MAE. However, good values of RMSE or MAE do not always mean good Dstat directly.

For the ensemble models, we also took a Wilcoxon signed rank test and an MCS test based on the errors of pairs of models. We set 0.2 as the level of significance in MCS, and 0.05 as the level of significance in WSRT. The results are shown in Tables 6 and 7.

From these two tables, it can be seen that, regarding the results of WSRT, the* p*-value between CEEMDAN-XGBOOST and any other models except EEMD-FNN are below 0.05, demonstrating that there is a significant difference on the population mean ranks between CEEMDAN-XGBOOST and any other models except EEMD-FNN. What is more, the MCS shows that the* p*-value of and of CEEMDAN-XGBOOST is always equal to 1.000 and demonstrates that CEEMDAN-XGBOOST is the optimal model among all models in terms of global errors and local errors. Meanwhile, the* p-*values of and of EEMD-FNN are greater than other models except CEEMDAN-XGBOOST and become the second best model with horizons 3 and 6 in MCS. Meanwhile, with the horizon 6, the CEEMDAN-SVR is also the second best model. Besides, the* p-*values of and of EEMD-SVR and CEEMDAN-SVR are up to 0.2 and they become the surviving models with horizon 6 in MCS.

From the results of single models and ensemble models, we can draw the following conclusions: single models usually cannot achieve satisfactory results, due to the nonlinearity and nonstationarity of raw crude oil prices. As a single forecaster, XGBOOST can achieve slightly better results than some state-of-the-art algorithms; ensemble models can significantly improve the forecasting accuracy in terms of several evaluation criteria, following the “decomposition and ensemble” framework; as a decomposition method, CEEMDAN outperforms EEMD in most cases; the extensive experiments demonstrate that the proposed CEEMDAN-XGBOOST is promising for forecasting crude oil prices.

##### 4.5. Discussion

In this subsection, we will study the impacts of several parameters related to the proposed CEEMDAN-XGBOOST.

###### 4.5.1. The Impact of the Number of Realizations in CEEMDAN

In (2), it is shown that there are* N* realizations in CEEMDAN. We explore how the number of realizations in CEEMDAN can influence on the results of forecasting crude oil prices by CEEMDAN-XGBOOST with horizon 1 and lag 6. And we set the number of realizations in CEEMDAN in the range of 10,25,50,75,100,250,500,750,1000. The results are shown in Figure 6.

It can be seen from Figure 6 that, for RMSE and MAE, the bigger the number of realizations is, the more accurate results the CEEMDAN-XGBOOST can achieve. When the number of realizations is less than or equal to 500, the values of both RMSE and MAE decrease with increasing of the number of realizations. However, when the number is greater than 500, these two values are increasing slightly. Regarding Dstat, when the number increases from 10 to 25, the value of Dstat increases rapidly, and then it increases slowly with the number increasing from 25 to 500. After that, Dstat decreases slightly. It is shown that the value of Dstat reaches the top values with the number of realizations 500. Therefore, 500 is the best value for the number of realizations in terms of RMSE, MAE, and Dstat.

###### 4.5.2. The Impact of the Lag Orders

In this section, we explore how the number of lag orders impacts the prediction accuracy of CEEMDAN-XGBOOST on the horizon of 1. In this experiment, we set the number of lag orders from 1 to 10, and the results are shown in Figure 7.

According to the empirical results shown in Figure 7, it can be seen that as the lag order increases from 1 to 2, the values of RMSE and MAE decrease sharply while that of Dstat increases drastically. After that, the values of RMSE of MAE remain almost unchanged (or increase very slightly) with the increasing of lag orders. However, for Dstat, the value increases sharply from 1 to 2 and then decreases from 2 to 3. After the lag order increases from 3 to 5, the Dstat stays almost stationary. Overall, when the value of lag order is up to 5, it reaches a good tradeoff among the values of RMSE, MAE, and Dstat.

###### 4.5.3. The Impact of the Noise Strength in CEEMDAN

Apart from the number of realizations, the noise strength in CEEMDAN, which stands for the standard deviation of the white noise in CEEMDAN, also affects the performance of CEEMDAN-XGBOOST. Thus, we set the noise strength in the set of 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07 to explore how the noise strength in CEEMDAN affects the prediction accuracy of CEEMDAN-XGBOOST on a fixed horizon 1 and a fixed lag 6.

As shown in Figure 8, when the noise strength in CEEMDAN is equal to 0.05, the values of RMSE, MAE and Dstat achieve the best results simultaneously. When the noise strength is less than or equal to 0.05 except 0.03, the values of RMSE, MAE and Dstat become better and better with the increase of the noise strength. However, when the strength is greater than 0.05, the values of RMSE, MAE and Dstat become worse and worse. The figure indicates that the noise strength has a great impact on forecasting results, and an ideal range for it is about 0.04-0.06.

#### 5. Conclusions

In this paper, we propose a novel model, namely, CEEMDAN-XGBOOST, to forecast crude oil prices. At first, CEEMDAN-XGBOOST decomposes the sequence of crude oil prices into several IMFs and a residue with CEEMDAN. Then, it forecasts the IMFs and the residue with XGBOOST individually. Finally, CEEMDAN-XGBOOST computes the sum of the prediction of the IMFs and the residue as the final forecasting results. The experimental results show that the proposed CEEMDAN-XGBOOST significantly outperforms the compared methods in terms of RMSE and MAE. Although the performance of the CEEMDAN-XGBOOST on forecasting the direction of crude oil prices is not the best, the MCS shows that the CEEMDAN-XGBOOST is still the optimal model. Meanwhile, it is proved that the number of the realizations, lag, and the noise strength for CEEMDAN are the vital factors which have great impacts on the performance of the CEEMDAN-XGBOOST.

In the future, we will study the performance of the CEEMDAN-XGBOOST on forecasting crude oil prices with different periods. We will also apply the proposed approach for forecasting other energy time series, such as wind speed, electricity load, and carbon emissions prices.

#### Data Availability

The data used to support the findings of this study are included within the article.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the Fundamental Research Funds for the Central Universities (Grants no. JBK1902029, no. JBK1802073, and no. JBK170505), the Natural Science Foundation of China (Grant no. 71473201), and the Scientific Research Fund of Sichuan Provincial Education Department (Grant no. 17ZB0433).