Research Article  Open Access
Selecting Single Model in Combination Forecasting Based on Cointegration Test and Encompassing Test
Abstract
Combination forecasting takes all characters of each single forecasting method into consideration, and combines them to form a composite, which increases forecasting accuracy. The existing researches on combination forecasting select single model randomly, neglecting the internal characters of the forecasting object. After discussing the function of cointegration test and encompassing test in the selection of single model, supplemented by empirical analysis, the paper gives the single model selection guidance: no more than five suitable single models can be selected from many alternative single models for a certain forecasting target, which increases accuracy and stability.
1. Introduction
In 1969, Bates and Granger [1] put forward the idea of “the combination of forecasts” for the first time; that is to say, after the characters of each single forecasting method are considered, combine different single forecasting. This method aroused the attention of forecasting scholars as soon as it was put forward.
The combination forecasting model is the progress of selecting and utilizing the information of single forecasting model, but there are fewer special researches on the selection problem of single model when forming a combination forecasting model. In previous combination forecasting researches and applications, often, there is no screening of single forecasting models involved in combination forecasting, and the forecasting model is chosen according to its obvious characters presented by particular progress and existing knowledge and experience.
During the regression analysis of economic measurement, an important index to evaluate the fitting degree is , but one of the most direct methods to raise is to increase explanatory variables. So a contradiction occurs: after the number of explanatory variables increases, problems such as tests and colinearity may not be explained. That is to say, it is not the case that the more explanatory variables the better.
The researches of Armstrong [2] prove that the best combinations, most of the time, only use several (no more than 5) methods. When more and more single models are combined, the probability to get the optimum performance decreases.
So, this paper will discuss how to select suitable model from alternative single models to form combination for a certain forecasting target; according to some selecting principles, the number of single models to form combination is no more than five.
2. Cointegration Test of Single Model in Combination Forecasting
From the end of the 1980s to the 1990s, a great breakthrough of the modeling theory in econometrics is the cointegration relationship research of the time series. Cointegration actually presents the longrun equilibrium relationship of different time series, which is a key basic thought and theory in the current econometric field and is also an important theoretical cornerstone in current researches on combination forecasting launched by time series.
If forecasting series is an objective performance of sample series, in the long term, they should have an equilibrium relationship, even though, in the short term, these variables may deviate from mean value due to random disturbance and other reasons. Therefore, we can calculate the forecasting series of all alternative single forecasting models and reject the alternative single forecasting models that do not conform to cointegration relationship through the cointegration test with sample series, and the goal of selecting single model can be achieved.
2.1. Brief Introduction to Cointegration
Suppose is stationary time series, the mean is , is the variance, covariance is finite, and these numbers can fluctuate according to the changes of time ; that is,
The three mathematical equations above hold true for all , , , and the , , here are constants.
Suppose some series requires difference transformation for times before changing into stationary series; this series is integrated of order , denoted by . Often, economic series is integrated of order one or two.
Suppose series are all integrated of order and a vector exists, satisfying
Let , ; it can be considered that is cointegrated of order [3].
The key of this result is to be able to explain that if some linear combination of two variables is , they must present the character of tracking each other, and these two variables will not deviate from each other too far in the long term. Therefore, in the long term, one of the necessary conditions for two variables to keep stationary is cointegration.
2.2. Cointegration Test of Single Models in Combination Forecasting
Generally speaking, economic series is unstable, so as a relative reasonable forecasting series. Supposing and are forecasting series and real series, respectively, and both of them are integrated of order , namely, , only when and are cointegrated can forecasting error reach ; otherwise, forecasting error would be unstable; that is, it will augment with time prolonging. The conclusion results in the 2 following theorems.
Theorem 1. Given that both predicted series and forecasting series belong to and only when and are cointegrated can forecasting error reach , thus the forecasting error is unstable.
Theorem 2. Only when every single forecasting in series ’s combination forecasting in is cointegrated with can forecasting error be ; otherwise, combination forecasting error is unstable.
The former parts in Theorems 1 and 2 are sufficient conditions while the latter parts are necessary conditions. This proposition can be learned that if the combination forecasting or single forecasting in unstable series is in inconformity with the requirements specified in the above proposition, the unstable forecasting error would exist.
Attestation of Theorem 1 can be directly obtained through related definitions of cointegration.
For different combination forecasting models, Theorem 2 has many proven ways, and only the proof process of linear regression method for combination forecasting is given below.
Assuming the forecasting series is of type, then two single forecasting models of and in the series are combined to form :
And . Assuming has a cointegration relation with , and are not cointegrated. From the definition of cointegration, it can be known that belongs to , while, for any real number , belongs to . The error of combination forecasting is recorded into ; namely,
It is obvious in the above equation that belongs to , meaning that forecasting model has unstable forecasting errors, thus proving Theorem 2, which can easily apply to the combination of kinds of single forecasting methods.
If the single model in combination forecasting is in inconformity with the cointegration conditions in Theorem 2, even creating a combination forecasting model cannot guarantee the forecasting accuracy because the established model forecasting error in this way will be unstable.
2.3. Steps of Cointegration Test
Step one of single model cointegration test in combination forecasting is to detect the integration order of predicted series and the integration order of single forecasting series as Theorem 1. If the single forecasting series is cointegrated with predicted series, then . When the predicted sample series has the same integration order as single forecasting series, the cointegration test can be made to the predicted sample series and single forecasting series, and the specific steps are as follows.
2.3.1. Integration Order Testing
The most widely used method is unit root test for integration order calculation of sample series or forecasting series. Standard unit root test method is DickerFuller testing (DF), but the frequently used method in reality is generalized DF testing (ADF) [4]. ADF testing is actually making the following regression:
The above formula adopts the lagged term of first degree and two differential lagged terms and of time series with the purpose to transform residue term to the white noise, and the number of terms is determined according to actual situations; constant term and time trend term represent the first difference of time series.
Unit root test is mainly performed to test the coefficient of the regression expression ; if the coefficient significance was unequal to 0, the hypothesis that has the unit root is rejected and is a smooth series. Then the following hypothesis testing can be made:
The output result of ADF test includes the statistical magnitude of coefficient in differential lagged terms and the tested critical value of the assumption whether it equals 0. If the output result is a negative number far smaller than 0, then is accepted and is rejected, which also indicates that the time series is stable with no unit root. The stability of first difference is also needed to be tested for unstable time series. If stable, then the series can be deemed as type. If unstable, the first difference will continue the difference till the series becomes stable; when the series becomes stable after time differences, the integration order of the series is .
2.3.2. Cointegration Test
To know whether two variables and have cointegration correlation, EG twostep method [5] can be used for verification.
Step 1. Evaluate the following equation by ordinary least squares (OLS):
So it can lead to
It is called cointegration regression.
Step 2. Test the integrity of . If it is a stable series, then variables , can be considered as cointegrated of order ; if is integrated of order, then variables , can be considered as cointegrated of order .
The testing methods for ’s integrity are the above DF testing or ADF testing.
Of course, in the specific application process, the following manner can also be carried out.
Expand the ADF regression of to through OLS method:
Study the stability of the obtained regression residuals from this equation; if stable, then it indicates the presence of cointegration process, the equation can also be called cointegration equation, and the equation expresses the cointegration relationship between variables. The stability of the obtained regression residuals can be tested by ADF testing method; that is, it makes the following regression:
In formula (10), the new error term is ; is the optimal lagged order to transform residue term to white noise. Lagrange multiplication testing method or DurbinWatson (DW) testing can be employed to determine whether the residual terms are white noise.
testing in is ADF statistics; if it exceeds the critical value when the significance level is 0.01, 0.05, or 0.10, then the original hypothesis of noncointegration is rejected.
2.4. Empirical Analysis
M3Competition sample series [6] is used below for empirical analysis of single model cointegration validation. The paper selects the 999th series in M3 (denoted by N999) as verification series; N999 is a macroeconomic series and quarterly data with totally 52 sample points, of which the first 44 samples are fitting samples while the last 8 are forecasting samples, (see Tables 1 and 2).


In the paper, 12 single models are selected to fit and predict N999 series, which are, respectively, moving average (Naïve), single exponential smoothing (Single), linear exponential smoothing (Holts), dampen trend exponential smoothing (Dampen), seasonal exponential smoothing (Winter), time series decomposition model (Decomposition), ARIMA, ARARMA, BP neural network (BP), NARX neural network (NARX), grey forecasting (GM), and support vector machine (SVM). Real things contain two kinds of information: the nonlinear information and linear information; therefore, some classical linear and nonlinear single models are considered in the paper.
The evaluation indexes of forecasting are of a large number, and Goodwin and Lawton [7] have listed many indicators used in forecasting evaluation currently; the paper selects “symmetric mean absolute percentage error” as the forecasting evaluation index. is the observation value at time , and represents the forecasting value of ; then
sMAPE is proposed for the greater punishment shortcomings of MAPE to positive error. For example, if , , then ; if , , then . But in both cases, sMAPE are 66.7. Furthermore, fluctuation range of sMAPE is 0–200, while MAPE has no upper limit.
Next, we conduct unit root test to N999 sample series through the test tool of Eviews6.0 software.
It can be observed from Table 3 that after a first difference to N999 sample series, ADF statistic magnitude is −5.025, less than the 1% critical level. Therefore, N999 sample series can be determined as integrated of order; namely, it belongs to .
(a)  
 
*MacKinnon (1996) onesided values.  
(b)  

Next, unit root test is conducted for the fitting data of twelve single models, and only when the fitting sample of single model is also integrated of order can they go into cointegration test phase (see Table 4).

Through the unit root test, we can see that the Naïve and Single are stable series, and the integrated order of AutoANN is greater than 1, so these three single models are removed first.
The cointegration test is conducted to fitting series and sample series of the remaining nine single models; according to (9) and (10), stability test is actually made to the residues of nine single models. Table 5 shows the ADF testing of fitting sample residues of nine single forecasting models.

It can be observed from Table 5 that the forecasting of DAMPEN and GM(1,1) to fitting sample residue series of two single models does not pass ADF testing, so it is unstable; therefore these two single forecasting models do not pass the cointegration verification.
Finally, the remaining seven single models are predicted through the combination of simple average, weighted optimal, and ANN forecasting model again, and the result is shown in Table 6.

It can be observed from Table 6 that the combination forecasting accuracy after single model cointegration screening has improved to a certain extent, though not obvious. More importantly, forecasting accuracy of simple average combination (4.90) exceeds the Naïve (4.91) with little advantage, but it at least proves the cointegration verification validity of single forecasting model.
3. Encompassing Test of Single Model in Combination Forecasting
Numerous research literatures on forecasting accuracy mostly focus on the forecasting accuracy and stability evaluation and cannot directly compare the differences between competing forecastings of the same object. To achieve such discriminative comparison, forecasting encompassing theory came naturally, which was proposed by Meese and Rogoff [8, 9]: if a forecasting model contains or includes forecasting model Brelated information, we believe that model A includes model B.
3.1. Significance of Encompassing Test of Single Model in Combination Forecasting
Kisinbay [10] establishes a statistical magnitude of approximate pivot in line with distribution to apply encompassing test to single model selection in combination forecasting so as to filter the included single models and reduce the single model quantity involved in the combination. He believes that adding the included single models into combination models has not any sense and even increases the complexity of the algorithm.
Tomiyuki and Ryoji [11] point out that finding the single models of high predictability and small correlation degree with other models is important to improve the combination forecasting effects. Encompassing test is the answer to the competing models in combination forecasting. Therefore, it seems extremely important to create encompassing methods and principles of combination forecasting on the basis of statistical approaches, so as to improve combination forecasting efficiency.
Encompassing test can identify the origins of accuracy differences between competing models; help predictors distinguish whether the difference is caused by sample variability or the significance of information set in construction models. But we cannot ensure that the forecasting model with the best forecasting accuracy performance must be able to explain all competing models. The research of Ericsson [12] shows that taking advantages in forecasting accuracy out of samples is not a sufficient condition for forecasting encompassing. In our empirical analysis, we further find that when the forecasting model A includes model B, the model A’s accuracy may be lower than B’s for containing more useless or inferior information.
Combination forecasting theory holds that the information in different forecasting models may vary due to the differences of models, even leading to different forecasting results. We can obtain better results than the forecasting values of originally two single models through combination forecasting. However, if no additional information is included in the constructed combination forecasting, this combination will be not able to get higher efficiency, and thus the combination forecasting loses its significance.
Based on the above analysis, besides cointegration test to single models before combination forecasting, we should also adopt forecasting encompassing test to identify the containment relationships between alternative single models to select single modes through certain heuristic strategies.
3.2. Basic Principle of Encompassing Test in Single Model Selection for Combination Forecasting
We first consider the twotwo encompassing between single forecasting models by making the forecasting values of two single forecasting models at time , and they have the following regression equation:
That model encompasses model can be written as , and in turn, it can also be that model includes model . If has other values, then the two models are intolerant and each model contains independent related information. If the covariance is relatively stable, then some standard methods can be used for encompassing verification test. If no any predictive model includes other models, it can be shown that all models are wrong (i.e., no one model can reflect the true data generating process); and only by relying on the combination forecasting can combination forecasting be made to useful data information.
Ericsson [12] constructs an encompassing test model:
means the forecasting value of the information at time used by to , and means the forecasting value of the information at time used by to . Under the restriction condition that , it is tested that or .
Fair and Shiller [13] propose the following encompassing test model:
If , , and , model is included by model ; otherwise, model is included by model , and and are unconstrained.
In testing process through (14), if the two models have the same information, the forecasting results will show a high correlation degree, and thus and are difficult to distinguish. Although this problem can obtain the evaluation of (14) through 0 regressions, the test results are not obvious.
Harvey, Leybourne, and Newbold provide HLN encompassing testing method for the above model defects [14]. HLN testing can verify and test whether two competing models have similar performances, and considering one simple of loss difference sequence, it has where represents any loss function, such as sMAPE or MSE; is the step forward forecasting error of model to ; ; . And the forecasting performance shows that the expected value of is 0, namely, ; besides, the test adopts the mean of observation sample . Assumed that the loss difference sequence has the stability of covariance, then, under the null hypothesis of the same forecasting accuracy, the test statistical magnitude of HLN will also be more consistent with normal distribution, and the specific formula of the statistical test is as follows:
Assuming that the step forward forecasting depends on the ordinal number and is the consistent evaluation of asymptotic variance , we can calculate through the following formula:
Here is the th autocovariance of , whose size can be estimated by .
Through the Monte Carlo method, we simulate and evaluate the testing capacity of HLN and find that it has poor performance in dealing with small samples.
Therefore, the paper amends on HLN testing so that its performance in testing small samples improves (after amendments, it can be referred to as MHLN testing). Specific amendments are as follows: first, compare and test statistical magnitude through distribution critical value of degree of freedom, rather than a normal distribution; second, modify test statistical magnitude as follows:
Testing for MHLN statistic magnitude is very simple, and only by calling correlation test function in Matlab7.0 statistical toolbox can test results of various significant levels be obtained.
3.3. Steps of Encompassing Test of Single Models in Combination Forecasting
This section chooses N999 series in M3 as the sample series of encompassing test; through the cointegration selection in Section 2.2, the original twelve single forecasting models are only chosen to seven: Holts linear exponential smoothing, WINTER seasonal exponential smoothing, THETA time series decomposition model, BJ auto ARIMA, ARARMA, NARX neural networks, and SVM support vector machine. Here we will use MHLN encompassing testing described above for N999 sample series to conduct encompassing testing for the alternative 7 single models.
Step 1. Calculate the fitting sMAPE in the samples of 7 single models for N999 series and sort all models according to that numerical number and the performance of each model. It should be noted that the fitting sMAPE in the samples of single models is different from the forecasting sMAPE out of samples.
Step 2. Select the model with the lowest sMAPE value as the optimal model, and for optimal model NARX neural network, the adopted MHLN encompassing test (confidence level of 0.01) and distribution table show that the remaining six single models are all not included by the optimal model.
Step 3. Select the second optimal model ARARMA and operate as step two, and the result shows that ARIMA is included by ARARMA, so we remove the ARIMA model among alternative single models.
Step 4. Select the third optimal model of Decomposition and operate as step two again until the end of encompassing test and no included single forecasting model is found.
Since the number of single models in combination forecasting is better to be no more than 5, we repeat the above encompassing test procedures again and increase confidence level to 0.05. And the result shows that the Decomposition includes SVM.
Through encompassing test, seven single models are finally selected into five models. These five single models are used to make combination forecasting again, and further analysis is made on the basis of Table 6. For comparison, six single models that are selected the first time are compared together (confidence level of 0.01) (see Table 7).

We find that ARARMA includes ARIMA; but in fact, the real forecasting accuracy out of ARIMA samples is higher than that of ARARMA, which also proves that “taking advantages in forecasting accuracy out of samples is not a sufficient condition for forecasting encompassing” mentioned at the beginning of this chapter.
Furthermore, we find that when confidence level is 0.01, the simple average combination accuracy of the remaining six single models in conclusiveness test decreases while the accuracy of the weighted optimal combination upgrades. The main reason is that the eliminated ARIMA has a high forecasting accuracy itself, whose exclusion will necessarily decrease the accuracy of simple average combination; but for weighted optimal combination, the exclusion of ARIMA leads to the significant increase of weights of ARARMA and NARX neural network, so the combination forecasting accuracy improves.
When the confidence level is 0.05, SVM is excluded; the accuracy of the remaining five single model combinations has significantly improved. In particular, the sMAPE value of weighted optimal combination is lower than the Naïve single forecasting accuracy, proving the effectiveness of combination forecasting.
Therefore, when the single models are under encompassing tests, if the remaining single model number is no more than 5 when under the test at confidence level of 0.01, the test directly ends; if greater than 5, then confidence level should be adjusted to 0.05 or 0.1.
4. Conclusion
This thesis discusses single forecasting model selection in combination forecasting through cointegration test first and encompassing test method then. The result shows that the forecasting accuracy has improved to a certain extent after single model selection.
Cointegration verification aims to keep the fitting sample and real sample of single models have a consistent fluctuation trend, thus guaranteeing the combination forecasting accuracy of samples to the largest extent, especially the accuracy of middle and longterm forecasting. And the encompassing test is to eliminate the inclusive models; the start point of single model forecasting lies in the fact that each model establishes on different information set and owns different model type. If one single model is included by another single model, it will lose its meaning in the combination model and only increase the burden of combination.
According to the empirical analysis about N999 sample series, the paper screens 12 alternative single models to the final 5; and finally the forecasting accuracy is calculated to prove the efficiency of single model selection.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 J. M. Bates and C. W. J. Granger, “The combination of forecasts,” Operational Research Quarterly, vol. 20, no. 4, pp. 451–468, 1969. View at: Google Scholar
 S. Armstrong, Principles of Forecasting. A Handbook for Researchers and Practitioners, KAP, Dordrecht, The Netherlands, 2001.
 C. R. Nelson and C. R. Plosser, “Trends and random walks in macroeconmic time series: some evidence and implications,” Journal of Monetary Economics, vol. 10, no. 4, pp. 139–162, 1982. View at: Google Scholar
 J. D. Hamilton, Time Serial Analysis, Social Science Press of China, Beijing, China, 1999.
 R. F. Enger and C. W. J. Geanger, “Cointegration and error correction representation estimation and testing,” Econometric, vol. 55, no. 2, pp. 251–276, 1987. View at: Google Scholar
 S. Makridakis and M. Hibon, “The M3competition: results, conclusions and implications,” International Journal of Forecasting, vol. 16, no. 4, pp. 451–476, 2000. View at: Google Scholar
 P. Goodwin and R. Lawton, “On the asymmetry of the symmetric MAPE,” International Journal of Forecasting, vol. 15, no. 4, pp. 405–408, 1999. View at: Google Scholar
 R. A. Meese and K. Rogoff, “Empirical exchange rate models of the seventies: do they fit out of sample?” Journal of International Economics, vol. 14, no. 12, pp. 3–24, 1983. View at: Google Scholar
 R. Meese and K. Rogoff, “Was it real? The exchange rateinterest differential relation over the modern floatingrate period,” Journal of Finance, vol. 43, pp. 933–948, 1988. View at: Google Scholar
 T. Kisinbay, “The use of encompassing tests for forecast combinations,” IMF Working Paper WP/264, International Monetary Fund, 2007. View at: Google Scholar
 K. Tomiyuki and K. Ryoji, “The effectiveness of forecasting methods using multiple information variables,” IMES Discussion Paper Series 2002E20, Institute for Monetary and Economic Studies, 2002. View at: Google Scholar
 N. R. Ericsson, “Parameter constancy, mean square forecast errors, and measuring forecast performance: an exposition, extensions, and illustration,” Journal of Policy Modeling, vol. 14, no. 4, pp. 465–495, 1992. View at: Google Scholar
 R. Fair and R. Shiller, “Comparing information in forecasts from econometric models,” The American Economic Review, vol. 80, pp. 375–389, 1990. View at: Google Scholar
 D. I. Harvey, S. J. Leybourne, and P. Newbold, “Tests for forecast encompassing,” Journal of Business and Economic Statistics, vol. 16, no. 2, pp. 254–259, 1998. View at: Google Scholar
Copyright
Copyright © 2014 Chuanjin Jiang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.