#### Abstract

According to the individual forecasting methods, an adaptive control combination forecasting (ACCF) method with adaptive weighting coefficients was proposed for short-term prediction of the time series data. The US population dataset, the American electric power dataset, and the vibration signal dataset in a hydraulic test rig were separately tested by using ACCF method, and then, the accuracy analysis of ACCF method was carried out in the study. The results showed that, in contrast to individual methods or combination methods, the proposed ACCF method was adaptive to adopt one or some of prediction methods and showed satisfactory forecasting results due to flexible adaptability and a high accuracy. It was also concluded that the higher the noise ratio of the tested datasets, the lower the prediction accuracy of the ACCF method; the ACCF method demonstrated a better prediction trend with good volatility and following quality under noisy data, as compared with other methods.

#### 1. Introduction

A time series is a set of statistics and usually collected at regular intervals. Time series data occur naturally in many application areas, such as economics, medicine, weather data, ocean engineering, finance, and engineering control. Time series data are obtained by the sensors, and they refer to the large, diverse datasets of information that cannot be easily processed by using standard computers. Based on the past performance, time series forecasting is an analysis used to forecast future value, which is still a challenging research topic nowadays [1–3].

The three common methods for time series forecasting included physical, statistical, and artificial intelligence. In physical methods, effective forecasting results rely on physical information [4, 5], but it was not proficient in dealing with short-term series with complex calculation process. Statistical models, including Period-Sequential Index (PSI) [6], moving average (MA) [7], autoregressive integrated moving average (ARIMA) [8], exponential smoothing [9], Kalman filter [10], and grey forecasting [11], effectively tackled linear features but gave larger error for a fluctuant, seasonal one [12], noise, or instability [13]. Artificial intelligence models, subsuming BP neural network (BP-NET) [14, 15], support vector machines (SVM) [16], fuzzy logic models [17], and least square support vector machine (LSSVM) [18], have exhibited significant advantages in dealing with nonlinear problems. These artificial intelligence models offered higher forecasting accuracy than physical or statistical models, but their prediction was mostly relying on training datasets, and they are easy to get stuck or suffer from overfitting in the local optima [19, 20].

Because of the inherent disadvantages of each model, nowadays, the effective information of multiple models has been used to predict time series, and weight problem of combination model is becoming the research focus. Weights could be allocated to the various forecasts produced by individual models, so as to achieve a combined forecast [21, 22]. For example, Clark et al. [23] derived mean square error-minimizing weights for combining the restricted and unrestricted forecasts and assigned more weights on the restricted model and less weights on the unrestricted model. However, Clark’s method supported the conventional wisdom that simple averages were hard to beat and applied only to averaging to nested models. Hao et al. [24] introduced entropy weight method into the combination prediction model. Xie et al. [25] combined linear regression prediction model and grey model to get a various weight combination model. The two methods have better forecasting accuracy in only Dam’s settlement or landslip, but weights were assigned to all participating single models, and forecast accuracy in other time series cases was unwarrantable or unknown. Gao et al. [26] established five kinds of combination forecasting models, including suboptimal weight, optimal weight, grey comprehensive correlation degree weight, entropy weight, and neural network. In Gao’s study, the weighted constraint criterion was given particular attention; nevertheless, negative weight of single item prediction model might occur in the combination model. Song and Fu [27] have also found out that some models failed to combine the advantages of single models, such as the combination of the autoregressive integrated moving average model and neural network model, or the combination of neural network and other forecasting models.

In my opinion, the main problem of the above combination methods is that the statistical distribution information of the forecasting errors with the historical time is not paid more attention or is ignored, leading to unreasonable weight distribution and even negative weights. Therefore, existing combined forecast models were still lacking the predicted reliability, particularly under a condition of noise. In the study, weighting coefficient for each model was adaptively determined based on their own statistical forecasting performance for historical data. The rest of the paper is organized as follows. Section 2 contains the methodology of combination forecasting method. Section 3 contains the steps of computation. Section 4 contains the results and discussion of short-term prediction cases. Section 5 contains the conclusions.

#### 2. Methodology

##### 2.1. The Forecasting Methods

###### 2.1.1. Individual Methods

(1)Period-Sequential Index Method The Period-Sequential Index (PSI) method [6] had been studied by the author in the previous work. The PSI model introduced the period index (PI) and sequential index (SI) to describe the dataset structure information in vertical and horizontal dimensions, respectively. Figure 1 shows the schematic diagram of the PSI algorithm.*H*

_{−2},

*H*

_{−1}denote reference historical periods, i.e., the year before last year and last year.

*H*

_{0}represents the forecasting period. The period for

*H*

_{−2},

*H*

_{−1}and

*H*

_{0}is uniform, defined as

*T.*When time

*t*

_{i}is upcoming, the forecasting value at time of

*t*

_{i}based on PSI method is mainly dependent on the PI and the SI. The PSI method is described as follows: where

*t*

_{i}(independent variable) is a forecasting time,

*Y*

_{1}(

*t*

_{i}) (dependent variable) is a forecasting value at time of

*t*

_{i},

*K*

_{0}is the reference coefficient for period index,

*y*(

*t*

_{i−1}) is the observed value at historical time of

*t*

_{i−1},

*α*is the optimized weighing factor of PSI method,

*PI*(

*t*

_{i}) is the forecasting period index (dependent variable) at time of

*t*

_{i}and

*SI*(

*t*

_{i−1}) is the forecasting sequential index (dependent variable) at time of

*t*

_{i−1}. where

*y*(

*t*

_{i}

*−2T*) and

*y*(

*t*

_{i}

*−T*) describe the reference historical data at time of

*t*

_{i}

*−2T*and

*t*

_{i}

*−T*, respectively.

*K*

_{−2}and

*K*

_{−1}are reference functions of period index. A standard period average is originally set to be a reference function of period index, where it is defined as a constant. The more detailed derivation can be found in Ref. [6].(2)Exponential Smoothing Method The Exponential Smoothing (ES) method [9] is often used in practice to forecast time series. Suppose that the observed values for time series are

*y*(

*t*

_{1}),

*y*(

*t*

_{2}),…,

*y*(

*t*

_{i−1}) at time of

*t*

_{1,}

*t*

_{2,}…,

*t*

_{i−1,}respectively. For ES method, the forecast value at time of

*t*

_{i}is dependent on the observed value at time of

*t*

_{i−1}and the forecasting value at time of

*t*

_{i−1}. The ES method is defined as where

*Y*

_{2}(

*t*

_{i}) and

*Y*

_{2}(

*t*

_{i−1}) represent the forecasting values at time of

*t*

_{i}and

*t*

_{i−1}by using ES method,

*y*(

*t*

_{i−1}) is the observed value at historical time of

*t*

_{i−1}, and

*β*is the smoothing parameter, which can be adjusted between 0 and 1. Higher

*β*will produce a forecast, which is more responsive to recent changes in the data, whilst also being less robust to any errors that could occur.(3)Moving Average Method The moving average (MA) method [7] is simple and widely used, and it performs well in forecasting competitions against more sophisticated approaches. A simple Moving Average is a common average of the previous

*n*data points in time series data, and each point in the time series data is equally weighted. The MA method can be described as follows: where

*Y*

_{3}(

*t*

_{i}) represents the forecasting values at time of

*t*

_{i}by using MA method,

*n*is the number of data points used in the calculation, and

*y*(

*t*

_{i−n}) is the observed data point value at time of

*t*

_{i−n}

*.*(4)Autoregressive Integrated Moving Average Method Autoregressive integrated moving average (ARIMA) [8] is one of the most popular statistical linear models for forecasting time series data. It is a combination of autoregression AR(

*p*) (an additive linear function of

*p*past observations), moving average MA(

*q*) (

*q*random errors), and

*d*which is an integer making a series to be stationary. The general form of the forecast equation for ARIMA (

*p*,

*q*,

*d*) model can be written as follows: where

*Y*

_{4}(

*t*

_{i}) is a forecasting value at time of

*t*

_{i}by using ARIMA method,

*c*is the constant representing the intercept,

*φ*

_{j}and

*y*(

*t*

_{i−j}) are the parameters and regressors for AR part of the model, respectively,

*θ*

_{k}and

*ε*(

*t*

_{i−k}) are the parameters and regressors of the MA part of the model, respectively, and

*ε*(

*t*

_{i}) is the white noise at time

*t*

_{i}

*.*(5)BP Neural Network Method BP neural network (BP-NET) method [14, 15] can realize self-learning and memory functions of machine. Figure 2 gives the simple structure of BP-NET. It can be seen that BP-NET is composed of input layer, output layer, and hidden layer. Using this three-layer structure, BP can simulate any complex nonlinear relationship through nonlinear elements. The basic calculation principle is divided into three steps: forward calculation (calculate the output of each node in turn based on the input); error backpropagation (calculate the gradient of each node according to the loss function); weight update. Because this method has excellent data processing and relationship building capabilities, it has been widely used in forecasting.(6)Grey Forecasting Method Grey forecasting method (GM) [11] indicates one variable and one-order grey forecasting model. This grey differential equation is formed by an original time series

*y*(

*t*

_{i}) using accumulated generating operation (AGO) technique. It is denoted as follows: where

*Y*

_{6}(

*t*

_{i}) is a forecasting value at time of

*t*

_{i}by using GM method,

*a*is a developing coefficient, and

*b*is a control variable.

*a*and

*b*are denoted as where

###### 2.1.2. Combination Forecasting Methods

Because it is too risky to rely on the forecasts produced by an individual method, the combination forecasting method was widely used in the study. There is one historical piece of data *y*(*t*_{1}), *y*(*t*_{2}),..., *y*(*t*_{m}) occurring at the corresponding time *t*_{1}, *t*_{2},..., *t*_{m}. For the same forecasting problem at time of *t*_{i}, *n* kinds of single forecasting model can give forecasts: *Y*_{1}(*t*_{i}), *Y*_{2}(*t*_{i}), …, *Y*_{n}(*t*_{i}). The forecasting value of the *j-*th model (*j* = 1,2, …, *n*) and the corresponding weight coefficient are *Y*_{j(}*t*_{i)} and respectively, at time *t*_{i}. The linear combination is generally calculated according to [28]

Considering the actual situation and the calculation complexity, three common weight methods [27], including inverse variance (IV) method, mean square error inverse (MSEI) method, and simple weighted average (SWA) method, were used to compute weight coefficients in the study. The methods are based on the sum of squared errors.

The weight coefficients of the IV method were computed in equation (10). In a case of the larger sum of squared errors in a single method, this method is assigned a smaller weight. On the contrary, a larger weight is assigned to the smaller squared errors in a single model.where *n* indicates the number of single models, and *e*_{j} is the *j-*th single model.

Similarly, the weight coefficients of the MSEI method were computed as follows:

For the SWA method, the sum of squared errors of each model was ranked by a descending ranking order. Then, a new array of *j*_{r}(=1,2, …, *n*) could be defined and represent a ranking order of single model. This meant that one individual model with a higher value of *j*_{r} would have a lower forecasting error. The weight coefficients of the SWA method were further given in equation (13). It is seen that the weight coefficient is larger in a condition of higher *j*_{r} in order to minimize the sum of squared errors.

##### 2.2. The Adaptive Control Combination Forecasting Method

In the following work, one total tested data set occurred at time *t*_{1}, *t*_{2}, …, *t*_{N−1}, *t*_{N}, and is defined as

y(t_{1}),y(t_{2}), …,y(t_{i−1}),y(t_{i}), (t_{i+1}), …,y(t_{N−1}),y(t_{N}).

This total tested data set is a time series based on an equal interval of time (*dt*) and a fixed time period (*T*). For example, while *dt* is month, day, hour, minute, or second, the corresponding *T* of the tested data are year, month, day, and minute, respectively. Now, the time *t*_{i} (3*T**t*_{i}*t*_{N}) is assumed to be upcoming, so the tested data set during three time periods (3*T*) before *t*_{i} and *y*(*t*_{i}) is extracted from the total tested data set. It is defined that the extracted tested data set contains *s* *+* 1 (*s* = 3*T*/*dt*) data in number and is given in detail as follows: *y*(*t*_{i−s}), *y*(*t*_{i−s−1}), *y*(*t*_{i−s−2}), …, *y*(*t*_{i−2}), *y*(*t*_{i−1}), and *y*(*t*_{i}). Using this extracted tested data set, each individual method can give forecasts at time of *t*_{i} based on that mentioned in Section 2.1.1 and get the modeling data set: *Y*_{1}(*t*_{i}), *Y*_{2}(*t*_{i}), …, *Y*_{n−1}(*t*_{i}), *Y*_{n}(*t*_{i}).

Similarly, as the forecasting time *t*_{i} moves back to *t*_{i}, *t*_{i−1}, *t*_{i−2}, ……, *t*_{i−s−2}, *t*_{i−s−1}, and *t*_{i−s} in sequence, the all extracted tested datasets can be generated, and they are listed in Table 1.

Using each individual method, corresponding forecasts at time of *t*_{i}, *t*_{i−1}, *t*_{i−2}, …, *t*_{i−s−1}, and *t*_{i−s,} are defined as the first modeling datasets, which are listed in Table 2.

In fact, the first modeling datasets are *n* forecasting datasets corresponding to every individual forecasting model, respectively. Then, the mean absolute percentage errors (*MAPE*) [29, 30] are computed to obtain the datasets MAPE_{1}, MAPE_{2}, …, and MAPE_{n}, given in equation (14). The mean values and the standard deviations for MAPE_{1}, MAPE_{2}, ……, and MAPE_{n} can be further given in equation (14). They are defined as the second modeling datasets, which are listed in Table 3.

Based on the previous preparation work of equation (14), our present work is to produce a forecast *F*(*t*_{i}) for the upcoming time *t*_{i} (3*s**i**N*). Consequently, an adaptive control combination forecasting (ACCF) method is given in equation (14).where , and are the weighting coefficients, which are dependent variables with the change of the upcoming time *t*_{i}. Based on the performance-based approach [31], an adaptive weight for each model is determined based on their own forecasting performance and can be defined as follows:where *k*_{j} is either 0 or 1, and it is computed as follows:

In the equations (16) and (17), there are the third modeling datasets, shown in Table 4.

Clearly, using this ACCF approach, models producing smaller values of (*j* = 1,2, …, *n*) will be assigned larger weights in comparison to models with higher ones. The smallest mean value and the corresponding standard deviation are defined as and *σ*_{m}, respectively. Thus, the smallest MAPE range is defined between and . When statistical arrays range from to produced by models that overlap partially or fully with the smallest MAPE range (from to ), where *k*_{j} is 1; otherwise, *k*_{j} is 0.

In the ACCF model, the modeling datasets, including , and *k*_{j} must be corrected on every upcoming time *t*_{i} according to the solution of equations (14), (16), and (17). Finally, key parameters are finally taken into equation (15) to make a prediction on time *t*_{i}. Because _{j} in equation (16) must be updated and revised on every upcoming time *t*_{i} so as to ensure prediction accuracy with high robustness, the new developed model is propitious to short-term prediction. In the condition of long-term prediction, however, this new model cannot update the value of the weights in time and may be result in a large percentage forecast error.

##### 2.3. Evaluation Index

The MAPE during time period (*T*) is further measured to evaluate the obtained results. Because there are *s*/3 data in number every time period, the error MAPE_{k} (*k* = 1, 2, …, (*N*−3*s*)/(*s*/3)) during the *k*_{th} time period can be calculated by equation (17). For total tested data set, the number of predicting time periods is equal to (*N*−3*s*)/(*s*/3), and calculation equations of total error indicators MAPE_{all} are given in the below formula (18). To verify the superiority of the ACCF approach, statistical count lower than MAPE_{all} (defined Count) and percentage of Count in the sum (defined Per) are also calculated by equation (18) for this new array MAPE_{k}.where *y*(*t*_{i}) is the measured value at time of *t*_{i}; *F*(*t*_{i}) is the predicting value at time of *t*_{i}.

After obtaining the predicting error of MAPE_{all}, the forecasting accuracy (*FA*_{all}) can be calculated by using equation (19). Thus, the forecasting accuracy of the model will be better when the *FA* is approaching 100%.

#### 3. Steps of Computation

In the study, six individual forecasting methods, including PSI, ES, MA, ARIMA, GM, and BP-NET, are used to construct ACCF method described in Section 2.2. The flow chart of the proposed ACCF model for short-term prediction is summarized in Figure 3. It gives a rolling forecast process, and the detailed steps are as follows: Step 1: The total tested data set (*y*(*t*_{1}), *y*(*t*_{2}), …… *y*(*t*_{i−1})) is initialized at time of *t*_{i−1}. Step 2: The all tested datasets in Table 1 are extracted from total tested data set. Step 3: By solving equations (1) and (3)–(6), ..., the first modeling datasets in Table 2 are predicted by using each individual model, such as PSI method, ES method, ARIMA method, MA method, GM method, and BP-NET method. Step 4: Equation (14) is solved, and the second modeling datasets in Table 3 are got; Step 5: Equations (16) and (17) are solved, and the third modeling datasets in Table 4 are further obtained; Step 6: Using *Y*_{1}(*t*_{i}), *Y*_{2}(*t*_{i}), …, *Y*_{6}(*t*_{i}) (in Table 2) and , , …, (in Table 4), equation (15) is calculated to give a new combination forecast *F*(*t*_{i}) at time step of *t*_{i}. Step 7: If the time steps of the stop condition (*t*_{i} *≥**t*_{N}) are satisfied, the search stops, as well as output parameters of MAPE_{all}, Count, Per, and *FA*_{all} by solving equations (18) and (19); Otherwise, the time step is added, new generated data *y*(*t*_{i}) is added into the total tested data set, and then the procedure returns to step 2.

#### 4. Results and Discussion

Table 5 shows three groups of tested time series datasets. As shown in Table 5, the first dataset (named USP), with total samples of 492, is US population between January 1979 and December 2019, which is from the US Census Bureau hosted by the Federal Reserve Economic Database (FRED) [32]. FRED has a data platform found US population data and updated population information of every month. The second dataset (named AEP), with total samples of 3432, is the American hourly electric power consumption data between March 13, 0 : 00, and August 2, 23 : 00, in 2018, which comes from PJM’s website and is in megawatts (MW) [33]. PJM is a regional transmission organization (RTO) in the United States, and part of the Eastern Interconnection grid operating an electric transmission system. The third dataset (named VS) was experimentally obtained with a hydraulic test rig. This test rig consists of a primary working, a test system, and a secondary cooling-filtration circuit, which are connected via the oil tank [34, 35]. The system cyclically repeats constant load cycles (duration 60 seconds). The test system is equipped with several sensors measuring process values such as vibration, with standard industrial 20 mA current loop interfaces connected to a data acquisition system. In the study, these vibration signals in hydraulic test rig with a sampling frequency of 1 Hz during 8580 seconds [34] were measured and used as the third dataset.

By using the developed ACCF method, a comparison analysis between the real value and the forecasting value was implemented and showed a direct observation of the prediction, so as to evaluate the confidence of the ACCF method.

##### 4.1. Periodic Recognition and Prediction on USP Dataset

The USP datasets were used as a tested dataset to show periodic detection and prediction results. In order to calculate second modeling datasets ( produced by six individual models, respectively) in equation (14), monthly population data of USP during contiguous 36 months are in turn trained to forecast the population data during next month. Figure 4 shows evolution of the mean values of MAPE with year when predicting by using the PSI, ES, ARIMA, MA, BP-NET, and GM methods. It can be seen that the ARIMA method gives a much better prediction accuracy.

Based on the ACCF method, the weighting coefficients can be given by solving equations (16) and (17): = 0 for PSI, = 0 for ES, = 1 for ARIMA, = 0 for MA, = 0 for BP-NET and = 0 for GM. It also shows that the ACCF method can adaptively seek the prediction methods with much higher accuracy and abandon other prediction methods with poor accuracy. Then, the USP dataset from January 1988 to December 2019 is circularly predicted by solving equation (15), as shown in Figure 5. It demonstrates a good prediction trend with good volatility and following quality by using the ACCF method.

##### 4.2. Periodic Recognition and Prediction on AEP Dataset

The AEP datasets were used as a tested data set to show periodic detection and prediction results. Figure 6 shows evolution of the mean values of MAPE with time when predicting by using the PSI, ES, ARIMA, MA, BP-NET, and GM methods. By solving equations (16) and (17), the weighting coefficients of models at different forecasting times are shown in Table 6. It can be seen that the sum of six weight coefficients is always equal to 1; , and most of are equal to zero; the values of are always less than those of and . Thus, because of the worst performance of MA method and GM method, their weights are not assigned. But PSI method and ARIMA method play more important role than other individual methods in the prediction of AEP dataset. Then, the AEP dataset from March 22, 0 : 00, to August 2, 23 : 00, in 2018 performed a rolling prediction by solving equation (15), as shown in Figure 7. It demonstrates that the ACCF method shows a better prediction trend with good volatility and following quality.

**(a)**

**(b)**

##### 4.3. Periodic Recognition and Prediction on VS Dataset

The VS datasets were used as a tested dataset to show periodic detection and prediction results. Figure 8 shows evolution of the mean values of MAPE with time when predicting by using the PSI, ES, ARIMA, MA, BP-NET, and GM methods. Based on the ACCF method, the weighting coefficients by each base model can be computed by solving equations (16) and (17). Table 7 shows the weighting coefficients of models at different forecasting times. It can be seen that the sum of six weight coefficients is equal to 1, which is similar to the result in Section 4.2, but only and are equal to zero. PSI method, ES method, ARIMA method, and BP-NET method all play important role in the prediction of VS dataset. Then, the VS dataset from the 541th second to the 8580th second is further predicted by using the ACCF method, as shown in Figure 9. Thus, once again, the ACCF method demonstrates a better prediction trend with good volatility and following quality.

**(a)**

**(b)**

##### 4.4. Accuracy Analysis of ACCF Method

###### 4.4.1. Comparison with Other Forecasting Models

Based on each prediction model, MAPE_{k} (*k* = 1, 2, …, (*N*−3*s*)/(*s*/3)), MAPE_{all}, Count, and Per are calculated by equation (18). Then, the statistical test of MAPE_{k} is shown in Table 8 on USP dataset, in Table 9 on AEP dataset, and in Table 10 on VS dataset. Max value of MAPE_{k}, standard deviation of MAPE_{k} can also be got and listed in Tables 8–10. It can be seen from Tables 8 and 9 that MAPE_{all}, max value of MAPE_{k} and standard deviation of MAPE_{k} are smaller by ACCF method, as compared to those by other methods. Meanwhile, greater values of Count and Per can be got by ACCF method than other methods. Furthermore, in Table 10, max value and standard deviation of MAPE_{k} by ACCF method are slightly greater than that of IV method and MSEI method, but ACCF method has the smaller value of MAPE_{all} and the greater values of *Count* and *Per* over all methods. A possible reason is that statistical distribution law of historical forecasting errors was delved deeper by using ACCF method, and weighting coefficient for each model was modified more reasonably, leading to smaller MAPE_{all} as well as greater statistical Count. Therefore, the ACCF method has the highest incidence of delivering the best predictions over all compared forecasting methods on USP dataset, AEP dataset, and VS dataset.

Table 11 presents a more visual view of prediction accuracy of each prediction model. It can be noticed that most of *FA*_{all} values for individual methods are all more than 95%, and the four combination forecasting methods have higher *FA*_{all} values of more than 97%. Obviously, the combination forecasting effect is most desirable and superior to the individual methods. It is interesting to know that the ACCF method is observed to be the best for the prediction of the USP AEP and VS datasets, due to higher forecasting accuracy. When judging by the *FA*_{all} values of the four combination methods, ACCF method and IV method are superior to MSEI method and SWA method. In general, the developed ACCF algorithm is adaptive to adopt one (for USP) or some (for AEP and VS) among individual prediction methods, achieving satisfactory accuracy (*FA*_{all} > 98.8%) in time series prediction, and it is suitable to the prediction of three datasets used in this study due to higher values of *FA*_{all}*.*

###### 4.4.2. Impact of Noise Ratio

In order to test robustness of the ACCF algorithm, the noisy data were further added to the AEP data set and the VS data set. The ratio of the standard deviation (STD) of added noise to the STD of original dataset is in a range from 0.00 to 0.50 in the study. The prediction accuracy of the ACCF method under noisy data was computed and compared with that of other forecasting methods, as shown in Tables 12 and 13.

As can be observed from Tables 12 and 13, when the proportion of noisy data increases, the *FA*_{all} value of each algorithm decreases in all cases. Considering the different natural periodicity of each time series dataset, comparison methods can obtain different accuracy on different datasets. The *FA*_{all} values of forecasting methods are between 76.257% and 98.184% for the AEP dataset and are in a range of 94.856%–98.770% for the VS dataset. In addition, for AEP dataset or VS dataset with noise ratio of 0.00, the ACCF method is superior to other methods, and the IV method is in the second place. With the increasing noise ratios, however, the ACCF method almost keeps the highest *FA*_{all} value in two cases against other comparison algorithms (including not only individual methods, but also combination methods). For example, as the noise ratio changes from 0.0 to 0.5, the *FA*_{all} of IV method decreases from 98.120% to 90.882% for AEP dataset, and from 98.742% to 97.443% for VS dataset, while the *FA*_{all} of ACCF method decreases only from 98.184% to 90.955% for AEP dataset, and from 98.770% to 97.452% for VS dataset. This might be due to the high stochasticity of the tested data set under high noise ratio, and thus the comparison models could not capture the actual trend of historical forecasting errors. Especially for IV, MSEI, and SWA models, the statistical distribution information of the forecasting errors with the historical time is not considered, so they show lower robustness than developed ACCF model. On the contrary, because the statistical forecasting errors are used to correct weights in real time, the robustness of ACCF method is better than other comparison methods for noisy data.

Therefore, it is concluded that the proposed ACCF algorithm obtains higher prediction accuracy on time series datasets and is more robust to noisy data than other individual methods, as well as combination methods.

#### 5. Conclusions

(i)According to the individual forecasting methods, such as PSI, ES, ARIMA, MA, and BP-NET methods, an ACCF method with adaptive weighting coefficients is proposed for short-term prediction of the time-series data.(ii)The combination forecasting methods are most desirable and superior to the individual methods. In contrast to other forecasting methods, the proposed ACCF method is adaptive to adopt one or some of prediction methods and shows satisfactory forecasting quality due to its flexible adaptability and high forecasting accuracy. The ACCF method is extremely suitable for short-term prediction of time series datasets.(iii)The higher the noise ratio of the tested datasets, the lower the prediction accuracy of the ACCF method. But the proposed ACCF methods can still achieve significant advantages compared with other forecasting methods in terms of forecasting accuracy. The ACCF method demonstrates a better prediction trend with good volatility and following quality.#### Abbreviations

_{dt}: | Equal interval of time |

e_{j}: | The sum of squared errors of j-th single model |

FA: | Forecasting accuracy |

k_{j}: | Either 0 or 1 (j = 1,2, …, n) |

K_{0}: | Correction coefficient for period index |

MAPE: | Mean absolute percentage error |

MAPE_{1}: | New historical arrays given in equation (14) by PSI method |

MAPE_{2}: | New historical arrays given in equation (14) by ES method |

MAPE_{3}: | New historical arrays given in equation (14) by ARIMA method |

MAPE_{4}: | New historical arrays given in equation (14) by MA method |

MAPE_{5}: | New historical arrays given in equation (14) by BP-NET method |

MAPE_{6}: | New historical arrays given in equation (14) by GM method |

N: | Number of forecasting methods |

PI(t_{i}): | Period index at time of t_{i} |

SI(t_{i}): | Sequential index at time of t_{i} |

t_{i}: | Time (t_{i} = t_{1} , t_{2}, …, t_{N}) |

T: | Fixed time period |

y(t_{i}),: | Observed value at time of t_{i} |

Y_{1}(t_{i}): | Forecasting value at time of t_{i} by PSI method |

Y_{2}(t_{i}): | Forecasting value at time of t_{i} by ES method |

Y_{3}(t_{i}): | Forecasting value at time of t_{i} by ARIMA method |

Y_{4}(t_{i}): | Forecasting value at time of t_{i} by MA method |

Y_{5}(t_{i}): | Forecasting value at time of t_{i} by BP-NET method |

Y_{6}(t_{i}): | Forecasting value at time of t_{i} by GM method |

: | Weighting coefficients by ACCF method (j = 1,2, …, n) |

: | Mean values for historical arrays MAPE_{j} (j = 1,2, …, n) |

: | The smallest values in (j = 1,2, …, n) |

σ_{j}: | Standard deviations for historical arrays MAPE_{j} (j = 1,2, …, n) |

σ_{m}: | Standard deviations for historical arrays MAPE_{m} |

F(t_{i}): | Forecasting value at time of t_{i} by ACCF method |

Α: | Optimized weighing factor of PSI method |

β: | Smoothing parameter. |

#### Data Availability

The data presented in this study are available upon request from the corresponding author.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Authors’ Contributions

Conceptualization was performed by H. J., D. F.; formal analysis was performed by H.J. , D. F.; investigation was performed by H. J., D. F.; original draft was written by H. J., X. Z.; reviewing and editing were performed by H. J., D. F.

#### Acknowledgments

This study was funded by the Key Projects for International Cooperation in Scientific and Technological Innovation between Governments (grant no. 2017YFE0101600). The authors would like to express their sincere thanks to technical support from Sino–German Institute for Intelligent Technologies.