Abstract

Electric load in summer has a significant cyclical trend with temperature effects. In general, the parameters of the SARIMA and the SMA turn out to be nonsignificant in most cases. To address this issue, the hybrid time series model is utilized to extract the spectrum sequences with different frequencies. The original electric load series are first decomposed into the trend sequence “G” and the cycle sequence “C.” After that, a revised ARMAX model is proposed to deal with the two divided sequences. Finally, the combined models are tested by case study. The case study on electric load forecast in one city from China shows that the proposed model outperforms other four comparative models in terms of prediction accuracy. It proves that the combined model proposed by the authors is more accurate than those based on a single forecasting method.

1. Introduction

Load forecast is mainly used to predict power load in the next few days [1, 2], which plays an important role in the modern electricity Demand Side Management (DSM). Accurate forecast on short-term electric load is a crucial element to the dynamic operations of Advanced Electricity Demand Side Management (EDSM) and Advanced Power Information Systems (APIS) in improving the efficiency and the safety of power grid. To a certain extent, medium-term power load is affected by seasonal factors, summer temperature, and consumption peaks due to unexpected cases. In general, high temperature in summer may form a high “air conditioning load.” With the rise of temperature, air conditioning load increases as well. Temperature data fluctuate widely in summer. Electric load also changes a lot with temperature. Theoretically, it is difficult to forecast mutable data, such as summer air conditioning load. To address this problem, this paper proposes the combined model with HP Filter-SARIMA and ARMAX model optimized by regression analysis algorithm.

Over the past decade, many forecasting methods have been put forward, such as time series, gray model, SVM, and artificial neural networks (ANN) [3]. Azadeh et al. explored seasonal fluctuation and nonlinearity in forecasting based on the fuzzy system and data mining techniques to analyze monthly electricity demand in Iran. This model is established on well developed statistical theories to show explicit relationships between input data and outputs. However, the SARIMA model does not perform well when electric power deviates greatly from the normal weekly pattern. It cannot react under abnormal load conditions before the flow deviation is detected [4]. Hamzacebi and Es predicted annual electricity consumption by the optimization model, which is more effective in analyzing the long-term trend but has less effect in seasonal variation [5]. Zhu et al. used the HP filter to decompose the GDP sequence into the tendency item and the cyclical item [6]. He et al. also took advantage of the HP filter in energy price analysis to study the synchronization between the markets home and abroad [7]. Zhang et al. employed four improved adaptive coefficient approaches optimized by particle swarm optimization (PSO) to forecast daily mean wind speed, the simulated results of which showed that the PSO obtained an observable improvement in forecasting performance [8]. Guo et al. proposed a modified EMD-FNN model by combining empirical mode decomposition (EMD) with the ensemble learning paradigm of feedforward neural network (FNN), which had better accuracy than that based on the basic FNN and unmodified EMD-FNN [9].

The multivariate ARIMAX model is hypothesized to improve the one-step-ahead forecasting accuracy of the univariate SARIMA model. Bierens and Broersma used the ARMAX model to study the relation between unemployment and interest rate. They found that the relationship is not confined only to the Netherlands; but it holds for USA, Canada, Japan, Germany, UK, and France [10]. Bordignon et al. analyzed combined versus individual forecasts for British electricity price prediction. It is found that combined forecasts are more accurate than or at least equivalent to individual ones [11]. Yan and Chowdhury adopted a hybrid midterm forecasting model based on the combination of both least squares support vector machine (LSSVM) and autoregressive moving average with external input (ARMAX) modules to forecast electricity market clearing price (MCP). It is shown that the proposed hybrid model can improve forecasting accuracy compared with the forecasting model using a single LSSVM [12]. Wang et al. proposed a two-stage model in estimating value-at-risk (VaR) based on ARMAX-GARCHSK and extreme value theory (EVT). It is shown in the empirical analysis that the ARMAX-GARCHSK-EVT model can rapidly reflect the most recent and relevant change of electricity prices, with accurate forecasts of VaR at all confidence levels, thereby presenting better dynamic characteristics [13]. Yang et al. proposed a new evolutionary programming (EP) approach to identify the autoregressive moving average with exogenous variable (ARMAX) model for hourly load demand forecasts from one day to one week ahead. The developed EP based load forecasting algorithm is verified by different types of data for Taiwan Power (Taipower) system and substation load as well as temperature values [14]. Huang et al. proposed a new particle swarm optimization (PSO) approach to identify the autoregressive moving average with exogenous variable (ARMAX) model for load forecasts. It is indicated by the testing results that the proposed PSO has the characteristic of high-quality solution, superior convergence, and shorter computation time [15].

Wangdi et al. adapted ARIMAX model to determine predictors of malaria in the coming month. ARIMAX model is an extension of ARIMA modeling in an attempt to predict the malaria cases using the climatic factors and the number of cases in the previous month. The predictors in the model include the number of cases in the previous month, mean maximum and minimum temperature, relative humidity, and rainfall lagged in a month. It is shown by test results that prediction accuracy has been greatly improved [16].

The above forecasting methods have obvious effects in dealing with the cyclical and trend data. However, the generalization capability is generally weak. Traditional time series forecasting methods can be used to predict short-term load data [17]. The forecasting accuracy of the method is not as good as the combined models [18]. Autoregressive integrated moving average with exogenous variables (ARIMAX) model is mainly aimed at forecasting the data under the influence of external factors. However, there are only a few literatures discussing this model for load forecast. With respect to the above active research, the combined models of HP Filter-SARIMA-Revised ARMAX optimized by regression analysis are proposed here to forecast short-term electric load. They have strong application value in summer load forecasting field.

2. Forecasting Methods

2.1. Basic Concepts

Some important concepts are displayed below.

Stationarity. Set as time series, ,   and named for strictly stationary time series

White Noise. Time series meet the condition of ① and ②; thennamed for white noise sequence or is displayed below:

Autocorrelation Function. Consider

Cointegration Theory. Cointegration theory was put forward by Engle and Granger [19]. Forecasting models can be developed without requiring that all sequences are stationary. Regression model is as follows:

Cointegration relationship exists between the independent variable sequence and response variable sequence , with stable regression residuals sequence. There is a strong correlation between electric load and temperature in summer, which is shown in Figure 4. The cointegration relationship is evident, which needs to be tested at the second section of the combined forecasting models.

2.2. Forecasting Models
2.2.1. ARMA

Wold’s decomposition theorem (1938) is the foundation of time series analysis, which states that any stationary time series can be decomposed into the purely deterministic component and the purely stochastic component. The latter one can be expressed as a moving average time series. The theorem is expressed aswhere is purely deterministic component; is a moving average time series. ARMA model (autoregressive moving average model) requires that sequence itself is stable. Thus the nonstationary sequence data is hard to be managed.

ARMA model is defined as

By introducing delay operator , it can also be presented as .

In terms of structure, models are the same as models, where the time series has first been transformed by differencing, the order of which is specified by . ARIMA model flowchart is as follows in Figure 1.

2.2.2. HP Filter

Hodrick-Prescott Filter. Hodrick and Prescott first put forward Hodrick-Prescott filter (HP filter) method in an analytical paper about economic cycle in postwar America [20]. The sequence is divided into two groups: the sequence with long-term trend, denoted as , and the sequence with short-term volatility, denoted as . The relationship is counted as

In this paper, HP filter is applied to the non-seasonally adjusted series, and the original sequence is divided into two sequences with the significant spectral frequency. The model is more accurate, which weakens the interaction among factors like seasonality, trend, cycle, and so forth [21].

2.2.3. ARMAX

ARMAX Model. Supposing that the response variable and the input variable sequences are all stationary, then the regression model is established in response to the input variable sequences and response sequences [22]

The ARMAX model is an improvement of the ARIMA model with explanatory exogenous variables . The model is a combination of a regression model with an ARIMA model, which includes the advantages of both models. In the actual modeling process, a combined “HP Filter-SARIMA-Revised ARMAX” model is proposed to forecast the short-term electric load. The specific process is displayed below.

2.2.4. Grey Prediction Model

The grey prediction processes are as follows [23].

Step 1. Consider first-order accumulated generating (1-AGO).
Variable is the original nonnegative sequenceFirst-order accumulated generating sequence of is

Step 2. Consider operating quasi-smooth test and quasi-exponential law test to :If and , shows a decreasing trend. is quasi-smooth sequence, while has quasi-exponential law. Otherwise

Step 3. Consider the establishment of first-order linear differential equations:

Step 4. Consider building grey forecasting model of sequence :

Step 5. Test the model.

2.2.5. MLP Neural Network

Multilayer perceptron neural network (MLP neural network) is one of the artificial neural networks which contains three processes, namely, training, testing, and validation [24]. Figure 2 is a multilayer neural network diagram

Synaptic weight change rules for the neurons of the hidden layer are as follows:

Synaptic weight change rules for the output neuron are as follows:

2.2.6. Regression Analysis

Regression analysis is used to forecast the value of one variable (dependent variable) based on other variables (independent variables).

The simple linear regression model is displayed below:where dependent variable is denoted as , independent variables are expressed as , and are parameters.

3. Establishing Process of the Combined Models

3.1. Modeling Steps

Step 1. The original sequence or , defined as the superposition of the waves with different frequencies, can be divided into the sequence and the sequence [25], the separation process of which must satisfy the minimum loss function principle:

Step 2. The sequence is the function of time. The error term is denoted as . The method of ordinary least squares (OLS) is used in making polynomial curve fitting to minimize the sum of squared errors. Polynomial fitting can be represented asThe sequences’ predictions can be got and defined as .

Step 3. The stationary property of the sequence is tested. If it is stationary, the correlation analysis can be used on the sequence; If it is stationary, the correlation analysis can be used on the sequence; if otherwise, the difference transformation is conducted on the sequence until it is stationary [26, 27]. The stationary sequence is denoted as .

Step 4. Through the correlation analysis, the autocorrelation and moving average items can be acquired. The ARIMA model is presented as

Step 5. Consider testing of the rationality of the model by the residual sequence.

Step 6. By ARIMA model, the predictions of the sequences can be denoted as .

Step 7. The final prediction result can be got according to the principle of HP filter:

Step 8. Consider exploring the correlation coefficient between the stationary sequences “” and “” to determine the structure of the improved ARMAX model. This step is an improved version for traditional ARMAX model. The revised ARMAX model can be calculated as follows:

Step 9. Consider fitting residual sequence :where is a zero mean white noise sequence.

Based on above steps, the combined “HP Filter-SARIMA-Revised ARMAX” model can be applied in load forecasting process.

3.2. Modeling Flowchart

The forecasting processes of the combined models are shown in Figure 3.

4. Load Forecasting with ARMA Model

4.1. Data Source

Figure 4 shows the daily maximum power load and the maximum temperature in a city in China from May 1st to July 15th. The specific information about the city is not allowed to be shared here. In this paper, the classical time series models and the combined models are applied to forecast load. The prediction results of different models are compared in Section 5.

It is shown in Figure 4 that load data has clear cyclical fluctuations by observing the sequences. The sequences are obviously nonstationary.

4.2. Establishing Seasonal-ARIMA Model

The ADF unit root test demonstrates that the original sequence is nonstationary, while the first-order difference of the original sequence is stationary under the 5% significant level [28]. Therefore, the additive model is used to adjust the seasonal sequence. The analysis on the partial autocorrelation and the autocorrelation is shown in Figure 5.

It is shown in Figure 5 that the autocorrelation of daily peak power load data is in stationary and periodic series. Partial autocorrelation shows that only the first-order partial autocorrelation coefficient is significantly greater than two times standard deviation [29]. The rest partial autocorrelation coefficient rapidly declines to zero, making random fluctuations within two times standard deviation ranges. Thus it may be regarded as the first-order truncation [30].

After determining the model order by parameter significance testing, the time series model obtained is [31] in Tables 1 and 2. In this model, the nonseasonal autoregressive items are and , the nonseasonal moving average items are and , and the seasonal autoregressive items are .

In the process of model order selection, is greatly influential in modeling and predicting. This phenomenon shows that the errors caused by long-term observations still affect the current monthly electricity consumption to some extent [32]. It is inferred that monthly electricity consumption may have the characteristic of long-term memory. This conjecture is confirmed by long-term memory test using the Rescaled Range Analysis () method, which shows the Hurst exponent:

According to the criteria, , it is known that the monthly electricity consumption has the long-term memory characteristic. Therefore, the current forecast is influenced by distant observations [33, 34].

4.3. Integrative Models Using HP Filter

HP filter is applied in the analysis of original sequence. The decomposition results are shown in Figure 6, in which the blue curve is the original sequence, while the red curve is the long-term trend. It can be found that the growth rate of the power consumption is mainly constant from May 1st to July 15th. The green curve changes cyclically and irregularly. As time goes on, the fluctuation range is larger.

According to HP filter, the original electric load sequence can be separated into the sequence with long-term trend and the sequence with other fluctuant properties [35].

Data separation results using HP filter are shown in Figures 7 and 8. It can be seen that the sequence approximates a smooth curve, while the curve of the sequence fluctuates up and down around zero.

In the perspective of statistics, the sequence can be transformed into a time-related sequence. The independent variable represents time, while the sequence shows the dependent variable of the system.

The fitting polynomial of electric load data is

Fitting figure of load is shown in Figure 9.

The sequence of temperature from May 1st to July 15th can be forecasted based on fitting polynomial [36], while the sequence is predicted based on .

The same procedure can be applied to temperature data.

Based on HP filter, the original temperature sequence can be separated into the sequence with long-term trend and the sequence with other fluctuant properties in Figure 10.

Data separation results using HP filter are shown in Figures 11 and 12. It is shown that the sequence looks like a smooth curve, while the curve of the sequence fluctuates up and down around zero.

The fitting polynomial of temperature data is

The sequence of the temperature from May 1st to July 15th can be forecasted based on the fitting polynomial, while the sequence is predicted based on . Fitting figure of temperature is shown in Figure 13.

Finally, the input value of revised ARMAX model is obtained based on the HP filter principle , .

4.4. Load Forecasting with Combined Models
4.4.1. and Sequences Model

Firstly, the model is established. It is shown in the test that is a stationary white noise sequence; thus the fitting model is

It is indicated in Table 3 that is a stationary white noise sequence.

The best order for model is , which is shown in Table 4.

Secondly, the model is set up. The test results obtained by SAS show that is a stationary white noise sequence [37]. Therefore, the fitting model is or model. Fitting parameter of is shown in Table 5.

The final fitting model is shown below:

4.4.2. Computing Load Data with Revised ARMAX Model

The above model is used to filter input variable sequence and the response variable sequence , which is followed by the calculation of mutual relationship number between the independent variables and the response variable after filtration.

The regression analysis in Table 6 shows that the final regression coefficient is 0.62871. The statistics test is conducted with residual sequence, showing that the residual sequence is stationary white noise sequence () [38, 39]. The fitted model for residual sequence is , where is zero mean white noise sequence.

The number of 0-order delay mutual relationship is significantly nonzero, which means that there is no hysteretic effect between response sequence and input sequence. Thus the model should be treated in the same period [40].

The statistics test is operated with residual sequence, showing that the residual sequence is stationary white noise sequence. The fitted model for residual sequence is , where is zero mean white noise sequence [4143]. It is known that there is significant correlation in the zero order between the two sequences. The same period model between and is established based on the results in Table 6:

The load from 16th to 31st is forecasted according to the combined models, HP Filter-SARIMA-Revised ARMAX. The second column in Table 7 is the actual load data (104 kw·h). The third column is the actual temperature (°C). The fourth is the prediction value of SARIMA model. The fifth is prediction value of ARIMA model. The sixth is prediction value of grey system theory. The seventh is prediction value of MLP neural network.

In our work, the prediction data points and the errors for five models from July 16th to July 31st (training set) have been conducted to assess the models’ fit performance. Table 8 lists the comparison of simulation performance among five models for training set, and Figures 14 and 15 show the corresponding error radar chart and histograms for direct observation.

The error criterion indicators are expressed as follows:

Forecast graphic is displayed in Figure 16.

It is shown in residual stationarity and white noise test that the residual is stationary white noise sequence, . There is a second-order delay correlation between and . The final fitting model is

5. Conclusion

Based on the above analysis, it is shown that the combined HP Filter-SARIMA-Revised ARMAX models can effectively forecast electric load with external variables. The structure of the model is determined by the statistical regression analysis and the least squares method. The process strictly follows the rules of -test, AIC, and SBC. The combined models are more accurate than the single forecasting method for short-term electric load forecasting.

A total of four traditional forecasting models are applied to forecast electric loads in this paper. It has been proved in empirical analysis that the combined models have small relative error compared with the traditional methods. The prediction accuracy of the combined models is greatly improved. The significance of parameters has been greatly enhanced. With the existing literatures and the analysis in this paper, researchers may find out that combined models have better prediction performance in dealing with special data compared with a single model.

Notations

:Mean of time series
:Variance of time series
:Correlation coefficient of time series
:Autocorrelation function
:A stationary sequence
:The average of the sequence
:Deterministic component
:Moving average time series
:Autoregressive coefficient
:Random interference coefficient
:Delay operator
:-order moving average coefficient polynomials
:Residual sequence moving average coefficient polynomials
:Original sequence
:The sequence
:Ratio of the standard deviations of the sequences and
:Backward shift operator
:Hurst index
:Difference between the maximum and the minimum of the cumulative deviation
:Standard deviation of the original sequence
MAPE:Mean absolute percentage error
MAE:Mean absolute error
RMSE:Root mean square error
DSM:Demand Side Management
EDSM:Electricity Demand Side Management
APIS:Advanced Power Information Systems
SVM:Support vector machine
ANN:Artificial neural networks
FNN:Feedforward neural network
LSSVM:Least squares support vector machine
PSO:Particle swarm optimization
ARMA model:Autoregressive moving average model
HP filter:Hodrick-Prescott filter
ARMAX model:The ARIMA model with explanatory exogenous variables
MLP neural network:Multilayer perceptron neural network
SARIMA model:Seasonal-ARIMA model.

Conflict of Interests

The authors declare no conflict of interests.

Acknowledgment

The authors gratefully acknowledge the financial support from the “National Natural Science Fund of China (no. 71471061).”