Discrete Dynamics in Nature and Society

Volume 2016 (2016), Article ID 9649682, 9 pages

http://dx.doi.org/10.1155/2016/9649682

## SARIMA-Orthogonal Polynomial Curve Fitting Model for Medium-Term Load Forecasting

School of Economics and Management, North China Electric Power University, Baoding 071003, China

Received 21 June 2016; Accepted 4 October 2016

Academic Editor: Paolo Renna

Copyright © 2016 Herui Cui et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Seasonal component has been a key factor in time series modeling for medium-term electric load forecasting. In this paper, a seasonal-ARIMA model is developed, but the parameters of the SAR and the SMA turn out to be quite nonsignificant in most cases during the model order selection. To address this issue, the hybrid time series model based on the HP filter is utilized to extract the spectrum sequences with different frequencies and analyze interactions among various factors. Finally, an integrative forecast is made for the electricity consumption from January to November in 2014. The empirical results demonstrate that the method with HP filter could reduce the relative error caused by the interaction between the trend component and the seasonal component.

#### 1. Introduction

To a certain extent, the medium-term power consumption is affected by the seasonal factors, the historical consumption, and consumption peaks caused by unexpected events. According to the current prediction techniques, these factors are temporarily categorized into the long-term trend (), the seasonal fluctuation (), the cycle volatility factors (), and the irregular volatility factors (). The influences of these factors superimpose on each other and thus that became a difficult problem in model construction.

In recent years, many researches have been conducted in the field of the above four fluctuation factors in power load forecasting. The neural network method is often used to make predictions on electric load [1, 2], whereas, considering the data volume, the time sequence model is more in line with the characteristics of the sequence than the neural network model [3]. Azadeh et al. [4] combined the seasonal fluctuation and the nonlinearity of forecasting with the fuzzy system and data mining techniques to analyze the monthly electricity demand in Iran. The SVM model can also be exploited to analyze the effect of the seasonal fluctuation and the long-term trend [5]. The significant trend sequence can be analyzed through the model combined with neural network method [6–8]. It follows that the characteristics such as trend and seasonal ones are the key factors which affect the accuracy of the load forecasting in the medium-term load forecasting. The SARIMA model which could eliminate the effects of seasonal factor and irregular change factors is more suitable for the monthly electricity consumption forecasting [9, 10]. The decomposition method of sequence is often used to analyze the superimposed effect produced by seasonal change tendency and the long-term growth and decline trend [11]. Among them, the application of Hodrick-Prescott (HP) filtering method has the certain superiority in the series decomposition [12, 13].

The seasonal-ARIMA model is able to take all seasonal fluctuation of sequence into full account. However, due to the interaction of the four fluctuations, “”, “”, “,” and “”, and the interaction between the seasonal factors and the nonseasonal factors, the seasonal parameters are nonsignificant in practical applications in most cases. HP filter is based on the spectral analysis to separate the data sequences and relieve the superimposed impact of the fluctuations. In this paper, by using the HP filter we get the sequence with significant trend and the sequence with significant periodicity. With separated modeling and integrative analysis, the model can successfully relieve the mutual influence of the changing trend and improve the precision of prediction. Besides, to cope with the model order problems, the paper has conducted a long-memory test on the original data sequence. The result shows that the sequence does not meet the standard random walk process, which puts forward new ideas with the power load forecasting.

The purpose of this paper is to design an accurate prediction model. And the remaining parts of the paper are organized as follows. Section 2 introduces the principle of the method. Section 3 describes the process of the power load forecasting by using the traditional method and the improved method and discusses the results. The last section makes some conclusions of this paper.

#### 2. Forecasting Models

##### 2.1. SARIMA Model

SARIMA (Seasonal Autoregressive Integrated Moving Average), which is denoted as , is based on the traditional model, and it can eliminate the periodicity influence in a prediction process and thus is a widely applied model for forecasting seasonal time series [14, 15]. The formula can be described as follows:where is the backward shift operator. The integers , , , and are the order of , , , and , respectively. The integers and are the number of regular differences and seasonal differences, respectively, and, for a nonstationary time series , could come to a stationary series by using the difference operator . satisfies the formula . The formulasare polynomials in of degrees and . And the formulasare polynomials in of degrees and . And which is a current interference with variance and mean = 0 is considered as the estimated residual at time . At the same time, is an independent and identically distributed normal random variable.

In the process of the seasonal time series analysis, there are three questions that need to be analyzed.

*(1) Stationary Test.* The stationarity of the time series is the premise for building the S-ARIMA model. When it meets the condition that , , and are constants in formula (4), we can define as weakly stationary or covariance stationary:

The ADF unit root test can be used to test whether the sequence is stationary or not. If the sequence is nonstationary, the difference transformation would be used until the difference sequence is stationary. The stationary sequence with differential transformation is defined as .

*(2) Seasonal Analysis*. Before we make the seasonal analysis, the autocorrelation function should be defined. It can be expressed as follows:where is a stationary sequence and is the average of the sequence . By judging the autocorrelation function and the confidence interval, the periodicity and the cycle of could be obtained. According to the additive model which is defined as (6), the sequence can be seasonally adjusted:where means long-term trend and cycle volatility, means seasonal fluctuation, and means irregular volatility.

*(3) Model Order Selection and Model Prediction.* Firstly, the seasonally adjusted sequence is defined as . The lag intervals for endogenous function and the confidence interval of the autocorrelation function and the partial autocorrelation function should be analyzed in order to determine the order of AR(), MA(), SAR(), and SMA() and build the model. Then, according to the principle of minimum mean square error, the prediction is the conditional expectation of , and it can be expressed as follows:when the higher-order problem exists in the model, we can make the long-memory test for the stationary sequence .

##### 2.2. Method of ARFIMA Model

The long-memory analysis, which is specific to the random walk process, is put forward by H. E. Hurst in the research of the relationship between the reservoir of water flow and the storage capacity in 1951. And he puts forward the rescaled range analysis () for the long-memory analysis. Then the researchers often use this method for financial sequence analysis and build the Autoregressive Fractionally Integrated Moving Average (ARFIMA) model [16, 17]. The analysis procedure of method is shown in the following paragraph.

Firstly, the sequence is divided into the infinite number of intervals, and the length of each interval is . Every interval is defined as follows:where is the average of the interval and is the cumulative deviation of the interval . Then the letter can be used in denoting the difference between the maximum and the minimum , and the letter can be used in denoting the standard deviation of the sequence , so the formula of analysis can be expressed as follows:where , which is the Hurst Index, is defined as the index of and is a constant. The logarithm should be taken on both sides of the equation, and adjust the equation as follows:

Then the Hurst Index can be worked out by the OLS method. Finally, the long memory can be judged by the standard as follows [18]:when , the original sequence is likely to have a long memory [19]; however, whether there is an ARFIMA model which is suitable for most of the medium-term load forecasting cannot be guaranteed [20].

##### 2.3. Orthogonal Polynomial Curve Fitting

Orthogonal polynomial curve fitting is the improvement of the Ordinary Least Square (OLS). There is a premise that the independent variables must be accurate values before using the OLS method, but it is not reasonable in most cases. When the error of the independent variables reaches a certain extent, the prediction model with OLS method would produce a certain error. In view of this situation, the orthogonal polynomial curve fitting is proposed. And its basic principle is that the square sum of the orthogonal distance from all points to the fitting curve is minimum. In the OLS method the fitting polynomial can be expressed as follows:which is fitted by the least square criterion: the distance square sum between the predicted value and actual value is minimum, and it can be expressed as follows:then the undetermined coefficients can be got by the mean value theorem. This orthogonal polynomial curve fitting method is improved on the basis of OLS method, and the errors of the dependent variable and the independent variable are considered to build forecasting model. And the fitting polynomial can be expressed as follows:where is the predicted value of the independent variable . The orthogonal distance error can be expressed as follows:where and are the random error of and , respectively. Then the criterion of the orthogonal polynomial curve fitting can be expressed as follows:

Combining the orthogonal polynomial with the OLS method, the multinomial model can rise to the imitative effect.

The objective function can be expressed as follows:where represents the real point, represents the fitted curve, and represents the orthogonal distance from the real points to the fitted curve.

The parameter equation of fitted curve can be defined as follows:where is a point of fitted curve and is the included angle of the tangent to the abscissa axis, so the objective function can be expressed as follows:

Then we should take its partial derivative with respect to , , and in order to calculate the minimum error and the fitted curve. The equation set can be expressed as follows:where and are the mean values of the sequences and .

##### 2.4. Hodrick-Prescott Filter

Hodrick and Prescott first put forward Hodrick-Prescott filter (HP filter) method in the paper analyzing the economic cycle about postwar America. The method regarded the time series as the spectrum for analyzing [14, 15]. It divided the sequence into two groups, and their relationship with the original sequence is counted as where the sequence with long-term trend is denoted as and the sequence with short-term volatility is denoted as . The separation process must satisfy the minimum loss function principle:where , where is the smoothing parameter and and represent the standard deviation of the sequence and the sequence , respectively. When increases, estimated total trend changes in relation to the change in the sequence which is reduced. It means that takes the high number, the estimated trend is more smooth, and when trends to infinity, estimated trend will be close to the linear function. As a general rule of thumb, when we analyze the monthly data, can be defined as .

In this paper, the HP filter is applied to the nonseasonally adjusted series, and the original sequence is divided into two sequences with the significant spectral frequency and building the model more accurately by weakening the mutual effect between the two sequences.

##### 2.5. Error Estimation Methods

There are five basic error estimation methods; simultaneously, the model can be evaluated by relative error (RE), mean absolute percentage error (MAPE), root mean square error (RMSE), and mean absolute error (MAE), which can be expressed as follows:

##### 2.6. Sequence Analysis and Combination Model Building

The improved model is based on separating the original sequence by filtering analysis. Then according to the characteristics of each sequence, the models can be established for forecasting. The detailed process is as follows.

(1) According to the HP filtering principle, the original sequence , defined as the superposition of the waves with different frequencies, can be divided into the sequence and the sequence .

(2) The sequence is defined as a function of time “,” and its scatter-plot can be drawn. The error term from each point to the fitting curve is denoted as , . Then the orthogonal polynomial with the OLS method is used for making polynomial curve fitting to minimize the sum of squared errors .

(3) According to the polynomial fitting in the previous step, the sequences’ predictions can be got and defined as .

(4) The stationary property of the sequence is tested. If it was stationary, the correlation analysis can be used on the sequence; otherwise, the differential transform is conducted on the sequence until it is stationary. The stationary sequence is denoted as .

(5) Through the correlation analysis of the seasonal fluctuation, the autocorrelation and moving average items can be acquired [18, 20]. According to the result, the ARIMA model can be defined as

(6) The rationality of the ARIMA model is tested by the residual sequence.

(7) According to the ARIMA model, the sequences’ predictions can be got and denoted as .

(8) The final prediction result can be obtained based on the principle of HP filter:

The improved model will produce twice prediction error in the analysis. In theory, there is the possibility of increasing the errors and reducing the prediction accuracy. But in the actual analysis, the HP filter method weakens the mutual influence of factors (including the long-term trend and the seasonal fluctuation) and utilizes integrative forecasting for models with the different characteristics of the sequence and the sequence . In this way, the trend of the sequence can be effectively fitted and the influence on the seasonal trend can be reduced. Finally higher prediction accuracy can be achieved.

According to the above steps, the specific process of the improved model is shown in Figure 1.