Abstract

A combined forecast with weights adaptively selected and errors calibrated by Hidden Markov model (HMM) is proposed to model the day-ahead electricity price. Firstly several single models were built to forecast the electricity price separately. Then the validation errors from every individual model were transformed into two discrete sequences: an emission sequence and a state sequence to build the HMM, obtaining a transmission matrix and an emission matrix, representing the forecasting ability state of the individual models. The combining weights of the individual models were decided by the state transmission matrixes in HMM and the best predict sample ratio of each individual among all the models in the validation set. The individual forecasts were averaged to get the combining forecast with the weights obtained above. The residuals of combining forecast were calibrated by the possible error calculated by the emission matrix of HMM. A case study of day-ahead electricity market of Pennsylvania-New Jersey-Maryland (PJM), USA, suggests that the proposed method outperforms individual techniques of price forecasting, such as support vector machine (SVM), generalized regression neural networks (GRNN), day-ahead modeling, and self-organized map (SOM) similar days modeling.

1. Introduction

Since the 1990s, the monopoly vertically integrated utilities of electric power industries around the world have been deregulated into competitive markets, aiming to break monopoly and increase operation efficiency. It is crucial for all participants in the market to predict the electricity price with high accuracy. Their bid actions depend on the forecasting and their benefits therefore are affected by the forecasting; thus price forecasting draws great interests.

Electricity price is affected by various uncertainties, such as power load, weather, and bidders’ expectations. These influential factors interact and have an intricate impact on price. Electricity price is more volatile than load with unexpected spikes (unusual prices), high frequency, and multiple seasonality (e.g., daily and weekly periodicity). So it is more difficult to be predicted than power load. There are primarily two categories of electricity price forecasting modeling, time series modeling, and artificial intelligence (AI) modeling.

Time series modeling forecasts future price with available historical prices by mining the relation information contained in the data, such as autoregressive moving average (ARMA), generalized autoregressive conditional heteroscedasticity (GARCH). Contreras et al. [1] used an ARMA model to forecast next-day electricity prices for mainland Spain and Californian markets. A novel technique was proposed to forecast day-ahead electricity prices based on wavelet transform and ARIMA models in [2]. A more robust time series modeling, GARCH model, was developed to forecast day-ahead electricity prices in [3, 4]. Time series modeling tries to mine the information contained in previous data however pays less attention to external influence leading to undesirable forecasting for the unstable characteristic of prices.

AI modeling usually exploits more circumstance influence factors than time series modeling and thus presents more desirable results. Artificial neural networks (ANNs) were developed to forecast electricity prices and showed better performance than time series modeling [5, 6]. An ANN modeling based on similar days was proposed to forecast day-ahead electricity prices in Pennsylvania-New Jersey-Maryland (PJM) market [7]. A technique with combining the Probability Neural Network (PNN) and Orthogonal Experimental Design (OED) was developed in [8] showing better performance than its counterparts.

Limited by the complexity of AI model, information contained in the historical prices is not made full use of. A hybrid model with support vector machines (SVM) to capture the nonlinear patterns and ARIMA to solve the residuals regression estimation problems was proposed in [9] showing the great potential of hybrid modeling. Another hybrid model combining SVM and GARCH was developed in [10] to forecast the day-ahead price of the PJM market.

Time series modeling and AI modeling have different weaknesses and strengthens in price forecasting since they place different emphasis on the exploitation method for the influence information of electricity price. Several predictions by different methods were suggested to combine to smooth the fluctuations which often occur in single model forecasting. The performance of the traditional combined forecast models relies on the combining weights of individual models, which usual are fixed and determined by historical performances of the models. Fixed weights are not always the best choice because the forecasting abilities of individual models vary along with the circumstance. Sometimes one model shows better performance, other times it does not. So it is necessary to select the combining weights of individual models according to their performance under certain circumstance. However it is a big challenge to analysis the circumstance and therefore to determine the proper combining weights of individual models under that circumstance. On the other hand, neither the single model nor the combining model can make full use of the influence information, and the modeling residuals usually contain information which have not be exploited by the models. It helps to improve the forecasting accuracy by analyzing the residual series and then to estimate the residual of next step [11].

A Markov chain is a random process which undergoes transitions from one state to another. It has an important character: the next state depends only on the current state and not on the sequence of events that preceded it. Markov chain can be used to analyze the performance of forecasting [1214]. A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. We can apply HMM to exploit the information contained in the forecasting error sequence. The forecasting errors can be treated as the observations of the HMM, and the forecasting abilities of a model under certain circumstance can be looked as the states [15]. In this paper a hybrid method consisting of a combining model with adaptive weights based circumstance and an error calibration technique was proposed to forecast the day-ahead electricity price. Several individual models were developed to forecast electricity price, respectively; then their performances under different circumstances were evaluated to build Hidden Markov models (HMMs). Together with the general past performance of the individual models, the state sequences of the HMMs were proposed to decide the combining weights; the emission sequences of HMMs were exploited to calibrate the errors by the combining model.

The rest of the paper is organized as follows. In Section 2, we describe the fundamental of HMM and the principle for combined forecast by HMM; Section 3 demonstrates the approach of combined modeling and error calibration with HMM. Experiments of the proposed technique and compared methods are showed in Section 4. Finally, the conclusions are presented in Section 5.

2. Principle of Combined Forecast with Weights Selected by HMM

In this section the basic ideas of combined forecast with weights selected by HMM are discussed.

2.1. Principle of HMM

HMM can be regard as a dual random process, a sequence of emissions that can be seen, and the other invisible sequence of state in which the emissions are generated. There are two kinds of HMM, discrete HMM and continuous HMM. Here we discuss the former and apply it to build combining model. For simplicity and emphasis we just give a brief introduction of discrete HMM. More details of HMM principle and how HMM works can be read in [15, 16].

Discrete HMM can be described by series parameters of five dimensions: , where(1): a set of states where the observation locates, , and is the number of the states;(2): a set of emissions or observations, , and is the number of the potential observations (or emissions) in each state;(3): a transition matrix which describes the probability of a transition from a given state to another state, , and here, ;(4): an emission matrix, whose , entry gives the probability of emitting symbol given that the model is in state . , where ;(5): a vector of initial state distribution, .

HMM mainly aims to resolve three problems:(1)to evaluate the most likely state path of a given sequence of emissions;(2)to estimate transition and emission probabilities of a given sequence of emissions;(3)to calculate the posterior probability that the model is in a particular state at any point in the sequence.

2.2. Combined Forecast by HMM

The basic idea of combined forecasting is to give a weighted sum of forecasting by different models to reduce the defects of individual modeling method. In this paper, we use HMM to determine the weights of combining models.

In electricity price forecasting, a sequence of errors generated from price modeling can be considered as a HMM process. The intervals in which the error of each forecasting locates form the sequence of observations or the emission sequence; the forecasting abilities of the individual models are regarded as the state of HMMs. The HMMs are built according to the validation errors of the individual models. Then the next states of the HMMs which depict the abilities of the individual models are used to decide the combining weights. The possible next emissions of individuals are averaged with combining weights and then used to calibrate the combined forecast.

2.3. Error Calibration by HMM

With the state probability vector of the next step assessed in Section 2.2 and the emission matrix of HMM, the probabilities of emissions in the next step can be calculated. Since the different emissions present the range intervals where the error falls in as mentioned in Section 2.2, we can convert the emissions and their probabilities to expected value of forecasting error. Then the expected value is used for error calibration of the combined modeling.

3. Approach of Combined Forecasting and Error Calibration by HMM

This part depicts how to build a combined model with error calibration based on HMM techniques. As showed in Figure 1. Considering that the hourly prices in different hours shows great difference, we build 24 combining models to forecast the hourly prices one by one. For any hour price’s modeling, the approach is the same, so we just take an hour as an example to show the modeling approach. The following 7 steps consist of the proposed method.

Step 1 (initiate). Including data pretreating and candidate models selection.
We cluster the experimental data into three sets: a training set, a validation set, and a test set. The first one is used to train models, the second one is used to tune models’ parameters according to their performances, and the last one is applied to evaluate the modeling algorithms.
candidate models () are selected for combining forecast.

Step 2. Individual modeling for combined forecast.
The following process is repeated for each individual model.

Substep 1. Build the individual model and calculate the validation error vector and forecasting price .
We train the model with the training set, then tune the parameters in with the validation set, and after that test with the test set. In the above steps, we get the validation error vector (see (1)) and forecasting price (see (3)) separately where is validation error by for the th hour on the th day, and is the number of days in the validation set, and error is calculated by where is the forecast price and is the actual price where is the forecasting price with for the th hour on the th day and is the number of days in the test set.

Substep 2. Calculate the emission sequence and the state sequence of . In this step, for each model we transform the error sequence into discrete emission (observations) sequence and class the states according to the error in which denotes the forecasting abilities of the model.

Substep 3. Calculate the emission sequence and the state sequence of .
In this step, we transform the error sequence into discrete emission sequence and obtain the state sequence according to the performance of modeling which denotes the forecasting abilities of the model.
As discrete HMMs are discussed here, the emission sequence needs to be discretized. Here we divide the range, in which spreads, into several intervals. Then marks the intervals where falls in with the emission values (elements of the emission set). Then we get the emission vector according to the intervals in which each falls: where is the emission of the model ; , is the emission set.
Then we begin to calculate the state matrix according to certain criterions based on the model’s performance. For simplicity, we just set three states to reveal the ability of the model forecasting, as follows: : the state of underestimate (when target price is significantly underestimated); : the state of proper prediction (when target price is estimated with acceptable accuracy), and : the state of overestimate (when target price is significantly overestimated): where is the state of the individual candidate in the th hour of the th day and , is the state set.

Substep 4. Estimate transition matrix and emission matrix .
In this step, the maximum likelihood estimate of the transition matrix and emission matrix are calculated with the known and . The process can be easily accomplished with the function of hmmestimate in Matlab (those who interested in its theory see the following: Durbin, R., S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis, Cambridge, UK: Cambridge University Press, 1998.), so we will not intend to give a detailed description here.
Given an initial state distribution, with the state sequence and the emission sequence, we can estimate the transition matrix (see (6)) and the emission matrix (see (7)) for HMM of the th hour: where is the transition probability from the th state to the th state and is the number of states: where is the probability of the th emitting symbol under the state and is the classes of the emissions.

Substep 5. Obtain the probabilities of the next state.
In Substep 4, we have obtained the transition matrix and the emission matrix . As discussed previously the matrix of describes the probability of a transition from a given state to another state; matrix gives the probability of emitting symbol under different states. So for a given state (suppose , ) in the th hour on the th day, the probabilities of the next state (in the th hour on the th day) are the vector in the transition matrix .

Step 3. Calculate probabilities of the next emission and estimate the possible error generated by the model.
The probabilities of the next state obtained in Substep 4 are multiplied with the emission matrix to calculate the probabilities of the next emission.
The emission probabilities then are transformed to continuous possible error with the intervals defined in Substep 3.

Step 4. Calculate the combining weights of the next step.
In this step, the combining weights of the different models are settled. The number of samples under the proper state of each individual for the validation set is used to evaluate the abilities of these models, as shown by where denotes the historical forecasting ability of , and is the number of samples in state of by model among the validation set.
The proper forecasting probability in the vector and the abilities are used to calculate the combining weights, as shown by where is the combining weight of model in the next step and is the proper forecasting probability of the .

Step 5. Get the combined forecast.
The forecasts from individual models are averaged with the weights obtained in Step 4 to get a combined forecast, as shown by where is the combined forecast and is the forecast by .

Step 6. Calculate the expected value of forecasting error of the combined model.
The possible errors from individual models are averaged with the combining weights obtained in Step 5 to get possible error of combined forecast, as shown by where is the possible error of the combined forecast and is the possible error from .

Step 7. Obtain the final forecast with expected value of error to calibrate the combined forecast, as shown by

4. Numerical Results

4.1. Individual Models Building for Combined Forecast

The proposed method is validated on the day-ahead electricity market of PJM. Considering electricity price in summer is more volatile than in other seasons, we apply the method to forecast the hourly price of August 2010. The data in July serve as validation data, and date in June serve as training data. New prices and loads are appended in modeling process to accommodate the models to the new circumstance.

In order to consider the influences from different aspects, we select various methods as candidates for modeling. Here we choose intelligent algorithm modeling (SVM, generalized regression neural networks (GRNN)), time series modeling (GARCH and GARCHX), and two direct methods as modeling candidates. One direct method is day-ahead modeling, in which we take the hourly price of the previous day as the forecasting price. The other direct method is SOM similar days modeling, in which we find the similar days by SOM from the validation set and then average the hourly prices of the similar days as the forecasting price. For simplicity, we use to to represent the SVM, GRNN, GARCH, GARCHX, day-ahead modeling, and SOM similar days modeling, respectively.

The performance of intelligent algorithm is sensitive to the input, so the modeling data are pretreated to eliminate the scale effects before modeling. All the numeric data are scaled to , as shown by where is the scaled value of the th attribute of the th sample, is the raw sample value, is the minimum of the th attribute of all the samples (all the data from 1 Jan. to July 31), and is the maximum of the th attribute of all the samples (ditto).

The input for SVM, GRNN, and SOM model is . is the forecast load of the target hour (it can be predicted day-ahead with high accuracy, so here we use the actual load as forecast load); is the price 24 hour previous to the target hour, and so on. is a daily variable reflecting the price fluctuation with different day types. can be calculated by where is the th hourly price on the th day of the th week in 2010. As data from August 1 to August 31 are applied to test the model, the prices of previous 30 weeks are used to calculate .

The exogenous variables in GARCHX are and .

The parameters of individual models and MAPE (for the validation set) are listed in Table 1. From Table 1, we can see that SVM outperforms the other models. The rest models have close results.

Figure 2 is the distribution of the validation error of the candidates. It can be seen that SVM outperforms other candidates obviously, and day-ahead modeling with right tailed is different with other candidates.

Table 2 is the correlation coefficient between the validation errors of candidates.

From Table 2, we can see that , , and show high correlation. Since each model has some contribution we mainly focus on the work of selection of combining weights and the error calibration, so all the six models are taken in as candidates for combined model.

4.2. Build HMM with Validation Errors of the Individual Models

In this part, HMM are built with the information of the validation error series of the selected models. As discussed in Step 2 of Section 3, the validation error sequence of is discretized to build HMM. Considering that most of the errors fall in the range , we divide the range into 5 intervals, as shown by

With process detailedly described in Substeps 3 and 4 of Section 3, we get the transition matrix and the emission matrix for .

4.3. Selecting Weights for Combined Forecast

Suppose that we have obtained all the forecasting errors of the 6 models of the th hour on the th day, as well as the forecasting price of the th hour on the th day (). For each model, we separate the range where errors fall into three zones, , , and , as discussed in Substep 3, indicating the forecasting ability of the model under certain circumstance. As shown by

Given , the model state of the th hour on the th day, the state probabilities of the next day of the same hour can be easily obtained from the transition matrix . Then the probabilities and historical abilities of the individuals are multiplied to generate the combination weights, according to (9). The combined forecast can be obtained by multiplying the forecasting of different models and the corresponding probabilities of their probable states, according to (10).

4.4. Calibrate the Combined Forecast with the Possible Combining Error

In this step, the possible errors of the next step by different candidates are estimated by their emission matrix as described in Substep 4 in Section 3; they are exploited to calibrate the combined forecast, according to (11).

4.5. Performance Comparison and Analysis

Table 3 shows the performance of the different modeling. It can be seen that the combination model significantly outperforms all the individual models. and also have the better forecasting than other individual models, just the same as the performance in the validation set.

Figure 3 is the comparison between the actual prices, forecasting by SVM and combined forecast with error calibration techniques. The forecasting by other models is not listed in the figure for simplicity since they are not as good as SVM. We can see from Figure 3 that most of the prices have been properly predicted by the combination model. Some extreme prices are too low or too high to model by both the two models. The hourly prices are overestimated by SVM, especially which have high prices on the previous day.

The effects of errors calibration is shown in Figure 4. The colors in Figure 4 show the value of the forecasting error: red color denotes big positive (high overestimation) and blue color denotes big negative (markedly underestimation), as the color bar lying on the right presents. We can see the most difference of the forecasting error from the black circle of the right part of Figure 4. The calibration reduces extreme errors (too high or too low) of the combined model; also it increases the number of proper forecasts, whose errors fall around zero, displayed by the grey color zone.

Figure 5 lists the error distributions by different models. The errors spread the range of . The minimum and maximum of errors reveal that some prices are not properly forecasted by some models. If we probe it deeply, we can find that SOM similar day modeling and day-ahead modeling have more extreme errors and SVM and GRNN have more desirable performance, but the calibration has the best performance. From Figure 5 we can see that all models tend to overestimate the price as the right tail in histogram shows. The errors in the combined model and calibration model centralizes around zero, and near half of the prices (more than 300 points) are forecasted with relative error in the interval . SVM and GRNN also predict prices well, more than 250 errors fall in the . Day-ahead modeling acts the worst; it has the least number of errors falling in the .

5. Conclusions

This paper proposes a comprehensive combined forecast technique for day-ahead price by HMM. Several models, SVM, GRNN, GARCH, GARCHX, day-ahead modeling, and SOM similar day modeling are selected as candidate models. The error distribution of each model is exploited to calculate the state of HMM and the intervals where minimum errors fall mark with emissions of HMM. Then the state sequence and emission sequence are used to estimate HMM. Given a state of current hour, the state probabilities of the combination modeling of next day can be obtained from transition matrix . These probabilities are regarded as combination weights for the combined forecast. Then the HMM are used to calculate the weights of combined model and to calibrate the error of the combined model.

The combined forecast can adapt to the varieties of circumstance by changing its weights dynamically with HMM, and the error calibration technique helps to reduce the error generated by combined model. The case study to forecast summer prices in PJM market shows that the proposed method outperforms other comparison methods, including SVM.