Advances in Meteorology

Volume 2016, Article ID 8760780, 10 pages

http://dx.doi.org/10.1155/2016/8760780

## Short-Term Wind Power Interval Forecasting Based on an EEMD-RT-RVM Model

^{1}College of Energy and Electrical Engineering, Hohai University, Nanjing 211100, China^{2}State Grid Chang Zhou Power Supply Company, Changzhou 213000, China

Received 3 June 2016; Revised 6 October 2016; Accepted 25 October 2016

Academic Editor: Caroline Draxl

Copyright © 2016 Haixiang Zang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Accurate short-term wind power forecasting is important for improving the security and economic success of power grids. Existing wind power forecasting methods are mostly types of deterministic point forecasting. Deterministic point forecasting is vulnerable to forecasting errors and cannot effectively deal with the random nature of wind power. In order to solve the above problems, we propose a short-term wind power interval forecasting model based on ensemble empirical mode decomposition (EEMD), runs test (RT), and relevance vector machine (RVM). First, in order to reduce the complexity of data, the original wind power sequence is decomposed into a plurality of intrinsic mode function (IMF) components and residual (RES) component by using EEMD. Next, we use the RT method to reconstruct the components and obtain three new components characterized by the fine-to-coarse order. Finally, we obtain the overall forecasting results (with preestablished confidence levels) by superimposing the forecasting results of each new component. Our results show that, compared with existing methods, our proposed short-term interval forecasting method has less forecasting errors, narrower interval widths, and larger interval coverage percentages. Ultimately, our forecasting model is more suitable for engineering applications and other forecasting methods for new energy.

#### 1. Introduction

Industrialization practices are rapidly depleting fossil fuel reserves. Moreover, widespread use of fossil fuels produces large amounts of greenhouse gases and dust particles, both of which have significant negative effects on human society and the environment [1–3]. In order to address the energy crisis and alleviate environmental pressures, many countries are researching and utilizing forms of renewable energy [4–6]. Wind power has become especially prominent in the field of renewable clean energy because it is pollution-free, reserve-rich, and readily renewable [5]. Continuous improvements in wind power technology have led to an increase in the number of wind-powered grids. However, wind power is also random and volatile, and any serious power disturbances can affect the safety and stability of wind-powered grids. As such, accurate wind power forecasting is necessary for creating reasonable generation plans and system backup arrangements [7–9]. Ultimately, the key to increasing the number of wind-powered grids is to improve the wind power penetration limit of power grids.

Recent research and studies have greatly improved short-term wind power forecasting. Many methods, such as the time series method [7, 10–12], Kalman filtering [13], model structure selection [14], fuzzy logic method [15], the artificial neural networks (ANNs) method [16–19], wavelet transformation [20], and support vector machines [21] have been utilized for wind power forecasting. Additionally, other combined methods have become popular in recent years [22–25].

The stochastic volatility of natural wind and its effects on wind-powered grids cannot be ignored. Interval forecasting can effectively reflect the uncertainties in the forecasting results. Deterministic point forecasting methods have some deficiencies in characterizing the randomness of actual wind power [26]. Therefore, it is necessary to establish a forecasting method that is capable of efficiently providing accurate information. If we can establish a forecasting method capable of providing accurate interval forecasting, we will better understand potential fluctuations in wind power, which will allow for the creation of standby arrangements for power systems [27, 28]. Compared to deterministic point forecasting, interval forecasting is still in its infancy. Interval forecasting has become more studied in recent years, and various interval forecasting methods have been proposed. The existing interval forecasting methods include the bootstrap method, the quantile regression method, the mixed structure interval method, and the probability interval forecasting method. The bootstrap method [29] constructs a sample set based on computer resampling technology, which requires a large amount of original data processing, and consumes much time and computation. The quantile regression method [30–32] utilizes a rigorous theoretical background and yields reliable results; however, it requires a predetermined regression model and subsites, complicated calculations, and its forecasting accuracy is significantly reduced when the predicting samples increase. The mixed structure interval method [33, 34] is usually based on point forecasting results, with the interval result being determined by the calculation of coefficients and error analysis. The probability interval forecasting method [35–38] constructs a load distribution so as to directly obtain the expectations and forecasting distribution of the load. The forecasting interval can then be drawn under an arbitrarily determined confidence level.

In order to establish a more simplistic and accurate short-term interval forecasting method, we propose a combined model based on ensemble empirical mode decomposition (EEMD) and a relevance vector machine (RVM). As part of our model, we use the framework of the probabilistic interval forecasting method, specifically the runs test reconstruction method, in order to achieve short-term interval wind power forecasts. First, RVM (a relatively new machine learning algorithm) combines the Markov Native theory, the Bayesian theorem, and the autorelated decisions a priori and maximum likelihood theory. Compared with ANN and SVM, not only does RVM have the advantages of higher model sparsity, fewer kernel function limits, and stronger generation ability, it can also obtain probabilistic forecasting results within the framework of the Bayesian theory and the statistical learning theory [39]. Second, in order to improve the forecasting accuracy of our model and narrow its width of interval range, we altered and improved two aspects of data decomposition preprocessing, as well as model parameter optimization. The EEMD is used to decompose the original wind power sequence into a series of IMF components and RES component in order to reduce its complexity. The RT method is then used to reconstruct these IMF components and RES component into a trend component, a detailed component, and a random component. Finally, a combination of the typical local kernel of the RBF kernel and the global kernel of the polynomial kernel is used to obtain better forecasting results.

Our proposed EEMD-RT-RVM model is used to achieve the one-point-ahead 15 min ahead short-term wind power interval forecasting. We used a variety of evaluation indexes to conduct comparative analyses and impact assessments for both our proposed model and other existing models. The results show that our combined model obtains higher forecasting accuracy and narrower interval widths than other existing methods. As such, our proposed model has high research significance and practical value.

#### 2. Methodologies

##### 2.1. Empirical Mode Decomposition (EMD)

EMD is an efficient signal decomposition method that does not rely on any predefined basis function. The EMD reflects the dynamics of signals more accurately than other models. The modes extracted by the EMD, named the intrinsic mode functions (IMF), are defined by the following criteria: () the number of extrema and zero crossings must be equal or differ by no more than one and () the local mean of the envelope defined by the local maxima and local minima must be zero [40, 41]. These two criteria ensure that each IMF has a physically meaningful phase definition; however, the time invariant frequency does not necessarily have a meaningful phase definition.

Given a signal , the EMD algorithm can be summarized as follows.

*Step 1. *Initialize the loop variable , , where is the given original data.

*Step 2. *Initialize the loop variable , .

*Step 3. *Find out all the local minima and maxima of , and interpolate between the local minima and maxima, respectively, in order to get an upper envelope and a lower envelope . The mean value of these envelopes is described asNext, compute the minis of the original data and the envelope mean value as

*Step 4. *Check whether satisfies the two criteria for an IMF (as defined above). If it is not satisfied, make , , and repeat Step 3. If it is satisfied, the first IMF can be given asThe residual can be computed by

*Step 5. *Treat as a new signal and repeat Steps 1–4 (in order to find more IMFs) until the residual is a constant or a monotonic function. Finally, the given can be decomposed into IMFs and a final residual as follows:

##### 2.2. Ensemble Empirical Mode Decomposition (EEMD)

Mode mixing is the most significant drawback of EMD. Mode mixing implies either a single IMF consisting of signals of dramatically disparate scales or a signal of the same scale appearing in different IMF components. This causes intermittency when analyzing signals.

In order to solve the problem of mode mixing in EMD, Wu and Huang proposed a new noise-assisted data analysis method called ensemble empirical mode decomposition (EEMD) [42]. The EEMD method utilizes recent studies on white noise which showed that the EMD method is an effective self-adaptive dyadic filter bank when applied to white noise. The results demonstrate that noise can help data analysis in the EMD method [22, 43].

Two important parameters used in the EEMD method are () the amplitude of white noise and () the total repeat number of the EMD. At present, the determination of and is based on the structural characteristics of the data. Generally, the taken is 100, and is chosen from a range of 0.05~0.5. Based on previous tests, we set and in this paper.

The specific steps of the EEMD can be described as follows:(1)Set the value of the amplitude and the total repeat number* M*.(2)Add a white noise series to the signal.(3)Decompose the signal with the added white noise into IMFs by using EMD.(4)Repeat steps () and () using different levels of white noise each time and obtain corresponding IMF components of the decomposition. Calculate the mean of all the corresponding IMF components. Take the mean as the final result for each IMF. Calculate the mean of all the residual (RES) components and take the mean as the final result for the RES component:(5)Take the () and as the IMF components and RES component, respectively.

##### 2.3. Runs Test (RT)

The runs test method [44] is defined in the following.

Assume the time series corresponds to and RES as , where is the label of IMF, is label of samples, and is the total number of samples. The mean value of the samples is defined as

Then, the timing symbol can be defined aswhere consists of a series of statistically independent randomly arranged sequences of 0 and 1.

Define each sequence with successive symbols (0 or 1) as a runs test. The total runs test number of each can be used to detect the fluctuation of each component obtained by the EEMD. Next, the high and low runs test thresholds can be set according to the runs test, and the components decomposed by the EEMD will be reconstructed into three new components (with typical characteristics based on the fine-to-coarse order) [44]. This ensures the decomposition effect and significantly reduces the run time of the model. Moreover, the similar components are reconstructed, strengthening the inherent laws of these data, to improve the prediction accuracy.

##### 2.4. Relevance Vector Machine (RVM)

Compared with other forecasting algorithms, the RVM not only has high sparsity, less optimized parameters, flexible kernels, and strong generalization abilities, but also directly implements interval forecasting [45, 46]. Therefore, in this study, the RVM is used to establish the interval forecasting model for the new components reconstructed by RT.

For a given set of input training samples and the corresponding output sets , the relevance vector machine regression model can be defined as follows: where is the error of the independent sample (which follows the Gaussian distribution with the variance ), is the model weights, is a nonlinear kernel function, is relevance vector, and is the length of the data.

In the RVM, a priori probability distribution for each model weight is given aswhere is the hyperparameter of a priori distribution of model weight .

Given a training sample set , assume the target value is independent and the noise in the data follows the Gaussian distribution with the variance . Then, the likelihood function of the training sample set can be represented bywhere , , and is the design matrix given by

Based on a priori probabilities distribution and the likelihood distribution, the posterior distribution over the weight forms Bayes rule and can be written aswhere ; ; .

The marginal likelihood distribution of the hyperparameters can be obtained bywhere .

Finally, the hyperparameter and the variance can be estimated by using the maximum likelihood algorithm.

If the input value is , then the corresponding output probability distribution obeys the Gaussian distribution, and the corresponding forecasting value can be derived by

The RVM model can give both the mean value and the variance. As such, this model reflects the uncertainty of forecasting results and provides accurate interval forecasting (within the range of certain confidence levels).

Under the confidence level of , the interval forecasting results are as follows:where Lb denotes the lower bound of the forecasting value and Ub denotes the upper bound of the forecasting value.

#### 3. Model Construction

##### 3.1. Using EEMD to Decompose an Original Wind Power Sequence

In order to verify the effectiveness of the forecasting model, the whole year wind power sequence (96 point one day) obtained from a wind farm in Jiangsu province is used as the research object. The installed capacity of this wind farm in Jiangsu province is 49.5 MW, which contains 33 wind turbines. In this study, actual wind power data (5 days ago) is taken as the training sample. Then we establish wind power interval forecasting model for the next day in advance 15 minutes’ forecast.

As is shown in Figure 1(a), the actual wind power is random and volatile. In order to improve the forecasting effect, it is necessary to reduce the complexity of the data. Compared with other decomposition algorithms, the EEMD exhibits better noise robustness and decomposing effects. In this study, we use the EEMD to decompose actual wind power and to establish specific components, in which the periodicity, randomness, and the trends of the actual wind power can be clearly seen in the components. The decomposition results of EEMD are shown in Figure 1.