Abstract

Accurate short-term wind power forecasting is important for improving the security and economic success of power grids. Existing wind power forecasting methods are mostly types of deterministic point forecasting. Deterministic point forecasting is vulnerable to forecasting errors and cannot effectively deal with the random nature of wind power. In order to solve the above problems, we propose a short-term wind power interval forecasting model based on ensemble empirical mode decomposition (EEMD), runs test (RT), and relevance vector machine (RVM). First, in order to reduce the complexity of data, the original wind power sequence is decomposed into a plurality of intrinsic mode function (IMF) components and residual (RES) component by using EEMD. Next, we use the RT method to reconstruct the components and obtain three new components characterized by the fine-to-coarse order. Finally, we obtain the overall forecasting results (with preestablished confidence levels) by superimposing the forecasting results of each new component. Our results show that, compared with existing methods, our proposed short-term interval forecasting method has less forecasting errors, narrower interval widths, and larger interval coverage percentages. Ultimately, our forecasting model is more suitable for engineering applications and other forecasting methods for new energy.

1. Introduction

Industrialization practices are rapidly depleting fossil fuel reserves. Moreover, widespread use of fossil fuels produces large amounts of greenhouse gases and dust particles, both of which have significant negative effects on human society and the environment [13]. In order to address the energy crisis and alleviate environmental pressures, many countries are researching and utilizing forms of renewable energy [46]. Wind power has become especially prominent in the field of renewable clean energy because it is pollution-free, reserve-rich, and readily renewable [5]. Continuous improvements in wind power technology have led to an increase in the number of wind-powered grids. However, wind power is also random and volatile, and any serious power disturbances can affect the safety and stability of wind-powered grids. As such, accurate wind power forecasting is necessary for creating reasonable generation plans and system backup arrangements [79]. Ultimately, the key to increasing the number of wind-powered grids is to improve the wind power penetration limit of power grids.

Recent research and studies have greatly improved short-term wind power forecasting. Many methods, such as the time series method [7, 1012], Kalman filtering [13], model structure selection [14], fuzzy logic method [15], the artificial neural networks (ANNs) method [1619], wavelet transformation [20], and support vector machines [21] have been utilized for wind power forecasting. Additionally, other combined methods have become popular in recent years [2225].

The stochastic volatility of natural wind and its effects on wind-powered grids cannot be ignored. Interval forecasting can effectively reflect the uncertainties in the forecasting results. Deterministic point forecasting methods have some deficiencies in characterizing the randomness of actual wind power [26]. Therefore, it is necessary to establish a forecasting method that is capable of efficiently providing accurate information. If we can establish a forecasting method capable of providing accurate interval forecasting, we will better understand potential fluctuations in wind power, which will allow for the creation of standby arrangements for power systems [27, 28]. Compared to deterministic point forecasting, interval forecasting is still in its infancy. Interval forecasting has become more studied in recent years, and various interval forecasting methods have been proposed. The existing interval forecasting methods include the bootstrap method, the quantile regression method, the mixed structure interval method, and the probability interval forecasting method. The bootstrap method [29] constructs a sample set based on computer resampling technology, which requires a large amount of original data processing, and consumes much time and computation. The quantile regression method [3032] utilizes a rigorous theoretical background and yields reliable results; however, it requires a predetermined regression model and subsites, complicated calculations, and its forecasting accuracy is significantly reduced when the predicting samples increase. The mixed structure interval method [33, 34] is usually based on point forecasting results, with the interval result being determined by the calculation of coefficients and error analysis. The probability interval forecasting method [3538] constructs a load distribution so as to directly obtain the expectations and forecasting distribution of the load. The forecasting interval can then be drawn under an arbitrarily determined confidence level.

In order to establish a more simplistic and accurate short-term interval forecasting method, we propose a combined model based on ensemble empirical mode decomposition (EEMD) and a relevance vector machine (RVM). As part of our model, we use the framework of the probabilistic interval forecasting method, specifically the runs test reconstruction method, in order to achieve short-term interval wind power forecasts. First, RVM (a relatively new machine learning algorithm) combines the Markov Native theory, the Bayesian theorem, and the autorelated decisions a priori and maximum likelihood theory. Compared with ANN and SVM, not only does RVM have the advantages of higher model sparsity, fewer kernel function limits, and stronger generation ability, it can also obtain probabilistic forecasting results within the framework of the Bayesian theory and the statistical learning theory [39]. Second, in order to improve the forecasting accuracy of our model and narrow its width of interval range, we altered and improved two aspects of data decomposition preprocessing, as well as model parameter optimization. The EEMD is used to decompose the original wind power sequence into a series of IMF components and RES component in order to reduce its complexity. The RT method is then used to reconstruct these IMF components and RES component into a trend component, a detailed component, and a random component. Finally, a combination of the typical local kernel of the RBF kernel and the global kernel of the polynomial kernel is used to obtain better forecasting results.

Our proposed EEMD-RT-RVM model is used to achieve the one-point-ahead 15 min ahead short-term wind power interval forecasting. We used a variety of evaluation indexes to conduct comparative analyses and impact assessments for both our proposed model and other existing models. The results show that our combined model obtains higher forecasting accuracy and narrower interval widths than other existing methods. As such, our proposed model has high research significance and practical value.

2. Methodologies

2.1. Empirical Mode Decomposition (EMD)

EMD is an efficient signal decomposition method that does not rely on any predefined basis function. The EMD reflects the dynamics of signals more accurately than other models. The modes extracted by the EMD, named the intrinsic mode functions (IMF), are defined by the following criteria: () the number of extrema and zero crossings must be equal or differ by no more than one and () the local mean of the envelope defined by the local maxima and local minima must be zero [40, 41]. These two criteria ensure that each IMF has a physically meaningful phase definition; however, the time invariant frequency does not necessarily have a meaningful phase definition.

Given a signal , the EMD algorithm can be summarized as follows.

Step 1. Initialize the loop variable , , where is the given original data.

Step 2. Initialize the loop variable , .

Step 3. Find out all the local minima and maxima of , and interpolate between the local minima and maxima, respectively, in order to get an upper envelope and a lower envelope . The mean value of these envelopes is described asNext, compute the minis of the original data and the envelope mean value as

Step 4. Check whether satisfies the two criteria for an IMF (as defined above). If it is not satisfied, make , , and repeat Step 3. If it is satisfied, the first IMF can be given asThe residual can be computed by

Step 5. Treat as a new signal and repeat Steps 14 (in order to find more IMFs) until the residual is a constant or a monotonic function. Finally, the given can be decomposed into IMFs and a final residual as follows:

2.2. Ensemble Empirical Mode Decomposition (EEMD)

Mode mixing is the most significant drawback of EMD. Mode mixing implies either a single IMF consisting of signals of dramatically disparate scales or a signal of the same scale appearing in different IMF components. This causes intermittency when analyzing signals.

In order to solve the problem of mode mixing in EMD, Wu and Huang proposed a new noise-assisted data analysis method called ensemble empirical mode decomposition (EEMD) [42]. The EEMD method utilizes recent studies on white noise which showed that the EMD method is an effective self-adaptive dyadic filter bank when applied to white noise. The results demonstrate that noise can help data analysis in the EMD method [22, 43].

Two important parameters used in the EEMD method are () the amplitude of white noise and () the total repeat number of the EMD. At present, the determination of and is based on the structural characteristics of the data. Generally, the taken is 100, and is chosen from a range of 0.05~0.5. Based on previous tests, we set and in this paper.

The specific steps of the EEMD can be described as follows:(1)Set the value of the amplitude and the total repeat number M.(2)Add a white noise series to the signal.(3)Decompose the signal with the added white noise into IMFs by using EMD.(4)Repeat steps () and () using different levels of white noise each time and obtain corresponding IMF components of the decomposition. Calculate the mean of all the corresponding IMF components. Take the mean as the final result for each IMF. Calculate the mean of all the residual (RES) components and take the mean as the final result for the RES component:(5)Take the () and as the IMF components and RES component, respectively.

2.3. Runs Test (RT)

The runs test method [44] is defined in the following.

Assume the time series corresponds to and RES as , where is the label of IMF, is label of samples, and is the total number of samples. The mean value of the samples is defined as

Then, the timing symbol can be defined aswhere consists of a series of statistically independent randomly arranged sequences of 0 and 1.

Define each sequence with successive symbols (0 or 1) as a runs test. The total runs test number of each can be used to detect the fluctuation of each component obtained by the EEMD. Next, the high and low runs test thresholds can be set according to the runs test, and the components decomposed by the EEMD will be reconstructed into three new components (with typical characteristics based on the fine-to-coarse order) [44]. This ensures the decomposition effect and significantly reduces the run time of the model. Moreover, the similar components are reconstructed, strengthening the inherent laws of these data, to improve the prediction accuracy.

2.4. Relevance Vector Machine (RVM)

Compared with other forecasting algorithms, the RVM not only has high sparsity, less optimized parameters, flexible kernels, and strong generalization abilities, but also directly implements interval forecasting [45, 46]. Therefore, in this study, the RVM is used to establish the interval forecasting model for the new components reconstructed by RT.

For a given set of input training samples and the corresponding output sets , the relevance vector machine regression model can be defined as follows: where is the error of the independent sample (which follows the Gaussian distribution with the variance ), is the model weights, is a nonlinear kernel function, is relevance vector, and is the length of the data.

In the RVM, a priori probability distribution for each model weight is given aswhere is the hyperparameter of a priori distribution of model weight .

Given a training sample set , assume the target value is independent and the noise in the data follows the Gaussian distribution with the variance . Then, the likelihood function of the training sample set can be represented bywhere , , and is the design matrix given by

Based on a priori probabilities distribution and the likelihood distribution, the posterior distribution over the weight forms Bayes rule and can be written aswhere ; ; .

The marginal likelihood distribution of the hyperparameters can be obtained bywhere .

Finally, the hyperparameter and the variance can be estimated by using the maximum likelihood algorithm.

If the input value is , then the corresponding output probability distribution obeys the Gaussian distribution, and the corresponding forecasting value can be derived by

The RVM model can give both the mean value and the variance. As such, this model reflects the uncertainty of forecasting results and provides accurate interval forecasting (within the range of certain confidence levels).

Under the confidence level of , the interval forecasting results are as follows:where Lb denotes the lower bound of the forecasting value and Ub denotes the upper bound of the forecasting value.

3. Model Construction

3.1. Using EEMD to Decompose an Original Wind Power Sequence

In order to verify the effectiveness of the forecasting model, the whole year wind power sequence (96 point one day) obtained from a wind farm in Jiangsu province is used as the research object. The installed capacity of this wind farm in Jiangsu province is 49.5 MW, which contains 33 wind turbines. In this study, actual wind power data (5 days ago) is taken as the training sample. Then we establish wind power interval forecasting model for the next day in advance 15 minutes’ forecast.

As is shown in Figure 1(a), the actual wind power is random and volatile. In order to improve the forecasting effect, it is necessary to reduce the complexity of the data. Compared with other decomposition algorithms, the EEMD exhibits better noise robustness and decomposing effects. In this study, we use the EEMD to decompose actual wind power and to establish specific components, in which the periodicity, randomness, and the trends of the actual wind power can be clearly seen in the components. The decomposition results of EEMD are shown in Figure 1.

3.2. Using RT to Reconstruct the New Components

By the definition of RT, the RT value is greater, the volatility of time series is stronger, the RT value is closer, and the overall trend of time series is more similar. The RT values of each component (in Figure 1) are calculated and shown in Table 1. From Figure 1 and Table 1, the RT values of IMF1 and IMF2 are significantly large and relatively close, while the RT values of IMF6–IMF8 and RES are too small and very close. Also, the RT values of IMF3–IMF5 are found between the two. This shows that the dispersion of IMF1 and IMF2 series is strong, while the general trend of IMF6–IMF8 and RES is similar. Moreover, the fluctuation trend of IMF3–IMF5 is between the two. Based on the above analyses and studies, we set the high runs test threshold as 100 and the low runs test threshold as 10. The composition of the three new components is shown in Table 2.

The trend graph of the new components after reconstruction is shown in Figure 2. It is evident from Figure 2 that the trend component, the detailed component, and the random component each have typical features. The trend component roughly reflects the overall fluctuation of the original data; the detailed component characterizes the details of the fluctuations of the original data; and the random component represents the fluctuations caused by other factors that cannot to be explicitly described. All three of the components meet the composition standard of actual wind power.

In order to further simplify the calculation and narrow the forecasting interval, we established a point forecasting for the trend component and obtained the interval forecasting for the detailed component and the random component. We obtained the overall optimal interval forecasting results (under a certain confidence level) from the superposition of each component’s forecasting results.

3.3. Sample Data Normalization

Since poor and missing data affect forecasting accuracy, it is necessary to pretreat load data obtained from measurements. In this study, we primarily used transverse and longitudinal comparisons methods for data pretreatment. In addition, we used the normalization method in order to simplify the calculation and standardization of loads, prices, and weather data (a necessary measure since the input variables have different units and values). By doing so, the value of the data can be limited to . The specific calculation formula iswhere is the normalized value of the data and and represent the maximum value and the minimum value of the data, respectively.

3.4. Kernel Function Determination of RVM

RVM is a pattern recognition and regression forecasting method based on kernel functions. The kernels implement nonlinear transformations among a plurality of feature spaces. The basic idea of hybrid kernels is to combine a plurality of kernels with different characteristics (in a certain proportion) in order to ensure that the combined kernel function has better performance. Importantly, RVM is less limited in kernel function selection. Moreover, RBF kernels are well-suited to solving local fluctuations, while polynomial kernels are well-suited to dealing with global fluctuations. A combination of typical RBF local kernels and the global kernels (of polynomial kernels) is used for improved short-term wind power interval forecasting. The hybrid kernel is shown as follows:where ) is the RBF kernel; ) is the binomial kernel function; is the weight of the kernel function; is the kernel width; and and are the parameters to be optimized. We employed the grid search method in order to obtain the optimal values of and .

3.5. Evaluation Indexes of the Model

There are many indexes used to evaluate the errors of point forecasting results, such as APE (absolute percentage error), MAPE (mean absolute percentage error), and RMSE (root mean square error) [4749]. The smaller the error, the higher the forecasting accuracy. The assessment methods of interval forecasting differ from point forecasting (except when using the MAPE index). Other indexes used to evaluate the efforts of interval forecasting results are FICP (forecasting interval coverage percentage) [50] and FIAW (forecasting interval average width). The definitions of these methods are as follows.

() MAPEwhere represents the forecasting result of the th forecasting sample; represents the true value of the th forecasting sample; and represents the number of the sample. MAPE is used to evaluate the error between the expected forecasting value and the actual value. The smaller the value, the higher the forecasting accuracy.

() FICPwhere represents the interval coverage; and is the number of the actual value falling within the confidence interval (at the confidence level). The index of FICP evaluates the credibility of the interval. The greater its value, the higher its credibility.

() FIAWwhere represents the average width of the confidence interval under the level ; and are, respectively, the upper and lower bounds of the th forecasting sample; and refers to the actual value of the th forecasting sample. The index of FIAW is based on its ability to evaluate the uncertain degree of the forecasting results.

3.6. Overall Procedures of the EEMD-RT-RVM Model

In this study, we propose a short-term wind power interval forecasting method based on the EEMD-RT-RVM model. The flow chart of our proposed forecasting model is shown in Figure 3.

4. Analysis Results of the Demonstration

In order to verify the interval forecasting effects of an EEMD-RT-RVM model that uses different confidence levels, we chose to use confidence levels of 90% and 60% in our example. The interval forecasting results are shown in Figures 4 and 5. The indexes of MAPE, FICP, and FIAW are used to assess the effects of the interval forecasting. Table 3 shows portions of the forecasting results and the indicator analysis results.

To prove the superiority of our model, we used the same wind power to obtain short-term interval forecasting from the RVM model, the EMD-RVM model, and the EEMD-RVM model. We used the indexes of MAPE, FICP, and FIAW, and their running times to assess their effects on interval forecasting. Table 4 shows the comparison results of the models (under the 90% confidence level).

Moreover, in order to further evaluate the adaptability of this proposed model, the wind power data of the actual wind farm in the other days of different seasons are chosen for the research. For example, the dates of February 12, July 22, October 15, and December 17, 2009, are chosen randomly. Based on the time scale of the original wind power data, the 15 min ahead short-term wind interval forecasting results under the 90% confidence level for these days are shown in Figure 6. The indicator analysis results with MAPE, FICP, and FIAW are organized in Table 5. In Figure 6, the interval width is narrower in July than October. It means the data fluctuation of October is stronger than the data of July. It is found in Figure 6 and Table 5 that the MAPE indicator can reflect the effectiveness of the proposal method. The smaller the MAPE, the better the forecasting accuracy, illustrating the forecasting expected value is closer to actual result. Further, the MAPE of different days are all within 6.5% and meet the actual project requirements. The interval width becomes narrower with the smaller FIAW value due to the better MAPE results and this decreases the model uncertainty. Meanwhile, the credibility of forecasting results may reduce with the smaller FICP value.

Based on the above analysis, we drew the following five conclusions: () the forecasting results of our proposed model effectively follow the actual wind power value, and the fluctuations are consistent with changes in actual wind power. () Most of the actual wind power falls within the forecasting interval with confidence levels of 90% and 60%; however, the number of forecasts falling outside the forecasting interval of the 60% confidence level is significantly larger than those falling outside the interval of the 90% confidence level. This accurately depicts the characteristics of the actual situation and reflects the effectiveness of our interval forecasting results. () The interval width of the 90% confidence level is significantly greater than that of the 60% confidence level. Decreases in the confidence level also decrease the interval width and the interval coverage. () Overall, our proposed model had minimum forecasting errors, narrower interval widths, and higher interval coverages than any of the other models. () The EEMD has a better theoretical foundation and noise robustness than our model; however, we overcome this weakness by using the mode mixing phenomenon of EMD. Moreover, the use of runs tests uncovers the correlation among the components and reduces the complexity of our model (which contributes to improved forecasting effects and enhanced running efficiency).

In summary, our EEMD-RT-RVM model had the better performance results in forecasting short-term wind power interval. Furthermore, our model is applicable to other practical engineering applications.

5. Conclusions

Owing to the volatility and randomness of the nature wind, the deterministic wind power forecasting is a difficult and complex task. In this study, we proposed an EEMD-RT-RVM model to achieve more accurate short-term interval wind power forecasting. The EEMD was used to decompose wind power sequences into IMF components and RES component, which reduced the inherent volatility of the wind power sequences. We then used runs tests to reconstruct new components. Overall, our methods improved the forecasting performance and enhanced running efficiency. The actual wind power data from November 20, 2009, to November 25, 2009 (15 min/one point), are used to verify the effectiveness and superiority of the proposed VMD-RT-RVM model, and quantitative evaluation is conducted based on comprehensive error evaluation criteria and interval evaluation criteria. Simulation results and analysis demonstrate that the volatility and randomness of wind power are reduced by the EEMD method, and the proposed EEMD-RT-RVM model performs better than conventional single models and other combined models. Our proposed EEMD-RT-RVM model not only improves forecasting accuracy, but also significantly reduces the width of the forecasting interval (under the premise of guaranteed interval coverage). Ultimately, our model is suitable for numerous practical applications, and it also serves as a good reference value for the output forecasting of other new energy sources.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The research is financially supported by National Natural Science Foundation of China (Program no. 51507052), China Postdoctoral Science Foundation (2015M571653), the 111 Project (B14022), and the Fundamental Research Funds for the Central Universities (2015B02714, 2016B20914).