Abstract

High-precision wind power prediction is important for the planning, economics, and security maintenance of a power grid. Meteorological features and seasonal information are strongly related to wind power prediction. This paper proposes a hybrid method for ultrashort-term wind power prediction considering meteorological features (wind direction, wind speed, temperature, atmospheric pressure, and humidity) and seasonal information. The wind power data are decomposed into stationary subsequences using the ensemble empirical mode decomposition (EEMD). The principal component analysis (PCA) is used to reduce the redundant meteorological features and the algorithm complexity. With the stationary subsequences and extracted meteorological features data as inputs, the long short-term memory (LSTM) network is used to complete the wind power prediction. Finally, the seasonal autoregressive integrated moving average (SARIMA) is innovatively used to fit seasonal features (quarterly and monthly) of wind power and reconstruct the prediction results of LSTM. The proposed method is used to predict 15-minute wind power. In this study, three datasets were collected from a windfarm in Laizhou to validate the prediction performance of the proposed method. The experimental results showed that the prediction accuracy was significantly improved when meteorological features were considered and further improved with seasonal correction.

1. Introduction

With the sharp decline in fossil energy reserves and increasing prominence of environmental problems, wind power has seen rapid development as a renewable and clean energy source around the world [1]. At the end of 2016, the global wind energy cumulative capacity reached 486.74 GW, with a new installation for 54.6 GW [2]. With the increasing proportion of installed wind capacity, the impact of the volatility and randomness of wind power on the power grid has become obvious, and the subsequent abandonment of wind has become more serious [3]. To suppress the influence of randomness, ensure the safe operation of the power grid, and improve the rate of wind power consumption, accurate prediction of wind power is important.

According to the time scale, the wind power forecasting model is divided into four categories: long-term, medium-term, short-term, and ultrashort-term prediction. The long-term prediction is usually in years and mainly serviced for the formulation of wind farm plan and annual power generation plan. The medium-term prediction is usually in weeks or months and generally used to arrange maintenance plans. The short-term prediction is generally in days and generally used to reduce the discarded wind power, optimize the maintenance scheduling, and optimize the generation scheduling [4]. So the accuracy of short-term prediction is particularly important for planning of power grid.

Ultrashort-term wind power prediction refers to forecasting at intervals of 5, 10, or 15 minutes. Such prediction results are mainly applied to increasing wind power consumption, reducing wind turbine regulation pressures, and ensuring the safe operation of power grids [5]. The ultrashort-term prediction is theoretically the most demanding forecasting method; it is mainly used to control daily operations of wind farm units [6, 7]. In addition, it is considered essential for maximizing the potential revenues of operators due to price fluctuations. This is because future price in the renewable energy market is determined via a bidding mechanism between wind power operators and renewable market operators; the potential revenue for a wind power operator is directly affected by their ability to predict when and how much electricity will be generated in the future [8].

Currently, there are three basic types of wind power prediction methods: physical, statistical, and hybrid. Some research focused on building physical methods using mechanism analysis and equations. For ensuring the best efficiency of power outputs, the authors in [9, 10], respectively, established turbine models to discuss the influence of different hub heights and configurations and the effects of inflow turbulence intensity and turbine arrangements. But due to the fluctuations and uncertainty of meteorological change, it is difficult to obtain the satisfactory forecasting results only through a single mechanism prediction model. Statistical methods, such as the autoregressive moving average model, are based on pattern recognition, parameter estimation, and model checking to construct historical data and predicted values [11, 12]. However, statistical models cannot dig deep into data features, which lead to large prediction errors. Hybrid methods have two main constructions. The first construction focuses on data preprocessing, in which a decomposition algorithm is used to decompose the original wind power sequence into several sublayers to weaken the volatility. Each sublayer is predicted separately by the prediction model, and the results of all sublayers are then recombined to obtain the final predicted values [13, 14]. The second construction usually uses several different prediction models to predict the wind power separately and then recombines the results to obtain the final predicted values [15, 16]. Because hybrid methods have a better predictive performance, research with this approach has peaked in recent years. Hong et al. [17] proposed a hybrid method based on empirical mode decomposition (EMD) and an artificial neural network (ANN). The EMD was used to decompose the wind and wind power to obtain the stationary sub-sequences, and the ANN was used to predict the wind power. Yang et al. [14] proposed a hybrid method using the improved EMD to decompose the wind power. The ANN and SVM were used to establish prediction model according to different frequencies of sub-sequences. In recent years, long short-term memory (LSTM) networks have been widely used because of their unique memory and forgetting mode; they can flexibly adapt to timing fluctuations in wind power and have better prediction accuracies than conventional methods. Liu et al. [18] proposed a hybrid method based on a discrete wavelet transform (DWT) and the LSTM. The DWT is used to decompose the nonstationary wind power into components which have more stationarity, and each component is dug by an independent LSTM. Yu et al. [19] proposed a hybrid method using improved LSTM network to predict the wind power and further optimized the prediction results with clustering analysis. Han et al. [20] proposed a hybrid method using improved variational mode decomposition (VMD) to decompose wind power, and using improved LSTM to get the prediction.

These studies achieved many optional models for wind power prediction, and the forecasting accuracy was improved by applied the hybrid methods. However, most of the current studies only used wind power data, which do not make full use of meteorological information such as wind speed, wind direction, and temperature. In addition, the prediction model fails to fully extract seasonal information which will greatly improve the prediction accuracy if these messages are considered.

Based on previous work, a hybrid method based on meteorological features and seasonal information is proposed. This paper uses LSTM-SARIMA hybrid model to achieve a 15-minute prediction. The innovations are as follows:(i)The meteorological features were regarded as the input variables, and the principal component analysis (PCA) was used to reduce the redundant components that these excessive influencing variables may cause higher computational complex.(ii)With the stationary subsequences and extracted meteorological features data as inputs, the LSTM was used to predict the wind power, taking advantage of seasonal autoregressive integrated moving average (SARIMA) method that can extract the seasonality characteristics (including quarterly and monthly) of wind power [21].

The rest of this paper is organized as follows. Section 2 describes the process of data preprocessing, including the decomposition of the wind power and the extraction of the related meteorological features. Section 3 introduces the prediction algorithms used in this paper. Section 4 introduces the case study. The method was applied to data from a windfarm in Laizhou for verification, and the prediction accuracy was compared with those of the four other methods. Finally, Section 5 summarizes this study and presents plans for future work.

2. Data Preprocessing

The architectural design of this prediction method is shown in Figure 1.

The details of how the proposed method decomposes the original wind power sequence and considers the coupling relationship between the wind power and meteorological features are presented here.

2.1. Ensemble Empirical Mode Decomposition of Wind Power

EMD is a direct, empirical, and adaptive data processing method designed specifically for nonlinear and non-stationary data. Unlike traditional methods, such as Fourier decomposition, that use fixed bases, EMD directly decomposes nonlinear and non-stationary time series into multiple sublayers (multiple IMF layers and a Res term) by the Hilbert transform without any base functions or filter functions. Because of this unique advantage, EMD is widely used in complex system analyses [22, 23] and has been demonstrated to be advantageous for time series prediction. EEMD was developed as an extended version of EMD to overcome the modal confusion problem [24] by adding white noise to the original time series multiple times; then, EMD is performed on the modified time series to obtain IMF sublayers, and the overall average of the corresponding IMF sublayers is considered as the final decomposition result. The main steps of EEMD are as follows.

Add white noise (t) − N(0, σ2) to the original wind power series p(t) to convert it to a new time series:

Decompose the time series pi(t) into n IMF sublayers (j = 1, 2, …, n) and a Res term ri(t) whose frequency is close to zero and cannot be extracted further:where is the jth IMF sublayer of the ith wind power sequence.

Repeat the above operation several times, in which the corresponding IMF sublayers could be obtained by adding different white noise to the original time series.

Calculate the average of all IMF sublayers obtained in m decompositions as the final IMF sublayer:

Once EEMD is complete, the original time series can be represented as a linear combination of IMF sublayers and the Res term:where ej(t) is the average of the jth IMF extracted during m decompositions, r(t) is the final residual, and n is the number of IMF sublayers.

2.2. Meteorological Feature Extraction Based on Principal Component Analysis

PCA is a multivariate statistical analysis technique that combines data compression and feature extraction [25]. PCA data inputs increase the overall accuracy by preserving the actually useful information. Also, by reducing the number of input data, the model training and prediction times can be significantly reduced. PCA has thus been effectively used in wind power prediction [26, 27]. The idea of PCA is to construct new variables from linear combinations of the original and select the variables that reflect the original as much as possible. PCA maps n-dimensional features to k dimensions (k < n), which results in a new set of orthogonal features called principal components (PC), rather than simply removing the remaining n − k dimensional features from the n-dimensional. The PCA algorithm calculates the eigenvalues of the input variable matrix, finds the variance corresponding to each input variable, and then determines the PC according to the cumulative value [28]. The main steps are as follows.

Center all meteorological features (including wind direction, wind speed, temperature, atmospheric pressure, and humidity): find the average of each feature fj (j = 1, 2, …, 5) and then subtract the mean from each feature fj:where fij is the ith value of feature j, is the average of feature j, n is the number of values of feature j, and p is the number of features.

Calculate the covariance matrix of features:

Calculate the eigenvalues of the covariance matrix and corresponding eigenvectors. Calculate the contribution rate of all eigenvectors and select the appropriate PC that was regarded as the extracted meteorological features. A higher contribution rate indicates that the original feature contains stronger information. The selection of a PC is mainly determined by the cumulative contribution rate of eigenvectors. In other words, the cumulative contribution rate usually needs to be more than 85% to ensure that the PC can contain most of the information of the original feature:

3. Prediction Algorithms

3.1. Seasonal Auxiliary Prediction Based on the SARIMA Model

As an extension of the autoregressive integrated moving average (ARIMA) model, SARIMA considers the periodicity of the time series. The predicted results can be reconstructed using the results of the LSTM network to correct the wind power values. When a wind power sequence contains seasonal and non-seasonal information, the typical ARIMA model cannot capture the behavior of the seasonal portion, which results in a wrong choice of model parameters [29]. The general formula for SARIMA (p, d, q) (P, D, Q)S [30] is given bywhere Pt is the wind power sequence, et is white noise, L is the lag operator, D is the seasonal difference order, and d is the regular difference order, which represents the number of differences that smooth the wind power sequence, is the regression model parameter, and s is the seasonality order that represents the seasonal change cycle of wind power.

Equations (11) and (12) are autoregressive and moving average polynomials, respectively, and represent the dependence of the future time series on past values and errors:

Similarly, equations (13) and (14) represent the seasonal autoregressive and seasonal moving average polynomials, respectively. Adding these polynomials to the ARIMA equation can help capture seasonal variations in time series:

3.2. Wind Power Prediction Based on the LSTM Networks

As an improved model of the recurrent neural network (RNN), the LSTM network has not only the recursiveness of RNN but also a unique memory and forgetting mode that can flexibly adapt to the timing characteristics of wind power. The LSTM network solves the gradient disappearance and gradient explosion problems of RNN during training and can fully utilize the time dependence of historical wind power data. The LSTM network consists of an input layer, output layer, and hidden layer. Compared to the traditional RNN, the hidden layer is no longer an ordinary neural unit but an LSTM unit with a unique memory mode. The structure of the memory cell is shown in Figure 2.

Each LSTM unit has a cell [31] whose state at time t is denoted as ct; this can also be regarded as a memory unit which will input in the same cell the next time. The reading and modification of the memory cell are accomplished by controlling the input gate, forgetting gate, and output gate. Specifically, the workflow of the LSTM unit is as follows: at each given time, the LSTM unit receives the input state of combined feature vectors:where xi is the ith combined feature vector, IMFi is the ith wind power sub-sequence of different frequency scales, and is k meteorological feature vectors.

In addition, the LSTM unit also receives the hidden state of the previous time ht−1 and the internal information ct−1 as the state of the last cell. After receiving the information, the logic function of a gate determines whether the input is active. Then, the input is transformed by a nonlinear function and superimposed with the state of the memory cell processed by the forgetting gate to form a new memory cell state ct. Finally, the state ct forms the wind power prediction ht by operation of the nonlinear function and dynamic control of the output gate. The variables are calculated as follows:where , , are the weight matrices connected to the input signal xt; , , , are the weight matrices connected to the input of the hidden layer; ht−1, , , are the weight matrices connected to the internal information; and ct−1, bi, , bc, bo are the offset vectors. σ is the activation function, which is usually a tanh or sigmoid function.

According to (18), the current cell state ct is composed of the current input feature xt and the previous cell state ct−1 in which there is the memory of previous input features that is controlled by forgetting gate . When predicting the subsequent time point ct+1, the information contained in ct−1 is not abandoned but is input into the next cell. As the sequence grows, all the features are used as important information for every subsequent wind power prediction.

Finally, the prediction results of the LSTM network are corrected by the results of the SARIMA model. The final wind power predicted values can be expressed aswhere is the prediction results of SARIMA and ω is the ratio by which the SARIMA model modifies the predicted values of the LSTM network, which takes the minimum error between the predicted value and the real value as the standard.

4. Case Study

4.1. Evaluation Indices

In this study, several indices were used to evaluate the proposed wind power prediction method: the mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). MAE reflects the true state of the predicted value error; MAPE measures the deviation between the predicted and true values; and RMSE represents the sample standard deviation between the predicted and true values, which is sensitive to larger errors. The specific equations are given below:where n is the size of the test data and Ypre(t) and Yact(t) represent the predicted and true values of wind power, respectively. To compare the prediction performances of relevant models, the percentage improvements of MAE, MAPE, and RMSE were used as evaluation indices:

4.2. Data Description

Laizhou is located in the northeast of Shandong Province (37.0215°N, 119.4546°N), which borders the ocean and is rich in wind power resources. This study used three time series datasets of wind power collected at 15-minute intervals from a Laizhou wind farm. The wind turbine siting layout is fixed. The datasets contain a total of 35040 samples (from 15 January 2017 to 27 February 2018). As shown in Table 1, the detail dates for prediction model are presented. The wind farm has a total installed capacity of 100 MW. The information of wind speed, pressure, temperature, and relative humidity is directly acquired through the system of Supervisory Control and Data Acquisition (SCADA), which could realize real-time collection of local meteorological information. There are many kinds of sensors in the system, which can collect many kinds of data such as meteorological environment and running state. The data has six features, as shown in Figure 3: wind direction, wind speed, temperature, atmospheric pressure, humidity, and wind power. Wind power shows high volatility and seasonality. The unit of wind speed is meters per second; the wind direction ranges from 0 to 360°, the temperature unit is Celsius, and the humidity ranges from 0 to 100 percent. During the modelling process, the first 80% of each dataset was used for training; the next 10% was used for cross-validation; and the last 10% was used for the test.

4.3. Experiments
4.3.1. Parameter Settings and Decomposition Results Using EEMD

As shown in Figure 3, the original wind power sequence had strong instability with no obvious rule to follow. It is difficult to obtain reliable results if the prediction model is directly established for analysis. In order to decrease the volatility of the sequence, EEMD was used to decompose the wind power. The standard deviation of the additional white noise was set to 0.2, and the EMD execution frequency was set to 100 times to make the Res item small enough. The algorithm produced 10 IMF sublayers and a Res term at different time scales. As shown in Figure 4, the stability of the IMF sublayers was significantly improved. The 11 subsequences generated after decomposition were used for the next LSTM modelling analysis.

4.3.2. Extracting Results Using PCA

The feature properties of wind power include five vectors: wind direction, wind speed, temperature, atmospheric pressure, and humidity. In order to analyze the correlation between the features and wind power, the Person correlation coefficient between wind power and each feature was tested. The calculation formula is shown as follows:where r represents the Person correlation coefficient, Xi and Yi represent the features, respectively, and and represent the mean value of respective samples.

As shown in Table 2, the wind speed is the most relevant to wind power, and the following is wind direction. It can be concluded that the wind turbine output is directly affected by the wind speed and wind direction. The temperature and humidity have a low correlation with wind power but have a certain correlation with wind speed. This is because temperature and humidity can indirectly reflect wind speed. In summary, this article considered all these five meteorological features for experiments.

In order to reduce the redundant components of meteorological characteristics and reduce the computational complex, PCA was used to extract features, and the components with a large contribution rate were kept. Table 3 presents the contribution rates of the extracted PC. The third and fourth PC had very small contribution rates and contained negligible information of the original features. Therefore, only the first and second PC were considered in this experiment.

4.3.3. Parameter and Hyperparameter Settings of LSTM

The IMF sublayers and Res term were combined with the principal components extracted by PCA to obtain 11 combined feature vectors Xt (IMFit, PC1, PC2), which were inputted into the LSTM network for training. For the experimental design of the training process, the Keras deep learning framework was selected. The sigmoid function was used as the activation function, and the model was optimized with the Adam optimization algorithm. To prevent overfitting, we set a dropout of 0.8 after cross-validation. The number of training samples per unit of time (batch size) was set to 96 that equals the amount of data in a day. The learning rate affects convergence effect and speed of the model, and we set it to 0.02 after experiments. In terms of the LSTM layers and neurons of dense, the following values were considered: {layers: 1, 2, 3, 4}, {neurons of dense: 1(70), 1(80), 2(70, 80), 2(80, 80)}. As shown in Table 4, RMSE is calculated to evaluate the forecasting performance using training set and validation set. As the number of layers increases, the training model will have a better effect, even reaching 100% prediction accuracy. However, what follows is that the model is overfitted, and the prediction effect is seriously reduced when the model is put on the test data. Above all, increasing the number of hidden layers may not be beneficial to forecasting performance. Also, based on their effects on the prediction and training speed, the number of LSTM layers was set to 2 with a dense layer that has 80 neurons.

4.3.4. Parameter Settings of SARIMA

At the same time, the SARIMA (p, d, q) (P, D, Q)S model was built according to the information of the wind power sequence. The Akaike information criterion (AIC) was used to evaluate and compare models with different parameters [32]. AIC also considers the overall model complexity when measuring how a model fits the data. A model suitable for wind power data would have a lower AIC score than others. In applications, d rarely exceeds 1, and P and Q are usually no more than 3 [33]. In this study, the SARIMA function of the statsmodels framework was used to test the effects of different model parameters on AIC, and the results are presented in Table 5. Based on the minimum AIC score, SARIMA (3, 0, 1) (3, 1, 1)12 was chosen as the wind power prediction model.

4.4. Comparative Analysis

To test prediction ability of the proposed three-stage wind power prediction method, it was compared with four other methods: single LSTM, Gated Recurrent Unit (GRU), EEMD-LSTM, and PCA-EEMD-LSTM. GRU is a sequence prediction network. Compared with the LSTM, it only contains update gate and reset gate, which also has a good prediction effect while simplifying the structure [34]. As the existing paper has demonstrated that the LSTM model is superior to traditional models in prediction effect, the traditional methods are not considered in this experiment. The three Laizhou wind farm datasets were used in this comparison. This study took the predicted wind power data for the following day as example, which contains 96 data points. The prediction interval is 15 minutes. To better explain the performance of the methods, we used the determination coefficient (R2) to test the degree of association between the predicted and true values. R2 is defined as follows:where yact represents the original values, ypre represents the predicted values, and is the average of the original values. The closer R2 is to 1, the smaller the deviation is between the predicted and true values [35]. Table 6 provides the results of R2 for the cross-validation datasets. R2 is close to 1, which shows that the value predicted by this model is very close to the true value; the method performed well during the cross-validation. Table 7 provides the results of R2 with the test dataset; here too, it can be seen that the predicted value is close to the true value.

Figures 57 show the prediction results for all methods for datasets 1–3. A one-day forecast curve was drawn with a total of 96 data points. Table 8 provides the prediction errors for the different methods. Table 9 presents the percentage improvement in the evaluation indices in the case of the proposed method, compared to the others.

The main purpose of this experiment was to compare the proposed method with others, thereby demonstrating its effectiveness. The minimum error value is taken as the evaluation standard, and the proposed method performed the best at wind power prediction among the four methods, with MAPE, RMSE, and MAE values of 4.78, 3.38, and 2.86, respectively. Taking dataset 1 as example, detailed results are presented as follows:(i)The PCA-EEMD-LSTM combination method has a significantly higher prediction accuracy than the EEMD-LSTM combination method. Tables 7 and 8 indicate that the PCA-EEMD-LSTM combination method has MAPE, RMSE, and MAE values of 5.33, 3.4, and 3.09, respectively. Compared to the EEMD-LSTM combination method, the percentage improvements in MAPE, RMSE, and MAE were 33.87%, 18.83%, and 13.93%, respectively. Figure 5 shows the prediction results for these two methods. The experimental results show that the meteorological information is effective in improving the accuracy, and the PCA can have a satisfactory effect on reducing the redundant components and get the components with high correlation with wind power, which improves the prediction accuracy and computing speed.(ii)The proposed method has a higher prediction accuracy than the PCA-EEMD-LSTM combination method. Tables 7 and 8 indicate that the proposed method has MAPE, RMSE, and MAE values of 4.78, 3.38, and 2.86, respectively. Compared to the PCA-EEMD-LSTM combination method, the percentage improvements in MAPE, RMSE, and MAE were 10.31%, 3.15%, and 7.44%, respectively. Figure 5 shows the prediction results for these two methods. The experimental results show that the SARIMA model can track the seasonality and periodicity of the wind power sequence, which improves the results predicted by the LSTM network.

5. Conclusions

In this study, a hybrid ultrashort-term wind power prediction method was developed. Based on the prediction using historical wind power data, the meteorological information regarded as the features of wind power is helpful to improve prediction accuracy. The PCA can be used to reduce the redundant meteorological features and only get the components with high correlation with wind power. In addition, the SARIMA can effectively extract seasonal information and be regarded as the auxiliary predictor of LSTM. The hybrid prediction model was applied to three datasets collected from a wind farm in Laizhou to verify prediction ability, and the experimental results showed that this method got great prediction accuracy. In the future, the proposed method can be applied to real-time prediction systems. Although it takes more than 3 hours to build the models on a general computer, the time required for real-time prediction using the built models is extremely short, which fully meets ultrashort-term requirements. Moreover, the future research should comprise the following:(i)The EEMD algorithm can be applied not only to the wind power sequence but also to meteorological features.(ii)The proposed method shows satisfactory results for a 15-minute prediction. We will verify predictions at 30-minute, 1-hour, or even longer intervals in the future.

Nomenclature

:The original wind power series
:The sublayer of the wind power
:The redundant item
fj:The meteorological feature
:The average of each feature
:Eigenvalues of covariance matrix
:Contribution rate of eigenvectors
L:The lag operator
D:The seasonal difference order
d:The regular difference order
:The regression model parameter
s:The seasonality order
:The combined feature vector
:Predicted values of wind power by LSTM
:The state of the LSTM cell
b:The offset vectors
ω:The ratio of final wind power
:Predicted values of wind power
:True values of wind power.

Abbreviations

EEMD:Ensemble empirical mode decomposition
IMF:Intrinsic mode function
PCA:Principal component analysis
PC:Principal component
LSTM:Long short-term memory network
SARIMA:Seasonal autoregressive integrated moving average.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research has been supported by the National Natural Science Foundation of China (grant nos. 61821004, 61733010, and 61803231), Department of Science and Technology of Shandong Province (grant no. 2019JZZY010901), Natural Science Foundation of Shandong Province (grant no. ZR2019ZD09), and the Young Scholars Program of Shandong University (grant no. 2016WLJH29).