Computational Intelligence in Data-Driven Modelling and Its Engineering ApplicationsView this Special Issue
Wind Power Prediction Based on Nonlinear Partial Least Square
Wind power prediction is important for the smart grid safe operation and scheduling, and it can improve the economic and technical penetration of wind energy. The intermittent and the randomness of wind would affect the accuracy of prediction. According to the sequence correlation between wind speed and wind power data, we propose a method for short-term wind power prediction. The proposed method adopts the wind speed in every sliding data window to obtain the continuous prediction of wind power. Then, the nonlinear partial least square is adopted to map the wind speed under the time series to wind power. The model carries the neural network as the nonlinear function to describe the inner relation, and the outputs of hidden layer nodes are the extension term of the original independent input matrix to partial least squares regression. To verify the effectiveness of the proposed algorithm, the real data of wind power with different working conditions are adopted in experiments. The proposed method, backpropagation neural network, radial basis function neural network, support vector machine, and partial least square are performed on the real data and their effectiveness is compared. The experimental results show that the proposed algorithm has higher precision, and the real power running curves also verify that the proposed method can predict the wind power in short-term effectively.
Wind power is a big proportion of clean energy that gives a powerful response to the shortage of energy and environment pollution. The intermittent and the randomness of wind cause that the system-balancing costs among all generation and demand participants are very high. Therefore, it is necessary that the short-term prediction of wind power should be attached much more attention to for higher accuracy in different situation [1, 2].
Generally, wind power prediction could be divided into three kinds: physical model [3–6], statistical model, and also the hybrid model. The physical one has a good performance for longer horizons and complicated terrain, because it adopts numerical weather prediction (NWP) and improves its resolution to accurately predict a certain point (such as wind turbines at each) of weather parameters. However, there are some limitations in both theory and application, such as the need of knowing physical properties, the huge amount of computation, and high cost of physical circumstances forecasting. The statistical models which purely carry previous data over time to make a prediction of that kind are effective for short-term forecasts. And also the hybrid model uses the data coming from NWP, like the wind speed and wind direction, to make the regression of wind power. It performs well in wind power prediction [7, 8]. In detail, the partial least squares (PLS) approach is a common multivariate regression algorithm for linear system and could yield the statistical model of prediction issues [9–11]. While wind data is inherently nonlinear, PLS regression may not always catch the function of wind power output and historical wind speed input [12–14]. Support vector machine (SVM) is a common regression algorithm based on the structural risk minimization with mapping the data into a high-dimensional feature space [15, 16]; due to their dot-product form, linear kernel SVMs are able to be transformed into a compact form by exchanging summation. Neural network (NN) learns from the samples without focusing on mathematical derivation and could outperform the simpler model structures, such as the PLS model. It could obtain good results and has been widely adopted for wind power prediction [17–19]. Among NN, the backpropagation neural networks (BPNN) are often used for prediction problem [20, 21], which adopts the mean square error and the gradient descent to modify the weights of the neurons. The more flexible radial basis function neural networks (RBFNN) in training algorithm and network resources are adopted in prediction [22–24]. RBFNN are also easy to integrate with other regression analysis models, such as SVM . The original NN also has the problem of insufficient accuracy; especially in different working conditions with different terrains and climate, it is promising to use NN ensembles with feature selection to increase the accuracy and robustness [26–28].
In this paper, a wind power prediction method which combines the neural network and NPLS considering the sequence of data is proposed. The proposed method adopts the wind speed in every sliding data window to obtain the continuous prediction of wind power. The nonlinear partial least squares is adopted to map the wind speed under the time series to wind power. In detail, the model carries the neural network as the nonlinear function to describe the inner relation, and the outputs of hidden layer nodes are the extension term of the original independent input matrix to partial least squares regression. PLS performs on the output matrix to establish the NPLS model based on NN extending input to make the prediction that how wind power is driven by wind speed in a short term. Backpropagation neural network and radial basis function neural network are chosen to realize the NPLS model as BPE-NPLS and RBFE-NPLS, respectively, and the two NPLS model are compared with PLS, BPNN, RBFNN, and SVM to perform on real data. The effectiveness of the these models are evaluated by the root-mean-squares errors of cross-validation (RMSECV), the root-mean-squares errors of prediction (RMSEP), the squared correlation coefficient of prediction , and the squared correlation coefficient of cross-validation . The experiments are recorded from the typical dispersed wind farm in Northwest China; three different kinds of wind power fluctuation conditions are discussed.
The organization of this paper is as follows: Section 2 reviews the related work and relevant algorithms. The experimental datasets and procedure are provided in detail in Section 3. In Section 4, the experiments results are discussed. Finally, Section 5 concludes the paper.
2. Problem Statement
A typical wind power generator system is shown in Figure 1; the main function of the gear box is to transmit the power generated by the wind turbine blade to the generator under the action of wind and get the corresponding speed to generate. The control system is the center of the modern wind power generator. Wind turbine control system is based on wind speed, wind direction control, the wind power generator system so that it could run at a steady voltage and frequency, automatic grid and off-grid, alarm for any abnormal situation, and automatic shutdown if necessary.
In operation, the wind speed always fluctuates astatically, resulting in insufficient grid connected generators, so that a reasonable forecast and control of wind speed would make the wind turbine more efficient to generate electricity. While wind speed is the amount of random variation, the output of wind power is also being related to the performance of wind speed. The prediction of wind power and the accuracy of prediction method are especially important to the safety operation of wind power grid. Being of statistical wind power prediction model, a number of wind speed values in a fixed period of time usually are chosen as the input, corresponding to the wind power output in the next short term. However, the mapping between a fixed number of wind speeds and wind power in the next short-term is not always good in prediction performance. Especially, the increase in the length of the data would require a higher calculation capacity, and it will also be mixed with more data outliers adding to the burden of preprocessing. We propose the wind power prediction method considering the datasets matrix mapping, and the time sequence prediction of the sliding data window is carried out instead of the original one-time form prediction in a short term.
3. The Proposed Method
In this paper, a combined NPLS model integrated with the neural network is proposed and the structure of the model is shown as Figure 2. The input of neural network is the continuous historical wind speed in the sliding data window that assumes a unit sliding window consists of historical wind speed and there are sliding data windows in a short-term prediction. is the input for PLS regression after extended conversion process, the outputs of hidden nodes of neural network are the extension term . Therefore, the wind power based on the historical wind speed dataset can be obtained as follows:
It assumes that the number of hidden nodes is , then . is the linear weighting matrix of ; is the nonlinear weighting matrix of the outputs of the hidden nodes of neural network; is the unit column vector; is the bias vector.
The external PLS part firstly uses the external model to obtain the feature vectors and , then takes the regression of on , and obtains the regression model of on . The regression equations can be demonstrated by the following:where and are the latent variables of and ; and are the loading matrices of and , respectively; is the number of latent vectors; and are the residual error matrices of and , respectively. is a transpose operation.
Considering and should represent the and well, and should have the interpretation for . The relation between and can be presented by the following:where is the regression coefficient matrix.
The optimization criteria for PLS is as follows:where and are the weight matrices of and , respectively.
Let and ; then the optimization problem of PLS can be written as follows:
Using the Lagrangian method to solve the problem, the Lagrangian function is defined as follows:where and are Lagrange multipliers.
Assume , , and , and the solution of Lagrange’s function constructed in formula (6) can be obtained:where and are the corresponding eigenvectors for the same maximum eigen value of and , respectively. Thus, , .
Reconstruct and with the following equations:where and are the loading vectors; and are the residual error matrices.
After the iteration process being performed, the regression equation is as follows:where is the residual error matrix.
To implement the proposed NPLS model, BP neural networks and RBF neural networks are adopted in the external NPLS model (BPE-NPLS and RBFE-NPLS), and the PLS, BPNN, and RBFNN are also put into prediction for compare. For BPNN, the activation function of the hidden layer is a tangent s-type function: where is the input of the hidden layer.
The Gaussian function of the hidden layer of RBFNN can be expressed as follows: where is the central vector of the hidden node; δ is the corresponding width parameter.
In the next section, we will verify the proposed algorithm to predict short-term wind power on real data.
Due to the obvious influence of daily, seasonal, and annual variations of wind resources, the volatility of wind energy is very large and unstable. So it requires data from different scenarios and different fluctuations to validate the model.
To evaluate the performance of proposed method, the prediction model is built for the single wind turbine of the typical dispersed wind farm in Northwest China (the capacity is 1.5 MW, double-fed wind turbines). There are three selected wind turbine working scenarios in the experiments, illustrated in Figures 3–5, respectively. In order to compare the performance of different prediction methods and avoid the influence of other factors, the single wind turbine in the same region is selected at different times, reflecting the test scenes of three typical wind fluctuation characteristics.
4.1. Condition I
Recorded from 2013/2/3, Winter, 00 a.m.–12 a.m., wind speed and wind power vary widely during day and night; the wind is large at night, while in daytime it is smaller; the mean wind power for 5 minutes is 727.877 kW and the mean wind speed for 5 minutes is 9.051 m/s.
4.2. Condition II
Recorded from 2013/6/8, Summer, 00 a.m.–12 a.m., rich in wind energy, wind power is also high but gentle; the mean wind power for 5 minutes is 919.994 kW and the mean wind speed for 5 minutes is 9.795 m/s.
4.3. Condition III
Shown as Figure 5, recorded from 2013/5/4, Summer, 12 a.m.–12 p.m., moderate wind, wind speed and power changes fast. The mean wind power for 5 minutes is 805.351 kW and the mean wind speed for 5 minutes is 9.246 m/s.
In the experiments, both the wind speed and wind power data are raw data mingles with null data recorded from the dispersed wind farm in time series. All data are recorded by five-minute interval and the null data have been removed before. The training set for the input of the proposed prediction is selected from the previous 12 hours using the shutters grouping strategy, which splits the data into 12 wind speed points corresponding to a wind power point, and there are 132 sets of series that constitute a short-time prediction period. After training process, the following wind speed in a sliding data window is used to predict the wind power. During each sliding data window in prediction, twelve continuous wind speeds are the input of the prediction model to obtain the next thirteenth wind power.
The parameter setting range could also be determined by the resolution, the optimal latent Variables (LVs) of PLS and hidden nodes of ANN should be within the sliding window width. Here there are sets 1 to 11 for LVs selection range and sets 1 to 12 for BP and RBF hidden nodes selection in the experiments. The same training set and prediction set are also implemented on SVM, where the kernel function type of SVM selects the Gauss type .
5. Results and Discussion
For the experiments of the paper, PLS, BPNN, RBFNN, SVM, BPE-NPLS, and RBFE-NPLS are implemented in MATLAB R2010a. The running environment for all of the calculations is a general-purpose personal computer with an Intel i3-4150 CPU and 8 GB of RAM.
5.1. Condition I
For condition I in discussion, the prediction parameters of different regression models are determined by the minimum RMSECV values. The prediction error of the cross-validation among PLS, BP, RBF, BPE-NPLS, and RBFE-NPLS is illustrated in Figure 6. The optimal prediction parameters of PLS, BP, and RBF are always less than 3, and the minimum error on the top of the surface of BPE-NPLS is 1 and 4 for LVs and hidden nodes, respectively; RBFE-NPLS is 8 and 2 for LVs and hidden nodes, respectively.
The detailed numerical analysis results are summarized in Table 1. In this situation where the wind is rich in the evening rich while weaker in daytime, RBFE-NPLS offers a relatively ideal predictive capability and has the superiority effectiveness compared to the poor performance of RBF models. SVM performs better than neural networks under condition I, and the performance is slightly worse than that of NPLS. The RMSECV and RMSEP of RBFE-NPLS are the lowest, the and values are higher than the other methods, and the values are slightly 0.34% inferior to SVM.
In addition, the hidden nodes are selected to be 1 for both BP and RBF network. Width of SVM Gauss kernel is 0.6. For BPE-NPLS, the number of latent variables and the number of nodes in hidden layer are 1 and 4 in this circumstance. And for RBFE-NPLS, those are chosen as 8 and 2, respectively.
5.2. Condition II
For condition II in discussion, the prediction error of the cross-validation among PLS, BP, RBF, BPE-NPLS, and RBFE-NPLS are illustrated in Figure 7. The optimal prediction parameters of PLS, BP, and RBF are always less than 3, and the minimum error on the top of the surface of BPE-NPLS is 1 and 2 for LVs and hidden nodes, respectively; RBFE-NPLS are 2 and 5 for LVs and hidden nodes, respectively.
The numerical analysis results are summarized in Table 2. In this situation where wind energy always is rich, wind power is also high but gentle in Summer, RBFE-NPLS still offers a relatively ideal predictive capability and has the superiority effectiveness compared to the poor performance of RBF and SVM. The RMSECV and RMSEP of RBFE-NPLS are the lowest, the and values are higher than the other methods, and the values are 2.58% inferior to RBF network. Original NN model with the generalization capabilities, particularly with flexible structures such as RBF, proved that simpler model structures (such as the PLS model) may outperform them, and the proposed NPLS method could improve the accuracy of prediction effectively. In this time, NN model outperforms SVM. The values of BP are 6.96% higher than SVM, and the values of BP are 24.97% higher than SVM.
In addition, the hidden nodes are selected to be 1 and 1 for BP and RBF network, respectively. Width of SVM Gauss kernel is 0.6. For BPE-NPLS, the number of latent variables and the number of nodes in hidden layer are 1 and 2 in this circumstance. And for RBFE-NPLS, those are chosen as 2 and 5, respectively.
5.3. Condition III
For condition III, the prediction error of the cross-validation among PLS, BP, RBF, BPE-NPLS, and RBFE-NPLS are illustrated in Figure 8. The optimal prediction parameters of PLS, BP, and RBF are always less than 3, and the minimum error on the top of the surface of BPE-NPLS is 1 and 1 for LVs and hidden nodes, respectively; RBFE-NPLS are 2 and 7, respectively.
The detailed numerical analysis results are summarized in Table 3. In this situation where there are moderate wind and the wind speed and power changes slowly, PLS has the ideal RMSECV and , choosing PLS as a benchmark, 0.04% and 1.97% superior to that of RBFE-NPLS. RBFE-NPLS has the ideal value of and RMSEP. Generally, PLS offers a relatively ideal predictive capability in this condition; NPLS model outperforms neural networks. The performance of SVM is close to that of RBFE-NPLS. The values of SVM are only 0.57% lower than RBFE-NPLS, and the SVM are 3.13% lower than RBFE-NPLS.
In addition, the hidden nodes are selected to be 2 and 1 for BP and RBF network, respectively. Width of SVM Gauss kernel is still chosen as 0.6. For BPE-NPLS, the number of latent variables and the number of nodes in hidden layer are 1 and 1 in this circumstance. And for RBFE-NPLS, those are chosen as 2 and 7, respectively.
It is noteworthy that the and values are generally low for both the other two conditions across all models. These low prediction performance are due to the anomalous fluctuation characteristics of wind farm no matter which strategy is employed. A number of factors could contribute to these limitations of the prediction process, such as the instability of the grid power flow and the fluctuation of the wind itself. Here, changing the scale of the data sliding window or transferring the short-term into an ultra-short-term forecast cycle may make the analysis more precise.
While wind data is inherently nonlinear, in our experiments the nonlinear ANN models are inferior to the linear PLS models for most components. The poor performance of the ANN models may be related to the size of the experimental datasets available (number of training samples) relative to its dimensionality and the level of noise. The characteristics of the NPLS paradigm, which essentially involves the estimation of relatively simple local models and the interpolation functions that interpolate between them, make it a more robust in the discussed circumstances, and the RBFE-NPLS method shows good all round performance in the experiments.
This paper proposes a wind power prediction algorithm structure based on PLS and chooses BPNN, RBFNN to implement the framework. Some advantages are concluded as follows: first, it uses the learning ability of the neural network to implement and refine the NPLS framework to predict the wind power. Secondly, it has higher prediction capability compared with neural network and PLS method themselves. Thirdly, by the hybrid learning process, all the best parameters of membership functions are obtained. The experiments results of proposed models verify that RBFE-NPLS could better adapt to different wind situation than other discussed methods. Since the training time may be affected by the number of inputs, in the future research work, we will use some feature extraction approaches to further improve the training performance of the proposed algorithm.
The data of this article are derived from the measured data of the Lang'er Gou wind farm. The wind farm is located in the southeast of Dingbian County, Yulin, Shaanxi Province, with an elevation of 1440 m to 1710 m. The capacity of the single wind turbine is 2.5 kW. If there are any needs for accessing the data of the study, contact the author of this article, Dr. Qian Wang ([email protected]).
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This research was funded by the National Natural Science Foundation of China with Project no. 51507135 and State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources with Grant no. LAPS15011.
M. B. Ozkan and P. Karagoz, “A novel wind power forecast model: Statistical hybrid wind power forecast technique (SHWIP),” IEEE Transactions on Industrial Informatics, vol. 11, no. 2, pp. 375–387, 2015.View at: Google Scholar
G. S. Bhatia and G. Arora, “Radial basis function methods for solving partial differential equations-A review,” Indian Journal of Science and Technology, vol. 9, no. 45, pp. 1–18, 2016.View at: Google Scholar
H. Cao, T. Naito, and Y. Ninomiya, “Approximate RBF Kernel SVM and Its Applications in Pedestrian Classification,” in Proceedings of the 1st International Workshop on Machine Learning for Vision-based Motion Analysis - MLVMA'08, 2008.View at: Google Scholar
S. Fang and H.-D. Chiang, “A High-Accuracy Wind Power Forecasting Model,” IEEE Transactions on Power Systems, vol. 32, no. 2, pp. 1589-1590, 2017.View at: Google Scholar