Abstract

Considering the inherent variability and uncertainty of wind power generation, in this study, a self-organizing map (SOM) combined with rough set theory clustering technique (RST) is proposed to extract the relative knowledge and to choose the most similar history situation and efficient data for wind power forecasting with numerical weather prediction (NWP). Through integrating the SOM and RST methods to cluster the historical data into several classes, the approach could find the similar days and excavate the hidden rules. According to the data reprocessing, the selected samples will improve the forecast accuracy echo state network (ESN) trained by the class of the forecasting day that is adopted to forecast the wind power output accordingly. The developed methods are applied to a case of power forecasting in a wind farm located in northwest of China with wind power data from April 1, 2008, to May 6, 2009. In order to verify its effectiveness, the performance of the proposed method is compared with the traditional backpropagation neural network (BP). The results demonstrated that knowledge mining led to a promising improvement in the performance for wind farm power forecasting.

1. Introduction

With the increasing resources constraints and environment pressure, wind power generation as an important application of renewable energy gains more and more concern. China has enormous potential in the wind energy utilization, with rich wind energy resource mainly in grass land or gobi of northwest, north, and northeast China, as well as coastal area and islands in east and southeast China. Due to the fast developed wind power technology and the distribution characteristics of wind resource, the trend of wind power generation development is concentrated with the large scale. According to statistics, by the end of 2012, the capacity of grid-connected wind generation has been 62660 MW and generated 100.8 billion kWh clean electricity, about 2% of the total electricity supply in China during 2012 [1].

Moreover, in wind resources system, various uncertainties exist in a number of system components as well as their interrelationships, such as the random characteristics of natural processes (e.g., climate change) and weather conditions, the errors in estimated modeling parameters, and the complexities of system operation. Along with a large amount of wind power integrated into the power system, the intrinsic intermittency and uncertainties of wind resources will cause a great impact on the stability of the whole power system. Thus, precise short-term forecast of wind power farm will be necessary and bring lots of benefits, including cutting large spinning reserve, reducing the cost of wind power generation, and improving the safety and reliability of power system operation.

Previously, the physical method and the statistical approach are the main way for wind power forecasting, and physical conditions such as geography, topography, temperature, and pressure are employed to calculate the power generation capacity of wind farm [2, 3]. These methods require accurate relative meteorological data and difficult modeling processes. In addition, they usually need to obtain the relationship between abundant historical data of wind power and other variances and are used for long-term wind power forecast, based on a large data base. Therefore, the emphasis of the methods is to build time series or dynamic model based on the experience gained from historical data.

Considering many uncertain factors and disturbances of wind power generating systems, the variation of wind power is largely affected by the complex random factors (e.g., wind speed); significant errors can be easily generated by time series model and regression algorithm, and expert system needs a mass of knowledge and experience with worse maintainability. As a popular simulation method, neural network is an alternative for handling random nonlinear complex mapping without indentifing the transform rules and has been applied in many fields (e.g., environmental modeling and system analysis) [46]. In this approach, the historical data of wind power and impact factors (e.g., uncertain parameters) are taken as the input variables. Moreover, the forecasting model is built after learning and training intelligently to avoid the errors or even minimize the impact of random factors on the simulation process. These advantages could lead the neural network to be a suitable method for the complex modeling problem.

In addition, knowledge mining can extract the connotative, unknown, and valuable knowledge or rules from the large-scale database, which is an area of the most extensive application value [7]. Knowledge mining has been widely applied, such as fuzzy time series [8], customer relationship management [9], and agricultural production prediction [10]. Climate factors, for example, wind speed, wind direction, temperature, and humidity, have a great impact on wind farm power changes [11]. Therefore, considering the characters of wind power forecasting, knowledge mining based on environmental simulation is proposed to build a data and knowledge base including history wind power and relative climate factors and to extract the similar historical situations though knowledge mining. Meanwhile, the new sample set would be generated through the historical situations to forecast the wind power with a similar characteristic in future. Moreover, classification is an important topic in data mining research. Given a set of data records, the classification problem is concerned with the discovery of classification rules that can allow records to be correctly classified.

Based on the above mentioned, in this paper, the objective is to propose an improved self-organization mapping neural network to classify the historical data and extract the useful rules for a precise wind power forecasting. A subset of the samples with the similar weather condition is used for training the neural network. The remaining of this paper is organized as follows. Section 2 describes the self-organization mapping based on rough set theory after a brief introduction of relative theory. Detailed structure of echo state network is proposed in Section 3. In Section 4, a case study is carried out to demonstrate the effectiveness of the proposed approach, and the performance is compared with the traditional neural network. The results analysis and discussion are also demonstrated. Finally, some remarks and conclusions of this study are presented in Section 5.

2. Knowledge Mining Based on Environmental Simulation

2.1. Self-Organization Mapping Theory

Self-organization mapping (SOM), firstly proposed by Kohonen in 1981, is a neural network with unsupervised learning [12]. It has certain topological structure which is adjusted through the input information, and the pattern recognition is completed by the synergy among multiple neurons [13, 14]. The special idea of SOM is that there is no need to initialize the cluster center or guidance information and that the weight information of neurons is self-adaptively adjusted by input data.

Figure 1 illustrates the schematic diagram of the SOM neural network. Generally, there are two layers in SOM neural network, the input layer and the competition layer. The number of neurons in input layer is equal to the dimension of samples. Every neuron in the input layer connects the neurons in the competitive layer with variable weight values. The neurons in competitive layer will compete for the opportunity to respond to the input pattern, and the weight with the closest match to the presented input pattern is the winner neuron or the best matching unit (BMU). There also exist partial connections between the neurons in the competitive layers. The two-dimension form is the most common form of the neurons’ arrangement in competition. The basic principle of SOM is described as follows.

(1) Competition Process. Let be the input samples with dimensions , where is the number of the input neurons. The number of the neurons in competition layer with two dimensions is (). The connection weight between the input layer and the competition layer is denoted as , where .

Calculate the inner product of input vector and the connection weight:

Select the winner neuron in the competition process or the best matching unit (BMU). The basic rule is that the larger the inner product, the closer the neuron to the input vector, which indicates that the neuron matches to the presented input pattern. The winning formula is as follows: where , for all , and , for all . Finally, only one neuron wins; thus, the result is as similar as , and is the winner neuron.

(2) Learning Process. SOM neural network is arranged according to the two-dimensional structure; each neuron has a promoting effect to the neighboring neurons and on the contrast an inhibition effect to those far away, as shown in Figure 2.

In Figure 2    denotes the distance between the neuron and its neighbor, and is the change of the connection weight. It is indicated that the neurons within the certain scope are promoted with weights increasing, while the neurons out of the scope are inhibited, and the weigh would be reduced.

In SOM neural network, the weight is adjusted according to Kohonen learning rules. The main idea is to make the winner neuron and its neighbors closer to the input sample by modifying the weights. The formula is expressed as where and represent the weight vector of neuron at time and , respectively. is the input sample at time . And means the learning rate at time , ranging in . At the beginning, the learning rate is the largest, and the value of is decreasing with the training. Since the weight shock of the neurons may occur during the training process, the learning rate need to be reduced gradually. Training neighborhood represents the neurons around the winner neuron whose weights will be adjusted. In the initial training, the scope of neighborhood is the slargest, which provides more neurons learning opportunity. With the increase of training, each neuron represents its own categories, and its neighborhood will be less.

Train the SOM neural network with all the input samples several times; after the procedure mentioned above, the neurons in competitive layer represent each cluster center, which achieves the clustering effect. As a result, the trained network can be utilized for pattern recognition. The algorithm itself can be summarized as follows.

Step 1. Initialize the SOM neural network. All the weight values are initialized to be random in .

Step 2. A training vector is picked randomly from the training set.

Step 3. Calculate the Euclidean distance between input vector and the weight vector ; . is the winning node and becomes the BMU.

Step 4. According to (3), the values of weight vectors for the winning neuron are updated.

Step 5. Choose the new input vector and repeat Steps 3 and 4 until stop criterion is satisfied, for example, reaching sufficiently large iterations.

In fact, during the competitive procedure, there may be several neurons that are closely matched with the input vector, thus judging only one of them as the winner seems improper. Considering this issue, the concept of rough set theory is introduced to solve this problem, and self-organized neural network based on rough set is proposed in this paper.

2.2. Rough Set Theory

Rough set theory (RST) is a useful tool for data mining and decision support. In particular, it is popular in dealing with the incomplete information, vague concept, and uncertain data [15]. Besides, combined with other data mining algorithm, it can produce more hybrid data mining algorithm [16, 17].

Suppose is the nonempty universe with finite members; is the equivalence relation in ; thus, the knowledge base can be expressed as a relation system .

For the subset and , the intersection of all the equivalence relation among can be called -indiscernibility relation, defined by : where represents the equivalence class containing in relation .

Lower and upper approximations are the important concept of RST. They help to measure the description of uncertain knowledge. Suppose is the subset of ; then, the lower approximation and upper approximation are

Meanwhile, the boundary region , and it consists of those objects that cannot be classified with certainty as members of with the knowledge in .

If , it indicates that , and cannot be expressed by the equivalence class of precisely. Thus, the set is called “rough” (or “roughly definable”); otherwise, is crisp.

2.3. SOM Neural Network Combined with RST

RST can solve the uncertain or imprecise knowledge expression; thus, it could be employed to deal with the imprecise problem in learning the process of SOM neural network. The novel network still has two-layer structure, while the different of the traditional SOM is in the competitive layer. In the competitive layer, each neuron contains upper approximation and lower approximation. In order to judge which neuron wins, we can determine that the input vector belongs to the lower approximation of a neuron exactly or to the upper approximation of several neurons imprecisely. Through this process, the imprecise problem in judging the winner will be solved properly.

Besides, we can set different learning rates for these two different matching results. If the input vector belongs to the lower approximation, it will get greater learning rate ; otherwise, it will get a lower learning rate . The idea is that when the input vector belongs to a pattern exactly, it can accelerate the learning; when it belongs to a pattern imprecisely, it will reduce its learning effect. The key issue of the novel SOM neural network is how to determine the input vector that belongs to a certain neuron or a set of neurons.

After selecting the best match neuron, it still needs to choose some suboptimum neurons using the suboptimum match degree, calculated as where is the key factor to determine the lower or upper approximation. Then, it would define the set of suboptimal neurons: where is the weight of the th neuron; is the threshold. Set is the collection of the suboptimum neurons with match degree higher than . If set is empty, it indicates that there are no other close match neurons expect the best match one, and the input vector belongs to its lower approximation. Otherwise, the input vector belongs to the upper approximation of best match and suboptimal neurons. The different learning processes of SOM neural network are expressed, respectively.(1)The neuron belongs to the lower approximation exactly: (2)The neuron belongs to the upper approximation imprecisely: where is the total training times; and are the learning rates of lower approximation and upper approximation.

The detailed procedure of the proposed SOM neural network is presented as follows.

Step 1. Initialize the network, set , and denotes the input sample vector; initial weight is a little random number; and are set at 0.9 and 0.5, respectively.

Step 2. According to (1), calculate the inner product of the input sample and the neurons in the output layers.

Step 3. Select the best match output neuron.

Step 4. Select the suboptimal match neurons as a collection .

Step 5. Adjust the weights according to (10)-(11).

Step 6. Turn to Step 2 and repeat the process until all the samples have been tested or the learning rates have been reduced to 0.

3. Echo State Network

Recurrent neural networks (RNNs) are very powerful tools for solving complex temporal machine learning tasks [18]. In 2001, a new approach to RNN design and training was respectively proposed under the names of liquid state machines and echo state networks. Its reservoir computing (RC) is an RNN technique that offers a solution to the many problems associated with typical RNN architectures which have prevented their widespread use [1921].

The classic echo state network contains three layers: input layer, hidden layer, and output layer (shown in Figure 2). The hidden layer is also called dynamic reservoir. Among the traditional circulate network, the scale of neurons is controlled within 12, while there is an abundance of neurons in the reservoir of ESN, about 20 to 500, with good short-term memory. Suppose there are units in the input layer, units in the output layer, and units in the hidden layer. Generally, represents the connection weight matrix of input layer; means the connection weight within the reservoir, which keeps 1%~5% sparsely connected. In addition, the spectral radius is usually less than 1. These ensure the reservoir with dynamic memory and certain stability. and denote the connection weight matrix of output layer and feedback. It should be noticed that , , and are decided randomly before the network is established, and once determined they would have not changed. is finally gained by training. Therefore, the main goal of network training is to determine the value of . The schematic view of echo state network is presented in Figure 3.

The primary algorithm of echo state network is to inspire the reservoir by input information and to generate the state variables in the reservoir. Though linear regression between the state variables and desire output information, the connection weight of output layer can be determined. The state variables and the output are updated as follows: where , . and are the activation functions of the reservoir and the output, and the most commonly used is a typical hyperbolic tangent function.

4. Case Study and Results Analysis

A wind farm in northwest of China is considered as a case study to demonstrate the effectiveness of the proposed approach. In the study area, there are 66 wind turbines on the wind farm with total capacity 49.5 MW. The historical meteorological data and wind power data from April 1, 2008, to May 6, 2009, are taken as the database. The forecasting model is solved though Matlab on a single core of a 32-bit Lenovo workstation running on Windows7 with 2 dual-core 2.60 GHz CPU and 4.0 GB of RAM. We extract rules from the past information to forecast the wind power load. The main factors considered here are wind scale, temperature, and humility. The feature vector for knowledge mining is described as where and are the highest and lowest temperatures on the th day; and are the highest and lowest temperatures on the th day; and are the wind scales on the th day; , , and are the maximum, minimum, and average wind speeds on the th day.

In order to eliminate the dimension influence among different variables, data preprocessing is the first job need to be done.

The proposed RS-SOM neural network is employed to cluster the history information. To evaluate the compactness of the clustering results, the sum of squared error (SSE) is adopted in this study. The smaller the SSE, the better the effect. And it is calculated as where is the set of each cluster, is the mean of the th cluster, and is the number of samples belonging to the th cluster.

The cluster results are shown in Table 1. It can be seen that the 400 samples in the data base are divided into 7 classes. The SSE of class 3 is the smallest, while that of class 6 is the largest. This indicated that the samples in class 3 are the most closely similar and the shape of the curve can almost reflect the fluctuation of the wind output power on those days. Besides the average wind power output curve of each class is illustrated in Figure 4. Each curve has significant features and is obviously different from the others. For example, in class 1, the valley of most samples’ power output is at about 10:00, while the peak time is at nearly 23:00. This is the most common situation with the largest number of class member 118. The class with the least samples is class 5, whose shape is extremely irregular.

Since the weather report may not be accurate, and the more the time lag, the worse the forecasting result, and this study only forecasted the wind power output on the next two days accordingly. The new input pattern can be discriminated using the trained RS-SOM neural network. The samples in the same class with the forecasting day are selected to train the ESN network for the forecasting day. In order to test the performance of ESN, BP neural network which has been widely used in load forecasting is also applied for the same task. Commonly used error evaluation indexes, including mean absolute percentage error (MAPE), mean absolute error (MAE), root mean square error (RMSE), and normal root mean square error (NRMSE), are used to discuss the performance of different forecasting methods: where is the forecasting value, is the actual value, and is the number of samples.

Figure 5 shows the forecasting results of different methods and the actual wind power output. It indicates that the overall trend of forecasting power is in accordance with the actual situation. However, as the forecasting of peak and valley, the performance of ESN is obviously greater than BP model. The deviation of the latter is larger at the extremism values. During the period of 1:00 to 8:00, the performance of both methods is well, while, at night from 18:00 to 24:00, the deviation is larger. The error evaluation indexes for both methods are presented in Table 2, which shows that all of the indexes of ESN are lower than BP. The MAPE and MAE of ESN are 0.1366 and 1.7771 MW, respectively, lower than those of BP 0.1709 and 1.9607 MW. The RMSE and NRMSE of ESN forecasting are 2.2171 and 0.0019, and those of BP are 2.4320 and 0.0023, respectively. As mentioned, the forecast accuracy for the study area obtained from ESN model is substantially more than that typically seen from the BP model. This extensive comparison reflects forecasts for all available 66 wind turbines. It is illustrated that the accuracy of the ESN forecasts was consistently better for the year and for each wind plant. It should be noted that accuracy would be even better if the data were adjusted for curtailments.

Compared with the results obtained from the two methods, it is indicated that the reason why accurate wind power forecasting of ESN is more than that of the BP model and the BP model has a higher degree of error than the ESN method is that the wind farm is located in the northwest of China, the management system of meteorological measurements is imperfect, and meteorological data at the wind plant often are of poor quality and contribute to inaccuracy in traditional forecasting approaches; wind direction readings from standard met towers might not even be applicable since surrounding terrain can affect this movement, and wind can ramp up or down quickly, that could lead to misleading information. All of the above problems with a very complex and important prediction process could directly affect the accuracy of the traditional forecasts. Data management also is vital to the accuracy of the forecasting method. Through integrating the SOM and RST methods to cluster the historical data in to several classes, the approach could find the similar days and excavate the hidden rules for increasing the forecast accuracy. In addition, it could reflect the uncertainties of the input parameters, avoid the data error, and provide valid data for ESN forecasting.

However, compared with other approaches, there is still much space for improvement of the proposed method. For example, there is no uniform way to determine the relative parameters (as the spectral radius of the connection weight within the reservoir) of the ESN network, which are mainly gained though massive experiments. Besides, other neural networks with universal approximation capability, for example, the radial basis function (RBF) network, will be studied and reformed further to improve the forecasting accuracy and calculation speed.

5. Conclusion

Wind power forecasting is an important tool for managing the inherent variability and uncertainty in wind power generation. Increasing the accuracy of forecasting can help to reduce the likelihood of an unexpected gap between scheduled and actual wind power generation, which can be extremely helpful for operators of power systems and wind power plants. In this study, we developed a database by using historical meteorological environment and power output data. Self-organizing map combining rough set theory as a knowledge mining technology is employed to discover and extract the rules. The classified samples are taken as the input of echo state network to train the structure of the network, respectively. Through integrating the SOM and RST methods to cluster the historical data in to several classes, the approach would provide valid data for ESN forecasting. The developed methods are applied to a case of power forecasting in a wind farm located in northwest of China with a wind power data from April 1, 2008, to May 6, 2009. The results demonstrated the successful use of the proposed method, which performs better than BP. The accuracy of prediction has been improved. However, the database is static in this study, which means the information of new samples will not be added into the knowledge pool automatically. And if the database is small, it may not cover comprehensive situation. Thus, it will be proper for the wind farm with long operation period.

Acknowledgments

This work was supported in part by the NSFC under Grant no. 71071052 and Grant no. 71201057, as well as “the Fundamental Research Funds for the Central Universities” under Grant no. 12QX23 and Grant no. 13X20. The authors would like to thank the anonymous reviewers and editors for their valuable comments, which greatly helped them to clarify and improve the contents of the paper.