Abstract

Accurate and reliable power generation energy forecasting of small hydropower (SHP) is essential for hydropower management and scheduling. Due to nonperson supervision for a long time, there are not enough historical power generation records, so the forecasting model is difficult to be developed. In this paper, the support vector machine (SVM) is chosen as a method for short-term power generation energy prediction because it shows many unique advantages in solving small sample, nonlinear, and high dimensional pattern recognition. In order to identify appropriate parameters of the SVM prediction model, the genetic algorithm (GA) is performed. The GA-SVM prediction model is tested using the short-term observations of power generation energy in the Yunlong County and Maguan County in Yunnan province. Through the comparison of its performance with those of the ARMA model, it is demonstrated that GA-SVM model is a very potential candidate for the prediction of short-term power generation energy of SHP.

1. Introduction

Small hydropower (SHP) is a kind of world recognized and concerned renewable clean energies. It widely attracts attention in the whole world as its great significance for medium and small rivers management, strengthening the rural water conservancy infrastructure construction, meets rural energy demand, improves the rural energy structure, reduces the pollution of the environment, responds to climate change, promotes the development of the local economy [15], and so forth. In the past two decades, the installed capacity of SHP increases more than 2.5 GW per year because it has many advantages, such as small scale, mature technology, short construction time, less investment, and near-zero pollution emissions, and generally causes no immigration or land submersion.

Up to the end of 2012, the installed capacity of SHP in China had exceeded 65 GW and annual generation over 200 TWh, which take about 30% of hydropower installed capacity and power generation, respectively, and both rank first in the world [6]. Different from other countries in the world, SHP plays an important role in China’s rural electricity supply as it is widely distributed in more than 1600 mountainous counties in China; approximately half of the territories, one-third of counties, and a quarter of the total population are dependent upon SHP for rural electricity supply [7, 8]. However, with the fast development of SHP and large-scale access to power grid, its influence on the power grid is becoming more and more obvious, especially in southwest China which has rich SHP resource. SHP has become a major factor that affects the safe operation and development of power grid. Most of SHP plants are runoff river plant without regulation ability, so its power output is obviously intermittent and seasonal because of the uncertainty of rainfall. In particular, in flood season, the rainfall is very big and focused so that SHP plants may generate much more power output than other periods. At the same time, the big hydropower plant also generated even more power output. That can probably lead to water resource wasted and electricity dumped under the condition of current transmission capacity. Therefore, it is necessary to master short-term power generation energy (STPGE) of SHP in order to avoid the above situation through using regulation ability of big hydropower plants.

However, SHP plants are generally in the small remote river basin with shortage of hydrologic station and the management is weak due to nonperson supervision for a long time, so it is very difficult for forecasting STPGE of SHP because of lack of necessary runoff data. At present, a lot of research activities in short-term forecasting models of hydropower stations have been carried out, which focus on the forecasting of inflow in reservoirs [912], of stream flow [1315], or of precipitation [16]. But there are few research works referring to forecasting the STPGE for SHP stations [17]. Since the parameters will greatly affect the performance of SVM, some literatures attempted to determine the proper parameter values for their problems [18, 19]. However, for large scale or real-time feature practice application, the considerable search time cannot be accepted. Heuristic algorithms have been successfully used in many complex problems [2022].

This paper presents a novel short-term forecasting model (named GA-SVM) for power generation energy of SHP stations. In this study, support vector machine (SVM) was used to identify power generation energy based on structural risk minimization principle [18, 2326] and its parameters are optimized by genetic algorithm (GA) to get the optimal model structure [27, 28]. Considering dynamically putting into operation of SHP plant or hydrounit, the installed capacity utilization hours of SHP are selected as input and output value of the proposed forecasting model since the power generation energy of SHP is not the same at different times. This method is applied to forecast STPGE of the small hydropower stations in Yunlong County and Maguan County, Yunnan province, China. Compared with the conventional method, the proposed GA-SVM model exhibits superior performance, demonstrating GA-SVM’s effectiveness as an approach to forecast STPGE of SHP.

The paper is organized as follows. In the next section “Brief Introduction to SVM and GA,” SVM and GA algorithms are briefly introduced. Then, the proposed GA-SVM forecasting method is described in the following section. In the next section, this method is applied to Yunnan province, and the results are compared with those of conventional method. The final section concludes the paper.

2. Brief Introduction to SVM and GA

2.1. Support Vector Machines (SVM)

The SVM, developed by Vapnik [29], is based on statistical learning theory and implements the structural risk minimization principle rather than the empirical risk minimization principle implemented by most traditional ANN models. It seeks to minimize an upper bound to the generalization error instead of minimizing the training error and can achieve an optimum network structure. Many researchers have used SVM to implement forecasting model in every field, which mainly focuses on forecasting rainfall. Dibike et al. demonstrated the capability of the SVM in hydrological prediction, such as modeling the rainfall runoff process [30]. There are other scholars who have used the SVM for rainfall forecast ranging from 1-2 days ahead to 1 h ahead [31]. In this paper, the SVM model is used to forecast STPGE of SHP. And the radial basis function (RBF) is employed as kernel function which has shown to simplify the use of a mapping, because the RBF is more compact in comparison with other kernels and is able to shorten the computational training process and improve the generalization performance [30]. The RBF is also computationally simpler than a polynomial kernel, which has more parameters [32]. The equation for RBF is of the form

2.2. Genetic Algorithm (GA)

GA is a global optimal algorithm based on “survival of the fittest” in Darwin’s theory of evolution and provides an efficient and robust optimized searching method in complex space. This is an excellent search algorithm adapted to the global probability. GA operates iteratively on a population of structures, each of which represents a candidate solution to the problem, encoded as a string of symbols (chromosome), and uses randomized technical guidance to effectively search a coded parameter space. GA makes use of coding technology to transform the solved space of problem into chromosome space and also convert the decisive variable into a certain structure of individual chromosomes. During the iteration of the algorithm, according to the rules set by the fitness function, these groups made up of individuals generated next generation through selection, crossover, and mutation. Fitness factor which is beneficial to the population will be inherited, while factors that reduce fitness will be eliminated with the operation of mutation and crossover in iterations. After continuous evolutions, the optimal individuals survive, which can be approximate optimal solution of the problem.

3. Short-Term Forecasting Model for Power Generation Energy Using GA-SVM

3.1. Forecasting Object

Generally, the daily power generation energy is directly selected as forecasting object for STPGE of SHP. But, considering dynamically putting into operation small hydropower plant or hydrounit in some region, there is a difference of installed capacity of SHP between one day and another day. Since the power output of SHP plant is almost close to installed capacity in flood season, the power generation energy is also very different due to the increase in installed capacity of SHP. The model prediction performance will be affected if power generation energy of SHP is only used as input and output values of the model. Therefore, the installed capacity utilization hour represents power generation energy of SHP in region. That could not only accurately reflect the characteristics of small hydropower plant without regulation ability but also alleviate short-term fluctuations in power generation curve. The installed capacity utilization hour was where is installed capacity utilization hour in region at day ; is power generation energy in region at day ; is the install capacity of all small hydropower plants in region at day .

3.2. Short-Term Forecasting Model of SHP Using GA-SVM

To apply SVM model to forecast STPGE of SHP plants in region, we need to know the three vital parameters RBF kernels: , , and , which respectively denote positive constant, insensitive loss function, and Gaussian noise level of standard deviation. Different values of , , and σ can lead to large differences in the forecasting result. The parameters , , and control the complicacy of the model and error of the approximation, thus reflecting the difficulty of the training and the forecasting accuracy. In order to improve the forecasting accuracy, we should confirm the three parameters. In recent years, several methods such as the genetic algorithm [33, 34] and shuffled complex evolution algorithm [3537] have been developed for model parameter calibration. In this paper, GA is used to optimize parameters of SVM kernel function. This approach requires no a priori knowledge and is of high stability and accuracy. Figure 1 illustrates the flow chart of optimizing the three parameters of SVM model by GA. The GA is used to seek a better combination of the three parameters in the SVM so that a bigger forecasting accuracy is obtained in each iteration.

In this study, the input and output variables are normalized in the range from 0 to 1 by (3). That can minimize deformation error range and guarantee the unity of the model data in order to improve prediction accuracy. Consider where is the normalization value at day ; is the original value at day ; and are the maximum and minimum of sample data sets, respectively.

After training and testing the GA-SVM model, the forecast value of power generation energy is calculated by

3.3. Model Performance Estimation

A lot of goodness-of-fit measurements have been applied to evaluate model performance. Appropriate evaluation criteria should be chosen when using multicriteria to validate model performance [38]. In this paper, the following two statistical measures, which are usually used in other researches, are chosen as evaluation criteria for model performance: where is the total amount of observed data, and are respective observed and forecasted value at day .

The root mean squared error (RMSE) is an arbitrary positive value and will indicate a good performance when it is close to zero. The mean absolute percentage error (MAPE) is a relative index of absolute model error and can express accuracy as a percentage [39, 40]. The smaller the value of MAPE is, the better performance the model shows.

4. Numerical Results

4.1. Study Areas and Data

There is extremely rich hydropower resource in Yunnan province, whose potential capacity ranks third in China. The hydropower resources of every region are extremely uneven and mainly distributed in the west and north, followed by the east and south. By the end of October 2012, the SHP plants in Yunnan had reached 1587, with 3417 units and 8453.05 MW of the installed capacity, which accounts for more than 27% and 12% of hydropower capacity in Yunnan province and SHP capacity in China, respectively [41]. The two typical counties, Yunlong County and Maguan County, are in Dali region and Wenshan region in Yunnan province, respectively, and are selected as study areas in this paper. The location of the two counties is shown in Figure 2.

Yunlong County is located in the west of Yunnan province with a total area of 4400.95 km2. And the annual average temperature and annual average rainfall are 15.9°C and 729.5 mm, respectively. By the end of 2013, there are 10 small hydropower plants with installed capacity 111.5 MW. Maguan County is located in the southeast of Yunnan province with a total area of 2676 km2. And the annual average temperature and annual average rainfall are 16.9°C and 1345 mm, respectively. By the end of 2013, there are 22 small hydropower plants with installed capacity of 213.89 MW.

The data derived from the two counties are both 915 days long with the period between May 1, 2011, and October 31, 2013, for which 854 days of the power generation energy data from May 1, 2011, to August 31, 2013, are used for calibration and the remaining 61 days from September 1, 2013, to October 31, 2013, are used for validation. The daily statistical parameters of calibration and validation and the entire data set for the two counties are shown in Table 1. In the table, ,  , , , and stand for mean, standard deviation, skewness coefficient, minimum, and maximum, respectively. The table indicates that the training data fully includes validation data. In addition, it can be easily found that power generation energy for the two counties both vary over a wide range and are concentrated in the flood season, much bigger than other seasons. So the data from September to October in flood season is selected for model testing and other data for model training. In addition, the dispatching personnel of power grid are more concerned about power generation energy of SHP in flood season.

4.2. Results and Discussion

In this study, the GA is employed as parameter search scheme. In order to get better parameters of SVM, the maximum iterative time of GA is set as 50 and the population size is set to 30, 50, 80, 100, 120, and 150, respectively. And the optimal scope of three parameters (, , and ) of SVM model are , , and , respectively. The performance statistics of SVM models are given in Tables 2 and 3 for the two counties.

The results from Table 2 clearly indicate that the population size (ii) for SVM models with the optimal parameters (, , ) = (5.5762, 0.2275, 0.0073) can be selected as forecast model for Yunlong County.

For Maguan County, it can be seen from Table 3 that the two statistical measures of population size (i) in calibration stage are clearly better than others since those are slightly better or worse in validation stage. So the optimal parameters (, , ) = (2.3792, 0.6749, 0.0058) were selected through comprehensive comparison.

In order to get a better comprehension of the GA-SVM model performance, the ARMA model was employed as a comparative purpose. The basic components to an ARMA model is autoregression (AR) and moving-average (MA). To obtain a suitable ARMA model, the two integers and have to be determined, respectively, by the number of autoregressive orders and the number of moving-average orders of the ARMA model. In this paper, the AIC (Akaike information criterion) value of ARMA models, for and ranging from 1 to 13, is calculated.

For Yunlong County, the models ARMA (3, 12), (4, 8), (5, 13), (7, 12), (6, 12), and (8, 8), which have relatively smaller AIC values, are selected as the candidate models. Table 4 shows the AIC value and the performance of selected ARMA models. By comparing analysis, the ARMA (7, 12) model was chosen as the final ARMA model for Yunlong County.

For Maguan County, the models ARMA (1, 2), (2, 1), (2, 2), (2, 3), (2, 4), and (3, 1), which have relatively smaller AIC values, are selected as the candidate models. Table 5 shows the AIC value and the performance of selected ARMA models. By comparing analysis, the ARMA (2, 4) model was chosen as the final ARMA model for Maguan County.

In this study, the same training and verification sets are used for the two models in order to have the same basis of comparison. Meanwhile, in order to evaluate the model performance for forecasting STPGE of SHP, the time series data are derived from two study sites in different region. And the two statistical measures are employed to evaluate the model performance.

For Yunlong County, the model’s RMSE and MAPE statistics of the calibration and validation period are summarized in Table 6. With the results shown in Table 6, the analysis can be executed crisply. The results reveal that the GA-SVM model outperformed ARMA with respect to the two measures in the calibration period. In this stage, the GA-SVM model improved the ARMA model of about 0.24 in RMSE value and 0.41 in MAPE value. For the comparison between GA-SVM and ARMA model in the validation period, the GA-SVM obtains better RMSE value than the ARMA; while the MAPE value of the two models are nearly equal to each other. Figure 3 shows the comparison of forecasted versus observed discharge using GA-SVM and ARMA model for Yunlong County. It can be seen from the residuals that the GA-SVM model performs better than ARMA. Furthermore, it can be concluded from Table 4 and Figure 3 that GA-SVM model obtains slightly better forecast precision than ARMA.

For Maguan County, the model’s RMSE and MAPE statistics of the calibration and validation period are summarized in Table 7. Table 7 demonstrates that the GA-SVM model is clearly superior to ARMA in the calibration and validation period of the two measures. In the validation period, the GA-SVM model improved the ARMA model of about 7.89 and 0.41 in RMSE and MAPE values, respectively. For the comparison between GA-SVM and SVM model in the validation period, the GA-SVM model obtains slightly better MAPE value and worse RMSE value than the ARMA. Figure 4 shows the comparison of forecasted versus observed power generation energy using GA-SVM and ARMA models for the Maguan County. As can be seen from the residuals, the GA-SVM model performs better than ARMA except for a few peaks. Furthermore, it can be concluded from Table 5 and Figure 4 that the GA-SVM model overall performs better than the ARMA model.

5. Conclusion

In the present study, the GA-SVM prediction model comprising support vector machine with genetic algorithm has been developed for forecasting short-term power generation energy of small hydropower in region. The historical observed data derived from Yunlong County and Maguan County in Yunnan province in China were employed to investigate the modeling potentiality of GA-SVM. Data from May 1, 2011, to August 31, 2013, and from September 1, 2013, to October 31, 2013, are used for training and validation, respectively, in short-term power generation energy prediction. Due to the lack of small hydropower operation data, SVM is chosen as forecasting model because of its ability in solving small sample. The three parameters of SVM model are not known a priori and optimized by GA in order to get appropriate parameters for improving forecasting accuracy. In order to get a better comprehension of the GA-SVM model performance, the ARMA model was employed as a comparative purpose. The two models were constructed and their performances were compared crisply. The results indicated that the GA-SVM model can give slightly better prediction performance than the other model.

For the less data of small hydropower in region, the GA-SVM model proposed in this paper is an effective method for improving short-term forecasting accuracy. That is useful for fully absorbing small hydropower resources and avoiding water resource wasted and electricity dumped in flood season.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the National High Technology Research and Development of China 863 Program (2012AA050205) and the Fundamental Research Funds for the Central Universities (DUT13JN05).