Abstract

Power load forecasting always plays a considerable role in the management of a power system, as accurate forecasting provides a guarantee for the daily operation of the power grid. It has been widely demonstrated in forecasting that hybrid forecasts can improve forecast performance compared with individual forecasts. In this paper, a hybrid forecasting approach, comprising Empirical Mode Decomposition, CSA (Cuckoo Search Algorithm), and WNN (Wavelet Neural Network), is proposed. This approach constructs a more valid forecasting structure and more stable results than traditional ANN (Artificial Neural Network) models such as BPNN (Back Propagation Neural Network), GABPNN (Back Propagation Neural Network Optimized by Genetic Algorithm), and WNN. To evaluate the forecasting performance of the proposed model, a half-hourly power load in New South Wales of Australia is used as a case study in this paper. The experimental results demonstrate that the proposed hybrid model is not only simple but also able to satisfactorily approximate the actual power load and can be an effective tool in planning and dispatch for smart grids.

1. Introduction

In a power system, the short-term power load forecasting is very important for the stable operation of the system. Accurate forecasting is a guarantee in the development of preventive maintenance plans, which include generator safeguards, power system reliability estimation, and scheduling dispatch [1, 2]. High-accuracy power load forecasts improve the economic and social benefits of power grid management, which reduce generation costs, improve the security of power systems, and help administrators develop optimal plans. Moreover, accurate load forecasting is crucial in forecasts of the power price in power markets [3]. Therefore, developing power load forecasting techniques to achieve accurate, simple, and fast load forecasts is necessary. Thus far, many short-term power load forecasting methods have been proposed, and these methods can be mainly divided into three categories: conventional methods, modern forecasting methods, and hybrid forecasting methods. Conventional methods include multiple linear regression analysis [4, 5], time series [6, 7], state space models [8], general exponential smoothing [9], and knowledge-based methods. However, these methods cannot provide appropriate nonlinear mathematical relationships to express actual power loads. The primary modern forecasting methods are intelligent evolutionary algorithms [10, 11], expert systems [12, 13], neural networks [1417], and fuzzy inference [18]. Intelligent algorithms and neural networks obtain good performance because of their clear patterns, easy implementation, and strong ability to address the problem. Hybrid forecasting methods, proposed to avoid the shortcomings that exist in individual forecasting methods, have become increasingly prevalent [19, 20]. A detailed introduction of the three categories is given below.

The deduction processes of traditional forecasting methods are rigorous, and most of them are based on traditional mathematics theories such as statistics, calculus, and modeling by subjective data analysis [21]. The main idea of trend extrapolation technology is to look for the trend of data changes, according to the trend equation, to forecast future data. The method is simple, and, especially for smooth power load changes, it can achieve a good prediction effect. Its deficiency is that its precision is greatly influenced by the random load component [22]. The regression analysis method is often applied to short-term load forecasting [23]. This method has many advantages such as a simple principle and better quality of data which leads to better precision; however, the selection of the main factors affecting power load in the model is difficult as many factors that affect the forecasting accuracy are hard to quantify. This model is lacking any self-study capability, and the input variable and output variable cannot be revised automatically [24]. With years of development, the time series forecasting method has become a mature theory method and has been applied to power load forecasting [25]. The basic time series prediction models mainly include AR, MA, and ARMA [26]. Although the time series forecasting method has advantages such as only requiring a small volume of historical data and a small amount of calculation and the fast speed of its calculation, this method has certain limitations such as its inability to reflect the influence of meteorological factors and how its forecasting accuracy will decrease with the increase of the prediction step [27]. ANN [17] is a type of nonlinear simulation of the human brain information processing system with an intelligent processing process; for an inaccurate variation trend, this method also has a good ability to adapt, is able to grasp information and keep on learning, and has good knowledge reasoning and self-optimization [28]. An expert system is a computer system based on the knowledge of the programming approach, mainly a software system, and the main components of an expert system include the inference engine of the system, the expert knowledge base, the explain interface, and the knowledge acquisition module. An expert system is a program that has decision-making capabilities based on reasoned knowledge; however, this method is limited by whether the expert knowledge is complete [29]. The grey forecasting method is an important technique in grey theory, and it uses approximate differential equations to describe future tendencies for a time series [30]. The limitation of this method is that the greater the dispersion degree of data, the worse the forecasting accuracy. Although traditional forecasting methods and forecasting methods based on intelligent computing have their respective applications, it is difficult to achieve better results when using one of them by itself [31]. In the literature related to forecasting [3134], the forecasting results are not quite as good with any single forecasting model. The primary reason is that single forecasting models cannot extract the complicated factors encountered in reality.

Due to the limitations of the forecasting capacity of a single model, it cannot always be optimal in all cases. In this paper, a novel hybrid model was developed with the hope of obtaining more accurate power load forecasting results. The proposed hybrid wind speed forecasting model can be grouped into four steps. Firstly, we used the empirical model decomposition technique, which represents a nonstationary data analysis technique, to reconstruct the original wind speed series. Secondly, a WNN model was employed to create the power load forecasting, and the parameters in the WNN model were tuned by the CSA. The simulation results illustrate that the hybrid model is an effective method in power load forecasting. The main contributions of this paper are summarized as follows:(1)The CSA algorithm is applied to choose the optimal initial weight in the WNN model, which always leads to unstable forecasting error.(2)In the field of power load forecasting, the proposed hybrid model is manifested as a valid method with efficient computation and satisfactory forecasting accuracy.(3)Considering the skewness and kurtosis of the forecasting accuracy distribution, the developed forecasting availability is proposed as an effective evaluation criterion for model selection in the power load forecasting field.

This paper is organized as follows. First, we outline the concept of models used in this paper, including empirical model decomposition, WNN, CSA, BPNN, GABPNN, and EMD-CSAWNN. Second, the modeling processes of the methods mentioned above are introduced. Simulation results are presented and analyzed. Finally, the overall conclusion is included.

2. Methodology

In this section, the required individual tools will be presented concisely, including the empirical model decomposition technique, BPNN, the WNN model, and the CSA and GA algorithms. Moreover, the proposed hybrid approach will be described in detail. In addition, the structure of the feed-forward neural network will be confirmed.

2.1. Empirical Mode Decomposition

Empirical model decomposition was proposed by Karthikeyan and Kumar as an adaptive method for nonstationary time series analysis, and it is now widely used. It can be applied to any type of signal decomposition [35]. Thus, it has obvious advantages in processing nonstationary and nonlinear series. The foundation of this technique is to decompose a time series into a finite set of several IMFs and a residue [36].

Definition 1. All IMFs are defined to satisfy the following conditions: (1) the number of local extreme points and the number of zero crossings must be equal or at least differ by only one; (2) the mean value of the upper envelope and lower envelope is zero.

Definition 2. The stoppage criterion determined is defined asThe sifting process stops when is smaller than a pregiven value. The process of decomposition is over when the value of is between 0.2 and 0.3. Additional details of the empirical model decomposition technique are illustrated in Figure 1.

2.2. Artificial Neural Network (ANN)
2.2.1. Confirmation of the Structure of the Network

The ANN has received considerable attention as a powerful computational tool for forecasting in many fields since 1980. ANN models always outperform statistical models because of their ability to map the inputs onto outputs via simple computation [37]. We discuss the feed-forward neural network in this paper because of its strong learning ability and simple structure. The determination of the network structure is as follows [38].

Definition 3. Given an arbitrary continuous function , can be accurately approximated using a three-layer forward neural network realization. The first layer of the network is the input layer, containing neurons. The middle layer is the hidden layer, containing neurons. The third layer is the output layer, which has neurons.

Definition 4. Let be bounded continuous monotone in function, let be the bounded compact subsets, and let , be a real continuous function in . , integer , real constant , and , such that satisfying . Note that, , a three-layer network structure, and the output function of the hidden layer is , the output function of the input and output layer is linear, and the total relationship of is, such that .

Proof. Because is a continuous function, , , assume that is a bounded function. Based on the Paley-Wiener Theorem [39], the Fourier transform is a real analytic function of , a constant , such that . We define such a function: Using an estimation of , can be proved: that is, is uniform convergence. Thus, . That is, , , such that . Moreover, for , let . For , can be found such that . That is to say, is a uniform approximation by the integral , . If , is a bounded subset, , and is a uniformly continuous function of . There are functions: , , is a natural number, when , such thatThus, is a uniform approximate on , The formula above can be achieved by a three-layer neural network. Thus, can be approximated by that neural network.

Definition 5. and , there exists a three-layer structure that can approximate in any square error precision of .

The definition above proves that, , we can use a feed-forward neural network with a three-layer structure to approximate it accurately. Thus, this part not only proves the existence of the mapping network but also demonstrates the network structure of the mapping. In summary, this paper adopts the three-layer neural network as the basis neural network.

2.2.2. BPNN

BPNN is a type of multilayer feed-forward neural network with an error back propagation learning process. The structure of BPNN is illustrated in Figure 2. Details of BPNN are introduced in [40].

2.2.3. WNN

WNN, a feed-forward network, is generally multilayer [41]. It is widely applied in signal processing because of its advantages of the localization property and generalization ability [42]. The structure of WNN is shown in Figure 2.

2.3. Neural Network Optimized by an Intelligence Algorithm

The intelligent optimization algorithm provides an efficient and powerful mathematical tool for optimizing the initial weights and thresholds of the ANN [43].

2.3.1. CSA

Cuckoo search is a heuristic swarm intelligence algorithm inspired by the behavior of the obligating brood parasitism of cuckoo species [44]. CSA is utilized in this paper for its stronger capability of global optimization [43].

Definition 6. To simulate the mode of cuckoo breeding, three idealized assumptions are presented, as follows: (1) each cuckoo selects nest randomly and dumps only one egg at a time, (2) the eggs with high quality will be carried over to the next generation, and (3) the available nest number is fixed, and the probability of the host bird discovering the exotic egg is .

Definition 7. The Lévy flight model simulates the process of the nest-seeking characteristic of cuckoo, and the update formula of the path and location is as follows: represents the location of nest at generation , is the step length-controlled factor, is the point-to-point multiplication, and obeys the Lévy distribution with a random search path of parameter : Here , , and obey the normal distribution, , , , denotes the location of the best nest at generation , and is the standard gamma function with unbounded variance and mean of the probability distribution.

2.3.2. GA

GA is a population-based optimization algorithm that simulates natural genetic mechanisms and biological evolutionism. It possesses a capacity for powerful global optimization [45]. The principle of GA relies on a random process, which is constituted by the processes of selection, crossover, and mutation [46]. The implementation process is shown in Figure 2.

2.4. GABPNN

The primary mechanism of GABPNN is composed of three parts: GA optimization, determination of the BPNN structure, and forecasting covered by BPNN [47]. The pseudo-code for GABPNN is as shown in Algorithm 1.

) GENERATE   / Calculate the fitness of each individual in the population. /
() DO WHILE
()     / Record the best fitness values and the average fitness values. /
()    FOR EACH    DO /The process of the select operation. /
()         
()         FOR EACH    DO / The process of the crossover and mutation operations. /
()             
()           
() END FOR; END WHILE
()  /Initial weights and thresholds of BPNN by the obtained best individual. /
() ,  / The process of training data normalization. /
() / Adjust the weights and threshold of BPNN according to the forecast error. /
() DO WHILE  
()     
()     FOR EACH    DO
()          FOR EACH    DO
()               / Calculate the outputs of the hidden layer. /
()              FOR EACH    DO
()                   / Calculate the outputs of the output layer. /
()                   / Calculate the forecast error in the output layer. /
()                   / Update the connection weights /
()                  
()                     / Update the threshold. /
() END FOR; END WHILE
() RETURN  
2.5. EMD-CSAWNN

In this paper, the proposed model, which incorporates the empirical model decomposition technique into the WNN model based on CSA, is adopted for short-term power load forecasts. Empirical model decomposition represents a self-adaptive decomposition technique to decompose short-term power load series into several IMFs and one residual item. WNN is adopted as a forecasting engine in the proposed approach because of its powerful approximation and high computation speed. Additionally, to avoid the deficiencies of WNN such as its unstable structure, CSA is used to initialize and determine the weights and thresholds of WNN, thereby imparting an outreach capacity to WNN. Figure 3 illustrates the general structure of the hybrid power load forecasting method. The pseudo-code for the algorithm of the EMD-CSAWNN model is as shown in Algorithm 2.

) / Initialize , , , ,./
() WHILE   is not a monotonic function DO
()    WHILE    DO / Find all local maxima and minima of by cubic spline./
()         / Produce the upper and lower envelopes expressed as and ./
()         , ,
()    END WHILE
()    , ,
() END WHILE
() ,
() GENERATE  
() DO WHILE  
()      / Generate a cuckoo egg by taking a Lévy flight from a random nest./
()     FOR EACH    DO/ Calculate the fitness value and select the candidate. /
()        
()        IF    DO / Replace the worse location of the nest by the better one./
()          
()       ELSE  
()       rand = randn()
()           IF    DO / Discard the worst solution by and produce a new solution./
()         
()         ELSE  
() END IF; END FOR; END WHILE
()  /Initial weights and thresholds of WNN by obtaining the best individual. /
() , / The process of training data normalization. /
()  / Adjust the weights and threshold of WNN according to the forecast error. /
() DO WHILE  
()    
()    FOR EACH    DO
()      FOR EACH    DO
()          / Calculate the outputs of the hidden layer. /
()         FOR EACH    DO
()             / Calculate the outputs of the output layer. /
()             / Calculate the forecast error in the output layer. /
()             / Update the connection weights /
()               / Update the threshold./
()                
() END FOR; END WHILE
() RETURN  

3. Experiments and Evaluations

Applications of the proposed hybrid approach and five comparison models are shown in this subsection. All algorithms are operated on the given platform: 3.20 GHz CPU, 8.00 GB RAM, Windows 7, and MATLAB R2012a. Meanwhile, taking into account the randomness factors and to make sure the final results are reliable and independent from the initial weights, we carry out each ANN experiment 50 times and then take the average value.

3.1. Region Description and Data Collection

Australia has plentiful coal and wind resources across its coastline and land. A power load data set from NSW, which is the state with the largest population and highest levels of industrialization and urbanization in Australia, is employed to validate each model. In this paper, power load data are collected randomly from NSW from January 12, 2009, to March 8, 2009, which includes eight weeks of data. Among them, the data from January 12, 2009, to February 1, 2009, is used as a training set to obtain the appropriate model, and data from February 2, 2009, to March 1, 2009, is used as the testing set. The power load data from January 12, 2009, to February 1, 2009, together with their statistical measures, that is, minimum, maximum, mean, and standard deviations, are shown in Figure 4(c); the standard deviations are all above 1800, which implies that the power load series fluctuates significantly with the minimum/maximum of week one, week two, and week three, which are 6049.94/13518.06 (MWh), 6280.27/13326.47, and 6375.84/13096.37, respectively. This can be intuitively observed from the amplitude and frequency of the series fluctuation, which can change from very high to low values and vice versa.

Figure 4(b) exhibits 1008 power load data points from January 12, 2009, to February 1, 2009, divided into three groups, with 336 data points in every group. Because the power load data of NSW are collected once every half hour, each day includes 48 data points. On different days of the week, daily life and human economic production usually have different behaviors; thus, the characteristics of the load are different on different days. To minimize forecasting error as much as possible, we forecast the load of different days in the week separately. In this paper, the cycles of data division are seven days; the first three weeks of Monday data, January 12, January 19, and January 26, are employed to forecast the next Monday load on February 2, 2009. Accordingly, the data on January 13, January 20, and January 27 are employed to forecast the power load on February 3, 2009. The rest can be conducted in the same manner. The structures of the training and testing sets are illustrated in Figure 3(b).

3.2. Evaluation Metrics

Forecasting accuracy is an important criterion for evaluating a forecasting model. In this paper, the basic error calculation method is as follows: Here, is the number of data points; the formula represents the absolute error between the observed value and the forecasting value at time , is the observed value, and is the forecasting value at time . To avoid a positive or negative offset in forecasting error, solve the problem for which positive and negative forecasting error cannot be added, adding to the absolute value of the error, and take the average in the end. This error belongs to the comprehensive index in error analysis:RMSE is the square root of the mean square error. It also belongs to the comprehensive index in error analysis. Take the square of the absolute error AE; thus, the role of great values in the error will be strengthened, improving the sensitivity of the indicators, which is the prevailing reason in the error analysis:where the symbolic meaning is as above. The indicator is the average of the absolute error. The index is one of the comprehensive indexes in error analysis that usually occupies a very important position in the analysis and forecasting performance of the model.

To determine the degree of correlation between different forecasting model results with observed values, GRA [48] is employed in this paper.

Definition 8 (determining the grey relational coefficient). is the reference sequence, is the comparative sequence, , and the relational coefficient of and in point is represented asHere, is the distinguishing coefficient, and , usually .

Definition 9 (grey relational degree). By focusing the degree of at utter points, the algorithm on the grey relational degree isConsidering the generation capacity of the proposed hybrid model, four statistical indices are employed as evaluation metrics to measure the forecasting accuracy, MAE, RMSE, MAPE, and GRA. MAPE, MAE, and RMSE measure the mean performance, and GRA illustrates how well the forecasted data points fit the trend.

4. Results and Analysis

The experiments were divided into three parts, Experiment 1, Experiment 2, Experiment 3, Experiment 4, and Experiment 5. Experiment 1, the primary experiment, aims to compare the performance of different models in one day (February 2, 2009) and, meanwhile, to compare the ARIMA and ANN models. Experiment 2 aims to compare the performance of different models between weekdays and weekends. Experiment 3 aims to compare the performance of different models in 28 days (from February 2, 2009, to March 1, 2009). In order to further validate the hybrid model we proposed Experiments 4 and 5 which are studied in this paper. In addition, ENN (Elman neural network) and RBFNN (radial basis function neural network) are utilized in further comparison. Experiment 4 aims at globally testing the proposed hybrid model by using real power load data on Thursday, April 24, 2008, and Tuesday, April 29, 2008, which is selected from New South Wales of Australia randomly. Experiment 5 aims to prove the general applicability of hybrid model; thus, the power load data on Saturday, June 28, 2008, and Monday, June 30, 2008, from Victoria of Australia are chosen for forecasting.

Experiment 1. According to determination of the network structure, as the structure of the three-layer neural network will achieve better forecasting results. The network structure of the ANN model is 4-9-1 because of its four nodes in the input layer and one node in the output layer. The iteration time is set to 100, the learning rate is set to 0.01, and the training requirement accuracy is set to 0.0004. To forecast the power load on Monday, February 2, the historical values from the Mondays of the first three weeks, January 12, January 19, and January 26, are chosen. Test results (MAE, RMSE, MAPE, and GRA) are presented in Tables 1 and 2, Figures 5, 6, and 7. Each individual model exhibits its best performance at a special time. For example, Figure 6 shows that BPNN provides the lowest MAPE value at 3:00 among all of the individual models, and CSAWNN yields the highest accuracy forecasting value from 1:00 to 3:00 among all of the individual models, whereas the maximum error is with the BPNN forecasting model on February 2, with a MAPE value of 10.12%. This result is due to the unstable initialization of the ANN. The result of the original data of empirical model decomposition is shown in Figure 3(a). The noise in the data is eliminated by using the empirical model decomposition technique; in this paper, the IMF1 is a high-frequency sequence with small values, which can be regarded as interference factors. As a result, the rest of the IMFs and the remainder term can be constructed as the training input of the CSAWNN model. Table 1 shows the experimental results for Monday of six types of forecasting models. The average values of MAPE for six models on February 2, 1.6% are 1.54%, 1.97%, 1.37%, 1.02%, and 1.94%, respectively; as shown in Table 2, the MAPE afforded by the hybrid model decreased by 48.22% compared with the maximum average value of MAPE. In addition, it is shown that the value of MAPE offered by the hybrid model is more stable than that of the other proposed models, and the maximum value of MAPE is 2.54%. By comparing the hybrid model with the other models, it is shown that the hybrid model can provide high and stable forecasting accuracy.

Experiment 2. Figure 8 and Table 3 describe the comparison of six models with values on weekdays and weekends on different evaluation metrics. On weekdays, the best performance model is the hybrid model and the value of MAPE is 0.82%; on the contrary, the worst is the ARIMA model, whose value of MAPE is 1.48%. The MAPE offered by the hybrid model is 44.59% lower than that offered by the ARIMA model. On weekends, the value of MAPE offered by the hybrid model is 0.68%, which outperforms all other models and is 49.25% lower than that of the worst model, ARIMA. Figure 8 indicates that, except for the criterion RMSE, the weekdays outperform the weekends when utilizing WNN and the criterion GRA, the weekdays outperform the weekends when utilizing the BPNN, GABPNN, WNN, and CSAWNN models, and the other performances on weekends are all better than those on weekdays when utilizing the six models. The values of MAPE offered by the six models on weekends are 1.23%, 1.05%, 1.27%, 1.13%, 0.68%, and 1.34%, which are 5.38%, 10.25%, 0.78%, 8.13%, 17.07%, and 9.46% lower than the corresponding weekday values, respectively. This illustrates that the forecasting results on weekends outperform those of weekdays.

Experiment 3. Figures 9 and 10 illustrate the forecasting results of Experiment 3. In comparison with the BPNN, WNN, GABPNN, CSWNN, and ARIMA models, the forecasting results offered by the hybrid model are more accurate. The forecasting results are shown in Figure 10, and detailed prediction results are shown in Table 4. The maximum values of MAPE offered by the six models are 1.28%, 1.14%, 1.27%, 1.2%, 0.78%, and 1.44%, respectively. The maximum value of MAPE offered by the ARIMA model is 4.4% over 28 days. Meanwhile, on all days of the test, the average values of MAE, MAPE, and RMSE offered by the hybrid model are all smaller than those of the other models. The average value of MAPE offered by the hybrid model over four weeks is 0.78%, and the highest decrease is 39.06% compared with the other ANN models. This indicates that the hybrid model is an effective power load forecasting approach. The GRA result is shown in Table 5. In addition, on March 1, the GRA of GABPNN is higher than that of the hybrid model; on the remaining 27 days, the values of GRA offered by the hybrid model are higher than those of the other five models. According to the average value of GRA over the 28 days, the forecasting effects of all six forecasting models are increasing in the following order: ARIMA, BPNN, which is the WNN, CSAWNN, GABPNN, and EMD-CSAWNN, which concludes that the effect of the hybrid forecasting model is the best model among the six forecasting models.

The higher the power load forecasting accuracy, the lower the economic cost, which has actual economic significance [49]. As is illustrated in this case, the ANN optimized by the intelligence algorithm after denoising provides a better power load forecasting effect.

Experiment 4. The power load on Thursday, April 24, 2008, and Tuesday, April 29, 2008, from New South Wales of Australia is used to globally testing the proposed hybrid model. The results are shown as in Tables 6 and 7. To forecast the power load on Thursday, April 24, 2008, the historical values from the Thursdays of the first three weeks, April 3, April 10, and April 17, are chosen, respectively. To forecast the power load on Tuesday, April 29, 2008, the historical values from the Thursdays of the first three weeks, April 8, April 15, and April 22, are chosen, respectively. Test results (AE, MAE, RMSE, and MAPE) are presented in Tables 6 and 7 and part (a) of Figure 11. Table 6 indicates that EMD-CSAWNN has the highest accuracy forecasting results on Thursday, April 24, 2008; the maximum, minimum, and average MAPE values are 1.2355% at 2:00, 1.1836 at 14:00, and 1.2025%, respectively. The second-highest to sixth-highest accurate models are GABPNN, BPNN, RBFNN, CSAWNN, WNN, ENN, and ARIMA with average MAPE values of 1.6436%, 1.7795%, 1.8314%, 2.3373%, 2.4141%, 3.1381%, and 5.5872%, respectively. Table 7 indicates that EMD-CSAWNN still yields the highest accuracy forecasting value from among all of other models mentioned in this paper when forecasting power load on Tuesday, April 29, 2008; the maximum, minimum, and average MAPE values are 1.9844% at 14:00, 1.6033% at 12:00, and 1.7811%, respectively. According to the average MAPE value, CSANN is the second most accurate model, GABPNN is the third most accurate model, RBFNN is the fourth most accurate model, WNN is the fifth most accurate model, BPNN is the sixth most accurate model, ENN is the fifth most accurate model, and ARIMA is the sixth most accurate model with average MAPE values of 2.5827%, 2.8472%, 2.8473%, 3.1717%, 3.2725%, 3.5151%, and 3.7549%, respectively. As shown in Table 6, the average MAPE afforded by the hybrid model decreased by 78.48% compared with the maximum average MAPE value. In Table 7, the average MAPE afforded by the hybrid model decreased by 52.57% compared with the maximum average MAPE value. In addition, it is shown that the value of MAPE offered by the hybrid model is more stable than that of the other proposed models. By comparing GABPNN, BPNN, CSAWNN, WNN, ARIMA, ENN, and RBFNN, it is shown that the hybrid EMD-CSAWNN model is better than that of single models.

Experiment 5. The power load on Saturday, June 28, 2008, and Monday, June 30, 2008, from Victoria of Australia is used to further prove that the proposed hybrid model can improve the performance of power load forecasting in different cases. The historical values from the Saturday of the first three weeks, June 7, June 14, and June 21, are chosen in order to forecast the power load on Saturday, June 28, 2008. Moreover, the historical values from the Monday of the first three weeks, June 9, June 16, and June 23, are chosen in order to forecast the power load on Monday, June 30, 2008. The experimental results are presented in Tables 8 and 9 and part (b) of Figure 11. For Saturday (June 28, 2008) data, the average MAPE values for the BPNN, GABPNN, WNN, CSAWNN, ARIMA, ENN, RBFNN, and EMD-CSAWNN models are 3.3224%, 2.8656%, 3.2301%, 2.5248%, 3.7716%, 3.3890%, 3.1754%, and 1.7699%, respectively. The maximum MAPE values for the six models are 3.5467% at 16:00, 3.3061% at 18:00, 3.5751% at 4:00, 2.9527% at 16:00, 4.2174% at 0:00, 3.7591% at 20:00, 3.3889% at 10:00, and 2.0503% at 14:00, respectively, and the minimum MAPE values are 2.9644% at 20:00, 2.2710% at 12:00, 2.7591% at 20:00, 2.1818% at 20:00, 3.4718% at 4:00, 3.1379% at 18:00, 2.8405% at 22:00, and 1.5798% at 10:00, respectively. The differences between the maximum and minimum MAPE values for the models are 0.5823%, 1.0351%, 0.816%, 0.1936%, 0.7456%, 0.6212%, 0.5484%, and 0.4705%. For Monday (June 30, 2008) data, the average MAPE values for the BPNN, GABPNN, WNN, CSAWNN, ARIMA, ENN, RBFNN, and EMD-CSAWNN model are 3.2747%, 2.8415%, 3.0806%, 2.5422%, 3.7656%, 3.4113%, 3.1982%, and 1.7312%, respectively. The maximum MAPE values for the six models are 3.9915% at 22:00, 3.4117% at 2:00, 3.6561% at 6:00, 2.8144% at 0:00, 4.1783% at 22:00, 3.6738% at 8:00, 3.5262% at 0:00, and 2.0361% at 12:00, respectively, and the minimum MAPE values are 3.0113% at 20:00, 2.3376% at 22:00, 2.4491% at 16:00, 2.1398% at 20:00, 3.2795% at 0:00, 3.2235% at 4:00, 2.9466% at 22:00, and 1.4521% at 10:00, respectively. The differences between the maximum and minimum MAPE values for the models are 0.9802%, 1.0741%, 1.207%, 0.6746%, 0.8988%, 0.4503%, 0.5796%, and 0.584%. Therefore, the proposed hybrid EMD-CSAWNN model is not only the most accurate but also the most stable of the investigated forecasting models.

5. Discussion

In this section, we discuss two important evaluation metrics, convergence speed and degree of certainty [50], offered by the GABPNN and CSAWNN models to determine a more practical forecasting model by considering reality factors such as forecasting stability and calculation time. The results illustrate that the CSAWNN model is more practical than the GABPNN model in forecasting power load. In addition, we propose forecasting availability to analyze and evaluate the quality of power load forecasting.

5.1. Convergence Speed

The computational complexity of evolutionary algorithms and swarm intelligence still remains a challenging issue; here, we use convergence speed as one of the evaluation metrics to examine the forecasting performance of GABPNN and CSAWNN. We obtain the computation time of the best fitness by analyzing the convergence speed of GA and CSA for use in comparative evaluation of optimization algorithm performance. However, the exploration and development are always two competing goals, and the conflict would exist between the convergence speed and forecasting accuracy. We define performance less than 10−5 as the convergence criteria.

We take the data from January 12, January 19, and January 26 as an example to illustrate the convergence speed of GA and CSA; Figure 12 shows the results of the comparison of evolutions among GA and CSA with different population sizes. We observed that the fitness values monotonically decrease as the iterations increase. In addition, when the iterations are less than 100, the larger the population size, the faster the convergence speed. We also observed that CSA has better convergence speed than GA. At iteration 20, the convergence of CSAWNN obtained the best speed in a population of 50, and the convergence of GABPNN at the iteration between 60 and 80, at the iteration 60, obtained the best speed.

5.2. Degree of Certainty

The forecasting results of optimization algorithm-NN are also usually different for each experiment because of the probability mechanism of the optimization algorithm. However, in the actual forecasting field, the future values are not known; thus, we cannot obtain which experiment will obtain the best result. Hence, we use these evaluation metrics to determine the randomness.

We defined the degree of certainty aswhere is the number of experiments, is the value of th forecasting experiment on , and is the average value of all experiments. It is clear that a smaller DC can bring a higher degree of certainty.

The scatterplot in Figure 13 indicates the MAPE and GRA distributions of different results of 100 experiments for February 2 by, respectively, using GABPNN and CSAWNN. Although the minimum value of MAPE of GABPNN is smaller than that of CSAWNN and of CSAWNN, both are smaller than those of GABPNN. DC (MAPE) and DC (GRA) of CSAWNN are 0.0014 and 0.00016, and the DC (MAPE) and DC (GRA) of GABPNN are 0.0016 and 0.00019. Thus, CSAWNN is a better forecasting method than GABPNN in the actual forecasting field.

5.3. Forecasting Availability

Forecasting availability can be measured not only by the square sum of forecasting error but also by the mean and mean squared deviation of the forecasting accuracy. In certain practical circumstances, the skewness and kurtosis of the distribution of forecasting accuracy need further consideration; on that basis, this section will give a general discrete form of forecasting availability [51].

Definition 10. Let denotes the relative forecasting error of th forecasting method at time , , . The matrix is called the relative error matrix of the forecasting model.

Definition 11. is called the forecasting accuracy of th forecasting method at time , , and .

Definition 12. is called th-order forecasting availability unit of th forecasting method, is a positive integer, , is the discrete probability distribution of types of methods at time , and .
Especially if the priori information of the discrete probability distribution of types of methods is unknown, we define .

Definition 13. is called the -order forecasting availability unit of th forecasting method, and is a continuous function of a certain unit. is called the -order forecasting availability of th forecasting method.

Definition 14. When is a continuous function of one variable, is the 1-order forecasting availability of th forecasting method. When is a continuous function of two variables, is the 2-order forecasting availability of th forecasting method.
Especially if the first decimal of is the same, we define , where denotes the fractional part.

Definition 14 illustrates that the 1st-order forecasting availability is the expectation forecasting accuracy sequence. The 2nd-order forecasting availability is the difference between the expectation and standard deviation of the forecasting accuracy sequence. We use the forecasting availability to evaluate the power load forecasting results in this paper. Through Figure 14, we obtain that the 1st-order and 2nd-order forecasting availability offered by the hybrid model are 0.9222 and 0.8366, respectively, which outperform those of the others; this evaluation result corresponds to the previous evaluation criterion. Thus, the hybrid model is a more valid model than the others.

6. Conclusion

The one-day-ahead power load forecasting is an extremely important problem in power load planning, secure operation, and energy expenditure economy. Assessment of the power load as accurately and quickly as possible is the primary objective in power load forecasting. However, power load is affected by various uncertain factors such as climate change and the social environment, which may lead to difficulty in obtaining accurate power load forecasts. The accuracy of traditional individual forecasting methods, which lack denoising, is not satisfactory for power load forecasting. Herein, a hybrid EMD-CSAWNN model for short-term power load forecasting is developed. The empirical model decomposition technique is applied to reduce the high-frequency items. On the basis that WNN can handle the data with nonlinear features, the ensemble forecasting method is adopted to overcome the uncertainty of the outcomes that can be attributed to the randomness of the initialization of the single WNN. Moreover, we use the CSA to optimize the parameter in the ensemble forecasting model. Experimental studies of power load forecasting in NSW demonstrate that the hybrid model has higher precision than conventional forecasting models. The proposed EMD-CSAWNN model can provide efficient computation and satisfactory forecasting accuracy for this type of data. Therefore, the developed hybrid approach is suggested for broad application in power load forecasts or even other fields such as wind speed and traffic flow forecasts.

Abbreviations

BPNN:Back Propagation Neural Network
WNN:Wavelet Neural Network
GA:Genetic Algorithm
CSA:Cuckoo Search Algorithm Back Propagation Neural
GABPNN:Network Optimized by Genetic Algorithm
CSAWNN:Wavelet Neural Network Optimized by Cuckoo Search Algorithm
AR:Autoregressive model
MA:Moving average model
ARIMA:Autoregressive integrated moving average model
IMF:Intrinsic mode function
ANN:Artificial Neural Network
PSO:Particle swarm optimization algorithm
ACO:Ant colony optimization algorithm
SAO:Simulated annealing optimization algorithm
MAE:Mean absolute error
RMSE:Root mean square error
MAPE:Mean absolute percentage error
GRA:Grey relational analysis
:The number of sample data points used to build the NN model
:The number of data points to be forecasted in the NN model
:The connection weights between the neurons of the input layer, hidden layer, and output layer, with values belonging to [−1, 1]
:The excitation function of the hidden layer
:The learning rate of the NN, which is used to adjust the weights and thresholds of the NN
:The maximum number of iterations
:The population size of the initial population space
:A random number belonging to
:A random number belonging to
:The higher and lower bounds of the value of the gene
:The maximum number of generations
:The possibility of finding an exotic egg by the nest master
:The location of the optimum nest in generation iter
:Two random numbers in generation iter.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

The National Natural Science Foundation of China supported this work Grant no. .