Abstract

Wind energy analysis and wind speed modeling have a significant impact on wind power generation systems and have attracted significant attention from many researchers in recent decades. Based on the inherent characteristics of wind speed, such as nonlinearity and randomness, the prediction of wind speed is considered to be a challenging task. Previous studies have only considered point prediction or interval measurement of wind speed separately and have not combined these two methods for prediction and analysis. In this study, we developed a novel hybrid wind speed double prediction system comprising a point prediction module and interval prediction module to compensate for the shortcomings of existing research. Regarding point prediction in the developed double prediction system, a novel nonlinear integration method based on a backpropagation network optimized using the multiobjective evolutionary algorithm based on decomposition was successfully implemented to derive the final prediction results, which enable further improvement of the accuracy of point prediction. Based on point prediction results, we propose an interval prediction method that constructs different intervals according to the classification of different data features via fuzzy clustering, which provides reliable interval prediction results. The experimental results demonstrate that the proposed system outperforms existing methods in engineering applications and can be used as an effective technology for power system planning.

1. Introduction

Recently, based on the exhaustion of fossil fuels and increasing demands for environmental protection, wind power and other new energy industries have developed rapidly [1]. Wind energy has the advantages of renewability and cleanliness, so the comprehensive development and utilization of wind energy have a wide range of social and economic benefits. Therefore, wind energy is a very promising resource around the world [2]. However, in practice, wind speed has the characteristics of inherent randomness and intermittence, meaning the effective and comprehensive development of wind systems is very limited, which poses a major challenge to the operation and management of power grids, particularly when considering wind power integration [3].

Generally, effective wind speed prediction can reduce the risks of wind power generation associated with uncertainty. Predicting wind speed accurately is a difficult task and major focus for wind farm decision-makers. It is very important to establish a suitable wind farm architecture and determine the nonlinear dynamic modes of wind speed precisely for the sake of efficient management and minimizing potential risk [4]. Wind speed prediction methods can be divided into four different types: ultrashort term (several seconds to 4 h), short term (4 to 24 h), medium term (1 to 7 d), and long term (more than 7 d). Different prediction horizons have different application values. For short-term and ultrashort-term wind speed prediction, the most critical impact is its role in power system operation [5]. For example, the output power of a wind farm in the United States can fluctuate by several hundred megawatts within an hour, which has a significant impact on the safe and stable operation of the system. To avoid potential problems, short-term wind speed prediction is crucial for providing data support for the reasonable dispatching of power resources, improving the efficiency of optimal dispatching, and optimizing the use of various power generation methods [6].

In recent decades, many wind speed prediction methods have been proposed in three main categories: physical methods, statistical methods, and artificial intelligence methods [79]. Physical methods mainly use detailed data from the lower atmosphere for analysis, mining, and prediction [10]. Such models are based on the basic information of wind turbines provided by numerical weather forecasting systems and the parameterization of physical phenomena according to initial conditions and nonlinear partial differential equation systems, which can be used to obtain a series of different meteorological parameters [11]. However, such models do not analyze and mine historical data, so they ignore potentially useful information in historical data [12]. Additionally, based on the inclusion of different parametric models within a single larger model, there are some difficulties in applying such models to wind farms. Using such models alone for wind energy mining and prediction will produce relatively large system errors [13]. Statistical models use large amounts of historical data to perform prediction without considering the impact of many meteorological factors [14]. In early research on wind speed prediction, traditional statistical models largely relied on autoregression (AR) [15] and its extensions (e.g., ARIMA [14, 16], real ARIMA [17], and ARIMA-ARCH [18]). However, for wind speed series containing complex information, such models have difficulty in accurately mining patterns, particularly nonlinear patterns.

Since the initial development of artificial intelligence technology, intelligent prediction models have been designed and applied to wind energy forecasting [19], including artificial intelligence systems [20, 21], support vector machines [22], and fuzzy logic methods [23]. Additionally, based on the strong nonlinearity of wind energy data, only nonlinear models have reasonable prediction ability [24]. However, based on the inherent disadvantages of individual models, they cannot achieve the expected prediction results in all circumstances [25, 26]. Based on the increasing application of wind power generation in electronic systems, developing effective controls for prediction error is crucial [27]. To compensate for the shortcomings of individual models, some hybrid models have been designed for wind speed prediction to achieve better prediction performance [28]. Generally speaking, the superior ability of hybrid models makes it easier to achieve accurate wind speed predictions compared to the abilities of individual models. Therefore, many research reports on hybrid forecasting models are put forward every year. Such reports tend to focus on data preprocessing [29, 30], combining single models, and using heuristic algorithms, such as particle swarm optimization [31], genetic algorithms [32], and multiobjective algorithms [33], to optimize model parameters.

In recent years, hybrid models have been applied to both long-term and short-term wind speed prediction. For short-term wind speed forecasting, Ma et al. [34] developed a wind speed prediction model by using singular spectrum analysis (SSA) to derive a noise removal sequence corresponding to a real sequence to predict short-term wind speeds. Wang et al. [35] developed a hybrid wind speed prediction model using complete ensemble empirical mode decomposition (CEEMD), multiobjective whale optimization, and an Elman neural network. Meng et al. [36] proposed a short-term wind speed prediction hybrid model combining data preprocessing with an artificial neural network and various optimization methods. Assessments of the effectiveness of their model revealed that its prediction accuracy was significantly improved compared to several benchmark models. In [37], an effective short-term prediction framework for wind speed was proposed by combining a local linear fuzzy neural network, discrete wavelet transform, and singular spectrum analysis optimized by the seeker optimization algorithm. In [38], a hybrid model based on wind speed prediction was proposed to combine variational mode decomposition (VMD) with an extreme learning machine (ELM) optimized using the hybrid backtracking search optimization algorithm. This model achieved excellent performance in terms of describing nonlinear modes. These studies demonstrate not only that hybrid strategies are superior to individual models but also that such strategies can be used as an effective form of engineering application technology. Additionally, there have been a large number of studies on medium- and long-term wind speed prediction. For example, Wang et al. [39] combined support vector regression with seasonal index adjustment and an Elman recurrent neural network to construct hybrid models called PMERNN and PAERNN, which performed the mid-term prediction of wind speed effectively. Ulkat and Günay [40] proposed a method to determine wind speeds corresponding to specific positions without relying on previous wind speed data, which is effective for long-term wind speed prediction, by combining physical factors with an artificial neural network. The prediction results of hybrid models based on the mechanisms discussed above demonstrate the short- and long-term wind speed prediction effectiveness of hybrid models.

Another problem regarding wind energy prediction is that many studies focus on a single mode of prediction. Specifically, most previous studies have focused on either point prediction or interval prediction of wind speed alone, without considering how both models could be used together for predictive modeling and analysis. Therefore, existing models cannot meet the needs of engineering applications or guarantee the reliability of wind systems. Existing probability interval prediction methods can generate a large quantity of predictions that can help managers implement appropriate policies. However, the study of interval modeling and prediction is still insufficient. The main research direction for uncertainty quantification focuses on statistical methods, including quantile regression [41, 42], bootstrapping [43], and kernel density estimation [44]. Additionally, several interval prediction methods have been proposed based on artificial neural networks, lower bound estimation (LUBE) [45], and so forth.

Table 1 summarizes existing methods and models for wind speed point prediction and interval prediction, as well as the advantages and disadvantages of these methods. The main points in Table 1 can be summarized as follows.

In the area of point prediction, (1) although physical models provide good long-term prediction ability, their application is limited based on complicated meteorological conditions, difficult model initialization, and excessive computations. (2) Traditional statistical models, such as AR and ARIMA, have enhanced computational efficiency. However, the modeling of nonlinear time series, such as wind speed time series, is limited by the linear forms of such models. (3) An important issue related to artificial neural networks is that network iteration can easily fall into local optima, although such networks do provide good nonlinear time series modeling ability. (4) Although recent combined models successfully incorporate the advantages of individual models and improve prediction accuracy significantly, model combination technology in existing systems always revolves around linear combination. Based on the nonlinear characteristics of wind speed, this paper presents a method for combining individual prediction models in a nonlinear manner and optimizing model parameters using a multiobjective optimization algorithm to improve prediction effectiveness further.

Regarding interval prediction, (1) based on the unique advantages of quantile regression, most research has focused on this method. However, quantile regression is disadvantageous for developing prediction intervals because it must obtain a specific training dataset to establish prediction models. Additionally, every quantile must be considered, which increases computational complexity and the probability of discarding useful results during resampling [46]. (2) Bootstrapping methods are statistical methods that apply data resampling and replacement to evaluate the robustness of various statistics, including standard error, confidence interval parameters, correlation coefficients, and regression coefficients. Bootstrapping methods can compensate for the shortcomings of quantile regression methods but are only helpful for handling small sample sizes [47]. (3) Kernel density estimation can quickly calculate intervals based on point prediction results and a given statistical historical error distribution. However, such methods require the strict assumption of distributions [48]. (4) The LUBE method eliminates the shortcomings of traditional interval prediction methods and has high computational efficiency in terms of hypothesizing distributions, but its complex objective function cannot be obtained using conventional methods. In summary, there is no unified interval prediction method and further research and investigation are required to obtain more effective results [49]. Therefore, we developed a novel interval prediction architecture that outperforms most individual interval prediction models based on assumed distributions. In the proposed interval prediction architecture, there are no hypotheses regarding distributions and models. Therefore, the established interval structure possesses powerful anti-interference ability in the presence of outliers in interval data.

Based on our review of the literature and methods described above, the major contribution of this article is the presentation of a hybrid double prediction system that is designed to combine the point prediction and probability interval prediction of wind speed, which compensates for the shortcomings of existing research. The proposed system is composed of a wind speed point prediction module and interval prediction module, which can provide numerous predictions for the managers of wind farms. Specifically, the proposed double prediction system includes a preprocessing module based on VMD, a prediction module based on a nonlinear combination model, an interval prediction module, and an evaluation module. As a relatively new signal processing technology, VMD decomposes wind speed sequences and then performs denoising and reconstruction to generate a time sequence with greater clarity. The nonlinear combination model proposed in this paper is an effective prediction. ELM [9], a generalized regression neural network (GRNN) [50], and ARIMA [51] are selected as the base models for combination. The prediction results from these three models are aggregated using backpropagation (BP) [52], which is a form of nonlinear combination. BP is very sensitive to the selection parameters, which directly determine the effectiveness of point prediction and interval prediction. Therefore, to identify the optimal parameters for the BP model, we adopted the multiobjective evolutionary algorithm based on decomposition (MOEA/D). Additionally, to verify the performance of the proposed prediction architecture, we selected ten indicators to judge the accuracy of prediction. We present a thorough discussion on the verification of the effectiveness of the prediction system in this paper.

Wind speed data from Penglai in Shandong province were selected as experiment datasets on which to test the performance of the proposed system. Shandong province is located on the east coast of China and is rich in wind energy resources. To meet the needs of social development, energy conservation, and environmental protection, Shandong has developed many wind power stations. As a coastal province, Shandong has one of the largest wind farms in China. By the end of 2018, the total installed capacity of wind power was 11.26 million kilowatts. In this study, Penglai city, which is located in northern Shandong province, was selected as a research area based on its huge energy potential and valuable wind energy resources.

The major innovations of the proposed predictive system can be summarized as follows.(1)In this paper, a novel double prediction system for wind speed is established based on certain point prediction and uncertain interval prediction. The goal of the proposed system is to enhance the accuracy of point prediction, enhance the construction efficiency of prediction intervals, and enhance the operation level of wind power systems. Numerical simulation results demonstrate that our model has satisfactory prediction abilities.(2)A nonlinear combination method based on a BP optimization method is proposed. To determine the optimal combination mode for each model and overcome the limitations of existing linear combination models for nonlinear wind speed data, a nonlinear aggregation mechanism based on ELM is used to combine different models to compensate for the inherent defects of individual models and linear combinations. The MOEA/D algorithm is applied to search for the best parameters for ELM to improve prediction accuracy further.(3)An interval prediction method based on fuzzy clustering is established. Compared to traditional parametric statistical models, one unique advantage of the proposed prediction model is its convenience because it does not need to know distribution shapes. This feature significantly reduces the complexity of the model and enhances the overall efficiency of the system.

The remainder of this article is organized as follows. Section 2 discusses the relevant methods applied in the proposed double prediction architecture. In Section 3, the double prediction model is established comprehensively. In Section 4, data are introduced and experimental results are analyzed. Further discussion is provided in Section 5. Finally, our conclusions are summarized in Section 6.

1.1. Knowledge and Tools for Model Preparation

When constructing our model, several methods were selected based on their unique advantages and combined to enhance the overall performance of the model. Here, we introduce the two main methods, which are variational mode decomposition [53] and multiobjective evolutionary algorithm based on decomposition.

First, we will discuss variational mode decomposition (VMD).

VMD is an effective data preprocessing method proposed by Dragomiretskiy and Zosso [54] in 2014. The goal of VMD is to decompose a real input signal sequence f into a series of subsignal sequences yk called modes, which have specific sparsity characteristics when reproducing the input. For completeness, signals f and modes yk are required to be complete and square-integrable to the second derivative (i.e., ). Each mode k maximally pulsates around a center .

Step 1. Accessing the bandwidth of each mode.
For each mode yk, the Hilbert transform is applied to calculate a correlation analysis signal and a unilateral frequency spectrum is obtained. Next, mixed exponents are adjusted to their estimated central frequencies to transfer the spectrum of the mode to the baseband. The bandwidth is estimated based on the H1 Gauss smoothness of the demodulated signal (i.e., the square L2-norm of the gradient). The constrained variational problem is defined as follows:where represents the Dirac distribution and k and t represent the number of modes and time scripts, respectively. Furthermore, {yk} is the set of modes {y1, y2, …, yk} and {} is the set of center pulsations .

Step 2. Defining the optimization problem.
Considering the penalty term and Lagrange multiplier , the constrained optimization problems above can be redefined as follows:where represents the equilibrium parameter of the data fidelity constraint.

Step 3. Solving for the modes {yk} and center pulsations {}.
By adopting the multiplier alternating direction method, the process of solving for yk and can be defined as follows.
For yk, the minimization problem is defined aswhere and are omitted for the fixed directions and . The problem is solved in the spectral domain as follows:which can be rewritten asTherefore, the final solution can be obtained as follows:For , the minimization problem is defined as follows:In the Fourier domain, the problem is optimized asTherefore, the final solution can be obtained as follows:where , , , and represent the Fourier transforms of , , , and , respectively, and n represents the iteration number.
Next, we will discuss the multiobjective evolutionary algorithm based on decomposition (MOEA/D).
Recently, the MOEA/D proposed by Zhang and Li [55] has attracted significant interest based on its concise and effective characteristics, and many theoretical and practical achievements have been realized. The MOEA/D algorithm is detailed below.
A multiobjective optimization problem (MOP) with M objectives and N decision variables can be expressed as follows:where is the decision space. The decision vector is a candidate solution to the MOP. Here, the objective function includes M conflicting object functions with continuous real values and is defined aswhere Rm represents the target space.
The Pareto dominance relationship between individuals is defined as follows. If there are decision vectors U and V which satisfy the following two conditions simultaneously, we say that U dominates V:(1)If and only if , .(2)There exists at least one index making .In this case, V is said to be dominated by U, which can be denoted as , where represents a set of dominant relationships.
If there is no point that makes dominate , then the point is Pareto optimal. There is only one optimal set of compromise solutions called nondominated solutions (i.e., not dominated by all other solutions). The values of the Pareto optimization solution in the determined space and target space are defined as the Pareto solution set (PS) and Pareto frontier, respectively [56].
MOEA/D has strong search ability for continuous optimization, combinatorial optimization, and PS complex problems. The main principle of this algorithm can be summarized as follows.
If a multiobjective optimal problem (e.g., equation (10)) and weight vector are given and the given weight vector satisfies , , , then the MOEA/D can be applied. MOEA/D based on Tchebycheff decomposition uses the weight vector to optimize a MOP into several subproblems based on the following methods:where is the ideal point and . By solving multiple subproblems with different weight vectors based on equation (12), a Pareto optimal solution set [57] with good diversity can be obtained.
It is known that is continuous in , so if is close to , then the solution must be close to the solution. Therefore, a useful tool for optimization is information regarding with weight vectors near .
In the MOEA/D, the population is made up of the optimal solutions to the current subproblem. Each subproblem maintains a list of neighbors, and this list preserves subproblems with weight vectors similar to those of the current subproblem. Therefore, under the assumption of continuity, two neighboring subproblems should have similar optimal solutions. In each generation of MOEA/D, each subproblem is optimized using only the information from its neighboring subproblems.
For each generation t, MOEA/D using the Tchebycheff decomposition satisfies the following conditions.(1)A point group , where the xi is the current solution for the i-th subproblem.(2), where for each .(3) and zi is the best value found now for objective fi.(4)An external population is available to store the nondominated solutions found during the search.The pseudocode of MOEA/D is described as Algorithm 1 below:

Input:(i)MOP-multiobjective optimization problem(ii)N—the number of the MOEA/D subproblems(iii)λ1,…, λN—a uniform distribution of N weight vectors(iv)T—the number of the weight vectors in the neighborhood of each weight vector(v)max_gen—the maximum number of generations
Output:(i)EP-external population
Setup:(i)Set (ii)gen = 0
Step 1: Initialization(i)/∗Initialize an primary internal population uniformly randomly.∗/P0 = {x1, …, xN} and FVi = F(xi)(ii)/∗Initialize z = (z1, …, zn)T by a specific problem method. ∗/(iii)/∗ Calculate the Euclidean distance between any two weight vectors, and then calculate the closest T weight vectors to each weight vector.∗/(iv), set B(i) = {i1, …, iT} λi1, …, λiT represent the T closest weight vectors to λi
Step 2: Updating(i)WHILE (t < max_gen) DO(ii)FOR EACH i= 1, …, N DO/∗ Genetic operators ∗//∗ Randomly select two indexes k, l from B(i), and then generate a new solution y from xk and xl by using genetic operators. ∗/(iii)FOR EACH j= 1, …, n DO/∗Update of z.∗/if zj < fj(y), then set zj = fj(y)END FOR(iv)FOR EACH index DO/∗Update of neighboring solutions.∗/if , then set xj = y and FVj = F(yj).END FOR(v)/∗Update of EP.∗//∗Remove from EP all the vectors dominated by F(y). Add F(y) to EP if no vector in EP dominate F(y). ∗/END FOR(vi)t=t + 1END WHILE(vii)RETURN EP

2. Construction of the Wind Speed Double Prediction System

This section discusses the proposed wind speed double prediction system architecture, including system establishment and evaluation.

2.1. System Establishment

The prediction system proposed in this paper consists of two modules: a point prediction module and interval prediction module. The following subsections describe the system construction process and the system structure is illustrated in Figure 1.

2.1.1. Point Prediction Module

In this section, we propose a novel type of nonlinear hybrid point forecasting model using ELM, GRNN, and ARIMA, as well as a BP network, the MOEA/D, and a nonlinear combination mechanism, to achieve stable high-precision wind speed point prediction results. Considering the excellent prediction performance of BP networks, the proposed method adopts a BP network for nonlinear combination.

The point prediction module in the designed system is composed of four stages. The details of each stage are discussed below.(i)First stage: wind speed data preprocessing.To remove the noise and extract helpful information from a wind speed sequence, we use VMD technology to disintegrate an original sequence and reconstruct a smooth time series. Specifically, an original sequence is decomposed into several intrinsic mode functions (IMFs). IMFs with higher frequencies are eliminated to filter the time series. Here, we remove IMF1, IMF2, and IMF3, and the remaining IMFs are reconstructed to derive the final series.(ii)Second stage: single-model prediction.In the proposed method, we first use individual models to predict points. Specifically, we use ELM, GRNN, and ARIMA as individual prediction models to construct a combined model. ELM and GRNN are adopted to handle the nonlinear characteristics of wind speed and ARIMA is adept at discerning the linear characteristics of wind speed data. In this study, we divided 4464 pieces of wind speed data into a training set train1 and testing set test1, where train1 contained 3964 pieces of data and test1 contained 500 pieces of data. In general, there are no clear regulations regarding the ratio of training sets and testing sets for neural networks. It is common practice to use approximately 2/3 to 4/5 of the sample data for training and the remaining samples for testing. When the quantity of data is large, the data proportion in the training set can be increased appropriately [58]. As the proportion of training data increases, the neural network can achieve better prediction accuracy [59, 60]. Therefore, this division of the model is sufficient to construct a model and verify its accuracy.The input and output structures of train1 are described in equations (13) and (14), respectively. By using ELM and GRNN models trained on train1 to predict the wind speeds in the test set, the prediction sequences predict1 and predict2 are obtained, respectively. Similarly, ARIMA is used to obtain the prediction sequence predict3.where n is the number of samples in train1, l is the look-back time lag, and x(k) is the wind speed value at time k. For example, consider x(1) to x(l) as inputs and x(l + 1) as an output. Here, we set l = 5.(iii)Third stage: nonlinear combination model construction.To obtain an effective combination of each model, a nonlinear decision-making method based on an optimized BP neural network is proposed to obtain optimal results. Specifically, we divide test1 into a training set train2 (356) and testing set test2 (144) and then use predict1, predict2, and predict3 as BP network inputs and the 356 data in train2 as outputs. It is worth noting that it is difficult to determine the weights and thresholds for the neurons in each layer of the BP network, so MOEA/D is adopted to search for the best weights and thresholds for the neural network. The input and output structure for BP network training are defined in equations (15) and (16), respectively.where N represents the number of samples in test2 and represent the kth wind speed values predicted by ELM, GRNN, and ARIMA, respectively. By using the input and output, an optimized BP neural network can be trained.(iv)Fourth stage: wind speed point prediction.According to the established nonlinear prediction model, a rolling prediction method is used for multistep prediction and final prediction results are obtained. Evaluation indexes are calculated using the prediction results and test2, and the performance of the model is evaluated.In particular, multistep forecasting means forecasting multiple load values in the future. A time index t is the forecast origin and a positive integer l is the forecast horizon. It can be assumed that the time index t is exactly the time point that we are in, and our target is to obtain the forecasting value (l ≥ 1). l = 1, 2, 3 corresponds to 1 step, 2 steps, and 3 steps, respectively.

2.1.2. Interval Prediction Module

The interval prediction method in the proposed system was developed using point prediction results based on a fuzzy system. The three main steps in this module are summarized below.(i)First stage: data classification.In this step, the training set train1 of wind speed data is clustered into several classes using fuzzy c-means clustering. We assume that the data in each category follows the same normal distribution. Therefore, we can derive a set of interval classes . Here, we consider site 1 as an example. One can see that the data is divided into ten categories, where the scope of each category is defined in (17).(ii)Second stage: wind speed interval estimation.The confidence degree of each category interval is 95%. According to the mean and variance of each category of data, a corresponding confidence interval is constructed. Different categories have different widths of unified prediction intervals. This process of constructing different adaptive intervals according to different data characteristics is one of the main innovations of our model. According to the testing set test2 of point prediction results from the point prediction module, we identify the category F to which each prediction value belongs. Then, according to the constructed confidence interval for each category, the prediction interval for each prediction value is calculated as follows:where xi is a point prediction value, j is the category number of xi, sj is the standard deviation of category j, and nj is the number of data samples in category j.(iii)Third stage: sorting prediction results.According to the prediction intervals derived above, final interval estimations for wind speed can be obtained.

2.2. System Evaluation

The evaluation indexes for the designed double prediction system are introduced in this section, including four indexes for point prediction and six indexes for interval prediction.

2.2.1. Point Prediction Evaluation

Generally speaking, evaluation criteria are not unique to a given prediction system. This paper uses four common evaluation standards to evaluate the ability of the developed model and other comparative models, namely, mean absolute error (MAE) [61], mean squared error (MSE) [62], mean absolute percentage error (MAPE) [63], and direction change (DC). The smaller the values of MAPE, MSE, and MAE, the better the prediction performance. If the DC value is relatively large, the predicted direction of motion is considered to be consistent with the real value. Table 2 provides additional details regarding these four indexes.

Among the formula, and represent the true value and predicted value of wind speed, respectively. N represents the testing set number.

Besides, ai is the directional factor and is calculated as

2.2.2. Interval Prediction Evaluation

For interval prediction, we selected six evaluation indexes: prediction interval coverage probability (PICP) [64], prediction interval normalized average width (PINAW) [65], average coverage error (ACE), average width of the constructed PIs (MPI), Winkler score (WS), and accumulated width deviation (AWD) [66]. Table 3 provides specific definitions for these six indexes.

In this paper, PICP specifically refers to the PICP of the testing dataset, which is the main evaluation index for interval prediction. It indicates the coverage effect of the obtained confidence intervals relative to the target value. Given a confidence level, if the PICP is greater than or equal to (1–alpha), then the constructed interval is valid. Otherwise, the constructed interval is invalid. PINAW refers to the NAW of the prediction interval of the testing dataset. The cost of reducing the width diminishes the probability of achieving the expected target coverage. Increasing coverage requires increasing the width of the interval, so PICP and PINAW are essentially contradictory [48]. ACE represents the difference between the coverage and confidence of a prediction interval. MPI represents the average width of an obtained interval [6]. Similarly, the quality of an interval can also be assessed by its Winkler score. A high-quality interval has a smaller Winkler absolute value for an assigned nominal confidence level [66]. AWD refers to the AWD of the testing dataset, which can be obtained by calculating its relative deviation degree. The cumulative sum of AWDi represents the relative deviation degree [67]. Table 3 provides specific descriptions of these formulas.

Among the formula, Ui and Li represent the upper limit and lower limit of forecasting interval, respectively. ci is the number of the truth values contained in constructed interval. N represents the testing set number. ymax and ymin are the maximum and minimum values of the targets in the whole prediction process.

Besides, Si is calculated as

AWDi is the width deviation of construction interval of each sample, of which the calculation expression is

3. Experiments and Analysis

This section discusses the application of the double prediction model and several comparative models. The comparisons are divided into three experimental demonstrations. The operating environment of the experiments was a PC with a 2.40 GHz CPU, 4.00 GB of RAM, Windows 7 operating system, and MATLAB R2016A platform. Considering random factors, to guarantee the reliability of final results, 20 trials were conducted for each experiment and the average values were recorded.

3.1. Dataset Description

The wind speed data for three sites at Penglai in Shandong Province are chosen as experimental datasets on which to test the performance of the established double prediction system. Basic information regarding this wind speed data is provided in Table 4. Descriptive statistical analysis uses four statistical indicators, namely, the maximum, minimum, and average values, as well as standard deviation (Std.). The basic information and original data for the selected sites are presented in Figure 2.

For the sake of estimating the prediction effects of the models, 10 min wind speed data blocks from the Penglai wind farm from January 1, 2011, to January 31, 2011, were selected as experimental data. This wind farm consists of three different sites. Each dataset contained 4464 data points, which were divided into a training set train1 and testing set test1. The training set train1 contained 3964 data points and the testing set test1 contained 500 data points. For nonlinear aggregation, the testing set test1 was subdivided into a training set train2 and testing set test2 containing 356 and 144 points, respectively. For both the training and testing sets, we used a rolling forecasting mechanism to predict wind speed and produce one-step and two-step prediction results. The data structure details of the double prediction model are presented in Figure 2.

Wind speed of Penglai, Shandong province (37.48N, 120.45E), from January 7 to January 17, 2011.

3.2. Diebold-Mariano Test

To determine if the designed hybrid model provides better forecasting results than the comparative models, we adopted an effective verification method called the DM test, which was proposed by Diebold and Mariano RS [46]. The theory behind the DM test is summarized below.

Considering a significance level , the zero hypothesis H0 indicates that the predictive effectiveness levels of the proposed model and a comparative model are not significantly different. The meaning of H1 is opposite to that of H0. The relevant formulas are defined as follows:where L represents the loss function for prediction error and and are the error sequences predicted by the selected models.

Additionally, the statistics of the DM test can be defined as follows:where S2 is the estimate of the variance of . Assuming a certain significance level , the obtained DM value is compared to . Once the DM statistics exceed the interval , H0 can be rejected. This indicates that the predictive performances of the target model and a comparative model are significantly different, meaning H1 will be accepted.

3.3. Results and Analysis of Point Prediction

For the sake of verifying the applicability of the proposed point prediction module, two experiments are presented in this section, which are denoted as experiment I and experiment II, respectively. The main purpose of experiment I was to prove the superiority of the nonlinear combination model in the point prediction module compared to a single model, thereby reasonably proving the validity of hybrid modeling. Additionally, the results of experiment I demonstrate the necessity of data preprocessing. Similarly, to demonstrate the rational and superior ability of the VMD technology adopted in our system, it was compared to other common data preprocessing methods in experiment II. Detailed analysis of each experiment is provided below.

3.3.1. Experiment I: Comparison to Individual Models

In this experiment, all experimental datasets were considered to assess the effectiveness of the point prediction module based on three comparisons. In the first comparison, the proposed model was compared to three preprocessed data models, namely, VMD-ARIMA, VMD-GRNN, and VMD-ELM, to analyze the advantages of the combination model and nonlinear combination method. In the second comparison, the three VMD-based models were compared to ARIMA, GRNN, and ELM, respectively. In the third comparison, the effectiveness of the designed prediction model was evaluated further by using the traditional wavenn model and BP as comparative methods. The predicted results are presented in Table 5 and Figure 3, and the comparison results are summarized below.(1)Regarding the first comparison, the hybrid nonlinear model yielded the best results for one-step and two-step wind speed prediction on all three datasets according to the error indexes. For example, for one-step prediction, the MAPE value of the developed model is approximately 2% to 3%, while the best accuracy values of the VMD-based models are more than 1% lower than that of the developed model. Two-step prediction yields similar results.(2)Regarding the second comparison, when comparing VMD-ARIMA, VMD-GRNN, and VMD-ELM to ARIMA, GRNN, and ELM, respectively, without data preprocessing, one can see that data preprocessing is very important for enhancing wind speed prediction. For site 1, the MAPE values of ARIMA and ELM are 4.2190% and 6.7442% higher than those of VMD-ARIMA and VMD-ELM, respectively, for one-step prediction and 4.2927% and 6.4793% higher, respectively, for two-step prediction. The accuracy of VMD-GRNN is also slightly improved. For sites 2 and 3, the results are very similar.(3)Regarding the third comparison, based on the four indexes of MAPE, MAE, MSE, and DC, one can see that the developed model is more accurate than other individual models, such as wavenn and the BP neural network. Additionally, the individual models with the highest prediction accuracy are ARIMA, BP, and GRNN. Therefore, we selected BP as a model for nonlinear combination. Because ARIMA is a linear model, it can determine if wind speed data has certain linear characteristics, so it is intuitive to consider ARIMA in the proposed model. The other three models, namely, ARIMA, GRNN, and ELM, are submodels of the combined model.

3.3.2. Experiment II: Testing Data Preprocessing Methods

This experiment aimed to compare the effectiveness of the VMD selected in this study to that of other common data preprocessing technologies, such as EMD, EEMD, CEEMD, and SSA. Therefore, the point prediction models based on different data preprocessing methods are the EMD-based model, EEMD-based model, CEEMD-based model, and SSA-based model. These models only use different decomposition methods during the data preprocessing stage. In this experiment, we tested whether or not the proposed prediction model is reasonable and identified the best method for removing noise to improve prediction effectiveness.

The results obtained by models using different data preprocessing methods are listed in Table 6. Figure 4 presents a clearer and more intuitive comparison. In Table 6 and Figure 4, one can see that the model based on VMD technology has superior performance compared to the other decomposition-based prediction models. The MAPE value of the VMD-based proposed model is 0.3 to 4 percentage points higher than those of the EMD-based model, EEMD-based model, CEEMD-based model, and SSA-based model. Of all the benchmark models, the SSA-based model performs the worst. Compared to the other models, the MAE, MSE, and DC values for one-step and two-step prediction by the proposed model are improved to some extent, which demonstrates the superiority of the data preprocessing method adopted in our hybrid model.

(1) Remark Regarding the Point Prediction Module. Experiments I and II focused on proving the advantages of the proposed point prediction module and verifying it from the perspective of single prediction models, combination models, and data preprocessing. The results show that, in both cases, the proposed point prediction model is superior to all the comparative models. This proves that the combination of data preprocessing technology, optimization algorithms, and nonlinear combined methods can successfully resolve the issues of wind energy prediction based on the selection of appropriate prediction methods. Based on the superior effectiveness of the designed point prediction model, it has very promising application potential.

In particular, EMD, EEMD, and CEEMD are a series of processes of the same principle; the changing process of can be summarized as follows:

The signal formula of EMD is

The signal formula of EEMD is

The signal formula of CEEMD iswhere is the original signal, is the noise sequence, and and are positive noise and negative noise sequence. On the basis of EMD, noise sequence is added to form EEMD. CEEMD further decomposes the noise sequence into positive noise sequence and negative noise sequence.

3.4. Results and Analysis of Interval Prediction (Experiment III)

Based on wind speed point prediction results, probability interval prediction can derive additional wind speed information. In this study, we developed a method based on fuzzy clustering which performs interval prediction based on point prediction results. Three datasets were considered in this experiment. For the sake of verifying the effectiveness of the designed interval prediction module, we used all of the comparative models for point prediction and performed multistep prediction to verify the interval prediction results. The results of the proposed interval prediction model and other models are listed in Table 7. Based on space limitations, Table 7 only lists the results for site 3. We set the confidence interval to 90% to assess the effectiveness of the interval prediction model. From Table 7, one can draw the following conclusions.

The best values for all indexes among all models are obtained by the proposed prediction model. The coverage probability of the prediction interval is 96.5278% in one-step PICP and 90.1944% in two-step PICP. The average width of the interval is 1.3399 for one-step prediction and 1.2158 for two-step prediction according to MPI. In terms of the absolute value of wind speed, the interval width is relatively accurate. The AWD is 0.0066 for one-step prediction and 0.0264 for two-step prediction, indicating that the deviation degree of the constructed interval is small. All indexes indicate that the predicted interval is qualified. In contrast, for the PICPs of individual prediction models, none of the one-step predictions reach more than 90% and all the two-step predictions are below 80%. By combining PINAW with PICP, for the proposed model, when the PICP value is very high, PINAW is relatively small, which demonstrates the superiority of the proposed model. For one-step and two-step prediction, the AWD values of most other benchmark models are more than ten times that of the proposed model. ACE is the difference between the coverage and confidence of the prediction interval. The ACE values of all models except for the proposed model are negative, indicating that the coverage of the developed model is much better than that of the other models. The absolute value of the WS index of the proposed model is the smallest, indicating its reliability. These six indexes fully reflect the superior prediction performance of the proposed model.

To present the comparison results intuitively, the results of the designed module and comparative methods are visualized in Figure 5. These conclusions are consistent with the results in Table 7, providing intuitive evidence that verifies the superior abilities of the proposed system for wind speed interval prediction. As shown in Figure 5, compared to the other methods, the proposed model yields superior interval prediction results. The prediction range not only covers most of the wind speed values but also is the smoothest range among all models. This demonstrates that the proposed model is more stable than the other models. Therefore, our model is more advantageous for the three experimental datasets.

(1) Remark Regarding the Interval Prediction Module. Similar to the comparison model used for point forecasting, 12 different models based on three datasets and multistep forecasting were compared. The results demonstrated that the designed interval model is superior to all the comparative models. Based on the excellent results of the designed interval prediction module using fuzzy clustering, it is a very promising interval prediction method for wind speed.

All the above comparison models are the comparison models of experiment 1 and experiment 2. We still use them to compare the performance of interval prediction, so as to prove the interval prediction performance of the developed model. In particular, the WS value in the table is bracketed to indicate its absolute value.

4. Discussion

For the sake of discussing our experimental conclusions in detail and reducing the error of wind speed forecasting, the validity of the established model, combination mechanism of the combined model, and its practical application to wind power systems are discussed in this section.

4.1. DM Test

First, the validity of the proposed model was verified via DM testing in which all of the other models were compared to the proposed double prediction model. Based on the DM testing theory, the zero hypothesis is that the forecasting results of two models contain no significant differences. The alternative hypothesis is opposite to the zero hypothesis. We chose two scales with alpha values of 0.1 and 0.05 as the criteria for judging the significance of results with Z0.05 /2 = 1.96 and Z0.1 /2 = 1.645, respectively. Table 8 lists the DM statistics and averages for the three test sites.

Table 8 reveals that most of the DM test values calculated by the developed model and comparative models are greater than the upper limit of a 5% significance level. However, for the VMD-ARIMA-, VMD-ELM-, and EEMD-based models, the results do not reveal significant differences compared to the proposed model. Therefore, we can reject the zero hypothesis at a threshold of 10% significance. For example, the DM test statistic for the VMD-ELM model at site 1 is 1.7554, which is not significantly different from that of the developed model at a 5% significance level, but is significantly different from the developed model at a 10% significance level. At a 10% significance level, all distinctions between the designed model and benchmark models are significant. Therefore, it can be concluded that the designed hybrid double prediction model is preferable to the other models.

4.2. Combination Mechanism of the Combined Model

For the sake of verifying the effectiveness of the designed nonlinear combination mechanism (MOEA/D-BP), a simple averaging strategy and linear combination mechanism were selected as comparative methods in this study. The simple averaging strategy computes the mean value of the prediction results of each model, while the linear combination mechanism uses the MOEA/D as a weight determination method to derive the final prediction results. Comparative results for the developed model and the other two methods are listed in Table 9.

The effects of each combination mechanism are compared based on four point prediction error measurement rules and six interval error prediction measurement rules. One can see that the prediction effectiveness of the nonlinear combination model is greater than that of the simple averaging strategy and linear combination mechanism, regardless of the location and prediction steps. The linear combination mechanism is often more effective than the simple averaging strategy. In other words, the simple averaging strategy performs the worst. Therefore, the developed MOEA/D-BP mechanism successfully improves forecasting effectiveness for wind speed.

The simple average method is to use the simple average formula under statistical sense to calculate the final predicted value. The method formula is briefly introduced as follows:where pi is the prediction results of the corresponding model. The linear combination of the models is the weighted combination of the results of the three single models, and a final prediction value is obtained. The weights are determined by the multiobjective optimization algorithm, which increases the intelligence of the method.

4.3. Performance Testing of Optimization Algorithms

This section first introduces the parameter settings for the BP network and MOEA/D and then presents convergence testing results for metaheuristic algorithms.

4.3.1. Parameter Settings

An artificial intelligence algorithm called BP was used to combine wind speed results. In a BP neural network, the weights and thresholds of input, hidden, and output layers play crucial roles in terms of network performance. To determine the appropriate connection weights and node thresholds efficiently, we adopted the MOEA/D for parameter optimization. The parameters for BP and the MOEA/D are listed in Tables 10 and 11, respectively.

4.3.2. Convergence Testing of Optimization Algorithms

To analyze the performance of the MOEA/D, different population size numbers were selected to test its abilities using four test functions. Three multiobjective optimization algorithms, namely, MOGWO, MOALO, and MODA, were used as comparative models. Table 12 contains the details of the four test functions. By comparing different optimization methods, it was proven that the prediction ability of the MOEA/D is superior to that of other multiobjective algorithms. Twenty experiments were conducted for each case and average values were obtained. The calculation results for each index are listed in Table 13.

We selected two performance indexes as the criteria for evaluating the optimization algorithms, namely, the IGD index and SP index. Additionally, the running times of different algorithms were compared. IGD is an indicator of the convergence conditions of an algorithm and it can be used to judge the robustness and stability of algorithms. The smaller the IGD value, the better the performance of an algorithm. In a Pareto set, SP is typically used to evaluate the distribution of solutions. If SP is equal to zero, then all nondominant solutions are equidistant.

The final simulated results are listed in Table 13. For all of the algorithms, as the population size increases from 100 to 150, 200, and 300, convergence is enhanced. The MOEA/D yields the best performance for ZDT1, ZDT2, ZDT3, and ZDT6. The IDG of the MOEA/D is far less than that of the other algorithms, indicating that the MOEA/D provides the best convergence performance. MODA is the second-best algorithm. The convergence effect of the MOGWO algorithm is much weaker than that of the other algorithms. For SP, the MOEA/D yields the best allocation performance. The running time of the MOEA/D is significantly lower than the running times of the other three algorithms, which demonstrates that the MOEA/D is the fastest and most efficient algorithm.

IGD and SP are two important performance evaluation indexes of multiobjective algorithm solution set, of which the calculation formulas are as follows [60]:where is the set of points uniformly distributed on the real Pareto surface and is the number of individuals of the set of points distributed on the real Pareto surface. Q is the optimal Pareto solution set obtained by the algorithm. d(, Q) is the minimum Euclidean distance between individual and population Q in . Therefore, IGD is to evaluate the comprehensive performance of the algorithm by calculating the average value of the minimum distance from the point set on the real Pareto surface to the obtained population.where di represents the minimum distance from the i-th solution to other solutions in the solution set and represents the mean value of all di. n is the number of solution set individuals. SP measures the standard deviation of the minimum distance from each solution to other solutions. The smaller the SP value, the more uniform the solution set.

4.4. Practical Application to a Power System

Wind power forecasting systems are of great importance for large-capacity wind power systems. Effective wind speed forecasting can be helpful in many areas, such as timely maintenance scheduling and electronic grid safety management. The contributions of an accurate wind speed forecasting model to a power system can be summarized as follows [50]:(1)To guarantee the best wind energy output quality, it is very important to assess the quantity of wind power. Wind power has a direct power relationship with wind speed, so the evaluation of wind power can be accomplished based on wind speed prediction. Therefore, precise wind speed forecasting can enhance decision-making for wind farms and is conducive to smart grid planning.(2)Accurate wind speed forecasting can provide essential guidance for the dispatching and control of wind turbines. Based on predicted wind speeds, administrators can control wind turbines immediately to ensure the best wind energy output quality. If the wind speed value is greater than the fan capacity, the fan should be closed to avoid damage and reduce operating costs.(3)Wind speed prediction effectiveness has an important impact on electronic grid dispatching and supervising. Wind power output fluctuates significantly and intermittently, which makes power system operation very challenging. Therefore, accurate prediction models can assist decision-makers in making timely decisions to avoid the problems discussed above.

5. Conclusions

Based on the depletion of traditional energy sources, wind energy is considered to be a promising alternative energy source because of its sustainability and cleanliness. However, based on its inherent intermittence and randomness, the extraction of wind energy is very limited, which can endanger the dispatching and management of wind power systems.

To analyze the uncertainty characteristics of wind speed more comprehensively, a double prediction system was successfully developed in this study. The proposed system compensates for the shortcomings of previous methods. The proposed system consists of two main parts: a point prediction module based on nonlinear combination and interval prediction module based on fuzzy clustering. It is of great significance to explore the predictability and modeling of wind speed comprehensively. Unlike previous works, we implemented a BP neural network using MOEA/D optimization as a novel nonlinear combination mechanism to derive final prediction results, which enhances the accuracy of point prediction and improves final prediction accuracy. To improve the accuracy of point prediction, wind speed data was divided into different categories based on fuzzy clustering and different intervals were constructed according to the prediction data in different categories. This method of constructing different intervals according to different data characteristics has been proven to be an effective interval prediction method. Finally, a large number of experiments were conducted using quantitative indexes, which demonstrated the effectiveness and superiority of the proposed system. Additionally, because the proposed system provides reliable performance, it can also be applied to load prediction, wind power forecasting, economic forecasting, and other fields.

Appendix

The model proposed in this paper is based on wind speed data from the Penglai wind farm which was collected in January of the year 2011. This dataset was randomly selected to train and test the proposed model. To explore the impact of different seasons on the proposed model, the authors also selected data from three other months in different seasons (April, July, and October) for comparison with the data from January. The results are listed in Table 14. One can see that there are no significant differences between the prediction accuracies of the models constructed from data from different months for one-step point prediction, two-step point prediction, or interval prediction. This demonstrates that the construction of the proposed model is not affected by seasonal changes. Similarly, data from different years can also be used to construct models for wind speed prediction. The proposed model can be used to predict general wind speeds to study time trends and seasonal characteristics.

Nomenclature

ARIMA:Autoregressive integrated moving average
GRNN:Generalized regression neural network
ELM:Extreme learning machine neural network
Wavenn:Wavelet neural network
BP:Backpropagation neural network
EMD:Empirical mode decomposition
EEMD:Ensemble empirical mode decomposition
CEEMD:Complementary ensemble empirical mode decomposition
SSA:Singular spectrum analysis
VMD:Variational mode decomposition
IMFs:Intrinsic mode functions
ZDT:Test functions for multiobjective algorithm
AR:Autoregressive model
ARIMA-ARCH:Autoregressive conditional heteroscedasticity model
MOP:Multiobjective optimization problem
MAE:Mean absolute error
MAPE:Mean absolute percentage error
MSE:Mean squared error
DC:Directional change
PICP:Prediction interval coverage probability
PINAW:Prediction interval normalized average width
ACE:Average coverage error
MPI:Average width of the constructed PIs
WS:Winkler score
AWD:Accumulated width deviation
MOGWO:Multiobjective grey wolf optimization
MOALO:Multiobjective ant colony optimization
MOALO:Multiobjective ant colony optimization
MODA:Multiobjective Dragonfly algorithm
MOEA/D:Multiobjective evolutionary algorithm based on decomposition
IGD:Inverted generational distance
SP:Spread performance.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request. The 10-minute wind speed data of Penglai, Shandong province, which has one of the largest wind farms in China, are selected and three datasets of data are collected. The data are true and reliable. The authors will provide the data if necessary to assist the experimental proof.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.