#### Abstract

Accurate wind speed forecasting is an effective way to improve the safety and stability of power grid. A novel hybrid model based on twice decomposition, phase space reconstruction (PSR), and an improved multiverse optimizer-extreme learning machine (IMVO-ELM) is proposed to enhance the performance of short-term wind speed forecasting in this paper. In consideration of the nonstationarity of the wind speed signal, a twice decomposition based on improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN), fuzzy entropy, and variational mode decomposition (VMD) is proposed to reduce the nonstationarity of the original signal firstly. Then the PSR based on C-C method is employed to reconstitute the decomposed signal as the input of the prediction model. Lastly, an improved multiverse optimizer is proposed to improve the stability and efficiency of ELM which is used as prediction model. Furthermore, two experiments are designed to verify the performance of the proposed method; the results indicate that (1) the wind speed forecasting with twice decomposition of original wind speed signal is better than other once-decomposition methods and much better than forecasting without decomposition; (2) the C-C-PSR method can determine the input dimension of ELM and improve the prediction accuracy of ELM; (3) the IMVO has improved the stability of ELM, and the optimization efficiency is better than other comparison optimization methods. The results show that the proposed hybrid approach is a useful tool for short-term wind speed forecasting.

#### 1. Introduction

With exhaustion of fossil energy and increase of requirements of environmental protection, energy supply has become an important problem. Developing clean energy is an effective way to solve energy problems. Wind energy as a cheap, recyclable, pollution-free energy has been vigorously developed by many countries, and the capacity of wind turbine is increasing rapidly [1]. According to statistics, the wind-turbine capacity increased from 487 GW in 2016 to 702 GW in 2020 [2].

Wind speed has the characteristics of randomness, intermittence, and fluctuation which makes the output power of wind turbine unstable. With the grid-connected large-scale wind power, the unstable output power brings great challenge to power grid [3]. Accurate wind speed forecasting is an effective tool to improve the safety and stability of power grid [4]. Many of wind speed forecasting methods have been proposed in the fast few decades. The methods can be classified into two categories [5]: the physical-driven methods and the data-driven methods. The physical-driven methods are usually established with topography, temperature, density, air pressure, and altitude. And the numerical weather prediction (NWP) is employed for forecasting [6, 7]. With the low resolution of NWP, the physical-driven methods usually cannot meet the demand of short-term wind speed forecasting [8].

The data-driven methods just need the history data for forecasting which is more suitable for short-term wind speed forecasting. The data-driven methods can be divided into two categories: statistical algorithms and artificial intelligence algorithms. The statistical algorithms employed for wind speed forecasting mainly include autoregressive moving average model (ARMA) and autoregressive integrated moving average model (ARIMA) [9, 10]. The ARMA model is a linear model which is not very suitable for the nonstationary signals [11]. The ARIMA model can convert nonstationary signals into stationary time series which improved the prediction accuracy of wind speed [12]. With the development of computer science, the artificial intelligence algorithms have been widely employed in wind speed forecasting, such as support vector machine (SVM) [13, 14], backpropagation (BP) [15], Elman neural network [16, 17], and extreme learning machine (ELM) [18, 19]. Among these artificial intelligence algorithms, the ELM has the fastest calculation speed and stronger generalization ability [20] which mean it is more suitable for short-term forecasting.

With the nonstationarity of wind speed, data preprocessing can get more useful data features from original wind speed signal to improve the prediction accuracy [21, 22]. Data preprocessing methods have been widely used to reduce the nonstationarity of wind speed signal, such as wavelet transform (WT) [12], empirical mode decomposition (EMD) [23], ensemble empirical mode decomposition (EEMD) [24], complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [25], improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) [26], and variational mode decomposition (VMD) [27]. The ICEEMDAN has solved the modal mixing problem and the residual components in intrinsic mode function (IMF) are greatly reduced [28, 29]. The VMD can also solve the modal mixing problem by decomposing signal into band-limited subseries [30, 31].

In this paper, a novel hybrid model for short-term wind speed forecasting based on twice decomposition, phase space reconstruction (PSR), and an improved multiverse optimizer-extreme learning machine (IMVO-ELM) is proposed. The proposed method includes data processing module, prediction module, and combination of final results module. A twice-decomposition method based on ICEEMDAN, fuzzy entropy, and VMD is proposed as data processing module. A prediction model based on C-C-PSR and IMVO-ELM is proposed as prediction module. The main contributions of this paper are illustrated as follows:(1)A twice decomposition based on ICEEMDAN, fuzzy entropy, and VMD is proposed for wind speed signal to improve the prediction accuracy. The ICEEMDAN is utilized to the original wind speed signal firstly. As some of the high frequency IMFs are still complex for the prediction, the VMD is employed to decompose the complexity of IMFs. And the fuzzy entropy is utilized to estimate the complexity of each IMF.(2)The PSR based on C-C method is used for establishing the input signal of the prediction model to improve the prediction accuracy.(3)An improved multiverse optimizer is proposed to optimize the weight coefficients from input layer to hidden layer and the bias of hidden layer of ELM. The IMVO-ELM can improve the stability and efficiency of ELM.

The rest of this paper is organized as follows: the theoretical background which is related to the proposed method is described in Section 2. In Section 3, the proposed hybrid model and the methodology of the article are described detailedly. Experiments are conducted and the results are analyzed in Section 4. Conclusions are given in Section 5.

#### 2. Theoretical Background

The theoretical backgrounds related to the proposed method of this paper are briefly reviewed in this section, including ICEEMDAN, VMD, fuzzy entropy, PSR based on C-C, and ELM.

##### 2.1. ICEEMDAN

ICEEMDAN is proposed by Colominas based on CEEMDAN which is recognized as the important improvement of EEMD [32]. The ICEEMDAN adds the mode of white noise to original signal instead of white noise which greatly reduces the residual noise in IMFs. The detailed steps of ICEEMADN are as follows: Step 1: The modes of white noise which is processed with EMD are added to the original signal. where *f* is the original signal, *β*_{0} is the SNR, *E*_{k}[·] represent the the *k*-th subseries decomposed by EMD, and denotes the *i*-th white noise which adds to the original signal. *I* is the total number of white noises. Step 2: The first-order residuals and the first IMF are calculated: where *r*_{1} is the first-order residuals, *M*[·] represents the calculation of local mean value, and *c*_{1} represents the first IMF. Step 3: The rest of the orders of residuals and IMFs are calculated by the following equations: where *r*_{k} represents the *k*-th order residual, and *c*_{k} is the *k*-th IMF.

##### 2.2. VMD

VMD is an adaptive decomposition algorithm which can decompose a signal into IMF with limited bandwidth [30]. The detailed steps of VMD can be described as follows: Step 1: The variational problem of VMD can be described as where *f* is the original signal, *K* is the number of IMFs of the original signal, *u*_{k} is the *k*-th IMF of *f*, and *ω*_{k} represents the center frequency of *u*_{k}. Because equation (5) cannot be solved directly, the augmented Lagrangian function of equation (5) can be described as where *η* represents the Lagrange multiplier, and *α* represents the penalty factor. Step 2: The are updated to search the saddle point of equation (6). The updating process can be described as follows: where *u*_{k}(*ω*), *f*_{k}(*ω*), and *η*_{k}(*ω*) are frequency domain signal of *u*_{k}(*t*), *f*_{k}(*t*), and *η*_{k}(*t*). *τ* represents the updating step.

In the process, the center frequency and bandwidth of each mode are constantly updated, and several IMFs with narrow bandwidths are obtained finally.

##### 2.3. Fuzzy Entropy

Fuzzy entropy is an improved complexity evaluation method based on sample entropy [33]. Membership function in fuzzy theory is employed in fuzzy entropy to replace the threshold value in sample entropy which can make similarity evaluation more clearly. The detailed steps of fuzzy entropy are described as follows:

A time series with *N* samples is assessed as [*u*(1), *u*(2), …, *u*(*N*)]. The phase space *U* is reconstructed with the time series which can be described aswhere *m* represents the dimension of the phase space.

The maximum absolute distance of *U*(*i*) and *U*(*j*) is described as

The similarity is calculated aswhere *n* and *r* present the gradient and width of the boundary of an exponential function. Equations (8)–(10) are repeated to get the similarity for phase space with *m* + 1 dimension.

The Fuzzy entropy is defined as

##### 2.4. PSR Based on C-C

The PSR is a basic method for chaotic time series analysis [34]. For a time series *x* = {*x*_{i} | *i* = 1, 2, …, *N*}, the PSR model can be described aswhere *m* represents the embedding dimension, and *τ* is the delay time.

The embedding dimension *m* and delay time *τ* are identified by the C-C method usually [35]. The detailed steps are as follows:

The correlation integral of time series is defined as

The statistics *S*_{1}(*m*, *N*, *r*, *t*) is defined as

When the number of samples is infinite, equation (14) can be described as

And some statistics of *S*_{2} can be calculated as

The first zero point of *SM*_{2}(*t*) or the first minimum value is the best delay time *τ*. The minimum value of *S*_{2cor}(*t*) is the length of time series window: .

##### 2.5. ELM

Extreme Learning Machine is a feedforward neural network which has the characteristic of fast learning speed. For an ELM with single hidden layer, the ELM model can be described as [36]where *y*_{j} is the output of ELM, *X*_{j} is the input of ELM, *L* is the neurons number of the hidden layer, *ω*_{i} represents the weight coefficient of neurons from input layer to hidden layer, *β*_{i} represents the weight coefficient of neurons from hidden layer to output layer, *b*_{i} denotes the bias of neurons of hidden layer, and *h*(*x*) is the activation function.

The objection function of ELM training is to get the minimum output error. If the output error is close to zero, the ELM model can be described in matrix form aswhere **Y** is formed with the real output, and *ω*_{i} and *b*_{i} are randomly selected.

The ** β** can be determined by

#### 3. The Proposed Hybrid Model

##### 3.1. The Structure of the Proposed Model

The structure of the proposed method is shown in Figure 1. The proposed method is mainly composed of three modules including data processing module, prediction module, and combination of final results module. Module 1: Data processing In this module, the original wind speed data is decomposed by ICEEMDAN firstly. Then, the entropies of each IMF are calculated. The IMFs with higher entropies which are regarded as more complexity subseries are decomposed by VMD again. The detailed process of the twice decomposition is presented in Section 3.2. The details of VMD and ICEEMDAN are presented in Sections 2.1 and 2.2. Module 2: Prediction The IMFs which are got by module 1 are utilized for prediction. Firstly, the C-C and PSR method are used to reconstitute the input of the prediction model which can get more useful information. And the dimension of the input can be also determined by the C-C method. The details of C-C and PSR method are presented in Section 2.3. Then, the IMVO-ELM model is employed for prediction for each IMF. The detail of IMVO-ELM model is presented in Section 3.3. Module 3: Combination of final results The summation of the prediction result of each IMF is the final result.

##### 3.2. Twice Decomposition Based on ICEEMDAN, Fuzzy Entropy, and VMD

In this paper, a twice-decomposition method is proposed to reduce the complexity of the input data of the prediction model. With nonstationarity of the original wind speed, the ICEEMDAN is employed to decompose the original wind speed which can reduce the complexity in prediction firstly. But some of the IMFs which are got by ICEEMDAN are still complex for prediction model, especially for the high frequency subseries. In order to find these IMFs, the fuzzy entropy is employed to estimate the complexity of each IMF. Then, the IMFs are reclassified into two datasets. The reclassification process is as follows:where FEn(IMF_{i}) represents the fuzzy entropy of the *i*-th IMF. FEn(original) represents the fuzzy entropy of the original wind speed. The dataset **L** includes the IMFs with lower fuzzy entropy which are easy for prediction. The dataset **H** includes the IMFs with higher fuzzy entropy which are difficult for prediction.

The VMD is employed to decompose the IMFs in dataset *H* again to reduce the complexity of the IMFs which are with high entropy. The subseries got by the twice decomposition have greatly reduced the complexity and can be used for prediction.

##### 3.3. Improved Multiverse Optimizer for ELM

As the EML method introduction in Section 2.5, the weight coefficients from input layer to hidden layer and the bias of hidden layer are formed randomly, and the values remain constant in training processing. According to the researches [18, 37], this principle makes the ELM have faster training processing, but it will also make the poor effect in training processing. In order to solve this problem, optimization methods have been widely employed to improve the ELM model [38, 39]. The parameter number which needs to be optimized in ELM is determined by the number of neurons in input layer and hidden layer. The optimal parameter number is usually too big to be effective which makes that more efficient optimization methods are necessary.

The MVO is a nature-inspired algorithm for global optimization which is proposed by Mirjalili et al. in recent years [40]. Many researches have proved the better performance of MVO compared to other well-known optimization methods. Although the MVO has better optimization ability, the exploration ability and exploitation ability are difficult to balance and the initial populations have uneven distribution. In this paper, an improved multiverse optimizer (IMVO) has been proposed with two improved strategies.

Firstly, the cubic chaos mapping is employed to increase the diversity of the initial populations. The cubic chaos mapping can be described as follows:where *a* and *b* represent influence factors of chaos which influence the state and scope of the mapping. In general, the mapping is chaotic when *b* ∈ (2.3, 3). *x*_{n} ∈ (−2, 2) when *a* = 1, and *x*_{n} ∈ (−1, 1) when *a* = 4.

Secondly, a sine function is proposed for WEP which is a control parameter in MVO. The WEP parameter control strategy is shown as follows:where WEP_{max} and WEP_{min} are the maximum value and minimum value of WEP, iter represents the current iteration, and iter_{max} denotes the maximum iteration.

Under this control strategy, WEP changes slowly in the early stage to improve the exploration ability. In the middle period, the WEP changes fast which makes the algorithm quickly change from exploration to exploitation. And the WEP also changes slowly in the late stage to improve the exploitation ability.

#### 4. Experiments and Analysis

##### 4.1. Dataset Description

The experiments data of this paper is collected from Sotavento Galicia wind farm. Wind speed data is recorded with a time interval of 10 mins. There are four datasets which are collected in different seasons and utilized for the experiments. The wind speed of the four datasets is shown in Figure 2. For each dataset, the first 1000 samples are used as the training dataset and the last 100 samples are used as testing dataset. Meanwhile, the statistical information which includes mean value, maximum value, minimum value, standard deviation, skewness, and kurtosis is illustrated in Table 1. The maximum wind speed and the minimum wind speed are in wide variation range in all datasets. The standard deviation, skewness, and kurtosis show that the wind speed is not normally distributed. All the above statistical information indicates that the wind speed presents strong nonlinearity and nonstationarity.

**(a)**

**(b)**

**(c)**

**(d)**

##### 4.2. Evaluation Metrics

In order to evaluate the performance of each forecasting method, it is necessary to calculate the evaluation metrics which are based on the forecasting result and the actual result. In this paper, mean absolute percentage error (MAPE), root mean square error (RMSE), and mean absolute error (MAE) are utilized as evaluation metrics which can be described as follows:where *L* represents the number of samples, and *y*_{i} and are the observed and forecasting wind speed value at time *i*.

##### 4.3. Comparison and Analysis with Different Optimization Methods for ELM

In this paper, the IMVO method proposed in Section 3, 4 is used to improve the efficiency of ELM. In order to demonstrate the performance of the proposed IMVO method, the proposed IMVO method is compared to genetic algorithm (GA), particle swarm optimization (PSO), grey wolf optimizer (GWO), and MVO which are well-known optimization methods.

Firstly, the parameters of these optimization methods are set to make sure the amount of computational complexity is roughly the same. The population is 30 and max iteration is 50 for all the optimization methods, and the other main parameters of these methods are set as follows: GA: generation gap = 0.95, crossover rate = 0.7, mutation rate = 0.01. PSO: accelerating constants *c*_{1} = 2 and *c*_{2} = 2, inertia weight *ω* = 0.6. MVO: WEP_{max} = 1, WEP_{min} = 0.2, *p* = 6. IMVO: WEP_{max} = 1, WEP_{min} = 0.2, *p* = 6, *a* = 4, *b* = 2.5.

The number of input neurons of ELM is set as 5 and the number of hidden neurons is set as 8.

Secondly, the ELM model is employed to establish the forecasting model with the training datasets. The above optimization methods are utilized to optimize the ELM model which will make the ELM have better performance. The objective function is set as the minimum MAPE of training processing. With the randomness of intelligent optimization algorithms, each method is calculated 20 times independently. The average values of evaluation metrics of training processing by different method and different datasets are illustrated in Table 2. And the boxplots of the evaluation metrics of the 20 times’ calculations are shown in Figure 3.

**(a)**

**(b)**

**(c)**

As shown in Table 2 and Figure 3, the MAPE, MAE and RMSE of ELM method are worse than other methods which are caused by the instability of ELM. Some of the results have large deviation from the average value. For example, the worst MAE value of ELM method of dataset B is 1.62 m/s, and the average MAE value of ELM method of dataset B is 1.18 m/s. The maximum deviation is near 50% to average value. Once the ELM method gets into this situation, it will bring bigger error in wind speed forecasting. The results also indicate that the intelligent optimization algorithms can improve the stability of training processing of ELM model. In Figure 3(a), the MAPE values of all the methods of dataset A and dataset B are almost the same, and the MAPE values of IMVO method of dataset C and dataset D are a little better than other methods. In Figure 3(b), the MAE values of all the methods of dataset B are almost the same, and the MAE values of IMVO method are better than other methods in datasets A, C, and D. The GA method performance is worse than other optimization algorithms. In Figure 3(c), the GWO, PSO, MVO, and IMVO performance is almost the same and better than GA method. Although the results of some methods are almost the same, the convergence rates and searching ability are different which is important for short-term wind speed forecasting. The average convergence curves of different intelligent optimization algorithms in 20 times are shown in Figure 4.

**(a)**

**(b)**

**(c)**

**(d)**

As shown in Figures 4(a)–4(c), the MVO has better searching ability than GA, GWO, and PSO method, but the convergence rate of MVO cannot match with the PSO and GWO method. The proposed IMVO method has not only increased the searching ability but also improved the convergence rate. As shown in Figure 4(d), although the convergence rate of the proposed IMVO method is a little slower than PSO method in early period, the strong searching ability makes it have better result in the mid to late period.

The results indicate that the proposed IMVO-ELM method can make the ELM model more stable to avoid the extreme situation. And the proposed IMVO method has strong searching ability and fast convergence rate which makes the ELM model more effective.

##### 4.4. Comparison and Analysis with Different Prediction Models

In this subsection, the proposed short-term wind speed forecasting method is verified. And seven comparative methods are carried out, including IMVO-ELM, EMD--IMVO-ELM, CEEMDAN—IMVO-ELM, ICEEMDAN--IMVO-ELM, EMD-cc-PSR-IMVO-ELM, CEEMDAN-cc-PSR-IMVO-ELM, and ICEEMDAN-cc-PSR-IMVO-ELM. All the above methods are based on the IMVO-ELM model which has been demonstrated to be effective in the previous subsection. The difference of these methods is the different input signal. The IMVO-ELM approach is based on original wind speed for input directly. The EMD--IMVO-ELM, CEEMDAN—IMVO-ELM, and ICEEMDAN--IMVO-ELM approaches are based on the EMD decomposition, CEEMDAN decomposition, and ICEEMDAN decomposition of original wind speed for input, respectively. In EMD-cc-PSR-IMVO-ELM, CEEMDAN-cc-PSR-IMVO-ELM, and ICEEMDAN-cc-PSR-IMVO-ELM approaches, the original wind speed is decomposed by EMD, CEEMDAN, and ICEEMDAN, respectively. Then the PSR whose dimension and time delay are determined by C-C method is employed to reconstitute the input signal with the decomposition signal.

As the input neuron number of ELM can be determined by the cc-PSR method, the EMD-cc-PSR-IMVO-ELM, CEEMDAN-cc-PSR-IMVO-ELM, ICEEMDAN-cc-PSR-IMVO-ELM, and the proposed approach can determine the input neuron number of ELM automatically. But the IMVO-ELM, EMD--IMVO-ELM, CEEMDAN—IMVO-ELM, and ICEEMDAN--IMVO-ELM approach require human judgement for the input neuron number of ELM. Traversing method is employed to get the best input neuron number of ELM of the IMVO-ELM, EMD--IMVO-ELM, CEEMDAN-IMVO-ELM, and ICEEMDAN--IMVO-ELM. The input neuron number of ELM is traversed from 1 to 10. The other parameters of the IMVO and ELM are set as Section 4.3. And 20 time’s independent calculations are applied for each approach. The average MAPE, MAE, and RMSE values of forecasting result under different approaches and different input neuron number are demonstrated in Figure 5.

**(a)**

**(b)**

**(c)**

**(d)**

According to the traversing calculation, the best input neuron number of IMVO-ELM is 5, 6, 3, and 6 for datasets A, B, C, and D, respectively. The best input neuron number of EMD--IMVO-ELM is 3, 2, 3, and 2 for datasets A, B, C, and D, respectively. The best input neuron number of CEEMDAN-IMVO-ELM is 3, 3, 3, and 3 for datasets A, B, C, and D, respectively. And the best input neuron number of ICEEMDAN--IMVO-ELM is 3, 6, 3, and 3 for datasets A, B, C, and D, respectively.

The results with the best input neuron number of IMVO-ELM, EMD--IMVO-ELM, CEEMDAN-IMVO-ELM, and ICEEMDAN--IMVO-ELM are used for comparison to the other methods. Meanwhile, each of the EMD-cc-PSR-IMVO-ELM, CEEMDAN-cc-PSR-IMVO-ELM, ICEEMDAN-cc-PSR-IMVO-ELM, and the proposed approach is employed 20 times independently for each dataset. The average evaluation metrics of the wind speed forecasting of all the above approaches are shown in Table 3.

As shown in the results in Table 3, the IMVO-ELM approach with original wind speed has the worst performance which indicates that the original wind speed has characteristic of nonstationarity and is difficult to predict by the IMVO-ELM model directly. The forecasting performance had been greatly improved by EMD-IMVO-ELM, CEEMDAN-IMVO-ELM, and ICEEMDAN-IMVO-ELM approaches which are with the input of the decomposition signal by EMD, CEEMDAN, and ICEEMDAN. The results indicate that the random component, periodic component, and trend component of the signal are well decomposed by these signal decomposition methods which is helpful for the forecasting. Meanwhile the experiments show that the ICEEMDAN is better than CEEMDAN and the CEEMDAN is better than EMD in this wind speed forecasting experiment. The EMD-cc-PSR-IMVO-ELM, CEEMDAN-cc-PSR-IMVO-ELM, and ICEEMDAN-cc-PSR-IMVO-ELM approaches have been added to the cc-PSR method to reconstruct input signal of each IMF, and the performance is better than EMD-IMVO-ELM, CEEMDAN-IMVO-ELM, and ICEEMDAN-IMVO-ELM, respectively. The results illustrate that more useful information can be gleaned from time series by cc-PSR method.

The proposed method has the best performance in evaluation metrics for all the datasets. The forecasting area of dataset A belongs to the low wind speed area, the MAPE of IMVO-ELM is over 25%, and the MAPE of other methods except the proposed method is near by 10%. The MAPE of the proposed method is 6.68% which is much better than other methods. The forecasting area of datasets B, C, and D belongs to the medium and high wind speed area, the MAPE, MAE, RMSE of the proposed method have reduced nearly 0.5%, 0.05 m/s, and 0.05 m/s comparing with the best of other methods, respectively. The results indicate that the proposed method is useful for wild range of wind speed.

The forecasting wind speed in time series of each approach is compared to the original wind speed in Figure 6. Meanwhile, the errors of each approach are also presented. As shown in the figures, the proposed method matches the original curve well, especially in the peak of the curve. And the error curve of the proposed method is smoother and more closed to the zeros.

**(a)**

**(b)**

**(c)**

**(d)**

Finally, the average calculation time of each method is listed in Table 4. The results indicate that although the proposed method costs a little more time than other methods, the calculation time of the proposed method is still acceptable for short-term wind speed forecasting.

According to the above comparison results, it can be seen that the proposed method has higher prediction accuracy and stronger adaptable in wild range of wind speed than other comparison methods.

#### 5. Conclusions

A novel hybrid model based on twice decomposition, PSR, and IMVO-ELM is proposed to enhance the performance of short-term wind speed forecasting. In the proposed hybrid model, a twice decomposition based on ICEEMDAN, fuzzy entropy, and VMD is proposed to reduce the nonstationarity of original wind speed signal. Then, decomposed signal is reconstituted by C-C-PSR method as the input data of prediction model. And an IMVO-ELM model is proposed as the prediction model. The proposed IMVO is utilized to improve the stability and efficiency of ELM. Finally, two comparison experiments are designed to verify the performance of the proposed method, and the experimental conclusions are as follows:(1)The wind speed forecasting with twice decomposition has greatly reduced the nonstationarity of original wind speed signal.(2)The C-C-PSR method can determine the input dimension of ELM which can improve the prediction accuracy of ELM.(3)The IMVO has improved the stability of ELM, and the optimization efficiency is better than other comparison methods.

Therefore, the proposed hybrid approach is a useful tool for short-term wind speed forecasting.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by National Natural Science Foundation of China (NSFC) (51709121).