Abstract

An alternative electric power source, such as wind power, has to be both reliable and autonomous. An accurate wind speed forecasting method plays the key role in achieving the aforementioned properties and also is a valuable tool in overcoming a variety of economic and technical problems connected to wind power production. The method proposed is based on the reformulation of the problem in the standard state space form and on implementing a bank of Kalman filters (KF), each fitting an ARMA model of different order. The proposed method is to be applied to a greenhouse unit which incorporates an automatized use of renewable energy sources including wind speed power.

1. Introduction

Energy is considered amongst the most significant factors that are closely related to both economic and social developments. It is also a fact that nowadays the majority of the electrical energy production is based on the fossil fuels, which on one hand are, without any doubt, highly efficient but on the other are responsible for the emission of greenhouse gases and their reserves are limited.

Consequently renewable sources of energy, such as wind, biomass, solar power, and wave power, have been already adopted for electric power production. It is well known that the wind power generation raises issues of reliability due to the fact that the wind speed is significantly and directly affected by various factors such as the type of the terrain, the height, season of the year, atmospheric conditions, obstacles present, and many more. This leads to the conclusion that unless the reliability of the wind power generation is at an acceptable level, wind power is not eligible for constant electrical energy supply to the power system [1, 2].

Recent studies have shown that combined forecasting methods can offer robust solutions and can be efficiently implemented to various real-life problems in diverging fields such as chemical processes, economics, load forecasting, tourism demand, environmental issues, medicine, and many more [37].

In this study a hybrid model is presented that reveals the advantages of an ARMA and SVM model in wind speed modelling and prediction problem. Initially successful model identification and parameter estimation have to be performed in order to choose the most appropriate ARMA models. For tackling this task the well-established MMPA was used. This approach was introduced by Lainiotis [8, 9] and summarizes the parametric model uncertainty into an unknown, finite dimensional parameter vector whose values are assumed to lie within a known set of finite cardinality. A nonexhaustive list of the reformulation, extension, and application of the MMPA approach as well as its application to a variety of problems can be found in [1019].

In this research real data were used, provided by Vestas Hellas; the simulation results appear to be very promising.

2. Hybrid Model Presentation

2.1. The ARMA Model

The problem of fitting an ARMA model in a given time series is present for more than half a century and is still appearing in many different fields such as in remote monitoring of civil infrastructure [20], predicting the demand for auto spare parts in China due to the fierce market competition [21], forecasting equipment failures in order to adjust maintenance policies in manufacturing plants [22], estimating retail sales volumes [23], predicting the outbreak and development of animal infectious diseases [24], and many more.

Considering the general case an -variate (i.e., multivariate) ARMA model of order ( ) [ ] for a stationary time series of vectors observed at equally spaced instants is defined as where the -dimensional vector is uncorrelated random noise, not necessarily Gaussian, with zero mean; covariance matrix , is the order of the predictor; and , are the coefficient matrices of the multivariate (MV) ARMA model.

It is obvious that the problem requires both the predictor’s order determination and computation of the predictor’s matrix coefficients .

The major disadvantage of the ARMA models is that their performance can be limited by any significant data nonlinearities.

2.2. Multimodel Partition Algorithm (MMPA)

Due to the fact that the wind speed does not have a constant or periodic behaviour, it was noted, by trial and error, that not a single ARMA model that was able to describe the whole data set satisfactory. It is actually the combination of various ARMA models, each one used for different time intervals and applied for different time durations that describes in the best manner the existing data. So instead of having various ARMA models of different order running in parallel with the SVM it was decided to load all the data to an adaptive filter programmed with the MMPA, and it will be the job of that filter to decide which ARMA model will be used each time.

If we assume that the model order fitting the data is known and is equal to , we can rewrite (1) in standard state-space form as

Now assign a new variable such as . Then is an    vector made up from the coefficients of the matrices ; is an observation history matrix of the process up to time .

If the general form of the matrices and is, respectively, then If then ; the last ( ) MA coefficients are zero. If then ; the last ( ) AR coefficients are zero. Consider where is the identity matrix.

Assuming that the system model and its statistics were completely known, the Kalman filter (KF) in its various forms would be the optimal estimator in the minimum variance sense.

However, if the system model is not completely known the MMPA, introduced by Lainiotis [8, 9], is one of the most widely used approaches for similar problems [1019, 2528].

In the case under consideration assume that the model uncertainty is the lack of knowledge of the model order . Let us further assume that the model order lies within a known sample space of finite cardinality; that is, , , where denotes the set of integers. The MMPA operates on the following discrete-time model: where is the unknown parameter, the model order in this case; is the state transition matrix; and is independent, zero mean, white noise not necessarily Gaussian with covariance which is usually set to a small positive nonzero constant. The optimal MMSE (minimum mean square error) estimate of is given by A set of models is designed, each matching one value of the parameter vector, . The probabilities for each model are set to , where is the cardinality of the model set.

A bank of Kalman filters is then applied, one for each model, which can be run in parallel, thus saving enormous computational time. At each iteration, the MMPA selects the model that corresponds to the maximum a posteriori probability as the correct one. This probability tends (asymptotically) to one, while the remaining probabilities tend to zero. The overall optimal estimate can be taken either to be the individual estimate of the elemental filter exhibiting the maximum posterior probability, for example, a value of 0.9 or higher [14], or the weighted average of the estimates produced by the elemental ARMA filters, as described in (7), which is the case used in this paper.

The probabilities are calculated on-line in a recursive manner as it is shown by where the innovation process is a zero mean white process with covariance matrix

For equations (8)–(10) .

2.3. Support Vector Machines (SVM)

In support vector machines, as they were proposed in [29], the training data set , , is mapped into a higher dimensional feature space via an operator .

A mathematical representation of the SVM function is where and can be found by the minimization of the following equations: where parameters and are user-defined. The term is the actual wind speed at the time instant and term is the loss function. By looking at (13) it is obvious that there is any penalty for errors below . The width of the function is given by the term and finally the training error term is given by , where is the tradeoff between the width of the function and the minimum training error. For dealing with nonlinear cases, like wind speed data, one may introduce slack variables and into (11) such that where , and .

By considering the above slack variables and in order to include any extra cost of the training errors, (12), which represents the objective function to be minimized, is rearranged to where again is user-defined and is the tradeoff between the maximum margin defined by and the minimum training error as defined by .

Finally by introducing positive Lagrangian multipliers and maximizing (15) the latter equation is reformed to subject to where .

The Lagrangian multipliers , satisfy and also The Kernel function introduced in (18) is defined such that , meaning that its value is equal to the inner product of the vectors and , included in the featured space and .

In this study the Gaussian kernel function (19), also known as radial basis function (RBF), is used. Consider The most significant feature of the SVM compared to other similar algorithms is that they manage to achieve optimum performance by restricting the decision’s function complexity so that is the most suitable according to the quantity of the data present.

2.4. The Hybrid Model

The wind speed behaviour is unpredictable and it is difficult to be represented. This is the reason for combining two different techniques for modelling the linear and the nonlinear parts of the series. The hybrid model proposed is based on a linear pattern, produced by the MMPA, and a nonlinear one, produced by the SVM. It can be represented as Both parts are directly calculated from the wind speed time series.

If is the MMPA estimation error at any time instant , then It is now the SVM that models these residuals as where is nonlinear and is random error. It is obvious that is the forecast of (22).

Consequently the forecast of the hybrid model is The proposed method is schematically represented in Figure 1.

3. Results and Discussion

3.1. Results

In this method the weighted average of the estimates produced by the elemental ARMA filters was used as a data preprocessor in order to detect the data’s linearities. This was succeeded using a bank of 10 Kalman filters of order programmed with the MMPA. Then the MMPA’s estimation error was applied as input to the SVM that were able to achieve a further error reduction and come up with a better forecasting outcome. As far as the SVM are concerned the three parameters ( , , and ) had to be carefully adjusted. Unsuitable values for these parameters may lead to either overfitting or underfitting of the training data. The values used in this research were , , and .

This research was conducted based on the hour average of daily wind speed recorded by the Vestas Hellas from November 2010 up to February 2011. The obtained time series did not follow any periodic pattern and it was also presenting irregular amplitudes, making it hard to both model and predict (Figures 2, 3, 4, and 5; “raw” data).

The aim of this work is to generate a single-step prediction based on past observations. The data were normalized to take values from zero to one, before using them as input data to the hybrid model.

From the 2725 available data points, 720 were for November, 744 for December, 744 for January, and 517 for February. For each month 20% of the available data was used for training, 20% for validation, and 60% for testing.

The performance of the hybrid method is judged by (a) comparing the predicted and the observed (raw) wind time series (Figures 2, 3, 4, and 5), (b) drawing scatter diagrams of the predicted and the observed sequences (Figures 6, 7, 8, and 9), and (c) computing the mean percentage absolute error (MAPE) for the testing data set, using the mathematical formula given in (22). Table 1 summarizes the results. Consider

3.2. Discussion

Figures 25 indicate that the predictions are very close to the real values of the wind speed time series. Additionally the hybrid model is capable of following satisfactorily the irregularities of the observed series. The value of MAPE between the real and the predicted series can be considered quite low since it has an average value below 3.01% and the individual value for each month is not above 3.3%. Furthermore the value of is another indication of close is the predicted to real series. A value of indicates a straight line of the form meaning that the predicted values are exactly the same as the real ones. In this study is at an average of 0.8685 which means that the predicted series is indeed very close to the real one.

For the sake of completeness of this research the results of forecasting using only the adaptive combination of MMPA with the ARMA models as well as using only the ANN architecture with the SVM are also presented. The first method was applied successfully in [22] for electric load modeling and forecasting. The difference is that electric load data is seasonal and after careful manipulation it can be converted to Gaussian, something that cannot be done with a wind speed time series.

Using the data for November, Figures 10 and 11 compare the predicted and the observed (raw) wind time series and Figures 12 and 13 show the scatter diagram of the predicted and observed sequences for MMPA-ARMA and SVM, respectively. Table 2 summarizes the performance of the aforementioned methods and the hybrid method proposed for the whole data set.

At this point it should be mentioned that when dealing with the wind speed forecasting problem it is quite difficult to attain very high prediction accuracy. The hybrid method proposed gave some very satisfactory results and the accuracy level reached can be considered sufficient for decision making as far as electric power production is concerned.

Last but not least it should be also mentioned that a designer’s crucial task when using MMPA is to assign a proper value for the cardinality . Trying to find an answer to that problem one has to take in mind the following considerations:(i)the quality of the overall MMPA estimate (in terms of MAPE) increases with the number of the elemental filters applied, so a small value of leads to estimates of a poor quality;(ii)the computational load of the MMPA is proportional to the number of filters implemented; so a large value increases the computational burden, a factor that makes the implementation of the MMPA for real time problems difficult.

A way of assigning the proper value of is to record the behaviour of MMPA during its operation. It is already mentioned that, at each iteration, the MMPA selects the model that corresponds to the maximum a posteriori probability as the correct one. This probability tends (asymptotically) to one, while the remaining probabilities tend to zero. Figure 14 shows the behaviour of the MMPA for the whole data set concerning December.

It can be seen that the highest order of the appropriate ARMA model for the time series under investigation is which makes the choice of acceptable. Another also significant conclusion is that if a model change is required, the algorithm senses the variation and increases its corresponding a posteriori probability while decreasing the remaining ones. Thus, the algorithm is adaptive in the sense of being able to track model changes in real time. This procedure incorporates the algorithm’s intelligence.

4. Conclusions

The area of forecasting is very demanding and it is over 50 years that ARMA models were exclusively used for tackling real-life problems. Recently ANN were applied in difficult prediction problems showing very satisfactory results especially due to their ability of manipulating the nonlinearities of the dataset. The aim of this work was not to just add yet another technique of wind speed prediction but to actually validate the fact that different forecasting methods fulfil each other and lead to accurate results. As it was shown the two individual forecasting methods, adaptive MMPA-ARMA and SVM, cannot match the performance of the hybrid method proposed. The results of the proposed method are to be applied to a greenhouse unit which incorporates an automatized use of renewable energy sources including wind speed power. Future work can include adjustments for wind speed prediction for time intervals smaller than 1 hour, say ten-minute intervals, and also for on-line wind speed prediction.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported in part by the European Union (European Social Fund-ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF), Research Funding Program: ARCHIMEDES III. Investing in knowledge society through the European Social Fund.