Applied Neural Intelligence to Modeling, Control, and Management of Human Systems and Environments
View this Special IssueResearch Article  Open Access
Standard Precipitation Index Drought Forecasting Using Neural Networks, Wavelet Neural Networks, and Support Vector Regression
Abstract
Drought forecasts can be an effective tool for mitigating some of the more adverse consequences of drought. Datadriven models are suitable forecasting tools due to their rapid development times, as well as minimal information requirements compared to the information required for physically based models. This study compares the effectiveness of three datadriven models for forecasting drought conditions in the Awash River Basin of Ethiopia. The Standard Precipitation Index (SPI) is forecast and compared using artificial neural networks (ANNs), support vector regression (SVR), and wavelet neural networks (WN). SPI 3 and SPI 12 were the SPI values that were forecasted. These SPI values were forecast over lead times of 1 and 6 months. The performance of all the models was compared using RMSE, MAE, and . The forecast results indicate that the coupled wavelet neural network (WN) models were the best models for forecasting SPI values over multiple lead times in the Awash River Basin in Ethiopia.
1. Introduction
Droughts, a natural occurrence in almost all climatic zones, are a result of the reduction, for an extended period of time, of precipitation from normal amounts. Extended periods of drought can lead to several adverse consequences, which include a disruption of the water supply, low agricultural yields, and reduced flows for ecosystems. Consequently, the ability to forecast and predict the characteristics of droughts, specifically their initiation, frequency, and severity, is important. Effective drought forecasts are an effective tool for water resource management as well as an effective tool for the agricultural industry.
Currently, drought monitoring in Ethiopia is conducted by the National Meteorological Services Agency (NMSA). The NMSA regularly produces a 10day bulletin that gives an analysis of rainfall based on the longterm average or normal. This bulletin is then circulated to a wide range of users, ranging from local development agents to decision makers at a national level. In addition to rainfall analysis, the normalized vegetation index (NDVI) is provided, which is a satellitebased index widely used to monitor vegetation and drought conditions. The NMSA produces a regular 10day bulletin regarding NDVI variation that compares the current vegetation condition with normal or conditions of the previous year [1]. However, the NDVI is sensitive to changes in vegetative land cover and may not be effective in areas where vegetation is minimal. In addition, the NMSA of Ethiopia produces medium and seasonal forecasts of precipitation using the aforementioned NDVI.
Unlike other natural hazards, droughts have a slow evolution time [2]. The consequences of droughts take a significant amount of time to come into effect with respect to their inception, and when they are perceived by ecosystems and hydrological systems. Due to this feature, effective mitigation of the most adverse drought impacts is possible, more than in the case of other extreme hydrological events such as floods, earthquakes, or hurricanes, provided a drought monitoring system, which is able to promptly warn of the onset of a drought and to follow its evolution in space and time, is in operation [3].
A common tool utilized to monitor current drought conditions is a drought index. Several drought indices can be used to forecast the possible evolution of an ongoing drought, in order to adopt appropriate mitigation measures and drought policies for water resources management [4]. This is because a drought index is expressed by a numeric number, which is believed to be far more functional than raw data during decisionmaking [2]. Several drought indices have been developed around the world in the past based on rainfall as the single variable, including the widely used Deciles [5], Standardized Precipitation Index (SPI) [6], and Effective Drought Index (EDI) [7]. There is also the wellknown Palmer Drought Severity Index (PDSI) [8], which considers temperature along with rainfall. The SPI drought index was chosen to forecast drought in this study due to its simplicity, its ability to represent droughts on multiple time scales, and because it is a probabilistic drought index. In addition, the study by Ntale and Gan [9] determined that the SPI is the most appropriate index for monitoring the variability of droughts in East Africa because it is easily adapted to local climate, has modest data requirements, and can be computed at almost any time scale.
Forecasting any hydrologic phenomena can be done using either a physical, conceptual, or datadriven approach. The latter approach is widely used in hydrologic forecasting because datadriven models have low information requirements with respect to the number of variables required for inputs compared to physically based models. Datadriven models also have rapid development times. Unlike physical and conceptual models, datadriven models are not difficult to implement for the purposes of realtime forecasting. Artificial neural networks (ANNs) have been used in several studies as a droughtforecasting tool [10–16]. The most popular type of ANN used for the purposes of drought forecasting is the multilayer perceptron (MLP) that is usually optimized with a back propagation algorithm. However, ANNs are limited in their ability to deal with nonstationarities in the data, a weakness also shared by multiple linear regression (MLR) and autoregressive integrated moving average (ARIMA) models.
This limitation with nonstationary data has led to the recent formation of hybrid models, where data is preprocessed for nonstationary characteristics and then run through a forecasting method such as ANNs to cope with the nonlinearity. Wavelet analysis, an effective tool to deal with nonstationary data, has recently been applied in hydrological forecasting to examine the rainfallrunoff relationship in a Karstic watershed [17], to characterize daily streamflow [18, 19] and monthly reservoir inflow [20], to evaluate rainfallrunoff models [21], to forecast river flow [22–24], to forecast future precipitation values [25], and for the purposes of drought forecasting [26]. The study conducted by Kim and Valdes [26] is the only study that has explored the ability of a waveletneural network conjunction model (WN) to forecast a given drought index. However, no studies that assess the ability of WN models to forecast the SPI drought index in particular have been explored.
Support Vector Machines (SVMs) are a relatively new form of machine learning that was developed by Vapnik [27]. The term SVM is used to refer to both classification and regression methods as well as the terms Support Vector Classification (SVC) and Support Vector Regression (SVR), which refer to the problems of classification and regression, respectively [28]. There are several studies where SVRs were used in hydrological forecasting. Khan and Coulibaly [29] found that an SVR model was more effective at predicting 3–12 month lake water levels than ANN models. Rajasekaran et al. [30] used SVR successfully for storm surge predictions, and Kisi and Cimen [31, 32] used SVR to estimate daily evaporation and daily streamflow, respectively. Finally, SVR have been successfully used to predict hourly streamflow by Asefa et al. [33] and were shown to perform better than ANN and ARIMA models for monthly streamflow prediction by Wang et al. [34] and Maity et al. [35], respectively. Yuan and Tan [36] used SVRs as a screening tool to test for drought resistance of rice. However, to date SVRs have not been applied to forecast a given drought index.
This study compared the effectiveness of three datadriven models for forecasting drought conditions in the Awash River Basin of Ethiopia. The Standard Precipitation Index (SPI) was forecasted and compared using artificial neural networks (ANNs), support vector regression (SVR), and wavelet networks (WN). SPI 3 and SPI 12 were forecast over lead times of 1 and 6 months. The forecast lead times were chosen because a 1month lead time is a typical shortterm lead time and a 6month lead time is representative of the bimodal rainfall pattern in the Awash River Basin. Forecast results of this study are useful for the agricultural water management sector and have the potential to be applied by water resources managers to effectively manage water resources in the region. In addition, accurate forecasts using these datadriven models can complement the forecasts already being used by the NMSA of Ethiopia.
2. Theoretical Development
In the following section, the computation of the SPI is briefly described. In addition to the description of the SPI, this section also describes the datadriven models that were used to forecast the SPI.
2.1. The Standard Precipitation Index (SPI)
The Standard Precipitation Index (SPI) was developed by McKee et al. [6]. As mentioned in the previous section, one of the main advantages of the SPI is that it only requires precipitation data as an input, which makes it ideal for areas where data collection is not as extensive (such as in Ethiopia). The fact that the SPI is based solely on precipitation makes its evaluation relatively easy [37]. The SPI is a standardized index. Standardization of a drought index ensures independence from geographical position as the index in question is calculated with respect to the average precipitation in the same place [37].
The computation of the SPI drought index for any location is based on the longterm precipitation record (at least 30 years) cumulated over a selected time scale [38]. This longterm precipitation time series is then fitted to a gamma distribution, which is then transformed through an equal probability transformation into a normal distribution [38, 39]. Positive SPI values indicate wet conditions with greater than median precipitation, and negative SPI values indicate dry conditions with lower than median precipitation [38]. Table 1 below indicates SPI drought classes.

In most cases, the probability distribution that best models observational precipitation data is the Gamma distribution [37]. The density probability function for the Gamma distribution is given by the expression [37]: where is the shape parameter, is the scale parameter, and is the amount of precipitation. is the value taken by the standard mathematical function known as the Gamma function, which is defined by the integral [37]: In general, the Gamma function is evaluated either numerically or using the values tabulated depending on the value taken by parameter .
In order to model the data observed with a gamma distributed density function, it is necessary to estimate parameters and appropriately. Different methods have been suggested in the literature for the estimate of these two parameters. For example, the Thom [40] approximation is used for maximum probability in Edwards and McKee [41]: where for observations The estimate of the parameters can be further improved by using the interactive approach suggested in Wilks [42].
After estimating coefficients and the density of probability function is integrated with respect to and we obtain an expression for cumulative probability that a certain amount of rain has been observed for a given month and for a specific time scale [37]: The Gamma function is not defined by , and since there may be no precipitation, the cumulative probability becomes [37] where is the probability of no precipitation. is the cumulative probability of precipitation observed. The cumulative probability is then transformed into a normal standardized distribution with null average and unit variance from which we obtain the SPI index.
The above approach, however, is neither practical nor numerically simple to use if there are many grid points of many stations on which to calculate the SPI index. In this case, an alternative method is described in Edwards and McKee [41] using the technique of approximate conversion developed in Abramowitz and Stegun [43] that converts the cumulative probability into a standard variable Z. The SPI index is then defined as where where is precipitation, is the cumulative probability of precipitation observed, and , , , , , are constants with the following values:
2.2. Artificial Neural Networks (ANNs)
Artificial neural networks (ANNs) are flexible computing frameworks that resemble the structure of a nerve system. ANNs have been used to model a broad range of hydrologic time series over the past two decades. The main advantage of using ANNs is that there is no need to define the physical processes between the inputs and outputs [11]. This feature makes ANNs suitable for the purposes of drought forecasting, where all the variables that may cause a drought are not fully understood.
In this paper, the multilayer perceptron (MLP) feedforward network was used to forecast the SPI time series. Figure 1 is an illustration of a typical feedforward neural network. ANN models in this study were trained with the Levenberg Marquardt (LM) back propagation algorithm. MLPs have been used extensively in hydrologic forecasting studies [10, 12, 23, 26, 44, 45] due to their simplicity. In terms of their architecture, MLPs consist of an input layer, one or more hidden layers, and an output layer. The hidden layer contains the neuronlike processing elements that connect the input and output layers and is given by [26] where is the number of input variables; is the number of hidden neurons; = the ith input variable at time step ; = weight that connects the ith neuron in the input layer and the jth neuron in the hidden layer; = bias for the jth hidden neuron; = activation function of the hidden neuron; = weight that connects the jth neuron in the hidden layer and kth neuron in the output layer; = bias for the kth output neuron; = activation function for the output neuron; is the forecasted kth output at time step [26].
2.3. Support Vector Regression
Support vector machines (SVM) were developed by Vapnik [27] as a tool for classification and regression. SVMs embody the structural risk minimization principle, while neural networks embody the empirical risk minimization principle. In contrast to ANNs that seek to minimize training error, SVMs attempt to minimize the generalization error. SVMs have two components: support vector classification (SVC) and support vector regression (SVR). Since the main objective of this study is to forecast the SPI, the SVR was used.
Support vector regression (SVR) is used to describe regression with SVMs [27]. In regression estimation with SVR, the purpose is to estimate a functional dependency between a set of sampled points X = taken from and target values Y = with (the input and target vectors (’s and ’s) refer to the monthly records of the SPI index). Assuming that these samples have been generated independently from an unknown probability distribution function and a class of functions [27]: where and are coefficients that have to be estimated from the input data. The main objective is to find a function that minimizes a risk functional [46]: where is a loss function used to measure the deviation between the target, , and estimate , values. As the probability distribution function is unknown, one cannot minimize the risk functional directly, but can only compute the empirical risk function as [46] where is the number of samples. This traditional empirical risk minimization is not advisable without any means of structural control or regularization. To avoid this issue a regularized risk function with the smallest steepness among the functions that minimize the empirical risk function can be used as [46] where is a constant (). This additional term reduces the model space and thereby controls the complexity of the solution resulting in the following form of this expression [46, 47]: where is a positive constant that has to be selected beforehand. The constant that influences a tradeoff between or an approximation error and the regression (weight) vector is a design parameter. The loss function in this expression, which is called an insensitive loss function (), has the advantage that it will not need all the input data for describing the regression vector and can be written as [46] This function behaves as a biased estimator when it is combined with the regularization term (). The loss is equal to 0 if the difference between the predicted and observed value is less than . The nonlinear regression function is described by the following expression [27, 46, 48]: where are the Lagrange multipliers, is a bias term, and is the Kernel function which is based upon Reproducing Kernel Hilbert Spaces [32]. The Kernel function enables operations to be performed in the input space as opposed to the potentially highdimensional feature space. Several types of functions are treated by SVR such as polynomial functions, Gaussian radial basis functions, exponential radial basis functions, multilayer perception functions, and functions with splines and so forth [32].
2.4. Wavelet Transforms
Wavelet transforms are mathematical functions that can be used for the analysis of timeseries that contain nonstationarities. Wavelet transforms allow for the use of long time intervals for low frequency information and shorter intervals for high frequency information. They are capable of revealing aspects of data like trends, breakdown points, and discontinuities that other signal analysis techniques might miss [26]. Another advantage of wavelet analysis is the flexible choice of the mother wavelet according to the characteristics of the investigated time series [45].
An important step in the use of wavelet transforms is the choice of a mother wavelet (). The continuous wavelet transform (CWT) is defined as the sum over all time of the signal multiplied by scale and shifted versions of the wavelet function [26]: where is the scale parameter; is the translation and * corresponds to the complex conjugate [26]. The CWT produces a continuum of all scales as the output. Each scale corresponds to the width of the wavelet; hence, a largerscale means that more of a time series is used in the calculation of the coefficient than in smaller scales. The CWT is useful for processing different images and signals; however, it is not often used for forecasting because its computation is complex and time consuming. As an alternative, in forecasting applications, the discrete wavelet transform (DWT) is used, due to its simplicity and shorter computation time. DWT scales and positions are usually based on powers of two (dyadic scales and positions). This is achieved by modifying the wavelet representation to [49] where and are integers that control the scale and translation, respectively, while is a fixed dilation step and is a translation factor that depends on the aforementioned dilation step. The effect of discretizing the wavelet is that the timespace scale is now sampled at discrete levels. The DWT operates two sets of functions: highpass and lowpass filters. The original time series is passed through highpass and lowpass filters, and detailed coefficients and approximation series are obtained.
One of the inherent challenges of using the DWT for forecasting applications is that if we change values at the beginning of our time series, all of the wavelet coefficients will subsequently change. To overcome this problem, a redundant algorithm, known as the à trous algorithm can be used, given by [50] where is the low pass filter and the finest scale is the original time series. To extract the details, , that were eliminated in (21), the smoothed version of the signal is subtracted from the coarser signal that preceded it, given by [51] where is the approximation of the signal and is the coarser signal. Each application of (20) and (21) creates a smoother approximation and extracts a higher level of detail. Finally, the nonsymmetric Haar wavelet can be used as the low pass filter to prevent any future information from being used during the decomposition [52].
3. The Awash River Basin
This study forecasted the SPI in the Awash River Basin of Ethiopia. The mean annual rainfall of the basin varies from about 1,600 mm in the highlands north east of Addis Ababa, to 160 mm in the northern point of the basin [53]. The total amount of rainfall also varies greatly from year to year, resulting in severe droughts in some years and flooding in others. The total annual surface runoff in the Awash Basin amounts to some m^{3} [54].
The Awash River Basin (Figure 2) was separated into three smaller basins for the purpose of this study on the basis of various factors such as location, altitude, climate, topography, and agricultural development. A study conducted by Edossa et al. [54] separated the Awash Basin in a similar fashion. The subbasins were called the Upper, Middle, and Lower Awash Basins, respectively. The reasoning behind the use of these three subbasins was to ensure the methods used in this study were effective in forecasting shortterm drought in different conditions. The characteristics of each subbasin are briefly described in the following sections.
3.1. Upper Awash Basin
The Upper Awash Basin has a temperate climate with annual mean temperatures ranging between 15–22°C and an annual precipitation of between 500–2000 mm [54]. Rainfall distribution in the Upper Awash Basin is unimodal. Seven rainfall gauges located in the Upper Awash River Basin were chosen for this study (Table 2). These stations were chosen because their precipitation records from 1970–2005 were either complete or relatively complete. Any station, which had over 10% of their records missing was not selected.

3.2. Middle Awash Basin
The Middle Awash Basin is in the semiarid climatic zone with a long hot summer and a short mild winter. Annual rainfall varies between 200–1500 mm [54]. The rainfall distribution is bimodal in this subbasin. Minor rains normally occur in March and April and major rains from July to August. Eight rainfall gauges located in the Middle Awash Basin were selected using the same criteria as in the Upper Awash Basin and are shown in Table 2.
3.3. Lower Awash Basin
The Lower Awash River Basin has a hot, semiarid climate. The annual mean temperature of the region ranges between 22 and 32°C with average annual precipitation between 500 and 700 mm [54]. Five rainfall gauges were selected form the Lower Awash Basin using the same criteria used in the two other subbasins and are shown in Table 2.
4. Methodology
The methodology section of this paper describes how the SPI was calculated and then forecast over two separate lead times using ANN, WN, and SVR models.
4.1. SPI Calculation
In order to calculate the SPI, a probability density function that adequately describes the precipitation data must be determined. The gamma distribution function was selected to fit the raw rainfall data from each station in this study. The SPI is a score and represents an event departure from the mean, expressed in standard deviation units. The SPI is a normalized index in time and space. SPI values can be categorized according to classes. In this study, the near normal class is established from the aggregation of two classes: (mild drought) and (slightly wet). The departure from the mean is a probability indication of the severity of the wetness or drought that can be used for risk assessment. The time series of the SPI can be used for drought monitoring by setting applicationspecific thresholds of the SPI for defining drought beginning and ending times. Accumulated values of the SPI can be used to analyze drought severity. In this study, the SPI_SL_6 program developed by the National Drought Mitigation Centre, University of NebraskaLincoln, was used to compute time series of drought indices (SPI) for each station in the basin and for each month of the year at different time scales.
In each subbasin, for each station, SPI 3 and SPI 12 were computed. These SPI values were subsequently forecast over lead times of 1 and 6 months. A 3month SPI compares the precipitation for that period with the same 3month period over the historical record. For example, a 3month SPI at the end of September compares the precipitation total for the July–September period with all the past totals for that same period. A 3month SPI indicates short and medium term trends in precipitation and is still considered to be more sensitive to conditions at this scale than the Palmer Index. A 3month SPI can be very effective in showing seasonal trends in precipitation and is a good indicator of agricultural drought. SPI 12 reflects longterm precipitation patterns. SPI 12 is a comparison of the precipitation for 12 consecutive months with the same 12 consecutive months during all the previous years of available data and is a good indicator of longterm drought conditions. Because these time scales are the cumulative result of shorter periods that may be above or below normal, the longer SPIs tend toward zero unless a specific trend is taking place. Forecast lead times of 1 and 6 months were chosen because 1 month is the shortest possible monthly lead time and 6 months is representative of the bimodal rainfall pattern in parts of the Awash River Basin discussed in Section 3.2.
4.2. Wavelet Decomposition
In the proposed WN model, the SPI data for each of the rainfall stations was decomposed into subseries of approximations and details (DWs). The process consists of a number of successive filtering steps. The original SPI time series is first decomposed into an approximation and accompanying detail signal. The decomposition process is then iterated, with successive approximation signals being decomposed in turn. As a result the original SPI time series is broken down into many lower resolution components.
When conducting wavelet analysis, the number of decomposition levels that is appropriate for the data must be chosen. A commonly used method to determine the number of decomposition levels is based on the signal length [55] and is given by , where is the level of decomposition and is the length of the signal. The training set in this study comprised between 1290 and 3017 samples (samples varied depending on the number of inputs for each rainfall station). Thus, the decomposition level was selected as .
As discussed in Section 2.4, the “a trous” wavelet algorithm with a low pass Haar filter was used to create four sets of wavelet subseries. These four subseries included a low frequency component (the approximation) used to uncover the trend of each signal and a set of three high frequency components (the details) used to uncover the periodicity of the signal. All decomposed subseries were added together to generate one time series and used as an input to the ANN models. Using the sum of all the subseries as an input in this study provided more accurate results than using certain subseries or subseries that exhibited the highest correlations with the original time series.
4.3. ANN Models
All the ANN models were created with the MATLAB () ANN toolbox. The hyperbolic tangent sigmoid transfer function was the activation function for the hidden layer, while the activation function for the output layer was a linear function. All the ANN models in this study were trained using the LM back propagation algorithm. The LM back propagation algorithm was chosen because of its efficiency and reduced computational time in training models [45].
In this study, there were between 4–8 input neurons for each ANN model. The optimal number of input neurons for each station was selected using a trial and error procedure. The datadriven models were recursive models, where a model is forecast one lead time ahead, and the subsequent forecasts include the output from the previous forecast as an input. Hence, a forecast of 6 months lead time will have the outputs from forecasts of lead times of 1–5 months. Recursive models were used because it was determined that it would be simpler to use an ANN with one output neuron. Mishra and Desai [10] compared recursive ANN models and ANN models with more than one output neuron (direct ANN models) and found the results to be comparable for forecasting the SPI. The inputs and outputs were normalized between 0 and 1. A study by Wanas et al. [56] empirically determined that the best performance of a neural network occurs when the number of hidden nodes is equal to log(), where is the number of training samples. Another study conducted by Mishra and Desai [10] determined that the optimal number of hidden neurons is , where is the number of input layers. In this study the optimal number of hidden neurons was determined to be between log() and (). For example, if using the method proposed by Wanas et al. [56] gave a result of 4 hidden neurons and using the method proposed by Mishra and Desai [10] gave 6 hidden neurons, the optimal number of hidden neurons was between 4 and 6, thereafter the optimal number was determined using trial and error. These two methods helped establish an upper and lower bound for the number of hidden neurons.
For all the ANN models the cross validation technique [57] was used to partition the data sets; 80% of the data was used to train the models, while the remaining 20% of the data was used to test and validate the models, with 10% used for testing and 10% used for validation. The training set was used to compute the error gradient and to update the network weights and biases. The error from the validation set was used to monitor the training process. If the network overfits the data, the error in the validation set will begin to rise. When the validation error increases for a specified number of iterations, the training is stopped, and the weights and biases at the minimum of the validation error are returned. The testing data set is an independent data set and is used to verify the performance of the model.
4.4. WN Models
The WN models were trained in the same way as the ANN models, with the exception that the inputs were made up from the wavelet decomposed subseries. In this study, the significant wavelets (approximation and detail series) were summed together once the insignificant coefficients were excluded, similar to what was done by Partal [58] and Kisi and Cimen [32]. In this study, the summed subseries provided better results than using the individual wavelet coefficients as inputs.
For WN models, an input layer with 4–8 neurons, a single hidden layer composed of 4–6 neurons, and one output layer consisting of one neuron were developed. The number of neurons was determined in the same way as for the traditional ANN models. All the ANN models that had wavelet decomposed subseries as their inputs were also partitioned in a similar manner to the traditional ANN models.
4.5. SVR Models
All SVR models were developed using the OnlineSVR software created by Parrella [59]. OnlineSVR is a technique used to build support vector machines for regression. The OnlineSVR software partitions the data into only two sets: a training set and a testing set. The SVR models were partitioned in a similar manner to the ANN and WN models.
All SVR models used the nonlinear radial basis function (RBF) kernel. As a result, each SVR model consisted of three parameters that were selected: gamma (), cost (), and epsilon (). The γ parameter is a constant that reduces the model space and controls the complexity of the solution, is a positive constant that is a capacity control parameter, and is the loss function that describes the regression vector without all the input data [32]. These three parameters were selected based on a trial and error procedure. The combination of parameters that produced the lowest RMSE values for the training data sets was selected.
4.6. Performance Measures
The performance of the forecasts resulting from the datadriven models was evaluated by the following measures of goodness of fit: where is the mean value taken over , is the observed value, is the forecasted value, and is the number of data points. The coefficient of determination measures the degree of association among the observed and predicted values. The higher the value of (with 1 being the highest possible value), the better the performance of the model where is the sum of squared errors and is the number of data points used. is given by with the variables already having been defined. The RMSE evaluates the variance of errors independently of the sample size The MAE is used to measure how close forecasted values are to the observed values. It is the average of the absolute errors.
5. Results and Discussion
For each subbasin of the Awash River Basin, the station that showed the best performance results for each data driven model are presented below. In this study, SPI 3 and SPI 12 were forecast over lead times of 1 and 6 months to determine the effectiveness of the datadriven models over short and longterm lead times.
As shown in Table 3(a), the best datadriven model in the Upper Awash Basin for forecasts of SPI 3 and 12 is the WN model. All the models exhibited better results for forecasts of a 1month lead time (L1) compared to forecasts of 6months lead time (L6). Forecasts of SPI 12, for all the datadriven models, had better performance results than forecasts of SPI 3 in terms of , RMSE, and MAE, regardless of forecast lead time. The best 1month lead time WN forecast of SPI 12 had results of 0.9534, 0.0600, and 0.0536 in terms of , RMSE, and MAE, respectively. The second best results were from ANN models with results of 0.9451, 0.0610, and 0.0603 in terms of , RMSE and MAE, respectively. Figures 3 and 4 show the ANN and WN 1month forecast results for SPI 12 at the Ejersalele station.
(a)  
 
(b)  
 
(c)  

The performance of both these models is quite similar, as indicated by Figures 3 and 4. Both models adequately represent the periods of abundant and acute precipitation as indicated by the peaks and valleys in the figures.
Similar to the results for the Upper Awash Basin, the best forecast results in the Middle Awash Basin were from WN models. The WN models had the best results for both SPI 3 and SPI 12, for forecast lead times of 1 and 6 months, respectively (Table 3(b)). The forecast results of all the datadriven models deteriorated when the forecast lead time was increased from 1 to 6 months.
Figure 5 illustrates the relationship between the observed SPI 12 and the predicted SPI 12 from the ANN model at the Nazereth station. The ANN model underestimates the severity of the drought period at 112 months. In contrast, the WN model for SPI 12 at the Nazereth station displays improved results with respect to the drought period at 112 months (Figure 6).
In the Lower Awash Basin, the forecast results exhibited the same trend shown in the Upper and Middle subbasins. The WN models had the best results for both SPI 3 and SPI 12, for forecast lead times of 1 and 6 months, respectively. Figures 7 and 8 illustrate the best SPI 12 forecasts at the Dubti station where both ANN and WN models predict the periods of abundant and acute precipitation quite well. When the forecast lead time was increased, the performance of all the models deteriorated, especially with respect to . Datadriven models in the Upper and Lower Awash basins exhibited their best results for forecasts of SPI 12, indicating that datadriven models are more effective in predicting longterm drought conditions in those two basins, while in the Middle Awash Basin most models also exhibited their best results for forecasts of SPI 12 except WN models, which exhibited their best results for forecasts of SPI 3. This trend could be due to the fact that longterm SPI, which is a cumulative of shortterm time scales, tend toward zero unless a specific trend is taking place. The exception regarding the WN models in the Middle Awash Basin may be due to the fact that the precipitation record at this station is relatively stable, meaning there are not many changes from one month to the next and the SPI 3 is not sensitive to those changes.
Overall, all three datadriven models forecast SPI 3 and SPI 12 well for forecast lead times of 1 and 6 months. The results indicate that ANN models are more effective than SVR models at forecasting in this study. The use of wavelet analysis improved the forecast results of ANN models, specifically in predicting extreme events as shown in Figure 6. Indeed, using a measure for peak relative error as shown by it was determined that the relative error of the ANN model, 95%, was reduced to 88% when a WN model was used.
The fact that wavelet analysis is an effective tool at revealing local discontinuities helps explain why it was more effective in predicting the extreme events in the Middle Awash Basin. Wavelet analysis may help denoise the original SPI time series compared to a traditional ANN model. The forecast of this denoised signal may further explain the fact that extreme events are forecast better using wavelet analysis.
An increase in forecast lead time results in a deterioration of performance in all the models. However, this deterioration does not result in poor models, indicating the stability of these datadriven models in predicting the SPI. The results in terms of RMSE and MAE do not deteriorate drastically with an increase of lead time. For example, for the Dubti station, the RMSE and MAE of SVR models deteriorate by 0.05 and 0.26%, respectively.
There is variability with regards to the best forecasts of both SPI 3 and SPI 12 amongst the three subbasins. For example, the best forecast of SPI 3 at a 1month lead time occurred in the Middle Awash Basin (WN model), while the best forecast of SPI 12 at a 1month lead time occurred in the Upper Awash Basin (WN model). While each subbasin has a different climatology, there does not seem to be a clear trend linking climatology with forecast accuracy. It seems that the reason behind the best models for each datadriven method being in various subbasins is linked with the characteristics of the individual station and not the characteristics of the subbasin as a whole.
In addition, the forecast results for SPI 12 are better than the forecast results for SPI 3 in almost all cases. For SPI 3 and other shortterm SPI, each new month has a large impact on the period sum of precipitation [6]. As a result, the SPI 3 is sensitive to any change in precipitation from one month to another. In the case of SPI 12, each individual month has less impact on the total and the index is not as sensitive to changes in precipitation from one month to the next. The fact that SPI 3 is more sensitive to changes in precipitation results in less accurate forecast results than SPI 12. However, the effects of wavelet analysis are more significant for SPI 3 than for SPI 12, especially for forecast lead times of 6 months. As stated previously, the ANN forecasts of SPI 12 are not as sensitive to changes in precipitation and thus good results are obtained. The ability of wavelet analysis to improve these results exists as shown but is not as high as the improvement seen in SPI 3 forecasts because ANN forecasts of SPI 3 suffer due to the sensitivity of SPI 3 to slight changes in precipitation over the longterm record.
All three subbasins had a different climatology. The forecast results have all shown that WN models are the most effective at forecasting the SPI in all the subbasins in terms of , RMSE and MAE. Whether this is the case in all climatic zones needs to be explored in future studies.
6. Conclusion
This study tried to determine the most effective datadriven model for forecasts of the SPI drought index in the Awash River Basin of Ethiopia. WN models were shown to be the most effective model for forecasts of SPI 3 and 12 in all three subbasins. WN models showed greater correlation between observed and predicted SPI compared to simple ANNs and SVR models. WN models also consistently showed lower values of RMSE and MAE compared to the other data driven models explored in this study. All the datadriven models showed increased forecast results for SPI 12 compared to SPI 3. Forecast results deteriorated as the forecast lead time increased for all the models. Of the two machine learning techniques, ANNs are more effective in forecasting the SPI compared to SVR models. This trend occurs in all three subbasins and should be studied in other regions to determine if ANNs are more effective tools for drought forecasting compared to SVR models. It is thought that WN models provide more accurate results because preprocessing the original SPI time series with wavelet decompositions “denoises” the data. Future studies should attempt to explore WSVR models, ensemble WN and WSVR models, and explore SPI forecasts using these new methods in other regions with different characteristics. Future studies should also attempt to quantify time shift error as it is a part of forecasting problems with regression models.
Acknowledgments
An NSERC Discovery Grant and a FQRNT New Researcher Grant held by Jan Adamowski were used to fund this research.
References
 E. Mersha and V. K. Boken, “Agricultural drought in Ethiopia,” in Monitoring and Predicting Agricultural Drought: A Global Study, V. K. Boken, A. P. Cracknell, and R. L. Heathcote, Eds., Oxford University Press, 2005. View at: Google Scholar
 A. K. Mishra and V. P. Singh, “A review of drought concepts,” Journal of Hydrology, vol. 391, no. 12, pp. 202–216, 2010. View at: Publisher Site  Google Scholar
 T. Ross and N. Lott, “A climatology of 1980–2003 extreme weather and climate events,” National Climatic Data Center Technical Report No. 200301. NOAA/ NESDIS, National Climatic Data Center, Asheville, NC, USA. View at: Google Scholar
 A. Cancelliere, G. di Mauro, B. Bonaccorso, and G. Rossi, “Stochastic forecasting of drought indices,” in Methods and Tools For Drought Analysis and Management, G. Rossi, T. Vega, and B. Bonaccorso, Eds., Springer, 2007. View at: Google Scholar
 W. J. Gibbs and J. V. Maher, Rainfall Deciles as Drought Indicators, vol. 48 of Bulletin (Commonwealth Bureau of Meteorology, Australia), Bureau of Meteorology, Melbourne, Australia, 1967.
 T. B. McKee, N. J. Doesken, and J. Kleist, “The relationship of drought frequency and duration to time scales,” in Proceedings of the 8th Conference on Applied Climatology, American Meteorological Society, Anaheim, Calif, USA, 1993. View at: Google Scholar
 H. R. Byun and D. A. Wilhite, “Objective quantification of drought severity and duration,” Journal of Climate, vol. 12, no. 9, pp. 2747–2756, 1999. View at: Google Scholar
 W. Palmer, “Meteorological drought,” Tech. Rep. 45, U.S. Weather Bureau, Washington, DC, USA, 1965. View at: Google Scholar
 H. K. Ntale and T. Y. Gan, “Drought indices and their application to East Africa,” International Journal of Climatology, vol. 23, no. 11, pp. 1335–1357, 2003. View at: Publisher Site  Google Scholar
 A. K. Mishra and V. R. Desai, “Drought forecasting using feedforward recursive neural network,” Ecological Modelling, vol. 198, no. 12, pp. 127–138, 2006. View at: Publisher Site  Google Scholar
 S. Morid, V. Smakhtin, and K. Bagherzadeh, “Drought forecasting using artificial neural networks and time series of drought indices,” International Journal of Climatology, vol. 27, no. 15, pp. 2103–2111, 2007. View at: Publisher Site  Google Scholar
 U. G. Bacanli, M. Firat, and F. Dikbas, “Adaptive NeuroFuzzy inference system for drought forecasting,” Stochastic Environmental Research and Risk Assessment, vol. 23, no. 8, pp. 1143–1154, 2009. View at: Publisher Site  Google Scholar
 A. P. Barros and G. J. Bowden, “Toward longlead operational forecasts of drought: an experimental study in the MurrayDarling River Basin,” Journal of Hydrology, vol. 357, no. 34, pp. 349–367, 2008. View at: Publisher Site  Google Scholar
 P. Cutore, G. Di Mauro, and A. Cancelliere, “Forecasting palmer index using neural networks and climatic indexes,” Journal of Hydrologic Engineering, vol. 14, no. 6, pp. 588–595, 2009. View at: Publisher Site  Google Scholar
 M. Karamouz, K. Rasouli, and S. Nazif, “Development of a hybrid Index for drought prediction: case study,” Journal of Hydrologic Engineering, vol. 14, no. 6, pp. 617–627, 2009. View at: Publisher Site  Google Scholar
 A. F. Marj and A. M. J. Meijerink, “Agricultural drought forecasting using satellite images, climate indices and artificial neural network,” International Journal of Remote Sensing, vol. 32, no. 24, pp. 9707–9719, 2011. View at: Publisher Site  Google Scholar
 D. Labat, R. Ababou, and A. Mangin, “Wavelet analysis in karstic hydrology. 2nd part: rainfallrunoff crosswavelet analysis,” Comptes Rendus de l'Academie de Sciences, vol. 329, no. 12, pp. 881–887, 1999. View at: Publisher Site  Google Scholar
 P. Saco and P. Kumar, “Coherent modes in multiscale variability of streamflow over the United States,” Water Resources Research, vol. 36, no. 4, pp. 1049–1067, 2000. View at: Publisher Site  Google Scholar
 L. C. Smith, D. L. Turcotte, and B. L. Isacks, “Stream flow characterization and feature detection using a discrete wavelet transform,” Hydrological Processes, vol. 12, no. 2, pp. 233–249, 1998. View at: Google Scholar
 P. Coulibaly, F. Anctil, and B. Bobée, “Daily reservoir inflow forecasting using artificial neural networks with stopped training approach,” Journal of Hydrology, vol. 230, no. 34, pp. 244–257, 2000. View at: Publisher Site  Google Scholar
 S. N. Lane, “Assessment of rainfallrunoff models based upon wavelet analysis,” Hydrological Processes, vol. 21, no. 5, pp. 586–607, 2007. View at: Publisher Site  Google Scholar
 J. F. Adamowski, “Development of a shortterm river flood forecasting method for snowmelt driven floods based on wavelet and crosswavelet analysis,” Journal of Hydrology, vol. 353, no. 34, pp. 247–266, 2008. View at: Publisher Site  Google Scholar
 J. Adamowski and K. Sun, “Development of a coupled wavelet transform and neural network method for flow forecasting of nonperennial rivers in semiarid watersheds,” Journal of Hydrology, vol. 390, no. 12, pp. 85–91, 2010. View at: Publisher Site  Google Scholar
 M. Özger, A. K. Mishra, and V. P. Singh, “Long lead time drought forecasting using a wavelet and fuzzy logic combination model: a case study in Texas,” Journal of Hydrometeorology, vol. 13, no. 1, pp. 284–297, 2012. View at: Publisher Site  Google Scholar
 T. Partal and Ö. Kişi, “Wavelet and neurofuzzy conjunction model for precipitation forecasting,” Journal of Hydrology, vol. 342, no. 12, pp. 199–212, 2007. View at: Publisher Site  Google Scholar
 T. W. Kim and J. B. Valdes, “Nonlinear model for drought forecasting based on a conjunction of wavelet transforms and neural networks,” Journal of Hydrologic Engineering, vol. 8, no. 6, pp. 319–328, 2003. View at: Publisher Site  Google Scholar
 V. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
 J. B. Gao, S. R. Gunn, C. J. Harris, and M. Brown, “A probabilistic framework for SVM regression and error bar estimation,” Machine Learning, vol. 46, no. 1–3, pp. 71–89, 2002. View at: Publisher Site  Google Scholar
 M. S. Khan and P. Coulibaly, “Application of support vector machine in lake water level prediction,” Journal of Hydrologic Engineering, vol. 11, no. 3, pp. 199–205, 2006. View at: Publisher Site  Google Scholar
 S. Rajasekaran, S. Gayathri, and T.L. Lee, “Support vector regression methodology for storm surge predictions,” Journal of Ocean Engineering, vol. 35, no. 16, pp. 1578–1587, 2008. View at: Publisher Site  Google Scholar
 O. Kisi and M. Cimen, “Evapotranspiration modelling using support vector machines,” Hydrological Sciences Journal, vol. 54, no. 5, pp. 918–928, 2009. View at: Publisher Site  Google Scholar
 O. Kisi and M. Cimen, “A waveletsupport vector machine conjunction model for monthly streamflow forecasting,” Journal of Hydrology, vol. 399, no. 12, pp. 132–140, 2011. View at: Publisher Site  Google Scholar
 T. Asefa, M. Kemblowski, M. McKee, and A. Khalil, “Multitime scale stream flow predictions: the support vector machines approach,” Journal of Hydrology, vol. 318, no. 1–4, pp. 7–16, 2006. View at: Publisher Site  Google Scholar
 W. C. Wang, K. W. Chau, C. T. Cheng, and L. Qiu, “A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series,” Journal of Hydrology, vol. 374, no. 34, pp. 294–306, 2009. View at: Publisher Site  Google Scholar
 R. Maity, P. P. Bhagwat, and A. Bhatnagar, “Potential of support vector regression for prediction of monthly streamflow using endogenous property,” Hydrological Processes, vol. 24, no. 7, pp. 917–923, 2010. View at: Publisher Site  Google Scholar
 Z. M. Yuan and X. S. Tan, “Nonlinear screening indicators of drought resistance at seedling stage of rice based on support vector machine,” Acta Agronomica Sinica, vol. 36, no. 7, pp. 1176–1182, 2010. View at: Publisher Site  Google Scholar
 C. Cacciamani, A. Morgillo, S. Marchesi, and V. Pavan, “Monitoring and forecasting drought on a regional scale: emiliaromagna region,” Water Science and Technology Library, vol. 62, part 1, pp. 29–48, 2007. View at: Publisher Site  Google Scholar
 I. Bordi and A. Sutera, “Drought monitoring and forecasting at largescale,” in Methods and Tools For Drought Analysis and Management, G. Rossi, T. Vega, and B. Bonaccorso, Eds., pp. 3–27, Springer, New York, NY, USA, 2007. View at: Google Scholar
 N. B. Guttman, “Accepting the standardized precipitation index: a calculation algorithm,” Journal of the American Water Resources Association, vol. 35, no. 2, pp. 311–322, 1999. View at: Google Scholar
 H. C. S. Thom, “A note on gamma distribution,” Monthly Weather Review, vol. 86, pp. 117–122, 1958. View at: Google Scholar
 D. C. Edwards and T. B. McKee, “Characteristics of 20th century drought in the United States at multiple scales,” Atmospheric Science Paper 634, 1997. View at: Google Scholar
 D. S. Wilks, Statistical Methods in the Atmospheric Sciences an Introduction, Academic Press, San Diego, Calif, USA, 1995.
 M. Abramowitz and A. Stegun, Eds., Handbook of Mathematical Formulas, Graphs, and Mathematical Tables, Dover Publications, New York, NY, USA, 1965.
 S. Morid, V. Smakhtin, and M. Moghaddasi, “Comparison of seven meteorological indices for drought monitoring in Iran,” International Journal of Climatology, vol. 26, no. 7, pp. 971–985, 2006. View at: Publisher Site  Google Scholar
 J. Adamowski and H. F. Chan, “A wavelet neural network conjunction model for groundwater level forecasting,” Journal of Hydrology, vol. 407, no. 1–4, pp. 28–40, 2011. View at: Publisher Site  Google Scholar
 M. Çimen, “Estimation of daily suspended sediments using support vector machines,” Hydrological Sciences Journal, vol. 53, no. 3, pp. 656–666, 2008. View at: Publisher Site  Google Scholar
 A. J. Smola, Regression Estimation with Support Vector Learning Machines [M.S. thesis], Technische Universitat Munchen, Munich, Germany, 1996.
 S. Gunn, “Support vector machines for classification and regression,” ISIS Technical Report, Department of Electronics and Computer Science, University of Southampton, 1998. View at: Google Scholar
 B. Cannas, A. Fanni, G. Sias, S. Tronci, and M. K. Zedda, “River flow forecasting using neural networks and wavelet analysis,” in Proceedings of the European Geosciences Union, 2006. View at: Google Scholar
 S. G. Mallat, A Wavelet Tour of Signal Processing, Academic Press, San Diego, Calif, USA, 1998.
 F. Murtagh, J. L. Starck, and O. Renuad, “On neurowavelet modeling,” Decision Support Systems, vol. 37, no. 4, pp. 475–484, 2004. View at: Publisher Site  Google Scholar
 O. Renaud, J. Starck, and F. Murtagh, WaveletBased Forecasting of Short and Long Memory Time Series, Department of Economics, University of Geneve, 2002.
 C. E. Desalegn, M. S. Babel, A. Das Gupta, B. A. Seleshi, and D. Merrey, “Farmers' perception of water management under drought conditions in the upper Awash Basin, Ethiopia,” International Journal of Water Resources Development, vol. 22, no. 4, pp. 589–602, 2006. View at: Publisher Site  Google Scholar
 D. C. Edossa, M. S. Babel, and A. D. Gupta, “Drought analysis in the Awash River Basin, Ethiopia,” Water Resources Management, vol. 24, no. 7, pp. 1441–1460, 2010. View at: Publisher Site  Google Scholar
 M. K. Tiwari and C. Chatterjee, “Development of an accurate and reliable hourly flood forecasting model using waveletbootstrapANN (WBANN) hybrid approach,” Journal of Hydrology, vol. 394, no. 34, pp. 458–470, 2010. View at: Publisher Site  Google Scholar
 N. Wanas, G. Auda, M. S. Kamel, and F. Karray, “On the optimal number of hidden nodes in a neural network,” in Proceedings of the 11th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE '98), pp. 918–921, May 1998. View at: Google Scholar
 J. C. Principe, N. R. Euliano, and W. Curt Lefebvre, Neural and Adaptive Systems, John Wiley & Sons, 2000.
 T. Partal, “Modelling evapotranspiration using discrete wavelet transform and neural networks,” Hydrological Processes, vol. 23, no. 25, pp. 3545–3555, 2009. View at: Publisher Site  Google Scholar
 F. Parrella, Online support vector regression [M.S. thesis], University of Genoa, 2007.
Copyright
Copyright © 2012 A. Belayneh and J. Adamowski. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.