Research Article | Open Access
Rana Muhammad Adnan, Xiaohui Yuan, Ozgur Kisi, Rabia Anam, "Improving Accuracy of River Flow Forecasting Using LSSVR with Gravitational Search Algorithm", Advances in Meteorology, vol. 2017, Article ID 2391621, 23 pages, 2017. https://doi.org/10.1155/2017/2391621
Improving Accuracy of River Flow Forecasting Using LSSVR with Gravitational Search Algorithm
River flow prediction is essential in many applications of water resources planning and management. In this paper, the accuracy of multivariate adaptive regression splines (MARS), model 5 regression tree (M5RT), and conventional multiple linear regression (CMLR) is compared with a hybrid least square support vector regression-gravitational search algorithm (HLGSA) in predicting monthly river flows. In the first part of the study, all three regression methods were compared with each other in predicting river flows of each basin. It was found that the HLGSA method performed better than the MARS, M5RT, and CMLR in river flow prediction. The effect of log transformation on prediction accuracy of the regression methods was also examined in the second part of the study. Log transformation of the river flow data significantly increased the prediction accuracy of all regression methods. It was also found that log HLGSA (LHLSGA) performed better than the other regression methods. In the third part of the study, the accuracy of the LHLGSA and HLGSA methods was examined in river flow estimation using nearby river flow data. On the basis of results of all applications, it was found that LHLGSA and HLGSA could be successfully used in prediction and estimation of river flow.
River flow forecasting plays a vital role in planning of water projects, irrigation systems, hydropower system, and optimized utilization of water resources . Due to continuous increase of population growth, industrial uses, and irrigation needs, the river flow forecasting has received great attentions of researchers for operational river management . Forecasting of river flow provides alerts of approaching floods and also assists in controlling the outflows of reservoir during low flows days of river. Floods affect countless lives, infrastructure, and property and cause limitless damage more than any other natural disaster. Due to no assessment of flood magnitude, a flood resulted in a loss of thousand lives and damage of agriculture land of million dollars in Pakistan in 2010 . It is not possible to provide complete safety from flood, but high amounts of money and many lives can be saved by providing accurate flood predictions, flood magnitude, and flood duration . The importance of water measurement compelled researchers to apply various types of forecasting methods to estimate and forecast river flows.
From the last three decades of the previous century, the statistical methods were applied successfully in the field of hydrology including the river flow forecasting. Statistical methods try to find inherent relationships within the actual data. The autoregressive integrated moving average (ARIMA) and seasonal autoregressive integrated moving average (SARIMA) methods are the most popular in the statistical methods category and have been extensively used to model different variables in the field of hydrology [5–15]. Ahlert and Mehta  and Kurunç et al.  used ARIMA statistical models for modeling river flows data. Ahmad et al.  and Mirzavand and Ghazavi  applied ARIMA statistical methods to analyse water quality and groundwater data, respectively. Otok and Suhartono , Rabenja et al. , and Valipour  forecasted runoff data in Indonesia and USA, respectively, by applying SARIMA model and compared with ARIMA. Psilovikos and Elhag  and Yang et al.  applied ARIMA models successfully in modeling different processes of evaporation data. Mishra and Desai  and Modarres  used SARIMA method efficiently for drought forecasting in India and Iran, respectively. In the previous two decades, the artificial neural networks (ANN) have been replaced with the statistical methods in solving different problems due to their flexible nature and capturing the nonlinearity in the data. In the literature, many researchers compared the ANN with statistical methods in solving many problems of hydrology and reported that ANN outperformed the statistical methods [16–21]. Huang et al.  used the ANN method to forecast the river flows of Apalachicola River, USA, using the previous rainfall and river flow data. They compared the quarterly and yearly river flow forecasts results with the ARIMA method’s results and found that the ANN performed better than the ARIMA method in prediction of river flow. The detailed discussion of all ANN applications in comparison with statistical methods to forecast different variables in hydrology is not possible in this paper. However, ANN also have some major weakness, that is, overfitting, falling into local minimum, slowing convergence speed, and requiring large number of training data. Thus, in the last decade, the support vector regression (SVR) took priority over ANN due to its parallel distributed processing, self-learning features, avoiding the overfitting problems, and providing globally optimal solutions [22–28]. Ahmad et al.  applied SVR model to forecast runoff of Bakhtiyari Basin, Iran, and results explored that SVR showed better accuracy than the ANN methods for daily runoff forecasting especially in case of prediction of higher values of river flows. However, SVM faces computationally difficulties in determining optimal solution due to use of quadratic programming with nonlinear equation. This procedure is time consuming.
Recently least square support vector regression (LSSVR), the improved version of SVR, received much attention in the field of prediction methods due to use of linear squares principle for the loss function instead of the quadratic programming in the SVR method and fast computational speed [29–38]. Shabri and Suhartono  and Kisi  applied LSSVR successfully to forecast river flows. Shabri and Suhartono  compared the prediction accuracy of LSSVR with ANN and multivariate linear regression methods whereas Kisi  compared it with adaptive neurofuzzy embedded fuzzy -means clustering (ANFIS-FCM) method and they both found that LSSVR performed better than the other methods. Kisi  and Goyal et al.  forecasted the reference evapotranspiration and pan evaporation by using LSSVR method. Kisi  compared its results with feed forward ANN whereas Goyal et al.  compared it with ANN and ANFIS methods and they found that LSSVR performed better than the other methods. Okkan and Serbes  and Bhagwat and Maity  successfully used LSSVR method to forecast runoff data by using the previous meteorological and river flows data. Kisi  estimated the suspended sediment by using the river flow data through LSSVR method and reported that the LSSVR gave better estimates in comparison with ANN and sediment rating curve (SRC) methods. Hwang et al.  predicted the daily water demand of the Seoul City, Korea, and daily mean inflow of Cheng-ju Dam by using LSSVR model. He compared the predicted results of LSSVR with the conventional multiple linear regression (CMLR) and back propagation neural network methods in both cases and found that LSSVR showed superiority in prediction accuracies. Wu et al.  and Mellit et al.  applied LSSVR method to predict the different meteorological variables and found that LSSVR performed better than ANN method. Motivated by these successful applications of the LSSVR, the LSSVR method was selected as a forecasting method in this research. LSSVR method has parameters which play a vital role in determining the prediction accuracy of the method. Determining suitable value of these parameters will produce better river flow prediction results. Still, there is no specific way to determine optimal parameters for LSSVR method in the literature of river flow forecasting. Thus, the novelty of this study is to generate a hybrid LSSVR-gravitational search algorithm (HLGSA) river flow forecasting method. In this study, GSA is used to find the optimal values of LSSVR method to increase the prediction accuracy of the method. GSA was preferred in this study over other heuristic algorithms such as simulated annealing algorithm, genetic algorithm, memetic algorithm, differential evolution, and particle swarm optimization due to their premature convergence, parameter sensitivity, and consuming too much time to obtain global optimal solution. Instead of these heuristic algorithms, GSA improves the global search ability and optimization speed by using the principle of gravity and motion. To the best knowledge of the authors, there is not any published work in the literature that predicts the river flow using hybrid LSSVR-gravitational search algorithm (HLGSA) method. Recently, researchers preferred hybrid methods for solving different problems in the field of hydrology.
In addition to the HLGSA method, multivariate adaptive regression splines (MARS) is another popular regression method used to model the complex nonlinear relationships among the variables. MARS is a nonparametric regression method and it has been applied extensively nowadays in the field of hydrology to predict different variables [39–45]. To determine the benefits of using MARS over other conventional regression methods, MARS method was compared with CMLR and model 5 regression tree (M5RT) methods in this study. Cross validation (CV) technique was used to better see the prediction accuracy of all applied methods. Log transform function was also utilized in this study to see its effect on the prediction accuracy of these methods.
2. Hybrid LSSVR-Gravitation Search Algorithm (HLGSA) Method for River Flow Prediction
LSSVR introduced by Suykens and Vandewalle  is a modification version of SVR and has advantage on SVR due to reduction in complexity of optimization process by using linear equation instead of quadratic equations . Figure 1 demonstrates the process of LSSVR algorithm. By using time series inputs (lagged river flows) and output (predicted river flow), the function of nonlinear LSSVR is given aswhere represents dot product, is a nonlinear function that employs regression, and and are the weight vector and bias term, respectively . The cost function () of LSSVR can be minimized aswhere represent the regularization constant and the training error for , respectively.
The solution of above equation can be achieved by determining the partial differential of Lagrange function and applying the kernel function (KF) to satisfy Mercer’s condition. To solve regression problems, there are many types of KF including polynomial, radial basis, Gaussian, sigmoid, Mexican hat, Meyer, and Morlet. The KF type plays a vital role in constructing high accurate LSSVR model . This study used the radial basis KF (RBKF) due to its effectiveness for the nonlinear regression problems . The performance of the RBKF with other KFs is shown in Section 7. RBKF can be expressed as After selecting the RBKF for the LSSVR method, finding proper values for penalty factor parameter that is and RBKF parameter that is is necessary. There is no specific way to obtain the optimal values of parameters. Due to these reasons, GSA is adopted in the study to calculate the suitable parameter values.
2.2. Gravitational Search Algorithm (GSA)
GSA is one of the effective optimization algorithms compared with other evolutionary algorithms. It is based on the law of gravity and motion and first proposed by Rashedi et al. . In GSA, each agent has four parameters: position, velocity, inertial mass, and gravitational mass. The location of the agent corresponds to a solution of the problem whereas its gravitational and inertia masses are obtained utilizing a fitness function . The location of particle can be expressed aswhere represents the location of the th agent in the kth dimension. The mass of each agent is computed after calculating the fitness of current population as [52, 54] where and represent the fitness value and mass of the th agent at time , respectively, whereas and represent the minimum fitness value and maximum fitness value, respectively.
To calculate the gravitational acceleration of the agent , firstly the force exerted by heavy agents on this agent should be computed as where and are the passive and active gravitational mass, respectively, corresponding to agents and at the generation, and are the gravitational and small constant, and indicate position of kth dimension of agents and at the generation, and is Euclidean distance between agents and .
The total gravitational acceleration of the th agent can be calculated using the law of motion as follows:where represents the gravitational acceleration of the agent in the kth dimension and indicates a random variable with uniform distribution in the interval . With the help of (9), the total gravitational force exerted on the agent in the kth dimension can be calculated asThen the speed and location of the agent are updated as follows:
It is clearly seen from the brief description of the GSA that it utilizes the gravitational force as the direct form to communicate the agents’ cooperation. The heavy agents in GSA are processed, infer good solutions, and move more gradually than lighter ones, which guarantee the algorithm’s exploitation step. In other words, the GSA searches for the ideal solution by appropriately calibrating the inertia and gravitational masses of agents where every agent provides a solution. As time progresses, the heaviest agent will exhibit an ideal solution in the search space .
2.3. HLGSA (Hybrid LSSVR-GSA)
The process of constructing the river flow prediction model HLGSA by using the hybrid of LSSVR and GSA methods is described in this section and shown in Figure 2. The process is as follows:(i)Firstly, divide all river flow data sets into training and test parts.(ii)Select the RBF kernel function and initial parameters for the HLGSA method to build the initial LSSVR model. The initial value of the parameters is set as follows: the range of penalty factor is 0.1 to 2000, the range of RBF parameter is 0.001 to 20, number of iterations is 15, the number of particles can be set up to 40, and constant alpha is found to be better in range of 16 to 20, whereas initial gravitational constant is found to be better in range from 105 to 115.(iii)Compute the particle fitness value of each agent. In this paper, is selected as the fitness function. The fitness function for this method can be defined as(iv)Choose the best parameters combination through GSA to obtain the optimal values of the LSSVR parameters.(v)If it does not meet the stopping criterion, then utilize the new combination of parameters to reconstruct the LSSVR. Compute the fitness until it suits the stopping criterion.(vi)The ideal parameter values are achieved to build the optimal LSSVR model for forecasting river flow. Now, the testing values are used for the optimal LSSVR model to get river flow prediction results.
3. Regression Methods
In this study, the performance accuracy of a hybrid nonlinear optimized regression method (HLGSA) was compared with a nonlinear, nonparametric regression method (MARS), with a piecewise linear regression method (M5RT), and with a conventional linear regression method (CMLR) in forecasting monthly river flow.
3.1. Multivariate Adaptive Regression Splines
MARS is a flexible method which finds relationships that are nearly additive or involve interactions with fewer parameters. The general MARS method is introduced by Friedman  and is expressed by the following equation:where is the forecasted river flow by the MARS that is dependent variable, is a constant, are the model coefficients calibrated to provide the best fit to the used data, is the quantity of basis functions (BFs), is the “splits” quantity that generates the mth BFs, and gets values of 1 or −1 and represents the (right/left) sense of the associated step function. is the independent variable’s label .
Two-step MARS provides optimal MARS model. MARS develops a huge number of BFs chosen to overfit the data at first step, where variables are permitted to enter—as continuous, categorical, or ordinal—the formal system by which variable ranges are characterized, and they can interact with each other or be restricted to enter only as additive components. In the second step, BFs are erased in the order of minimum effect utilizing the generalized cross validation criterion (GCV). A measure of variable significance can then be evaluated by watching the decrement in the computed GCV when a parameter is excluded from the model. This procedure proceeds until the rest of the BFs all satisfy the predecided necessities. The GCV can be computed as :where and are the actual and predicted river flow values and is a complexity penalty function.
After building the MARS model, the relative importance of a variable in terms of its contribution to the fit of the model can be estimated. MARS is capable of tracking very complex data structures, so selected in this study for modeling river flow time series.
3.2. Model 5 Regression Regression Tree (M5RT)
In decision tree (DT), each branch node indicates a choice between a number of alternatives and a decision is made in every leaf node . Regression trees (RT) are applied to solve those forecasting problems having numeric response variable. They are different from the DT only in that they involve a numeric value rather than a class label combined with the leaves . The M5RT method combines the features of DT and RT methods because the construction of the M5RT is similar to the DT but, instead of the class labels, it has linear regression functions at the leaves. The M5RT is a piecewise linear method that was introduced by Quinlan  and has many successful applications in the field of water resources [41, 62–68] that compelled the authors to use M5RT method in this paper for river flow prediction.
The division criteria for the M5RT method are based on reducing the standard deviation of the class values that reach a node as an error measure and computing the estimated reduction in this error as a consequence of testing each attribute at that node. The standard deviation reduction (SDR) is computed bywhere stands for set of samples that enters the node, indicates the subset of samples that have the th output of the potential set, and is the standard deviation .
3.3. Conventional Multiple Linear Regression (CMLR)
The multiple linear regression methods forecast values of a dependent variable based on independent variables (). Two main advantages of the CMLR are that it has simple structure and it is included in lots of statistical packages . In this study, after determining the independent lagged river flow values for dependent river flows of both basins, the CMLR can be constructed as follows:where is the dependent variable, – are the equation parameters for the linear relation, and , are the independent lagged river flow value used to forecast river flow. However, CMLRs have some disadvantages in predicting nonlinear situations because of their linear structure .
4. Study Sites and Data Preprocessing
The study used the river flows data from two catchments, Astore and Shyok, on the Upper Indus Basin of Pakistan. Figure 3 shows the location map of the catchments. The geographical location of Astore Basin is approximately between longitudes 74°, 24′ and 75°, 14′E and between latitudes 34°, 45′ and 35°, 38′N. The river covers a catchment area of about 3750 km2. Water and power development authority (WAPDA), Pakistan, has one flow gauging station, that is, Doyian in this area for flow record under Surface Water Hydrology Project (SWHP). The elevation of this gauging station is 1583 masl and its geographical location in the basin is 35°, 33′N latitude and 74°, 42′E longitude. The Shyok Basin covers drainage area of 68,458 km2 with average basin elevation of 4940 m. WAPDA also installed one flow gauging station hydrometric station in this area for flow record at Yogo with an elevation of 2469 m and its geographical location in the catchment is 35°, 11′N latitude and 76°, 06′E longitude. The recorded monthly data of river flows of both catchments were collected through WAPDA for the duration of 1975–2006 and the total time span of this duration is 384 months. According to the WAPDA, the mean annual river flow 32 yr (1975–2006) flow record is 142 m3/s for Astore catchment whereas, for Shyok catchment, it is 457 m3/s.
In this research, cross validation (CV) technique was applied to better see the prediction accuracy of the applied methods. In CV technique, the whole data is divided into equal data sets (DS), then the DS is used to train, and the other one DS is used to test the accuracy of the method. This process is repeated times till every DS of the data is used to test the applied method. This CV technique is preferred over -fold cross validation techniques due to usage of every data set for testing and which makes it closer to the real world problem [69, 72]. Similarly, in this research, the whole river flow data was divided into four equal DS. In all the applications, the three DS were used to train and remaining one DS was adopted to test the method. This procedure was repeated four times till every DS of data was used to test the method. The monthly river flow time series statistics of Astore and Shyok catchments are reported in Table 1. Here, the DS1, DS2, DS3, and DS4 represent four equal data sets of whole data for CV analysis whereas , , , , and represent mean, standard deviation, skewness coefficient, minimum, and maximum river flows, respectively. The recorded monthly river flows data show similarly high positive skewed distribution for Astore and Shyok catchments ( and 1.88). However, the range of the flow data of Shyok Basin (36.7–2080.7 m3/s) is much higher than that of the Astore Basin (19.3–654.9 m3/s). The lagged values of river flows show low persistence (e.g., Lag 1 = 0.735, Lag 2 = 0.266, and Lag 3 = −0.152). However, the lagged values of Astore Basin give a little better persistence than those of the Shyok Basin.
5. Input Combination Selection and Performance Evaluation Criteria
Input combinations (IC) selection is an important step in model development and it plays a key role in increasing the accuracy of the model. To see the correlation effect of the lagged values, the autocorrelation function is generally used to determine the number of affective lagged input values. For river flow forecasting, previous lagged river flows values are generally taken as IC in many researches of river flow forecasting [37, 73, 74]. In this research, the effect of lagged river flows values of both catchments was examined through autocorrelation function and was reported in Figure 4. According to the analysis, the following three IC were selected as inputs on the basis of most significant lagged river flows values for both basins, that is, , ; , , ; and , , , .
In this paper, two error indices were selected to evaluate the performance of the models in prediction of monthly river flows including the root mean square error and mean absolute error (). The similarity between the observed value and the forecasted value of river flow is measured by using the determination coefficient () index. These three indexes have been extensively applied in many problems of water resources for evaluating the model performance [19, 26, 75]. All the performance evaluation indexes can be calculated aswhere is the total size of observations of river flow time series, is observed river flow, is forecasted river flow, is average of river flows, and is average forecasted river flow.
6. River Flow Prediction Using Soft Computing Methods
In the first part of the paper, the performance of the proposed method hybrid LSSVR-GSA (HLGSA) was compared with other regression methods in predicting monthly river flows of the Astore and Shyok catchments, separately by using the three input combinations comprising antecedent river flows. CV technique was used for each applied method by dividing river flows data into four equal DS. Test statistics of HLGSA, MARS, M5RT, and CMLR methods for the Astore catchment in the test duration is compared in Table 2. It is clear from the table that all four applied methods provide different prediction results for different DS and input combinations. In case of input combinations, IC1 comprising the two consecutive previous months’ river flow values provides the worst prediction results for the HLGSA, MARS, and CMLR methods. IC2 comprising the two consecutive antecedent months’ river flow values including antecedent eleventh month’s flow value gives the best prediction results for the HLGSA, MARS, and M5RT methods whereas, for the CMLR method, IC3 comprising the two consecutive antecedent months’ river flow values and antecedent eleventh and twelfth months’ river flow value provides better performance than the other two input combinations. In the case of data sets, it is clear from the table that DS2 gives the worst forecasts results for all the regressions models including proposed HLGSA method. The reason of this is the maximum river flow value of Astore Basin’s test set; DS2 ( m3/s) is higher than the corresponding extreme value of the training DS value (see Table 1). This indicates that all trained methods encounter problems in constructing extrapolation in higher value of DS2. The higher values of and parameters for the DS2 data set in comparison with other data sets can also be another reason for the worst results. It is evident that all methods provide good forecasts for the DS4 under all input combination scenarios. The best model structures for HLGSA, MARS, and M5RT methods were found for the DS4 and IC2. However, in case of CMLR method, the best model structure was found for the DS4 and IC3. The best (40.09 m3/s) for the HLGSA is better as compared to MARS, M5RT, and CMLR methods (43.46, 57.26, and 58.08 m3/s), respectively. This is also true for values where the best for the HLGSA is 25.05 m3/s compared to MARS, M5RT, and CMLR methods (28.36, 29.83, and 32.70 m3/s), respectively. Table 2 clearly shows that the HLGSA method provides better prediction results than the other regression methods under all data sets and input combinations scenarios. MARS is ranked as the second best and performs better than the M5RT and CMLR whereas M5RT performs better than the CMLR under all data sets. In case of IC3, however, CMLR gives better forecasts than the M5RT for all data sets by having lower values of error indexes ( and ) and higher value of correlation index (). The best prediction results of HLGSA, MARS, and M5RT for IC2 indicate that the river flow of the two preceding months and eleventh month highly affects the current month river flow. The results of IC3 justify this statement by adding the river flow data of twelfth month that negatively affect the method performance. However, the CMLR showed positive dependence on the input combinations by showing the best prediction results for IC3.
For the sake of simplicity, the performance accuracy of all methods was evaluated by comparing the overall mean errors representing the mean error of all data sets and input combinations. Figures 5(a)-5(b) show the overall average errors statistics of all methods. Mean errors statistics of and clearly indicate that the HLGSA method performs better by having relatively less value of error indexes than the other methods in prediction of Astore catchment’s river flows. M5RT and CMLR give almost same mean errors statistics and provide the worst accuracy prediction in comparison to HLGSA and MARS due to having higher values of both errors indexes. This indicates the nonlinearity of the investigated phenomenon because both M5RT and CMLR have linear structures. HLGSA decreases the overall mean of the MARS, M5RT, and CMLR by 8.22%, 23.15%, and 24.49%, respectively. The observed and forecasted monthly river flows of Astore Basin by all the methods using their best model structures are reported in Figures 6(a)–6(d). The figure clearly explores that the HLGSA method is in good fit with the original river flows data in comparison to the other methods. The HLGSA gives higher value of correlation index () than the other methods. From Figure 6, it can be clearly seen that the value of the HLGSA method is 0.916, which is higher than of MARS, M5RT, and CMLR methods (which are 0.900, 0.888, and 0.882).
Table 3 reports the results of three performance evaluation statistical indexes for the HLGSA, MARS, M5RT, and CMLR methods in forecasting river flows of the Shyok catchment. In case of input combinations, here also IC1 gives the worst forecast results compared to the IC1 and IC3. However, in contrast to Astore application, here IC3 shows better accuracy than the IC2 for all the applied methods. In the case of data sets, it is evident from the table that DS4 gives the worst prediction results for all the methods. Similar to Astore Basin, here also the maximum river flow value of the Shyok Basin’s test set, DS4 ( m3/s), is higher than that of the training value (see Table 1). Another reason of this may be the fact that the DS2 has low correlations with the preceding river flow input data in comparison to the other data sets. It is obvious from the table that all methods give good prediction results for the DS3 among all input combination scenarios. The best model structures for all four applied methods in case of Shyok catchment are found for the DS3 and IC3. Similar to the Astore Basin, here also Table 3 clearly explores that the HLGSA method outperforms the other methods from , , and viewpoints for all data sets and input combination scenarios. Here, also the MARS gives better accuracy than the M5RT and CMLR methods for all data sets and input combinations. However in contrast to Astore Basin, here M5RT provides better prediction results than the CMLR method. The best for the proposed method is better as 109.20 m3/s compared to MARS, M5RT, and CMLR methods (128.38, 141.13, and 151.21 m3/s), respectively. This is also true for values where the best for the proposed method is 55.75 m3/s compared to MARS, M5RT, and CMLR methods (69.52, 77.27, and 86.70 m3/s), respectively.
Figures 7(a)-7(b) compare the overall average errors statistics of all methods in prediction river flows of Shyok Basin. Similar to Astore, here the figure clearly shows that and indexes of the HLGSA method give lower values in comparison to the MARS, M5RT, and CMLR methods and report that the HLGSA performs better than the other methods. However, in contrast to Astore, here the MARS and M5RT give almost same accuracy results whereas CMLR provides the worst performance in comparison to the other methods due to having higher values of both error indexes. In case of Shyok catchment, HLGSA decreases the overall mean of the MARS, M5RT, and CMLR by 11.48%, 19.55%, and 36.19%, respectively. The scatterplots of the original and predicted monthly river flows of Shyok Basin by all the methods using their best model structures are reported in Figures 8(a)–8(d). The figure clearly shows the superior accuracy of the HLGSA method over the MARS, M5RT, and CMLR methods. It is evident from the figure that the HLGSA method has a good fit to the observed river flows data in comparison to the other methods by having the higher value of . From Figure 8, it can be clearly seen that the value of the HLGSA method is 0.947, which is higher than of MARS, M5RT, and CMLR methods (which are 0.917, 0.891, and 0.876).
Selection of the proper kernel function (KF) is very important in obtaining highly accurate HLGSA method. The accuracy of the HLGSA method produced by different types of KF is different. In this research, the RBKF was utilized for determining the prediction results of river flows of both basins. However, the KF has many other types, and mostly common KF types are polynomial KF, sigmoid KF, Gaussian KF, Morlet KF, Mexican hat KF, and Meyer KF. To evaluate the performance of the applied KF in this study, six different kernel functions were compared in prediction of river flows of both catchments by using the best HLGSA model structures and were reported in Table 4. Table 4 clearly proves the superiority of the RBKF over the other kernel functions due to having smaller values of both errors indexes (, for Astore and , for Shyok) and higher value of ( for Astore and for Shyok) for both basins.
7. Effect of Log Transform on the Prediction Accuracy of the Soft Computing Methods
In this section of the paper, the effect of log transform on the prediction accuracy of the applied methods was investigated by applying log transform on the time series data of both basins before applying these methods. Here also the CV technique was applied and log transform data was divided into four equal data sets. Table 5 reports the test results of the LHLGSA, LMARS, LM5RT, and LCMLR methods for the Astore Basin. Here, also the IC2 generally provides the best forecast of LHLGSA, LMARS, and LM5RT methods whereas IC3 generally gives better results for the LCMLR method. However, IC3 generally provides worse results for LM5RT method in comparison to the IC1 and IC2. LCMLR gives better performance than the LM5RT method in case of IC2 and IC3 scenarios due to having lower values of and and higher values of . Similar to previous Astore application, here also the DS2 gives the worst results whereas DS4 performs the best among all data sets. The reason of the worst results of DS2 was already mentioned before. In case of Astore Basin, the best , , and values (35.33 m3/s, 13.08 m3/s, and 0.921) of the LHLGSA are better than those of the HLGSA (40.09 m3/s, 25.05 m3/s, and 0.916), respectively. This is also true for the LMARS, LM5RT, and LCMLR methods where the best , , and values of the LMARS, LM5RT, and LCMLR methods, respectively, are 38.78 m3/s, 20.88 m3/s, and 0.906, 46.18 m3/s, 24.80 m3/s, and 0.899, and 39.02 m3/s, 21.08 m3/s, and 0.901, which are better in comparison to those of the MARS, M5RT, and CMLR methods (43.46 m3/s, 28.36 m3/s, and 0.900, 57.26 m3/s, 29.83 m3/s, and 0.888, and 58.08 m3/s, 32.70 m3/s, and 0.888, resp.).
The mean errors statistics of all log models is illustrated in Figures 9(a)-9(b). The figure clearly explores the performance dominancy of LHLGSA over the other corresponding methods. According to Table 5, LCMLR generally performs better than LM5RT method in mostly cases of input combinations. However, on the mean errors values basis, the LCMLR performs worse than the LM5RT method due to inaccurate forecast results of IC1 ( > , > , > , > according to viewpoint) which affects the mean errors values. Figures 10(a)–10(d) illustrate the observed and forecasted river flows of the Astore Basin by using the log transform methods. The figure clearly shows that the LHLGSA method is in good fit with the original data. According to comparison of scatter plots of log and normal methods (Figures 6 and 10), it is evident that the log methods have better fits with the original data in comparison to normal methods. On the basis of fit line equation, the log methods are closer to the exact line than the normal methods (see and coefficients in Figures 6 and 10). LHLGSA decreases the overall mean of the LMARS, LM5RT, and LCMLR by 11.46%, 24.16%, and 35.65%, respectively.
Test statistics of the LHLGSA, LMARS, LM5RT, and LCMLR methods for the Shyok Basin are reported in Table 6. Here, in contrast to previous Shyok application, IC2 gives better forecast results for the LHLGSA and M5RT methods whereas IC3 performs better for the LMARS and LCMLR methods. However, IC3 performs slightly better for the LM5RT, whereas IC2 performs better for the LMARS in the case of DS3 data set. Similar to the previous Shyok application, here also DS3 performs the best whereas DS4 performs worst among all data sets. To check the best log method among all log methods, the mean errors indexes ( and ) of all log models are plotted in Figures 11(a)-11(b) in the form of bar graphs. The bar graphs of mean error indexes clearly show that the LHLGSA performs better than the other log methods due to having lower values of error indexes. Scatter plots of the observed and predicted river flows of Shyok catchment for all log methods by using their best model structures are shown in Figures 12(a)–12(d). The figure clearly shows that the LHLGSA method gives less scatter estimates with a higher value of in comparison to the LMARS, LM5RT, and LCMLR methods. The figure also reveals that all log methods are closer to exact line than the normal methods (compare Figures 8 and 12). On the basis of comparison, it is obvious that the normal methods give more scattered forecasts than the log methods. In case of Shyok catchment, LHLGSA decreases the overall mean of the LMARS, LM5RT, and LCMLR by 14.49%, 21.73%, and 43.84%, respectively. The log transform does not equally affect all input combinations and data sets but it can be observed that this effect is more prominent when more inputs are used in the case of CMLR methods (see Tables 5 and 6). In contrast to Astore Basin application, for the Shyok Basin, the best and values for the HLGSA (109.20 m3/s and 55.75 m3/s) are a little better compared to LHLGSA (109.41 m3/s and 59.01 m3/s), respectively. However in case of similarity index (), the best value for LHLGSA (0.959) is better than that of the HLGSA (0.947) method. In case of LMARS, LM5RT, and CMLR method, the best , , and values are better in comparison with MARS, M5RT, and CMLR methods, similar to Astore Basin application.
To evaluate the overall effect of log transform function on all applied methods, the overall mean error indexes ( and ) for both basins with and without log methods are compared in Figures 13(a), 13(b), 14(a), and 14(b). Both graphs clearly explore that the log methods give better accuracy than the normal methods except the LCMLR which gives worse accuracy compared to the CMLR with respect to error index. However, the LCMLR gives better accuracy than the CMLR from the index viewpoint for both basins. The reason of the LCMLR’s bad performance is due to its worse results in case of IC1 that affects the overall value. The bar graphs also prove that the proposed HLGSA method shows better accuracy than the other models in both cases of normal and logarithm transformed time series data. According to the bar graphs, the LHLGSA reduces the overall of the HLGSA by 5.66% and 4.87% for the Astore and Shyok Basins, respectively. The LMARS reduces the overall of the MARS by 2.20% and 1.52% for the Astore and Shyok Basins, respectively. The LM5RT reduced the overall of the M5RT by 4.41% and 2.22% for the Astore and Shyok Basins, respectively. However, in the case of multiple linear regression method, CMLR reduced the overall of the LCMLR by 9.05% and 7.45% for the Astore and Shyok Basins, respectively, while, in the case of error index, the LCMLR reduced the overall of the CMLR by 3.60% and 10.80%, respectively.
8. Comparison of LHLGSA and HLGSA Methods in Estimating River Flows Using Nearby River Flows Data
In this last section of the research, the performance of the HLGSA and LHLGSA is evaluated in river flow estimation of a basin using flow data from a nearby basin. The river flows estimation using nearby basin’s flow data is a vital issue because Pakistan is a developing country and many basins have long duration of missing flows data due to financial problems in the maintenance of the hydraulic gauging stations at higher altitudes. Since river flows play a key role in planning and designing of hydropower projects and for the flood mitigation, it is necessary to find a suitable way to fill these missing flows data. In this paper, the river flows data of the Astore Basin is used to estimate the flow data of the Shyok Basin. In this application, also the CV technique is applied to better see the accuracy of both methods in estimating river flows. Table 7 shows the , , and values of both methods in estimating monthly river flows of Shyok Basin. The best input combination for the LHLGSA and HLGSA methods is IC3 while IC2 generally gives worse results for both methods. DS4 gives better accuracy than the other DS whereas DS1 provides the worst estimates for both methods. Table 7 clearly shows the superiority of the LHLGSA over HLGSA in case of accuracy under all data sets and input combinations. It can also be seen that LHLGSA gave lower values of error indexes in case of the best data set with the best input combination ( of LHLGSA = 177.62 < of HLGSA = 189.03 and of LHLGSA = 80.21 < of HLGSA = 99.66). The original and estimated river flows by the LHLGSA and HLGSA methods using their best model structures (DS4 with IC3) are illustrated in Figure 15. The figure clearly shows that the LHLGSA has higher value of (0.882 > 0.871) representing better estimation than the HLGSA method. Figure 16 compares the mean error indexes of both methods in estimation river flows of Shyok Basin. It can be seen from the figure that the LHLGSA shows better accuracy than the HLGSA by having lower values of errors indexes. LHLGSA decreases the overall mean of the HLGSA method by 4.10% in the estimation of Shyok Basin flows using the river flows of Astore Basin.
On the basis of above results, the key findings can be summarized as follows.(1)The Astore Basin reported lower values of and in comparison with the Shyok Basin. This is due to the mean river flow of the basins; Astore Basin is characterized by mean river flow of 142 m3/s while the mean river flow of Shyok Basin is 457 m3/s.(2)It was also observed that the prediction accuracy of all methods including proposed method was mostly improved with increasing in input numbers which indicated that all input combinations have positive effects on predicting river flow especially in case of CMLR method.(3)It was also found that the higher value of testing data set’s maximum river flow in comparison with other training data set’s maximum river flow values caused the extrapolation difficulties and produced worst prediction results for that data set.(4)Overall, the HLGSA and LHLGSA methods outperformed the MARS, M5RT, CMLR and LMARS, LM5RT, and LCMLR methods, respectively. Moreover, the comparison between Figures 13 and 14 indicates that the prediction results with log transform function are better on the mean basis than the prediction results without log transform function using all regression methods including the proposed method which means the log transform function is suitable for denoising the river flow data.(5)In literature, many studies reported that MARS performed better or equally in comparison with the LSSVR methods [43, 44, 76–78]. However, in this study, the hybrid LSSVR method with gravitational search algorithm performed better than MARS method. The main reason behind this may be the LSSVR’s strong generalization capability and nonlinear fitting ability and the second reason may be the selection of optimal LSSVR control parameters ( and ) through GSA that directly affects and improves the accuracy of the method. The powerful global search ability of GSA helps to find the optimal and suitable values for the LSSVR control parameters in a shorter time in comparison to other algorithms. We can conclude that, in application of LSSVM, the control parameters should be adequately optimized by using global optimization techniques. This will decrease the uncertainties in obtaining optimal LSSVM models.(6)In general, the benchmark regression methods can be ranked according to their prediction accuracy as MARS, M5RT, and CMLR. The reason behind the worst results of M5RT and CMLR methods can be the linear structure of these models.(7)The , , and results validate that the HLGSA and LHLGSA methods can be effectively applied for the prediction and estimation of river flow.
In the current study, river flow data of Astore and Shyok rivers was used to determine the forecasting capability of HLGSA, MARS, M5RT, and CMLR methods by using the antecedent river flow values as inputs. Two error indexes ( and ) and one similarity index () were used for comparing the prediction accuracy of these methods. CV technique was used in all the applications to better see the prediction accuracy of the data sets. In the first part of the study, among four regression methods, HLGSA provided better results than the other methods in prediction of the monthly river flow data of both catchments. HLGSA improved the prediction accuracy of the MARS, M5RT, and CMLR by 8.22%, 23.15%, and 24.49% in Astore Basin, respectively, whereas, for Shyok Basin, HLGSA improved the prediction accuracy of the MARS, M5RT, and CMLR methods by 11.48%, 19.55%, and 36.19%, respectively. In the current study, radial basis kernel function was selected for HLGSA model due to its better prediction accuracy. In the second part of the study, the effect of logarithm transform function on prediction performance of all regression methods was also investigated. Results reported that after applying logarithm function on river flow time series data, all the regression methods provided better prediction accuracy for both basins. Prediction results also exposed that the HLGSA method outperformed the other methods. LHLGSA decreased the overall mean of the LMARS, LM5RT, and LCMLR by 11.46%, 24.16%, and 35.65%, respectively, for the Astore Basin, whereas, for Shyok Basin, LHLGSA decreases the overall mean of the LMARS, LM5RT, and LCMLR by 14.49%, 21.73%, and 43.84%, respectively. On the comparison of log transformed methods and normal methods, the LHLGSA reduced the overall of the HLGSA by 5.66% and 4.87% for the Astore and Shyok Basins, respectively. LMARS reduced the overall of the MARS by 2.20% and 1.52% for the Astore and Shyok Basins, respectively. The LM5RT reduced the overall of the M5RT by 4.41% and 2.22% for the Astore and Shyok Basins, respectively. The third part of the study evaluated the prediction performance of the HLGSA and LHLGSA methods in river flow estimation using river flow data of the nearby basin. The test results revealed that the LHLGSA performed better than the HLGSA in estimating river flows of Shyok Basin by using Astore Basin data.
In this study we forecasted river flows with only previous river flow values as inputs. These prediction accuracies of the applied methods could be improved if more input variables were available. Further studies may be conducted by including more inputs such as rainfall, snowpack, and temperature or/and building prediction models using more advanced modeling methods at these study sites. The proposed data driven methods may be applied for other regions with similar or different climates. In this case, however, the methods should be properly calibrated by using high number of river flow data.
The authors declare that they have no competing interests.
This work was supported by National Natural Science Foundation of China (no. 51379080 and no. 41571514) and Hubei Provincial Collaborative Innovation Center for New Energy Microgrid, China Three Gorges University. The authors thank the staff of WAPDA for providing river flows data of both basins.
- Q. Zhang, B.-D. Wang, B. He, Y. Peng, and M.-L. Ren, “Singular spectrum analysis and ARIMA hybrid model for annual runoff forecasting,” Water Resources Management, vol. 25, no. 11, pp. 2683–2703, 2011.
- J. Xu, Y. Chen, W. Li, Q. Nie, C. Song, and C. Wei, “Integrating wavelet analysis and BPANN to simulate the annual runoff with regional climate change: a case study of Yarkand River, Northwest China,” Water Resources Management, vol. 28, no. 9, pp. 2523–2537, 2014.
- A. R. Ghumman, Y. M. Ghazaw, A. R. Sohail, and K. Watanabe, “Runoff forecasting by artificial neural network and conventional model,” Alexandria Engineering Journal, vol. 50, no. 4, pp. 345–350, 2011.
- M. Tayyab, J. Zhou, X. Zeng, L. Chen, and L. Ye, “Optimal application of conceptual rainfall-runoff hydrological models in the Jinshajiang River basin,” Proceedings of the International Association of Hydrological Sciences, vol. 368, pp. 227–232, 2015.
- R. C. Ahlert and B. M. Mehta, “Stochastic analyses and transfer functions for flows of the upper Delaware River,” Ecological Modelling, vol. 14, no. 1-2, pp. 59–78, 1981.
- S. Ahmad, I. H. Khan, and B. P. Parida, “Performance of stochastic approaches for forecasting river water quality,” Water Research, vol. 35, no. 18, pp. 4261–4266, 2001.
- A. Kurunç, K. Yürekli, and O. Çevik, “Performance of two stochastic approaches for forecasting water quality and streamflow data from Yeşilirmak River, Turkey,” Environmental Modelling and Software, vol. 20, no. 9, pp. 1195–1200, 2005.
- M. Mirzavand and R. Ghazavi, “A stochastic modelling technique for groundwater level forecasting in an arid environment using time series methods,” Water Resources Management, vol. 29, no. 4, pp. 1315–1328, 2015.
- A. K. Mishra and V. R. Desai, “Drought forecasting using stochastic models,” Stochastic Environmental Research and Risk Assessment, vol. 19, no. 5, pp. 326–339, 2005.
- R. Modarres, “Streamflow drought time series forecasting,” Stochastic Environmental Research and Risk Assessment, vol. 21, no. 3, pp. 223–233, 2007.
- B. W. Otok and Suhartono, “Development of rainfall forecasting model in Indonesia by using ASTAR, transfer function, and ARIMA methods,” European Journal of Scientific Research, vol. 38, no. 3, pp. 386–395, 2009.
- A. Psilovikos and M. Elhag, “Forecasting of remotely sensed daily evapotranspiration data over Nile Delta region, Egypt,” Water Resources Management, vol. 27, no. 12, pp. 4115–4130, 2013.
- A. T. Rabenja, A. Ratiarison, and J. M. Rabeharisoa, “Forecasting of the rainfall and the discharge of the Namorona River in Vohiparara and FFT analyses of these data,” in Proceedings of the 4th International Madagascar Conference in High-Energy Physics (HEP-MAD '09), pp. 1–12, August 2009.
- M. Valipour, “Long-term runoff study using SARIMA and ARIMA models in the United States,” Meteorological Applications, vol. 22, no. 3, pp. 592–598, 2015.
- S. Yang, S. Liu, X. Li, Y. Zhong, X. He, and C. Wu, “The short-term forecasting of evaporation duct height (EDH) based on ARIMA model,” Multimedia Tools and Applications, pp. 1–14, 2016.
- A. Altunkaynak, “Forecasting surface water level fluctuations of lake van by artificial neural networks,” Water Resources Management, vol. 21, no. 2, pp. 399–408, 2007.
- D. Ömer Faruk, “A hybrid neural network and ARIMA model for water quality time series prediction,” Engineering Applications of Artificial Intelligence, vol. 23, no. 4, pp. 586–594, 2010.
- W. Huang, B. Xu, and A. Chan-Hilton, “Forecasting flows in Apalachicola River using neural networks,” Hydrological Processes, vol. 18, no. 13, pp. 2545–2564, 2004.
- G. Landeras, A. Ortiz-Barredo, and J. J. López, “Forecasting weekly evapotranspiration with ARIMA and artificial neural network models,” Journal of Irrigation and Drainage Engineering, vol. 135, no. 3, pp. 323–334, 2009.
- A. K. Mishra, V. R. Desai, and V. P. Singh, “Drought forecasting using a hybrid stochastic and neural network model,” Journal of Hydrologic Engineering, vol. 12, no. 6, pp. 626–638, 2007.
- C. L. Wu and K.-W. Chau, “Data-driven models for monthly streamflow time series prediction,” Engineering Applications of Artificial Intelligence, vol. 23, no. 8, pp. 1350–1367, 2010.
- S. Ahmad, A. Kalra, and H. Stephen, “Estimating soil moisture using remote sensing data: a machine learning approach,” Advances in Water Resources, vol. 33, no. 1, pp. 69–80, 2010.
- M. Behzad, K. Asghari, M. Eazi, and M. Palhang, “Generalization performance of support vector machines and neural networks in runoff modeling,” Expert Systems with Applications, vol. 36, no. 4, pp. 7624–7629, 2009.
- Z. He, X. Wen, H. Liu, and J. Du, “A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region,” Journal of Hydrology, vol. 509, pp. 379–386, 2014.
- S. K. Jain, “Modeling river stage-discharge-sediment rating relation using support vector regression,” Hydrology Research, vol. 43, no. 6, pp. 851–861, 2012.
- A. M. Kalteh, “Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform,” Computers & Geosciences, vol. 54, pp. 1–8, 2013.
- J.-Y. Lin, C.-T. Cheng, and K.-W. Chau, “Using support vector machines for long-term discharge prediction,” Hydrological Sciences Journal, vol. 51, no. 4, pp. 599–612, 2006.
- H. Yoon, S.-C. Jun, Y. Hyun, G.-O. Bae, and K.-K. Lee, “A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer,” Journal of Hydrology, vol. 396, no. 1-2, pp. 128–138, 2011.
- P. P. Bhagwat and R. Maity, “Hydroclimatic streamflow prediction using least square-support vector regression,” ISH Journal of Hydraulic Engineering, vol. 19, no. 3, pp. 320–328, 2013.
- M. K. Goyal, B. Bharti, J. Quilty, J. Adamowski, and A. Pandey, “Modeling of daily pan evaporation in sub tropical climates using ANN, LS-SVR, Fuzzy Logic, and ANFIS,” Expert Systems with Applications, vol. 41, no. 11, pp. 5267–5276, 2014.
- S. H. Hwang, D. H. Ham, and J. H. Kim, “Forecasting performance of LS-SVM for nonlinear hydrological time series,” KSCE Journal of Civil Engineering, vol. 16, no. 5, pp. 870–882, 2012.
- O. Kisi, “Modeling discharge-suspended sediment relationship using least square support vector machine,” Journal of Hydrology, vol. 456-457, pp. 110–120, 2012.
- O. Kisi, “Least squares support vector machine for modeling daily reference evapotranspiration,” Irrigation Science, vol. 31, no. 4, pp. 611–619, 2013.
- O. Kisi, “Streamflow forecasting and estimation using least square support vector regression and adaptive Neuro-Fuzzy embedded fuzzy c-means clustering,” Water Resources Management, vol. 29, no. 14, pp. 5109–5127, 2015.
- A. Mellit, A. M. Pavan, and M. Benghanem, “Least squares support vector machine for short-term prediction of meteorological time series,” Theoretical and Applied Climatology, vol. 111, no. 1-2, pp. 297–307, 2013.
- U. Okkan and Z. A. Serbes, “Rainfall-runoff modeling using least squares support vector machines,” Environmetrics, vol. 23, no. 6, pp. 549–564, 2012.
- A. Shabri and Suhartono, “Streamflow forecasting using least-squares support vector machines,” Hydrological Sciences Journal, vol. 57, no. 7, pp. 1275–1293, 2012.
- J. Wu, M. Liu, and L. Jin, “Least square support vector machine ensemble for daily rainfall forecasting based on linear and nonlinear regression,” in Advances in Neural Network Research and Applications, pp. 55–64, Springer, Berlin, Germany, 2010.
- J. Adamowski, H. F. Chan, S. O. Prasher, and V. N. Sharda, “Comparison of multivariate adaptive regression splines with coupled wavelet transform artificial neural networks for runoff forecasting in Himalayan micro-watersheds with limited data,” Journal of Hydroinformatics, vol. 14, no. 3, pp. 731–744, 2012.
- P. Coulibaly and C. K. Baldwin, “Nonstationary hydrological time series forecasting using nonlinear dynamic methods,” Journal of Hydrology, vol. 307, no. 1-4, pp. 164–174, 2005.
- R. C. Deo, O. Kisi, and V. P. Singh, “Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model,” Atmospheric Research, vol. 184, pp. 149–175, 2017.
- R. C. Deo, P. Samui, and D. Kim, “Estimation of monthly evaporative loss using relevance vector machine, extreme learning machine and multivariate adaptive regression spline models,” Stochastic Environmental Research and Risk Assessment, vol. 30, no. 6, pp. 1769–1784, 2016.
- O. Kisi, “Modeling reference evapotranspiration using three different heuristic regression approaches,” Agricultural Water Management, vol. 169, pp. 162–172, 2016.
- O. Kisi and K. S. Parmar, “Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution,” Journal of Hydrology, vol. 534, pp. 104–112, 2016.
- V. N. Sharda, S. O. Prasher, R. M. Patel, P. R. Ojasvi, and C. Prakash, “Performance of Multivariate Adaptive Regression Splines (MARS) in predicting runoff in mid-Himalayan micro-watersheds with limited data,” Hydrological Sciences Journal, vol. 53, no. 6, pp. 1165–1175, 2008.
- J. A. K. Suykens and J. Vandewalle, “Least squares support vector machine classifiers,” Neural Processing Letters, vol. 9, no. 3, pp. 293–300, 1999.
- X. Guo and X. Ma, “Mine water discharge prediction based on least squares support vector machines,” Mining Science and Technology, vol. 20, no. 5, pp. 738–742, 2010.
- M. Arabloo, H. Ziaee, M. Lee, and A. Bahadori, “Prediction of the properties of brines using least squares support vector machine (LS-SVM) computational strategy,” Journal of the Taiwan Institute of Chemical Engineers, vol. 50, pp. 123–130, 2015.
- J. A. K. Suykens, J. De Brabanter, L. Lukas, and J. Vandewalle, “Weighted least squares support vector machines: robustness and sparce approximation,” Neurocomputing, vol. 48, no. 1–4, pp. 85–105, 2002.
- X. Yuan, C. Chen, Y. Yuan, Y. Huang, and Q. Tan, “Short-term wind power prediction based on LSSVM-GSA model,” Energy Conversion and Management, vol. 101, pp. 393–401, 2015.
- D.-Y. Shi, J. Lu, and L.-J. Lu, “A judge model of the impact of lane closure incident on individual vehicles on freeways based on RFID technology and FOA-GRNN method,” Journal of Wuhan University of Technology, vol. 34, no. 3, pp. 63–68, 2012.
- E. Rashedi, H. Nezamabadi-Pour, and S. Saryazdi, “GSA: a gravitational search algorithm,” Information Sciences, vol. 179, no. 13, pp. 2232–2248, 2009.
- R. K. Sahu, S. Panda, and S. Padhan, “A novel hybrid gravitational search and pattern search algorithm for load frequency control of nonlinear power system,” Applied Soft Computing, vol. 29, pp. 310–327, 2015.
- E. Rashedi, H. Nezamabadi-Pour, and S. Saryazdi, “Filter modeling using gravitational search algorithm,” Engineering Applications of Artificial Intelligence, vol. 24, no. 1, pp. 117–122, 2011.
- F.-Y. Ju and W.-C. Hong, “Application of seasonal SVR with chaotic gravitational search algorithm in electricity forecasting,” Applied Mathematical Modelling. Simulation and Computation for Engineering and Environmental Systems, vol. 37, no. 23, pp. 9643–9651, 2013.
- J. H. Friedman, “Multivariate adaptive regression splines,” The Annals of Statistics, vol. 19, no. 1, pp. 1–67, 1991.
- S.-M. Chou, T.-S. Lee, Y. E. Shao, and I.-F. Chen, “Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines,” Expert Systems with Applications, vol. 27, no. 1, pp. 133–142, 2004.
- C.-J. Lu, T.-S. Lee, and C.-M. Lian, “Sales forecasting for computer wholesalers: a comparison of multivariate adaptive regression splines and artificial neural networks,” Decision Support Systems, vol. 54, no. 1, pp. 584–596, 2012.
- D. J. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT Press, 2001.
- A. Etemad-Shahidi and J. Mahjoobi, “Comparison between M5′ model tree and neural networks for prediction of significant wave height in Lake Superior,” Ocean Engineering, vol. 36, no. 15-16, pp. 1175–1181, 2009.
- J. R. Quinlan, “Learning with continuous classes,” in Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348, Singapore, 1992.
- B. Bhattacharya and D. P. Solomatine, “Neural networks and M5 model trees in modelling water level-discharge relationship,” Neurocomputing, vol. 63, pp. 381–396, 2005.
- S. Londhe and P. Dixit, “Stream flow forecasting using model trees,” International Journal of Earth Sciences and Engineering, vol. 4, no. 6, pp. 282–285, 2011.
- M. Janga Reddy and B. N. S. Ghimire, “Use of model tree and gene expression programming to predict the suspended sediment Load in rivers,” Journal of Intelligent Systems, vol. 18, no. 3, pp. 211–227, 2009.
- M. T. Sattari, M. Pal, H. Apaydin, and F. Ozturk, “M5 model tree application in daily river flow forecasting in Sohu Stream, Turkey,” Water Resources, vol. 40, no. 3, pp. 233–242, 2013.
- K. K. Singh, M. Pal, and V. P. Singh, “Estimation of mean annual flood in indian catchments using backpropagation neural network and M5 model tree,” Water Resources Management, vol. 24, no. 10, pp. 2007–2019, 2010.
- D. P. Solomatine and Y. Xue, “M5 model trees and neural networks: application to flood forecasting in the upper reach of the Huai River in China,” Journal of Hydrologic Engineering, vol. 9, no. 6, pp. 491–501, 2004.
- A. Zahiri and H. M. Azamathulla, “Comparison between linear genetic programming and M5 tree models to predict flow discharge in compound channels,” Neural Computing and Applications, vol. 24, no. 2, pp. 413–420, 2014.
- I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2005.
- A. Heiat, “Comparison of artificial neural network and regression models for estimating software development effort,” Information and Software Technology, vol. 44, no. 15, pp. 911–922, 2002.
- M. Zounemat-Kermani, Ö. Kişi, J. Adamowski, and A. Ramezani-Charmahineh, “Evaluation of data driven models for river suspended sediment concentration modeling,” Journal of Hydrology, vol. 535, pp. 457–472, 2016.
- I. Myrtveit, E. Stensrud, and M. Shepperd, “Reliability and validity in comparative studies software prediction models,” IEEE Transactions on Software Engineering, vol. 31, no. 5, pp. 380–391, 2005.
- B. Li and C. Cheng, “Monthly discharge forecasting using wavelet neural networks with extreme learning machine,” Science China Technological Sciences, vol. 57, no. 12, pp. 2441–2452, 2014.
- A. G. Yilmaz and N. Muttil, “Runoff estimation by machine learning methods and application to the Euphrates Basin in Turkey,” Journal of Hydrologic Engineering, vol. 19, no. 5, pp. 1015–1025, 2014.
- C. Santhi, J. G. Arnold, J. R. Williams, W. A. Dugas, R. Srinivasan, and L. M. Hauck, “Validation of the swat model on a large RWER basin with point and nonpoint sources,” Journal of the American Water Resources Association, vol. 37, no. 5, pp. 1169–1188, 2001.
- J. Alreja, S. Parab, S. Mathur, and P. Samui, “Estimating hysteretic energy demand in steel moment resisting frames using multivariate adaptive regression spline and least square support vector machine,” Ain Shams Engineering Journal, vol. 6, no. 2, pp. 449–455, 2015.
- A. Kibekbaev and E. Duman, “Benchmarking regression algorithms for income prediction modeling,” Information Systems, vol. 61, pp. 40–52, 2016.
- P. Samui and D. Kim, “Least square support vector machine and multivariate adaptive regression spline for modeling lateral load capacity of piles,” Neural Computing and Applications, vol. 23, no. 3-4, pp. 1123–1127, 2013.
Copyright © 2017 Rana Muhammad Adnan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.