Research Article  Open Access
Comparing the Selected Transfer Functions and Local Optimization Methods for Neural Network Flood Runoff Forecast
Abstract
The presented paper aims to analyze the influence of the selection of transfer function and training algorithms on neural network flood runoff forecast. Nine of the most significant flood events, caused by the extreme rainfall, were selected from 10 years of measurement on small headwater catchment in the Czech Republic, and flood runoff forecast was investigated using the extensive set of multilayer perceptrons with one hidden layer of neurons. The analyzed artificial neural network models with 11 different activation functions in hidden layer were trained using 7 local optimization algorithms. The results show that the LevenbergMarquardt algorithm was superior compared to the remaining tested local optimization methods. When comparing the 11 nonlinear transfer functions, used in hidden layer neurons, the RootSig function was superior compared to the rest of analyzed activation functions.
1. Introduction
In recent three decades, the implementations of various models based on artificial neural networks (ANN) were intensively explored in hydrological engineering. The general reviews of ANNs modeling strategies and applications with the emphases on modeling of hydrological processes are presented in [1–3]. They confirm that the class of multilayer perceptron (MLP) [4, 5] belongs to the most frequently studied ANN’s models in hydrological modeling [6–9].
The MLP forms the nonlinear data driven model. According to its architecture, it is a fully connected feedforward network, which organizes the processing units (neurons) into the layers and allows the interconnections only between neurons in two following layers. As it was proved by [10], the MLP is the universal function approximator. This important property has been widely confirmed by many hydrological studies [11–14].
Despite the positive research results of a large number of studies on MLP runoff forecasting, there is a need for clear methodological recommendations of MLP transfer function selection [15, 22–24] combined together with the training method assessment and the implementation of new training method [8, 18, 19, 25].
Main aims of presented paper are to analyze the hourly flood runoff forecast on small headwater catchment with MLPANN models, which are based on 12 different MLP’s transfer functions following the work of [15, 24], to compare the 7 local optimization algorithms [5, 17, 19], and finally to evaluate the MLP performance with 4 selected model evaluation measures [26, 27].
2. Material and Methods
The tested runoff prediction using the MLPANN models uses the set of rainfall runoff data. The MLPANN implementation for runoff forecast generally consists of data preprocessing, model architecture selection, MLP training, and model validation. In this section, we give a very brief description of the MLPANN model architecture and tested optimization schemes and datasets.
2.1. MLPANN Model
We analyzed the MLP model with one hidden layer. The similar ANN architecture was used in a large number of hydrologically oriented studies [18, 28–31]. The studied MLP models had in total three layers of neurons, the input layer, the hidden layer, and the output layer. As proved by Hornik et al. [10], this type of artificial neural network with sufficiently a large number of neurons in the second layer can approximate with desired precision any measurable functional relationship.
The implemented MLPANN models had a general form where the is a network output, that is, flood runoff forecast for given time interval, is network input for input layer neuron , is the number of MLP inputs, the is the weight of input to hidden layer neuron, is the activation function constant for all hidden layer neurons, is the number of hidden neurons, is the weight for output from hidden neuron , and , are neuron biases [2–4, 18, 25, 31].
2.1.1. MLPANN Transfer Functions
The type of activation function together with network architecture influences the generalization of neural network. Imrie et al. [32] empirically confirmed that the transfer function bounding influences the ANN generalization and hydrological extreme simulations during runoff forecast. Following the work of [15], we implemented the 12 different types of transfer functions, and 11 of them were tested in hidden neuron layer of analyzed MLPANN models. Table 1 provides their list.

The activation functions type combined with specific type of training methods influences the average performance of leaning algorithm and computing time [15, 24]. For example, the Bishop [4] pointed out that the implementation of hyperbolic function speeds up the training process compared to the use of logistic sigmoid.
2.1.2. MLPANN Local Optimization Methods
We selected 7 gradient based local optimization methods. Table 2 shows their list together with their references. All MLPANN optimization was performed using the batch learning mode [4].

All tested gradient local search methods (except BP_regul) minimized the error function represented as a the sum of square of residuals and the residuals were defined as differences between observed and computed flood runoff.
The two first order local training methods are represented by the standard backpropagation and backpropagation with regularization term. Both backpropagation methods implement the following modification: constant learning rate and momentum parameter. The BP_regul used the regularization term, which penalizes the size of estimated weights, and the error function is defined as where the is a total number of MLPANN weights . The hyperparameters and were constant within the standard backpropagation with the regularization term [4, 16].
The scaled conjugate gradient methods are built together with safe line search based on golden section search combined with bracketing the minima [33, 34]. The implementation enables the restarting during the iteration search based on the recommendations of [21, 35]. The restarting controls the prescribed number of iterations or gradient norm. The implementation of scaled conjugate gradient uses four different updating schemes in detail described by [19, 36].
All gradient based methods apply the standard backpropagation algorithm for the estimation derivatives of the objective function with respect to weights [37]. The LevenbergMarquardt methods approximate the Hessian matrix using first order derivatives neglecting the terms with the second order derivatives [4, 17].
2.1.3. The MLPANN Performance
We based the evaluation of MLPANN model simulations of training, testing, and validation datasets on the following statistics [26, 27, 38]: mean absolute error (MAE) Nash Sutcliffe efficiency (NS) fourth root mean quadrupled error (R4MS4E) persistency index (PI) where the represents the total number of time intervals to be predicted, the is the average of observed flood runoff , and is the time shift describing last observed flood runoff .
2.1.4. The PONS2train
The tested MLPANN models were implemented using the PONS2train software application. The PONS2train is software written in C++ programing language, whose main goal is to test MLP models with different architectures. The software application uses the LAPACK, BLAS, and ARMADILLO C++ linear algebra libraries [39–41]. The application is freely distributed upon a request to authors.
The PONS2train has additional features: the weight initialization can be performed using two methods. The first one follows the work of Nguyen and Widrow [42], while the second one uses random initialization coming from the uniform distribution.
Giustolisi and Laucelli [25] extensively studied the eight methods for improving the MLP performance and generalization. One of them the early stopping is incorporated in designed application. Following the recommendations of Stäger and Agarwal [43], the PONS2train also controls the avoiding of the neuron’s saturation.
The important PONS2train implementation feature is the multirun and ensemble simulation. Its software design also enables further multimodel or hybrid MLP extensions [29, 44].
The software design also allows the comparative analysis of MLP’s architectures with or without bias neurons in layers. The PONS2train also enables the comparison of MLP trained on shuffled and unshuffled dataset. The shuffling of data patterns follows the random permutation algorithm of Durstenfeld [45].
The MLP datasets are scaled using two methods. Both methods scale the analyses datasets into the interval with arbitrary chosen upper bound . The nonlinear scaling provides the transformed data obtained from original data using exponential transformation where the is a control parameter. The second scaling methods is a linear one.
2.2. The Dataset Description
We explored the MLPANN models using the rainfall and runoff time series data obtained from 10year monitoring in the Modrava catchment 0.17 km^{2}. The experimental watershed was established in 1998 in upper parts of Bohemian Forest National Park. The basin belongs to the set of testbeds designed to monitor the hydrological behavior of headwater forested catchments. The watershed description shows that of Pavlasek et al. [46].
The forest cover is a clearing with young artificially planted forest combined with an undergrowth of herbs (mainly Calamagrostis villosa, Avenella flexuosa, Scirpus sylvaticus, and Vaccinium myrtillus) and bryophyte (Polytrichastrum formosum, Dicranum scoparium, and Sphagnum girgensohnii). A small part of the catchment (less than 10%) is covered by 40yearold forest. The bark beetle calamity removed the original forest cover. Catchment bedrock is formed by granite, migmatite, and paragneiss covered by Haplic Podzols with depths of up to 0.9 m. The mean runoff coefficient is 0.2, mean daily runoff 1.2 mm.
The most significant nine rainfall runoff events observed in hourly time step were selected from 10year measurement. The flood runoff prediction was analyzed via proposed MLPANN models. The characteristics of flood events are described in Table 3. All floods events were complemented with the periods of 5 preceding days. The rainfall runoff events were divided into the nonoverlapping training, testing, and validation dataset.

The division of flood events into the datasets was made with respect to the similarity of empirical distribution functions of training, testing, and validation datasets and to their independence. The empirical distribution functions were estimated using the quantile estimation method, which was specifically developed for the description of hydrological time series (for detailed information see [47]). The selected quantiles of all datasets are shown in Table 4. The quantiles show that the distinctions of the information in training, testing, and validation datasets are not significant.

3. Results and Discussion
We tested MLPANN models with 4 MLP architectures; they are different according to the number of hidden layer neurons . For each MLP architecture, we prepared 11 types of MLPANN models according to the type of hidden layer activation function (AF) (see Table 1). Each of them was trained with 7 training algorithms (TA) (see Table 2).
All MLPANN datasets consisted of all available pairs of four inputs and one output. The inputs were one runoff interval and three rainfall intervals , , and and output was formed from one runoff output for all available time intervals . The total number of training pairs was 1270, the testing inputoutput datasets were 1221, and validation datasets were 1423.
Although there are suitable methodologies for selection of the proper input vector for MLP model, that is, [48–50], we based our flood forecast on small number of previous rainfall intervals and one previous runoff mainly due to fast hydrological response of analyzed watershed. The datasets were transformed using the nonlinear exponential transformation.
Each training algorithm was repeated 150 times. The random initialization of network weights was performed by the method of [42]. Each optimization multirun used the same values of 150 mutually different initial random vectors of weights, in order to ensure that the comparison of performances of optimization algorithms was based on similar random weights initializations.
3.1. The Benchmark Model
The flood forecast was simulated using the benchmark model based on simple linear model—SLMB. The SLMB parameters were calculated using the ordinary least squares. Table 5 shows results obtained from the simulation of SLMB benchmark model.

Since the benchmark model provides the single simulation and one value for all tested model comparison measures, we compared the results of SLMB with results of the best selected single MLPANN models. In model ensemble, we found MLPANN models, which were superior compared SLMB.
For example, the model performance based on the PI index shows all MLPANN provided models, which were superior compared to SLMB (see the results of Table 6). The highest differences between the best PI values of ANN and PI of SLMB were obtained on MLPANN trained using LM algorithm on training dataset (). The LM and PER training algorithms provided models with the highest values of PI on testing and validation datasets (, resp., ).

These conclusions are in agreement with the values of remaining model performance measures—MAE, NS, and R4MS4E (see Table 7). The LM and BP_regul were superior in terms of differences with SLBM according to the MAE and R4MS4E. The LM and PER were superior compared to SLMB for NS values on training, testing, and validation datasets.

The similar results can be found, when comparing the results of SLMB with the best MLPANN models organized in terms of different transfer functions. The highest differences of PI values were on training dataset for MLPANN with LL transfer function (), for testing dataset on RS transfer function () and for validation dataset on LL transfer function (). These were calculated for MLPANN with transfer functions, which were successful in more than 10% of simulations on validation dataset.
Those results were confirmed by the values of MAE, NS, and R4MS4E obtained for the best model of a simulation ensemble. The RS transfer function provided the best results in terms of differences between , , and on training, testing, and validation datasets.
3.2. The Optimization Algorithms
The results of MLPANN models were explained through the values of model performance measures, which are shown in Tables 6 and 7. All training computations controlled the neuron’s saturation using the method of Stäger and Agarwal [43]. The parameters of TA (i.e., number of epochs, learning rate, etc.) were selected in such a way that the number of MLPANN evaluations was similar in all tested TA.
Table 6 shows the results of persistency index, which was used as a main reference index, since the PI compares the model with last observed information [38]. The best TA according to the number of successful models with was the PER (the scaled conjugate gradient method with Perry updating formula). The highest number of successfully trained models was found on MLP with (see the ntrained = 1181, ntest = 838, and nval = 468 in Table 6).
When comparing the performance of TA according to the best single value of PI (see columns PI_train, PI_test, and PI_val in Table 6) and the average performance of best MLPANN models on PI (see columns mPI_train, mPI_test, and mPI_val in Table 6), the LevenbergMarquardt algorithm was mostly superior compared to all remaining TA, except for three cases, when the PER and BP_regul were better on validation datasets for MLP with on best single value of PI and for average of mPI_val for .
Table 7 displays the results of best models for remaining statistical measures of MLPANN models trained on tested TA. Only three algorithms were superior at least for one architecture of MLP and on one dataset. They are LM, PER, and BP_regul. Again, the LM was mostly superior compared to the other tested TA. The differences between results of LM and PER and BP_regul were very small.
The best values of NS were in agreement with values of PI (see, e.g., the PER on MLP with ). The BP_regul was better in terms of the length of residuals for MAE_test on MLP ANN models with . Also when comparing the simulation of peak flow in terms of R4MS4E, the BP_regul was better on MLP with for validation dataset.
Our finding are in agreement with results on runoff forecast of Piotrowski and Napiorkowski [18], who compared the LevenbergMarquardt approach even with more robust global optimization schemes, and found that the LM provides comparable results with MLP trained using the selected evolutionary computation methods.
3.3. The Transfer Functions
The results of PI, MAE, NS, and R4MS4E are shown in Tables 8 and 9. The PI has again served as a reference. We trained the MLP with all AF listed in Table 1. Tables 8 and 9 show the results of AF for MLPANN models, which were successful in more than 10% of simulations on validation dataset.


When comparing the absolute values of number of MLPANN models with , the models with two AF (RS and CLm) were superior compared to MLP models with remaining 9 AFs. The MLP with RS provided the larger number of better models in terms of PI value on 8 datasets, while the MLP with CLm transfer function was successful on 4 datasets.
RS was also the most successful TA on training dataset at MLPs with (note that for the differences in PI between RS and CLm are almost insignificant). The LL also provided good results on training dataset (for all tested values of ) and on validation data for .
The mean performances based on arithmetical means of PI values of best models showed that three AFs were superior compared to remaining 8 AFs (see mPI_train, mPI_test, and mPI_val in Table 8). They were CL, HT, and RS MLP ANN models. Their differences of PI were again very small.
Table 9 shows the averages of MAE, NS, and R4MS4E on set of tested models. The results point out that the RS transfer function provided in summary superior values compared to rest of tested AF. The CLm, HT, and LS activation functions were on some datasets better in terms of mean values of tested statistical measures but the differences between the RS MLP ANN models were again negligible.
When reflecting the results of da S. Gomes et al. [15], who recommended the CL, CLm, and LL functions on MLP ANN models, we point out the ability of the MLP models with RS to improve the flood runoff forecast.
Our findings on the selection of suitable AF on MLP ANN models recommend that different AF should be tested during the implementation of MLP models for flood runoff forecast.
4. Conclusions
During the extensive computational test, we trained in total the 46200 models of multilayer perceptron with one hidden layer. The main aim of computational exercise was the evaluation of the impacts of the transfer function selection and the test of selected local optimization schemes on flood runoff forecast.
Using the rainfall runoff data of nine of the most significant flood events, we analyzed the short term runoff forecast on small watershed with fast hydrological response. The developed MLP ANN models were able to predict flood runoff using the records of past rainfall and runoff from the basin.
When comparing the tested MLP ANN models with benchmark simple linear model, the developed MLP models were superior in terms of values of model performance measures compared to the SLMB.
The PONS2Train software application was developed for the purposes of the evaluation of MLPANN models with different architectures and for providing the simulations of neural network flood forecast.
When analyzing the 7 different gradient oriented optimization schemes we found that the LevenbergMarquardt algorithm was superior compared to the tested set of scaled conjugate gradient methods and two first order local optimization schemes.
When analyzing the 11 different transfer functions used in hidden neurons we found that the RootSig function was according to the values of four model performance measures most promising activation function in terms of flood runoff forecast.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 H. R. Maier and G. C. Dandy, “Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications,” Environmental Modelling and Software, vol. 15, no. 1, pp. 101–124, 2000. View at: Publisher Site  Google Scholar
 C. W. Dawson and R. L. Wilby, “Hydrological modelling using artificial neural networks,” Progress in Physical Geography, vol. 25, no. 1, pp. 80–108, 2001. View at: Publisher Site  Google Scholar
 H. R. Maier, A. Jain, G. C. Dandy, and K. P. Sudheer, “Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions,” Environmental Modelling and Software, vol. 25, no. 8, pp. 891–909, 2010. View at: Publisher Site  Google Scholar
 C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, New York, NY, USA, 1995. View at: MathSciNet
 R. D. Reed and R. J. Marks, Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks, MIT Press, Cambridge, Mass, USA, 1998.
 A. W. Minns and M. J. Hall, “Artificial neural networks as rainfallrunoff models,” Hydrological Sciences Journal, vol. 41, no. 3, pp. 399–417, 1996. View at: Publisher Site  Google Scholar
 H. K. Cigizoglu, “Estimation, forecasting and extrapolation of river flows by artificial neural networks,” Hydrological Sciences Journal, vol. 48, no. 3, pp. 349–361, 2003. View at: Publisher Site  Google Scholar
 N. J. de Vos and T. H. M. Rientjes, “Constraints of artificial neural networks for rainfallrunoff modelling: tradeoffs in hydrological state representation and model evaluation,” Hydrology and Earth System Sciences, vol. 9, no. 12, pp. 111–126, 2005. View at: Publisher Site  Google Scholar
 G. Napolitano, F. Serinaldi, and L. See, “Impact of EMD decomposition and random initialisation of weights in ANN hindcasting of daily stream flow series: an empirical examination,” Journal of Hydrology, vol. 406, no. 34, pp. 199–214, 2011. View at: Publisher Site  Google Scholar
 K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, 1989. View at: Publisher Site  Google Scholar
 N. J. de Vos and T. H. M. Rientjes, “Multiobjective training of artificial neural networks for rainfallrunoff modeling,” Water Resources Research, vol. 44, no. 8, 2008. View at: Publisher Site  Google Scholar
 E. Toth and A. Brath, “Multistep ahead streamflow forecasting: role of calibration data in conceptual and neural network modeling,” Water Resources Research, vol. 43, no. 11, 2007. View at: Publisher Site  Google Scholar
 M. P. Rajurkar, U. C. Kothyari, and U. C. Chaube, “Modeling of the daily rainfallrunoff relationship with artificial neural network,” Journal of Hydrology, vol. 285, no. 1–4, pp. 96–113, 2004. View at: Publisher Site  Google Scholar
 C. M. Zealand, D. H. Burn, and S. P. Simonovic, “Short term streamflow forecasting using artificial neural networks,” Journal of Hydrology, vol. 214, no. 1–4, pp. 32–48, 1999. View at: Publisher Site  Google Scholar
 G. S. da S.Gomes, T. B. Ludermir, and L. M. M. R. Lima, “Comparison of new activation functions in neural network for forecasting financial time series,” Neural Computing & Applications, vol. 20, no. 3, pp. 417–439, 2011. View at: Publisher Site  Google Scholar
 D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, New York, NY, USA, 2003. View at: MathSciNet
 M. Hagan and M. Menhaj, “Training feedforward networks with the Marquardt algorithm,” IEEE Transactions on Neural Networks, vol. 5, no. 6, pp. 989–993, 1994. View at: Publisher Site  Google Scholar
 A. P. Piotrowski and J. J. Napiorkowski, “Optimizing neural networks for river flow forecasting—evolutionary computation methods versus the levenbergmarquardt approach,” Journal of Hydrology, vol. 407, no. 1–4, pp. 12–27, 2011. View at: Publisher Site  Google Scholar
 A. E. Kostopoulos and T. N. Grapsa, “Selfscaled conjugate gradient training algorithms,” Neurocomputing, vol. 72, no. 13–15, pp. 3000–3019, 2009. View at: Publisher Site  Google Scholar
 J. Barzilai and J. M. Borwein, “Twopoint step size gradient methods,” IMA Journal of Numerical Analysis, vol. 8, no. 1, pp. 141–148, 1988. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 M. J. D. Powell, “Restart procedures for the conjugate gradient method,” Mathematical Programming, vol. 12, no. 2, pp. 241–254, 1977. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 A. Shamseldin, A. Nasr, and K. O'Connor, “Comparsion of different forms of the multilayer feedforward neural network method used for river flow forecasting,” Hydrology and Earth System Sciences, vol. 6, no. 4, pp. 671–684, 2002. View at: Publisher Site  Google Scholar
 R. R. Shrestha, S. Theobald, and F. Nestmann, “Simulation of flood flow in a river system using artificial neural networks,” Hydrology and Earth System Sciences, vol. 9, no. 4, pp. 313–321, 2005. View at: Publisher Site  Google Scholar
 H. Yonaba, F. Anctil, and V. Fortin, “Comparing sigmoid transfer functions for neural network multistep ahead streamflow forecasting,” Journal of Hydrologic Engineering, vol. 15, no. 4, pp. 275–283, 2010. View at: Publisher Site  Google Scholar
 O. Giustolisi and D. Laucelli, “Improving generalization of artificial neural networks in rainfallrunoff modelling,” Hydrological Sciences Journal, vol. 50, no. 3, pp. 439–457, 2005. View at: Publisher Site  Google Scholar
 C. W. Dawson, R. J. Abrahart, and L. M. See, “HydroTest: further development of a web resource for the standardised assessment of hydrological models,” Environmental Modelling & Software, vol. 25, no. 11, pp. 1481–1482, 2010. View at: Publisher Site  Google Scholar
 C. W. Dawson, R. J. Abrahart, and L. M. See, “HydroTest: a webbased toolbox of evaluation metrics for the standardised assessment of hydrological forecasts,” Environmental Modelling and Software, vol. 22, no. 7, pp. 1034–1052, 2007. View at: Publisher Site  Google Scholar
 A. Y. Shamseldin, “Application of a neural network technique to rainfallrunoff modelling,” Journal of Hydrology, vol. 199, no. 34, pp. 272–294, 1997. View at: Publisher Site  Google Scholar
 W. Wang, P. H. A. J. M. V. Gelder, J. K. Vrijling, and J. Ma, “Forecasting daily streamflow using hybrid ANN models,” Journal of Hydrology, vol. 324, no. 1–4, pp. 383–399, 2006. View at: Publisher Site  Google Scholar
 B. Pang, S. Guo, L. Xiong, and C. Li, “A nonlinear perturbation model based on artificial neural network,” Journal of Hydrology, vol. 333, no. 2–4, pp. 504–516, 2007. View at: Publisher Site  Google Scholar
 A. P. Piotrowski and J. J. Napiorkowski, “A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling,” Journal of Hydrology, vol. 476, pp. 97–111, 2013. View at: Publisher Site  Google Scholar
 C. Imrie, S. Durucan, and A. Korre, “River flow prediction using artificial neural networks: generalisation beyond the calibration range,” Journal of Hydrology, vol. 233, no. 1–4, pp. 138–153, 2000. View at: Publisher Site  Google Scholar
 W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C++: The Art of Scientific Computing, Cambridge University Press, 2002.
 T. Masters, Practical Neural Network Recipes in C++, Morgan Kaufmann, 1st edition, 1993.
 N. Andrei, “Scaled conjugate gradient algorithms for unconstrained optimization,” Computational Optimization and Applications. An International Journal, vol. 38, no. 3, pp. 401–416, 2007. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, pp. 149–154, 1964. View at: Publisher Site  Google Scholar  MathSciNet
 D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by backpropagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986. View at: Publisher Site  Google Scholar
 P. K. Kitanidis and R. L. Bras, “Realtime forecasting with a conceptual hydrologic model. 2: applications and results,” Water Resources Research, vol. 16, no. 6, pp. 1034–1044, 1980. View at: Publisher Site  Google Scholar
 E. Anderson, Z. Bai, J. Dongarra et al., “Lapack: a portable linear algebra library for highperformance computers,” in Proceedings of the ACM/IEEE conference on Super computing, pp. 2–11, IEEE Computer Society Press, Los Alamitos, Calif, USA, November 1990. View at: Google Scholar
 L. S. Blackford, J. Demmel, J. Dongarra et al., “An updated set of basic linear algebra subprograms ({BLAS}),” ACM Transactions on Mathematical Software, vol. 28, no. 2, pp. 135–151, 2002. View at: Publisher Site  Google Scholar  MathSciNet
 C. Sanderson, “Armadillo: an open source C++ linear algebra library for fast prototyping and computationally intensive experiments,” Tech. Rep., NICTA, Sydney, Australia, 2010. View at: Google Scholar
 D. Nguyen and B. Widrow, “Improving the learning speed of 2layer neural networks by choosing initial values of adaptive weights,” in Proceedings of the International Joint Conference on Neural Networks ( IJCNN '90), vol. 1–3, pp. C21–C26, International Neural Network Society, San Diego, Calif, USA, June 1990. View at: Google Scholar
 F. Stäger and M. Agarwal, “Three methods to speed up the training of feedforward and feedback perceptrons,” Neural Networks, vol. 10, no. 8, pp. 1435–1443, 1997. View at: Publisher Site  Google Scholar
 Z. Huo, S. Feng, S. Kang, G. Huang, F. Wang, and P. Guo, “Integrated neural networks for monthly river flow estimation in arid inland basin of Northwest China,” Journal of Hydrology, vol. 420421, pp. 159–170, 2012. View at: Publisher Site  Google Scholar
 R. Durstenfeld, “Algorithm 235: random permutation,” Communications of the ACM, vol. 7, no. 7, p. 420, 1964. View at: Google Scholar
 J. Pavlasek, M. Tesar, P. Maca et al., “Ten years of hydrological monitoring in upland microcatchments in the bohemian forest, Czech Republic,” in Status and Perspectives of Hydrology in Small Basins, pp. 213–219, IAHS, 2010. View at: Google Scholar
 R. J. Hyndman and Y. Fan, “Sample quantiles in statistical packages,” American Statistician, vol. 50, no. 4, pp. 361–365, 1996. View at: Google Scholar
 G. J. Bowden, G. C. Dandy, and H. R. Maier, “Input determination for neural network models in water resources applications. Part 1: background and methodology,” Journal of Hydrology, vol. 301, no. 1–4, pp. 75–92, 2005. View at: Publisher Site  Google Scholar
 R. J. May, H. R. Maier, G. C. Dandy, and T. M. K. G. Fernando, “Nonlinear variable selection for artificial neural networks using partial mutual information,” Environmental Modelling and Software, vol. 23, no. 1011, pp. 1312–1326, 2008. View at: Publisher Site  Google Scholar
 R. J. May, G. C. Dandy, H. R. Maier, and J. B. Nixon, “Application of partial mutual information variable selection to ANN forecasting of water quality in water distribution systems,” Environmental Modelling & Software, vol. 23, no. 1011, pp. 1289–1299, 2008. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2014 Petr Maca et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.