Abstract

Design storms are very useful in many hydrological and hydraulic practices and are obtained from statistical analysis of precipitation records. However considering design storms, which are often quite unlike the natural rainstorms, may result in designing oversized or undersized drainage facilities. For these reasons, in this study, a two-parameter double exponential function is proposed to parameterize historical storm events. The proposed function has been assessed against the storms selected from 5-year rainfall time series with a 1-minute resolution, measured by three meteorological stations located in Calabria, Italy. In particular, a nonlinear least square optimization has been used to identify parameters. In previous studies, several evaluation methods to measure the goodness of fit have been used with excellent performances. One parameter is related to the centroid of the rain distribution; the second one is related to high values of the standard deviation of the kurtosis for the selected events. Finally, considering the similarity between the proposed function and the Gumbel function, the two parameters have been computed with the method of moments; in this case, the correlation values were lower than those computed with nonlinear least squares optimization but sufficiently accurate for designing purposes.

1. Introduction

In recent years, climate change and the growing waterproofing land have favoured the occurrence, in urban areas, of critical situations causing surface flooding. Floods are the most dangerous meteorological hazard in the Mediterranean areas due to both the number of people involved and the relatively high frequency by which human activities and goods suffer damage and losses [1]. Flooding in urban areas can occur due to several factors that vary according to the kind of drainage system used (separate or combined sewers) and its design characteristics. In addition to these variables, rainfall plays the main role for flooding characterization although this one is characterized by a substantial uncertainty as much from the spatial point of view as from the temporal point of view [2]. Also its triggering factors are very complex, so it represents one of the most difficult variables to predict [3, 4]. Several studies concluded that the factors of heavy rainfall generation are various; in particular they are summarized as (a) high moisture content of the air mass present over the zone, (b) vertical movement on one or more scales, and (c) static instability.

Heavy rainfall can be the result of persistent moderate precipitation or very intense precipitation of short duration. In this way critical rainfall events are represented by heavy precipitation that occur in a very short time [5].

The use of design storms is very popular among hydraulic engineers. Several techniques to develop design storms have been studied, including intensity-duration-frequency (IDF) curve, stochastic models, and profiles obtained directly from rainfall records. A critical characteristic of IDF curves is that intensities are averaged over the specified duration and do not represent the real distribution of rainfall. Usually, design storms are developed by statistical analysis of rainfall records, but, unfortunately, sometimes they are not tested against long-term rainfall records. In addition, much of the digitised rainfall data from numerous stations is viewed as being unreliable, with important events missing and errors in the digitised data. In an attempt to overcome these problems and to improve reliability, a double exponential approach has been applied to estimate design storm for short duration (<1 h) rainfalls.

The aim of this research is to propose a cumulative rainfall function (CRF) to assess subhourly rainfall distribution to observed data taken from three sites from the Calabria region in South Italy. Calabria region has a mean rainfall rate of about 1170 mm/years; several studies have classified Calabria Storm into four main groups: (a) storms that originate on the lee of the Alps (including those over the Gulf of Genoa); (b) storms that develop in the Western Mediterranean (Gulf of Lyon, Rhone valley, and Iberia); (c) storms that develop in Northern Africa or enter the Mediterranean from the Strait of Gibraltar; (d) storms that move over the central Mediterranean from Balkans and Eastern Europe [10]. Storms of class (a) are the most frequent ones. This research work intends to propose a multiparameter function of double exponential type applied to observed data from meteorological stations located in urban watersheds with the aim of obtaining significant input that helps in design problems of urban drainage systems; however, for urban basins, subhourly events were chosen. Several studies have been performed using stochastic models to simulate observed data [2, 1116]. To separate a storm event from the other an appropriate minimum interevent time (MIT) was chosen. Almost all of the mathematical models used in the literature to describe or simulate hydrological processes, also stochastic, require techniques that allow the performance evaluation. In general, to evaluate the performance of a model, it needs to compare the calculated values ​​and the corresponding measured or reference methods. The description of the various indices and the discussion on their suitability have been widely discussed in the literature [1724]. In addition, several studies identify some common points to evaluate a mathematical model. In particular (1) a standard procedure for mathematical evaluation of a model is needed; (2) performance evaluation of a model should include at least one absolute value error indicator (in the variable units), one dimensionless index (or indicator of the relative error) for quantifying the goodness of fit, and a graphical representation of the relationship between model estimates and observations [21].

2. Materials and Methods

2.1. Data Collection and Analysis

Precipitation data, consisting of rainfall depth recorded with minute frequency, were collected from the Functional Meteorological Hydrographic and Mareographic Center database of Calabria region, South Italy. This institution has the principle role of measuring and collecting the information associated with the Earth’s climate of the whole region. Three stations equipped with tipping-bucket rain gauges were selected for the data collection. The data, which covered a period of 5 years, was examined and missing records were resampled. In Figure 1 are summarized the geographical characteristics of the three stations chosen for this work.

To separate a storm event from the other, Minimum Interevent Time criterion was chosen. Minimum Interevent Time is defined as the time between the end of a storm event and the beginning of the next. Often the choice of a particular value of MIT was related to physical parameters changes that competed in the definition of independent rainfall events. This value therefore has never been unique but closely related to the type of analysis or observation of a particular natural phenomenon. For example, Bracken et al. [9] used MIT = 12 h in order that the ground could dry between runoff events, reducing the impact of antecedent soil moisture on runoff event. In Table 1 is summarized the typical range of rain MIT in the literature.

In urban areas, rains that deserve special attention are those of short duration such as subhourly. In this way critical rainfall events are represented by heavy precipitation that occur in a very short time [5]. As discussed by Carbone et al. [25] an appropriate value of MIT to identify independent rainfall events in urban catchment scale for subhourly rainfall events is 15 minutes. Using this value of MIT and considering a volume threshold of 8 mm, a number of rainfall events were selected from each location. Rainfall events separated by less than 15 min were merged and considered as single event. Obviously, the choice of the volume threshold can condition both the number of events resulting from the chosen MIT and the events’ characteristics.

The statistical analysis is crucial for the mathematical characterization of rainfall events. In particular, skewness, kurtosis, and variation are among the most important shape parameters of a statistical distribution. Skewness, kurtosis, and variation are computed as follows: where represents the th data points, is the mean, is the number of data points, and is the standard deviation of data.

The skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. The skewness for a normal distribution is zero and any symmetric data should have a skewness near zero. On the other hand, the kurtosis is a measure of degree of the “peakedness” of the probability distribution of a real-valued random variable.

For these reasons, in this research, a statistical analysis of selected rainfall events has been performed in order to characterize the statistical behaviour of samples. In Table 2, the values of skewness, kurtosis, and variation are listed.

All selected events have positive skewness and are not normally distributed. Kurtosis is almost always positive; only four events exhibit negative kurtosis which indicates a uniform rain distribution. The station of Cortale exhibits always positive values of kurtosis. The mean coefficient of variation is 1.31 which indicates a high variance in the distribution of rain data.

2.2. Cumulative Rainfall Function

In general, to reproduce rainfall behaviour, stochastic processes are used in the literature [1116]. Usually, when a statistical procedure is applied on annual maximum values the probability of exceedance might be converted into a “return period (years).” However the critical values of the rain event refer not only to the rain intensity (or rain depth) but to the “time distribution” of the rain quantity as well. It is difficult to assign a return period to a sequence of discrete values of rainfall intensities throughout certain design storm duration. In this study, to overcome this problem, a parameterization of the cumulative historical rainfall event curve is proposed.

Rainfall depths are often described by means of the exponential distribution, sometimes by the generalized Pareto distribution hereafter referred to as GPD, and more rarely by the gamma distribution or the generalized extreme value (GEV) distribution. Results showed that although all the analysed distributions were able to satisfactorily reproduce ordinary statistics, generalized Pareto distribution was able to better reproduce the observed behaviour [26]. The selection of a particular distribution is mainly influenced by the value of the skewness of the samples considered.

In solving hydrologic problems, such as the design of urban storm sewer systems, it is much important to know the time distribution rainfall. In hydrological theory, theoretical distribution functions are usually used to set rainfall characteristics, and their parameters are calibrated using historical rainfall events.

As shown by the analysis of selected rainfall events, samples exhibit positive skewness and kurtosis, which, in general, indicate that samples are not normally distributed. In this case, considering the values of skewness and kurtosis, the sample could be represented by double exponential functions, such as Gumbel type.

If we refer to the cumulative rain depth, this function often exhibits a sigmoidal behaviour. For this reason and for what is stated above about skewness and kurtosis, the idea to propose a Gumbel-type function that parameterizes the cumulative rainfall time series has arisen.

In particular, a double exponential function is proposed:where represents the parametric function proposed and and are the parameters of the function, which depend on the characteristics of the sample considered.

In order to identify these parameters a nonlinear least squares optimization has been performed. Nonlinear optimization problems arise in numerous applications such as economy, statistic, and engineering. An optimization problem begins with a set of independent variables and often includes conditions or restrictions that define acceptable values of the variables. Such restrictions are called restraints. The essential component is the objective function, which depends in some way on the variables. The solution of an optimization problem is a set of allowed values of the variables for which the objective function assumes an optimal value. In general, a parameter of function appears nonlinearly if the derivative is a function of the parameter. For this reason, a function is nonlinear if at least one of the parameters appears nonlinearly.

The least squares (LS) method is a standard approach in finding an approximation to an overdetermined system of equations. It is also the most important application in data fitting. A standard example of using LS is the case of fitting a line to a given set of points, called linear regression. In general, one can distinguish between linear least squares (LLS) and nonlinear least squares (NLS).

LLS implies that the residuals are all linear. There is a closed form for the solution and it occurs mostly in regression analysis, meaning that one has a set of data points and fits function (usually a polynomial of degrees 1, 2, or 3) to best approximate the given data set.

NLS implies that the residuals are not all linear. In general, one does not have a closed form for the solution and needs to apply an iterative procedure in order to obtain a solution. Furthermore, there might exist several minimums and finding the global minimum might require high computational costs.

For these reasons the nonlinear least squares optimization has been used.

Numerical methods for nonlinear optimization problems are iterative. At the th iteration, a current approximate solution is available. A new point is computed by certain techniques, and this process is repeated until a point can be accepted as a solution. The classical methods for optimization are line search algorithms. Such an algorithm obtains a search direction in each iteration and searches along this direction to obtain a better point. The search direction is a descent direction, normally computed by solving a subproblem that approximates the original optimization problem near the current iterate. Therefore, unless a stationary point is reached, there always exist better points along the search direction. To set function parameter and solve nonlinear curve fitting (data fitting) problems in least squares sense, “trust-region-reflective” algorithm was used.

Nonlinear least squares problem usually makes use of one the following two algorithms:(1)trust-region-reflective (TRR);(2)Levenberg-Marquardt method (LM).Levenberg-Marquardt algorithms and trust region algorithms are both Newton step-based methods (they are called “restricted Newton step methods”). Thus, they both exhibit quadratic speed of convergence near optimal value. When we are far from the solution, we can encounter a negative curvature. If this happens, Levenberg-Marquardt algorithms will slow down dramatically. In opposition, trust region algorithm will thus exhibit better performances each time a negative curvature is encountered and have thus better performances than all the Levenberg-Marquardt algorithms. The trust region reflective algorithm has been chosen because (i) analytical first and second order derivative information can be included, (ii) upper and lower bounds on parameters can be considered easily, and (iii) it is computationally faster than LM [27].

2.3. Statistical Evaluation

In general, in order to verify the accuracy of a model, it is necessary to perform a statistical evaluation to ensure the results validity. The statistical evaluation indices of a hydrological model are varied. The use of an index instead of another depends on the type of hydrological model to validate. The description of the various indices and discussions on their suitability have been widely discussed in the literature [1724]. The quantitative statistics are generally divided into three categories: standard regression, goodness of fit (GOF), and error index. Standard regression statistics determine the strength of the linear relationship between simulated and measured data; goodness of fit of a statistical model describes how well it fits into a set of observations. GOF indices summarize the discrepancy between the observed values and the values expected under a statistical model; error indices quantify the deviation in the units of the data of interest.

For these reasons, in order to have a global assessment of the proposed function, most widely used indices were used for each of the categories described above.

2.3.1. Statistical Evaluation (Standard Regression: Coefficient of Determination )

The coefficient of determination () describes the proportion of the variance in measured data explained by the model. ranges from 0 to 1, with higher values indicating less error variance, and typically values greater than 0.5 are considered acceptable [21]. Given two cumulative functions and , is given by where

2.3.2. Statistical Evaluation (Goodness of Fit: Kolmogorov-Smirnov Test)

The Kolmogorov-Smirnov (KS) statistics provides a means of testing whether a set of observations comes from a specific continuous distribution. The usual alternative would be the chi-square test. The KS test has at least two major advantages over the chi-square test:(1)It can be used with small sample sizes, where the validity of the chi-square test would be questionable.(2)Often it appears to be a more powerful test than the chi-square test for any sample size.Considering the sizes of the samples selected in this study, which are small, the KS test has been preferred.

The “one-sample” Kolmogorov-Smirnov (KS) test is the most used GOF test to decide if a sample comes from a hypothesized continuous function. It is based on the largest vertical difference between the theoretical and empirical functions. Given two cumulative functions and , the Kolmogorov-Smirnov test statistics () is given by values less than critical value are considered acceptable. is reported in tables function of sample size .

2.3.3. Statistical Evaluation (Error Index: Root Mean Square Error, Percent Bias)

Several error indices are commonly used in model evaluation. These include mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE). These indices are valuable because they indicate error in the units (or squared units) of the constituent of interest, which aids in analysis of the results. Root mean square error (RMSE) is one of the commonly used error index statistics; it measures the differences between value predicted by a model or an estimator and the values actually observed. RMSE values of 0 indicate a perfect fit.

Given two cumulative functions and , RMSE [L] is given by Percent bias (Pbias) measures the average tendency of the expected data to be larger or smaller than their observed data. The optimal value of Pbias is 0. Low values indicate a very good performance. Positive value of Pbias indicates an underestimation bias while positive values indicate an overestimation.

Given two cumulative functions and , Pbias is given by

Study step Methodology is described in Figure 2.

3. Results and Discussion

In this research, three rain gauge stations were selected for the analysis. The stations considered are installed in South Italy. Rain gauge has a resolution of 0.2 mm and rain data are registered with a temporal resolution of 1 min. The analysis has been carried on a period of five consecutive years, from 2008 to 2013. Pluviometric data come from the Functional Meteorological Hydrographic and Mareographic Center database of Calabria region. This database contains all rainfall records of all meteorological stations installed throughout the region. Unfortunately, stations with measurement per minute were relatively few and many of them have interruptions in measurements resulting in missing data. In addition, the weather stations chosen for this work measure per minute only since 2008 and it has not been possible to obtain more data in previous years. Most recent records were not available because they are still under processing.

However, five years of pluviometric data are considered to be sufficient to identify a significative sample of short duration rainfall events, which are of interest in this paper.

The rainfall series have been processed using a Python code. In specific, subhourly rainfall events were identified by setting a Minimum Interevent Time (MIT) of 15 minutes and discarding events with a volume lower than 8 mm. Under these conditions, a total of 63 subhourly rain events for the three stations were identified and then used in the analysis. The characteristics of selected events are listed in Table 3.

The two parameters of the proposed function, and , for each event were determined by using a nonlinear optimization procedure. The proposed parametric function has been fitted on the cumulative rainfall distribution of each event. In order to assess the quality of the fitting error indices and standard regression index were computed. Results of the curve fitting are listed in Table 4.

Results confirm the optimal fitting of the proposed function. Low values of RMSE and high values of indicate very good performance. In particular, is always above 0.5 and RMSE exhibits always values near zero. In addition, the Pbias values are considerably low indicating a good agreement between the fitted function and the observations. A graphical representation to describe statistics of the computed parameters and by indices of dispersion and position is described in Figure 3.

Considering that is directly related to first derivative of the cumulative rainfall distribution, which indicates peakedness of the sample, the large value of the standard deviation of could be related to high value of the standard deviation of the kurtosis for the selected events. Also , which is a location parameter, exhibits large values of standard deviation. The meaning of has been further investigated. In particular, the analysis has revealed that is directly related to -coordinate of the centroid of the rain distribution.

As shown in Figure 4, the relationship between and can be explained by two types of regression: a linear regression and a bisector regression. The linear regression achieves slightly better results than the bisector, which also gives a sufficiently accurate description of the relation.

In order to assess the suitability of the proposed exponential function, the Kolmogorov-Smirnov one-sample test (KS test) has been used. The KS test is used to test whether a sample comes from a specific distribution. The test has been performed on all 63 selected events by comparing the value of the Kolmogorov-Smirnov test statistic with the critical value obtained by tables already present in the literature by assuming a 5% of level of significance. The KS test has been carried out by using the parameters and already calculated with the NLS optimization described in the previous sections. Results of the KS test are reported in Figure 5.

As it can be seen from Figure 5, results of the KS test confirm that selected rainfall events come from the exponential distribution function proposed in this study. Only for two rain events .

In general, the definition of parameters by using the NLS optimization guarantees good results in terms of fitting. This is confirmed by the statistical criterion described above. Although, especially for practical applications, the NLS optimization could be complex and time consuming, in such practical cases parameters could be computed by using other statistical procedures, which are easy to implement.

It is possible to observe that function (2) closely resembles the Gumbel distribution function, where represents the exceedance probability. In statistics parameters and may be computed by using several methods; the most common are maximum likelihood estimation (MLE) and method of moments. The MLE could be more accurate in the estimation of parameters, although the mathematical computation is very complex. On the other hand, the method of moments is fairly simple and allows calculating easily the parameters. For these reasons, in this study, the method of moments has been used.

and could be calculated by using the method of moments as follows:where and are, respectively, the mean value and the standard deviation of distribution.

Consider the assumptions:where is the rainfall depth at time , is the total rainfall depth of the event, and is the casual variable ; for every event in each station the parameters and have been calculated by using (8). A graphical representation to describe the statistics of the computed parameters and by indices of dispersion and position is described in Figure 6 while results of the curve fitting are listed in Table 5.

Values of RMSE and indicate a deterioration in terms of quality of fitting when using and computed by using (8). In particular, exhibits a mean value of 0.771 which is still acceptable. The RMSE values are three times the values computed with NLS. However, results are still acceptable for practical applications where the degree of precision may be less accurate.

In addition, the KS test has been performed. Results of the KS test are shown in Figure 7.

Results of the KS test confirm the nonoptimal performance of the proposed function when using parameters and computed by using (8).

4. Conclusion

This paper intends to provide a contribution on defining a “design storm” for urban drainage system. The main idea is based on the definition of a parametric cumulative function validated with an exhaustive evaluation of both model scientific basis and performance. The proposed CRF has been fitted on the cumulative rainfall distribution of each event. In order to assess the quality of the fitting RMSE, and Pbias were computed. The results highlight very good performances with low values of RMSE ranging from and high values of ranging from . Finally, considering the similarity of the proposed CRF with the Gumbel function, a practical and expeditious method to assess the parameters of the CRF has been proposed. In this case, the values of RMSE and indicate a deterioration in terms of quality of fitting; however, results are still acceptable for the practical application. Further studies will go in the direction of a better definition of the parameters to allow a direct practical application of the CRF. In particular, the approach will be strengthened with the analysis of more pluviometric stations, in different countries and under different climate conditions. In addition to this, a more detailed study on the physical meaning of parameters and will be conducted.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research study was cofunded by the Italian National Operative Project (PON)-Research and Competitiveness for the Convergence Regions 2007/2013-I Axis “Support to structural changes” operative objective 4.1.1.1 “Scientific-technological generators of transformation processes of the productive system and creation of new sectors” Action II: “Interventions to support industrial research.”