Research Article | Open Access
A Geometric Derivation of the Irwin-Hall Distribution
The Irwin-Hall distribution is the distribution of the sum of a finite number of independent identically distributed uniform random variables on the unit interval. Many applications arise since round-off errors have a transformed Irwin-Hall distribution and the distribution supplies spline approximations to normal distributions. We review some of the distribution’s history. The present derivation is very transparent, since it is geometric and explicitly uses the inclusion-exclusion principle. In certain special cases, the derivation can be extended to linear combinations of independent uniform random variables on other intervals of finite length. The derivation adds to the literature about methodologies for finding distributions of sums of random variables, especially distributions that have domains with boundaries so that the inclusion-exclusion principle might be employed.
The simple continuous uniform or rectangular distribution Uniform(0, 1) with probability density function (PDF) for and otherwise is very important. Two applications arise in numerical simulation and Bayesian analysis of proportions. If is the cumulative distribution function (CDF) of the continuous random variable , then the random variable has a Uniform() distribution. The random variable can be simulated by first simulating and then letting . This is called the inversion method ([1, page 295], [2, pages 194–196]). The transformation is called the probability integral transformation (, [4, pages 203-204]). The uniform distribution is a Bayesian noninformative prior distribution for the distribution of a random variable defined on the unit interval, such as a beta distribution for a proportion ([2, page 33], [5, pages 82–90]). For other applications and generalizations of the uniform distribution, see [6–8].
The present goal is to derive the CDF and the PDF of the sum , where are independent identically distributed Uniform(0, 1) random variables for . The CDF and PDF arerespectively, where is the unit step functionThe derivation in Section 2 is geometric and explicitly uses the inclusion-exclusion principle.
Derivations of the distribution, which more recently acquired its name Irwin-Hall, go back to Lagrange and Laplace in the latter 18th century and the early 19th century. Lagrange used generating functions based on to obtain the distribution of T ([9, pages 603–612], [10, page 283]). Those generating functions are a predecessor of characteristic functions [10, page 286]. Laplace often revisited the problem of finding the distribution of and employed many methods ([9, pages 714-715], [10, pages 286–301]). The distribution is described in [1, pages 296–300], where it is called the Irwin-Hall distribution.
Some derivations employ characteristic functions in a variety of ways, since the characteristic function of a sum of independent random variables is the product of each summand’s characteristic function and the inverse transform is not intractable ([11, pages 188-189], [12–14], [15, pages 362-363], [16, 17]). Others utilize the convolution integral for sums and mathematical induction ([4, page 225], [11, pages 190-191 and 244–246], ). The distribution of the sum of uniform random variables that may have differing domains is found in [18–21]. Sums of dependent uniform random variables are examined in [22, 23].
Direct integration techniques can be used to obtain the distribution of a linear combination of Uniform(0, 1) random variables ([15, pages 358–360], [24, 25]). Similar techniques are used in  for uniform distributions whose domains are intervals with zero as their left endpoints. The distribution of the mean is obtained when all the constants are . In this case, the distribution is called the Bates distribution ([1, page 297], ), which can also be found by a simple transformation of the Irwin-Hall distribution ([15, page 359], [25, page 241]). Using moment generating functions, instead of characteristic functions, Gray and Odell  found the distribution of any linear combination of uniform random variables with different domains allowed. In Section 3, the present method or style of proof is extended to those cases giving the same distributions.
Because is a sum, the Irwin-Hall distribution approximates a normal distribution with a spline, since the Irwin-Hall distribution in (2) is composed of polynomials. The support of is the interval ; the mean, mode, and median of are ; and its variance is . By symmetry, all odd central moments are zero, including skewness. The kurtosis is [1, page 300]. This is the measure of kurtosis that is 3 for a normal distribution, so Irwin-Hall distributions are platykurtic, and the kurtosis is close to 3 for large . According to the Central Limit Theorem,([4, pages 280–283], [11, pages 213–218 and 245], [29, pages 220–222]). Figure 1 contains a normal distribution with mean and variance and its approximating Irwin-Hall distribution with . The approximation is very good even for this small value of . The uniform error bound for the normal(0, 1) CDF is(, [32, page 51]). Approximations with spline fitting can be useful with or without complete information about the distributional shape [33, 34].
Since round-off errors for random variables that are rounded to the nearest integer are distributed Uniform(−1/2, 1/2), the sum of round-off errors is a linearly transformed Irwin-Hall distribution . For large , the sum of round-off errors is easily described with a normal distribution [29, page 222]. For small , the Irwin-Hall distribution is also appropriate and not too complicated.
Lee et al.  use the Irwin-Hall distribution to examine the efficacy of goodness-of-fit tests. Heinrich et al.  adapt the Irwin-Hall distribution in consideration of the accumulated accuracy of round-off errors. Inequalities for linear combinations of independent random variables whose domains have an upper bound are given in .
2. Derivation of the Irwin-Hall Distribution
Proof. For and , letwhich is the -dimensional unit cube. The set complement of with respect to is denoted by .
The hypervolume of the -dimensional solid has value, since the solid is a standard orthogonal simplex from the corner of an -cube. Similarly, if , then the hypervolume of isFor ,since the sum of nonnegative coordinates exceeds the number of coordinates which are greater than 1.
By the inclusion-exclusion principle,In (1), is the Stirling number of the second kind with both parameters equal to and has numerical value 1 [39, pages 38-39]. If , then , so in this case. Since is a polynomial, for all real-valued . Introducing the unit step function gives (1), and differentiation with respect to gives (2).
3. Discussion and a Generalization
Figures 2 and 3 reveal the structure of the CDFfor . Figure 2 demonstrates how the hyperplane (line), which is the line of a constant sum of the values of the random variables and is perpendicular to the -cube’s (square’s) main diagonal, accrues volume (area) below it. Figure 3 illustrates the regions that are included and excluded for various positions of the hyperplane (line) and how vertices are meet in sets. For , the binomial coefficients, which provide the counts of the vertices, are 1 for (), 2 for () and (), and 1 for (), as seen in Figures 2 and 3. In (11), the first term is the area of the large triangle in Figures 3(a), 3(b), and 3(c); the second term is the sum of the areas of the two hatched triangles in Figure 3(b), where exactly one of is greater than 1, and in Figure 3(c); and the third term is the area of the crosshatched triangle in Figure 3(c), where both and are greater than 1.
Figure 4 shows the same geometric interpretation for . In its CDFthe first term is the volume using (7) of the large orthogonal simplex in Figures 4(a), 4(b), and 4(c) with edges of length . The second term is the sum of the volumes using (8) of the three orthogonal simplexes, where exactly one of is greater than 1. In Figure 4(b), the vertices , and of the simplex with are labeled. Their coordinates are , , and . The lengths of the edges , and are . The third term of (12) is the sum of the three volumes using (8), where exactly two of are greater than 1. In Figure 4(c), the vertices are labeled , and in the region where both and are greater than 1. Their coordinates are , , and . The lengths of the edges , and are . The fourth term is the region that is shared by all the other regions, analogous to the crosshatched region in Figure 3(c).
In the same way, for any , the terms are the -volumes of orthogonal -simplexes, whose multiplicity is counted by binomial coefficients determined by the number of vertices of the -cube in sets as the “moving” -dimensional hyperplane “passes” them as increases. The hyperplane is perpendicular to the diagonal line . The volumes of the simplexes are computed using (7) and (8).
The method of proof in Section 2 can be extended to linear combinations of uniform random variables on different intervals. Suppose that are independent, that is uniformly distributed on the interval , and that are real constants. Also, whereThen, are independent uniform random variables on , and can be interpreted as the hypervolume of the solid that consists of all points that lie inside the unit hypercube and on one side of the hyperplane . Now, proceed by inclusion-exclusion as in Section 2. In general, the formula for is complicated because of the lack of symmetry that is caused by the presence of . This increases the number cases and removes the congruence of the solids of each size whose hypervolumes need to be added or subtracted at each stage of the inclusion-exclusion process. Nevertheless, the correct distribution is obtained in this manner. A special case in which these problems disappear is , so thatwhere is given in (1).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.
- N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, vol. 2, John Wiley & Sons, New York, NY, USA, 2nd edition, 1995.
- K. R. Koch, Introduction to Bayesian Statistics, Springer, Berlin, Germany, 2nd edition, 2010.
- C. P. Quesenberry, “Probability integral transformations,” in Encyclopedia of Statistical Sciences, S. Kotz, N. L. Johnson, and C. B. Read, Eds., vol. 7, pp. 225–231, Wiley, New York, NY, USA, 1986.
- V. K. Rohatgi, An Introduction to Probability Theory and Mathematical Statistics, John Wiley & Sons, New York, NY, USA, 1976.
- J. O. Berger, Statistical Decision Theory and Bayesian Analysis, Springer Series in Statistics, Springer, New York, NY, USA, 2nd edition, 1985.
- P. C. Silva, J. O. Cerdeira, M. J. Martins, and T. Monteiro-Henriques, “Data depth for the uniform distribution,” Environmental and Ecological Statistics, vol. 21, no. 1, pp. 27–39, 2014.
- K. Jayakumar and K. K. Sankaran, “On a generalization of uniform distribution and its properties,” Statistica, vol. 76, no. 1, pp. 83–91, 2016.
- C. P. Dettmann and M. K. Roychowdhury, “Quantization for uniform distributions on equilateral triangles,” Real Analysis Exchange, vol. 42, no. 1, pp. 149–166, 2017.
- E. S. Pearson, The History of Statistics in the 17th and 18th Centuries, Against the Changing Background of Intellectual, Scientific and Religious Thought, Lectures by Karl Pearson given at University College London during the Academic Sessions, Macmillan, New York, NY, USA, 1978.
- O. B. Sheynin, “Finite random sums (a historical essay),” Archive for History of Exact Sciences, vol. 9, no. 4-5, pp. 275–305, 1972/1973.
- H. Cramer, Mathematical Methods of Statistics, Princeton, Princeton, NJ, USA, 1946.
- S. K. Mitra and S. N. Banerjee, “On the probability distribution of round-off errors propagated in tabular differences,” The Australian Computer Society. The Australian Computer Journal, vol. 3, no. 2, pp. 60–68, 1971.
- J. O. Irwin, “On the frequency distribution of the means of samples from a population having any law of frequency with finite moments, with special reference to Pearson's Type II,” Biometrika, vol. 19, no. 3/4, pp. 225–239, 1927.
- A. N. Lowan and J. Laderman, “On the distribution of errors in Nth tabular differences,” The Annals of Mathematical Statistics, vol. 10, no. 4, pp. 360–364, 1939.
- A. Stuart and J. K. Ord, Kendall’s Advanced Theory of Statistics, vol. 1, Oxford, New York, NY, USA, 5th edition, 1987.
- V. M. Kruglov, “On one identity for distribution of sums of independent random variables,” Theory of Probability and its Applications, vol. 58, no. 2, pp. 329–331, 2014.
- H. Potuschak and W. G. Müller, “More on the distribution of the sum of uniform random variables,” Statistical Papers, vol. 50, no. 1, pp. 177–183, 2009.
- E. G. Olds, “A note on the convolution of uniform distributions,” Annals of Mathematical Statistics, vol. 23, no. 2, pp. 282–285, 1952.
- S. K. Mitra, “On the probability distribution of the sum of uniformly distributed random variables,” SIAM Journal on Applied Mathematics, vol. 20, no. 2, pp. 195–198, 1971.
- D. M. Bradley and R. C. Gupta, “On the distribution of the sum of n non-identically distributed uniform random variables,” Annals of the Institute of Statistical Mathematics, vol. 54, no. 3, pp. 689–700, 2002.
- S. M. Sadooghi-Alvandi, A. R. Nematollahi, and R. Habibi, “On the distribution of the sum of independent uniform random variables,” Statistical Papers, vol. 50, no. 1, pp. 171–175, 2009.
- H. Murakami, “A saddlepoint approximation to the distribution of the sum of independent non-identically uniform random variables,” Statistica Neerlandica. Journal of the Netherlands Society for Statistics and Operations Research, vol. 68, no. 4, pp. 267–275, 2014.
- G. S. Lo, H. Sangare, and C. . Ndiaye, “A review on asymptotic normality of sums of associated random variables,” Afrika Statistika, vol. 11, no. 1, pp. 855–867, 2016.
- D. L. Barrow and P. W. Smith, “Classroom Notes: spline notation applied to a volume problem,” American Mathematical Monthly, vol. 86, no. 1, pp. 50-51, 1979.
- P. Hall, “The distribution of means for samples of size N drawn from a population in which the variate takes values between 0 and 1, all such values being equally probable,” Biometrika, vol. 19, no. 3/4, pp. 240–245, 1927.
- S. A. Roach, “The frequency distribution of the sample mean where each member of the sample is drawn from a different rectangular distribution,” Biometrika, vol. 50, pp. 508–513, 1963.
- G. E. Bates, “Joint distributions of time intervals for the occurrence of successive accidents in a generalized Polya scheme,” Annals of Mathematical Statistics, vol. 26, no. 4, pp. 705–720, 1955.
- H. L. Gray and P. L. Odell, “On sums and products of rectangular variates,” Biometrika, vol. 53, no. 3-4, pp. 615–617, 1966.
- R. V. Hogg, J. W. McKean, and A. T. Craig, Introduction to Mathematical Statistics, Pearson-Prentice Hall, Upper Saddle River, NJ, USA, 6th edition, 2005.
- J. P. Hoyt, “The teacher's corner: a simple approximation to the standard normal probability density function,” American Statistician, vol. 22, no. 2, pp. 25-26, 1968.
- G. Allasia, “Approximation of the normal distribution function by means of a spline function,” Statistica, vol. 41, no. 2, pp. 325–332, 1981.
- J. K. Patel and C. B. Read, Handbook of the Normal Distribution, Statistics: Textbooks and Monographs, Marcel Dekker, New York, NY, USA, 2nd edition, 1982.
- M. S. Muminov and K. Soatov, “A note on spline estimator of unknown probability density function,” Open Journal of Statistics, vol. 1, no. 3, pp. 157–160, 2011.
- M. S. Muminov and K. S. Soatov, “On the approximation of maximum deviation spline estimation of the probability density Gaussian process,” Open Journal of Statistics, vol. 5, no. 4, pp. 334–339, 2015.
- C. Lee, S. Kim, and J. Jeong, “A view on the validity of central limit theorem: an empirical study using random samples from uniform distribution,” Communications for Statistical Applications and Methods, vol. 21, no. 6, pp. 539–559, 2014.
- L. Heinrich, F. Pukelsheim, and V. Wachtel, “The variance of the discrepancy distribution of rounding procedures, and sums of uniform random variables,” Metrika. International Journal for Theoretical and Applied Statistics, vol. 80, no. 3, pp. 363–375, 2017.
- E. Rio, “Exponential inequalities for weighted sums of bounded random variables,” Electronic Communications in Probability, vol. 20, no. 77, pp. 1–10, 2015.
- P. Stein, “Classroom Notes: a note on the volume of a simplex,” American Mathematical Monthly, vol. 73, no. 3, pp. 299–301, 1966.
- C. L. Liu, Introduction to Combinatorial Mathematics, McGraw-Hill, New York, NY, USA, 1968.
- 2017, http://www.math.uah.edu/stat/special/IrwinHall.html.
- 2017, http://www.distributome.org/V3/calc/IrwinHallCalculator.html.
- 2017, http://randomservices.org/distributions/IrwinHall/Calculator.html.
Copyright © 2017 James E. Marengo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.