Abstract

The Irwin-Hall distribution is the distribution of the sum of a finite number of independent identically distributed uniform random variables on the unit interval. Many applications arise since round-off errors have a transformed Irwin-Hall distribution and the distribution supplies spline approximations to normal distributions. We review some of the distribution’s history. The present derivation is very transparent, since it is geometric and explicitly uses the inclusion-exclusion principle. In certain special cases, the derivation can be extended to linear combinations of independent uniform random variables on other intervals of finite length. The derivation adds to the literature about methodologies for finding distributions of sums of random variables, especially distributions that have domains with boundaries so that the inclusion-exclusion principle might be employed.

1. Introduction

The simple continuous uniform or rectangular distribution Uniform(0, 1) with probability density function (PDF) for and otherwise is very important. Two applications arise in numerical simulation and Bayesian analysis of proportions. If is the cumulative distribution function (CDF) of the continuous random variable , then the random variable has a Uniform() distribution. The random variable can be simulated by first simulating and then letting . This is called the inversion method ([1, page 295], [2, pages 194–196]). The transformation is called the probability integral transformation ([3], [4, pages 203-204]). The uniform distribution is a Bayesian noninformative prior distribution for the distribution of a random variable defined on the unit interval, such as a beta distribution for a proportion ([2, page 33], [5, pages 82–90]). For other applications and generalizations of the uniform distribution, see [68].

The present goal is to derive the CDF and the PDF of the sum , where are independent identically distributed Uniform(0, 1) random variables for . The CDF and PDF arerespectively, where is the unit step functionThe derivation in Section 2 is geometric and explicitly uses the inclusion-exclusion principle.

Derivations of the distribution, which more recently acquired its name Irwin-Hall, go back to Lagrange and Laplace in the latter 18th century and the early 19th century. Lagrange used generating functions based on to obtain the distribution of T ([9, pages 603–612], [10, page 283]). Those generating functions are a predecessor of characteristic functions [10, page 286]. Laplace often revisited the problem of finding the distribution of and employed many methods ([9, pages 714-715], [10, pages 286–301]). The distribution is described in [1, pages 296–300], where it is called the Irwin-Hall distribution.

Some derivations employ characteristic functions in a variety of ways, since the characteristic function of a sum of independent random variables is the product of each summand’s characteristic function and the inverse transform is not intractable ([11, pages 188-189], [1214], [15, pages 362-363], [16, 17]). Others utilize the convolution integral for sums and mathematical induction ([4, page 225], [11, pages 190-191 and 244–246], [18]). The distribution of the sum of uniform random variables that may have differing domains is found in [1821]. Sums of dependent uniform random variables are examined in [22, 23].

Direct integration techniques can be used to obtain the distribution of a linear combination of Uniform(0, 1) random variables ([15, pages 358–360], [24, 25]). Similar techniques are used in [26] for uniform distributions whose domains are intervals with zero as their left endpoints. The distribution of the mean is obtained when all the constants are . In this case, the distribution is called the Bates distribution ([1, page 297], [27]), which can also be found by a simple transformation of the Irwin-Hall distribution ([15, page 359], [25, page 241]). Using moment generating functions, instead of characteristic functions, Gray and Odell [28] found the distribution of any linear combination of uniform random variables with different domains allowed. In Section 3, the present method or style of proof is extended to those cases giving the same distributions.

Because is a sum, the Irwin-Hall distribution approximates a normal distribution with a spline, since the Irwin-Hall distribution in (2) is composed of polynomials. The support of is the interval []; the mean, mode, and median of are ; and its variance is . By symmetry, all odd central moments are zero, including skewness. The kurtosis is [1, page 300]. This is the measure of kurtosis that is 3 for a normal distribution, so Irwin-Hall distributions are platykurtic, and the kurtosis is close to 3 for large . According to the Central Limit Theorem,([4, pages 280–283], [11, pages 213–218 and 245], [29, pages 220–222]). Figure 1 contains a normal distribution with mean and variance and its approximating Irwin-Hall distribution with . The approximation is very good even for this small value of [30]. The uniform error bound for the normal(0, 1) CDF is([31], [32, page 51]). Approximations with spline fitting can be useful with or without complete information about the distributional shape [33, 34].

Since round-off errors for random variables that are rounded to the nearest integer are distributed Uniform(−1/2, 1/2), the sum of round-off errors is a linearly transformed Irwin-Hall distribution [12]. For large , the sum of round-off errors is easily described with a normal distribution [29, page 222]. For small , the Irwin-Hall distribution is also appropriate and not too complicated.

Lee et al. [35] use the Irwin-Hall distribution to examine the efficacy of goodness-of-fit tests. Heinrich et al. [36] adapt the Irwin-Hall distribution in consideration of the accumulated accuracy of round-off errors. Inequalities for linear combinations of independent random variables whose domains have an upper bound are given in [37].

2. Derivation of the Irwin-Hall Distribution

Theorem 1. Let for be independent random variables, each having the continuous uniform distribution on the unit interval, and let . Then, the CDF and PDF of are given by (1) and (2), respectively.

Proof. For and , letwhich is the -dimensional unit cube. The set complement of with respect to is denoted by .
The hypervolume of the -dimensional solid has value[38], since the solid is a standard orthogonal simplex from the corner of an -cube. Similarly, if , then the hypervolume of isFor ,since the sum of nonnegative coordinates exceeds the number of coordinates which are greater than 1.
By the inclusion-exclusion principle,In (1), is the Stirling number of the second kind with both parameters equal to and has numerical value 1 [39, pages 38-39]. If , then , so in this case. Since is a polynomial, for all real-valued . Introducing the unit step function gives (1), and differentiation with respect to gives (2).

3. Discussion and a Generalization

Figures 2 and 3 reveal the structure of the CDFfor . Figure 2 demonstrates how the hyperplane (line), which is the line of a constant sum of the values of the random variables and is perpendicular to the -cube’s (square’s) main diagonal, accrues volume (area) below it. Figure 3 illustrates the regions that are included and excluded for various positions of the hyperplane (line) and how vertices are meet in sets. For , the binomial coefficients, which provide the counts of the vertices, are 1 for (), 2 for () and (), and 1 for (), as seen in Figures 2 and 3. In (11), the first term is the area of the large triangle in Figures 3(a), 3(b), and 3(c); the second term is the sum of the areas of the two hatched triangles in Figure 3(b), where exactly one of is greater than 1, and in Figure 3(c); and the third term is the area of the crosshatched triangle in Figure 3(c), where both and are greater than 1.

Figure 4 shows the same geometric interpretation for . In its CDFthe first term is the volume using (7) of the large orthogonal simplex in Figures 4(a), 4(b), and 4(c) with edges of length . The second term is the sum of the volumes using (8) of the three orthogonal simplexes, where exactly one of is greater than 1. In Figure 4(b), the vertices , and of the simplex with are labeled. Their coordinates are , , and . The lengths of the edges , and are . The third term of (12) is the sum of the three volumes using (8), where exactly two of are greater than 1. In Figure 4(c), the vertices are labeled , and in the region where both and are greater than 1. Their coordinates are , , and . The lengths of the edges , and are . The fourth term is the region that is shared by all the other regions, analogous to the crosshatched region in Figure 3(c).

In the same way, for any , the terms are the -volumes of orthogonal -simplexes, whose multiplicity is counted by binomial coefficients determined by the number of vertices of the -cube in sets as the “moving” -dimensional hyperplane “passes” them as increases. The hyperplane is perpendicular to the diagonal line . The volumes of the simplexes are computed using (7) and (8).

The Website [40] has a free simulator for , where selecting yields the PDF (2). Other calculators are at [41, 42].

The method of proof in Section 2 can be extended to linear combinations of uniform random variables on different intervals. Suppose that are independent, that is uniformly distributed on the interval , and that are real constants. Also, whereThen, are independent uniform random variables on , and can be interpreted as the hypervolume of the solid that consists of all points that lie inside the unit hypercube and on one side of the hyperplane . Now, proceed by inclusion-exclusion as in Section 2. In general, the formula for is complicated because of the lack of symmetry that is caused by the presence of . This increases the number cases and removes the congruence of the solids of each size whose hypervolumes need to be added or subtracted at each stage of the inclusion-exclusion process. Nevertheless, the correct distribution is obtained in this manner. A special case in which these problems disappear is , so thatwhere is given in (1).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.