Abstract

We study the cumulative distribution function (CDF), probability density function (PDF), and moments of distance between a given vertex and a uniformly distributed random point within a triangle in this work. Based on a computational technique that helps us provide unified formulae of the CDF and PDF for this random distance then we compute its moments of arbitrary orders, based on which the variance and standard deviation can be easily derived. We conduct Monte Carlo simulations under various conditions to check the validity of our theoretical derivations. Our method can be adapted to study the random distances sampled from arbitrary polygons by decomposing them into triangles.

1. Introduction

A Voronoi diagram is a partitioning of a space into convex polygons called Voronoi cells based on prespecified points (called seeds), such that each cell contains exactly one seed and the interior points of a cell are closer to this seed than any other ones.

Voronoi diagram was formally defined and studied for a general -dimensional Euclidean space in 1908 [1]. Since its inception, it has been widely used in a variety of research fields. Below we list several applications as examples. A classical application of Voronoi diagram is in the field of forest ecology. To make forestry inventories, the distances from sampling points to seedlings are often recorded [2]. A better understanding of this random distance will help ecologists estimate the average number of plants per unit area [3]. Another application involves in distribution management [4], in which it is often necessary to study the expected random distances that result from dispatching vehicles to meet customers’ demand. In recent years, Voronoi diagram was used to estimate the Shannon entropy of multidimensional probability densities [5], to analyze the complex nature of biomolecular structure [6], to enhance the posterior distribution estimation via nonparametric sampling approximation [7], to model stochastic foam geometries [8, 9], to simulate granular materials [10], and so forth.

Perhaps the most successful application of Voronoi diagram is in the field of wireless communications. The analysis of the distance between given wireless base stations and random receivers is an important problem in wireless communication networks [1113]. In fact, assuming that a random receiver always connects to the nearest base station, we can decompose the entire space into disjoint Voronoi cells [14] and simply study the properties of this random distance within one Voronoi cell, as shown in Figure 1(a). Because all Voronoi cells are convex polygons that can be further decomposed into disjoint triangles with the seed of Voronoi cell which serve as their common vertex, it suffices to study the distance between a given vertex (the seed of the Voronoi cell) and a uniformly distributed random point inside a triangle, for example, in Figure 1(b).

All of these applications require researchers to study the distribution or the moments of a random distance between a given vertex and a random point that is uniformly distributed within an arbitrary triangle. Statistical properties such as the cumulative distribution function (CDF) and probability density function (PDF) of such random distance have been actively studied in recent years [1517]. One drawback of these prior works is that the resulting CDF and PDF formulae depend on the geometric properties of the triangle, such as whether the altitude from the fixed vertex of the triangle to the opposite side is inside or outside the triangle. It makes the use of CDF/PDF in studying moments (e.g., the mean, variance, and skewness) of this random distance difficult. In this paper, we use a parametrization tactic to reformulate the CDF and PDF of this random distance so that they share a consistent form for all cases, which enables us to provide a general formula for moments of arbitrary order. We conducted Monte Carlo simulations with various configurations and found that the empirical results match the theoretical results well.

2. The Unified Formulae of CDF and PDF of the Random Distance

For the ease of presentation, we adopt the following symbols throughout the discussion as follows. We denote the triangle as , and the vertex is always considered as the reference point. We denote as the random location of a receiver within this triangle. We assume that is uniformly distributed within triangle ; namely,Here denotes the area of .

Our goal is to study the statistical properties of , the Euclidean distance between the reference point and the random point . To achieve this, we first compute , the CDF of .

In [15], the vertex is considered as the reference point, and we may assume that without loss of generality. Two cases need to be treated separately according to :(1)If , the altitude with the edge as base is inside of the triangle, called the acute or inside altitude case, as shown in Figure 2(a). Note that the right triangle case () is considered as a special example of this case.(2)If , the altitude with the base edge is outside of the triangle, called the obtuse or outside altitude case (Figure 2(b)).

We adopt a computational technique to merge those two cases into one form by using axial symmetry and employing an additional coefficient . As shown in Figure 2(c), the line is used as the symmetry axis. If is acute (the inside altitude case), we use information from (Figure 2(c)), where . Similarly, we use , where if is obtuse (outside altitude case). Points and are symmetric with respect to the axis . The additional coefficient will be discussed later when the CDF of is calculated.

As in [15], we draw a circle centered at with radius , to compute the distribution of the distance between and . The CDF function, , is the area of divided by , the total area of . Four possible cases are discussed below corresponding to different ranges of (Figure 3).

Case 1. Let be the length of altitude . When , as shown in Figure 3(a), the CDF of is the area of a sector divided by .We would like to point out that the above formula is valid for both the inside and outside altitude cases.

Case 2. If , as shown in Figure 3(b), the CDF of is the intersection area divided by .In formula (3), , , is , and . Let , and then and are variables related to by the following formulae:In this case, the acute/obtuse cases are unified by using coefficient defined as follows:When , is the deep blue area in Figure 3(b). When , it is the union of both light and deep blue areas.

Case 3. If , as shown in Figure 3(c), the CDF of is the shadow area divided by .where

Case 4. When , , so .
In summary, the CDF of iswhereThe PDF, , is the derivative of with respect to and is provided as follows:whereIt is not hard to derive thatConsequently the PDF formula (10) can be simplified as

3. Moments Calculation

To facilitate the computation of moments, we first need to compute the following important integrals. Recall that , and we haveHere is a special case of the incomplete beta function that can be calculated by trigonometric substitution. Let , and we havewhere

After using the inverse trigonometric substitution, including and , we havewhere

For convenience, we provide some special cases in the following equation:

By using formulae (14)–(17), we can derive the general formula for moments of the random distance as follows:

Below we provide four low-order moments as special cases of the above general formula:

With the above formulae, commonly used statistical quantities such as variance, skewness, and kurtosis of can be calculated easily.

3.1. Extensions to Certain Types of Triangles

We would like to point out that, for certain special triangles, the first two moments have especially simple forms.(1)If , we have , , and therefore(2)If , we have , , , and(3)If and °, we have , , and . In this case

Below we validate our results by comparing the above formulae with some published results for special polygons.

Example 1 (rectangle). For a rectangle with side lengths and , the mean Euclidean-distance from the center of the rectangle to a uniformly distributed random point in the rectangle iswhere and are the conditional expectations of this random distance within isosceles triangles and , respectively. The results presented above is computed by our formula (23) and is equivalent to the results derived in [18, Formula ].

Example 2 (square). If the polygon is a square with length , then the square can be split to 4 isosceles right triangles by the diagonals. For each triangle, , and . Based on formula (26), we derive the formula for , which is the CDF of the distance from the center to a random point that is uniformly distributed in this square, as follows:where . Formula (30) is identical to Formula given in [19].
With formulas (27) and (28), we can calculate the first two moments of as follows:The result of the first moment was also reported in several earlier studies [2, 4]; to the best of our knowledge, the result of the second-order moment is novel.

Example 3 (regular polygon). In general, for any regular polygon , let be the distance from the center to an arbitrary edge. This regular polygon can be decomposed into isosceles and with the same shape. Consequently the conditional expectation and , which can be described by (23), must be identical for all , thus equal to the unconditional expectation . The same conclusion holds for corresponding to (24).
A more interesting result is the limit form of our results for regular polygon (a disc) when and . By taking the limits of (23) and (24), we havewhere is the radius of the circle.

4. Simulation Studies

We conduct several Monte Carlo studies to compare the theoretical results derived in Sections 2 and 3 to those obtained from simulations. For a given triangle , we sample point from a uniform distribution on with the following formula:where and are generated from uniform distribution subject to condition . We would like to point out that this technique was first used in [20]. A total number of points are generated with formula (33), and we compute the distance .

Five cases of triangles are used in this simulation study and they are illustrated in Figure 4. Technical details about these triangles are given as follows:(a)The generic inside altitude (acute) case: we select , , and , as shown in Figure 4(a).(b)The generic outside altitude (obtuse) case: we select , , and , as shown in Figure 4(b).(c)The isosceles triangle case: we select , , and , as shown in Figure 4(c).(d)The generic right triangle case: we select , , and , as shown in Figure 4(d).(e)The isosceles right triangle case: we select , , and , as shown in Figure 4(e).

We compute the theoretical CDF and PDF of by formulae (8) and (13), respectively. The theoretical CDF and PDF are compared with the empirical CDF and PDF estimated by an adaptive Kernel Density Estimator [21] as implemented in MATLAB package “Kernel Density Estimator”.

As shown in Figures 5(a)5(e), the empirical CDF and PDF match their theoretical counterparts very closely in all five cases.

Next, we compare the empirical moments with theoretical results defined by (21). Here we define the empirical moment as the maximum likelihood estimator of the th population moment, which is calculated by

Relative errors of sample moments and the sample standard deviation (as compared to their theoretical counterparts) are summarized in Table 1. We decide to summarize relative errors with respect to standard deviation in this table as well because it is a commonly used characteristic.

More specifically, we define the relative error rate asThe relative error rate for the standard deviation, , is defined similarly. We then repeat the above simulation experiments for time and report (the mean of ), (the mean of ), (the maximum of ), and (the maximum of ) over repetitions in Table 1. From this table, we observed that the largest average error rate is less than and the largest maximum error rate is less than . All these simulation results show the consistence between the theoretical results and that estimated from the simulations.

As in any simulation studies, the error rate is a decreasing function of the number of sampling points () and it is important to understand the rate of convergence in a simulation study. Since we observe from Table 1 that the error rate is the largest for the obtuse case, we focus on this case in the following additional simulation studies. More specifically, we let change from 500 to 20,000 and recorded the observed average and maximum error rates in Figure 6. We see that the error rates decrease at rate approximately, which is consistent with the central limit theorem. We also observed that, in general, the higher order a moment is, the larger its error rate is. Based on these empirical evidences, if we hope the maximum error rate to be no more than for the 4th-order moment (the most difficult case), we should choose no less than 13,000. As a remark, we also repeated these simulation studies for other shapes of triangles and the patterns of the error rate curves are very similar.

5. Conclusion

To study the statistical properties of the distance (denoted as throughout this manuscript) from a fixed vertex of a triangle to a random point that is uniformly distributed on the interior of this triangle is important because many distance-optimization problems depend on Voronoi decomposition, in which the entire plane is divided into polygons called Voronoi cells, which can be further decomposed into triangles. This type of optimization has many applications in areas such as wireless communications, ecology, and distribution management.

In recent years researchers have developed distinct CDF and PDF formulae for two types of triangles, namely, the inside altitude (acute) and outside altitude (obtuse) cases. Without a unified formula, it is difficult to derive useful statistical quantities such as moments and standard deviation of . In this paper, we consolidated the two special cases and give a unified formula for the exact CDF and PDF of . Our formula is consistent with the results obtained in [15]. The unified CDF/PDF formulae reduce computational burden significantly and help us derive population moments of with arbitrary orders (see (20)) and we gave the exact formula for the first four moments (see (21)). The reparametrization technique we use in consolidating the acute and obtuse cases may also be useful in similar research projects.

With our new PDF formula, the distribution of the distance between any point to a random point within an arbitrary polygon can be easily built using a method based on piecing triangles together [15, 16, 22, 23]. In this manuscript, we derived the moments formula for some polygons with special shapes, such as rectangles, regular -polygons, and discs as limiting cases of regular -polygons. Our results are consistent with those derived from prior studies [2, 4, 18].

We conduct Monte Carlo simulation studies to verify the consistency of the theoretical results derived in this study and give some empirical evidences about how fast the observed error rate converges to zero. These results conform to the convergence rate predicted by the central limit theorem.

Disclosure

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported in part by National Natural Science Foundation of China (nos. 61372190 and 61370193), the University of Rochester Center for AIDS Research (NIH 5 P30 AI078498-08), Respiratory Pathogens Research Center (NIAID Contract no. HHSN272201200005C), and the University of Rochester CTSA Award no. UL1 TR002001 from the National Center for Advancing Translational Sciences of the National Institutes of Health. The authors would like to thank the following individuals for their expert contributions to the studies described in this manuscript: Hongmei Yang, Jeanne Holden-Wiltse, Yu Gu, Sanjukta Bandyopadhyay, Lu Wang, Yogeshwar Kelkar, Jie Zhou, and James Java.