Success Run Waiting Times and Fuss-Catalan Numbers
We present power series expressions for all the roots of the auxiliary equation of the recurrence relation for the distribution of the waiting time for the first run of consecutive successes in a sequence of independent Bernoulli trials, that is, the geometric distribution of order . We show that the series coefficients are Fuss-Catalan numbers and write the roots in terms of the generating function of the Fuss-Catalan numbers. Our main result is a new exact expression for the distribution, which is more concise than previously published formulas. Our work extends the analysis by Feller, who gave asymptotic results. We obtain quantitative improvements of the error estimates obtained by Feller.
For a sequence of independent Bernoulli trials with probability of success, let be the waiting time for the first run of consecutive successes. Then is said to have the geometric distribution of order . See, for example, the texts by Balakrishnan and Koutras  and by Johnson et al. . An expression for (the probability mass function of ) was derived by Philippou et al.  in terms of multinomial sums, following earlier work by Philippou and Muwafi . By a clever counting argument, Burr and Cane  obtained an expression for as a sum of terms involving products of two binomial coefficients (rederived by Godbole  using a different method). A somewhat different expression for involving multinomial coefficients was obtained by Philippou and Makri . Perhaps the simplest expression to date is the one by Muselli [8, eq. 16], which requires a sum over binomial coefficients.
In his classic text [9, pp. 322–326], Feller took a different approach to the distribution of by setting up a suitable recurrence relation. The auxiliary equation has degree . Feller showed that the equation has a unique largest (in absolute value) “principal root,” which is real and positive, and hence obtained an asymptotic formula for in terms of the principal root (more precisely, we work with (see Section 2) but Feller employed ; hence in Feller’s formalism the principal root has the smallest absolute value). Feller also bounded the error from neglecting the contribution from the other roots and formulated an iteration scheme for approximating the principal root numerically. Concerning the formula for the exact solution given by the recurrence relation method, Feller commented that it was “primarily of theoretical interest” because “the labor involved in computing all the roots is usually prohibitive” [9, p. 276]. Feller’s work dates from 1968 and computing power for numerical analysis has increased greatly since then. Note, however, that our derivation below is purely analytical.
We extend Feller’s analysis  of the recurrence relation for by finding power series expressions for all the roots of the auxiliary equation. We show that the series coefficients are Fuss-Catalan numbers (see the text by Graham et al. ) and the roots are given by suitable values of the generating function of the Fuss-Catalan numbers. This permits the roots to be written in terms of known “elementary functions.” This leads to our main result (21) which is a new exact expression for . This formula differs from the other results mentioned above in the important respect that there are only summands (independently of ). We also obtain quantitative improvements of the error estimates obtained by Feller. We also derive numerous properties of the roots of the auxiliary equation; for example, we draw attention to the fact that, in many results below, the value is a special case, and the ranges and require separate treatments.
Our analysis below focuses only on the original problem treated by Feller. More recent papers study variants of the problem; for example, Eryilmaz  treats the geometric distribution of order with a reward, where each time a success occurs a random reward is received, while Shmerling  studies a generalization of the geometric distribution of order for Markov processes.
2. Notation and Definitions
2.1. Auxiliary Equation
We present our basic notation and definitions below. The probability mass function of satisfies the recurrence relation, for ,Here and below we define . A derivation of (1) was given by Barry and Lo Bello . We employ the initial conditions for and . Next we define the auxiliary polynomialThe auxiliary equation is . We shall drop the subscript “” unless necessary.
To establish contact with Feller’s formalism , note that he derived the following expression for the probability generating function [9, eq. (7.6)]:Feller then considered the roots of the polynomial in the denominator of (3). Setting in (2), this is equivalent to our auxiliary polynomialWe shall employ and (2) below, bearing in mind throughout that Feller worked with .
Feller  proved that the roots of the auxiliary equation are distinct. We denote the roots by , . Unless required, we shall omit the arguments and . It is useful to multiply by to obtain the polynomialIt was shown by Feller  that there is exactly one real root for . This real positive root will feature sufficiently prominently in our analysis below that we designate it by the symbol and call it the “principal root.” The other roots will be termed “secondary roots.” We define . For brevity in various calculations below, we also define and .
2.2. Special Cases and
The special cases and are usually ignored in the literature, but it is useful to record the solutions, to help with limits for other calculations for . Clearly if then for all . Also if then and for all other values of. However, we are more interested in the roots of the auxiliary equation. If the auxiliary equation is and all the roots vanish. If the auxiliary equation is . One root is and there are repeated roots . Hence if and if while in both cases for . These facts will be helpful below.
3. Main Results
Theorem 1 (roots of auxiliary equation). For , letFor , the secondary roots are given byThe principal root is given by
Remark 2 (Fuss-Catalan numbers). Relevant definitions, formulas, and identities for the Fuss-Catalan numbers can be found in the text by Graham et al. . The Fuss-Catalan numbers areThe numbers are well defined provided . Hence the coefficients defined in (6) areThe generating function of the Fuss-Catalan numbers is and [10, p. 363] Hence for all , the secondary roots are given byFor , the above expression also applies to . For , note thatHence for ,This establishes the connection of the roots of the auxiliary equation to the generating functions of the Fuss-Catalan numbers.
Theorem 3 (probability mass function). For , the probability mass function is given by
The proof will be given in Section 6.
Corollary 4 (asymptotic solution). For fixed and , the contribution to is dominated by the principal root, and so asymptoticallyThe expression for was derived by Feller . We formulate the notion of “asymptotic” more precisely as follows. For fixed , we demand that the magnitude of the contribution to in (15) from all the secondary roots is less than times the contribution from the principal root. This is achieved if , whereHereHere
The derivations of the expressions for , , and will be given in Section 7.
Theorem 5 (probability of longest run). Let denote the probability that, in a sequence of Bernoulli trials, the longest run of successes has length less than . Note that . ThenHence
An expression for was derived by Muselli [8, eq. 16] and requires a sum of terms; namely, Expressions involving (possibly nested) sums of binomial or multinomial coefficients were derived by Burr and Cane  and Philippou and Makri . For fixed , our formula (21) requires a sum of exactly terms, independently of (this can be reduced to terms by noting that the complex roots occur in conjugate pairs: there are (resp., ) complex roots for odd (resp., even) ).
Corollary 6 (asymptotic solution). For fixed and , the contribution to is dominated by the principal root, and so asymptotically
4. Properties of Roots
Section 4.1 presents results which are essential to prove the main results of our paper in Sections 5, 6, and 7. Section 4.2 contains additional results, which can be omitted by the reader who wishes to proceed directly to Section 5. We rewrite the equation in the formWe assume below; expressions where or will be indicated as appropriate.
4.1. Properties of Roots I
Remark 7. Feller  proved that all the roots of the auxiliary equation are distinct. We include a summarized proof for completeness (see also Barry and Lo Bello ). Recall that has all the roots of and an extra root . NowThen when , which is not a root of , or else . But is a root of if and only if ; that is, . So can have a repeated root (of order 2) only when . The roots are at and we know that one of those two roots is not a root of. Hence has no repeated roots.
Proposition 8. For , the auxiliary equation has a unique positive real root, which lies in . One denotes the positive real root by . For any , exactly one of the three following statements is true: (i).(ii).(iii).
Proof. It was proved by Feller , who employed , that the auxiliary equation has a unique positive real root. In Feller’s analysis, the root had a magnitude larger than unity, so in our case the root lies in . The mutually exclusive statements (i), (ii), and (iii) follow immediately from an examination of the level sets of for , bearing in mind that (24) has a repeated root of order 2 when .
Proposition 9 (Feller ). For , the principal root has a strictly greater magnitude than all the other roots of the auxiliary equation; that is, , where is a root of . One employs the term “secondary roots” for the set , .
Proof. This was proved by Feller , who employed and showed that the positive real root has a smaller magnitude (in his case) than all the other roots.
Proposition 10. Suppose and and . Writing to denote the dependence on , then and . Also the secondary roots of the auxiliary polynomials and are identical; that is, for .
Proof. Clearly the equations and are identical. Hence the equations have the same roots. Since both equations have exactly two positive roots, which are and , respectively, and , it follows that and . The secondary roots of the two equations are obviously identical.
It follows that for any . From above, if then and if , then , which proves the result.
Proposition 11. Let . The principal root satisfies the bounds and .
Proof. The bound follows becauseThen because , and because is the unique positive root, it follows that . The upper bound is derived as follows:These bounds are not tight; for example, if then but so if then .
4.2. Properties of Roots II
Proposition 12. For , the roots are continuously differentiable functions of .
Proof. We have seen that the roots are distinct. In this section we write and and employ subscripts “” and “” below to denote partial derivatives. We show that if is a root, . Note that The former vanishes only at (not a root) or , which is a root only if . For , the latter vanishes only at . Next Hence at a root , and . Hence for all the secondary roots. For the primary root , if then and so . If then . We treat this case as follows. First, the partial derivatives are Now put ; then Hence for . The implicit function theorem then yields that every root is a continuously differentiable function of .
Proposition 13. (a) The principal root decreases monotonically and continuously from 1 to as increases from 0 to 1.
(b) If is odd, there are no other real roots. If is even, there is exactly one real negative root.
(c) All the other roots are complex and form conjugate pairs. There are no pure imaginary roots.
Proof. (a) This is obvious by examining the level sets of. For any , the level set is . As increases then decreases monotonically and continuously, and we have seen for and for . (b) If is odd, then for and there are no other real roots. If is even, then for and increases continuously without bound as . Hence there is exactly one real negative root. (c) Obviously all the other roots are complex, and they form complex conjugate pairs because the coefficients of the auxiliary equation are real. To show there are no pure imaginary roots, let , where is real. Then for any integer , is never real; hence it cannot equal ; hence is not a root.
Proposition 14. For any , the secondary roots , , satisfy the inequalityThe inequalities involving are strict if . This is a stronger bound than Proposition 9.
Proof. We already know that if and only if . For the secondary roots, we proved . Hence which yields the claimed inequality.
Proposition 15. For , let denote the set of roots of the equation . Let . Then if or ; otherwise .
Proof. Suppose . Let . Then . Hence either or , in which case and coincide; else does not exist and .
Proposition 16. Let and let and be two distinct roots. Then (a) if and only if .(b) if and only if .(c)For even , the negative real root has the most negative real part, hence the smallest amplitude of all the roots.
Proof. (a) Suppose . Then . Cancelling the factors of and and squaring yields , soCancel terms to deduce . But since this implies . Hence either or . Geometrically, the condition implies equal distance from origin (circle around ), and implies equal distance from (circle around ). The solutions are the points of intersection of the two circles that is symmetric around the real axis, hence complex conjugates. (b) Note that . Multiply to obtain . Then so rearrange terms to obtainAll the roots satisfy this equation. The right-hand side of (34) is a continuous monotonically increasing function of . Hence if and are two roots with unequal amplitudes then implies and vice versa. (c) For even , let in (b) be real and negative. Suppose a root exists such that ; then necessarily , which contradicts the result in (b). Hence the negative real root has the most negative real part and the smallest amplitude of all the roots.
We proved in (32) that for all the secondary roots. We also know that all the secondary roots vanish, that is, attaining their minimum amplitudes, at or 1.
Proposition 17. All the secondary roots simultaneously attain their maximum amplitudes at .
Proof. Let be a secondary root. For all , we know from (7) that is a series in powers of ; hence is differentiable with respect to within the radius of convergence of that series. We drop the subscript below. Using , we obtainRecall and . ThenThe denominator never vanishes for . The numerator , which is real, can be simplified as follows:The last line follows because ; hence . The inequalities are strict because is not real and positive. Hence and attains an extremum only if the factor vanishes; that is, . The extremum is obviously a maximum.
Remark 18 (moment generating function). An expression for the probability generating function was derived by Feller  (see (3)) and rederived by Philippou et al. . An equivalent expression for the moment generating function was derived by Barry and Lo Bello . Feller stated that the domain of convergence of is , whereas Philippou et al. stated that exists for . Barry and Lo Bello denoted the roots by , , and defined . They proved that all the roots have modulus less than unity and stated that exists on the interval . We confirm the correctness of Feller’s statement that the domain of convergence is determined by the principal root; hence yields the most precise value for the domain of convergence.
The following expressions for the mean and variance of the waiting time for the first run of successes were derived by Feller  (and rederived by Philippou et al. ):Chaves and de Souza  obtained the above expression for the mean but a different expression for the variance. We confirm the correctness of Feller’s expressions. The mean and variance are polynomials in and it is easily derived that From the above we see that both the mean and variance decrease strictly and continuously with . In particular and at , as expected. Both and diverge as .
5. Power Series Solutions for Roots
We now derive expressions for the roots as sums of power series. We begin with the principal root . We already know that when . Next we treat the case , so . Let , , and .
Proposition 19. is a conformal homeomorphism from onto .
Proof. If then , and hence . Thus it suffices to show that is one-one on . Note that . So for all , which implies that is locally one-one on . Let and suppose, to derive a contradiction, that is nonempty. Let . Suppose that . Then there exist such that and . Clearly, and . But this contradicts the fact that is one-one on a neighborhood of . So . There exist and such that , , and . Passing to a subsequence, we may assume that , , and . Since , it follows that there exists such that for all . So , and hence . By continuity , so . Moreover, . Since is one-one on a neighborhood of it follows that . Choose disjoint open neighborhoods of and of . By the Open Mapping Theorem is an open neighborhood of . So there exist , such that , where . Since , this contradicts the definition of and hence contradicts the assumption that is nonempty.
Since is a conformal homeomorphism its inverse is analytic on . Hence for for some scalar sequence (we show in Proposition 23 below that is given by (6)). Note that but . Hence the functional equation cannot be continued through . It follows that cannot be continued analytically through . In particular, the Taylor series for has radius of convergence .
Corollary 20. Suppose . Then
Proof. Note that since and that . Hence .
Corollary 21. For all , the secondary roots are given by (note that )
Proof. First suppose . Then for . So for each there exist distinct such that . Thus, , and so is a secondary root as desired. The power series for has positive coefficients (see Proposition 23 below) and converges at when . Hence extends continuously to and the series for each converges absolutely when . By continuity of the series for each , when , represents a secondary root of the auxiliary equation as desired.
Corollary 22. For , the principal root is given by
Proof. The sum of the roots is ; hence the secondary roots sum up to . Using , it follows that for all . Hence for and soHenceIn the second line, the sum over the roots of unity vanishes unless is a multiple of . The interchange of the orders of summation is permissible because the series converge absolutely within the domain of convergence.
Proposition 23. The coefficients are given byAsymptotically for large (this is an application of Stirling’s formula and the proof is omitted)
Proof. From (41), and ,Then and are easily derived. In the manipulations below, it is convenient to employ the falling factorial and rising factorial , where , while, for , and . We obtain for by equating terms in The coefficient of is , so by rearranging terms we can obtain in terms of . We thus formulate the following claim: we claim that identically, whereNext note that . Hence for all , we obtain the following product of and consecutive terms:HenceFor convenience below, replace by . Also define so . Then, dropping the prime,The product is a polynomial of degree in :Note that the coefficients depend on but not on . (e.g., .) Hence we can writeIt is a standard identity of binomial coefficients that, for all ,(Essentially, expand in a binomial series and differentiate times and evaluate the sum at .) It follows that identically. Setting then yields the original claim.
6. The Probability Mass Function
From the theory of recurrence relations, there exist constants , , such thatWe omit the dependences on and in (58) and below. Recall the initial conditions are for and . The equations to solve for the form a Vandermonde system:It is then known, from the properties of Vandermonde matrices, that the solution isThis can be expressed more concisely as follows. Note that the auxiliary polynomial can be written asHence for ,Hence . Next note that . Then provided (which can happen only if , and only for the positive root) . From (5), . Hence for ,HenceSubstitution into (58) yields (15) for .
Next we treat the case . This requires to be real and positive; that is, , and we know if and only if . Hence we set and in (63). The term is finite and positive for all . To evaluate the ratio in (63), which is a indeterminate form, we expand about . Let and set in (5), recalling that , , and . HenceSince , the solution is . Then (63) yieldsFor all the secondary roots, if then (63) yieldsHence for ,Substitution into (58) yields (15) for .
The derivation of the asymptotic expressions in Corollary 4 goes as follows. Because the principal root has a strictly larger magnitude than all the other roots, for sufficiently large it dominates the contribution to . From (15) for ,In the last line we used . The expression in (69) agrees with that by Feller . Next for , using ,This expression is not in .
We derive (17) and (18). For given fixed , we want such that, for all , the magnitude of the summed contribution from all the secondary roots in (15) is less than times the contribution from the principal root. From (15), one must haveNow suppose that we have an upper bound such that for all the secondary roots. Note that depends on and but not on . The contribution from the secondary roots is bounded byHence from (71) the contribution of the principal root will dominate ifIn the special case , we see from (15) that (73) simplifies toHence we require . Next if , then and , soHence in (73)Hence we require