Research Article  Open Access
The Concepts of Pseudo Compound Poisson and Partition Representations in Discrete Probability
Abstract
The mathematical/statistical concepts of pseudo compound Poisson and partition representations in discrete probability are reviewed and clarified. A combinatorial interpretation of the convolution of geometric distributions in terms of a variant of Newton’s identities is obtained. The practical use of the twofold convolution leads to an improved goodnessoffit for a data set from automobile insurance that was up to now not fitted satisfactorily.
1. Introduction
Consider the class of discrete arithmetic random variables with probability generating function (pgf) and nonvanishing zero probability . Suppose that the pgf can be written as for some generating function . The pseudo compound Poisson representation concerns the duality between and , which is best expressed in terms of the identity or equivalently the recurrence relationsAs a consequence, it has been shown that is infinitely divisible if and only if one has , (e.g., [1]; [2, page 83]). Around the same time Feller [3, page 290] shows that is infinitely divisible if and only if it is a compound Poisson random variable with parameter and severity probabilities . If is not infinitely divisible, then (1) still holds for some , with at least one negative value, and this property motivates the naming pseudo compound Poisson representation.
Section 1 summarizes its main properties. The power series identity leads also to the more complex and less known expressionwhere is the set of free partitions of weight , which is called partition representation. Although (2) has been applied in Hürlimann [4, Theorem 3.2], to derive an existence criterion for the construction of confidence bounds for discrete sampling distributions, expression (2) is not stated correctly and a proof of it is missing. Section 3 fills in these gaps and provides a combinatorial interpretation of the partition representation.
As a new illustration, Section 4 considers the convolution of geometric random variables, whose pseudo compound Poisson representation is specified by the th power sum polynomial . The partition representation (2) identifies with the multiple of the th complete symmetric function in the variables . The recursion (1) is equivalent to a variant of Newton’s identities, which motivates us to call this convolution Newton distribution. Applied to estimation theory, we show that this distribution satisfies Gauss’s principle (the maximum likelihood estimator of the mean is the sample mean) and construct a parameter vector orthogonal to the mean. In the twoparameter case, we derive the maximum likelihood equations and illustrate their use in Section 5 at a specific data set from automobile insurance. Through regrouping of classes, we show that the Newton distribution beats in goodnessoffit both the negative binomial and the Poisson inverse Gaussian and at the same time improves the unsatisfactory fit obtained in a previous case study.
2. Characterization through Pseudo Compound Poisson Representation
Recall the pseudo compound Poisson representation in discrete probability theory from Hürlimann [5, 6]. Let be a discrete arithmetic random variable defined on the natural numbers with probabilities , such that . Besides the probability generating function (pgf) one considers the cumulant pgf defined byIts name is motivated by the series expansion of the cumulant generating function (cgf)
Proposition 1 (pseudo compound Poisson representation). Let be a discrete arithmetic random variable with such that and set . Then the probabilities satisfy Panjer’s recursionand the following pseudo compound Poisson representation holds:
Proof. The recursion (5) is an immediate consequence of (3). Indeed, the derivative of the equation satisfies the relation , which is equivalent to identities (5). For further details consult Hürlimann [5, Corollary 2], Hürlimann [6, Theorem 1].
Remarks 1. Clearly, if , , the distribution is compound Poisson with parameter and severity probabilities , , a fact denoted by . In this situation, the recursion (5) is named Panjer recursion after Panjer [7]. With Feller [3, Section XII.2], one knows that is compound Poisson if and only if it is infinitely divisible (see also [6] for a more general characterization). Otherwise, one says that has a pseudo compound Poisson distribution. In the terminology of Sundt [8], it belongs to the class with (see also [9]). The theoretical and practical usefulness of pseudo compound Poisson distributions have been demonstrated by the author in numerous publications. Sometimes, the sequence defined by (5) is called the De Pril transform of after De Pril [10] (e.g., [11]). The interest of recurrence relation (5) extends beyond discrete probability to the general context of integer sequences. Let be the identity map such that for all nonnegative integers. Then, the convolution equation for integer sequences and occurs in many areas of mathematics. An important problem is the relationship between the asymptotic behaviors of the two sequences and (e.g., [12] and its references).
The converse of Proposition 1, although used in applications (e.g., [6, 13–15]), has been less studied. In case of negative ’s a necessary condition on the cumulant pgf such that (5) defines a true probability distribution has been first identified by Lévy.
Proposition 2. Given is a finite generating function with . Assume that . In order that (5) defines a discrete probability distribution, it is necessary that the condition (NC) , is fulfilled.
Proof. The condition (NC) is found in Lukacs [16, page 252] and Johnson et al. [17, page 356]. Proofs are found in Lévy [18, 19] and Cuppens [20, Section 8.4 and Appendix B].
Remarks 2. Precise sufficient conditions (SC) in Proposition 2 are not known. In principle, any of the remaining ’s could be negative provided they are sufficiently small in absolute value; that is, , , for some (cf. [18, page 263]; [13, Remark 1 to Theorem 3.1]). Lévy [18, page 263] points out that is the simplest case admitting a negative . Van Harn [21, page 84] obtains four inequality constraints on in terms of the other ’s. Lukacs [16, page 251] shows that if , , then (5) defines a discrete probability distribution (cf. [13, Remark 3 to Theorem 3.1]).
3. Partition Representation and Combinatorial Interpretation
Besides the pseudo compound Poisson representation the power series identity has another important immediate consequence.
Proposition 3 (partition representation). Let X be a discrete arithmetic random variable with such that , and let be its cumulant . Then, the following partition representation holds:where is the set of free partitions of weight .
Proof. This result follows from the calculationwhich implies identities (7).
Remarks 3. Although variants existed before (e.g., [22, Section 2]; [23, Equation (3.4)]), the general partition representation (7) for pseudo compound Poisson arithmetic distributions has been first stated in Corollary 3.1 of Hürlimann [4]. Unfortunately, the given formula contains a misprint: the multiplicity factorial has been wrongly replaced by . Note that this mistake remains without influence on the validity of Theorem 3.2 derived from this representation. However, the claimed but missing proof by induction is best replaced by the present proof. The simple power series manipulation has already been used by Macdonald [24] in his proof on page 25 of a similar but more special result in the theory of symmetric functions (see later Section 4.3). Section I.1 of the mentioned textbook is recommended reading for definitions and properties around partitions (in particular weight, length, and multiplicity of a partition).
The partition representation has a nice combinatorial interpretation. For convenience, set , . Then, the expressionderived from (7) identifies with the cycle index polynomial (or cycle indicator) of the symmetric group of order in the variables (e.g., [24, Example 9.(a), page 29]). Indeed, the coefficient in (9) of an arbitrary monomial is equal to the fraction of all permutations that have fixed points, cycles of length , and cycles of length . We note that this combinatorial interpretation is crucial to the novel example studied in the next two sections.
4. The Convolution of Geometric Random Variables: A Newton Type Distribution
To illustrate further the discrete probability concepts of Sections 2 and 3, consider the power sum like specification of the cumulant pgf in (3); namely,where , , , , are unknown parameters. A calculation shows that the quantity is finite. The condition (C1) of Proposition 2 is fulfilled and therefore (10) belongs to the random variable of some compound Poisson distribution, to be determined. Clearly, one hasfrom which one obtainswhich is the pgf of the convolution of geometric random variables; henceConsider the partition representation of Proposition 3. One observes that is nothing else other than the th power sum polynomial in the variables , . Identity (9) is herewith equal toand shows with Macdonald [24, Equation (2.14′), page 25] that the th probability of , namely, , is the multiple of the th complete symmetric function in the variables . In this combinatorial context the Panjer recursion (5) is equivalent to the identitieswhich is nothing else other than a variant of Newton’s identities (see [24, Equations (2.11) and (2.11′), page 23]). Therefore, the distribution (14) of the convolution of geometric random variables derived from the recursion (15) could also be called Newton distribution. The distribution corresponding to the random variable (13) is abbreviated by .
Now, we apply our concepts to study some estimation properties of the Newton distribution. In a first step, we construct a parameter vector orthogonal to the meanIn general, given a random variable we suppose that the mean is functionally independent of a certain parameter vector , that is, , , and denote the loglikelihood of by . The mean is orthogonal to the parameter vector , denoted by , if one has , . Further, let be the class of all discrete arithmetic distributions for which the maximum likelihood estimator of the mean is the sample mean, that is, such that (Gauss’s principle). The subclass of , called mean orthogonal class, consists of all those distributions, which besides Gauss’s principle satisfy the mean orthogonal property . It is known that the class is closed under convolution (see [15, Theorem 2.2]) and can be characterized as follows. Suppose there exist a parameter and a onetoone coordinate transformation mapping to , and set . Then with is equivalent to the following condition (see [15, Lemma 3.1, Equation (3.8)]): where , is determined by the pseudo compound Poisson representation of . Applied to the Newton distribution, one sees after some calculation that (17) with , , is fulfilled provided the partial differential equationcan be solved. It is easy to see that this will be fulfilled provided one hasA solution to these partial differential equations isIt follows that the mean is orthogonal to the parameter vector .
Next, we derive the maximum likelihood equations for the twoparameter Newton distribution obtained from the convolution of two geometric distributions with parameters , . The probabilities in the above orthogonal parameterization are given byFor each let be the number of observations of the random variable equal to , and let be the total number of observations, where for all . From (21) one obtains the scaled loglikelihood functionThe maximum likelihood equations are given byThis system of two nonlinear equations in the two variables can be solved with substitution setting . Insert in the first equation to see thatThe only feasible solution of this quadratic equation takes the minus sign and is given byUsing that and inserting (25) into the second equation of (23) one sees that the latter depends besides the observations only upon and is determined byThe preceding simple derivation of maximum likelihood properties is possible in virtue of the pseudo compound Poisson representation (mainly (17)). Let us also illustrate the mathematical benefit of the partition representation. For this we rely on Theorem 3.2 in Hürlimann [4], which guarantees the existence of welldefined confidence bounds for the mean of a count distribution provided the following regularity assumption is fulfilled.(RA)There exist a positive integer and a constant such that and for all partitions such that .Now, for the Newton distribution with two parameters such that , one hasand thus . The regularity assumption with is fulfilled.
5. A Numerical Example from Automobile Insurance
The choice of an appropriate claim number model in automobile insurance is a prominent actuarial research topic whose last 50yearold story begins with Bichsel [25], who used the negative binomial (NB) distribution to construct a bonusmalus system. Among the other first good choices, the Poisson inverse Gaussian (PIG) has often been advocated (see Example 3.4 in [4] and discussions). Due to the penalty in the SBC (Schwartz Bayesian criterion) score the best overall fit is in any case obtained for a twoparameter distribution (see [4, end of Section 4]). In this context, we ask whether the Newton distribution might compete with the NB and PIG. For a specific example and in a very precise sense (regrouping of classes), it beats both of them and at the same time improves the unsatisfactory fitting of data set 4 in Table 4.5 of Hürlimann [4].
Since the method of moments is inappropriate, as shown by Gossiaux and Lemaire [26], we use the maximum likelihood estimation (MLE) method, which is best in view of its asymptotic properties. The goodnessoffit is established on the basis of both the SBC score and the value of the chisquare statistic with appropriate regrouping of the classes. The improved results are found in Tables 1 and 2. Although the minimum SBC score is attained for the PIG the chisquare and value (with the last two regrouped classes) are best for the Newton distribution. The decrease in goodnessoffit of the PIG is due to the loose in fit by regrouping the last two classes and the deficiency in fitting the class with two observed insurance claims.


Let us conclude the present work. Our aim was to give a systematic brief overview of the concepts of pseudo compound Poisson and partition representations in discrete probability and present a new application. Through the detailed study of a single new example, we have exemplified some typical features and further potential of our methodology.
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
References
 S. K. Katti, “Infinite divisibility of integervalued random variables,” Annals of Mathematical Statistics, vol. 38, pp. 1306–1308, 1967. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 F. Steutel, Preservation of Infinite Divisibility under Mixing and Related Results, vol. 33 of Mathematical Centre Tracts, Mathematical Centre, Amsterdam, The Netherlands, 1970.
 W. Feller, An Introduction to Probability and Its Applications, vol. 1, John Wiley & Sons, 3rd edition, 1968. View at: MathSciNet
 W. Hürlimann, “Robust confidence bounds for the mean of some count data models,” Blätter der DGVFM, vol. 25, no. 4, pp. 795–811, 2002. View at: Publisher Site  Google Scholar
 W. Hürlimann, “On maximum likelihood estimation for count data models,” Insurance: Mathematics and Economics, vol. 9, no. 1, pp. 39–49, 1990, Correction note: Insurance: Mathematics and Economics, vol. 10, p. 81, 1991. View at: Publisher Site  Google Scholar
 W. Hürlimann, “Pseudo compound poisson distributions in risk theory,” ASTIN Bulletin, vol. 20, no. 1, pp. 57–79, 1990. View at: Publisher Site  Google Scholar
 H. H. Panjer, “Recursive evaluation of a family of compound distributions,” The Astin Bulletin, vol. 12, no. 1, pp. 22–26, 1981. View at: Google Scholar  MathSciNet
 B. Sundt, “On some extensions of Panjer's class of counting distributions,” ASTIN Bulletin, vol. 22, no. 1, pp. 61–80, 1992. View at: Publisher Site  Google Scholar
 B. Sundt and R. Vernic, Recursions for Convolutions and Compound Distributions with Insurance Applications, vol. 2 of EAA Lecture Notes, Springer, 2009. View at: Publisher Site  MathSciNet
 N. De Pril, “The aggregate claims distribution in the individual model of risk theory,” ASTIN Bulletin, vol. 19, no. 1, pp. 9–24, 1989. View at: Publisher Site  Google Scholar
 J. Dhaene and B. Sundt, “On approximating distributions by approximating their De Pril transforms,” Scandinavian Actuarial Journal, vol. 1998, no. 1, pp. 1–23, 1998. View at: Publisher Site  Google Scholar  MathSciNet
 A. Baltrunas and E. Omey, “Secondorder subexponential sequences and the asymptotic behavior of their De Pril transform,” Lithuanian Mathematical Journal, vol. 41, no. 1, pp. 17–27, 2001. View at: Google Scholar
 R. K. Milne and M. Westcott, “Generalized multivariate Hermite distributions and related point processes,” Annals of the Institute of Statistical Mathematics, vol. 45, no. 2, pp. 367–381, 1993. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. Hürlimann, “Quasiexact numerical evaluation of collateralized debt obligations prices,” Journal of Numerical Mathematics and Stochastics, vol. 2, no. 1, pp. 64–75, 2010. View at: Google Scholar
 W. Hürlimann, “A characterization of the compound multiparameter hermite gamma distribution via Gauss's principle,” The Scientific World Journal, vol. 2013, Article ID 468418, 2013. View at: Publisher Site  Google Scholar
 E. Lukacs, Characteristic Functions, Griffin, UK, 2nd edition, 1970. View at: MathSciNet
 L. N. Johnson, S. Kotz, and A. W. Kemp, Univariate Discrete Distributions, John Wiley & Sons, New York, NY, USA, 2nd edition, 1992.
 P. Lévy, “Sur les exponentielles de polynômes et sur l'arithmétique des produits de lois de Poisson,” Annales Scientifiques de l'École Normale Supérieure Série III, vol. 54, pp. 231–292, 1937. View at: Google Scholar
 P. Lévy, “L'arithmétique des lois de probabilités et les produits finis de los de Poisson,” Actualités Scientifiques et Industrielles, no. 736, pp. 25–59, 1938, (Colloque de Genève III), Hermann, Paris, France. View at: Google Scholar
 R. Cuppens, Decomposition of Multivariate Probability, vol. 29 of Probability and Mathematical Statistics, Academic Press, New York, NY, USA, 1975.
 K. Van Harn, “Classifying infinitely distributions by functional equations,” Mathematical Centre Tract 103, Mathematisch Centrum, Amsterdam, The Netherlands, 1978. View at: Google Scholar
 W. Hürlimann, “Negative claim amounts, bessel functions, linear programming and Miller's algorithm,” Insurance: Mathematics and Economics, vol. 10, no. 1, pp. 9–20, 1991. View at: Publisher Site  Google Scholar
 W. Hürlimann, “Predictive StopLoss Premiums,” ASTIN Bulletin, vol. 23, no. 1, pp. 55–76, 1993. View at: Publisher Site  Google Scholar
 I. G. Macdonald, Symmetric Functions and Hall Polynomials, Oxford Mathematical Monographs, Oxford University Press, New York, NY, USA, 2nd edition, 1995.
 F. Bichsel, “ErfahrungsTarifierung in der MotorfahrzeughaftpflichtVersicherung,” Bulletin of the Swiss Association of Actuaries, vol. 64, pp. 119–143, 1964. View at: Google Scholar
 A. Gossiaux and J. Lemaire, “Méthodes d'ajustement de distributions de sinistres,” Bulletin of the Association of Swiss Actuaries, vol. 1981, pp. 87–95, 1981. View at: Google Scholar
Copyright
Copyright © 2015 Werner Hürlimann. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.