#### Abstract

The mathematical/statistical concepts of pseudo compound Poisson and partition representations in discrete probability are reviewed and clarified. A combinatorial interpretation of the convolution of geometric distributions in terms of a variant of Newton’s identities is obtained. The practical use of the twofold convolution leads to an improved goodness-of-fit for a data set from automobile insurance that was up to now not fitted satisfactorily.

#### 1. Introduction

Consider the class of discrete arithmetic random variables with probability generating function (pgf) and nonvanishing zero probability . Suppose that the pgf can be written as for some generating function . The pseudo compound Poisson representation concerns the duality between and , which is best expressed in terms of the identity or equivalently the recurrence relationsAs a consequence, it has been shown that is infinitely divisible if and only if one has , (e.g., ; [2, page 83]). Around the same time Feller [3, page 290] shows that is infinitely divisible if and only if it is a compound Poisson random variable with parameter and severity probabilities . If is not infinitely divisible, then (1) still holds for some , with at least one negative value, and this property motivates the naming pseudo compound Poisson representation.

Section 1 summarizes its main properties. The power series identity leads also to the more complex and less known expressionwhere is the set of free partitions of weight , which is called partition representation. Although (2) has been applied in Hürlimann [4, Theorem  3.2], to derive an existence criterion for the construction of confidence bounds for discrete sampling distributions, expression (2) is not stated correctly and a proof of it is missing. Section 3 fills in these gaps and provides a combinatorial interpretation of the partition representation.

As a new illustration, Section 4 considers the convolution of geometric random variables, whose pseudo compound Poisson representation is specified by the th power sum polynomial . The partition representation (2) identifies with the multiple of the th complete symmetric function in the variables . The recursion (1) is equivalent to a variant of Newton’s identities, which motivates us to call this convolution Newton distribution. Applied to estimation theory, we show that this distribution satisfies Gauss’s principle (the maximum likelihood estimator of the mean is the sample mean) and construct a parameter vector orthogonal to the mean. In the two-parameter case, we derive the maximum likelihood equations and illustrate their use in Section 5 at a specific data set from automobile insurance. Through regrouping of classes, we show that the Newton distribution beats in goodness-of-fit both the negative binomial and the Poisson inverse Gaussian and at the same time improves the unsatisfactory fit obtained in a previous case study.

#### 2. Characterization through Pseudo Compound Poisson Representation

Recall the pseudo compound Poisson representation in discrete probability theory from Hürlimann [5, 6]. Let be a discrete arithmetic random variable defined on the natural numbers with probabilities , such that . Besides the probability generating function (pgf) one considers the cumulant pgf defined byIts name is motivated by the series expansion of the cumulant generating function (cgf)

Proposition 1 (pseudo compound Poisson representation). Let be a discrete arithmetic random variable with such that and set . Then the probabilities satisfy Panjer’s recursionand the following pseudo compound Poisson representation holds:

Proof. The recursion (5) is an immediate consequence of (3). Indeed, the derivative of the equation satisfies the relation , which is equivalent to identities (5). For further details consult Hürlimann [5, Corollary  2], Hürlimann [6, Theorem  1].

Remarks 1. Clearly, if , , the distribution is compound Poisson with parameter and severity probabilities , , a fact denoted by . In this situation, the recursion (5) is named Panjer recursion after Panjer . With Feller [3, Section  XII.2], one knows that is compound Poisson if and only if it is infinitely divisible (see also  for a more general characterization). Otherwise, one says that has a pseudo compound Poisson distribution. In the terminology of Sundt , it belongs to the class with (see also ). The theoretical and practical usefulness of pseudo compound Poisson distributions have been demonstrated by the author in numerous publications. Sometimes, the sequence defined by (5) is called the De Pril transform of after De Pril  (e.g., ). The interest of recurrence relation (5) extends beyond discrete probability to the general context of integer sequences. Let be the identity map such that for all nonnegative integers. Then, the convolution equation for integer sequences and occurs in many areas of mathematics. An important problem is the relationship between the asymptotic behaviors of the two sequences and (e.g.,  and its references).

The converse of Proposition 1, although used in applications (e.g., [6, 1315]), has been less studied. In case of negative ’s a necessary condition on the cumulant pgf such that (5) defines a true probability distribution has been first identified by Lévy.

Proposition 2. Given is a finite generating function with . Assume that . In order that (5) defines a discrete probability distribution, it is necessary that the condition (NC) , is fulfilled.

Proof. The condition (NC) is found in Lukacs [16, page 252] and Johnson et al. [17, page 356]. Proofs are found in Lévy [18, 19] and Cuppens [20, Section  8.4 and Appendix  B].

Remarks 2. Precise sufficient conditions (SC) in Proposition 2 are not known. In principle, any of the remaining ’s could be negative provided they are sufficiently small in absolute value; that is, , , for some (cf. [18, page 263]; [13, Remark  1 to Theorem  3.1]). Lévy [18, page 263] points out that is the simplest case admitting a negative . Van Harn [21, page 84] obtains four inequality constraints on in terms of the other ’s. Lukacs [16, page 251] shows that if , , then (5) defines a discrete probability distribution (cf. [13, Remark  3 to Theorem  3.1]).

#### 3. Partition Representation and Combinatorial Interpretation

Besides the pseudo compound Poisson representation the power series identity has another important immediate consequence.

Proposition 3 (partition representation). Let X be a discrete arithmetic random variable with such that , and let be its cumulant . Then, the following partition representation holds:where is the set of free partitions of weight .

Proof. This result follows from the calculationwhich implies identities (7).

Remarks 3. Although variants existed before (e.g., [22, Section  2]; [23, Equation (3.4)]), the general partition representation (7) for pseudo compound Poisson arithmetic distributions has been first stated in Corollary  3.1 of Hürlimann . Unfortunately, the given formula contains a misprint: the multiplicity factorial has been wrongly replaced by . Note that this mistake remains without influence on the validity of Theorem  3.2 derived from this representation. However, the claimed but missing proof by induction is best replaced by the present proof. The simple power series manipulation has already been used by Macdonald  in his proof on page 25 of a similar but more special result in the theory of symmetric functions (see later Section  4.3). Section  I.1 of the mentioned textbook is recommended reading for definitions and properties around partitions (in particular weight, length, and multiplicity of a partition).

The partition representation has a nice combinatorial interpretation. For convenience, set , . Then, the expressionderived from (7) identifies with the cycle index polynomial (or cycle indicator) of the symmetric group of order in the variables (e.g., [24, Example  9.(a), page 29]). Indeed, the coefficient in (9) of an arbitrary monomial is equal to the fraction of all permutations that have fixed points, cycles of length , and cycles of length . We note that this combinatorial interpretation is crucial to the novel example studied in the next two sections.

#### 4. The Convolution of Geometric Random Variables: A Newton Type Distribution

To illustrate further the discrete probability concepts of Sections 2 and 3, consider the power sum like specification of the cumulant pgf in (3); namely,where , , , , are unknown parameters. A calculation shows that the quantity is finite. The condition (C1) of Proposition 2 is fulfilled and therefore (10) belongs to the random variable of some compound Poisson distribution, to be determined. Clearly, one hasfrom which one obtainswhich is the pgf of the convolution of geometric random variables; henceConsider the partition representation of Proposition 3. One observes that is nothing else other than the th power sum polynomial in the variables , . Identity (9) is herewith equal toand shows with Macdonald [24, Equation (2.14′), page 25] that the th probability of , namely, , is the multiple of the th complete symmetric function in the variables . In this combinatorial context the Panjer recursion (5) is equivalent to the identitieswhich is nothing else other than a variant of Newton’s identities (see [24, Equations (2.11) and (2.11′), page 23]). Therefore, the distribution (14) of the convolution of geometric random variables derived from the recursion (15) could also be called Newton distribution. The distribution corresponding to the random variable (13) is abbreviated by .

Now, we apply our concepts to study some estimation properties of the Newton distribution. In a first step, we construct a parameter vector orthogonal to the meanIn general, given a random variable we suppose that the mean is functionally independent of a certain parameter vector , that is, , , and denote the log-likelihood of by . The mean is orthogonal to the parameter vector , denoted by , if one has , . Further, let be the class of all discrete arithmetic distributions for which the maximum likelihood estimator of the mean is the sample mean, that is, such that (Gauss’s principle). The subclass of , called mean orthogonal class, consists of all those distributions, which besides Gauss’s principle satisfy the mean orthogonal property . It is known that the class is closed under convolution (see [15, Theorem  2.2]) and can be characterized as follows. Suppose there exist a parameter and a one-to-one coordinate transformation mapping to , and set . Then with is equivalent to the following condition (see [15, Lemma  3.1, Equation (3.8)]): where , is determined by the pseudo compound Poisson representation of . Applied to the Newton distribution, one sees after some calculation that (17) with , , is fulfilled provided the partial differential equationcan be solved. It is easy to see that this will be fulfilled provided one hasA solution to these partial differential equations isIt follows that the mean is orthogonal to the parameter vector .

Next, we derive the maximum likelihood equations for the two-parameter Newton distribution obtained from the convolution of two geometric distributions with parameters , . The probabilities in the above orthogonal parameterization are given byFor each let be the number of observations of the random variable equal to , and let be the total number of observations, where for all . From (21) one obtains the scaled log-likelihood functionThe maximum likelihood equations are given byThis system of two nonlinear equations in the two variables can be solved with substitution setting . Insert in the first equation to see thatThe only feasible solution of this quadratic equation takes the minus sign and is given byUsing that and inserting (25) into the second equation of (23) one sees that the latter depends besides the observations only upon and is determined byThe preceding simple derivation of maximum likelihood properties is possible in virtue of the pseudo compound Poisson representation (mainly (17)). Let us also illustrate the mathematical benefit of the partition representation. For this we rely on Theorem  3.2 in Hürlimann , which guarantees the existence of well-defined confidence bounds for the mean of a count distribution provided the following regularity assumption is fulfilled.(RA)There exist a positive integer and a constant such that and for all partitions such that .Now, for the Newton distribution with two parameters such that , one hasand thus . The regularity assumption with is fulfilled.

#### 5. A Numerical Example from Automobile Insurance

The choice of an appropriate claim number model in automobile insurance is a prominent actuarial research topic whose last 50-year-old story begins with Bichsel , who used the negative binomial (NB) distribution to construct a bonus-malus system. Among the other first good choices, the Poisson inverse Gaussian (PIG) has often been advocated (see Example  3.4 in  and discussions). Due to the penalty in the SBC (Schwartz Bayesian criterion) score the best overall fit is in any case obtained for a two-parameter distribution (see [4, end of Section  4]). In this context, we ask whether the Newton distribution might compete with the NB and PIG. For a specific example and in a very precise sense (regrouping of classes), it beats both of them and at the same time improves the unsatisfactory fitting of data set 4 in Table  4.5 of Hürlimann .

Since the method of moments is inappropriate, as shown by Gossiaux and Lemaire , we use the maximum likelihood estimation (MLE) method, which is best in view of its asymptotic properties. The goodness-of-fit is established on the basis of both the SBC score and the value of the chi-square statistic with appropriate regrouping of the classes. The improved results are found in Tables 1 and 2. Although the minimum SBC score is attained for the PIG the chi-square and value (with the last two regrouped classes) are best for the Newton distribution. The decrease in goodness-of-fit of the PIG is due to the loose in fit by regrouping the last two classes and the deficiency in fitting the class with two observed insurance claims.

Let us conclude the present work. Our aim was to give a systematic brief overview of the concepts of pseudo compound Poisson and partition representations in discrete probability and present a new application. Through the detailed study of a single new example, we have exemplified some typical features and further potential of our methodology.

#### Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.