Research Article | Open Access
Werner Hรผrlimann, "From the General Affine Transform Family to a Pareto Type IV Model", Journal of Probability and Statistics, vol. 2009, Article ID 364901, 10 pages, 2009. https://doi.org/10.1155/2009/364901
From the General Affine Transform Family to a Pareto Type IV Model
Abstract
The analytical form of general affine transform families with given maximum likelihood estimators for the affine parameters is determined. In this context, the simultaneous maximum likelihood equations of the affine parameters in the generalised Pareto distribution cannot have a common solution. This pathological situation is removed by extending it to a four parameter family, called Pareto type IV model.
1. Introduction
Based on [1], the author has studied the general affine transform X of the random variable Y defined by , where and are twice differentiable monotone increasing functions, and are deterministic functions of the affine parameter vector such that . The work in [2] determines exact maximum likelihood estimators of parameters in order statistics distributions with exponential, Pareto, and Weibull parent distributions. The article [3] recovers the older result by the work in [4] that the Pareto is an exponential transform, and also notes that the latter result is not restricted to the Pareto, but applies to a lot of distributions like the truncated Cauchy, Gompertz, log-logistic, para-logistic, inverse Weibull, and log-Laplace.
A further contribution in this area is offered. Based on the method introduced in [5], we determine the analytical form that parametric models may take for specific maximum likelihood estimators of the affine parameters in a general affine transform family. Applied to the generalised Pareto distribution, of great importance in extreme value theory and its applications (e.g., [6, 7]), one observes that the simultaneous maximum likelihood equations of the affine parameters cannot have a common solution. Therefore, the highly desirable maximum likelihood method is not applicable to this distribution. Fortunately, this pathological situation can be removed by enlarging the generalised Pareto to a four-parameter family. The resulting new family, called Pareto type IV model, includes as special cases the generalised Pareto and the Beta of type II. Finally, it is worthwhile to mention the construction of alternative statistical models of Pareto type II and III in [8], and of type IV in [9]. A recent discussion of the Pareto type III is [10] and a useful monograph including Pareto type distributions is [11]. This paper is organized as follows.
Section 2 recalls the general affine transform family (GATF) and its relevance. Our main result concerns the possible form GATF models may take given specific maximum likelihood estimators (MLE) for their affine parameters and is derived in Section 3. Section 4 shows that our method does not apply to the generalised Pareto distribution and introduces the new Pareto type IV model. Section 5 concludes and gives a short outlook on further research.
2. General Affine Transform Families
Let be random variables with distribution functions and densities (provided they exist). Suppose that the distributions and densities depend on a parameter vector with values in the parameter space , where is a vector of affine parameters, is a vector of shape parameters, and . We assume that the functions and are continuous twice-differentiable monotone increasing with inverses and . Moreover, these functions do not depend on but may depend on .
Definition 2.1. The general affine transform X of Y (GATF) is the random variable defined by via a three-stage transformation. First, Y is nonlinearly transformed to , then positively linear transformed to , with , and again nonlinearly transformed to . The constants and are called location and scale parameters. A GATF family is a set of parameterised GATF X of Y whose distributions and densities satisfy the relationships In applications, very often special cases are most useful. Using [1, Table โ1], the main types are summarized in [3, Table โ2.1]. Some typical examples illustrate the relevance of the GATF as the generalised Pareto and the gxh-family [3, Examples โ2.1 and 2.2].
3. GATF Families with Prescribed Maximum Likelihood Estimators
Consider a random sample of size, where are independent and identically distributed random variables, and denote the common random variable by X. For a real function , we define and denote the mean value of by
It is assumed that sample mean value equations like have a unique solution . Our main result characterizes GATF families by the form of the maximum likelihood estimators for their affine parameters. The proof makes use in [12, Theorem โ2.2].
Theorem 3.1. Given is a GATF with support and affine parameter vector . Suppose that the distribution function of is twice differentiable, and that the MLE of the th affine parameter is solution of one of the following mean value equations. Case 1 :.
with some real function .Case 2 :.
with some real function .
Then there exists a twice-differentiable and monotone increasing function with derivative , and constants such that
Furthermore, for simultaneous maximum likelihood estimation of the affine parameters, the following compatibility conditions must be satisfied:
Under these conditions, the distribution function has the unique representation
for all .
Proof. We proceed as in [5, proof of Theorem โ2.1].Case 1 (). Using (2.2) and the relations , one obtains for the negative of the random log-likelihood of X the expression
Denoting partial derivatives with respect to with a lower index and making use of
one obtains from (3.10) the expression for the partial derivative
By assumption (3.2), one has using [12, Theorem โ2.2] that
for some constant . By comparison solves the second-order differential equation
Setting and multiplying with this simplifies to
Transform it to the equivalent system of first-order equations in [13, Chapter 19]:
The second differential equation is of Bernoulli type [13, Chapter 2]. Setting , this is equivalent to the simpler system in :
The second equation is linear inhomogeneous of first order and has the homogeneous solution . By variation of the constant, one sees that . On the other side, from the first equation in (3.17), one has , hence . Together, this shows the following separated differential equation:
Assume momentary that has an integral such that for some . Then, has the solution . It follows that the general solution of the second differential equation in (3.17) is given by
The first differential equation in (3.17) implies the separated differential equation
Assume momentary that there exists a twice-differentiable function such that (). The general solution to (3.20) yields the relationship
Setting and using that , one gets the random relation , which implies by (2.1) that
Setting one obtains the density function
The side conditions , , imply that the constants are determined by
The validity of the representation (3.9) for is shown. Since has been assumed twice differentiable, so is , and
as claimed in (3.4). In particular, the two momentary assumptions made above, that is, and , are fulfilled.Case 2 (). Since , one has similarly to (3.11) the relationship
From (3.10), one obtains for the partial derivative of the random log-likelihood the relation
By assumption (3.2) and again in [12, Theorem โ2.2], one has
for some constant . Through comparison, it follows that must solve
Proceeding as in Case 1, one obtains a twice-differentiable function , with derivative , such that and . As in Case 1, one concludes that (3.9) for must hold.
It remains to show the compatibility conditions (3.6)โ(3.8). Through differentiation of (3.9), one obtains the probability density functions
for all . Three subcases are possible.Subcase 1 (). From (3.30), one gets that with . Using (3.4), one obtains without difficulty the compatibility condition (3.6).Subcase 2 (). From (3.30), one sees that with . Using (3.4) and (3.5), one shows without difficulty condition (3.7).Subcase 3 (). From (3.30), one obtains that with . Using (3.5), one shows without difficulty condition (3.8). The proof of Theorem 3.1 is complete.
4. A Pareto Type IV Model
The generalised Pareto distribution is the GATF defined by with , Y exponential with mean one, , , , . Its probability density function is
Applying Theorem 3.1, one sees that the MLE of are determined by the real functions
According to Theorem 3.1, there are functions
and constants such that
and the compatibility condition (3.7) is fulfilled. For any random sample from this family, one observes that the simultaneous maximum likelihood equations
cannot have a common solution, hence the maximum likelihood method is not applicable.
The described pathological situation can be removed in a simple way thanks to Theorem 3.1. Our construction is motivated by the following question. What is the most general affine transform family with MLE of the affine parameter that is determined by the mean value equation ?. By Theorem 3.1, Case 1, there must exist a constant and a function such that
Using [5], formula (3.1) one obtains
A corresponding probability density function is
One notes that two well-known subfamilies are included, namely, the generalised Pareto (4.1) obtained by setting , and the Beta of type II obtained by setting . This suggests the name โgeneralised Pareto-Betaโ but we prefer the simpler nomenclature โPareto type IV modelโ for the new four-parameter family (4.8). Applying Theorem 3.1, one sees that the MLE of and are determined by
There are functions
and constants such that
and the compatibility condition (3.7), that is,
is fulfilled. For a random sample the MLE of and solves the simultaneous equations
The value of the normalising constant in (4.8) depends only on the shape vector .
Proposition 4.1. Assume that are not integers. Then the normalising constant of the Pareto type IV model (4.8) is determined by the infinite series expansion where , is a generalised binomial coefficient.
Proof. From the observation made above, one notes that To obtain convergent integrals, separate calculation in two parts and make a substitution to get The binomial expansion , valid for [14, (18.7), page 134], yields the series Under the assumption this implies without difficulty the expression (4.14).
5. Conclusions and Outlook
The proposed method is not the only way to generalize the Pareto family (4.1). The recent note [9] extends this family to the family
which looks similar to (4.8), except for the โpower lawโ component in the second bracket, but has different statistical properties. An advantage of (5.1) is certainly the analytical closed-form expression for the survival function given by
To conclude, several advantages of (4.8) can be noted, in particular, the simple MLE estimation of the affine parameters and the inclusion of the very important generalised Pareto distribution as a submodel. From a statistical viewpoint, the interest of the extended model (4.8) is two-fold. First, it may provide a better fit of the data than any submodel. Second, it yields a simple statistical procedure to choose among submodels like the generalised Pareto and the Beta of type II. Only the model โclosestโ to the full model will be retained. A detailed comparison of these two four parameter Pareto families is left to further research.
Acknowledgment
The author is grateful to the referees for careful reading of the manuscript and valuable comments.
References
- B. Efron, โTransformation theory: how normal is a family of distributions?โ The Annals of Statistics, vol. 10, no. 2, pp. 323โ339, 1982. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
- W. Hürlimann, โGeneral location transform of the order statistics from the exponential, Pareto and Weibull, with application to maximum likelihood estimation,โ Communications in Statistics: Theory and Methods, vol. 29, no. 11, pp. 2535โ2545, 2000. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
- W. Hürlimann, โGeneral affine transform families: why is the Pareto an exponential transform?โ Statistical Papers, vol. 44, no. 4, pp. 499โ519, 2003. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
- E. J. Gumbel, Statistics of Extremes, Columbia University Press, New York, NY, USA, 1958. View at: MathSciNet
- W. Hürlimann, โOn the characterization of maximum likelihood estimators for location-scale families,โ Communications in Statistics: Theory and Methods, vol. 27, no. 2, pp. 495โ508, 1998. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
- P. Embrechts, C. Klüppelberg, and Th. Mikosch, Modelling Extremal Events for Insurance and Finance, vol. 33 of Applications of Mathematics, Springer, Berlin, Germany, 1997. View at: MathSciNet
- S. Kotz and S. Nadarajah, Extreme Value Distributions: Theory and Applications, Imperial College Press, London, UK, 2000. View at: MathSciNet
- W. Hürlimann, โHigher-degree stop-loss transforms and stochastic orders (II) applications,โ Blätter der Deutschen Gesellschaft für Versicherungsmathematik, vol. 24, no. 3, pp. 465โ476, 2000. View at: Google Scholar
- A. M. Abd Elfattah, E. A. Elsherpieny, and E. A. Hussein, โA new generalized Pareto distribution,โ 2007, http://interstat.statjournals.net/YEAR/2007/abstracts/0712001.php. View at: Google Scholar
- G. Bottazzi, โOn the Pareto type III distribution,โ Sant'Anna School of Advanced Studies, Pisa, Italy, 2007, http://www.lem.sssup.it/WPLem/files/2007-07.pdf. View at: Google Scholar
- C. Kleiber and S. Kotz, Statistical Size Distributions in Economics and Actuarial Sciences, Wiley Series in Probability and Statistics, John Wiley & Sons, New York, NY, USA, 2003. View at: MathSciNet
- A. K. Gupta and T. Varga, โAn empirical estimation procedure,โ Metron, vol. 52, no. 1-2, pp. 67โ70, 1994. View at: Google Scholar | Zentralblatt MATH | MathSciNet
- W. Walter, Gewöhnliche Differentialgleichungen, Eine Einführung. Heidelberger Taschenbücher, Band 110, Springer, Berlin, Germany, 1972. View at: MathSciNet
- Ch. Blatter, Analysis II, Heidelberger Taschenbücher, Band 152, Springer, Berlin, Germany, 1974. View at: MathSciNet
Copyright
Copyright © 2009 Werner Hürlimann. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.