Research Article | Open Access
Salim Bouzebda, Mohamed Cherfi, "Dual Divergence Estimators of the Tail Index", International Scholarly Research Notices, vol. 2012, Article ID 746203, 14 pages, 2012. https://doi.org/10.5402/2012/746203
Dual Divergence Estimators of the Tail Index
The main purpose of the present paper is to propose a new estimator of the tail index using -divergences and the duality technique. These estimators are explored with respect to robustness through the influence function approach. The empirical performances of the proposed estimators are illustrated by simulation.
In extreme value statistics emphasis lies on the modelling of rare events, mostly events with a low frequency but a high impact. Common practice is to characterize the size and frequency of such extreme events mainly by the extreme value index , and here the main problem is to estimate the unknown parameter . Since only the upper tail of the distribution is involved, it is reasonable to construct estimators of based on the top extreme values of a sample . The most commonly used estimator of the kind is that proposed by Hill . We mention that the most prominent estimators of this real-valued parameter are maximum likelihood estimators of specific parametric models which are fitted to excesses over large thresholds (see ). Indeed, alternatives to the Hill estimator are discussed by Smith . One of his conclusions (see, e.g., , pp. 1181-1182) is that in general the Hill estimator compares favourably with other competitors. In general, these maximum likelihood estimators often prove to be highly efficient, though nonrobust against deviations of the actual distribution from the assumed parametric model. This is, for instance, the case in the presence of outliers or suspicious data, where the performance of the maximum likelihood estimators and the quality of the corresponding estimates of the tail index are often seriously affected. It is known that the maximum likelihood estimation is very sensitive to deviations from theoretical distributions which is, not surprisingly, the case for the class of heavy-tailed distributions, and fails to provide a reasonable parameter estimation; refer to Alexander  among others.
Robustness is an important issue in extreme value theory; see for instance DellAquila and Embrechts . As shown in Brazauskas and Serfling , small errors in the estimation of the tail index can already produce large errors in the estimation of quantities based on the tail index . Hence, to overcome the lack of robustness to outliers of this kind of estimators, some robust methods for extreme values have already been discussed in recent literature. The interested reader may refer to Brazauskas and Serfling  for robust estimation in the context of strict Pareto distributions. Dupuis and Field , respectively, Peng and Welsh  and Juárez and Schucany , derived robust estimation methods for the case where the observations follow a generalized extreme value distribution, respectively, a generalized Pareto distribution, light or heavy tailed. Vandewalle et al.  considered a robust estimation method based on the minimization of the integrated squared error criterion using an incomplete density mixture model; Kim and Lee  used the minimum density power divergence approach of Basu et al.  to estimate the tail index in the dependent case; more recently Hubert et al.  proposed a method to detect outliers that can influence the Hill  estimator.
In this paper, we propose a new robust tail index estimation procedure, based on -divergences and the duality technique, for the semi-parametric setting of Pareto-type (or heavy-tailed) distributions. So here, the strict Pareto distribution is assumed to hold only asymptotically, that is, for excess distributions over high enough threshold values. The proposed method extends the maximum likelihood procedure, and, it will be seen that the last method corresponds to the particular choice of the -divergence which leads to the Hill 's etimator.
The remainder of this paper is organized as follows. After some motivations in this Introduction, Section 2, is devoted to preliminary results on -divergence and the introduction of our estimator. Section 3 presents our new results on the Influence function. In Section 4, we investigate the finite-sample performance of the newly proposed estimators. To avoid interrupting the flow of the presentation, all technical arguments are deferred to the Appendix.
2. Extreme Value Statistics and -Divergence Setting
A widely usen family of divergences is the so-called “power divergences”, introduced by Cressie and Read  (see also Liese and Vajda [15, Chapter 2] and also the Renyi 's paper is to be mentioned here), which are defined through the class of convex real-valued functions, for in : We have and . (For all , we define .) So, the -divergence is associated to , the to , the to , the to and the Hellinger distance to . In the monograph by Liese and Vajda  the reader may find detailed ingredients of the modeling theory as well as surveys of the commonly used divergences. We recall some basic definitions for the readers' convenience. Unless otherwise specified we will assume that the function is a function of class , strictly convex, such that, for fixed , According to Broniatowski and Keziou , if the function satisfies the following conditions: then the assumption (2.2) is satisfied whenever , where stands for the -divergence between and , we refer the reader to Broniatowski and Keziou [18, Lemma 3.2]. Also the real convex functions (2.1), associated with the class of power divergences, all satisfy the condition (2.2), including all standard divergences. Under assumption (2.2), using Fenchel duality technique, the divergence can be represented as resulting from an optimization procedure, where and This result was elegantly proven in, Keziou , Liese and Vajda  and Broniatowski and Keziou . Broniatowski and Keziou  called it the dual form of a divergence, due to its connection with convex analysis. Furthermore, the supremum in this display (2.4) is unique and reached in , independently upon the value of . Let be an independent, identically distributed (i.i.d.) sample from an unknown distribution function (d.f.) . Naturally, a class of estimators of , called “dual -divergence estimators” (DDE's), is defined by where is the function defined in (2.5). The class of estimators satisfies Formula (2.6) defines a family of -estimators indexed by the function specifying the divergence and by some instrumental value of the parameter . Application of dual representation of -divergences has been considered by many authors; we cite among others, Keziou and Leoni-Aubin  for semiparametric two-sample density ratio models, bootstrapped -divergences estimates are considered in Bouzebda and Cherfi , extension of dual -divergences estimators to right censored data are introduced in Cherfi , for estimation and tests in copula models we refer to Bouzebda and Keziou [24, 25], and the references therein. Performances of dual -divergence estimators for normal models are studied in Cherfi .
In what follows, we describe the procedure used to obtain the DDE for the tail index. Let be an independent, identically distributed (i.i.d.) sample from an unknown distribution function (d.f.) . Since it is well known that a distribution is in the domain of attraction of a Fréchet distribution if and only if the distribution has a regularly varying tail, we can assume that is regularly varying at with the exponent , is called the tail index of distribution : or equivalently, where is slowly varying at , namely, In the Pareto-type case, the conditional distribution of relative excesses over a threshold satisfies it is easily seen that ultimately A popular choice for the threshold in threshold based methods is , the th largest observation of the sample, with for some . Here and elsewhere, denotes the largest integer . The quantile function pertaining to , is defined, for , by The empirical quantile function is given, for each and , by The threshold is easily seen to be equal to . The idea of constructing the DDE for the tail index is to assume the above Pareto approximation to hold exactly as a model for the conditional distribution of the relative excesses That is we can fit a Pareto model to the relative excesses. In this framework, the estimation of can be handled through dual divergences techniques, which provide a wide range of estimators, including the Hill estimator, they all can be compared with respect to robustness properties. Consider the Pareto density Specializing (2.4) to this setting, elementary calculation, for in , gives Using this last equality, one finds We now consider an interesting particular case of the previous setup, for , one obtains which leads to the famous Hill estimator , given by independently upon , where . Mason  show that consistency of the Hill estimator if is a sequence of positive integers satisfying Further investigations concerning the asymptotic distribution of the Hill estimator have been made by Hall , Csörgő and Mason , Haeusler and Teugels , Beirlant and Teugels , and Bouzebda  ISUP. This is shown, under certain additional regularity conditions, on and on satisfying (2.22).
3. Influence Function
In this section we study the robustness properties of the proposed estimators theoretically. In particular, we derive their influence functions from which the asymptotic variance follows. The following definition is needed for the statement of our forthcoming result. Recall that the influence function of a functional at a distribution describes the effect on the estimate of an infinitesimal contamination to at the point and is given by where and is the Dirac measure putting all its mass at and . In the following, we will derive the influence function for the functional form of the newly proposed estimator in an analogous way as for the classical -estimators . General results on influence functions of the dual -divergence estimators can be found in Toma and Broniatowski . We will use the following notations where stands for the indicator function of the event . Our results are summarized in the following theorem; its proof is given in the next section.
Proposition 3.1. The influence function of the functional corresponding to an estimator is given by
To illustrate the behavior of the obtained influence functions we restrict ourselves to the strict Pareto case; simulation results for other heavy-tailed distribution are presented in the next section. Figure 1 plots the influence functions of our estimators for the Pareto distribution. Observe that, for the Hellinger distance () and the -divergence (), the influence for the tail index becomes negligible for large outliers. The influence functions are bounded, making the associated functionals robust, in contrast with the Hill estimator () and the other divergences.
In order to illustrate the robustness of the proposed statistical method, its finite sample behavior is investigated, both at contaminated as well as uncontaminated data. For the tail index , a comparison is made between the well-known Hill  estimator and the newly proposed estimator.
We consider simulated samples without contamination, each containing observations, from the two Pareto-type distributions. (i)The Fréchet distribution given by . (ii)The Burr distribution given by In the simulations, we have chosen , for the Burr distribution , , . The means of the DDE (left) and the corresponding empirical mean squared errors (right), also as a function of are plotted. The horizontal line indicates the true value of .
When the data are uncontaminated, although most robust estimators are known to be less efficient at the true model than maximum likelihood estimators , we notice that the estimates seem to be fairly stable for intermediate values of , making the influence of the choice of less troublesome and even with respect to mean squared error, the newly proposed estimator does not seem to lose too much accuracy, a close look to Figures 2 and 3 shows that there is a slight tendency to overestimation. The newly proposed DDE's perform remarkably as well as the Hill  estimator.
However, a slight contamination () is sufficient to make the DDE associated to the -divergence () more appealing in terms of low MSE; see Figures 4, 5, 6, and 7. Furthermore when contamination increases, the DDE performs remarkably better.
Overall, the simulation results in this section provide supporting evidence of the adequacy of the DDE associated with the -divergence with observations drawn from Fréchet and Burr distributions. Moreover, the sensitivity of this estimator for the choice of is low.
This section is devoted to the proofs of our result.
Proof of Proposition 3.1. For convenience, we recall the definition of the empirical measure associated with the random variables , , which is given by We define the estimator as the value which maximizes, independently of , the following estimating equation: or, equivalently, as the solution, in , of the following equation: In this view, the estimator may be written in the form of a functional , given by We continue by rewriting (A.4) for the contaminated distribution as given in the definition of the influence function defined in (3.1), that is, Observe that Keeping in mind the definition of as in (A.4), the last term in (A.7) disappears. We next evaluate the first term in the right side of (A.7). We infer readily by using Leibnitz's integral rule, that: In a similar way, we can therefore write, The proof of Proposition 3.1 is therefore completed.
The authors would like to thank the four editors for their helpful comments on the paper.
- B. M. Hill, “A simple general approach to inference about the tail of a distribution,” The Annals of Statistics, vol. 3, no. 5, pp. 1163–1174, 1975.
- R. L. Smith, “Estimating tails of probability distributions,” The Annals of Statistics, vol. 15, no. 3, pp. 1174–1207, 1987.
- C. Alexander, “Assessment of operational risk capital. newblock,” in Riskman-Agement, Challenge and Opportunity, M. Frenkel, U. Hommel, and M. Rudolf, Eds., 2nd edition, 2005.
- R. Dell'Aquila and P. Embrechts, “Extremes and robustness: a contradiction?” Financial Markets and Portfolio Management, vol. 20, no. 1, pp. 103–118, 2006.
- V. Brazauskas and R. Serfling, “Robust estimation of tail parameters for twoparameter and exponential models via generalized quantile statistics,” Extremes, vol. 3, pp. 231–249, 2000.
- V. Brazauskas and R. Serfling, “Robust and effcient estimation of the tail index of a single-parameter Pareto distribution,” The North American Actuarial Journal, vol. 4, pp. 12–27, 2000.
- D. J. Dupuis and C. A. Field, “Robust estimation of extremes,” The Canadian Journal of Statistics, vol. 26, no. 2, pp. 199–215, 1998.
- L. Peng and A. H. Welsh, “Robust estimation of the generalized Pareto distribution,” Extremes, vol. 4, no. 1, pp. 53–65, 2001.
- S. F. Juárez and W. R. Schucany, “Robust and efficient estimation for the generalized Pareto distribution,” Extremes, vol. 7, no. 3, pp. 237–251, 2004.
- B. Vandewalle, J. Beirlant, A. Christmann, and M. Hubert, “A robust estimator for the tail index of Pareto-type distributions,” Computational Statistics & Data Analysis, vol. 51, no. 12, pp. 6252–6268, 2007.
- M. Kim and S. Lee, “Estimation of a tail index based on minimum density power divergence,” Journal of Multivariate Analysis, vol. 99, no. 10, pp. 2453–2471, 2008.
- A. Basu, I. R. Harris, N. L. Hjort, and M. C. Jones, “Robust and efficient estimation by minimising a density power divergence,” Biometrika, vol. 85, no. 3, pp. 549–559, 1998.
- M. Hubert, G. Dierckx, and D. Vanpaemel, “Detecting inuential data points for the Hill estimator in Pareto-type distributions,” Computational Statistics & Data Analysis. In press.
- N. Cressie and T. R. C. Read, “Multinomial goodness-of-fit tests,” Journal of the Royal Statistical Society B, vol. 46, no. 3, pp. 440–464, 1984.
- F. Liese and I. Vajda, Convex Statistical Distances, vol. 95 of Teubner Texts in Mathematics, BSB B. G. Teubner Verlagsgesellschaft, Leipzig, Germany, 1987.
- A. Rényi, “On measures of entropy and information,” in Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 547–561, University of California Press, Berkeley, Calif, USA, 1961.
- M. Broniatowski and A. Keziou, “Parametric estimation and tests through divergences and the duality technique,” Journal of Multivariate Analysis, vol. 100, no. 1, pp. 16–36, 2009.
- M. Broniatowski and A. Keziou, “Minimization of -divergences on sets of signed measures,” Studia Scientiarum Mathematicarum Hungarica, vol. 43, no. 4, pp. 403–442, 2006.
- A. Keziou, “Dual representation of -divergences and applications,” Comptes Rendus Mathématique, vol. 336, no. 10, pp. 857–862, 2003.
- F. Liese and I. Vajda, “On divergences and informations in statistics and information theory,” IEEE Transactions on Information Theory, vol. 52, no. 10, pp. 4394–4412, 2006.
- A. Keziou and S. Leoni-Aubin, “On empirical likelihood for semiparametric two-sample density ratio models,” Journal of Statistical Planning and Inference, vol. 138, no. 4, pp. 915–928, 2008.
- S. Bouzebda and M. Cherfi, “General bootstrap for dual -divergence estimates,” Journal of Probability and Statistics, vol. 2012, Article ID 834107, 33 pages, 2012.
- M. Cherfi, “Dual divergences estimation for censored survival data,” Journal of Statistical Planning and Inference, vol. 142, no. 7, pp. 1746–1756, 2012.
- S. Bouzebda and A. Keziou, “New estimates and tests of independence in semiparametric copula models,” Kybernetika, vol. 46, no. 1, pp. 178–201, 2010.
- S. Bouzebda and A. Keziou, “A new test procedure of independence in copula models via -divergence,” Communications in Statistics, vol. 39, no. 1-2, pp. 1–20, 2010.
- M. Cherfi, “Dual ϕ-divergences estimation in normal models,” Computation. In press. http://arxiv.org/abs/1108.2999
- D. M. Mason, “Laws of large numbers for sums of extreme values,” The Annals of Probability, vol. 10, no. 3, pp. 754–764, 1982.
- P. Hall, “On some simple estimates of an exponent of regular variation,” Journal of the Royal Statistical Society B, vol. 44, no. 1, pp. 37–42, 1982.
- S. Csörgő and D. M. Mason, “Central limit theorems for sums of extreme values,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 98, no. 3, pp. 547–558, 1985.
- E. Haeusler and J. L. Teugels, “On asymptotic normality of Hill's estimator for the exponent of regular variation,” The Annals of Statistics, vol. 13, no. 2, pp. 743–756, 1985.
- J. Beirlant and J. L. Teugels, “Asymptotics of Hill's estimator,” Teoriya Veroyatnosteĭ i ee Primeneniya, vol. 31, no. 3, pp. 530–536, 1986.
- S. Bouzebda, “Bootstrap de l'estimateur de Hill: théorèmes limites,” Annales de l'I.S.U.P., vol. 54, no. 1-2, pp. 61–72, 2010.
- F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel, Robust Statistics, Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics, John Wiley & Sons, New York, NY, USA, 1986.
- A. Toma and M. Broniatowski, “Dual divergence estimators and tests: robustness results,” Journal of Multivariate Analysis, vol. 102, no. 1, pp. 20–36, 2011.
Copyright © 2012 Salim Bouzebda and Mohamed Cherfi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.