Abstract

The generalized Pareto distribution is one of the most important distributions in statistics of extremes as it has wide applications in fields such as finance, insurance, and hydrology. This study proposes two new methods for estimating the shape parameter of the generalized Pareto distribution (GPD). The proposed methods use the shrinkage principle to adapt the existing empirical Bayesian with data-based prior and the likelihood moment method to obtain two estimators. The performance of the proposed estimators is compared with the existing estimators (i.e., maximum likelihood, likelihood moment estimators, etc.) for the shape parameter of the generalized Pareto distribution in a simulation study. The results show that the proposed estimators perform better for small to moderate number of exceedances in estimating shape parameter of the light-tailed distributions and competitive when estimating heavy-tailed distributions. The proposed estimators are illustrated with practical datasets from climate and insurance studies.

1. Introduction

The generalized Pareto distribution (GPD) is one of the fundamental distributions used in extreme value theory. As reported in [1, 2], the GPD is the limiting distribution of excesses over a large threshold regardless of the underlying distribution of data, The significance of the GPD in extreme value theory cannot be overemphasized because of its extensive applicability: hydroelectric dam management [3], flood levels of river, insurance and finance [4, 5], waiting time problems [6], ecology [7], and climatology [8], among others.

Let be a random sample from with unknown underlying distribution function, An exceedance occurs over a deterministic threshold, , if the value of is greater than i.e., , where is the number of observations exceeding Also, the excess over a threshold is defined as with The excess distribution over threshold is given by where is the finite or infinite right endpoint of the distribution . From [1, 2], the distribution of is well approximated by the GPD with the survival function defined as

If the location , shape (), and scale () parameters are included (see, e.g., [9]), the resulting three-parameter generalized Pareto distribution is given by

The shape parameter, is usually referred to as the tail index or the extreme value index [3, 10, 11]. The mean and the variance of a GPD random variable, are given by

respectively. Here, the expected value of exists when and the variance when

The shape (tail index) parameter shows the tail heaviness of the underlying distribution and can be used to classify the distribution into three classes. These are short-tailed when , medium-tailed when , and heavy-tailed when . In special cases, and the GPD reduces to exponential and uniform distributions, respectively.

Several estimation techniques have been proposed for estimating the parameters of the GPD such as maximum likelihood (ML), probability weighted moments, and elemental percentile. For large sample sizes, [12] showed that the ML estimator (MLE) is asymptotically the best. However, the MLE is known to pose computational difficulties and convergence problems [13, 14]. In addition, it does not exist when [15] and provides nonfeasible estimates if [13, 16].

To address these problems of the MLE, [15] proposed the likelihood moment estimators (LME). This estimator shows high efficiency for small sample sizes. However, [17] reports that it performs poorly if and tends to overestimate the shape parameter for larger values of the scale parameter. Also, the LME again is computationally intensive and slow. Hence, [18, 19] proposed an empirical Bayesian method to solve the computational intensiveness of the LME. The resulting estimators of the tail index from this method showed better performance for Despite this advantage, the authors use all available data in the computation of the estimates of the parameters of the GPD. However, it is known that for GPD-based models, the inclusion of small to moderate observations introduces bias [11, 20].

In this paper, we propose two estimators of the tail index of the generalized Pareto distribution using the idea of the shrinkage principle. The estimators are a combination of the likelihood-based approach and that of the empirical Bayesian methods. The performance of the proposed method is evaluated in conjunction with the existing estimators in a simulation.

The rest of the paper is organized as follows. Review of the existing estimators and proposed two new estimators are in Section 2. The performances of the estimators are examined via an extensive simulation study in Section 3. A practical application of these estimators on two datasets is demonstrated. Finally, Section 5 provides concluding remarks.

2. Estimation of the Parameters of the Generalized Pareto Distribution

Assuming is a GPD with scale and shape parameters and , respectively, the following methods can be used to obtain the estimators of the parameters, maximum likelihood [13], likelihood moment [15], profile likelihood with empirical Bayesian [18], and improved profile likelihood with empirical Bayesian [19] methods. In the next subsections, we present a brief description of each of these methods.

2.1. Maximum Likelihood Estimator (MLE)

The maximum likelihood principle involves finding the set of parameters that maximize the likelihood function evaluated on the sample from the distribution, of the random variable, : where is the density function associated with .

Many studies have shown that the estimators based on the maximum likelihood estimator are usually better in the case of large samples (see, e.g., [11, 12, 21, 22]), among others. In addition, its asymptotic properties including normality are easily found compared with other estimation methods.

In the case of statistics of extremes, and in particular the GPD, [11, 13, 21], among others, show that for small sample sizes, the maximum likelihood performs poorly. Also, the numerical algorithm may fail to converge to the maxima. [21] suggests reparametrization of () to () of which is of the form and is estimated as , with

Hence, where are the order statistics associated with Maximising (8) leads to the MLE estimator , and the corresponding MLE of and are obtained as

With this reparametrization, the MLE becomes asymptotically efficient. However, the computational issues persist such as nonconvergence and, thus, the search for solutions including the likelihood moment estimation technique.

2.2. Likelihood Moment Estimator

The likelihood moment estimator (LME) was proposed by [15] to solve the computational complexities, efficiency, and convergence problems of the MLE. LME is reported to be efficient and robust, yet it is computationally intensive. The LME is estimated with the log-likelihood and the moment estimators as

It should be noted that for any constant satisfying , the of which the sample version is equivalent to

Using some conditions on and for the existence of the MLE estimator of the tail index, [15] obtains

Here, and [15] depicts that when the asymptotic variances of LME and MLE are the same. Even though the choice of threshold is an important issue in the estimation of the GPD, it was not considered by the authors. Also, [23] states that the LME is less sensitive than other estimators to the choice of threshold. In general, the LME estimator performs poorly if and tends to overestimate the shape parameter for large values of the scale parameter [17].

2.3. The Profile Likelihood with Empirical Bayesian Method (PB)

This estimation method was proposed by [18] to improve the MLE and also solve the computational difficulties of the LME method. It borrows ideas from empirical Bayesian method where and are reparameterized as and , of which is of the form

The authors defined their estimator as where is the profile log-likelihood function and is a data-driven “prior” density function for

The integral in (14) is computationally difficult for most priors; hence, a simplified numerical version was proposed as where with as the first quartile of the sample data, and The shape and scale parameters are obtained as

However, this method is reported to be sensitive to the shape of the prior distribution but performs well when Also, it is reported to have very poor performance for extremely heavy-tailed distributions and efficient for only small sample sizes [19].

Since this estimation method is in the peaks-over threshold framework, the selection of a suitable threshold is critical as it leads to bias-variance trade-offs. However, the authors did not consider this in their simulation study.

2.4. Improved Profile Likelihood with Empirical Bayesian Method (IPB)

[19] proposed a refinement of the estimator in [18] with the aim of obtaining improved estimators of the tail index in the case of heavy-tailed distributions. In order to achieve this, a better data-driven prior distribution was proposed.

In this case, a GPD with scale and shape was chosen as the prior distribution for

Here, is chosen as negative to ensure that is positive and finite.

Similar to the estimator in Section 2.3, the modified estimators of and are obtained as

Here, with the computation of as

However, this method is reported to be sensitive to the estimation of the scale parameter and has poor performance for light-tailed distributions [9, 24]. In addition, this estimator is not valid for [22]. Furthermore, the authors did not consider the effect of the number of exceedances, on their proposed estimator.

2.5. Empirical Review of Some Existing Estimation Methods

In [18, 19], it has been demonstrated numerically that their estimators of the tail index are less biased and efficient compared with existing estimators such as the maximum likelihood and likelihood moments. However, a closer look at the codes implemented for the simulation study reveals that the number of exceedances () is taken as the sample size (i.e., the whole sample is utilised in the estimation of the parameters). This is in contrast to the concepts in extreme value theory relating to the peaks-over threshold (POT) where emphasis is on the tails of the distributions. Therefore, in such a simulation study, estimators should be assessed on their sensitivity to the choice of top order statistics, i.e., number of exceedances. In view of the above, we present a simulation study to assess the performance of the proposals in [18, 19] with that of the MLE and the LME of tail index as a function of the number of top order statistics or exceedances.

We generate samples of size and from the GPD with parameters, and . Figure 1 shows the behaviour of the estimators arising from these estimators of the tail index of the underlying distributions of the generated datasets.

In the case of the estimation of LME tends to overestimate (show large deviations) the tail index when is small. Also, PB and IPB methods underestimate the value of the tail index for small values of and are only closer to the true value as . In addition, for positive values of the shape parameter, the LME gives very good estimates, i.e., closer to the true value. However, the PB and IPB estimators give good estimates of only if . Therefore, the estimators, PB [18] and IPB [19], do not perform well when the number of exceedances is small compared to the sample size.

Even though the [18, 19] estimators (i.e., PB and IPB) show attractive properties in the estimation of the extreme value index, this small simulation study gives an indication of their sensitivities to the choice of Therefore, in this study, we aim to provide estimators of the tail index that takes into account the advantages of the [18, 19] estimators but are less sensitive to the choice of the number of exceedances, .

2.6. The Proposed Method

This estimation technique seeks to improve the methods proposed by [15] and that of [18, 19]. The proposed method uses the idea of shrinkage estimation which relies on a weighted combination of these methods.

For convenience, we reparametrize and of the GPD as and , where as done by [21]. This implies that the shape and scale parameter are estimated, respectively, by and by maximizing the log-likelihood for the sample

For the first and second proposed shrinkage estimators of are given by respectively. The value of plays an important role in the shrinkage estimator: implies that the LME dominates whereas with , the empirical Bayesian methods (PB and IPB) dominate. Also, when equal proportion of both estimators is included in the proposed estimators of the tail index.

Therefore,

Now, the MSE of is computed as

For optimal weight, we minimise the with respect to :

The asymptotic properties of (i.e., and ) remain an open problem, and hence, we resort to the use of simulation to find an approximate value for . An extensive simulation was done to choose the value of and in practice, the findings show that for light-tailed situations, provides an appropriate or suitable estimates whereas for heavy-tailed cases, gives an optimal estimates. We provide a few examples as shown in Figure 2. The rest are available on request from the authors.

3. Simulation Study

In this section, a simulation study is conducted to compare the performance of the tail index estimation methods discussed in the previous section. Specifically, the two proposed estimators are compared with the MLE [13], LME [15], PB [18], and IPB [19]. We present the simulation design and a discussion of the results in the next subsubsections.

3.1. Simulation Design

Samples of different sizes are generated from the GPD with various values of the shape and scale parameters. The four existing estimators, MLE, LME, PB, and IPB, and the two proposed estimators, Proposed 1 and Proposed 2, are used to estimate the values of the parameters of the GPD. Two error metrics are used for comparison of the estimators, i.e., mean square error (MSE) and bias.

Algorithm 1 outlines the procedure for comparing the performance of estimators.

Step 1. Generate a sample of size from a GPD with known shape and scale () parameters such that both light and heavy-tailed distributions are realized.
Step 2. For each number of top order statistic    or equivalently threshold value compute estimates of where {MLE, LME, PB, IPB, Proposed 1 and Proposed 2}.
Step 3. In order to compute the bias and mean square error for each estimator, repeat Steps and a large number of times, to obtain the estimates, with
Step 4. Compute the bias
     
and Mean Square Error(MSE)
     
at each

We remark that, from Step 2, the estimation leads to concurrent estimates of However, our interest is the tail index (i.e., shape) parameter only.

3.2. Simulation Results and Discussion

The simulation is implemented in the statistical package R. The codes for IPB and PB are obtained from [18, 22], respectively. Also, the codes for LME can be found in the R package POT.

The results of the simulation on the estimation of the shape parameter, are presented in Figures 3 and 4 for light and heavy-tailed distributions, respectively.

First, the results show that the proposed methods generally have smaller MSE for small values of (i.e., less than 40% of the sample size). As PB and IPB estimators appear to be the best as their MSEs approach 0. This is in conformity with what have been reported in [18, 19] and the results in Section 2.5. Also, the MLE estimator gives better performance than all other estimation methods when at In general, the LME overestimates all values of and, hence, performs poorly.

In addition, Figure 4 shows the MSEs of the six estimators of The results indicate that the proposed methods are competitive with LME as they have smaller MSEs across most of the values of Closely following the sample path of both proposed methods and LME is the MLE: it has better performance in some cases. Lastly, the MSEs of the estimators converge to 0 as , and this shows a sign of empirical consistency.

Therefore, we conclude that the proposed estimators are competitive with the existing ones, and at worse, it has MSEs close to that of likelihood-based method (LME).

4. Practical Application

In order to demonstrate the use of the proposed methods, we consider the estimation of the tail indexes of the underlying distributions of two datasets. First, the monthly mean temperature series of Ghana for the period 1901 to 2019 obtained from the climate monitor unit hosted by the Climatic Research Unit, University of East Anglia. The second data is the Danish fire insurance claims in Denmark for the period 3 January 1980 to 31 December 1990 and that was obtained from the EVIR package in R.

4.1. Descriptive Measures

Table 1 shows the main descriptive measures for both monthly mean temperature of Ghana with 1417 monthly observations and the fire insurance claims in Denmark with 2167 daily series. Whereas the fire claim data is heavy-tailed, the distribution of the temperature data shows a mild positive skewness.

The values of the Shapiro-Wilk test for normality reject the null hypothesis of a normal distribution for both series, thereby making it appropriate for extreme value analysis.

From the plots in Figure 5, no obvious trend can be seen in the temperature data. However, the Danish fire insurance claim had few outliers. The temperature series is dense in the range of 25.5 to 27 while the Danish fire claims are denser from one million to 2 million Danish Krone. The histogram and the exponential quantile-quantile plots of the two datasets indicate that the fire insurance claims have a long tail (heavy tail) while the temperature series has a shorter tail.

In addition, the mean excess plots show that the temperature data is light-tailed and, hence, indicates a likely negative value of the tail index, , whereas the Danish shows more of heavy-tailed.

4.2. Estimation of the Tail Index

Figure 6 gives the sample paths of the estimators of the tail index of the underlying distribution of the Danish and temperature datasets. The sample paths of the estimators show that there exist large variations at smaller values of and smoother for larger values of The former can be explained by the small number of observations involved in the estimation of the tail index and, hence, the large variations associated with the sample paths. On the other hand, the latter can be explained by the large number of observations involved in the estimation of the tail index and, thus, the less variations associated with the sample paths.

In addition, the proposed estimators are in between the sample paths of the likelihood-based and empirical Bayesian-based estimators as required.

5. Conclusion

The importance of the generalized Pareto distribution cannot be overemphasized as it has wide applications in finance, hydrology, and economics, among others. In this paper, we proposed two estimators of the parameters of the generalized Pareto distribution. The estimators are based on the existing likelihood moment and the empirical Bayesian-based estimators using the idea of shrinkage. The performance of the proposed estimators was assessed using a large-scale simulation study. The results show that the likelihood moment estimator is not appropriate for negative values of the shape parameter. Also, the empirical Bayesian-based estimators use all the available data and hence defeat the threshold characteristics of the GPD. In addition, it performs well for only in the case where the number of exceedances approaches the sample size. The proposed estimators have competitive MSE for small to moderate number of exceedances. This is appropriate as the inclusion of smaller observations (i.e., large number of exceedances) introduces large bias. Moreover, the proposed estimators are at worst close to the performance of the likelihood-based estimator. The estimators were illustrated using two real datasets from the climate sector representing light-tailed distributions and the insurance industry representing heavy-tailed distributions. The asymptotic properties of the extreme value index estimators in [18, 19] remain an open problem. Consequently, since our proposed shrinkage estimators of the extreme value index depend on these estimators, their asymptotic properties also remain an open problem. Therefore, these asymptotic properties, estimation of the scale parameter, and other problems such as estimation of high quantiles and exceedance probabilities are the subject of future research.

Data Availability

Data were obtained from the Climate Research Unit and the EVIR package in R. The temperature data used to support the findings of this study have been deposited in a git-hub repository (https://github.com/wilheminapels2/Shrinkage/blob/main/Temperature.csv).

Conflicts of Interest

The authors declare that they have no conflicts of interest.