Assessing the Performance of the Discrete Generalised Pareto Distribution in Modelling Non-Life Insurance Claims
The generalised Pareto distribution (GPD) offers a family of probability spaces which support threshold exceedances and is thus suitable for modelling high-end actuarial risks. Nonetheless, its distributional continuity presents a critical limitation in characterising data of discrete forms. Discretising the GPD, therefore, yields a derived distribution which accommodates the count data while maintaining the essential tail modelling properties of the GPD. In this paper, we model non-life insurance claims under the three-parameter discrete generalised Pareto (DGP) distribution. Data for the study on reported and settled claims, spanning the period 2012–2016, were obtained from the National Insurance Commission, Ghana. The maximum likelihood estimation (MLE) principle was adopted in fitting the DGP to yearly and aggregated data. The estimation involved two steps. First, we propose a modification to the and frequency method in the literature. The proposal provides an alternative routine for generating initial estimators for MLE, in cases of varied count intervals, as is a characteristic of the claim data under study. Second, a bootstrap algorithm is implemented to obtain standard errors of estimators of the DGP parameters. The performance of the DGP is compared to the negative binomial distribution in modelling the claim data using the Akaike and Bayesian information criteria. The results show that the DGP is appropriate for modelling the count of non-life insurance claims and provides a better fit to the regulatory claim data considered.
Non-life or general insurance involves the provision of financial loss protection against risks on interests other than life, such as buildings, vehicles, machinery, and equipment. Conditioned on periodic payments or one-off advance of a predetermined amount, called premium, non-life policies are designed to provide coverage against the occurrence of the insured probabilistic events for individuals, private organisations, and public institutions. The payments effected in response to occurrences of such events are termed as insurance claims (Wuthrich ). The non-life insurance claim process is characterised by two quantities: claim frequency or count and claim severity or size. As noted by Renshaw  and Özgürel , the underlying expectations of claim frequency and severity, quantified as a product, are foremost considerations in the computation of pure or risk premiums.
The main objective of this paper is to illustrate that the discrete generalised Pareto (DGP) distribution can be employed to model the counts of non-life insurance claims, collated by an insurance regulatory authority from a licensed class of insurers. In the absence of suitable actuarial models, non-life insurers largely encounter difficulties in conducting evidence-based assessment of risks insured, often resulting in the miscomputation of premiums and inability to settle claims when due. In response, developing probability models that describe claim frequencies offers a distributional framework for evaluating risks to facilitate premium setting and liquidity reserving by non-life insurance service providers.
A random variable that follows a DGP with shape , scale , and location parameters is denoted by . This is a parametric model obtained by discretising the continuous generalised Pareto distribution, introduced by Pickands , and is particularly noted for tail modelling properties. The DGP assumes varying forms based on the omission of one or two of its parameters. For instance, if the location parameter , the DGP transforms into the discrete Lomax distribution, .
A fair amount of research has demonstrated the application of probability models to the study of non-life insurance claims. Among the relevant literature, selected studies have principally explored the subject with reference to standard probability distributions from the outlook of randomised spatial effects (Gschlößl and Czado ), collective risk simulation (Pacáková ), and incorporation of covariates (Renshaw ). In this paper, we attempt to contribute to bridging the research gap by providing a method of fitting the DGP to the non-life claims data from the National Insurance Commission in Ghana. Furthermore, similar to the work of Prieto et al. , we assess the performance of the DGP by comparing it with the negative binomial distribution.
This study contributes to the field of claims modelling in threefold. First, it proposes a count-based data structure for the analysis of non-life insurance claim frequencies to enhance the precision of statistical models. Second, a modification is made to the and frequency method of Prieto et al. , to obtain initial estimators of the DGP for the claims data characterised by varying discrete observational intervals. Third, the algorithm implemented under the estimation of the parameters of the DGP offers a resource to researchers on performance analysis in future statistical and/or actuarial work.
The rest of this paper is organised as follows. In Section 2, the methodology is presented including the maximum likelihood estimation of the parameters of the distributions and model selection criteria. Section 3 presents the data and the arrangements needed to put the data into a form necessary for the model fitting. Lastly, in Section 4, we present concluding remarks.
This section presents the systematic approach followed to model the reported and settled claim datasets. Specifically, the section entails the description of the probability distributions, parameter estimation, and model selection criteria.
The parameter estimation method, maximum likelihood estimation (MLE) technique, is used in fitting the models. Consider the case where a random variable is available from a population with a known probability distribution except for its parameter . The maximum likelihood principle suggests that the criterion for making the selection should be the probability (or likelihood) with which a particular distribution can produce the given sample. The value of for that distribution is the maximum likelihood estimate of the unknown parameter, .
Suppose is an independent random sample of size from a distribution with dependence on one or more unknown parameters . Let be the probability density (or mass) function of , with restricted to a given parameter space . The likelihood function of the sample is given by
The maximum likelihood estimator, , of is the solution to the equation
Usually, may involve exponentials, and hence, is maximised. Since the logarithm of a function increases or decreases with the function, the maximiser of also maximises .
2.1. Negative Binomial Distribution
The negative binomial is a discrete probability distribution which characterises the number of successes in a sequence of independent and identically distributed Bernoulli trials before a specified number of failures (denoted r) occur. Suppose a sequence of Bernoulli trials is observed. By definition, a turn of each trial yields two possible outcomes, success or failure, with respective probabilities of occurrence denoted by and . Also, the trials are independent, and remains constant for each trial. If represents the number of trials, or failures, prior to the -th success, then follows a negative binomial distribution with probability mass function:
The geometric distribution is a special case of the negative binomial, where the Bernoulli trial discontinues at the first failure, . Since the negative binomial may be represented by alternative distributional parametrisations, three factors inform distinctions: starting point of the support, whether at or ; definition of , whether it represents the probability of success or failure; and interpretation of , whether it denotes the number of successes or failures (DeGroot and Schervish ).
Given independent and identically distributed claim count observations, , the likelihood function can be expressed as
To maximise equation (5), the partial derivative with respect to and is set to zero:
Here, the digamma function . Furthermore, solving for in equation (4) produces
Finally, substituting in (7) yields
The form of (9) suggests that a closed-form solution for may not be obtained analytically. Therefore, numerical methods can be used in order to obtain estimators of and . We adopt an alternative appealing reparametrisation, , where and represent the mean and dispersion (or shape) parameters, respectively (see, e.g., Piegorsch ). For example, in R Core Team , the function fitdistr in the MASS package provides a routine for estimating the parameters of the negative binomial distribution with reference to the alternative reparametrisation.
2.2. Discrete Generalised Pareto Distribution
The DGP arises from a discretisation of the continuous generalised Pareto distribution. To provide a basis for the discussion on the discrete generalised Pareto, the distribution functions of Pareto type I and generalised Pareto are given asrespectively. The generalised Pareto distribution is noted for its ability to model tails of distribution functions (see, e.g., [11, 12]). Upon discretisation of the generalised Pareto distribution, the resultant DGP inherits the prior continuous properties in forms adapted to the discrete probability space.
From the stated generalised Pareto distribution in (11), the probability mass function of the DGP can be formally deduced. First, consider the cumulative distribution function of the DGP expressed aswhere and if .
Also, Krishna and Pundir  addressed the discretisation of a continuous model by observing unit groupings on the failure time axis. The authors reasoned that, for a continuous failure time , with survival function and time groupings of intervals , the discrete observed variable, , would have the probability mass function
Next, consider a standard generalisation for the survival function from Xekalaki :
Suppose is a sample of size from a DGP. The parameters and are estimated on the assumption that is known since . Adopting the and frequency method of Prieto et al. , the initial values can be obtained and used as seed estimators in the subsequent maximum likelihood operation. Thus, the relative frequencies of and , respectively, denoted by and , are calculated from the sample data. Analogously, and are determined by substituting and into the DGP probability mass function in (16) and equating the expressions to their respective and values.
However, the and frequency method assumes that the count data used are observed in increasing steps of 1. However, in real-life situations, such as the data presented in Section 3.1, they may exhibit variation of intervals other than 1. Therefore, applying the method strictly on the count data results in generating several and hence leading to .
In this regard, proceeding with the computations with zero relative frequencies will result in a loss of essential frequency information in the dataset. As a result, we provide a modification of the method as and () frequency methods, where and is the smallest observation larger than the minimum, . Therefore, the estimators of and are obtained by solving the resulting expressions,andsimultaneously. The expression in (19) results after α is eliminated from equations (17) and (18):
Next, the maximum likelihood estimation method is employed to obtain estimators of the parameters of the discrete generalised Pareto. The log-likelihood function is constructed aswhere refers to the probability mass function specified in (16). Partial derivatives of (21) are taken with respect to and and set to zero to obtain normal equations:
To proceed with the estimation of the parameters of the DGP, an algorithm was implemented in R to perform the following operations: A1. Specify log-likelihood function (21) based on DGP probability mass function (16). The log-likelihood function is set to return a negation of the log-likelihood value since the R optim function is a minimiser. In effect, minimising the negated log-likelihood function at the initial estimates produces the equivalent of maximising the log-likelihood function. A2. Optimise the log-likelihood functions in (22) and (23) at the seed values by simulated annealing (SANN), a variant of the Bélisle  technique. A3. Extract the estimated parameters, and , from the output generated in A2, and compute the standard errors of the estimators using bootstrap resampling of Efron and Tibshirani .
2.3. Model Selection Criteria
While and represent the number of parameters and size of the sample, respectively, specifies the log-likelihood of the model evaluated at the maximum likelihood estimates. Thus, is the maximum value of the likelihood function associated with the model.
In comparison with AIC, BIC addresses the issue of overfitting with a factor, , thereby placing a higher penalty for model complexity (Dziak et al. ). In statistical decision-making, a candidate model with minimum AIC and/or BIC values is selected.
3. Data and Model Fitting
The study employs secondary data on non-life claims from the National Insurance Commission (NIC) of Ghana for the five-year period, 2012 to 2016. The historical data cover insurance claims of 29 non-life service providers. For each fiscal year, the dataset indicates the total number of claims administered under each of the five business classes of non-life insurance in Ghana. The classes are Fire, Burglary, and Property Damage; Accident; Marine and Aviation; Motor; and General Liability. The claim data are organised into three categories: incurred but not reported (IBNR), reported but not settled (RBNS), and settled but outstanding (SEBO), each bearing the standard actuarial definitions.
However, since IBNR is necessarily an estimate, the study focuses on RBNS and SEBO, hereinafter referred to as reported and settled claims, respectively. Overall, the data consist of 3,878,355 non-life insurance policies, generating 39,563 reported claims, of which 5,210 claims were settled within the period.
Figure 1 provides an overview of the annual aggregates for policy subscriptions, reported claims, and settled claims. Although policy subscriptions have seen a decrease from 2014 onwards, the number of reported and settled claims has been increasing within the period. This observation shows some evidence of potential liquidity challenges for the non-life insurers if the trend persists into the future.
Following Prieto et al. , the dataset is organised into a structure which enables the fit of the discrete distributions to the frequency of occurrence of reported and settled claims. Tables 1 and 2 present descriptive statistics on the reported and settled claim datasets, respectively. For each year of the period considered, the skewness indicates the extent of symmetry and shows that there is positive skewness for the distribution of the count of claims. Also, among the reported claims, the fiscal year 2016 recorded some unusually large values culminating in its large kurtosis value. Similar results can be found in 2013 for the settled claim data.
In addition, Tables 3 and 4 record the individual observations of reported and settled claim counts with corresponding frequencies. For instance, in 2012, there were two records of reported claim count of 19 and six records of cases where no claims were settled.
It should be noted that the respective columns for the count frequencies for both reported and settled claims sum up to 29. Thus, each of them tallies with the total number of non-life insurers from whom records are collated by the National Insurance Commission. Lastly, nonsettlement of reported claims, among other reasons, may result from the eligibility of reported interest, proximity of the cause of the insured event, and noncompliance with coverage provisions of the insurance policy.
3.2. Model Fitting and Discussion of Results
This section presents the outcomes of the model fitting methods discussed in Section 2 on the claim data from the preceding section.
The parameter estimates are obtained through the maximum likelihood method. The maximum likelihood estimation of the negative binomial and DGP parameters is performed in R. The negative binomial parameters are estimated using the mle function and their standard arguments in the fitdistrplus package. However, to the best of the authors’ knowledge, no statistical package exists for estimating the parameters of the DGP in R. Therefore, the authors wrote an R-function for estimating the parameters of the DGP, using algorithms A1–A3, and it is available upon request. In addition, the selection criteria for model comparison are presented for the individual years and the aggregated claim data for the five-year period.
3.2.1. Parameter Estimation for Yearly Data
Also, Table 6 presents estimates from the estimators, , , and , representing the estimated DGP location, shape, and scale parameters, respectively. The bootstrap standard errors are placed in parenthesis.
In terms of reported claims’ count, Tables 7 and 8 show that the DGP model presents smaller AIC and BIC values, in comparison with the negative binomial model. The observation is consistent across the fiscal timeframe under consideration. In addition, for the settled claim counts, the DGP is preferred as it exhibits smaller AIC and BIC values throughout the period as shown in Tables 7 and 8. Therefore, the DGP is recommended as it provides a better fit to both classes of the non-life insurance claim data.
3.2.2. Parameter Estimation for Aggregated Data
This section presents the results of fitting the negative binomial and DGP to the aggregated 5-year count data on reported and settled claims. The results of the parameter estimation for the negative binomial and DGP are presented in Tables 9 and 10, respectively.
Comparing the negative binomial and DGP, Table 11 shows the AIC and BIC values for the fit of the two probability distributions. It is obvious that the DGP model produces smaller AIC and BIC values for the aggregate reported claim counts. Also, regarding the aggregate settled claim counts, smaller AIC and BIC values are produced by the DGP. Therefore, in alignment with the year-based modelling conclusion, the DGP is recommended as it provides a better fit to both classes of yearly and the aggregated non-life insurance claims data.
This study demonstrates that non-life insurance claims can be described by the three-parameter discrete generalised Pareto distribution. Relative to the negative binomial model, the DGP was observed to provide a better fit to the non-life reported and settled claim counts, as evident from the information criterion values under both yearly and aggregated data scenarios.
First, in organising the regulatory claim data for the distributional investigation, the study disaggregated each dataset into the observed count of claims and the corresponding frequencies, thereby providing an explicit count-frequency breakdown for informed probabilistic modelling. Furthermore, in deriving initial estimators (, , and ) to evaluate the DGP parameters, the and frequencies of Prieto et al.  were adapted to and , with , where is the smallest count value larger than the sample minimum, . The modified frequency routine involving and extends the application of the and frequency method to the real-world count data exhibiting varying observational intervals.
To assess the strict response of the DGP to the claim count data, the study is conducted from a pure distributional perspective in the absence of explanatory variables. In future studies, however, incorporating relevant covariates into the modelling framework may offer additional insights for enhanced performance evaluation of the DGP. The outcomes will complement the present study in contributing towards optimality considerations for the allocation of premium funds by non-life insurance service providers.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
V. Pacáková, “Modelling and simulation in non-life insurance,” in Proceedings of the 5th International Conference on Applied Mathematics, Simulation and Modelling, WSEAS Press, Corfu Island, Greece, July 2011.View at: Google Scholar
M. H. DeGroot and M. J. Schervish, Probability and Statistics, London, UK, 2012.
R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2017.
J. Beirlant, Y. Goegebeur, J. Segers, and J. L. Teugels, Statistics of Extremes: Theory and Applications, Wiley, England, UK, 2004.View at: Publisher Site
B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, London, UK, 1993.