Research Article  Open Access
Extension and Application of Credibility Models in Predicting Claim Frequency
Abstract
In nonlife actuarial science, credibility models are one of the main methods of experience ratemaking. BühlmannStraub credibility model can be expressed as a special case of linear mixed models (LMMs) with the underlying assumption of normality. In this paper, we extend the assumption of BühlmannStraub model to include Poisson and negative binomial distributions as they are more appropriate for describing the distribution of a number of claims. By using the framework of generalized linear mixed models (GLMMs), we obtain the generalized credibility premiums that contain as particular cases another credibility premium in the literature. Compared to generalized linear mixed models, our extended credibility models also have an advantage in that the credibility factor falls into the range from 0 to 1. The performance of our models in comparison with an existing model in the literature is also evaluated through numerical studies, which shows that our approach produces premium estimates close to the optima. In addition, our proposed model can also be applied to the most commonly used ratemaking approach, namely, the net, the optimal BonusMalus system.
1. Introduction
Credibility theory is one of the key topics in the field of actuarial science. Since the beginning of the twentieth century, the greatestaccuracy credibility theory based on the Bayesian philosophy has gradually taken the place of the limited fluctuation credibility theory and has become widely used in nonlife ratemaking. In credibility theory, the insurance premium for the individual contract is derived from a convex combination of prior mean, say , and the mean of claim experience for each contract, say , by , where represents the credibility factor, ranging from 0 to 1. Jewell [1] shows that the credibility formula can be derived from Bayes’ theorem by using a Poissongamma model. To simplify the credibility factor in the Bayesian method, Bühlmann [2] developed a linear credibility formula under the principle of minimum mean square error (least squares). Without assuming a prior distribution with specified distributional parameters, Bühlmann considered the best linear estimator based on observed claims, which can be estimated consistently by the method of moments [3]. Frangos and Vrontos [4] and Tzougas et al. [5] proposed an explicit expression for the Bayesian credibility model using the conjugate prior distribution, but this had limited applicability due to the difficulty in the calculation of the integration when nonconjugate prior distribution was used. Frees et al. [6] discussed the credibility model as a special case of the linear mixed model, introducing the fixed effect and the random effect to represent the overall mean of claims over the collection of subjects and the deviation of the individual mean of a specific subject from the overall mean, respectively. It is also well known that different methods can lead to the exact expression of credibility factor. These methods are, among others, a generalization of distributionfree approaches [7], the posterior regret minimax approaches [8], Bayesian nonparametric method [9, 10], and approximate credibility formula [11].
While linear mixed models have applications in many other areas, they have limited applications in actuarial science. Considering the insurance data is usually rightskewed or discrete (claim frequency models), linear mixed models have to be extended to the generalized linear mixed models (GLMM); see Antonio and Beirlant [12]. These models no longer require the normal distribution assumption and can adapt to distributions in the exponential family, such as Poisson and gamma distributions. With the introduction of random effects, the predictors in the generalized linear mixed models are similar to the predictors in the credibility model. The credibility factor can be deduced from the GLMM predictor, but it cannot be guaranteed to fall into the range from 0 to 1 (Meng, 2014) and apparently violates the principle in credibility model.
In this paper, we propose a new approach to modeling credibility model, using generalized linear mixed models framework for analyzing longitudinal claims data. The contribution of this article is that, first, we generalize the underlying normal assumption in the BühlmannStraub model to Poisson and negative binomial distribution, which are more appropriate distributions to describe the claim frequency, overcoming the insufficiency of the BühlmannStraub model. Second, we borrow the generalized linear mixed models basic framework, use the fixed effect to describe the overall mean of claims over the collection of insurance contracts, and adopt the random effect to describe the deviation of the individual mean of specific contract from the overall mean. We derive explicit expressions for credibility formula under the Poisson and negative binomial assumption and ensure that the credibility factor falls in the interval , which makes it easier to explain the credibility predictor. Third, we provide a numerical method that gives practitioners access to the optimal BonusMalus system based on our proposed credibility models, allowing them to adjust the premium in the next year based on claim experiences.
The remainder of the article is organized as follows. Section 2 introduces our nations and describes the relationship between linear mixed model and BühlmannStraub model. The new formulas of credibility predictor and credibility factor for claim frequency under Poisson and negative binomial distribution assumptions are derived in Section 3. Section 4 provides implementation details for parameter estimation. In Section 5, we give a numerical example to show the benefit of our model compared to others in the literature. Some concluding remarks are given in Section 6.
2. Basic Assumption
Consider a portfolio of insurance contracts of insured each with an available a history of time periods. Denote by the number of claims for individuals during the time period . Let be the risk parameters that take into account all the common characteristics individual risks, which are referred to as potential individual characteristics in BühlmannStraub model. Assume that, for every , are independent with .
In credibility theory, the credibility predictor has the following linear form:where is the average historical claims for the individual and is the overall mean in the insurance portfolio. The credibility factor is a weight assigned to the individual’s own claims experience.
BühlmannStraub model can be considered as a special case of linear mixed models [13]. The credibility predictor is equivalent to the linear mixed models with only intercept and random effect. In actuarial practice, the parameters of BühlmannStraub model are generally estimated using the method of moments, while in the linear mixed models, one uses the method of the maximum likelihood. These estimators may differ from each other and, in particular, for the credibility factor and the credibility predictor.
Since the numbers of claims are discrete data, the implicit normal distribution assumption in the linear mixed models is not appropriate. Therefore, we generalize the normal distribution to include Poisson and negative binomial distribution, which can be extended to generalized linear mixed models. For an individual in the year , given the risk parameters , let the conditional random variable follow the exponential distribution with probability density function as follows: where is the dispersion parameter and is the natural parameter. The mean and variance for the conditional random can be expressed aswhere is the variance function.
In the generalized linear mixed models framework with only intercept (similar to credibility model), we can model the relationship between predictor with fixed effect and an unobserved random effect with link function given by
In actuarial applications, log link function is usually used: that is,where represents the overall mean over the whole insurance portfolio. The prediction for the number of claims for an individual in the next year can be obtained from the adjustment factor .
To ensure that a priori ratemaking is correct on average, we need to apply the following constraint on the random effect in (5), . The a priori premium is given by
In (5), if the random effect, , follows a normal distribution with zero mean, , then these models are considered as generalized linear mixed models with the only intercept. However, under the constraint , the random effect follows a normal distribution with a nonzero mean, . This assumption will cause the estimator of the overall mean to be and the estimator for the th individual to be . The estimator for the th individual can be expressed as a ratio adjustment to the overall mean with the adjusting factor .
A number of commonly used probability distributions (discrete or continuous) follow the form given in (2); these include the normal, Poisson, binomial, gamma, inverse Gaussian, and geometric distributions. In this paper, we focus on two commonly used distributions to model the number of claims, namely, the Poisson and negative binomial distributions.
3. Extended Credibility Model
The random variable, , represents the numbers of claims for the th individual insurance contract at the th year. The credibility model is aimed at predicting the next year loss, , based on historical observations . The estimator, , for the next year in a credibility model is a function of the historical claims : that is, . In the greatestaccuracy credibility approach [2], the credibility estimator is constrained to be linear in historical observations, are estimated under the quadratic loss function: that is, find such that the expected squared difference between and is minimum:
Unlike estimation method in BühlmannStraub model, we estimate , and using the framework of the generalized linear mixed models. First, we present the following result as a lemma.
Lemma 1. Let the random variable denote the number of claims. In the framework of generalized linear mixed models, its expectation and variance are given byThe credibility estimator for individual in the period is given bywhere denotes the variance of exponential transformation of random effect and denotes the overall mean over the insurance contracts .
Proof. The quadratic loss function of credibility predictor is constructed as follows: Equating to 0 yieldsWe substitute (13) into (14) to getUsing the definition of expectation of exponential family, we substitute (9) into (15), and the left side of (15) becomesThe right side of (15) becomesEquating LHS and RHS, we getUsing the definition of variance of exponential family, substituting (10) into (18), we getFor each , (19) becomes This yields for all .
Let , (19) now becomesAnd then we find out the final expression for Substituting this result in (13), we getThis leads to the credibility predictor
This suggests a more flexible modeling approach that takes the place of Bühlmann credibility model. Note that the forms of credibility estimator in our extended credibility model vary with the expectation and variance function of responses variable . In the following section, we mainly discuss two special cases of extended credibility models for predicting the claim frequency, the extended Poisson credibility model, and negative binomial credibility model.
3.1. Extended Poisson Credibility Model
Let the number of claims follow a Poisson distribution, that is, , and the probability density function is given by
In this case, where the variance function for a Poisson distribution is given by , then the expectation and variance of will take the exact form given by
Substituting the above expectation and variance of into (11), the credibility predictor can be simplified aswhere and the overall mean .
Note that this credibility predictor can be decomposed into the familiar form of the Bühlmann credibility predictor, overall mean over the collection of insurance contracts and averaged experience loss of the individual contract.where the credibility factor is
3.2. Extended Negative Binomial Credibility Model
Assume the number of claims follows a negative binomial distribution; then ; then the probability density function is given by
In this case, the variance is given by and the expectation and variance of will take the exact form given by
Substituting the above expectation and variance into (11), the credibility predictor can be simplified aswhere and the overall mean .
The extended negative binomial credibility predictor simplifies to the expression we found in Section 2.where the credibility factor is given by
It is clear from (29) and (34) that the extended claim frequency model we proposed will result in a credibility factor that is within the interval .
3.3. Relationship between Extended Model and Bayesian Credibility Model
In the Bayesian credibility theory, Poissongamma credibility model is used to predict claim frequency [14]. It assumes that the numbers of claims follow a Poisson distribution with a mean parameter , , and . The credibility estimator in Poissongamma credibility models is given by the expectation of posterior distributionwhere credibility factor is in Poissongamma credibility model:
In the extended Poisson credibility model, we propose that if the risk parameter and random effect satisfy the following relationship, and , then the credibility factor in Section 3.1 can be expressed in the form similar to credibility factor in PoissonGamma credibility modelThis is identical to Bayesian PoissonGamma credibility model.
Furthermore, if the random effect is normally distributed , then the Poisson model credibility factor can be rewritten asThis model degenerates to Poissonlognormal model in claims count data [15].
4. Parameter Estimation in GLMMs Framework
In order to obtain the credibility estimator for , one has to obtain estimates for the elements of variance components, , and for the overall mean, , in the extended Poisson credibility model. Then, one has to estimate the scale parameter, , in the extended negative binomial credibility model. The estimators of the variance components, , are correlated to the variance of the random effect, while can be obtained using the exponential transformation of the estimation of the fixed effect in the framework of generalized linear mixed models. Joint probability density function for the model can be expressed as . The estimation for parameters of the model can be obtained based on the marginal loglikelihood function belowwhere denotes the probability distribution of response variable and denotes the density function of random effect.
The integration required in the loglikelihood function is quite complex. Thus, we apply numerical approximation algorithm, adaptive Gaussian quadrature, which also enables us to apply the likelihood ratio test in our model [16]. While there is a number of software packages available to solve this problem, the NLMIXED procedure in SAS is the most convenient (also authors are more comfortable with SAS).
The parameters in both Poisson and negative binomial credibility models include the estimation of fixed effect, , the variance components of random effect, , and scale parameter in the negative binomial credibility model. Since the random effect in our model follows a normal distribution , the exponential transformation of the random effect follows a lognormal distribution that leads to the parameter . Finally, the credibility predictor and credibility factor in (11) can be obtained by substituting , , and into the extended credibility model (28).
5. Empirical Study
In this section, we implement the estimation procedures from Section 4 and show how to use the resulting estimates to produce the credibility predictor and credibility factor in both the Poisson distribution and the negative binomial distribution from Section 3. We used a claims dataset that was collected at a Chinese auto insurance company. It is a balanced longitudinal data and contains claims information from the calendar year 2006 to 2008. The dataset contains 9712 policyholders that stay in the company for complete 3 years’ periods, resulting in 36748 insurance contracts and risk exposures of each contract are 1. The mean of the number of claims is 0.2884783 and the variance is 0.4072942, which implies overdispersion. Table 1 presents the claim frequency distribution over time. In each year, the number of claims has a significant fraction of zeros. This is consistent with the insurance practice, where insurers manage the risk pool through diversification effect.

The estimates for parameters are and in extended Poisson credibility model and , , and in extended negative binomial credibility model. To demonstrate the advantages of our model, we also compare the results of the proposed credibility model with linear mixed model (LMM), which is the exact same form of BühlmannStraub model following Frees (1998) and optimal BonusMalus systems using finite mixture models following Tzougas et al. [5]. The differences between models produce different results.
We compare goodnessoffit statistics values of competing models by using AIC and BIC statistics based on the sample. Table 2 reports that our proposed models perform the best and the AIC, BIC values in Poisson distribution and negative binominal distribution are almost the same, which indicate that both our extended credibility models have no significant difference. The linear mixed model or BühlmannStraub performs worst, which indicate that linear mixed model or BühlmannStraub model may be not appropriate for our datasets. This is not surprising as (we mentioned earlier) BühlmannStraub model implicitly assumes normal distribution for the response variable. We also fit this dataset by implementing the EM algorithm to establish finite mixture models in Poisson distribution and negative binominal distribution. The results show that finite mixture models perform better than linear mixed model but clearly worse than our proposed credibility model. Table 2 also summarizes the predictive ability of competing models by comparing the mean squared prediction error (MSPE). MSPE is calculated based on holdout sample of 764 policyholders in year 2008. The extended credibility model in Poisson distribution shows the lowest MSPE, which indicate that our proposed model shows the best predictive ability.

As credibility models can be implemented to establish the optimal BonusMalus system, we will find the optimal BMS from our extended credibility models following Lemaire [17] and Frangos and Vrontos [4]. The BMS will be defined from (27) and (33) which is presented in Tables 3 and 4. This BMS can be considered generous with good drivers and bad drivers. Take the optimal BMS based on Poisson distribution: for example, the bonuses given for the first claim free year are 12% of the basic premium. Drivers who have three accidents over the first year will have to pay a malus of 373% of the basic premium. Compared to Poisson distribution, the optimal BMS based on negative binomial credibility model has higher punishment and award of the premium; for example, the bonuses given for the first claim free year are 15% of the basic premium. Drivers who have three accidents over the first year will have to pay a malus of 292% of the basic premium. The result of optimal BMS based on BühlmannStraub models was also added in Table 5. Compared to our extended credibility model, the optimal BMS based on BühlmannStraub model have more tender bonuses and malus for the premium.



6. Conclusion
BühlmannStraub model is widely used in the experience ratemaking with a significant disadvantage. The BühlmannStraub incorporates an implicit normal distribution assumption which is a poor model for the discrete claim frequency. To address this problem, we assumed Poisson and negative binomial models, which are more appropriate distributions for the claim frequency than the normal distribution assumption.
The extended credibility model we proposed has more generalized credibility expression, which contains the Bayesian credibility model. When the exponential expression of the random effect follows the gamma distribution, the extended credibility model degenerated into Poissongamma credibility model; when the random effect follows a normal distribution, the extended credibility model degenerated to PoissonLognormal credibility model.
We have assumed that the claim frequency follows either Poisson distribution or negative binomial distribution and provide a new approach to credibility model. The new formulas of credibility predictor and credibility factor for claim frequency under Poisson and negative binomial distribution assumptions are derived and the optimal BonusMalus system is modeled by using a claims dataset that was collected at a Chinese auto insurance company in the years 2006–2008. The empirical results show that goodnessoffit statistics values of our proposed credibility models are much lower than the linear mixed model, BühlmannStraub model, and finite mixture model in the literature, which implies that our proposed model can fit the data very well.
Compared to the BühlmannStraub model, the results of an optimal BonusMalus system based on our extended credibility models show a more severe punishment and give more reward to the premium. In addition, compared to the generalized linear mixed model under the assumption of Poisson or negative binomial distribution, the extended claim frequency credibility model not only is able to solve for the credibility factor but also ensures that the credibility factor falls in the range from 0 to 1, while generalized linear mixed model cannot be applied in the optimal BonusMalus system.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The research is supported by the Fundamental Research Funds for the Central Universities in UIBE (CXTD904) and the National Natural Science Foundation Project of China (no. 71303045).
References
 W. Jewell, “Credible means are exact bayesian for exponential families,” ASTIN Bulletin, vol. 8, no. 1, pp. 77–90, 1974. View at: Publisher Site  Google Scholar
 H. Bühlmann, “Experience Rating and Credibility,” ASTIN Bulletin, vol. 4, no. 3, pp. 199–207, 1967. View at: Publisher Site  Google Scholar
 S. A. Klugman, H. H. Panjer, and G. E. Willmot, Loss Models: From Data to Decisions, John Wiley and Sons, Inc, New Jersey, NJ, USA, 4th edition, 2012.
 N. E. Frangos and S. D. Vrontos, “Design of optimal bonusmalus systems with a frequency and a severity component on an individual basis in automobile insurance,” ASTIN Bulletin, vol. 31, no. 1, pp. 1–22, 2001. View at: Publisher Site  Google Scholar
 G. Tzougas, S. Vrontos, and N. Frangos, “Optimal bonusmalus systems using finite mixture models,” ASTIN Bulletin, vol. 44, no. 2, pp. 417–444, 2014. View at: Publisher Site  Google Scholar
 E. W. Frees, V. R. Young, and Y. Luo, “A longitudinal data analysis interpretation of credibility models,” Insurance: Mathematics and Economics, vol. 24, no. 3, pp. 229–247, 1999. View at: Publisher Site  Google Scholar  MathSciNet
 E. GómezDéniz, “A generalization of the credibility theory obtained by using the weighted balanced loss function,” Insurance: Mathematics and Economics, vol. 42, no. 2, pp. 850–854, 2008. View at: Publisher Site  Google Scholar
 E. GómezDéniz, “Some bayesian credibility premiums obtained by using posterior regret γ minimax methodology,” Bayesian Analysis, vol. 4, no. 2, pp. 223–242, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 L. Hong and R. Martin, “A flexible Bayesian nonparametric model for predicting future insurance claims,” North American Actuarial Journal, vol. 21, no. 2, pp. 228–241, 2017. View at: Publisher Site  Google Scholar  MathSciNet
 X. Cai, L. Wen, X. Wu, and X. Zhou, “Credibility estimation of distribution functions with applications to experience rating in general insurance,” North American Actuarial Journal, vol. 19, no. 4, pp. 311–335, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 A. T. Payandeh, “A new approach to the credibility formula,” Insurance: Mathematics and Economics, vol. 46, no. 2, pp. 334–338, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 K. Antonio and J. Beirlant, “Actuarial statistics with generalized linear mixed models,” Insurance: Mathematics and Economics, vol. 40, no. 1, pp. 58–76, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 B. Hans and A. Gisler, A Course in Credibility Theory and Its Applications, Springer, New York, NY, USA, 2005.
 M. Shengwang, “Optimal bonusmalus systems accounting for risk characteristics of individual policies,” Journal of Applied Statistics and Management, vol. 32, no. 3, pp. 505–510, 2013. View at: Google Scholar
 M. Denuit, Actuarial Modelling of Claim Counts: Risk Classification, Credibility and BonusMalus Systems, John Wiley and Sons, New Jersey, NJ, USA, 2007.
 J. C. Pinheiro and D. M. Bates, “Approximations to the loglikelihood function in the nonlinear mixedeffects model,” Journal of Computational and Graphical Statistics, vol. 4, no. 1, pp. 12–35, 1995. View at: Publisher Site  Google Scholar
 J. Lemaire, “Bonusmalus systems in automobile insurance,” Insurance Mathematics and Economics, vol. 3, no. 16, pp. 277–312, 1995. View at: Google Scholar
Copyright
Copyright © 2018 Yuantao Xie et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.