/ / Article

Research Article | Open Access

Volume 2019 |Article ID 1393946 | https://doi.org/10.1155/2019/1393946

Apollinaire Woundjiagué, Martin Le Doux Mbele Bidima, Ronald Waweru Mwangi, "An Estimation of a Hybrid Log-Poisson Regression Using a Quadratic Optimization Program for Optimal Loss Reserving in Insurance", Advances in Fuzzy Systems, vol. 2019, Article ID 1393946, 14 pages, 2019. https://doi.org/10.1155/2019/1393946

# An Estimation of a Hybrid Log-Poisson Regression Using a Quadratic Optimization Program for Optimal Loss Reserving in Insurance

Revised07 Mar 2019
Accepted20 May 2019
Published16 Jun 2019

#### Abstract

In this article, we are interested in developing an alternative estimation method of the parameters of the hybrid log-Poisson regression model. In our previous paper, we have proposed a hybrid log-Poisson regression model where we have derived the analytical expression of the fuzzy parameters. We found that the hybrid model provide better results than the classical log-Poisson regression model according to the mean square error prediction and the goodness of fit index. However, nowhere we have taken into account the optimal value of (cut) which is of greatest importance in fuzzy regressions literature. In this paper, we provide an alternative estimation method of our hybrid model using a quadratic optimization program and the optimized value (cut). The expected value of fuzzy number is used as a defuzzification procedure to move from fuzzy values to crisp values. We perform the hybrid model with the alternative estimation we are suggesting on two different numerical data to predict incremental payments in loss reserving. From the mean square error prediction, we prove that the alternative estimation of the new hybrid model with an optimized value predicts incremental payments better than the classical log-Poisson regression model as well as the same hybrid model with analytical estimation of parameters. Hence we have optimized the outstanding loss reserves.

#### 1. Introduction

“An important role of a non-life actuary is the calculation of provisions, mainly IBNR (incurred but not reported) reserve. Then, finding the fair value of loss reserve is a relevant topic for non-life actuaries. Indeed, insurance companies must simultaneously have enough reserves to meet their commitment to policyholders and have enough funds for their investments. Therefore several methods have been proposed in actuarial science literature to capture this fair value” [1].

“In one hand, we distinguish deterministic methods [24]. They provide crisp predictions for reserves. In [sic] the other hand, [58] present stochastic methods. Those methods don’t give only a crisp value of the reserves but provide also their variability. For more details on existing loss reserving methods, one can consult [919]” [1].

But in [20], there are some experiences where stochastic methods can give unrealistic estimates. For example, when the claims are related to body injures, the future losses for the company will depend on the growth of the wage index that helps to determine the amount of indemnity and depends also on changes in court practices and public awareness of liability matters. Then the information is vague. Therefore the use of Fuzzy Set Theory becomes very attractive when the information is vague as in this case. Hence one should think about models that can handle both fuzziness and randomness, namely, hybrid models.

In [1], we have proved that we can improve the classical log-Poisson regression model through a hybrid one where the parameters are derived and have an analytical form. Although the new model provides better results than the classical log-Poisson regression model (according to mean square error prediction (MSEP) and goodness of fit index), we still have a large value of the MSEP. This could be due to the choice of value which is important in fuzzy regression framework. The purpose of this paper is to provide an alternative estimation which is taking into account the optimized value. We will prove in this paper that the optimized value is of the greatest importance because the value of the MSEP will be very low compared to the MSEP we are getting from the analytical estimation of the hybrid model.

In this paper, we investigate the possibilistic approach to estimate the hybrid model, i.e., to estimate the asymmetric triangular fuzzy coefficients (ATFC) of the model through a quadratic optimization program and by taking into account an optimized value of (cut) in loss reserving framework. We have investigated the fuzzy least-squares approach in our previous paper [1]. To move from FN to crisp values, we shall use the expected value of FN.

Our objective therefore is to come with a new estimation method of fuzzy parameters in the hybrid log-Poisson model where the optimized value will be taken into account. From two different data, we prove that our hybrid model the new estimation method provides best predictions of reserves compared to the classical log-Poisson model according to the MSEP criterion.

The structure of the paper is as follows: We present in the first section the preliminaries on fuzzy sets and their properties. In the second section, we shall review some models and results on FRM. In the third section, the framework of estimation of loss reserve with log-Poisson regression will be introduced. Our contribution starts from Section 5 where we propose a new estimation method of the hybrid log-Poisson regression model [1] for loss reserving through a quadratic optimization program by taking into account the optimized value and we prove its relevance from two datasets. Then we conclude the article.

#### 2. Preliminaries on Fuzzy Sets and Their Properties

In this section, we review some concepts related to our research. That is the concept of fuzzy set, membership function, FN, FRM.

##### 2.1. Review of Some Definitions and Properties of Fuzzy Sets

Definition 1 (from [21]). Let be a nonempty set and . In classical set theory, a subset of can be defined by its characteristic function as a mapping from the elements of to the elements of the set ,This mapping may be represented as a set of ordered pairs, with exactly one ordered pair present for each element of . The first element of the ordered pair is an element of the set , and the second element is an element of the set . The value zero is used to represent nonmembership, and the value one is used to represent membership. The truth or falsity of the statement “ is in ” is determined by the ordered pair . The statement is true if the second element of the ordered pair is 1, and the statement is false if it is 0.
Similarly, a fuzzy subset (also called fuzzy set) of a set can be defined as a set of ordered pairs, each with the first element from and the second element from the interval , with exactly one ordered pair present for each element of . This defines a mapping called membership function.

Definition 2 (from [21]). The membership function of a fuzzy set , denoted by , is defined bywhere is typically interpreted as the membership degree of element in the fuzzy set .
The degree to which the statement “ is in ” is true is determined by finding the ordered pair . The degree of truth of the statement is the second element of the ordered pair. A fuzzy set on can also be defined as a set of tuplesand could be represented by a graphic.

Definition 3 (from [22]). Let be the set of objects and The cut of is the set defined by

Definition 4 (from [23]). (1)A FN is a fuzzy set of a universe (the real line ) such that(a)all its cut are convex which is equivalent to the fact that is convex, that is, and ;(b) is normalized, that is, such that (c) is continued membership function of bounded support, where and are equipped with the natural topology.(2)A triangular fuzzy number (TFN) is a FN denoted by ; , such that and with the centre of , its left spread and its right spread [24].A TFN could be defined with its membership degree function or, with its level ( cut () (see [23]), i.e.,or(i)If , then define a STFN(ii)Otherwise ; then define an ATFN (see Figure 1).(1) Notes and Comments. It is well know that if is a FN, then , the level (cut) of , is a compact set of , for all
Let us present some properties on TFN.

Property 5 (from [25]). Let be a vector of TFN such that , , are TFN. (1)If is obtained from a linear combination of the TFN , , then is also a TFN, whereFrom the extension principle [2628], we can obtain the level of , i.e.,and(2)If is evaluated by nonlinear functions with TFN, i.e., is increasing with respect to the first variables, where , and decreasing with respect to variables, the result will not be a TFN. But [23] has shown that can be approximate with a TFN , i.e.,In the next section, we review some results on fuzzy linear regression in the literature.

#### 3. Review of Some Models and Results on Fuzzy Regression

In this section we review the FRM proposed by [29] and the one proposed by [30]; then we present their properties. These models will help us to develop the new hybrid model for loss reserving.

##### 3.1. Ishibuchi’s FRM with Asymmetric Triangular Fuzzy Coefficients

Let us define the fuzzy linear regression (FLR) model proposed by [29].

Let be the given crisp data and letbe a FLR model with STFC, where is a fuzzy output from , is the matrix of the given crisp dataset, and is the fuzzy parameter of the model.

In model (16), , are fuzzy coefficients such that are the centres of , and are its spreads. The disturbance term is not introduced as a random addend in the linear relation, but incorporated into the coefficients , .

When the coefficients are STFN, the output is also a STFN.

Let us denotewhereand .

Then is a STFN. Its level, i.e., cut with is calculated as follows for :From (24), , To determine the parameters of the FRM (16), [29] proposes to solve a linear programming problem with an objective function of minimizing the total spread of the fuzzy coefficient, i.e.,Reference [30] has shown that the fuzzy regression method developed by [29], when applied to different data sets, can provide the same FRM by solving the linear programming problem in (26) for . Then the authors proposed the FRM with ATFC in order to remedy this limitation.

Let us assume thatis ATFC in the fuzzy model (16), where is its left spread, its centre, and its right spread (see Figure 1).

When , are ATFC in model (16); then is also calculated as an ATFC [30]. Hence, we denoteFrom [31], and , , are compute as follows:and the level of is as follows:The steps to determine the fuzzy coefficients in (27) are as follows [30]: (i)determine by OLS,(ii)determine , of for by solving the linear programming problem:

##### 3.2. Optimizing the Value for FLR with ATFC

Let us present in this subsection the optimising value for ATFC developed by [32] that we shall use later to compute an optimized outstanding loss reserves.

###### 3.2.1. Preliminaries and Some Definitions

Let us consider the model (16), but by consideringThat is a FLR with ATFC.

The system fuzziness in this case (model (16) with (35)) is defined byThen the area where is predicted is exactly the fuzziness . That is why the objective function in (33) is to minimize the total fuzziness.

Definition 6 (see [32]). The credibility of in representing denoted by is defined asand the system credibility in model (16) and (18)-(35), denoted by is calculated as follows:The higher the (resp., ) is the better the performance of (resp., FLR) will be.

Definition 7 (from [32]). (1)Define by(2)Define by(3)Define by(4)Define by

###### 3.2.2. Optimizing the Value

Theorem 8. Consider and . Then, the optimal value of is given by

Proof. (see [32]).

Remark 9. Reference [32] has shown that

#### 4. Estimation of Loss Reserve with Log-Poisson Regression

“In this section, we are interested in the estimation of loss reserve using a GLM (log-Poisson). We consider a non-life insurance company which sells policies in a period of time (year). This year is referred as underwriting year. The claims regarding an underwriting year will not necessarily all be paid within this year. Due to legal issues, general consideration of the claim, the delay from the claim’s occurrence time to the reporting time,  ..., some claims are reported and paid in the following years.”

At some point in time there will however be no more payments regarding underwriting year one; the year one is said to have run off. We use then the historical data of claims presented as a run-off triangle (Table 1), where is the total loss regarding the underwriting period which have been paid with periods delay. The loss amounts with have been paid in calendar year . At period , we have observed the paymentswhereTable 1 is usually called run-off triangle, for example, in periods because , . And then the reserve for the underwriting period is defined as the predictor of the not yet observed amount . The total amount reserve is defined as the prediction ofwhere

 Development Year 0 1 Accident Year 0 1

Here we assume that follow a Poisson distribution with the underwriting period which is reported with period delay.

Assume that , , are mutually independent [33] andwhere is the dispersion parameter, means that we assume the portfolio to grow, or shrink, by a fixed percentage each year, means that the proportion settled decreases by a fixed fraction with each origin year, and means that the proportion settled decreases by a fixed fraction with each development year.

The parameter vector is given byThe estimator of can be given using the following relationship:where are the Maximum Likelihood Estimators (MLE) of , respectively, and could be derived from a recursive algorithm (see [7]).

Let us consider the numerical run-off triangles used in [10, 25] (see Tables 2 and 3).

 Development Year 0 1 2 3 4 Accident Year 2000 1120 2090 2610 2920 3130 2001 1030 1920 2370 2710 2002 1090 2140 2610 2003 1300 2650 2004 1420
 Development Year 1 2 3 4 5 6 7 Accident Year 2007 3511 3215 2266 1712 1059 587 340 2008 4001 3702 2278 1180 956 629 2009 4355 3932 1946 1522 1238 2010 4295 3455 2023 1320 2011 4150 3747 2320 2012 5102 4548 2013 6283

As an example of reading a run-off triangle (see Table 2), we can say that 1030 is the indemnity amount of an accident occurring in 2001 and paid during the same year. 1920 is the indemnity amount of a claim occurring in 2001 and paid in 2002. The same interpretation can be done for Table 3.

Let us plot the data (Table 2) to start the reserving analysis. The interpretation given for Table 2 could be done for Table 3.

Figure 2 presented the incremental and cumulative claims development by origin year. The run-off triangle appears to be fairly well behaved. The past years, 2003 and 2004, appear to be slightly higher than years 2000 to 2002, and the values in 2001 are lower in comparison to the later years and 2000, for example, the book changed over the years. The last payment of 2,650 for the 2003 origin year stands out a bit as well.

We performed the classical log-Poisson regression on the dataset in Tables 2 and 3, respectively, using the software R. We got the following Maximum Likelihood Estimations (MLE) of the parameters of the model and their confidence interval (CI) in Table 4.

 Data of Table 2 Data of Table 3

From Table 4 (first column) and with a threshold of , we conclude that except , the others coefficients are statistically significant. As an example of interpretation of the results, the estimation of the payment of origin year could be . From the second column of Table 4, we conclude that all the coefficients of the model are statistically significant at a threshold of .

Now let us test if the model performed in each data is adapted to a statistical perspective through a dispersion test (see Table 5).

 Data from Table 2 Data from Table 3

From the first column of Table 5 and with a threshold of , we do not reject the null hypothesis since . Therefore we do not need to perform a quasi-Poisson regression. However in the second column of Table 5, we reject the null hypothesis with a threshold of , i.e., a quasi-Poisson model, with the variance proportional to the mean, should be more reasonable for the data from Table 3.

Thus the predicted incremental payments from the run-off triangle of Table 2 are displayed in Table 6.

 Development Year i/j 0 1 2 3 4 Origin Year 2000 1120 2090 2610 2920 3130 2001 1030 1920 2370 2710 2875.732 2002 1090 2140 2610 2951.454 3148.427 2003 1300 2650 3192.812 3609.877 3850.791 2004 1420 2752.423 3372.597 3813.148 4067.628

The value of the MSEP got from the dataset 1 (Table 2) by fitting the classical log-Poisson regression model is . As reminder from our previous paper [1], the value of the MSEP by fitting the hybrid log-Poisson regression model (with an analytical estimation of parameters) on dataset 1 (Table 2) is . Furthermore the value of from the classical log-Poisson regression model is . We shall compare the values of and with the one we will get from the hybrid model where the fuzzy coefficients are estimated through a quadratic optimization program with optimized value (). From Table 6, we compute the outstanding reserve, i.e., .

To predict the incremental payments from Table 3, we need first to perform a quasi-Poisson regression as it has been point out in the overdispersion test (Table 5). For that data, the predicted incremental payments computed from the quasi-Poisson model are displayed in Table 7.

 Development Year 1 2 3 4 5 6 7 Accident Year 2007 3511 3215 2266 1712 1059 587 340 2008 4001 3702 2278 1180 956 629 350.9020 2009 4355 3932 1946 1522 1238 661.6201 375.9167 2010 4295 3455 2023 1320 1073.335 619.5253 351.9994 2011 4150 3747 2320 1502.970 1134 654.5405 371.8942 2012 5102 4548 2724.981 1820.420 1373.517 792.7891 450.4438 2013 6283 5587.059 3351.885 2239.222 1689.505 975.1765 554.0719

From Table 7, the MSEP from the classical log-Poisson regression can be computed, i.e., . Similarly, from our previous results [1], the hybrid model with analytical estimation of parameters fitted on dataset 2 (Table 3) gives as value of MSEP . The goodness of fit index from the classical log-Poisson regression fitted on dataset 2 (Table 3) is , i.e., of the variability is explained by the model. The values of the MSEP () will be compared with the one from the hybrid log-Poisson model where the fuzzy coefficients are estimated through a quadratic optimization program with optimized value, i.e., . From Table 7, the outstanding reserves are .

Remark 10. We notice that in both cases (results on dataset 1 (Table 2) and dataset 2 (Table 3), respectively), the MSEP from the hybrid model with analytical estimation of parameters is greater than the one we are getting from the classical log-Poisson regression model. However these values () still be large.

After performing the classical log-Poisson model and the hybrid model with analytical estimation of parameters on each of the run-off triangles data (Tables 2 and 3), let us propose a new method to estimate the fuzzy parameters of the hybrid model and compare it with the classic log-Poisson regression model and the hybrid one with analytical estimation of parameters through the same datasets” [1].

#### 5. Main Results

##### 5.1. A Quadratic Optimization Program to Estimate the Hybrid Log-Poisson Regression Model for Optimal Loss Reserving

In this section, we present a new way to estimate the hybrid log-Poisson regression [1]. For that, let us define the framework where our method will be defined. Indeed, [7] has assumed that incremental payments are log-Poisson distributed, i.e.,DenoteSo the model becomesIn our case let us consider the following assumptions:

Assumption 1. We consider a hybrid log-Poisson regression estimated from minimum distance of spreads and OLS criterion. Indeed, [30] considered a minimum fuzziness criterion, but in our model we consider the minimum distance between the left and right spread, i.e., the minimum of the sum of square of fuzziness as this can be seen as the well known minimum of the sum of square of errors in OLS.

Assumption 2. We assume that uncertainty about incremental payments in the run-off triangle is due to both fuzziness and randomness. Then the estimate of will be obtain by the use of the expected value of FN on , where denote the estimate of the fuzzy parameter
We suppose that is a fuzzy Poisson random variable [34], i.e.,According to [34], the fuzzy expected value of is defined by its level, i.e., where is the fuzzy mean operator. So the fuzzy mean is just the fuzzification of the crisp mean.
Then suppose that in the log-Poisson model (57)-(58), are FN such that and the membership function of , whereThus we define the membership function of as follows: