Computational Intelligence and Neuroscience

Computational Intelligence and Neuroscience / 2020 / Article

Research Article | Open Access

Volume 2020 |Article ID 7631495 | https://doi.org/10.1155/2020/7631495

Zubair Ahmad, Eisa Mahmoudi, Omid Kharazmi, "On Modeling the Earthquake Insurance Data via a New Member of the T-X Family", Computational Intelligence and Neuroscience, vol. 2020, Article ID 7631495, 20 pages, 2020. https://doi.org/10.1155/2020/7631495

On Modeling the Earthquake Insurance Data via a New Member of the T-X Family

Academic Editor: Friedhelm Schwenker
Received27 Dec 2019
Revised25 Aug 2020
Accepted09 Sep 2020
Published19 Sep 2020

Abstract

Heavy-tailed distributions play an important role in modeling data in actuarial and financial sciences. In this article, a new method is suggested to define new distributions suitable for modeling data with a heavy right tail. The proposed method may be named as the Z-family of distributions. For illustrative purposes, a special submodel of the proposed family, called the Z-Weibull distribution, is considered in detail to model data with a heavy right tail. The method of maximum likelihood estimation is adopted to estimate the model parameters. A brief Monte Carlo simulation study for evaluating the maximum likelihood estimators is done. Furthermore, some actuarial measures such as value at risk and tail value at risk are calculated. A simulation study based on these actuarial measures is also done. An application of the Z-Weibull model to the earthquake insurance data is presented. Based on the analyses, we observed that the proposed distribution can be used quite effectively in modeling heavy-tailed data in insurance sciences and other related fields. Finally, Bayesian analysis and performance of Gibbs sampling for the earthquake data have also been carried out.

1. Introduction

In a number of applied areas such as finance and actuarial sciences, data sets are most often positive, and the respective distribution is unimodal hump-shaped and skewed to right having heavier tails as compared to the well-known classical distributions. These distributions are not much flexible to adequately model such types of heavy-tailed data sets. For example, (i) the Pareto distribution, which is frequently used to model financial data sets, does not provide a reasonable fit for many applications, for example, if we are interested in modeling only especially moderate-to-large losses altogether, then in such cases, the Pareto distribution may not be a suitable choice to use [1], and (ii) the Weibull model is capable of catering the behavior of small losses very closely, but, unfortunately, fails to provide an adequate fit to the large losses [2]. In such circumstances, the utilization of the heavy-tailed models may be a good choice to apply. For positive data, heavy-tailed distributions are those whose right-tail probabilities are greater than the exponential one [3], that is,where is the cumulative distribution function (cdf) depending on the parameter vector .

Due to the usefulness and flexibility of the heavy-tailed models in financial and actuarial practice, actuaries are always intended to propose new statistical distributions. Therefore, serious attempts have been made to propose new statistical models and are still growing rapidly. The new contribution is made via different approaches such as (i) transformation of variables, (ii) composition of two or more distributions, (iii) compounding of distributions, and (iv) finite mixture of distributions, see Ahmad et al. [4].

Recent investigation of Eling [5] and Adcock et al. [6] determined that skew-normal and skew Student’s t distributions are the most excellent competitors because the skewed distributions adjust right-skewness and high kurtosis; for the interested readers, one can refer to Shushi [7] and Punzo et al. [8]. However, insurance losses and monetary risks take values on the positive real line, and subsequently, these skew models may not be a suitable choice to use as they are defined on . In such circumstances, the transformation of variable approach, especially, the exponential transformation, has demonstrated to be considerable; for details, see Azzalini et al. [9]. Bagnato and Punzo [10] showed that the transformation method of introducing new distributions is easy to use; however, most often, the inferences become complicated.

Another useful method of proposing new versatile heavy-tailed distributions, which provide the best fit to the heavy-tailed losses, is the methodology of composition, see Paula et al. [11], Klugman et al. [12], Nadarajah and Bakar [13] and Bakar et al. [14]. However, it ought to be noted that the new statistical models introduced via this approach involve more than three parameters inflicting difficulties in estimating the parameters, and more computational efforts are needed.

Another approach of introducing new distributions to cater data modeling adequately with unimodality is compounding of distributions, see Punzo et al. [15] and Mazza and Punzo [16]. Unfortunately, the density function of the distributions obtained by this approach might not have a closed-form expression that makes the estimation more complicated as shown in Punzo et al. [15].

The method of finite mixture models is another prominent approach to obtain new, very flexible models which are able to capture, for example, multimodality of the distribution under consideration, see Bernardi et al. [17], Miljkovic and Grün [18], and Punzo et al. [15]. No doubt, the distributions obtained via this approach are much flexible, but the inferences become more complicated and computationally challenging.

Furthermore, Dutta and Perry [19] performed an empirical analysis of loss distributions, and risk was estimated by different approaches such as exploratory data analysis and other empirical approaches. These authors rejected the idea of using the exponential, gamma, and Weibull models in modeling insurance losses due to the poor results. They concluded that one would need to use a model that is flexible enough in its structure. This motivated the researchers to search for more flexible models offering greater accuracy in fitting the heavy-tailed data.

Hence, bringing flexibility to a model by introducing additional parameter(s) is a desirable feature [2024]. In a number of recent papers, serious attempts have been made to introduce a variety of new heavy-tailed distributions, see Ahmad et al. [25] and Ahmad et al. [26]. Due to the importance of statistical distributions in financial science, a new family of distributions, called the Z-family, is introduced. The proposed family is introduced via the T-X family approach [27]. To illustrate the usefulness of the proposed method, a three-parameter submodel, called the Z-Weibull distribution, is taken and studied in detail. The proposed distribution provides a better description of the earthquake insurance data with possibly heavy tails than the available (i) two-parameter distributions such as Weibull, Burr-XII (B-XII), and generalized exponential (GE), (ii) three-parameter Weibull-claim (W-claim) and exponentiated Lomax (EL), and (iii) four-parameter beta-Weibull (BW) distributions, and possibly many others.

The rest of the article is structured in the following way. The proposed method is introduced in Section 2. The Z-Weibull model is considered in Section 3, and the shapes of its probability density function (pdf) are investigated in the same section. Estimation of parameters is discussed in Section 4. In the same section, a detailed Monte Carlo simulation study is conducted. Actuarial measures of the proposed method along with a simulation study are provided in Section 5. Distribution fit to the earthquake insurance data set is discussed in Section 6. Bayesian analysis as well as Gibbs sampling procedure for the real data set is discussed in Section 7. Future frame work is discussed in Section 8. Finally, some concluding remarks are presented in Section 9.

2. Development of the Z-Family

Let be the pdf of a random variable, say T, where for , and let be a function of of a random variable, say X, depending on the vector parameter satisfying the conditions given in the following:(1)(2) is differentiable and monotonically increasing(3) as and as

Alzaatreh et al. [27] introduced a general method for generating new families of distributions called the T-X family which is defined bywhere satisfies the conditions mentioned above. The probability density function (pdf) corresponding to (2) is given by

Deploying the T-X proposal, several new classes of distributions have been introduced in the literature [28]. Let X have the exponential distribution with the pdf given by

Using in (4), we get

On setting and in (2), we define the cdf of the Z-family of distributions bywhere is the baseline cdf. The expression in (6) represents a wide family of univariate continuous distributions. Clearly, when , the cdf of the proposed family derived in (6) becomes identical to the baseline cdf. The pdf corresponding to (6) is given by

The survival function (sf) and hazard rate function (hrf) corresponding to (6) are, respectively, given by

Due to induction of the extra parameter, the Z-family provides greater distributional flexibility. The key motivations for using the Z-family in the practice are as follows:(i)A very useful and simple method of introducing an additional parameter to generalize the existing distributions(ii)To improve the characteristics and flexibility of the existing models(iii)To introduce new distributions having closed form of cdf and sf, as well as hrf(iv)To extend the existing distributions by introducing only one parameter, rather than adding two or more parameters(v)To provide the best fit to heavy-tailed insurance data sets(vi)To provide better fits than other modified models having same or higher number of parameters

3. The Z-Weibull Distribution

Most of the extended forms of distributions are introduced for one of the following aims: (i) an extension of the existing model to improve its characteristics, (ii) to obtain new distribution having a heavy right tail, and (iii) to introduce a model whose empirical fit is good to data. Here, we discuss the Z-Weibull distribution that can possess at least one of these aims. The Weibull random variable has the cdf and pdf given by and , respectively, where . Then, the cdf of the Z-Weibull distribution has the following form:

The corresponding density is given by

The sf, hrf, and reversed hazard rate function (rhrf) of the proposed model are given byrespectively.

Different plots for the pdf of the Z-Weibull distribution for selected parameter values are given in Figure 1.

4. Estimation and Monte Carlo Simulation Study

Several approaches to estimate the model parameter have been introduced in the literature, but the maximum likelihood estimation method is the most commonly employed. The maximum likelihood estimators (MLEs) enjoy several desirable properties and can be used for constructing confidence intervals and regions and also in test statistics. The normal approximation for MLEs in large samples can be easily handled either analytically or numerically. So, we estimate the parameters of the Z-family of distributions from complete samples via the maximum likelihood estimation method. Furthermore, we perform a comprehensive Monte Carlo simulation study to evaluate the performance of the MLEs.

4.1. Maximum Likelihood Estimation

In this section, we obtain the MLEs of the model parameters of the Z-family of distributions from complete samples only. Let be the observed values from the Z-family of distributions with parameters and . The total log-likelihood function corresponding to (7) is given by

The partial derivatives of (12) are

Setting and equal to zero and solving numerically these expressions simultaneously yield the MLEs of .

4.2. Monte Carlo Simulation Study

This section offers a comprehensive simulation study to assess the behavior of the MLEs. The Z-family is easily simulated by inverting (6) as follows: if U has a uniform U (0,1) distribution, then the nonlinear equation is

Expression (14) can be used to simulate any special subcase of the Z-family. Here, we consider the Z-Weibull distribution to assess the behavior of the MLEs of the proposed method. We simulate the Z-Weibull distribution for two sets of parameters (set 1: , , and and set 2: , , and ). The simulation is performed via statistical software R through the library (rootSolve) command mle. The number of Monte Carlo replications made was 750 times. For maximizing the log-likelihood function, we use the method = “L-BFGS-B” algorithm with optim(). The evaluation of the estimators was performed via the following quantities for each sample size: the empirical mean squared errors (MSEs) are calculated using the R package from the Monte Carlo replications. The MLEs are determined for each piece of simulated data, say, for , and the biases and MSEs are computed, respectively, byfor . We consider the sample sizes of . The empirical results are given in Tables 1 and 2. Corresponding to Tables 1 and 2, the simulation results are graphically displayed in Figures 2 and 3. Based on Tables 1 and 2 and Figures 2 and 3, the following results are concluded:(i)Biases for all parameters are positive(ii)The parameters tend to be stable(iii)Estimated biases decrease when the sample size n increases(iv)Estimated MSEs decay toward zero when the sample size n increases


Set 1: , , and
ParametersMLEsMSEsBias

251.11500.19370.2150
1.24903.68720.9490
2.04314.71840.8431

501.04560.15260.1456
0.95362.46140.6536
1.92824.39330.7281

1000.99700.07830.0970
0.72771.37410.4277
1.67222.36420.4722

2000.96050.04720.0605
0.54780.68780.2478
1.46911.14740.2691

4000.92810.01630.0281
0.37690.09010.0769
1.31150.43360.1115

6000.91930.00850.0193
0.34710.03830.0471
1.24240.18120.0424

8000.91280.00670.0128
0.33130.02790.0313
1.24050.13180.0405

10000.90060.00420.0076
0.31840.01190.0184
1.21600.09280.0360


Set 2: , , and
ParametersMLEsMSEsBias

251.30940.20020.2094
1.99255.73991.2925
2.19374.30610.7937

501.23400.13330.1340
1.63684.06810.9368
2.09393.38650.6939

1001.17660.06970.0766
1.25432.15600.5543
1.87152.11360.4715

2001.13960.03680.0396
0.97400.93130.2740
1.67711.03200.2771

4001.11940.01530.0194
0.80800.22930.1080
1.52040.37180.1204

6001.11710.00940.0171
0.78600.12290.0860
1.45160.18690.0516

8001.10930.00650.0093
0.74950.08330.0495
1.44430.12190.0443

10001.10500.00490.0050
0.78070.05370.0307
1.41470.09350.0347

5. Actuarial Measures

One of the most important tasks of actuarial science institutions is to evaluate the exposure to market risk in a portfolio of instruments, which arise from changes in underlying variables such as prices of equity, interest rates, or exchange rates. In this section, we calculate some important risk measures such as value at risk (VaR) and tail value at risk (TVaR) for the proposed distribution, which play a crucial role in portfolio optimization under uncertainty.

5.1. Value at Risk

In the context of actuarial sciences, the VaR is widely used by practitioners as a standard financial market risk measure. It is also known as the quantile risk measure or quantile premium principle. The VaR is always specified with a given degree of confidence, say q (typically 90%, 95%, or 99%), and represents the percentage loss in the portfolio value that will be equaled or exceeded only X percent of the time. VaR of a random variable X is the qth quantile of its cdf, see Artzner [29]. If X follows the proposed method, then the VaR of X iswhere t is the solution of .

5.2. Tail Value at Risk

Another important measure is TVaR, also known as conditional tail expectation (CTE) or tail conditional expectation (TCE), which is used to quantify the expected value of the loss given that an event outside a given probability level has occurred. Let X follow the Z-Weibull distribution; then, TVaR of X is derived as

Using (10) in (17), we get

Recall the definition of incomplete gamma function in the form , so from (18), we get

5.3. Numerical Study of the Risk Measures

n this section, we provide numerical study of the VaR and TVaR for the Weibull and Z-Weibull distributions for different sets of parameters. The process is described as follows:(1)Random samples of size n = 100 are generated from the Weibull and Z-Weibull models(2)The parameters have been estimated via the maximum likelihood method(3)1000 repetitions are made to calculate the VaR and TVaR

The numerical results of the risk measures are provided in Tables 3 and 4 and displayed graphically in Figures 4 and 5 corresponding to each table.


Dist.ParLevel of significanceVaRTVaR

Weibull
0.7004.53737.3911
0.7505.09457.9076
0.8005.76018.5304
0.8506.59679.3205
0.9007.743710.4136
0.9509.639912.2384
0.99011.476514.0197
0.99919.535221.9150

Z-Weibull

0.7004.377813.2772
0.7505.537614.9465
0.8007.101517.1137
0.8509.340820.1026
0.90012.899924.6727
0.95020.001733.4043
0.99028.295643.2302
0.99980.4988101.7184


Dist.ParLevel of significanceVaRTVaR

Weibull
0.7001.52734.4325
0.7501.91744.9762
0.8002.43925.6792
0.8503.18016.6436
0.9004.34638.1092
0.9506.644410.8853
0.9909.295013.9806
0.99925.566732.0408

Z-Weibull

0.7001.75856.9318
0.7502.32227.9131
0.8003.11719.2175
0.8504.311211.0668
0.9006.316913.9926
0.95010.614819.8560
0.99016.002726.8019
0.99955.156673.1187

The simulation is performed for Weibull and Z-Weibull for the selected values of parameters. A model with higher values of the risk measures is said to have a heavier tail. The simulated results provided in Tables 3 and 4 show that the proposed Z-Weibull model has higher values of the risk measures than the traditional Weibull distribution. The simulation results are graphically displayed in Figures 4 and 5, which show that the proposed model has a heavier tail than the Weibull distribution.

6. Practical Illustration via the Earthquake Insurance Data

The main applications of the heavy tail models are the so-called extreme value theory or insurance loss phenomena. In this section, we consider heavy-tailed earthquake insurance data to illustrate the usefulness of the Z-Weibull model. The data are reported by the “National Centers for Environmental Information” available at https://ngdc.noaa.gov/hazard/earthqk.shtml. We compare the goodness-of-fit results of the Z-Weibull distribution with the other well-known heavy-tailed distributions. The distribution functions of the competitive models are as follows:(i)W-claim:(ii)Weibull:(iii)B-XII:(iv)GE:(v)EL:(vi)BW:Next, we consider certain analytical measures in order to verify which distribution fits better the considered data. These analytical measures include (i) discrimination measures, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Hannan–Quinn information criterion (HQIC), and consistent Akaike information criterion (CAIC), and (ii) two other goodness-of-fit measures including Anderson–Darling (AD) test statistic and . The discrimination measures are given as follows.(vii)Akaike information criterion is given by(viii)Consistent Akaike information criterion is given by(ix)Bayesian information criterion is given by(x)Hannan–Quinn information criterion is given bywhere denotes the log-likelihood function, k is the number of model parameters, and n is the sample size.(xi)The AD test statistic is given bywhere n is the sample size and is the observation in the sample, calculated when the data are sorted in the ascending order.

All the computations are carried out using the optim() R-function with the argument method = “BFGS” (see Appendix). A model with lowest values for these measures could be chosen as the best model to fit the data. The values of MLEs of the parameters along with standard errors in parenthesis are presented in Table 5, whereas the discrimination measures are displayed in Table 6. The AD statistic and are provided in Table 7. Based on the considered data set, we have observed that the Z-Weibull distribution is the best fitted model among the above considered models.


Dist.

Z-Weibull2.346 (0.020)0.564 (0.029)2.295 (0.248)
W-claim2.023 (0.076)0.709 (0.056)1.783 (0.197)
Weibull2.110 (0.091)0.889 (0.086)
B-XII1.480 (0.030)2.851 (0.107)
GE2.283 (0.015)4.308 (0.053)
EL4.804 (4.910)3.879 (5.400)4.335 (0.054)
BW1.609 (0.055)0.644 (0.304)1.616 (0.087)2.237 (1.228)


Dist.AICBICCAICHQIC

Z-Weibull23286.9523310.5723289.9223294.69
W-claim23345.8623364.0923355.0423354.73
Weibull23379.6423395.3823386.8423384.80
B-XII24197.6324213.3724197.6324202.78
GE23846.7123862.4523853.5723851.86
EL23863.9923887.6023871.9923871.72
BW23338.2423369.7323343.1723348.56


Dist.AD

Z-Weibull11640.480.840
W-claim11667.980.986
Weibull11687.821.071
B-XII12096.815.139
GE11921.354.965
EL11928.995.012
BW11665.120.952

As we can see, the results show that the Z-Weibull distribution provides a better fit than the other competitors. Hence, the proposed model can be used as a best candidate model for modeling insurance data sets. Furthermore, in support of Tables 6 and 7, the estimated cdf and pdf are plotted in Figure 6. The Kaplan–Meier survival plot and PP plot are sketched in Figure 7, whereas the QQ plot of the proposed distribution and box plot of the earthquake data are presented in Figure 8. From the estimated pdf in Figure 6, it is clear that the proposed distribution provides an adequate fit to the heavy-tailed earthquake data. From Figures 6 and 7, we can easily detect that the proposed distribution fits the estimated cdf and Kaplan–Meier survival plots very closely. The PP and QQ plots which serve as a tool for graphical display of the analytical measures show that the Z-Weibull distribution provides the best fit to real data. Finally, the box plot (Figure 8) of the data is graphical evidence that the data possess tail skewed to the right.

Furthermore, using the earthquake insurance data, we obtained the values of the Kolmogorov–Smirnov (KS) statistic of the proposed and other competing models. Then, we applied the parametric bootstrap technique [30] and bootstrapped the value for all the competing models. The KS statistic and the corresponding bootstrapped value are provided in Table 8. Based on the results provided in Table 8, we conclude that the proposed model is the best candidate model among the competing distributions for modeling the insurance claim data.


Dist.KSBootstrapped value

Z-Weibull0.2050.976
W-claim0.3670.806
Weibull0.4760.704
B-XII0.9540.408
GE0.5900.605
EL0.8690.502
BW0.4060.775

7. Bayesian Estimation

Bayesian inference procedure has been taken into consideration by many statistical researchers, especially those in the field of survival analysis and reliability engineering. In this section, complete sample data are analyzed through the Bayesian point of view. We assume that the parameters , and of the Z-Weibull distribution have independent prior distributions as follows:where , and are positive. More about choosing gamma priors, refer Kundu and Howlader [31], S. Dey and T. Dey [32], Dey et al. [33], and Dey et al. [34]. Hence, the joint prior density function is formulated as follows:

In the Bayesian estimation, the actual value of the parameter may be adversely affected by the loss when choosing an estimator. This loss can be measured by a function of the parameter and the corresponding estimator. Five well-known loss functions and associated Bayesian estimators and corresponding posterior risk are presented in Table 9.


Loss functionBayes estimatorPosterior risk

L1 = SELF = 
L2 = WSELF = 
L3 = MSELF = 
L4 = PLF = 
L5 = KLF = 

For more details, see Calabria and Pulcini [35] and Dey et al. [36]. Next, we provide the posterior probability distribution for a complete data set. We define the function as

The joint posterior distribution in terms of a given likelihood function L(data) and joint prior distribution is defined as

Hence, the joint posterior density of parameters , and for complete sample data is obtained by combining the likelihood function and joint prior density (32). Therefore, the joint posterior density function is given bywhere

Moreover, the marginal posterior density of , , and , assuming that , is given bywhere , and also is the -th member of vector .

From (35) and (37), it is clear that there is no closed form for the Bayesian estimators under the five loss functions described in Table 8. Therefore, we use the MCMC procedure based on 10,000 replicates to compute Bayesian estimators.

Because of intractable integrals associated with joint posterior and marginal posterior distributions, one needs to use numerical software to solve the integral equations numerically. The two most popular MCMC methods are the Metropolis–Hastings algorithm [37, 38] and the Gibbs sampling [39]. Gibbs sampling is a special case of the Metropolis–Hastings algorithm which generates a Markov chain by sampling from the full set of conditional distributions. Often, Bayesian inference requires computing intractable integrals to generate posterior samples. In practice, simulations related to Gibbs sampling are conducted through special software WinBUGS. WinBUGS software was developed in 1997 to simulate data of complex posterior distributions, where analytical or numerical integration techniques cannot be applied. One may also use OpenBUGS software, which is an open-source version of WinBUGS. Using Gibbs sampling, we obtain samples from the joint posterior distribution and then use OpenBUGS software to carry out the Bayesian analysis.

The process is described as follows:(i)Gibbs sampling technique is used to generate posterior samples(ii)10,000 replicates are made to compute the Bayesian estimators via OpenBUGS software(iii)The idea of Congdon [40] was to implement and choose as we do not have any prior information about hyperparameters

The corresponding Bayesian estimates and posterior risk are provided in Table 10. Table 11 provides 95% credible and HPD intervals for each parameter of the Z-Weibull distribution. Moreover, we provide the posterior summary plots in Figures 911. These plots confirm that the sampling process is of the prime quality, and the convergence does occur.


Bayes
Loss functionsEstimateRiskEstimateRiskEstimateRisk

SELF2.348080.000480.560780.001132.357420.08661
WSELF2.347870.000200.558760.002012.321620.03580
MSELF2.347668.90e − 050.556750.003592.286860.01497
PLF2.348190.000200.561790.002022.375720.03659
KLF2.347988.879e − 050.559770.003602.339450.01536


ParametersCredible intervalHPD interval

(2.334, 2.364)(2.303, 2.388)
(0.536, 0.585)(0.410, 0.625)
(2.139, 2.562)(1.827, 2.883)

8. Discussion and Future Framework

Statistical decision theory addresses the state of uncertainty and provides a rational framework for dealing with problems of actuarial and financial decision-making. The insurance data sets are generally skewed to the right and heavy-tailed. The traditional distributions are not flexible enough to counter complex forms of data such as insurance science data.

Due to the importance of statistical distributions in actuarial sciences, a number of papers have been appeared in the literature aiming to improve the characteristics of the existing distributions. Although this has been achieved, unfortunately, the numbers of parameters have been increased, and the estimation of parameters and derivation of mathematical problems become complicated.

To provide a better description of the insurance science data, therefore, in this study, an attempt has been made to introduce a new family of statistical distributions aiming to increase the flexibility of the existing distributions. A special submodel of the proposed family offers the best fitting to the heavy-tailed insurance science data. The maximum likelihood method is adopted to estimate the model parameters, and a comprehensive Monte Carlo simulation study is done to evaluate the behavior of the estimators.

To show the usefulness of the proposed method in insurance sciences, a real-life application of the earthquake insurance data is discussed. Analyzing the data set, it showed that the proposed model performs much better than the other competitive distributions.

From the above discussion, it is obvious that the researchers are always in search of new flexible distributions. Therefore, to bring further flexibility in the proposed model, we suggest to introduce its extended versions. The proposed method can further be extended by introducing a shape parameter to the model.(i)A random variable X is said to follow the extended version of the Z-family if its cdf is given bywhere is the additional shape parameter. For , expression (38) reduces to (6). The new proposal may be named as the exponentiated Z-family. For the illustrative purposes, one may consider its special subcase, called the exponentiated Z-Weibull (EZ-Weibull) distribution, defined by the cdfDue to the introduction of the additional shape parameter, the suggested extension may be much flexible in modeling data in insurance sciences and other related fields.(ii)Another extension of the Z-family is given bywhere is the additional shape parameter. For , expression (40) reduces to (6). The model defined in (40) may be named as the extended Z-family.(iii)Another generalized version of the Z-family can be introduced viawhere and are the additional shape parameters. Clearly, for , expression (41) reduces to (40), and for , expression (41) reduces to (38). However, for , expression (41) reduces to (6). The model introduced in (41) may be named as the extended exponentiated Z-family.(iv)Another generalized version of the new extended Z-family can be introduced via

9. Concluding Remarks

A variety of methods for proposing new heavy-tailed distributions have been developed to model data related to financial and actuarial sciences. We carried out this area of research further and introduced a new heavy-tailed distribution family. Some distributional properties are derived, and the method of maximum likelihood estimation is discussed to estimate the model parameters. In addition to distributional properties, some actuarial properties are also derived. Based on the actuarial measures, a comprehensive simulation study is conducted. We focused our concentration on a three-parameter special model called the Z-Weibull distribution. To prove the potential and usefulness of the Z-Weibull distribution, earthquake insurance data are analyzed, and its comparison is made with the other well-known distributions. While analyzing the earthquake insurance data, it is observed that the proposed model performs better than the other competitive models. Bayesian analysis using the earthquake data is also provided. Finally, some new extensions are also suggested which may further improve the characteristics of the proposed family.

Appendix

R-Code for Analysis

Note: in the following R-code, a is used for , is used for , b is used for , and pm is used for the proposed model.data ← read.csv(file.choose(), header = TRUE)data = data[1]data = data[!is.na(data)]data = data/100datahist(data)###—Probability density functionpdf_pm ← function(par,x){a = par[1] = par[2]b = par[3](a ∗  ∗ (x^(a − 1)) ∗ (exp( ∗ x^a)) ∗ (b^(exp( ∗ x^a) − 1))) ∗ (1 + log(b) ∗ exp( ∗ x^a))###—Cumulative distribution function.cdf_pm ← function(par,x){a = par[1] = par[2]b = par[3]1 − (exp( ∗ x^a)/(b^(1 − exp( ∗ x^a))))}set.seed(0)goodness.fit(pdf = pdf_pm,cdf = cdf_pm,starts = c(0.5,0.5,0.5), data = data,method = “BFGS,” domain = c(0,Inf),mle = NULL)%%%%---------------------------------------------------% Estimated pdf%%%%---------------------------------------------------x ← read.csv(file.choose(), header = TRUE)x = x[,1]x = x[!is.na(x)]x = x/100x###—Parameter valuesa = 2.346 = 0.564b = 2.295pdf = (a ∗  ∗ (x^(a − 1)) ∗ (exp( ∗ x^a)) ∗ (b^(exp( ∗ x^a) − 1))) ∗ (1 + log(b) ∗ exp( ∗ x^a))f = pdfx = sort(x)yrange = c(0,1)xrange = c(min(x),max(x))hist(x, freq = FALSE, breaks = 15, xlim = xrange, ylim = yrange,ylab = “Estimated pdf,”xlab = “x,” main = ””)par(new = TRUE)lines(x,f, xlim = xrange, lty = 1, ylab = ,” ylim = yrange, lwd = 3,col = “blue,” xlab = ””)par(new = TRUE)%%%%---------------------------------------------------%Estimated cdf%%%%---------------------------------------------------x ← read.csv(file.choose(), header = TRUE)x = x[,1]x = x[!is.na(x)]x = x/100x###—Parameter valuesa = 2.346 = 0.564b = 2.295x ← sort(x)F1 ← ecdf(x)ecdf ← F1(c(x))zweibullcdf ← 1 − ((exp( ∗ x^a)/(b^(1 − exp( ∗ x^a)))))plot(x, ecdf, lty = 1, lwd = 2.5, type = “s,” xlab = “x,”ylab = “Estimated cdf,” ylim = c(0,1), xlim = c(min(x),max(x)), col = “black”)par(new = TRUE)plot(x, zweibullcdf, lty = 1, lwd = 3, type = “l,” xlab = “x,”ylab = “Estimated cdf,” ylim = c(0,1), xlim = c(min(x),max(x)), col = “blue”)par(new = TRUE)%%%%---------------------------------------------------%Kaplan-Meier survival plot%%%%---------------------------------------------------x ← read.csv(file.choose(), header = TRUE)x = x[,1]x = x[!is.na(x)]x = x/100x###—Parameter valueslibrary(survival)delta = rep(1,length(x))x ← sort(x)km = survfit(Surv(x,delta)∼1)plot(km,conf.int = FALSE, ylab = “Kaplan-Meier survival plot,”xlab = “x”)a = 2.346 = 0.564b = 2.295ss ← function(x){(exp( ∗ x^a)/(b^(1 − exp( ∗ x^a))))}lines(seq(0, 3.5, length.out = 100),ss(seq(0, 3.5, length.out = 100)),col = “blue,” lwd = 3)%%%%---------------------------------------------------%PP plot%%%%---------------------------------------------------x ← read.csv(file.choose(), header = TRUE)x = x[,1]x = x[!is.na(x)]x = x/100xcdfLD1 = function(x, a, , b){1 − (exp( ∗ x^a)/(b^(1 − exp( ∗ x^a))))}x = sort(x)n = length(x)###—Empirical distribution functionFn = seq(1,n)/nplot(Fn, cdfLD1(x, 2.346, 0.564, 2.295), xlab = “x,”ylab = “PP Plot,” pch = 21, col = “blue,” bg = “blue”)abline(0,1)%%%%---------------------------------------------------%QQ plot%%%%---------------------------------------------------x ← read.csv(file.choose(), header = TRUE)x = x[,1]x = x[!is.na(x)]x = x/100x###—Parameter valuesa = 2.346 = 0.564b = 2.295x ← sort(x)F1 ← ecdf(x)ecdf ← F1(c(x))zweibullcdf ← 1 − (exp( ∗ x^a)/(b^(1 − exp( ∗ x^a))))qqnorm(x, pch = “@,” col = “black,” main = ””)qqline(x, pch = “@,” col = “blue,” ylab = “X,” lty = 1, lwd = 3)%%%%---------------------------------------------------%Box plot%%%%---------------------------------------------------x < read.csv(file.choose(), header = TRUE)x = x[,1]x = x[!is.na(x)]x = x/100xboxplot(x, main = ,” col = “blue,” ylab = “x,” xlab = “Box Plot”)%%%%---------------------------------------------------

Data Availability

This work is mainly a methodological development and has been applied on secondary data related to the earthquake insurance data, but if required, data will be provided.

Disclosure

This article is drafted from the PhD work of the first author (Zubair Ahmad).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. M. Guillen, F. Prieto, and J. M. Sarabia, “Modelling losses and locating the tail with the Pareto positive stable distribution,” Insurance: Mathematics and Economics, vol. 49, no. 3, pp. 454–461, 2011. View at: Publisher Site | Google Scholar
  2. D. Bhati and S. Ravi, “On generalized log-Moyal distribution: a new heavy tailed size distribution,” Insurance: Mathematics and Economics, vol. 79, pp. 247–259, 2018. View at: Publisher Site | Google Scholar
  3. J. Beirlant, G. Matthys, and G. Dierckx, “Heavy-tailed distributions and rating,” ASTIN Bulletin, vol. 31, no. 1, pp. 37–58, 2001. View at: Publisher Site | Google Scholar
  4. Z. Ahmad, E. Mahmoudi, G. G. Hamedani, and O. Kharazmi, “New methods to define heavy-tailed distributions with applications to insurance data,” Journal of Taibah University for Science, vol. 14, no. 1, pp. 359–382, 2020. View at: Publisher Site | Google Scholar
  5. M. Eling, “Fitting insurance claims to skewed distributions: are the skew-normal and skew-student good models?” Insurance: Mathematics and Economics, vol. 51, no. 2, pp. 239–248, 2012. View at: Publisher Site | Google Scholar
  6. C. Adcock, M. Eling, and N. Loperfido, “Skewed distributions in finance and actuarial science: a review,” The European Journal of Finance, vol. 21, no. 13-14, pp. 1253–1281, 2015. View at: Publisher Site | Google Scholar
  7. T. Shushi, “Skew-elliptical distributions with applications in risk theory,” European Actuarial Journal, vol. 7, no. 1, pp. 277–296, 2017. View at: Publisher Site | Google Scholar
  8. A. Punzo, A. Mazza, and A. Maruotti, “Fitting insurance and economic data with outliers: a flexible approach based on finite mixtures of contaminated gamma distributions,” Journal of Applied Statistics, vol. 45, no. 14, pp. 2563–2584, 2018. View at: Publisher Site | Google Scholar
  9. A. Azzalini, T. Del Cappello, and S. Kotz, “Log-skew-normal and log-skew-t distributions as models for family income data,” Journal of Income Distribution, vol. 11, no. 3-4, p. 2, 2002. View at: Google Scholar
  10. L. Bagnato and A. Punzo, “Finite mixtures of unimodal beta and gamma densities and the k-bumps algorithm,” Computational Statistics, vol. 28, no. 4, pp. 1571–1597, 2013. View at: Publisher Site | Google Scholar
  11. G. A. Paula, V. Leiva, M. Barros, and S. Liu, “Robust statistical modeling using the Birnbaum-Saunders-t distribution applied to insurance,” Applied Stochastic Models in Business and Industry, vol. 28, no. 1, pp. 16–34, 2012. View at: Publisher Site | Google Scholar
  12. S. A. Klugman, H. H. Panjer, and G. E. Willmot, Loss Models: from Data to Decisions, vol. 715, John Wiley & Sons, Hoboken, NJ, USA, 2012.
  13. S. Nadarajah and S. A. A. Bakar, “New composite models for the Danish fire insurance data,” Scandinavian Actuarial Journal, vol. 2014, no. 2, pp. 180–187, 2014. View at: Publisher Site | Google Scholar
  14. S. A. Bakar, N. A. Hamzah, M. Maghsoudi, and S. Nadarajah, “Modeling loss data using composite models,” Insurance: Mathematics and Economics, vol. 61, pp. 146–154, 2015. View at: Publisher Site | Google Scholar
  15. A. Punzo, L. Bagnato, and A. Maruotti, “Compound unimodal distributions for insurance losses,” Insurance: Mathematics and Economics, vol. 81, pp. 95–107, 2018. View at: Publisher Site | Google Scholar
  16. A. Mazza and A. Punzo, “Modeling household income with contaminated unimodal distributions,” in New Statistical Developments in Data Science, pp. 373–391, Springer, Cham, Switzerland, 2019. View at: Publisher Site | Google Scholar
  17. M. Bernardi, A. Maruotti, and L. Petrella, “Skew mixture models for loss distributions: a Bayesian approach,” Insurance: Mathematics and Economics, vol. 51, no. 3, pp. 617–623, 2012. View at: Publisher Site | Google Scholar
  18. T. Miljkovic and B. Grün, “Modeling loss data using mixtures of distributions,” Insurance: Mathematics and Economics, vol. 70, pp. 387–396, 2016. View at: Publisher Site | Google Scholar
  19. K. Dutta and J. Perry, “A tale of tails: an empirical analysis of loss distribution models for estimating operational risk capital,” Tech. Rep., Federal Reserve Bank of Boston, Boston, MA, USA, 2006, Working Papers, No. 06-13. View at: Google Scholar
  20. S. Nasiru, “Extended odd Fréchet-G family of distributions,” Journal of Probability and Statistics, vol. 2018, Article ID 2931326, 12 pages, 2018. View at: Publisher Site | Google Scholar
  21. M. A. Cortés, D. Elal-Olivero, and J. F. Olivares-Pacheco, “A new class of distributions generated by the extended bimodal-normal distribution,” Journal of Probability and Statistics, vol. 2018, Article ID 9753439, 10 pages, 2018. View at: Publisher Site | Google Scholar
  22. H. Al-Mofleh, “On generating a new family of distributions using the tangent function,” Pakistan Journal of Statistics and Operation Research, vol. 14, no. 3, pp. 471–499, 2018. View at: Publisher Site | Google Scholar
  23. M. E. Mead, G. M. Cordeiro, A. Z. Afify, and H. A. Mofleh, “The alpha power transformation family: properties and applications,” Pakistan Journal of Statistics and Operation Research, vol. 15, pp. 525–545, 2019. View at: Publisher Site | Google Scholar
  24. W. He, Z. Ahmad, A. Z. Afify, and H. Goual, “The arcsine exponentiated-X family: validation and insurance application,” Complexity, vol. 2020, Article ID 8394815, 18 pages, 2020. View at: Publisher Site | Google Scholar
  25. Z. Ahmad, E. Mahmoudi, S. Dey, and S. K. Khosa, “Modeling vehicle insurance loss data using a new member of T-X family of distribution,” Journal of Statistical Theory & Practice, vol. 19, no. 2, pp. 133–147, 2020. View at: Publisher Site | Google Scholar
  26. Z. Ahmad, E. Mahmoudi, and S. Dey, “A new family of heavy-tailed distribution with an application to the heavy-tailed insurance loss data,” Communication in Statistics: Simulation & Computation, vol. 49, 2020. View at: Publisher Site | Google Scholar
  27. A. Alzaatreh, C. Lee, and F. Famoye, “A new method for generating families of continuous distributions,” Metron, vol. 71, no. 1, pp. 63–79, 2013. View at: Publisher Site | Google Scholar
  28. Z. Ahmad, G. G. Hamedani, and N. S. Butt, “Recent developments in distribution theory: a brief survey and some new generalized classes of distributions,” Pakistan Journal of Statistics and Operation Research, vol. 15, no. 1, pp. 87–110, 2019. View at: Publisher Site | Google Scholar
  29. P. Artzner, “Application of coherent risk measures to capital requirements in insurance,” North American Actuarial Journal, vol. 3, no. 2, pp. 11–25, 1999. View at: Publisher Site | Google Scholar
  30. W. Stute, W. G. Manteiga, and M. P. Quindimil, “Bootstrap based goodness-of-fit-tests,” Metrika, vol. 40, no. 1, pp. 243–256, 1993. View at: Publisher Site | Google Scholar
  31. D. Kundu and H. Howlader, “Bayesian inference and prediction of the inverse Weibull distribution for type-II censored data,” Computational Statistics & Data Analysis, vol. 54, no. 6, pp. 1547–1558, 2010. View at: Publisher Site | Google Scholar
  32. S. Dey and T. Dey, “On progressively censored generalized inverted exponential distribution,” Journal of Applied Statistics, vol. 41, no. 12, pp. 2557–2576, 2014. View at: Publisher Site | Google Scholar
  33. S. Dey, S. Ali, and C. Park, “Weighted exponential distribution: properties and different methods of estimation,” Journal of Statistical Computation and Simulation, vol. 85, no. 18, pp. 3641–3661, 2015. View at: Publisher Site | Google Scholar
  34. S. Dey, T. Dey, S. Ali, and M. S. Mulekar, “Two-parameter Maxwell distribution: properties and different methods of estimation,” Journal of Statistical Theory and Practice, vol. 10, no. 2, pp. 291–310, 2016. View at: Publisher Site | Google Scholar
  35. R. Calabria and G. Pulcini, “Point estimation under asymmetric loss functions for left truncated exponential samples,” Communications in Statistics-Theory and Methods, vol. 25, no. 3, pp. 585–600, 1996. View at: Publisher Site | Google Scholar
  36. S. Dey, S. Singh, Y. M. Tripathi, and A. Asgharzadeh, “Estimation and prediction for a progressively censored generalized inverted exponential distribution,” Statistical Methodology, vol. 32, pp. 185–202, 2016. View at: Publisher Site | Google Scholar
  37. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” The Journal of Chemical Physics, vol. 21, no. 6, pp. 1087–1092, 1953. View at: Publisher Site | Google Scholar
  38. W. K. Hastings, “Monte Carlo sampling methods using Markov chains and their applications,” Biometrika, vol. 57, no. 1, pp. 97–109, 1970. View at: Publisher Site | Google Scholar
  39. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721–741, 1984. View at: Publisher Site | Google Scholar
  40. P. Congdon, Bayesian Statistical Modelling, John Wiley & Sons, New York, NY, USA, 2001.

Copyright © 2020 Zubair Ahmad et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views65
Downloads29
Citations

Related articles

We are committed to sharing findings related to COVID-19 as quickly as possible. We will be providing unlimited waivers of publication charges for accepted research articles as well as case reports and case series related to COVID-19. Review articles are excluded from this waiver policy. Sign up here as a reviewer to help fast-track new submissions.