Journal of Probability and Statistics

Journal of Probability and Statistics / 2016 / Article

Research Article | Open Access

Volume 2016 |Article ID 7208425 | 12 pages | https://doi.org/10.1155/2016/7208425

General Results for the Transmuted Family of Distributions and New Models

Academic Editor: Zacharias Psaradakis
Received04 Oct 2015
Accepted29 Dec 2015
Published31 Jan 2016

Abstract

The transmuted family of distributions has been receiving increased attention over the last few years. For a baseline G distribution, we derive a simple representation for the transmuted-G family density function as a linear mixture of the G and exponentiated-G densities. We investigate the asymptotes and shapes and obtain explicit expressions for the ordinary and incomplete moments, quantile and generating functions, mean deviations, Rényi and Shannon entropies, and order statistics and their moments. We estimate the model parameters of the family by the method of maximum likelihood. We prove empirically the flexibility of the proposed model by means of an application to a real data set.

1. Introduction

Adding parameters to a well-established distribution is a time honored device for obtaining more flexible new families of distributions. Shaw and Buckley [1] pioneered an interesting method of adding a new parameter to an existing distribution that would offer more distributional flexibility. They used the quadratic rank transmutation map (QRTM) in order to generate a flexible family of distributions. The generated family, also called the transmuted extended distribution, includes the parent distribution as a special case and gives more flexibility to model various types of data.

In the last three years, there has been a growing interest in transmuted distributions and several of them have been investigated. A significant amount of work has been attributed towards developing a new transmuted model and subsequently discussing its utilities as enhanced flexibility in modeling various types of real life data, where the parent model does not provide a good fit. Aryal and Tsokos [2] defined the transmuted generalized extreme value distribution and studied some basic mathematical characteristics of the transmuted Gumbel distribution and its applications to climate data. Aryal and Tsokos [3] presented a new generalized Weibull distribution called the transmuted Weibull distribution. Recently, Aryal [4] proposed and studied various structural properties of the transmuted log-logistic distribution. Khan and King [5] introduced the transmuted modified Weibull distribution, which extends the transmuted Weibull distribution [3], and studied its mathematical properties and maximum likelihood estimation of the model parameters. Elbatal [6] proposed the transmuted modified inverse Weibull distribution. Elbatal and Aryal [7] explored the transmuted additive Weibull model, which extends the additive Weibull distribution and some other distributions using the QRTM method [1]. However, several published works did not investigate many properties such as finite mixture of the density function, Rényi and Shannon entropies, extreme values, probability weighted moments (PWMs), and bivariate and multivariate generalization. This paper aims to fill out this gap in the existing literature and contribute with general properties of the transmuted family.

This vast amount of literature merits for a detailed study for the most general transmuted family of distributions, which is our major motivation to carry out this work. In this paper, we derive general mathematical properties for the transmuted family, which hold for any baseline distribution, such as the ordinary, central, and incomplete moments, quantile and generating functions, mean deviations, Rényi and Shannon entropies, extreme values, PWMs, order statistics and their moments, and bivariate and multivariate generalizations. We provide a comprehensive description of these properties with the hope that the transmuted family will attract wider applications in biology, medicine, economics, reliability, and engineering and in other areas of research. We also introduce new distributions based on the transmuted construction.

The rest of the paper is organized as follows. In Section 2, we discuss the general theory behind the transmuted distribution and present useful representations for the density and cumulative functions. In Section 3, we investigate its asymptotes and shapes. In Section 4, we provide an algorithm for generating samples from the transmuted family based on its quantile function (qf). In Section 5, we derive expressions for the moments and generating function. In Section 6, we obtain mean deviations and provide some examples. In Section 7, we present two special transmuted models. In Section 8, we discuss the limiting behavior of the extreme statistics. In Section 9, we derive the PWMs. In Section 10, we obtain the order statistics. We derive expressions for the Shannon and Rényi entropies and Kullback-Leibler divergence measure in Section 11. We introduce in Section 12 the bivariate and multivariate extensions of the univariate transmuted family. In Section 13, we use the maximum likelihood method to estimate the model parameters. In Section 14, we fit some special models of the transmuted family to a real data set to prove empirically its usefulness. In Section 15, we offer some concluding remarks.

2. Distribution and Density Functions

Let and be the cumulative distribution functions (cdfs) of two models with a common sample space. The general rank transmutation as given in Shaw and Buckley [1] is defined as and . Note that the qf is defined by for . Functions and both map the unit interval into itself and, under suitable assumptions, are mutual inverses and satisfy and (for ). The QRTM is defined by , from which it follows that . Differentiating gives , where and are the probability density functions (pdfs) corresponding to the cdfs and , respectively. For more details about the QRTM approach, see Shaw and Buckley [1].

A random variable has the transmuted-G () family if the pdf and cdf are defined through the QRTM method by (for )where is the parent cdf and is the parent pdf. Both functions depend on the parameter vector . For , it reduces to the parent model. Hereafter, the random variable following (1) with parameter and baseline vector of parameters is denoted by . The computations for fitting family (1) to real data in practical problems can be easily performed using the AdequacyModel script in the R software.

For an arbitrary baseline cdf , a random variable is said to have the Exp- distribution with power parameter , say , if its pdf and cdf are given by respectively. Note that . The properties of exponentiated distributions have been studied by many authors in recent years. See, for example, Mudholkar and Srivastava [8] for exponentiated Weibull, Gupta et al. [9] for exponentiated Pareto, Gupta and Kundu [10] for exponentiated exponential, Nadarajah [11] for exponentiated Gumbel, Kakde and Shirke [12] for exponentiated lognormal, and Nadarajah and Gupta [13] for exponentiated gamma distributions.

Theorem 1. The density function of can be expressed as the linear mixture where .

Corollary 2. If , then .

Theorem 1 is important to obtain some measures of from those of exponentiated distributions. This result plays an important role in the paper, since we can obtain, for example, the moments, generating function, and mean deviations of . Established explicit expressions for these measures can be simpler than using numerical integration.

The hazard rate function (hrf) of is given bywhere is the baseline hrf. The multiplying quantity is a kind of correction factor for the baseline hrf.

Equation (5) can deal with general situations for modeling survival data with various hrf shapes. From this equation, we note that is decreasing in for and it is increasing in for . Additionally, we have for and , respectively.

Equation (5) can be expressed as where , , and and are the hrfs of the and Exp- distributions, respectively.

3. Asymptotes and Shapes

Proposition 3. The asymptotics of (1) and (2) as are(i),(ii),(iii), where .

Proposition 4. The asymptotics of (1) and (2) as are(i),(ii),(iii).

The shapes of the density and hazard functions of can be described analytically. The critical points of the pdf are the roots of the equation

There may be more than one root to (8). Let . We haveIf is a root of (9), then it corresponds to a local maximum if for all and for all . It corresponds to a local minimum if for all and for all . It refers to a point of inflection if either for all or for all .

The critical points of the hrf of are obtained from

Again, there may be two roots to (10). Let . We haveIf is a root of (11), then it corresponds to a local maximum if for all and for all . It corresponds to a local minimum if for all and for all . It refers to a point of inflection if either for all or for all .

4. Quantile Function and Simulation

The qf of the family is given bywhere is the inverse of the baseline cdf. The family is easily simulated by Algorithm 1.

() Generate a random number from ;
() If then compute a random number ; Otherwise ;
() Repeat steps () to () until the required amount of random numbers to be completed.

Table 1 gives and the corresponding parameters for some special distributions.


Distribution

Uniform ()
Exponential ()
Weibull ()
Fréchet ()
Half-logistic ()
Power function ()
Pareto ()
Burr XII ()
Logistic ()
Log-logistic ()
Lomax ()
Gumbel ()
Kumaraswamy ()
Normal ()

5. Moments and Generating Function

Many of the important characteristics and features of a distribution are determined through the ordinary moments. The th ordinary moment of is obtained from Theorem 1 aswhere for . Some moments obtained from (13) are reported in Table 2.


Distribution Reference

Transmuted Weibull Aryal and Tsokos [3]

Transmuted Lindley
Merovci [14]

Transmuted Fréchet
, Mahmoud and Mandouh [15]

Transmuted log-logistic
Aryal [4]

Transmuted Pareto , Merovci and Puka [16]

The central moments () and cumulants () of follow from (13) asrespectively, where . Further, the skewness and kurtosis are obtained from the third and fourth standardized cumulants and , respectively.

The moment generating function (mgf) of , say , can be expressed from Theorem 1 as where for . The integrals and can be evaluated numerically for most parent distributions.

Three closed forms for (for ) follow by selecting from Table 1 the exponential (with parameter ), standard logistic, and Fréchet as baseline distributions, for which (for ), (for ), and , respectively.

The characteristic function (chf) has many useful and important properties and plays a central role in statistical theory. It is particularly useful in analysis of linear combination of independent random variables. Clearly, a simple representation for the chf of , where , is given by From expansions and , we obtain

6. Mean Deviations

The th incomplete moment of , say , is expressed aswhereThe integral can be determined analytically for some special models with closed form expressions for or evaluated at least numerically for most baseline distributions. It can also be obtained for several baseline distributions using power series methods. These methods are at the heart of many aspects of applied mathematics and statistics. If this function does not have a closed form expression, it can be expressed as a power series:where coefficients are suitably chosen real numbers. For some important distributions, such as the normal, Student , gamma, and beta distributions, does not have closed form but it can be expanded as in (20). For example, for the standard normal distribution, coefficients s are given by where (for ) and (for ), and the s are determined recursively from Then, , , , and

We consider a result by Gradshteyn and Ryzhik [17] for a power series raised to a positive integer :where coefficients (for ) are determined from the recurrence equation and . Coefficient can be obtained from quantities in any analytical or numerical software. Hence, quantity (for ) in (19) is given by

An important application of the first incomplete moment of in (18) is related to the Bonferroni and Lorenz curves. These curves are very useful in economics, reliability, demography, insurance, and medicine. For a given probability , they are given by and , where comes from (12).

The magnitude of dispersion associated with the population can be measured by the totality of deviations from the mean and median. Another application refers to the the deviations about the mean () and about the median () of given by respectively, where is the median of , is determined from (13), is easily evaluated from (2), and is obtained from (19) with .

Next, we provide two applications of (19) by taking for the baseline model the exponential (with parameter ) and standard logistic distributions listed in Table 1. By using the generalized binomial expansion, we obtain (for and ) for the transmuted-exponential (TE) (with parameter ) and transmuted-standard logistic (TSL) as respectively.

7. Special Transmuted Models

The pdf and cdf of in (1) and (2) will be most tractable when and have simple analytic expressions. In this section, we present two special models.

7.1. The Transmuted Burr XII (TBXII) Distribution

We consider the parent Burr XII distribution, where the pdf and cdf (for ) are , , and , respectively. Then, the TBXII density function is given by where . The corresponding pdf is given by

The TBXII distribution includes an important special case when : the transmuted-log-logistic [4] distribution. Further, we obtain the transmuted Lomax [18] distribution when . Some plots of the TBXII density function are displayed in Figure 1.

The th ordinary moment of the TBXII model can be obtained from (13) as where is the beta function.

7.2. The Transmuted Kumaraswamy (TKw) Distribution

The baseline Kumaraswamy (Kw) distribution has pdf and cdf, for and , given by and , respectively. Then, the TKw cdf is given by where . The corresponding pdf is given by

Ahmad et al. [19] proposed the transmuted Kumaraswamy (TKw) distribution as an extension of the Kw distribution and obtained the density and cumulative functions. However, they did not investigate an application to real data and explore the qf. In Figure 2, we plot the TKw density function for some parameter values.

The th moment of TKw can be obtained from (13) as and the qf is given by

8. Extreme Values

If denotes the sample mean from iid random variables following (2), then by standard central limit theorem converges in distribution to the standard normal as under suitable conditions. However, one might be interested in the asymptotics of the extreme values and . We consider the following:(i)Suppose that belongs to the max. domain of attraction of the Gumbel extreme value distribution. Then, by Leadbetter et al. [20], there must be a strictly positive function, say , such that for every . In our case, we have for every . Hence, it follows by Leadbetter et al. [20] that also belongs to the max. domain of attraction of the Gumbel extreme value distribution with for some suitable norming constants and .(ii)Again, suppose that belongs to the max. domain of attraction of the Fréchet extreme value distribution. By Leadbetter et al. [20], there must exist a , such that for every . In our case, for every . Hence, it follows by Leadbetter et al. [20] that also belongs to the max. domain attraction of the Fréchet extreme value distribution with for some suitable norming constants and .(iii)Also, suppose that belongs to the max. domain of attraction of the Weibull extreme value distribution. By Leadbetter et al. [20], there must be an such that for every . In our case, we havefor every . Hence, it follows by Leadbetter et al. [20, Chapter  1] that also belongs to the max. domain of attraction of the Weibull extreme value distribution with for some suitable norming constants and .Similar arguments apply to min. domains of attraction. That is, belongs to the same min. domain of attraction as that of .

9. Probability Weighted Moments

The PWMs of a baseline model can be very useful to determine the moments of more complex distributions. Distributions that can be expressed in inverse form may present problems in estimating their parameters as functions of ordinary moments. For these distributions, the relations between the PWMs and the parameters have simpler analytical structure than those between the ordinary moments and the parameters. The PWMs are also widely used for estimating parameters of distributions from complete or censored samples.

We demonstrate that the th PWM of , say (for ), can be expressed as linear combinations of the baseline PWMs defined by . First, we can write from (20) and (23) by interchanging the sum and the integral

Further, we have and then using the binomial expansion, we can express the PWMs of as where and are obtained from (44).

10. Order Statistics

Order statistics are required in many fields, such as climatology, engineering, and industry. Further, they play an important role in Statistical Inference and Nonparametric Statistics. In this section, we present some results with respect to the order statistics. We obtain an expression for the density of the th order statistic and the large sample distribution of the minimum and maximum when a random sample of size is drawn from the family. The density function of the th order statistic, say , from a random sample of size drawn from (5) is given by

The th order moment of is obtained from (20) as where can be evaluated numerically for most parent distributions using statistical software.

11. Information Theory

11.1. Entropies

An entropy is a measure of variation or uncertainty of a random variable . Two well-known entropy measures are the Rényi and Shannon entropies. The Rényi entropy of a random variable with pdf is defined by for and . The Rényi entropy for the family should be evaluated numerically.

The Shannon entropy of a random variable is defined by . It is the special case of the Rényi entropy when . For the proposed model in (1), the Shannon entropy reduces towhere

By substituting the last two expressions in (50), the Shannon entropy becomes

Next, we consider the Rényi entropy. From (1), we have which can be computed numerically.

11.2. Kullback-Leibler (KL) Divergence

Consider two distributions from the same family (but with different parametric configuration). To be more specific, let and . Then, the KL divergence measure between and , say , is given by where

Special Case. If , then reduces to