Abstract

We study some mathematical properties of a new generator of continuous distributions with two extra parameters called the exponentiated half-logistic family. We present some special models. We investigate the shapes of the density and hazard rate function. We derive explicit expressions for the ordinary and incomplete moments, quantile and generating functions, probability weighted moments, Bonferroni and Lorenz curves, Shannon and Rényi entropies, and order statistics, which hold for any baseline model. We introduce two bivariate extensions of this family. We discuss the estimation of the model parameters by maximum likelihood and demonstrate the potentiality of the new family by means of two real data sets.

1. Introduction

The use of new generators of continuous distributions from classic distributions has become very common in recent years. One example is the beta-generated family of distributions proposed by Eugene et al. [4]. Another example is the gamma-generated family of distributions defined by Zografos and Balakrishnan [5]. Based on a baseline continuous distribution with survival function and density , their families are defined by the cumulative distribution function (cdf) and probability density function (pdf) (for ): respectively, where is the gamma function.

Based on Zografos and Balakrishnan’s [5] paper, we replace the gamma distribution by the exponentiated half-logistic (“EHL” for short) distribution to define a new family of continuous distributions by the cdf: where is the baseline cdf depending on a parameter vector and and are two additional shape parameters. For any continuous distribution, the EHL- distribution is defined by the cdf (2). Equation (2) is a wider family of continuous distributions and includes some special models as those listed in Table 1.

The density function corresponding to (2) is given by where is the baseline pdf. Equation (3) will be most tractable when and have simple analytic expressions. Hereafter, a random variable with density function (3) is denoted by . Further, we can omit sometimes the dependence on the vector of the parameters and simply write .

A physical interpretation of the EHL- distribution can be given as follows. Consider a system formed by independent components having the half-logistic- (“HL-”) cdf given by Suppose that the system fails if all of the components fail and let denote the lifetime of the entire system. Then, the pdf of is given by (3).

The hazard rate function (hrf) of becomes The EHL family of distributions is easily simulated by inverting (2) as follows: if has a uniform distribution, the solution of the nonlinear equation has the density function (3).

This paper is organized as follows. In Section 2, some special cases of the EHL family of distributions are defined. In Section 3, the shapes of the density and hazard rate functions are described analytically. A useful expansion for the new density family is obtained and we derive a power series for the EHL quantile function in Section 4. General explicit expressions for some special EHL moments are obtained in Section 5.

In Section 6, we derive the generating function, the incomplete moments are investigated, we obtain the mean deviations and the reliability and provide expressions for the Rényi and Shannon entropies, and the order statistics and their moments are determined. We introduce two bivariate extensions of the new family in Section 7. Estimation of the model parameters by maximum likelihood is performed in Section 8. Applications to two real data sets illustrate the performance of the new family in Section 9. The paper is concluded in Section 10.

2. Special EHL- Models

Here, we introduce only three of the many distributions which can arise as EHL special models, where and are positive shape parameters of the new generator. We consider three baseline distributions, namely, Fréchet, log-logistic, and generalized half-normal distributions, although we can generate as many new distributions as desirable.

2.1. Exponentiated Half-Logistic-Fréchet (EHLF) Model

The Fréchet (or type II extreme value) distribution has been useful for modeling of market-returns which are often heavy-tailed in applications to finance [6]. Now, we introduce a new four-parameter distribution called the EHLF distribution. Taking to be the Fréchet distribution with scale parameter and shape parameter , where , the EHLF density function (for ) is given by

The cdf and hrf corresponding to (7) are given by respectively. A characteristic of the EHLF distribution is that its hrf can be monotonically increasing or decreasing and upside-down bathtub depending basically on the parameter values. Plots of its density function and hrf for some parameter values are displayed in Figures 1 and 2, respectively.

2.2. Exponentiated Half-Logistic-Log-Logistic (EHLLL) Model

The log-logistic (LL) distribution is widely used in practice and it is an alternative to the log-normal distribution since it presents a failure rate function that increases, reaches a peak after some finite period, and then declines gradually. The properties of the LL distribution make it an attractive alternative to the log-normal and Weibull distributions in the analysis of survival data [7]. This distribution can exhibit a monotonically decreasing failure rate function for some parameter values. For , let be the LL cdf, where is the shape parameter and is the scale parameter, where . The EHLLL density function becomes

In Figure 3, we display some possible shapes of the EHLLL density function. The corresponding cdf and hrf are given by respectively. Plots of the EHLLL hrf for some parameter values are displayed in Figure 4.

2.3. Exponentiated Half-Logistic Generalized Half-Normal (EHLGHN) Model

The most popular models used to describe the lifetime process under fatigue are the half-normal (HN) and Birnbaum-Saunders (BS) distributions. When modeling monotone hazard rates, the HN and BS distributions may be an initial choice because of their negatively and positively skewed density shapes. Consider to be the generalized half-normal (GHN) distribution [8] with scale parameter and shape parameter , where , given by , where is the error function. Note that Then, the four-parameter EHLGHN density (for ) can be expressed as

If , the EHLGHN distribution model reduces to the exponentiated half-logistic half-normal (EHLHN) distribution. The cdf and hrf corresponding to (12) are respectively. A characteristic of the EHLGHN distribution is that its hrf can be bathtub shaped, monotonically increasing or decreasing, and upside-down bathtub depending basically on the parameter values. Plots of the EHLGHN density function and hrf for some parameter values are displayed in Figures 5 and 6, respectively.

3. Shapes

The shapes of the density and hazard rate functions can be described analytically. The critical points of the EHL- density function are the roots of the equation: There may be more than one root to (14). Let . We have If is a root of (14), then it corresponds to a local maximum if for all and for all . It corresponds to a local minimum if for all and for all . It refers to a point of inflexion if either for all or for all .

The critical point of the hrf of , say , is obtained from the following equation:

There may be more than one root to (16). Let . We have If is a root of (16), then it refers to a local maximum if for all and for all . It corresponds to a local minimum if for all and for all . It gives an inflexion point if either for all or for all .

4. A Useful Expansion and Quantile Power Series

We can demonstrate that the cdf of given by (2) admits the following expansion: where denotes the exponentiated- (“exp-”) cumulative distribution with power parameter , Some structural properties of the exp- distributions are investigated by Mudholkar et al. [9], Gupta and Kundu [10], and Nadarajah and Kotz [11], among others.

The density function of can be expressed as an infinite linear combination of exp- density functions: where denotes the density function of the exp- random variable with power parameter . Equation (20) reveals that the EHL- density function is a linear combination of exp- density functions. Thus, some mathematical properties of the new family can be obtained directly from those properties of the exp- distribution.

Here, we derive a power series expansion for the quantile function of by expanding (6). If the quantile function, say , does not have a closed-form expression, it can usually be expressed in terms of a power series where the coefficients are suitably chosen real numbers which depend on the parameters of the distribution. For several important distributions, such as the normal, the Student , and gamma and beta distributions, does not have explicit expressions but it can be expanded as in (21). As a simple example, for the normal distribution, for and , , , and .

We use throughout the paper a result of Gradshteyn and Ryzhik ([12], Section  0.314) for a power series raised to a positive integer (for ): where the coefficients (for ) are easily obtained from the recurrence equation (with ): Clearly, can be determined from and then from the quantities .

Next, we derive an expansion for the argument of in (6):

Using the generalized binomial expansion four times since , we can write and then where and for , , and Then, the quantile function of can be expressed from (6) as where for and For any baseline distribution, we can combine (21) with (28) to obtain and then using (22) and (23), we have where , , and, for , Equation (32) is the main result of this section since it allows to obtain various mathematical quantities for the EHL family as shown in the next sections.

The formulae derived throughout the paper can be easily handled in most symbolic computation software platforms such as Maple, Mathematica, and MATLAB. These platforms currently have the ability to deal with analytic expressions of formidable size and complexity. Established explicit expressions to calculate statistical measures can be more efficient than computing them directly by numerical integration. The infinity limit in these sums can be substituted by a large positive integer such as 20 or 30 for most practical purposes.

5. Moments

Hereafter, we will assume that is the cdf of a random variable and that is the cdf of the random variable having density function (3). The moments of can be obtained from the th probability weighted moments (PWMs) of given by

An alternative expression for can be determined using (22) and (23): The PWMs for several distributions can be calculated from (34) and (35).

We can write from (20) Thus, the moments of any EHL- distribution can be expressed as an infinite weighted linear combination of the baseline PWMs. Equations (34)–(36) are the main results of this section.

Further, the central moments () and cumulants () of can be calculated as respectively, where . Then, , , , and so forth. The skewness and kurtosis quantities follow from the second, third, and fourth cumulants.

5.1. EHLF Model

Consider the Fréchet baseline cdf for and corresponding pdf discussed in Section 2.2. The EHLF density function can be written from (20) as where . This equation reveals that the EHLF density function can be expressed as an infinite mixture of Fréchet densities.

The th PWM of the Fréchet distribution becomes Setting , reduces to The integral converges absolutely for and then

Plots of the skewness and kurtosis for some choices of as functions of , for and , are displayed in Figure 7.

5.2. Exponentiated Half-Logistic Logistic (EHLLo) Model

For the EHLLo distribution, the baseline cumulative function is . Using a result from Prudnikov et al. ([13], Section  2.6.13, equation (4)), we can write from (34) (for ) the following: where is the beta function. The th moment of the EHLLo distribution comes from (36) as

5.3. Exponentiated Half-Logistic Gamma (EHLGa) Model

Using the power series expansion for the gamma cdf we obtain from (20) the following series expansion:

The EHLGa moments follow from (36) and the expression for given by

5.4. Exponentiated Half-Logistic Normal (EHLN) Model

The moments of can be obtained from the moments of using , and then we can work with the standard normal distribution. We can expand the EHLN cumulative function (18) (with and ) as From the series expansion for the error function we obtain a series expansion from (20) (with and ) given by The EHLN moments can be obtained from (36) and the PWMs given by Cordeiro and Nadarajah [14]. Plots of the skewness and kurtosis for some choices of as functions of , for , and , , are displayed for the EHLLL and EHLHGN distributions in Figures 8 and 9, respectively. These plots show that the skewness and kurtosis are very flexible.

6. Other Measures

In this section, we calculate the following measures: generating function, incomplete moments, mean deviations, reliability, entropies, and order statistics for the EHL- family.

6.1. Generating Function

Here, we provide two formulae for the moment generating function (mgf) of . A first formula for comes from (20) as where is the generating function of the exp- distribution with power parameter . Hence, can be determined from the exp- generating function.

A second formula for can be derived from (20) as where

We can derive the mgf’s of several EHL distributions directly from (50)-(51). For example, the mgf’s of the exponentiated half-logistic exponential (EHLE) (with parameter and ) and EHLLo (with ) distributions are given by respectively.

Clearly, two representations for the characteristic function (chf) of can be derived from (50)–(52) by , where .

6.2. Incomplete Moments

Incomplete moments of the income distribution form natural building blocks for measuring inequality. For example, the Lorenz and Bonferroni curves depend upon the incomplete moments of the income distribution. The th incomplete moment of is defined as . Here, we provide two formulae to calculate the incomplete moments of the EHL family. First, the th incomplete moment of can be expressed as

The integral in (54) can be computed at least numerically for most baseline distributions.

A second formula follows from (54) using (22) and (23). We can write where is given by (23).

The first incomplete moment can be used to obtain Bonferroni and Lorenz curves defined for a given probability by and , respectively, where is immediately calculated from the parent quantile function.

6.3. Mean Deviations

The mean deviations about the mean () and about the median () of can be expressed as respectively, where is the median of , and come from (2) and (36), respectively, and is the first incomplete moment.

Now, we provide two alternative ways to compute and . A general equation for can be derived from (20) as where Equation (58) is the basic quantity to compute the mean deviations for the EGL distributions.

A second general formula for can be derived by setting in (57): where

Equations (55)–(59) are the main results of this section.

6.4. Reliability

Here, we derive the reliability when and are independent random variables with a positive support. It has many applications especially in engineering concepts. Let denote the pdf of and let denote the cdf of . By expanding the binomial terms in and , we obtain where If , we obtain Further, if and , then .

6.5. Entropies

An entropy is a measure of variation or uncertainty of a random variable . Two popular entropy measures are the Rényi and Shannon entropies. The Rényi entropy of a random variable with pdf is defined (for and ) as The Shannon entropy of a random variable is given by , which is the special case of the Rényi entropy when . Direct calculation gives

After some algebraic manipulations, we obtain the following.

Proposition 1. Let be a random variable with pdf given by (3). Then,

The simplest formula for the entropy of becomes After some algebraic developments, we obtain an alternative expression for : where .

6.6. Order Statistics

Order statistics make their appearance in many areas of statistical theory and practice. Suppose that is a random sample from the EHL- distribution. Let denote the th order statistic. From (18) and (20), the pdf of is given by where . Using (22) and (23), we can write where and Hence, where .

Equation (72) is the main result of this section. It reveals that the pdf of the EHL- order statistics is a linear combination of exp- density functions. So, several structural quantities of the EHL- order statistics like ordinary, incomplete moments, generating function, mean deviations, and several others can be obtained from the corresponding quantities of exp- distributions.

7. Bivariate Extensions

In this section, we introduce two extensions of the proposed model. The first extension is based on the idea of [15]. Let , , and be independent random variables. Further, we define and . Then, the pdf of the bivariate random variable is given by where . The marginal cdf’s are Clearly, if we consider and , the pdf of is given by

The marginal pdf’s are given by

A second extension is given by where is a bivariate continuous distribution with marginal cdf’s and . The marginal cdf’s are given by

The joint pdf of is easily obtained by and then where

The marginal pdf’s are

The conditional cdf’s are

The conditional density functions reduce to

8. Estimation

We derive the maximum likelihood estimates (MLEs) of the parameters of the EHL- family. Let be a random sample of size from the random variable , where is a vector of unknown parameters of the baseline distribution . The log-likelihood function for can be expressed as Equation (84) can be maximized either directly, for example, using SAS (Proc NLMixed) or Ox (subroutine MaxBFGS) (see [16]) or by solving the nonlinear likelihood equations obtained by differentiating the score function. Initial estimates of the parameters and may be inferred from the estimates of . The components of the score vector are given by where are vectors.

For interval estimation and hypothesis tests on the model parameters, we require the observed information matrix calculated numerically. Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, is asymptotically normal , where is the expected information matrix. We can substitute by , that is, the observed information matrix evaluated at , and then the multivariate normal distribution can be used to construct approximate confidence regions for the model parameters.

We can compute the maximum values of the unrestricted and restricted log-likelihoods to construct likelihood ratio (LR) statistics for testing some special models of the EHL- distribution. For example, for comparing, the EHLGHN and EHLHN distributions are equivalent to test versus and the LR statistic reduces to where , , , and are the MLEs under and , , and are the estimates under .

9. Applications

In this section, the potentiality of the EHL- family is illustrated by means of two applications using well-known data sets. We demonstrate the flexibility and applicability of the proposed model. The reason for choosing these data is that they allow us to show how in different fields it is necessary to have positively skewed distributions with nonnegative support. These data sets present different degrees of skewness and kurtosis.

9.1. Application 1: Tubercle Data

The first data set corresponds to the survival times of guinea pigs injected with different doses of tubercle bacilli reported by Bjerkedal [17]. It is well known that guinea pigs have high susceptibility to human tuberculosis and that is because they were used in that study. Here, we are primarily concerned with the animals in the same cage that are under the same regimen; the data includes observations. These data were also analyzed by Kundu et al. [18] using the Birnbaum-Saunders distribution.

An alternative approach for modeling these data can be provided by the Weibull and Birnbaum-Saunders (BS) distribution. There are various extensions of these lifetimes distributions. For example, Famoye et al. [19] proposed the beta Weibull (BW) distribution and Cordeiro et al. [1] study some mathematical properties of the BW distribution, which is a quite flexible model in analysing positive data. More recently, Cordeiro and Lemonte [20] proposed the -Birnbaum-Saunders (-BS) distribution for fatigue life modeling. They investigated various properties of the -BS model including expansions for the moments, generating function, mean deviations, density function of the order statistics, and their moments. The BW and -BS distribution are as follows.

(i) BW Distribution. The BW distribution [19] with four parameters , , , and has density function given by (for ) where is the beta function and is the gamma function. Here, and are two additional shape parameters to the Weibull distribution to govern skewness and kurtosis. For , we obtain the Weibull distribution.

(ii) -BS Distribution. The -BS density function (with four parameters , , , and ) is where , , and are shape parameters and is a scale parameter, , , , , and is the standard normal cumulative distribution. For , we obtain the BS distribution.

We fit the EHLF, EHLLL, EHLGHN, Fréchet, LL, GHN, BW, and -BS distributions to the current data. In order to estimate the parameters , , and , we adopt the maximum likelihood estimation method discussed in Section 8. We use the MLEs of and applied to the corresponding wider models for the Weibull, LL, and GHN distributions as starting values for the iterative procedure. The computations were done using the NLMixed procedure in SAS. Table 4 lists the MLEs (and the corresponding standard errors in parentheses) of the model parameters and the values of the following statistics for the fitted models: AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and CAIC (Consistent Akaike Information Criterion). These results indicate that the EHLLL and LL models have the lowest AIC, BIC, and CAIC values, and therefore they could be chosen as the best models.

Now, we will apply formal goodness-of-fit tests in order to verify which distribution fits better to the carbon data. In particular, we consider the Cramér-von Mises () and Anderson-Darling () statistics. The and statistics are described in detail in Chen and Balakrishnan [21]. In general, the smaller the values of these statistics, the better the fit to the data. Let be the cdf, where the form of is known but (a -dimensional parameter vector) is unknown. To obtain the statistics and , we can proceed as follows: (i) compute , where the ’s are in ascending order, and then , where is the inverse of ; (ii) compute , where and ; (iii) calculate and and then and (see Table 2).

The and statistics for all the models are given in Table 3. From the figures in this table, the proposed EHLLL model fits the current data better than the other models. Therefore, the new family may be an interesting alternative to the other models available in the literature for modeling positive real data.

More information is provided by a visual comparison of the histogram of the data with the fitted EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN distributions. The plot of the fitted EHLLL density is displayed in Figure 10(b) for the tubercle data. Clearly, the new EHLLL distribution provides a closer fit to the histogram.

Figure 11(a) displays plots of the empirical function and the estimated cdf’s of the EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN distributions. We note a good fit of the EHLLL and LL models to these data.

9.2. Carbon Monoxide Data

The first data set consists of the carbon monoxide (CO) measurements made in several brands of cigarettes in 1998. The reports show that nicotine levels, on average, had remained stable since 1980, after falling in the preceding decade. The report entitled “Tar, nicotine, and carbon monoxide of the smoke of 1206 varieties of domestic cigarettes for the year of 1998” includes the data sets and some information about the source of the data, smoker’s behavior and beliefs about nicotine, and tar and carbon monoxide contents in cigarettes.

The CO data includes records of measurements of CO content, in milligrams, in cigarettes of several brands.

We fit the EHLF, EHLLL, EHLGHN, Fréchet, LL, GHN, BW, and -BS distributions to the data. The computations were done using the NLMixed procedure in SAS. Table 4 lists the MLEs (and the corresponding standard errors in parentheses) of the model parameters and the values of AIC, BIC, and CAIC statistics for some models. These results indicate that the EHLF, EHLGHN, BW, and GHN models have the lowest AIC, BIC, and CAIC values.

The and statistics for all the models are given in Table 5. From the figures in this table, the proposed EHLF model fits the current data better than the other models.

In order to assess if the EHL- model is really appropriate, the plots of the fitted EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN density functions are displayed in Figure 12. Based on these plots, we conclude that the EHLF distribution provides the best fit to the carbon monoxide data.

Figure 13(a) displays plots of the empirical function and the estimated cdf’s of the EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN distributions. We note a good fit of the EHLF model to these data.

10. Conclusions

We propose a new exponentiated half-logistic (EHL) family which represents a competitive alternative for lifetime data analysis. For any parent continuous distribution , we can define the corresponding EHL- distribution with two positive parameters. So, the new family extends several common distributions such as Fréchet, normal, log-normal, Gumbel, and log-logistic distributions. The mathematical properties of the new family such as ordinary, incomplete and factorial moments, generating and quantile functions, mean deviations, Bonferroni and Lorenz curves, Shannon entropy, Rényi entropy, reliability, and order statistics are obtained for any EHL- distribution. The model parameters are estimated by maximum likelihood. Two examples to real data illustrate the importance and potentiality of the new family.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.