Abstract

A four-parameter family of Weibull distributions is introduced, as an example of a more general class created along the lines of Marshall and Olkin, 1997. Various properties of the distribution are explored and its usefulness in modelling real data is demonstrated using maximum likelihood estimates.

1. Introduction

Probability distributions are often used in survival analysis for modeling data, because they offer insight into the nature of various parameters and functions, particularly the failure rate (or hazard) function. Throughout the last decades, a considerable amount of research was devoted to the creation of lifetime models with more than the classical increasing and decreasing hazard rates; apparently, the motivation for this trend was to provide with more freedom of choice in the description of complex practical situations (see e.g., [19], and the references therein). In this paper a general class of models is introduced, by adding an extra parameter to a distribution in the sense of Marshall and Olkin [10], and subsequently used in developing a four-parameter modified Weibull extension distribution, with various failure rate curves that compete well with other alternatives in fitting real data. Specifically, Xie et al. [11] generalized the Chen [12] distribution by adding the lacking scale parameter, thus creating a three-parameter Weibull distribution; although the variety of shapes of the reliability curves was not enriched, the resulting model provided better fit to real data. The proposed distribution extends the Xie et al. [11] distribution by adding a shape parameter; it will be seen that compared to the previous and other models, the cost of the addition is balanced by the improvement in fitting real data.

The paper is organized as follows. Section 2 includes the general class of models and some properties. The proposed four-parameter Weibull model is introduced in Section 3 and some properties and reliability aspects are studied. The parameters are estimated by the method of maximum likelihood and the observed information matrix is obtained; the fit of the proposed distribution to two sets of real data is examined against three and two parameter competitors.

2. The Class of Distributions

It is possible to generalize a distribution by adding a shape parameter, in the sense of Marshall and Olkin [10]. Thus, starting with a distribution with survival function , the survival function of the proposed family with the additional parameter is given by and when , then . The probability density and hazard functions are readily found to be where and are the probability density and hazard functions corresponding to the distribution with survival function , and , . Since, it follows from (2) that Therefore, with is increasing for and decreasing for . When , the hazard function at the origin, , behaves quite differently than the corresponding functions for the Weibull and gamma distributions; for both these families, the distribution can be exponential, or , or , so that is discontinuous in the shape parameter. This is not the case for the hazard functions in (2), and therefore the proposed family may be useful in fine-tuning the distribution with survival function .

3. A Weibull Extension Model

The survival function of the modified Weibull extension distribution, introduced by Xie et al. [11] and studied further by Tang et al. [13], is given by for ; hereinafter we shall be referring to the distribution with survival function given by (5) as the XTG distribution for brevity. By substituting (5) in (1), the survival function of the proposed distribution is obtained in the form where and . From (2) or (6), the pdf is readily obtained as where is ascale parameter and , , are shape parameters. It can be shown that the pdf is monotone decreasing, unimodal or even roller-coaster type; the different shapes of the pdf are illustrated in Figure 1 for selected values of the parameters.

Clearly, for the proposed distribution reduces to the XTG distribution therefore the proposed model can be viewed as an extension of the XTG model (which is asymptotically related to the usual two-parameter Weibull distribution) and if, in addition, , then (7) defines the Chen [12] distribution; hereinafter we shall be referring to this distribution as the extended XTG distribution (EXTG distribution for brevity). Furthermore, it can be shown that for (7) is a compound of the logarithmic and the XTG distributions. Indeed, by incorporating the results of Barlow and Proschan [14] and Arnold et al. [15], consider the lifetime of a “series-system” of identical components, where failure occurs if at least one component ceases to function. If the lifetimes of the components are iid random variables with survivals given by (5) and the distribution of their number is logarithmic, independently of the ’s, with pmf for , , then the distribution of has pdf for , and the distribution of is the EXTG with pdf given by (7).

The calculations of the th raw moments of the EXTG distribution involve the use of standard numerical integration procedures available in every mathematical package; for they can be expressed in the form,

By straightforward reversal of the cdf, obtained from (6) using that , the quantile function is calculated to be for ; hence the median is .

3.1. Failure Rate and Mean Residual Life Functions

From (6) and (7) the failure rate (also known as hazard rate) function of the EXTG distribution is It can be shown that for , the EXTG is an IFR distribution [16]. However, for it can be IFR, DFR, and BTFR distribution, although it is not easy to determine analytically the ranges of the parameter values; the IFR, DFR, and BTFR characteristics are depicted in Figure 2 for selected values of the parameters. Given that there is no failure prior to , the residual life is the period from time until the time of failure. The mean residual lifetime, for , is Other reliability aspects of the distribution can be obtained numerically. For example, the renewal function, which is important for maintenance, can be calculated approximately either by the well-known method of considering its limit at infinity and the first and second raw moments given by (10), or by applying the method of the linear combination of the cdf and the hazard function; see Cui and Xie [17], Jiang [18] and the references therein.

3.2. Inference

Assuming a random sample of n observations, , from (7), the log-likelihood is given by

and by differentiating with respect to the gradients are The mle of , , is obtained by solving simultaneously the four nonlinear normal equations, , , and by any iterative numerical method such as the Newton-Raphson, quasi-Newton, or Nelder-Mead procedures.

If the true parameter vector is an interior point of the parameter space, then can be treated as being approximately trivariate normal with mean and covariance matrix the inverse of the Fisher’s expected information matrix , where is the observed information matrix with elements , and the expectation is to be taken with respect to the distribution of . By differentiating the normal equations in (15) the elements in the upper triangular part of are found to be

The latter is a consistent estimator of and can be used for constructing asymptotic confidence intervals for the parameters. However, if any of the true parameter values is zero then the asymptotic distribution of the maximum likelihood estimators is a mixture distribution [19]; in this case obtaining the asymptotic confidence intervals becomes quite difficult and shall not be pursued here.

3.3. Examples

In this section two sets of real data are considered in order to test the goodness of fit of the proposed model. The first set of data consists of times to first failure of fifty devices [20]. The second set of data involves forty-four observations obtained from a life test concerning failure times (in hours) of all subsystems of a machine, that is, engine, hydraulic and air-conditioning subsystems, brakes, transmissions, tyres and wheels, body and chassis [21, 22]; in both cases, the data were grouped and the empirical hazard rate was estimated by many methods to indicate a BT shape. In addition to the EXTG, the XTG distribution, the two-parameter Chen [12] distribution, and the three-parameter model introduced by Dimitrakopoulou et al. [3], were fitted to the datasets; for brevity, hereinafter we shall be referring to the latter two models as the Chen and DAL distributions respectively. The fit of each distribution was examined by the Akaike information criterion (AIC) and the Kolmogorov-Smirnov (K-S) goodness-of-fit test using maximum likelihood estimates; the estimates, the maximized log-likelihoods, the values of the AIC, and the values of the K-S statistic with the associated -values are presented in Table 1. Furthermore, the values of the likelihood ratio test statistic for testing , calculated from the first and the second set of data, were 8.7939 and 6.8366 , respectively; the analogous computations for testing were 11.8371 and 6.479 . All the results indicate that the EXTG distribution describes these data better than the other models; these findings are also supported by the empirical and fitted survivor functions, plotted in Figure 3.

Acknowledgment

The authors would like to thank a referee for useful comments and suggestions.