Applied Mathematics and Statistical Mechanics and their ApplicationsView this Special Issue
The Mixture of the Marshall–Olkin Extended Weibull Distribution under Type-II Censoring and Different Loss Functions
To study the heterogeneous nature of lifetimes of certain mechanical or engineering processes, a mixture model of some suitable lifetime distributions may be more appropriate and appealing as compared to simple models. This paper considers a mixture of the Marshall–Olkin extended Weibull distribution for efficient modeling of failure, survival, and COVID-19 data under classical and Bayesian perspectives based on type-II censored data. We derive several properties of the new distribution such as moments, incomplete moments, mean deviation, average lifetime, mean residual lifetime, Rényi entropy, Shannon entropy, and order statistics of the proposed distribution. Maximum likelihood and Bayes procedure are used to derive both point and interval estimates of the parameters involved in the model. Bayes estimators of the unknown parameters of the model are obtained under symmetric (squared error) and asymmetric (linear exponential (LINEX)) loss functions using gamma priors for both the shape and the scale parameters. Furthermore, approximate confidence intervals and Bayes credible intervals (CIs) are also obtained. Monte Carlo simulation study is carried out to assess the performance of the maximum likelihood estimators and Bayes estimators with respect to their estimated risk. The flexibility and importance of the proposed distribution are illustrated by means of four real datasets.
In the history of statistics, the use of finite mixture models is very old. They were particularly used to model population heterogeneity, generalize distributional assumptions, clustering and classification, etc. The concept of the finite mixture distribution was pioneered by Newcomb  for modeling outliers. Eight years later, Pearson  considered a mixture of two univariate Gaussian distributions to estimate the parameters of the model using the method of moments (MOM) to analyze a dataset containing ratios of the forehead to body lengths for 1,000 crabs. Since then, several authors have studied finite mixture models under different scenarios. Mendenhall and Hader  considered exponentially distributed failure time distributions based on censored lifetime data to estimate the model parameters using the maximum likelihood method. In their study, they divided the failure population into two subpopulations, each representing a different cause or type of failure. Radhakrishna et al.  considered both moment and maximum likelihood estimators of the unknown parameters of a two-component mixture of generalized gamma distribution. Ahmed et al.  obtained approximate Bayes estimators for parameters of the mixture of two Weibull distributions under type-II censoring. Al-Hussaini et al.  applied both maximum likelihood and Bayes estimation methods on a two-component mixture of the Gompertz distribution based on type-I and type-II censoring. Jaheen  adopted both maximum likelihood and Bayesian approach to discuss the problem of estimating the parameters using the finite mixture of two exponential distributions based on record statistics. Shawky and Bakoban  adopted both maximum likelihood and Bayesian approach to estimate the parameters of the model, reliability, and failure rate functions of two-component finite mixtures of exponentiated gamma distribution. Abu-Zinadah  used maximum likelihood and Bayes estimation methods to estimate the parameters, reliability, and hazard functions of a mixture of exponentiated Pareto and exponential distribution under complete and type-II censoring schemes. Prakash  adopted the Bayes method to estimate the parameters from the mixture of two Weibull distributions based on informative and noninformative priors. Zhang et al.  proposed a mixture Weibull proportional hazard model to predict the failure of a mechanical system with multiple failure modes. They estimated the mixed model parameters by combining historical lifetime and monitoring data of all failure modes. ALgfary  introduced the finite mixture of two exponentiated Kumaraswamy (MEKum) distributions and obtained the maximum likelihood estimates for the vector of the parameters of the MEKum distribution. Adham and ALgfary  adopted the Bayesian approach to estimate the vector of parameters of the finite mixture of two-component exponentiated Kumaraswamy distribution. They also obtained Bayesian predictive density functions of future observations from the MEKum distribution. Ateya and Al Khald  studied the finite mixture of truncated generalized Cauchy distribution based on type-I, type-II, and progressively type-II censored samples. Aslam et al.  studied the three-component mixture of exponential, Rayleigh, Pareto, and Burr type-XII distributions in relation to reliability analysis. Tahir et al.  studied the properties of the three-component mixture of Rayleigh distributions based on doubly censored lifetime data. Kalantan and Alrewely  studied the two-component Laplace mixture model and obtained the estimates of the parameters using maximum likelihood and method of moments. Recently, Tahir et al.  also studied the three-component mixture of exponential distributions from the Bayesian perspective based on type-II doubly censoring sampling scheme. Kharazmi et al.  studied the 2-component mixture of Topp–Leone distribution and obtained classical and Bayes estimators based on the complete sample and the references cited therein.
Wide applicability of mixture modeling motivates us to develop a two-component mixture of Marshall–Olkin extended Weibull distribution for efficient modeling of breaking stress of carbon fibers, survival times in days of 72 guinea pigs infected datasets, survival times in weeks of 33 patients suffering from acute myelogenous leukemia, and COVID-19 data belonging to Canada of 36 days. In this article, the primary objective of the paper is twofold. First, we obtain maximum likelihood estimators and corresponding approximate confidence intervals (CIs) of the unknown parameters of the 2-component mixture of the Marshall–Olkin extended Weibull (MOEW) distribution for type-II censored data. Next, we consider the Bayes estimation method. The Bayes estimators have been derived and evaluated under the assumption of two loss functions using independent gamma priors. Symmetric two-sided Bayes credible intervals are also obtained, and they are compared with classical CIs. To the best of our knowledge, the 2-component mixture of the MOEW distribution is not discussed before using the aforementioned methods of estimation. Through this paper, we purport to provide some guidelines on selecting the best estimator that may be of significant interest to applied statisticians/practitioners/engineers.
The organization of this paper is as follows. The description of the model along with basic properties is reported in Section 2. We use the maximum likelihood estimation method based on type-II censoring as a part of frequentist methodology for parameter estimation in Section 3. We have also taken into account approximate CIs in the same section. In Section 4, we have derived the Bayes estimators of the unknown parameters of the model under squared error loss (SEL) and linear exponential (LINEX) loss functions using gamma priors for both scale and shape parameters. We have also obtained two-sided Bayes probability intervals in the same section. The simulation study is carried out in Section 5. For illustrative purposes, four real datasets are analyzed in Section 6. Finally, concluding remarks are presented in Section 7.
2. Model Description
The probability density function (pdf) of the Marshall–Olkin extended Weibull distribution (MOEW) for a random variable X is defined by (see Ghitany et al.  and Zhang and Xie )where , and the cumulative distribution function (cdf) of the distribution is
The hazard rate function of MOEW takes the form
A density function for the mixture of two components’ densities with mixing proportions iswhere are the mixing proportions, satisfying the conditions and ; all of them are unknowns, and the pdf is plotted in Figure 1. The cdf for the mixture model iswhere
The reliability function for the mixture model iswhere
Due to exponentiation of each component by a positive integer, the model is so flexible that shows different shapes of hazard rate function (hrf) of the mixture which is given by
3. General Properties of the MOEW Distribution
The rth moment of a finite mixture of the 2-component MOEW distribution is given bywhere .
When r = 1, the mean is given by
The moment-generating function of the mixture MOEW distribution is given by
3.2. Incomplete Moments
The rth incomplete moment of a finite mixture of the 2-component mixture distribution is given bywhere is the rth incomplete moment of the jth component so that the rth incomplete moment of a finite mixture of the 2-component MOEW distribution is given by
The first incomplete moment of a finite mixture of the 2-component MOEW distribution is given by
3.3. Mean Deviations
The mean deviations of the random variable about the mean, and the median, , are given, respectively, bywhere and since and .
The median follows from the nonlinear equation . So, these quantities reduce towhere T1(z) is the first incomplete moment of X obtained from (15),where
3.4. Average Lifetime and Mean Residual Lifetime Functions
The average lifetime is given by
The application of mean residual lifetime can be seen in the paper of Guess and Proschan . The mean residual lifetime is given by
3.5. Rényi Entropy
The Rényi entropy of with pdf (4) is given bywhere .
It is difficult to obtain in the closed form for a finite mixture of the 2-component MOEW distribution.
3.6. Shannon Entropy
The Shannon entropy of X is given by
Thus, from (4), we can get the log-likelihood function as
Thus, the above equation can be rewritten as
3.7. Distribution of Order Statistics
3.8. Maximum Likelihood Estimation Based on Type-II Censoring
Here, we discuss the maximum likelihood estimates of the unknown parameters of the 2-component mixture of the MOEW distributions. In a life testing experiment, n items from the above mixture model are employed to test and get terminated when a preassigned number of items, say r (<n), have failed. The samples obtained from such an experiment are called failure-censored samples or type-II censored samples. In the failure-censored case, data comprise the lifetime of the r items that have failed (say ), and the remaining (n − r) items have survived beyond xr with the assumption that the lifetime distribution of the items is independent and identically distributed MOEW distribution; the likelihood function for the type-II censoring scheme can be written as
The corresponding log-likelihood function can be written as
The resulting normal equations are
The MLEs of can be obtained by solving equations (32)–(39) simultaneously. Since explicit solutions cannot be obtained from the above equations, thus, we propose to use a suitable numerical technique to solve these seven nonlinear equations; however, one may use Newton–Raphson to solve these equations. This can be routinely done using R packages.
3.9. Approximate Confidence Intervals
In this section, under the normality property of MLEs of the parameters , we obtain the asymptotic confidence interval. The asymptotic distribution of the MLE is , see Lawless , where , the inverse of the observed information matrix of the unknown parameters , is .
The above approach is used to derive approximate confidence intervals of the parameters of the forms , where is the upper percentile of the standard normal distribution.
4. Bayesian Estimation Using the Gamma Prior Distribution
In this section, the Bayes estimates of the model parameters are obtained under the assumption that the random variables have independent gamma prior distributions (see Dey et al. [24–26]) with hyperparameters given bywhere . By multiplying (31) with (40), the joint posterior density for the vector of parameters given the data becomes
Marginal distributions of can be obtained by integrating with respect to the nuisance parameters. Next, we consider the loss function that will be used to derive the estimators from the marginal posterior distributions.
4.1. Bayes Estimators of the Vector of Parameters
In this section, we derive the Bayes estimators of the model parameters under symmetric as well asymmetric loss functions. A much known symmetric loss function is the squared error loss (SEL) function which is defined as
The popularity of this loss function is due to its relationship to least squares theory; it also makes the calculations simpler. Under the SELF in (43), the Bayes estimates of any function derived as
All the above integrals have no closed form; so, we employ the numerical method to estimate the parameters. A useful asymmetric loss function, known as LINEX loss function, was introduced by Varian  and widely used by several authors (see Zellner  and Pandey and Rai ). We noticed that the LINEX loss function does not perform well for the estimation of the scale parameter in the whole parametric space, but performs well for a certain specified value of . Basu and Ebrahimi  also suggested that the LINEX loss function is proper for the location parameter, and it appears not to be suitable for the estimation of the scale parameter. The linear exponential (LINEX) is an asymmetric loss function defined as
Under the LINEX loss function, the Bayes estimators of any function can be written as
All the above integrals have no closed form. So, they are solved by the analytical method.
4.2. Credible Intervals
In this section, a symmetric two-sided Bayes probability interval estimate of , denoted by , is obtained by satisfying the following expression:
Since it is difficult to find the interval and analytically, thus, we apply suitable numerical techniques to solve this nonlinear equation.
5. Simulation Study and Comparisons
Here, we have carried out Monte Carlo simulation study to assess the performance of the maximum likelihood estimators and Bayes estimators with respect to their estimated risk. Here, for the simulation study, we have considered the parameter values as and different values of the mixing proportion. We set sample sizes n = 20, 40, and 80.
Probabilistic mixing is used here to generate the mixture data. For each observation, a random number u is generated from the uniform (0, 1) distribution. If u < p1, the observation is taken randomly from F1 (the MOEW distribution with parameters , and ; otherwise, from F2 (the MOEW distribution with parameters , and ). The choice of the censoring failure is made in such a way that the censoring rate of the resultant sample is approximately 10%. To implement censored samplings, the observations and , of failed items come from first and second subpopulations, respectively. The rest of the observations, which are greater than and , have been assumed to be censored from each component.
The simulated datasets have been obtained using the following steps: Step 1: generate a uniform random number corresponding to each observation Step 2: if , take the observation from the first subpopulation; otherwise, from the second subpopulation Step 3: determine the test termination points on the right, that is, Step 4: the observations which are greater than have been considered to be censored from each component (type-II censoring)
To avoid an extreme sample, we simulate 5000 datasets each of size . The abbreviations used in the tables are estimate, estimated risk, and length of CIs based on the maximum likelihood method, Bayes estimates based on the squared loss function, and Bayes estimates based on the LINEX loss function. The Bayes estimates, estimated risk, and length of the confidence interval are computed using R package. These results are reported in Table 1. We assume that the prior distributions follow gamma distribution with hyperparameters , .
From Table 1, we observe that, as sample size increases, estimated values of the parameters converge to the true values, and Bayes posterior risk tends to decrease. We also observe that, as sample size increases, the length of the classical confidence interval and Bayes credible interval decreases. It is to be noted that the Bayes estimates perform better than maximum likelihood estimates. In comparison of loss functions, the squared loss function provides better results than the LINEX loss function.
6. Real Data Analysis
In this section, we use four real-life datasets to illustrate the importance and flexibility of the MMOEW distribution. We compare the fits of the new MMOEW distribution with some other competitive models, such as Weibull (W), exponentiated Weibull (EW), exponentiated exponential (EE), Marshall–Olkin Weibull (MOW), and Marshall–Olkin extended Weibull (MOEW) distributions. The comparisons are done based on some measures of goodness of fit, namely, the maximized log-likelihood under the model , Akaike information criterion (AIC), Bayesian information criterion (BIC), Hannan–Quinn information criterion (HQIC), consistent Akaike information criterion (CAIC), and Kolmogorov–Smirnov (KS) statistic with its value (PV). We observe that all the distributions in Tables 2–5 show a reasonably good fit for the given four datasets. The plots of empirical and fitted cdfs (Figures 2–5) also support the results in Tables 2–5. However, according to the cited statistics, the MMOEW model fits dataset II better than the other models.
Dataset I: the first dataset consists of 100 observations of breaking stress of carbon fibers. This dataset is obtained from Nichols and Padgett . These data are stated as follows: 0.98, 5.56, 5.08, 0.39, 1.57, 3.19, 4.90, 2.93, 2.85, 2.77, 2.76, 1.73, 2.48, 3.68, 1.08, 3.22, 3.75, 3.22, 3.70, 2.74, 2.73, 2.50, 3.60, 3.11, 3.27, 2.87, 1.47, 3.11, 4.42, 2.40, 3.15, 2.67, 3.31, 2.81, 2.56, 2.17, 4.91, 1.59, 1.18, 2.48, 2.03, 1.69, 2.43, 3.39, 3.56, 2.83, 3.68, 2.00, 3.51, 0.85, 1.61, 3.28, 2.95, 2.81, 3.15, 1.92, 1.84, 1.22, 2.17, 1.61, 2.12, 3.09, 2.97, 4.20, 2.35, 1.41, 1.59, 1.12, 1.69, 2.79, 1.89, 1.87, 3.39, 3.33, 2.55, 3.68, 3.19, 1.71, 1.25, 4.70, 2.88, 2.96, 2.55, 2.59, 2.97, 1.57, 2.17, 4.38, 2.03, 2.82, 2.53, 3.31, 2.38, 1.36, 0.81, 1.17, 1.84, 1.80, 2.05, and 3.65.
Dataset II: the second dataset consists of survival times in days of 72 guinea pigs infected with virulent tubercle bacilli. This dataset is taken from Bjerkedal . These data are illustrated as follows: 0.1, 0.33, 0.44, 0.56, 0.59, 0.72, 0.74, 0.77, 0.92, 0.93, 0.96, 1, 1, 1.02, 1.05, 1.07, 07, 1.08, 1.08, 1.08, 1.09, 1.12, 1.13, 1.15, 1.16, 1.2, 1.21, 1.22, 1.22, 1.24, 1.3, 1.34, 1.36, 1.39, 1.44, 1.46, 1.53, 1.59, 1.6, 1.63, 1.63, 1.68, 1.71, 1.72, 1.76, 1.83, 1.95, 1.96, 1.97, 2.02, 2.13, 2.15, 2.16, 2.22, 2.3, 2.31, 2.4, 2.45, 2.51, 2.53, 2.54, 2.54, 2.78, 2.93, 3.27, 3.42, 3.47, 3.61, 4.02, 4.32, 4.58, and 5.55.
Dataset III: the third dataset consists of the survival times in weeks of 33 patients suffering from acute myelogenous leukemia. This dataset is taken from Mahmoudi . These data are illustrated as follows: 65, 156, 100, 134, 16, 108, 121, 4, 39, 143, 56, 26, 22, 1, 1, 5, 65, 56, 65, 17, 7, 16, 22, 3, 4, 2, 3, 8, 4, 3, 30, 4, and 43.
Dataset IV: the fourth dataset consists of drought mortality rate. The data represent COVID-19 data belonging to Canada of 36 days from 10 April to 15 May, 2020 (see the link https://covid19.who.int/). The data are as follows: 3.1091, 3.3825, 3.1444, 3.2135, 2.4946, 3.5146, 4.9274, 3.3769, 6.8686, 3.0914, 4.9378, 3.1091, 3.2823, 3.8594, 4.0480, 4.1685, 3.6426, 3.2110, 2.8636, 3.2218, 2.9078, 3.6346, 2.7957, 4.2781, 4.2202, 1.5157, 2.6029, 3.3592, 2.8349, 3.1348, 2.5261, 1.5806, 2.7704, 2.1901, 2.4141, and 1.9048.
7. Concluding Remarks
In this paper, we have introduced a two-component mixture model based on Marshall–Olkin extended Weibull distributions. Maximum likelihood and Bayes methods of estimation have been used to estimate the parameters of the mixture model. The numerical evidence shows that Bayes estimates perform better than the maximum likelihood estimates. Our simulated results follow the consistency property. The length of Bayes credible intervals is shorter than classical ones. From the simulation study, we may conclude that the Bayesian estimation has an advantage because of its small posterior risks as compared to the MLE method. If we compare the estimates with respect to loss functions, SELF performs better as compared to the LINEX loss function. Finally, for precise estimation of the unknown parameters of the Marshall–Olkin extended Weibull mixture model, Bayes method of estimation is preferable over maximum likelihood estimation, especially when the suitable prior information of the unknown parameters is available. The contents of the study may be useful in different fields where lifetime models are used for analysis of more than one causal factor of failure and where the data are type-II censored. The scope of this study may also be extended to other censoring schemes as well as for more than two-component mixture models.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that there are no conﬂicts of interest regarding the publication of this paper.
This research was funded by the Deanship of Scientific Research at Princess Nourah Bint Abdulrahman University through the Fast-track Research Funding Program.
K. Pearson, “Contributions to the Mathematical theory of Evolution,” Philosophical Transactions, A, vol. 185, pp. 71–110, 1894.View at: Google Scholar
K. E. Ahmed, H. M. Moustafa, and A. M. Abd-Elrahman, “Approximate Bayes estimation for mixture of two Weibull distributions under type-2 censoring,” Journal of Statistical Computation and Simulation, vol. 58, pp. 269–285, 1997.View at: Google Scholar
Z. F. Jaheen, “On record statistics from a mixture of two exponential distributions,” Journal of Statistical Computation and Simulation, vol. 75, pp. 1–11, 2005.View at: Google Scholar
A. I. Shawky and R. A. Bakoban, “On infinite mixture of two-component exponentiated gamma distribution,” Journal of Applied Sciences Research, vol. 5, no. 10, pp. 1351–1369, 2009.View at: Google Scholar
H. H. Abu-Zinadah, “A study on mixture of exponentiated Pareto and exponential distributions,” Journal of Applied Sciences Research, vol. 6, pp. 358–376, 2012.View at: Google Scholar
G. Prakash, “Bayes estimation for a mixture of the Weibull distributions,” International Journal of Mathematics and Scientific Computing, vol. 2, no. 1, pp. 2231–5330, 2012.View at: Google Scholar
A. A. ALgfary, “On finite mixture of exponentiated kumaraswamy distributions,” King Abdulaziz University, Jeddah, Saudi Arabia, 2015, Masters thesis.View at: Google Scholar
M. Tahir, M. Aslam, H. Hussain, M. Abid, and S. H. Bhatti, “Bayesian analysis of heterogeneous doubly censored lifetime data using the 3-component mixture of Rayleigh distributions: a Monte Carlo simulation study,” ScientiaIranica, vol. 26, no. 3, pp. 1789–1808, 2019.View at: Google Scholar
Z. I. Kalantan and F. Alrewely, “A 2-component Laplace mixture model: properties and parametric estimations,” Mathematics and Statistics, vol. 7, no. 4A, pp. 9–16, 2019.View at: Google Scholar
O. Kharazmi, S. Dey, and D. Kumar, “Statistical inference on 2-component mixture of Topp-Leone distribution, Bayesian and non-Bayesian estimation,” Journal of Mathematical Extension, 2020, In press.View at: Google Scholar
F. Guess and F. Proschan, “Mean residual life: theory and applications,” Handbook in Statistics, vol. 7, pp. 512–224, 1988.View at: Google Scholar
J. F. Lawless, Statistical Models and Methods for Lifetime Data, John Wiley & Sons, New York, NY, USA, 1982.
H. Varian, “A Bayesian approach to real estate Assessment,” in Studies in Bayesian Econometrics and Statistics, S. E. Fienberg and A. Zellner, Eds., Scientific Research, Amsterdam, Netherlands, 1975.View at: Google Scholar
A. Zellner, “Bayesian and non-Bayesian estimation using Balanced loss functions,” in Statistical Decision Theory and Related Topics V, S. S. Gupta and J. O. Burger, Eds., Springer, Berlin, Germany, 1986.View at: Google Scholar