Table of Contents Author Guidelines Submit a Manuscript
Journal of Probability and Statistics
Volume 2013, Article ID 364705, 7 pages
http://dx.doi.org/10.1155/2013/364705
Research Article

Bayesian and Non-Bayesian Inference for Survival Data Using Generalised Exponential Distribution

Department of Biostatistics, School of Public Health, University of Ghana, Legon Accra, Ghana

Received 29 April 2013; Revised 5 August 2013; Accepted 8 August 2013

Academic Editor: Zhidong Bai

Copyright © 2013 Chris Bambey Guure and Samuel Bosomprah. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

A two-parameter lifetime distribution was introduced by Kundu and Gupta known as generalised exponential distribution. This distribution has been touted to be an alternative to the well-known 2-parameter Weibull and gamma distributions. We seek to determine the parameters and the survival function of this distribution. The survival function determines the probability that a unit under investigation will survive beyond a certain specified time, say, (). We have employed different data sets to estimate the parameters and see how well the distribution can be used to analyse survival data. A comparison is made about the estimators used in this study. Standard errors of the estimators are determined and used for the comparisons. A simulation study is also carried out, and the mean squared errors and absolute bias are obtained for the purpose of comparison.

1. Introduction

As the need grows for conceptualization, formalization, and abstraction in biology, so too does mathematics’ relevance to the field according to Fagerström et al. [1]. Mathematics is particularly important for analysing and characterizing random variation, for example, size and weight of individuals in populations, their sensitivity to chemicals, and time-to-event cases, such as the amount of time an individual needs to recover from illness. The frequency distribution of such data is a major factor determining the type of statistical analysis that can be validly carried out on any data set. Many widely used statistical methods, such as ANOVA (analysis of variance) and regression analysis, require that the data be normally distributed, but only rarely is the frequency distribution of data tested when these techniques are used (Limpert et al.) [2].

Gupta and Kundu [3] recently proposed the two-parameter generalised exponential distribution as an alternative to the lognormal, gamma, and Weibull distributions and did some studies on its properties. Some references on distribution are Raqab [4], Raqab and Ahsanullah [5], Zheng [6], and Kundu and Gupta [7]. According to Gupta and Kundu [8], the two-parameter can have increasing and decreasing failure rates depending on the shape parameter.

Some research has been done to compare MLE to that of the Bayesian approach in estimating the survival function and the parameters of the Weibull distribution which are similar to the generalised exponential distribution. Amongst others, Sinha [9] determined the Bayes estimates of the reliability function and the hazard rate of the Weibull failure time distribution by employing only squared error loss function; Abdel-Wahid and Winterbottom [10] studied the approximate Bayesian estimates for the Weibull reliability function and hazard rate from censored data by employing a new method that has the potential of reducing the number of terms in Lindley procedure; Guure and Ibrahim [11] did some studies on Bayesian analysis of the survival function and failure rate of Weibull distribution with censored data. Huang and Wu [12] considered Bayesian estimation and prediction for Weibull model with progressive censoring. Similar work can be seen in Guure et al. [13], Zellner [14], Guure et al. [15]. Al-Aboud [16], Al-Athari [17], and Pandey et al. [18].

Maximum likelihood method is one of the most popular estimation techniques for many distributions. According to the classical statistician, maximum likelihood has some outstanding properties. We have in this paper compared the maximum likelihood estimator and that of the Bayesian estimators to determine the best method that can be used to estimate the parameters of the generalised exponential distribution.

The remaining part of the paper is arranged as follows. Section 2 contains materials and methods that deal with the derivative of the parameters under maximum likelihood estimator and that of the Bayesian estimators. Section 3 deals with analysis of some real data sets, followed by a simulation study in Section 4. Results and discussion are in Section 5, and then Section 6 is the conclusion.

2. Materials and Methods

Let be the set of random lifetimes with respect to the generalised exponential distribution, with and as the parameters, where is the scale parameter and is the shape parameter. The cumulative distribution function , the probability density function , and the survival function are given, respectively, as the density function: and the survival function: Let the distribution with the shape parameter and the scale parameter be denoted by .

2.1. Maximum Likelihood Estimation

Since is the set of random lifetimes from the generalised exponential distribution with parameters and .

The likelihood function is To obtain the equations for the unknown parameters, we take the log of (4) and partially different with respect to the unknown parameters. By employing an iterative procedure such as Newton-Raphson, the estimates of the scale and shape parameters can easily be obtained. Readers can refer to Kundu and Gupta [7] for details.

Therefore, the survival function can be obtained as where and are the maximum likelihood estimates of the parameters.

2.2. Bayesian Inference

In Bayesian analysis, the parameter of interest is always considered to be a random variable with a prior distribution. The prior distribution is the distribution of the parameter before any data is observed. The selection of prior distribution is most often than not based on the type of prior information that is available to us. When we have little or no information about the parameter, a noninformative prior should be used else an informative prior. In analysing data from medical, engineering, or biological studies, it is possible to obtain information with respect to similar studies in the past, and if that is even unattainable, information from an expert could be modelled to fit an appropriate prior distribution. This can be referred to as prior elicitation.

We let the two unknown parameters take on the gamma prior distributions by assuming that the hyper parameters are all known and greater than zero, that is, . The gamma prior is assumed for this distribution because both the scale and shape parameters are greater than zero:

Bayesian inference is based on the posterior distribution which is obtained by dividing the joint density function to the marginal distribution function as given below: Due to the complex nature of the posterior distribution given in (7), Lindley approximation is employed in order to estimate the unknown parameters.

The Bayes estimator is considered under two loss functions. Since in drawing conclusions about the survival or duration of a living organism, an overestimation could be more detrimental to underestimation or vice versa, we have considered both asymmetric (general entropy) loss function and symmetric (squared error) loss function.

2.2.1. Lindley Approximation

Lindley [19] proposed an approximation for a ratio of integral of the form where is the log likelihood and , are arbitrary functions of , is the prior distribution for , and with being some function of interest as seen in (8).

The posterior expectation according to Sinha [9] is where and represent the log-likelihood function. Considering the Bayesian estimator under the squared error loss function, which is the posterior mean, the posterior expectation can be approximated asymptotically with respect to the two parameters by (10): The second and third derivatives with respect to the scale and shape parameters are To estimate the survival function, under the squared error loss function, we let where

2.2.2. General Entropy Loss Function

The general entropy loss function is a generalisation of the entropy loss function. The Bayes estimator of under the general entropy loss is provided exists and is finite. The Bayes estimator for this loss function with respect to the parameters and the survival function are A similar Lindley approach is used for the general entropy loss function as in the squared error loss function, with For the general entropy loss function, the posterior expectation according to Lindley [19] can be approximated by using (18). The survival function is similarly obtained by substituting (14) into (18):

3. Real Data Analysis

3.1. Example 1

We analyse two data sets in this section which we have considered to be relatively small and moderate for illustration and comparison purposes. The data is obtained from Lawless [20]. The data represent survival times for two groups of laboratory mice, all of which were exposed to a fixed dose of radiation at an age of 5 to 6. The first group of mice lived in a conventional lab environment, and the second group was kept in a germ-free environment. The cause of death for each mouse was assigned after autopsy to be one of three things: thymic lymphoma , reticulum cell sarcoma , or other causes . The mice all died by the end of the experiment, so there is no censoring. For the purpose of our study, we have considered the first set of data from the control group which seem relatively small with and the second which seem relatively moderate from the third set of data of the germ-free group with .

Using the iterative procedure suggested from the beginning of this paper, the MLEs of and for the relatively small samples are 0.015024 and 36.935530, respectively, with their corresponding standard errors as 0.003203 and 8.374682. The relatively moderate samples have and , with their corresponding standard errors as 0.000606 and 1.999012.

Since we do not have any prior information on the hyper-parameters, we assume . This makes the priors proper on and and the corresponding posteriors also proper.

When we compute the Bayes estimators with squared error loss of and under the relatively small sample sizes, the following parameters estimates and standard errors are obtained, respectively, 0.014605, 36.935530 and 0.003114, 8.374682. For the moderate samples, we have with a standard error of 0.000603. The with a standard error of 1.999012.

Computing the Bayes estimates of and and their corresponding standard errors under the general entropy loss functions with the loss parameter being , we have 0.014270, 35.630480 and 0.003042, 8.096443. With the loss parameter being , we have 0.014517, 36.600361 and 0.003095, 8.303224, respectively.

Considering the Bayes estimates on the relatively moderate samples of and and their corresponding standard errors under the general entropy loss functions with the loss parameter being , we have 0.003646, 8.445543 and 0.000591, 1.970048. With the loss parameter being , we have 0.003698, 8.578749 and 0.00600, 1.991657, respectively.

Observably, the Bayes estimator under squared error loss for the shape parameter has the same estimate and standard error as compared to that of the classical maximum likelihood estimator but with the scale parameter , and Bayes under squared error has a smaller standard error in juxtaposition to MLE.

3.2. Example 2

In this section, we analyse another real data set which we have considered to be relatively large and complete for illustrative purpose. The data originates from Bjerkedal [21]. The data represents the survival times of guinea pigs injected with different doses of tubercle bacilli. It is known that guinea pigs have high susceptibility to that of human tuberculosis, and this informs our decision to use the data in our study. Here, our primary concern is with the animals in the same cage that were under the same regimen. The regimen number is the common logarithm of the number of bacillary units in 0.5 mL of challenge solution; that is, regimen 6.6 corresponds to bacillary units per 0.5 mL . Corresponding to regimen 6.6, there were 72 observations.

Observing from Table 1, it is evident that the estimator with the smallest parameter estimate and having a corresponding smaller standard error is Bayesian with the generalised entropy loss function. This occurred for both parameters with a positive loss parameter, that is, .

tab1
Table 1: Average parameters estimates and their corresponding standard errors .

The importance of the survival function cannot be ignored; therefore, the correctness of its estimate is very crucial to both biological and medical studies. As clearly presented in Table 2, the estimator with the smallest standard error under all the samples is the classical maximum likelihood estimator; therefore, MLE is preferred as a better estimator to the others.

tab2
Table 2: Standard errors of the survival function.

Comparing all the estimators, it is clear from the results that Bayes estimator under general entropy loss function with the loss parameter of has the smallest standard error and estimate for both the shape parameter and the scale parameter .

4. Simulation Study

Since it is difficult to compare the performance of the estimators theoretically and also to validate the real data employed in this paper, we have performed extensive simulations to compare the estimators through mean squared errors and absolute biases by employing different sample sizes with different parameter values.

We have considered in the simulation study a sample size of , 50, and 100, which is representative of small, moderate, and large data sets. The following steps were employed to generate the data.

The generation of is simple as stated in Gupta and Kundu [8]. If follows a uniform distribution in the interval , then follows . Consequently, with a very good uniform random number generator, the generation of random deviate is immediate.

A lifetime is generated from the sample sizes indicated above from the distribution which represents failure of the product or unit. The values of the assumed actual shape parameter of the distribution were taken to be 0.8, 1.6, and 2.2. The scale parameter was considered throughout the paper to be one (4).

We observed that the parameter estimates under the classical maximum likelihood method could not be obtained in close form, and we therefore employed Newton-Raphson iterative approach via the Hessian matrix. This can simply be implemented in the programming language with the package maxLik.

To compute the Bayes estimates, an assumption is made such that and take, respectively, Gamma and Gamma priors. We set the hyperparameters to be 0.0001, that is, , in order to obtain proper priors. This approach was suggested by Press and Tanur [22]. Note that at this point the posterior distribution is also proper.

The values of the loss parameter for the general entropy loss function are , which can be extended for other values of the loss parameter. Readers are referred to Calabria and Pulcini [23] for the choice of the loss parameter values. These were iterated times. The mean squared error and the absolute bias values are determined and presented below for the purpose of comparison.

5. Results and Discussion

In this study, our objective is to obtain the estimates of the parameters and to observe the performance of the methods used for estimation. To examine the estimates of the parameters, we obtain the absolute biases and mean squared errors of the estimates under different methods of estimation.

From Table 3, it is very clear that the most dominant estimator that had the smallest mean squared errors vis--vis the absolute biases for the scale parameter is Bayesian under general entropy loss function. This is followed closely by Bayes under squared error loss function. What has been observed again is that, as the sample size increases, the mean squared error of all the estimators decrease unswervingly. This is simply an indication of how good and reliable the estimators are.

tab3
Table 3: Average mean squared errors and absolute biases of .

When we consider Table 4, which contains the mean squared errors and the absolute biases of the estimated shape parameter , we noticed that the mean squared errors and the absolute biases of the two estimators, that is, maximum likelihood and Bayes under squared error loss function, have the same values for the estimated shape parameter. This is expected in that the priors used for the Bayesian analysis are noninformative. With regards to the survival function, Bayes estimator under the general entropy loss function gives a minimum bias with relatively small samples. Maximum likelihood estimator is slightly ahead of the other estimators with respect to the mean squared error. The bold numbers indicate the smallest MSE and minimum biases of the estimated parameters and their corresponding estimators.

tab4
Table 4: Average mean squared errors and absolute biases of .

From Table 5, we observed that the classical maximum likelihood estimator (MLE) as compared to Bayes estimators under squared error and general entropy loss function had the smallest mean squared error values as well as minimum absolute bias for the estimated survival function of the generalised exponential distribution. This implies that maximum likelihood estimator may be preferred when estimating the survival function to the others.

tab5
Table 5: Average mean squared errors and absolute biases of the survival function at , and 100.

6. Conclusion

In this paper, we have considered the Bayes estimation of the unknown parameters of the generalized exponential distribution. We have also assumed a gamma prior on both parameters, and we provide the Bayes estimators under the assumptions of squared error and general entropy loss functions. We observed that the Bayes estimators cannot be obtained in explicit forms, due to the complex nature of the posterior distribution of which Bayes inference is drawn. Therefore, Lindley’s numerical approximations procedure is used. It is also observed that the parameter estimates under the classical maximum likelihood method could not be obtained in close form; we therefore employed Newton-Raphson iterative approach via the Hessian matrix.

From the results and discussions above, it is evident that the Bayesian estimator under general entropy loss function performed quiet better than Bayes under squared error loss function and that of maximum likelihood estimator for estimating the scale parameter , with both MSE and absolute bias. In the case of the shape parameter , the Bayesian estimator under the squared error loss function and the maximum likelihood estimator are both almost tantamount in estimating it. For the survival function, maximum likelihood performed better than the other estimators.

References

  1. T. Fagerström, P. Jagers, P. Schuster, and E. Szathmary, “Biologists put on mathematical glasses,” Science, vol. 274, no. 5295, pp. 2039–2040, 1996. View at Publisher · View at Google Scholar · View at Scopus
  2. E. Limpert, W. A. Stahel, and M. Abbt, “Log-normal distributions across the sciences: keys and clues,” BioScience, vol. 51, no. 5, pp. 341–352, 2001. View at Google Scholar · View at Scopus
  3. R. D. Gupta and D. Kundu, “Generalized exponential distributions,” Australian & New Zealand Journal of Statistics, vol. 41, no. 2, pp. 173–188, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  4. M. Z. Raqab, “Inferences for generalized exponential distribution based on record statistics,” Journal of Statistical Planning and Inference, vol. 104, no. 2, pp. 339–350, 2002. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  5. M. Z. Raqab and M. Ahsanullah, “Estimation of the location and scale parameters of generalized exponential distribution based on order statistics,” Journal of Statistical Computation and Simulation, vol. 69, no. 2, pp. 109–123, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  6. G. Zheng, “On the Fisher information matrix in type II censored data from the exponentiated exponential family,” Biometrical Journal, vol. 44, no. 3, pp. 353–357, 2002. View at Publisher · View at Google Scholar · View at MathSciNet
  7. D. Kundu and R. D. Gupta, “Generalized exponential distribution: Bayesian estimations,” Computational Statistics & Data Analysis, vol. 52, no. 4, pp. 1873–1883, 2008. View at Publisher · View at Google Scholar · View at MathSciNet
  8. R. D. Gupta and D. Kundu, “Generalized exponential distribution: different method of estimations,” Journal of Statistical Computation and Simulation, vol. 69, no. 4, pp. 315–337, 2001. View at Publisher · View at Google Scholar · View at MathSciNet
  9. S. K. Sinha, “Bayes estimation of the reliability function and hazard rate of a weibull failure time distribution,” Trabajos de Estadistica, vol. 1, no. 2, pp. 47–56, 1986. View at Publisher · View at Google Scholar · View at Scopus
  10. A. A. Abdel-Wahid and A. Winterbottom, “Approximate Bayesian estimates for the Weibull reliability function and hazard rate from censored data,” Journal of Statistical Planning and Inference, vol. 16, no. 3, pp. 277–283, 1987. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  11. C. B. Guure and N. A. Ibrahim, “Bayesian analysis of the survival function and failure rate of Weibull distribution with censored data,” Mathematical Problems in Engineering, vol. 2012, Article ID 329489, 18 pages, 2012. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  12. S. R. Huang and S. J. Wu, “Bayesian estimation and prediction for Weibull model with progressive censoring,” Journal of Statistical Computation and Simulation, vol. 82, no. 11, pp. 1607–1620, 2012. View at Publisher · View at Google Scholar · View at MathSciNet
  13. C. B. Guure, N. A. Ibrahim, and M. Bakri Adam, “Bayesian inference of the Weibull model based on interval-censored survival data,” Computational and Mathematical Methods in Medicine, vol. 2013, Article ID 849520, 10 pages, 2013. View at Publisher · View at Google Scholar · View at MathSciNet
  14. A. Zellner, “Bayesian estimation and prediction using asymmetric loss functions,” Journal of the American Statistical Association, vol. 81, no. 394, pp. 446–451, 1986. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  15. C. B. Guure, N. A. Ibrahim, M. B. Adam, S. Bosomprah, and A. O. M. Ahmed, “Bayesian parameter and reliability estimate of Weibull failure time distribution,” Bulletin of the Malaysian Mathematical Sciences Society. In press.
  16. F. M. Al-Aboud, “Bayesian estimations for the extreme value distribution using progressive censored data and asymmetric loss,” International Mathematical Forum, vol. 4, no. 33–36, pp. 1603–1622, 2009. View at Google Scholar · View at MathSciNet
  17. F. M. Al-Athari, “Parameter estimation for the double-pareto distribution,” Journal of Mathematics and Statistics, vol. 7, no. 4, pp. 289–294, 2011. View at Google Scholar
  18. B. N. Pandey, N. Dwividi, and B. Pulastya, “Comparison between Bayesian and maximum likelihood estimation of the scale parameter in weibull distribution with known shape under linex loss function,” Journal of Scientometric Research, vol. 55, pp. 163–172, 2011. View at Google Scholar
  19. D. V. Lindley, “Approximate Bayesian methods,” Trabajos Estadist, vol. 31, pp. 223–245, 1980. View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  20. J. F. Lawless, Statistical Models and Methods for Lifetime Data, Wiley, New York, NY, USA, 2003. View at MathSciNet
  21. T. Bjerkedal, “Acquisition of resistance in Guinea pigs infected with different doses of virulent tubercle bacilli,” American Journal of Epidemiology, vol. 72, no. 1, pp. 130–148, 1960. View at Google Scholar · View at Scopus
  22. S. J. Press and J. M. Tanur, The Subjectivity of Scientists and the Bayesian Approach, Wiley, New York, NY, USA, 2001. View at Publisher · View at Google Scholar · View at MathSciNet
  23. R. Calabria and G. Pulcini, “Point estimation under asymmetric loss functions for left-truncated exponential samples,” Communications in Statistics, vol. 25, no. 3, pp. 585–600, 1996. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet