Table of Contents
International Scholarly Research Notices
Volume 2014, Article ID 430357, 9 pages
http://dx.doi.org/10.1155/2014/430357
Research Article

Bayesian Perspective on Random Censored Survival Data

Department of Biostatistics, School of Public Health, University of Ghana, Legon, Accra, Ghana

Received 10 June 2014; Accepted 12 July 2014; Published 29 October 2014

Academic Editor: Francisco W. S. Lima

Copyright © 2014 Chris B. Guure and Samuel Bosomprah. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

A unit is said to be randomly censored when the information on time occurrence of an event is not available due to either loss to followup, withdrawal, or nonoccurrence of the outcome event before the end of the study. It is assumed in independent random/noninformative censoring that each individual has his/her own failure time and censoring time ; however, one can only observe the random vector, say, . The classical approach is considered for analysing the generalised exponential distribution with random or noninformative censored samples which occur most often in biological or medical studies. The Bayes methods are also considered via a numerical approximation suggested by Lindley in 1980 and that of the Laplace approximation procedure developed by Tierney and Kadane in 1986 with assumed informative priors alongside linear exponential loss function and squared error loss function. A simulation study is carried out to compare the estimators proposed in this paper. Two datasets have also been illustrated.

1. Introduction

A new distribution for analysing time-to-event data was introduced by [1], known as generalised exponential distribution. Generalised exponential distribution can be used as an alternative to the well-known and used Weibull distribution in lifetime data analysis and reliability engineering according to [1].

The generalised exponential distribution has the distribution, density, and survival functions, respectively, as where is the shape parameter and the scale parameter. Let the distribution with the shape parameter and the scale parameter be denoted by . According to [1], the two-parameter can be used quite effectively in analysing many lifetime data and can assume the place of the two-parameter gamma and two-parameter Weibull distributions. The two-parameter can have increasing and decreasing failure rates depending on the shape parameter.

Studies that involve time-to-event or survival data analysis are focussed on measuring time-to-event of an outcome. Time-to-event could vary from time to either death or the occurrence of a clinical endpoint such as disease or the attainment of a biochemical marker [2]. A special course of difficulty in the analysis of time-to-event data is the possibility that some individuals or units may not be observed for the full time to failure. In some circumstances, some individuals or units do not fail but are lost-to-followup during the observed period. Instead of knowing the failure time , all we know about these individuals is that their time-to-failure exceeds some value, say , where is the follow-up time of these individuals in the study, which is referred to as censoring.

Under random or noninformative censoring, a sample of, say, , elements are followed for some time, say . An instance of this type of censoring occurs when the termination date for a medical trial is not fixed before the study starts but is rather chosen later, where the choice is influenced by the results of the study up to the termination time. In a straightforward overview of this scheme, which can be considered as time censoring, each element has a maximum inspection time of, say, , for which may possibly vary from one situation to another. Consider an experiment where we start with an observation of 50 cancer patients and terminate the experiment after a certain amount of time irrespective of the number of patients that have died or survived at the specified time. The survival of the patients may be due to withdrawal, inadequate monitoring mechanism, or deaths, which is not related to the purpose of the study.

Maximum likelihood estimator (MLE) is very popular both in the literature and in practice. Some researches have been done to compare MLE and the Bayesian approach in estimating the two parameters of the generalised distribution using hybrid and complete failure time data. Amongst them are [3], who studied generalized exponential distribution: Bayesian estimations. Reference [4] considered generalized exponential distribution by applying a different method of estimations. Other estimation procedures related to the above were considered by [5]. Reference [5] determined the Bayes estimates of the reliability function and the hazard rate of the Weibull failure time distribution by employing squared error loss function. Reference [6] studied Bayesian parameter and reliability estimate of Weibull failure time distribution; reference [7] studied the approximate Bayesian estimates for the Weibull reliability function and hazard rate from censored data by employing a new method that has the potential of reducing the number of terms in Lindley procedure. See also [8, 9]. Reference [10] studied Bayes estimators of modified Weibull distribution parameters using Lindley's approximation.

The main aim of this paper is to compare the classical maximum likelihood estimator to the proposed Bayesian estimators with two loss functions for the unknown parameters of the generalised exponential distribution for different sample sizes and parameter values.

2. Maximum Likelihood Estimation

Let be the set of random lifetimes with respect to the generalised exponential distribution with and as the parameters, where is the scale parameter and the shape parameter.

In random censoring as stated by [11], we assume , if and if . The observed data from individuals is assumed to consist of the pair , , so that the final result obtained will be the same provided is available for all .

It is therefore assumed that ; that is, and are independent of each other, which implies that the censoring time is noninformative in analysing the failure time . In order for this assumption to be valid, one has to ensure that the loss to follow-up of individuals is not as a result of the failure time defined. The likelihood function with respect to random censored data is where is the survival function. Calculation of the maximum likelihood estimator often requires that some iterative (e.g., Newton-Raphson) procedures be implemented to obtain the parameters estimates. This can simply be obtained in any statistical software.

3. Bayesian Inference

In this section we consider the Bayes estimation of the two unknown parameters. Since both parameters are assumed to be greater than zero (0), we let both take on the following gamma prior distributions: Assume that the hyperparameters , and are known and 0. The joint density function of the data, and , can be obtained as Bayesian inference is based on the posterior distribution which is given as The ratio of the two integrals given in (5) cannot be obtained in a closed form. We can apply a numerical integration technique, which may be computationally intensive especially in high-dimensional parameter space. It is also possible to make use of numerical approximation methods such as [12] and/or [9]. In this paper, we shall consider both methods for this type of censoring scheme and for this distribution, since we are unaware of any study employing both methods for this distribution and with this type of censored data apart from the former by [3] with uncensored data. This approach is considered under two loss functions, namely, LINEX and squared error loss.

3.1. Lindley Approximation

Reference [12] proposed a procedure to approximate the ratio of integrals. This approach has been used by several authors like [3, 6] to obtain the approximate Bayes estimators. Reference [12] shows the approximate procedure for evaluating ratio of integrals of the form where and is the logarithm of the likelihood function and are arbitrary functions of . Assume that is the prior distribution for and with being some function of interest. The posterior expectation of is given as where . An outline of the procedure can be obtained from [12] and a recent paper by [13]. Lindley's procedure can be approximated asymptotically by Considering the Bayesian estimator under the squared error loss function, which is the posterior mean, the following can be obtained where and   are the first and second derivatives of the scale parameter while and are also the first and second derivatives of : Refer to appendix section for derivatives with respect to the shape and scale parameters.

3.2. Linear Exponential Loss Function with Lindley Procedure

Unlike the symmetric loss function (squared error), this loss function measures the degree of underestimation and overestimation of the estimated parameter.

The Bayes estimator of, say, , which is denoted by under LINEX loss function, is provided that exists and is finite.

The Bayes estimator of a function under LINEX is given as where

3.3. Tierney and Kadane

Observing from Section 3.1, it is clear that the Lindley approach demands or requires that we evaluate the third derivatives of the likelihood function. Depending on the distribution and the number of parameters involved, this approach can be very difficult to achieve. Tierney and Kadane through Laplace approximation procedure gave an alternative to the Lindley approach which only requires the first and second derivatives of the likelihood function. Let be the likelihood function of based on number of observations. represents the prior distribution defined over the parameter space, represents the loss function, and represents the posterior distribution of . The Bayes estimate of a function under the squared error loss function is the posterior mean and is given as with Equation (13) can be approximated in the form This can similarly be expressed as where and maximize and , respectively, and and are the negatives of the inverse Hessians of and , respectively.

The matrix takes the form We can similarly obtain the expression for the matrix , which involves the partial derivatives of . In applying the method the following need to be maximised: Setting and to zero produces the following system of equations: where and are easy to obtain. Refer to Appendix section for , , and .

3.4. Linear Exponential Loss Function with Tierney and Kadane Procedure

The Bayesian estimator of a function under LINEX with respect to Tierney and Kadane procedure is given as where The same approach is also adopted with the squared error loss function to obtain the Bayes estimates of the unknown parameters.

4. Data Source

4.1. Data 1

The data for this example are on survival of patients with cervical cancer, recruited to a randomised trial aimed at analysing the effect of addition of a radio sensitiser to radiotherapy (new therapy—“treatment B”) compared to using only radiotherapy (control—“treatment A”). Treatment A was given to 16 and treatment B to 14 patients. The data are in days since the start of the study; the event of interest is death caused by this cancer. Our main interest is on the patients under treatment A, which is fairly small to illustrate the proposed methods in this paper. The data is obtained from [14]. Starred observations are censored: 90, , 142, 1037, 150, , 269, , 291, 1153, , 1297, 680, 1429, 837, . The results are depicted in Table 3.

4.2. Data 2

The following data which are considered large are obtained from [11]. The data represent survival times for 121 breast cancer patients who were treated over the period 1929–1938. Times are in months and asterisks denote censoring times: 0.3, , , 5.0, 5.6, 6.2, 6.3, 6.6, 6.8, , 7.5, 8.4, 8.4, 10.3, 11.0, 11.8, 12.2, 12.3, 13.5, 14.4, 14.4, 14.8, , 15.7, 16.2, 16.3, 16.5, 16.8, 17.2, 17.3, 17.5, 17.9, 19.8, 20.4, 20.9, 21.0, 21.0, 21.1, 23.0, , 23.6, 24.0, 24.0, 27.9, 28.2, 29.1, 30, 31, 31, 32, 35, 35, , , , 38, , , , , 40, , , 41, 41, , 42, , , , 44, , , , , , 48, , 51, 51, , 52, 54, , 56, , , , 60, , , , , , , , , , , 78, 80, , , 89, 90, , , , , , , , , , , 126, , , , , .

5. Simulation Study

Since it is difficult to compare the performance of the proposed methods theoretically, we have performed an extensive simulation to compare the estimators through mean squared errors and absolute biases by employing different sample sizes with different parameter values. We considered a sample size of = 25, 50, and 100. The following steps were employed to generate the data. The generation of is simple as stated in [4]. If follows a uniform distribution in the interval , then follows . Consequently, with a very good uniform random number generator, the generation of random deviate is immediate.

A lifetime is generated from the sample sizes indicated above from the distribution which represent failure of the product. The values of the assumed actual shape parameter of the distribution were taken to be 0.8, 1.2 and 2.0. The scale parameter was considered throughout to be without loss of generality. The same sample size is generated from the uniform distribution for the censored time with , where the value of depends solely on the proportion of the observations that are censored. In our study, we considered the percentage of censoring to be 25. is taken as the minimum of the failure time and that of the censored time of the observed time . To compute the Bayes estimates, an assumption is made such that and take, respectively, Gamma and Gamma priors. We set the hyperparameters to 0; that is, ; this makes the priors noninformative. The values of the loss parameter for the LINEX loss function are = . These were iterated 1000 times. The mean squared errors and the absolute biases are determined and presented for the purpose of comparison.

6. Results and Discussion

The main objective of this study is to obtain the estimates of the generalised exponential distribution parameters and compare the proposed methods applied in this paper. In order to examine the estimates of the parameters which cannot be obtained analytically, we made use of different numerical approximation procedures and have obtained absolute biases and mean squared errors of the estimated parameters.

Observing from Table 1 and Figures 1 and 2, it is evident that the smallest mean squared errors vis--vis the absolute biases for the estimated scale parameter occurred under the Bayesian estimator with the linear exponential loss function. The loss parameter from which we obtained the smallest mean squared errors is , which is above zero, implying this approach is preferred if overestimation is more serious than underestimation. This occurred largely with the Lindley numerical approximation procedure, followed by Tierney and Kadane. As the sample size increases, all the estimators' mean squared errors correspondingly decrease. Another observation made that needs to be mentioned is that Lindley approximation method under the squared error loss function performed better than that of the Tierney and Kadane with respect to the generalised exponential scale parameter. As illustrated clearly in Figure 2, both Tierney and Kadane had equal minimum absolute biases.

tab1
Table 1: Average mean squared errors and absolute biases for .
430357.fig.001
Figure 1: Mean squared errors for the scale parameter. ML: maximum likelihood, SL: squared error loss under Lindley approximation, LL: LINEX loss under Lindley, STK: squared error under Tierney and Kadane method, and LTK: LINEX loss under Tierney and Kadane method.
430357.fig.002
Figure 2: Absolute bias for the scale parameter. ML: maximum likelihood, SL: squared error loss under Lindley approximation, LL: LINEX loss under Lindley, STK: squared error under Tierney and Kadane method, and LTK: LINEX loss under Tierney and Kadane method.

Considering Table 2 alongside Figures 3 and 4, which contain the mean squared errors and the absolute biases of the estimated shape parameter , we noticed that the Bayesian estimator under the Tierney and Kadane method performed better than the Lindley approach but maximum likelihood estimator overall had the smallest mean squared error followed by Tierney and Kadane. The minimum absolute bias occurred predominantly with the Tierney and Kadane approach. The bold numbers indicate the smallest and minimum biases of the estimated parameters with their corresponding estimators. The Bayes estimator with Lindley numerical approximation procedure for the exponential distribution performed better under the squared error loss function for the shape parameter than that of the Tierney and Kadane numerical method to a very large extent. All the estimators' mean squared errors got closer as the sample size increased.

tab2
Table 2: Average mean squared errors and absolute bias for .
tab3
Table 3: Standard errors for and with = 16.
430357.fig.003
Figure 3: Mean squared errors for the shape parameter. ML: maximum likelihood, SL: squared error loss under Lindley approximation, LL: LINEX loss under Lindley, STK: squared error under Tierney and Kadane method, and LTK: LINEX loss under Tierney and Kadane method.
430357.fig.004
Figure 4: Absolute bias for the shape parameter. ML: maximum likelihood, SL: squared error loss under Lindley approximation, LL: LINEX loss under Lindley, STK: squared error under Tierney and Kadane method, and LTK: LINEX loss under Tierney and Kadane method.

The Bayesian estimator under linear exponential loss function with the positive loss parameter has the smallest standard error as illustrated in Table 3. This happened with the approximation procedure suggested by Tierney and Kadane; it implies that linear exponential loss function overestimates the scale and shape parameters of the generalised exponential distribution. From this example, where the sample size is considered to be fairly small, we noticed that using the Tierney and Kadane approach via Bayes under squared error loss performs fairly better than that of the Lindley method as well as the maximum likelihood estimator.

Using the iterative procedure suggested in this paper for both MLEs and Bayes with respect to data 2, the MLEs of and are 0.765027 and 6.277847 with their corresponding standard errors as 0.006323 and 0.010377. Since we do not have any prior information on the hyperparameters, we assume . This makes the priors on and noninformative. For computing the Bayes estimators, we considered the squared error loss and linear exponential loss functions and gamma priors on both and same as the approach used in the simulation section. After computing the Bayes estimators via Lindley approximation procedure under squared error loss for and , the following parameters estimates and standard errors were obtained, respectively, 0.765027, 6.277847 and 0.006325, 0.010376.

Computing the Bayes estimates of and and their corresponding standard errors under the linear exponential loss function with a loss parameter of , we have 0.765029, 6.277860 and 0.006323, 0.010377. With the loss parameter being , we have 0.765025, 6.277840 and 0.006323, 0.010376, respectively. Considering 95% confidence interval of MLE, we have = (0.752634, 0.777419) and = (6.257508, 6.298185). Bayes credible intervals under squared error loss function of and are 0.752634, 0.777419 and 6.257508, 6.298185, respectively. The Bayes credible intervals with respect to the LINEX loss function with the loss parameter for and are 0.752637, 0.777421 and 6.257521, 6.298198 and those of the are 0.752633, 0.777417 and 6.257501, 6.298178, respectively.

Computing the Bayes estimators using Tierney and Kadane (T & K) approximation procedure under squared error loss function for and , we have, respectively, the following parameters estimates and standard errors: 0.764725, 6.275374 and 0.006320, 0.010373. Calculating the Bayes estimates via Tierney and Kadane of and with their corresponding standard errors under the linear exponential loss function with a loss parameter of , we have 0.765633, 6.282807 and 0.006328, 0.010385. With the loss parameter of , we have 0.763671, 6.277809 and 0.006311, 0.010358, respectively. Bayes credible intervals using Tierney and Kadane under squared error loss function of and are 0.752338, 0.777113 and 6.255044, 6.295704. The Bayes credible intervals with respect to the LINEX loss function with the loss parameter for and are 0.753231, 0.778035 and 6.262453, 6.303161 and those of are 0.751301, 0.776041 and 6.246441, 6.287045, respectively.

As clearly stipulated above, the estimator with the smallest standard error is Bayesian under the linear exponential loss function for both the scale and shape parameters. This happened under the Tierney and Kadane numerical approximation procedure. This is followed by Bayes estimator using the squared error loss function, again with the Tierney and Kadane method. We observed that the linear exponential loss function had the narrowest credible intervals with respect to the Tierney and Kadane approach as compared to the credible intervals of Bayes using Lindley and the confidence intervals obtained from maximum likelihood estimator. This happened with a negative loss parameter, an indication of underestimation of the generalised exponential distribution parameters.

7. Conclusion

From the results and discussions above it is evident that the Bayesian estimator under linear exponential loss function performed quite better well than Bayes under squared error loss function and maximum likelihood estimator for estimating both the scale parameter and shape parameter , with both MSE and absolute bias. Lindley method performed better than T & K for the scale parameter with regard to mean squared errors while T & K performed better for the shape parameter with both the mean squared errors and the absolute bias. Considering the standard errors obtained for the real data analysis, we can state that the T & K method outperformed the Lindley numerical approximation and the maximum likelihood estimator.

Appendix

Let the following assumptions hold under the Lindley approach:

Note that and are the second and third derivatives for the scale parameter of the log-likelihood function while and are the derivatives of the shape parameter.

Using 20 from the Tierney and Kadane numerical approach, the following derivatives are obtained: By employing 19, the following derivatives are obtained:

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

  1. R. D. Gupta and D. Kundu, “Generalized exponential distributions,” Australian & New Zealand Journal of Statistics, vol. 41, no. 2, pp. 173–188, 1999. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  2. S. Prinja, N. Gupta, and R. Verma, “Censoring in clinical trials: review of survival analysis techniques,” Indian Journal of Community Medicine, vol. 35, no. 2, pp. 217–221, 2010. View at Publisher · View at Google Scholar · View at Scopus
  3. D. Kundu and R. D. Gupta, “Generalized exponential distribution: bayesian estimations,” Computational Statistics & Data Analysis, vol. 52, no. 4, pp. 1873–1883, 2008. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  4. R. D. Gupta and D. Kundu, “Generalized exponential distribution: different method of estimations,” Journal of Statistical Computation and Simulation, vol. 69, no. 4, pp. 315–337, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet · View at Scopus
  5. S. K. Sinha, “Bayes estimation of the reliability function and hazard rate of a weibull failure time distribution,” Trabajos de Estadistica, vol. 1, no. 2, pp. 47–56, 1986. View at Publisher · View at Google Scholar · View at Scopus
  6. C. B. Guure, N. A. Ibrahim, M. B. Adam, A. M. Al Omari, and S. Bosomprah, “Bayesian parameter and reliability estimate of weibull failure time distribution,” The Bulletin of the Malaysian Mathematical Sciences Society, Series 2, vol. 37, no. 2, pp. 611–632, 2014. View at Google Scholar
  7. A. A. Abdel-Wahid and A. Winterbottom, “Approximate Bayesian estimates for the Weibull reliability function and hazard rate from censored data,” Journal of Statistical Planning and Inference, vol. 16, no. 3, pp. 277–283, 1987. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet · View at Scopus
  8. C. B. Guure, N. A. Ibrahim, and M. B. Adam, “Bayesian inference of the Weibull model based on interval-censored survival data,” Computational and Mathematical Methods in Medicine, vol. 2013, Article ID 849520, 10 pages, 2013. View at Publisher · View at Google Scholar · View at MathSciNet
  9. L. Tierney and J. B. Kadane, “Accurate approximations for posterior moments and marginal densities,” Journal of the American Statistical Association, vol. 81, no. 393, pp. 82–86, 1986. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  10. V. Preda, E. Panaitescu, and A. Constantinescu, “Bayes estimators of modified-Weibull distribution parameters using Lindley's approximation,” WSEAS Transactions on Mathematics, vol. 9, no. 7, pp. 539–549, 2010. View at Google Scholar · View at MathSciNet · View at Scopus
  11. J. F. Lawless, Statistical Models and Methods for Lifetime Data, John Wiley & Sons, New York, NY, USA, 2003. View at MathSciNet
  12. D. V. Lindley, “Approximate Bayesian methods,” Trabajos de Estadistica, vol. 31, pp. 223–237, 1980. View at Google Scholar
  13. H. A. Howlader and A. M. Hossain, “Bayesian survival estimation of Pareto distribution of the second kind based on failure-censored data,” Computational Statistics and Data Analysis, vol. 38, no. 3, pp. 301–314, 2002. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  14. M. K. B. Parmar and D. Machin, Survival Analysis: A Practical Approach, John Wiley & Sons, Chichester, UK, 1995.