Research Article  Open Access
The BetaLindley Distribution: Properties and Applications
Abstract
We introduce the new continuous distribution, the socalled betaLindley distribution that extends the Lindley distribution. We provide a comprehensive mathematical treatment of this distribution. We derive the moment generating function and the rth moment thus, generalizing some results in the literature. Expressions for the density, moment generating function, and rth moment of the order statistics also are obtained. Further, we also discuss estimation of the unknown model parameters in both classical and Bayesian setup. The usefulness of the new model is illustrated by means of two real data sets. We hope that the new distribution proposed here will serve as an alternative model to other models available in the literature for modelling positive real data in many areas.
1. Introduction
In many applied sciences such as medicine, engineering, and finance, amongst others, modelling and analysing lifetime data are crucial. Several lifetime distributions have been used to model such kinds of data. The quality of the procedures used in a statistical analysis depends heavily on the assumed probability model or distributions. Because of this, considerable effort has been expended in the development of large classes of standard probability distributions along with relevant statistical methodologies. However, there still remain many important problems where the real data does not follow any of the classical or standard probability models.
Some betageneralized distributions were discussed in recent years. Eugene et al. [1], Nadarajah and Gupta [2], Nadarajah and Kotz [3], and Nadarajah and Kotz [4] proposed the betanormal, betaGumbel, betaFrchet, and betaexponential distributions, respectively. Jones [5] discusses this general beta family motivated by its order statistics and shows that it has interesting distributional properties and potential for exciting statistical applications.
Recently, BarretoSouza et al. [6] proposed the betageneralized exponential distribution, Pescim et al. [7] introduced the betageneralized halfnormal distribution, and Cordeiro et al. [8] defined the betageneralized Rayleigh distribution with applications to lifetime data.
In this paper, we present a new generalization of Lindley distribution called the betaLindley distribution. The Lindley distribution was originally proposed by Lindley [9] in the context of Bayesian statistics, as a counter example of fudicial statistics.
Definition 1. A random variable is said to have the Lindley distribution with parameter if its probability density is defined as
The corresponding cumulative distribution function (CDF) is
Ghitany et al. [10] have discussed the various statistical properties of Lindley distribution and shown its applicability over the exponential distribution. They have found that the Lindley distribution performs better than exponential model. One of the main reasons to consider the Lindley distribution over the exponential distribution is its time dependent/increasing hazard rate. Since last decade, Lindley distribution has been widely used in different setup by many authors.
The rest of the paper has been organized as follows. In Section 2, we introduced the betaLindley distribution and demonstrated its flexibility showing the wide variety of shapes of the density, distribution, and hazard rate functions. The moments and order statistics from the betaLindley distribution are derived in Sections 3 and 4, respectively. In Section 5, the maximum likelihood and least square estimators as well as Bayes estimators of the parameters are constructed for estimating the unknown parameters of the betaLindley distribution. For demonstrating the applicability of proposed distribution, two real data sets are considered in Section 6. Simulation algorithm is also provided in Section 6 to generate the random sample from betaLindley distribution. The paper is then concluded in Section 7.
2. BetaLindley Distribution
Let denote the cumulative distribution function (CDF) of a random variable , and then the cumulative distribution function for a generalized class of distribution for the random variable , as defined by Eugene et al. [1], is generated by applying the inverse CDF to a beta distributed random variable to obtain The corresponding probability density function for is given by where is the parent density function and is beta function. We now introduce the threeparameter betaLindley (BL) distribution by taking in (3) to be the CDF (2). The CDF of the BL distribution is then The PDF of the new distribution is given by
Figure 1(a) illustrates some of the possible shapes of the PDF of the betaLindley distribution for selected values of the parameters , and , respectively.
(a) PDF of BL
(b) Hazard function of BL
The CDF (5) can be expressed in terms of the hypergeometric function (see Cordeiro and Nadarajah [11]) in the following way: If the parameter is real noninteger, we have
Lemma 2. When , the BL in (6) reduces to the Lindley distribution in (1) with parameter .
Lemma 3. When , the BL in (6) reduces to the generalized Lindley distribution proposed by Nadarajah et al. [12].
Lemma 4. The limit of betaLindley density as is and the limit as is .
Proof. It is straightforward to show the above from the betaLindley density in (6).
The reliability function , which is the probability of an item not failing prior to some time , is defined by . The reliability function of the betaLindley distribution is given by The other characteristic of interest of a random variable is the hazard rate function defined by , which is an important quantity characterizing life phenomenon. It can be loosely interpreted as the conditional probability of failure, given that it has survived to time . The hazard rate function for the betaLindley random variable is given by
Figure 1(b) illustrates some of the possible shapes of the hazard function of the betaLindley distribution for selected values of the parameters , and , respectively.
3. Moments and Generating Function
Theorem 5. The th moment of the betaLindley distributed random variable , if are real nonintegers, is given as
Proof. See the appendix.
4. Order Statistics
The th order statistic of a sample is its th smallest value. For a sample of size , the th order statistic (or largest order statistic) is the maximum; that is, The sample range is the difference between the maximum and minimum. It is clearly a function of the order statistics: We know that if denotes the order statistic of a random sample from a continuous population with CDF and PDF , then the PDF of is given by for . The PDF of the th order statistic for the betaLindley distribution is given by
5. Estimation
5.1. Maximum Likelihood Estimates
The maximum likelihood estimates, MLEs, of the parameters that are inherent within the betaLindley distribution function are obtained as follows. The likelihood function of the observed sample of size drawn from the density (6) is defined as The corresponding loglikelihood function is given by Now, setting we have where is digamma function. The MLEs of , respectively, are obtained by solving this nonlinear system of equations. It is usually more convenient to use nonlinear optimization algorithms such as the quasiNewton algorithm to numerically maximize the sample likelihood function given in (16). Applying the usual large sample approximation, the MLE can be treated as being approximately trivariate normal with mean and variancecovariance matrix equal to the inverse of the expected information matrix; that is, where is the limiting variancecovariance matrix of . The elements of the matrix can be estimated by , .
The elements of the Hessian matrix corresponding to the function in (17) are given in the appendix.
Approximate twosided confidence intervals for and for are, respectively, given by where is the upper th quantile of the standard normal distribution. Using , we can easily compute the Hessian matrix and its inverse and hence the standard errors and asymptotic confidence intervals.
We can compute the maximized unrestricted and restricted loglikelihood functions to construct the likelihood ratio (LR) test statistic for testing on some of the betaLindley submodels. For example, we can use the LR test statistic to check whether the betaLindley distribution for a given data set is statistically superior to the Lindley distribution. In any case, hypothesis tests of the type versus can be performed using a LR test. In this case, the LR test statistic for testing versus is , where and are the MLEs under and , respectively. The statistic is asymptotically (as ) distributed as , where is the length of the parameter vector of interest. The LR test rejects if , where denotes the upper quantile of the distribution.
5.2. Least Squares Estimators
In this section, we provide the regression based method estimators of the unknown parameters of the betaLindley distribution, which was originally suggested by Swain et al. [13] to estimate the parameters of beta distributions. It can be used in some other cases also. Suppose is a random sample of size from a distribution function and suppose , , denotes the ordered sample. The proposed method uses the distribution of . For a sample of size , we have see Johnson et al. [14]. Using the expectations and the variances, the least squares methods can be used.
Obtain the estimators by minimizing with respect to the unknown parameters. Therefore, in case of BL distribution, the least squares estimators of , and , say , and , respectively, can be obtained by minimizing with respect to , and .
5.3. Bayes Estimation
In this section, we developed the Bayes procedure for the estimation of the unknown model parameters based on observed sample from betaLindley distribution. In addition to having a likelihood function, the Bayesian needs a prior distribution for parameter, which quantifies the uncertainty about parameter prior to having data. In many situations, existing knowledge may be difficult to summarise in the form of an informative prior. In such case, it is better to consider the noninformative prior for Bayesian analysis (for more details on the use of noninformative prior, see [15]). We take the noninformative priors ([16]) for , , and of the following forms: It is to be noticed that the choices of and are unimportant and we can simply take Thus, the joint posterior distribution of , , and is given by where is the normalizing constant. Under square error loss, the Bayes estimates of , , and are the means of their marginal posteriors and defined as respectively. It is not easy to calculate Bayes estimates through (28), (29), and (30) and so the numerical approximation techniques are needed. Therefore, we proposed the use of Monte Carlo Markov Chain (MCMC) techniques, namely, Gibbs sampler and Metropolis Hastings (MH) algorithm; see [17–19]. Since the conditional posteriors of the parameters cannot be obtained in any standard forms, we, therefore, used a hybrid MCMC strategy for drawing samples from the joint posterior of the parameters. To implement the Gibbs algorithm, the full conditional posteriors of , , and are given by The simulation algorithm we followed is given by the following.
Step 1. Set starting points, say , , and , then at th stage.
Step 2. Using MH algorithm, generate .
Step 3. Using MH algorithm, generate .
Step 4. Using MH algorithm, generate .
Step 5. Repeat steps 2–4, times to get the samples of size from the corresponding posteriors of interest.
Step 6. Obtain the Bayes estimates of , , and using the following formulae: respectively, where is the burnin period of the generated Markov chains.
Step 7. Obtain the HPD credible intervals for , , and by applying the methodology of [20]. The HPD credible intervals for , , and are , , and , respectively, where is chosen such that
Here, denotes the largest integer less than or equal to .
Note that there have been several attempts made to suggest the proposal density for the target posterior in the implementation of MH algorithm. By reparameterizing the posterior on the entire real line, [16, 21] have suggested to use the normal approximation of the posterior as a proposal candidate in MH algorithm. Alternatively, it is also realistic to have the thought of using the truncated normal distribution without reparameterizing the original parameters. Therefore, we proposed the use of the truncated normal distribution as the proposal kernel to the target posterior.
6. Application
6.1. Real Data Applications
In this section, we use two real data sets to show that the betaLindley distribution can be a better model than one based on the Lindley distribution. The description of the data is as follows.
Data Set 1. The data set 1 represents an uncensored data set corresponding to remission times (in months) of a random sample of 128 bladder cancer patients reported by Lee and Wang [22].
Data Set 2. The data set 2 represents the survival times (in days) of 72 guinea pigs infected with virulent tubercle bacilli, observed and reported by Bjerkedal [23]. The survival times of 72 guinea pigs are as follows.
The variancecovariance matrix of the MLEs under the betaLindley distribution for data set 1 is computed as Thus, the variances of the MLE of , and are , and . Therefore, confidence intervals for , and are , , and , respectively.
In order to compare the two distribution models, we consider criteria like , AIC, and CAIC for the data set. The better distribution corresponds to smaller , AIC, and AICC values.
The LR test statistic to test the hypotheses versus for data set 1 is , so we reject the null hypothesis.
The variancecovariance matrix of the MLEs under the betaLindley distribution for data set 2 is computed as Thus, the variances of the MLE of , and are , and . Therefore, confidence intervals for , and are , and , respectively.
The LR test statistic to test the hypotheses versus for data set 2 is , so we reject the null hypothesis. Tables 1 and 2 show parameter MLEs to each one of the two fitted distributions for data sets 1 and 2, and Tables 1 and 2 show the values of , AIC, and AICC. The values in Tables 1 and 2 indicate that the betaLindley is a strong competitor to another distribution used here for fitting data set 1 and data set 2. A density plot compares the fitted densities of the models with the empirical histogram of the observed data (Figures 2(a) and 2(b)). The fitted density for the betaLindley model is closer to the empirical histogram than the fits of the Lindley models.


(a) Data set 1
(b) Data set 2
The Bayes estimates and the corresponding HPD credible intervals for the parameters , , and are summarised in Table 3.

6.2. Simulated Data
In this subsection, we provided an algorithm to generate a random sample from the betaLindley distribution for the given values of its parameters and sample size . The simulation process consists of the following steps.
Step 1. Set , and .
Step 2. Set initial value for the random starting.
Step 3. Set .
Step 4. Generate .
Step 5. Update by using Newton’s formula such as
Step 6. If (very small, tolerance limit), then will be the desired sample from .
Step 7. If , then set and go to step 50.
Step 8. Repeat steps 40–70, for , and obtain .
Using the previous algorithm, we generated a sample of size 30 from betaLindley distribution for arbitrary values of , and . The simulated sample (Data 3) is given by
The maximum likelihood estimates and Bayes estimates with corresponding confidence/credible intervals are calculated based on the simulated sample. The MLEs of are , respectively. The asymptotic confidence intervals for are obtained as , , and , respectively. For Bayes estimates and the corresponding credible intervals based on simulated data, see Table 3.
7. Conclusion
Here, we propose a new model, the socalled betaLindley distribution which extends the Lindley distribution in the analysis of data with real support. An obvious reason for generalizing a standard distribution is that the generalized form provides larger flexibility in modelling real data. We derive expansions for the moments and for the moment generating function. The estimation of parameters is approached by the method of maximum likelihood and Bayesian; also the information matrix is derived. We consider the likelihood ratio statistic to compare the model with its baseline model. Two applications of the betaLindley distribution to real data show that the new distribution can be used quite effectively to provide better fits than the Lindley distribution.
Appendix
Proof of Theorem 5. One has
So,
The elements of Hessian matrix. One has
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 N. Eugene, C. Lee, and F. Famoye, “Betanormal distribution and its applications,” Communications in Statistics: Theory and Methods, vol. 31, no. 4, pp. 497–512, 2002. View at: Publisher Site  Google Scholar  MathSciNet
 S. Nadarajah and A. K. Gupta, “The beta Frechet distribution,” Far East Journal of Theoretical Statistics, vol. 14, no. 1, pp. 15–24, 2004. View at: Google Scholar  MathSciNet
 S. Nadarajah and S. Kotz, “The beta gumbel distribution,” Mathematical Problems in Engineering, vol. 2004, no. 4, pp. 323–332, 2004. View at: Publisher Site  Google Scholar  MathSciNet
 S. Nadarajah and S. Kotz, “The beta exponential distribution,” Reliability Engineering & System Safety, vol. 91, Article ID 689697, 2005. View at: Google Scholar
 M. C. Jones, “Families of distributions arising from distributions of order statistics,” Test, vol. 13, no. 1, pp. 1–43, 2004. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. BarretoSouza, A. H. S. Santos, and G. M. Cordeiro, “The beta generalized exponential distribution,” Journal of Statistical Computation and Simulation, vol. 80, no. 2, pp. 159–172, 2010. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 R. R. Pescim, C. G. Demtrio, G. M. Cordeiro, E. M. Ortega, and M. R. Urbano, “The beta generalized halfnormal distribution,” Computational Statistics and Data Analysis, vol. 54, no. 4, pp. 945–957, 2010. View at: Google Scholar
 G. M. Cordeiro, C. T. Cristino, E. M. Hashimoto, and E. M. M. Ortega, “The beta generalized Rayleigh distribution with applications to lifetime data,” Statistical Papers, vol. 54, no. 1, pp. 133–161, 2013. View at: Google Scholar
 D. V. Lindley, “Fiducial distributions and Bayes' theorem,” Journal of the Royal Statistical Society B, vol. 20, pp. 102–107, 1958. View at: Google Scholar  MathSciNet
 M. E. Ghitany, B. Atieh, and S. Nadarajah, “Lindley distribution and its application,” Mathematics and Computers in Simulation, vol. 78, no. 4, pp. 493–506, 2008. View at: Publisher Site  Google Scholar
 G. M. Cordeiro and S. Nadarajah, “Closedform expressions for moments of a class of beta generalized distributions,” Brazilian Journal of Probability and Statistics, vol. 25, no. 1, pp. 14–33, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 S. Nadarajah, H. S. Bakouch, and R. Tahmasbi, “A generalized Lindley distribution,” Sankhya B: Applied and Interdisciplinary Statistics, vol. 73, no. 2, pp. 331–359, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 J. Swain, S. Venkatraman, and J. Wilson, “Least squares estimation of distribution function in Johnson's translation system,” Journal of Statistical Computation and Simulation, vol. 29, pp. 271–297, 1988. View at: Google Scholar
 N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distr ibution, vol. 2, John Wiley & Sons, 2nd edition, 1995. View at: MathSciNet
 J. O. Berger, Statistical Decision Theory and Bayesian Analysis, Springer, 1985. View at: Publisher Site  MathSciNet
 S. K. Upadhyay and A. Gupta, “A Bayes analysis of modified Weibull distribution via Markov chain MONte Carlo simulation,” Journal of Statistical Computation and Simulation, vol. 80, no. 34, pp. 241–254, 2010. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 A. E. Gelfand and A. F. M. Smith, “Samplingbased approaches to calculating marginal densities,” Journal of the American Statistical Association, vol. 85, no. 410, pp. 398–409, 1990. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. K. Hastings, “Monte carlo sampling methods using Markov chains and their applications,” Biometrika, vol. 57, no. 1, pp. 97–109, 1970. View at: Publisher Site  Google Scholar
 S. P. Brooks, “Markov chain Monte Carlo method and its application,” Journal of the Royal Statistical Society Series D: The Statistician, vol. 47, no. 1, pp. 69–100, 1998. View at: Publisher Site  Google Scholar
 M. H. Chen and Q. M. Shao, “Monte carlo estimation of Bayesian credible and HPD intervals,” Journal of Computational and Graphical Statistics, vol. 8, no. 1, pp. 69–92, 1999. View at: Publisher Site  Google Scholar  MathSciNet
 S. K. Upadhyay, N. Vasishta, and A. F. M. Smith, “Bayes inference in life testing and reliability via Markov chain Monte Carlo simulation,” Sankhya: The Indian Journal of Statistics A, vol. 63, no. 1, pp. 15–40, 2001. View at: Google Scholar
 E. T. Lee and J. W. Wang, Statistical Methods for Survival Data Analysis, John Wiley & Sons, New York, NY, USA, 3rd edition, 2003. View at: Publisher Site  MathSciNet
 T. Bjerkedal, “Acquisition of resistance in Guinea pigs infected with different doses of virulent tubercle bacilli,” The American Journal of Epidemiology, vol. 72, no. 1, pp. 130–148, 1960. View at: Google Scholar
Copyright
Copyright © 2014 Faton Merovci and Vikas Kumar Sharma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.