Journal of Probability and Statistics

Volume 2014 (2014), Article ID 645719, 5 pages

http://dx.doi.org/10.1155/2014/645719

## Parametric Regression Models Using Reversed Hazard Rates

^{1}Department of Mathematics and Statistics, Memorial University of Newfoundland, St. John's, NL, Canada A1C 5S7^{2}Department of Statistics, Cochin University of Science and Technology, Cochin, Kerala 682022, India

Received 19 June 2013; Accepted 5 October 2013; Published 6 January 2014

Academic Editor: Aera Thavaneswaran

Copyright © 2014 Asokan Mulayath Variyath and P. G. Sankaran. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Proportional hazard regression models are widely used in survival analysis to understand and exploit the relationship between survival time and covariates. For left censored survival times, reversed hazard rate functions are more appropriate. In this paper, we develop a parametric proportional hazard rates model using an inverted Weibull distribution. The estimation and construction of confidence intervals for the parameters are discussed. We assess the performance of the proposed procedure based on a large number of Monte Carlo simulations. We illustrate the proposed method using a real case example.

#### 1. Introduction

In survival studies, covariates or explanatory variables are usually employed to represent heterogeneity in a population. The main objective in such situations is to understand and exploit the relationship between lifetime and covariates. Regression models are useful in such contexts to assess the effect of covariates on lifetime. These models can be formulated in many ways and several types are in common use. Parametric regression models for lifetime involve specification for the distribution of a lifetime given a vector of covariates . The most commonly used parametric model is the Weibull regression model, which satisfies the proportional relationship between hazard rate functions of the lifetimes of two subjects. The maximum likelihood technique is usually employed to find estimates of the parameters of the model. For more properties and applications of parametric regression models, one should refer to Lawless [1].

In survival studies, there are many occasions where lifetime data are left censored. For example, baboons in the Amboseli Reserve, Kenya, sleep in the trees and descend for ageing at certain times of the day. Observers often arrive later in the day after this descent has occurred and on such days they can only ascertain that the descent took place before a particular time, so that the descent times are left censored (see [2]). On such occasions, a reversed hazard rate is more appropriate than a hazard rate to analyze lifetime data due to the fact that estimators of hazard rates are unstable when data are left censored. The reversed hazard rate of is defined as

Introduced by Barlow et al. [3], the function has been used in various contexts such as the estimation of distribution function under left censoring [1], defining a new stochastic order [4], characterization of lifetime distributions [5–7], studying ageing behavior [8, 9], evolving new repair and maintenance strategies [10, 11], the mixed proportional hazards model [12], and stress hybrid hazards model [13].

Recently, Sengupta and Nanda [14] introduced the proportional reversed hazards model in a semiparametric setup. In the present work, we introduce a fully parametric regression model that satisfies the proportional reversed hazards property. The inverted Weibull distribution is employed as a lifetime model, which can be extended to any parametric model. A large number of simulation studies indicate that the proposed approach is performing well.

The rest of the paper is organized as follows. In Section 2, we introduce a parametric regression model using an inverted Weibull distribution. The proposed model has the property that the reversed hazard rate for the lifetime of pair of subjects is proportional. The estimation of the parameters of the model is discussed in Section 3. Simulation studies are conducted in Section 4 to assess the finite sample behavior of the estimators. The proposed model is applied to real life data in Section 5 to illustrate its utility. Finally Section 6 provides the major conclusions of the study.

#### 2. Statistical Model

Let be a nonnegative random variable representing the lifetime of a subject with the distribution function . Assume that the probability density function of , , exists. The reversed hazard rate of given in (1) can be written as Let be vector of auxiliary information which may be time dependent. The proportional reversed hazard (PRH) model is defined by where is the baseline reversed hazard rate, is a nonnegative function of and vector of regression parameters, and is the reversed hazard rate of given the covariate . The PRH model can be expressed in terms of the distribution function as where is the distribution function of given and is the baseline distribution function in the absence of covariates. It should be noted that, for two subjects, the ratio of reversed hazard rates is independent of the time . Semiparametric analysis of the model (2) is recently discussed in Sengupta and Nanda [14]. Our objective here is to carry out the parametric analysis of an inverted Weibull distribution under left censoring. When the lifetime random variable follows the inverted Weibull distribution, the baseline distribution function is given by The baseline reversed hazard rate of is then obtained as Note that the baseline reversed hazard rate is decreasing as increases. In the presence of the covariate , we have We assume that so that with Suppose that the lifetime random variable is randomly left censored by . In practice, one could observe the vectors , where and with being the indicator function. Let , , be i.i.d. copies of . Then the likelihood function can be written as Under the inverted Weibull distribution assumption, the likelihood function given in (10) is obtained as so that the log likelihood function is where is a real constant independent of , , and . We maximize (12) to estimate the parameters , , and by equating the partial derivatives with respect to each parameter to zero as Since there is no closed form solution available for (13), we use numerical methods to estimate the parameters. The observed information matrix is given by Note that the matrix (14) is of order . Under the standard regularity conditions, the vector of estimates is asymptotically -variate normal with mean vector and dispersion matrix , where is the Fisher information matrix obtained from by taking the expected values of each entry.

There are different algorithms available to estimate the parameters by solving the score equations or directly optimizing the likelihood function. The Newton-Raphson method is the most common method used to estimate since it is easy to determine the derivatives of the score equations. In this numerical iterative method, the initial values play a vital role due to the logarithm function. In the simulation studies given in Section 4, we use the simplex method proposed by Neldar and Mead [15] to estimate the parameters. The simplex method is a simple method to use to estimate the parameters by maximizing the likelihood function where we do not need the derivatives of the function to be optimized.

#### 3. Testing and Confidence Intervals for

Tests and interval estimates of parameters can be derived by the likelihood ratio test procedure. We are mainly interested in the regression parameter where the parameters are normally considered as nuisance parameters.

Let the vector regression parameter be denoted as , where and are of vectors of sizes and , respectively, and is the other parameter in the model. We are interested in testing where is the specified regression parameter value. To test , we construct the likelihood ratio statistic where , , and are the maximum likelihood estimates under the full model. For a large value of , follows the distribution under the null hypothesis.

Alternatively, we can use the test statistic where can be obtained from , which is partitioned as Under the null hypothesis, follows distribution.

Assuming asymptotic normality, we can construct the confidence interval for the individual regression parameter as where can be obtained from .

Another important problem is the selection of important covariates in the proportional reversed hazard models. Since we assume a parametric model, we can use variable selection methods such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). To test adequacy of the parametric model, Cox-Snell residuals can be used, which is explained in Section 5 with a case example.

#### 4. Performance Analysis

To assess the performance of the proposed method, we carried out a large number of simulations. We generated samples of size 100 from an inverted Weibull distribution, with different values of parameters as and . We considered a single covariate which is generated from Uniform (0,1) and a regression parameter assumed to be . We developed the censoring mechanism using the random data generated from the exponential distribution with the parameter . We choose the value of such that the percent of censored data is between 10 and 20 percent. We used the simplex method proposed by Neldar and Mead [15] to estimate the parameters. We repeated the study for 10000 times and computed the mean and standard deviation of the parameter estimates. The entire study was repeated for a sample size of 250. The summary of the parameter estimates is given in Table 1.

From Table 1 we can see that the mean of the parameter estimates based on 10000 simulation is very close to the true parameter values and the standard deviation is also small. When the sample size increases the standard error of the estimates decreases and bias reduces. It should be noted that there is a slight positive bias in all cases, even though it is negligibly small. Since there are no comparable models based on reversed hazard rates, we did not perform any comparison studies.

#### 5. An Example

We consider an extract of left censored data from an Australian twin study given in Duffy et al. [16]. The data consist of information on the age of appendectomy of monozygotic and dizygotic twins. There are observations with missing age at onset and therefore the data are left censored. The individuals having age at onset of less than 11 are left censored. The covariate, namely, Zygocity, has values from 1 to 6. This data set consists of 54 observations of which 15 are left censored. We use this data to illustrate the utility of the parametric reverse hazard rate model. Probability plotting and statistical test confirmed the distribution of data as inverted Weibull distribution. We use the simplex method to estimate the parameters. Since the parameter values are unknown and to avoid the effect of inappropriate initial values, we consider different initial values and choose the estimates which have maximum likelihood. Estimates of the parameters are , , and . The 95% confidence interval for indicates that the regression coefficient corresponding to Zygocity is not significantly different than zero; that is, the effect of Zygocity is negligible. This conclusion is also verified through the likelihood ratio test statistic value 0.0074 having a value of 0.93.

We use a Cox-Snell residual plot to assess the goodness of fit. The Cox-Snell residual is defined by If the model fits the data, then the residuals should have a standard exponential distribution, so that a hazard plot of residuals versus the Nelson-Aalen estimator of the cumulative hazard of the residuals will be a straight line with slope one. A plot of Cox-Snell residuals against the Nelson-Aalen estimates of the cumulative hazard rate of residuals is given in Figure 1, which shows that the fit is reasonably good.

#### 6. Conclusions

Proportional reversed hazard rate models are more suitable for modeling the left censored lifetime data. In this paper, we proposed a parametric PRH model assuming that the lifetime data follows an inverted Weibull distribution. The estimation and hypothesis testing of the parameters of the model have been discussed in detail. The performance of the proposed model is assessed based on a large number of Monte Carlo simulations. Our simulations results clearly indicated that the proposed model is performing well. We applied the proposed model to a real life example to illustrate the utility of the method. Recently, Bayesian methodology was extensively employed in the analysis of lifetime data. The inference procedures of the proposed model by selecting an appropriate prior distribution are topics of research to be explored. The present work can be easily extended to any location-scale families of distributions.

#### Conflict of Interests

The authors declare that there is no conflict of interest for this paper.

#### Acknowledgments

The authors would like to thank the editor and an anonymous referee for their valuable comments and suggestions that substantially improved the overall quality of an earlier version of this paper. The research of Dr. Variyath is supported by a grant from the Natural Science and Engineering Council of Canada.

#### References

- J. F. Lawless,
*Statistical Models and Methods for Lifetime Data*, John Wiley & Sons, New York, NY, USA, 2003. View at MathSciNet - P. K. Andersen, O. Borgan, R. D. Gill, and N. Keiding,
*Statistical Models Based on Counting Process*, Springer, New York, NY, USA, 1993. - R. E. Barlow, A. W. Marshall, and P. Proschan, “Properties of probability distributions with monotone hazard rate,”
*Annals of Mathematical Statistics*, vol. 34, pp. 375–389, 1963. View at Google Scholar · View at MathSciNet - J. Keilson and U. Sumitha, “Uniform stochastic ordering and related inequalities,”
*Canadian Journal of Statistics*, vol. 15, pp. 63–69, 1982. View at Google Scholar - H. W. Block, T. H. Savits, and H. Singh, “The reversed hazard rate function,”
*Probability in the Engineering and Informational Sciences*, vol. 12, no. 1, pp. 69–90, 1998. View at Publisher · View at Google Scholar · View at MathSciNet - M. S. Finkelstein, “On the reversed hazard rate,”
*Reliability Engineering and System Safety*, vol. 78, pp. 71–75, 2002. View at Google Scholar - N. U. Nair, P. G. Sankaran, and G. Asha, “Characterizations of distributions using reliability concepts,”
*Journal of Applied Statistical Science*, vol. 14, no. 3-4, pp. 237–241, 2005. View at Google Scholar · View at MathSciNet - R. C. Gupta, P. L. Gupta, and R. D. Gupta, “Modeling failure time data by Lehman alternatives,”
*Communications in Statistics*, vol. 27, no. 4, pp. 887–904, 1998. View at Google Scholar - C.-D. Lai and M. Xie,
*Stochastic Ageing and Dependence for Reliability*, Springer, New York, NY, USA, 2006. View at MathSciNet - A. W. Marshall and I. Olkin,
*Life Distributions*, Springer, New York, NY, USA, 2007. View at MathSciNet - X. Li and M. Xu, “Reversed hazard rate order of equilibrium distributions and a related aging notion,”
*Statistical Papers*, vol. 49, no. 4, pp. 749–767, 2008. View at Publisher · View at Google Scholar · View at MathSciNet - G. Horny, “Inference in mixed proportional hazard models with $K$ random effects,”
*Statistical Papers*, vol. 50, no. 3, pp. 481–499, 2009. View at Publisher · View at Google Scholar · View at MathSciNet - C. A. V. Tojeiro and F. Louzada, “A general threshold stress hybrid hazard model for lifetime data,”
*Statistical Papers*, vol. 53, no. 4, pp. 833–848, 2012. View at Publisher · View at Google Scholar · View at MathSciNet - D. Sengupta and A. K. Nanda, “The proportional reversed hazards regression model,”
*Journal of Statistical Theory and Applications*, vol. 18, no. 4, pp. 461–476, 2011. View at Google Scholar - J. A. Neldar and R. Mead, “A simplex method for function minimization,”
*Computer Journal*, vol. 7, pp. 308–331, 1965. View at Google Scholar - D. Duffy, N. G. Martin, and J. D. Mathews, “Appendectomy in Australian twins,”
*American Journal of Human Genetics*, vol. 47, no. 3, pp. 590–592, 1990. View at Google Scholar