Abstract

The nonparametric estimation for the density and hazard rate functions for right-censored data using the kernel smoothing techniques is considered. The “classical” fixed symmetric kernel type estimator of these functions performs well in the interior region, but it suffers from the problem of bias in the boundary region. Here, we propose new estimators based on the gamma kernels for the density and the hazard rate functions. The estimators are free of bias and achieve the optimal rate of convergence in terms of integrated mean squared error. The mean integrated squared error, the asymptotic normality, and the law of iterated logarithm are studied. A comparison of gamma estimators with the local linear estimator for the density function and with hazard rate estimator proposed by Müller and Wang (1994), which are free from boundary bias, is investigated by simulations.

1. Introduction

Censored data arise in many contexts, for example, in medical follow-up studies in which the occurrence of the event times (called survival) of individuals may be prevented by the previous occurrence of another competing event (called censoring). In such studies, interest focuses on estimating the underlying density and/or hazard rate of the survival time. Nonparametric estimation using kernel smoothing method has received considerable attention in the statistical literature. A popular approach for estimating the density function and the hazard rate function is done using a fixed symmetric kernel density with bounded support and a bandwidth parameter. The kernel determines the shape of the local neighborhood while the bandwidth controls the degree of smoothness. In order to get a reasonable estimator, these two parameters, the kernel and the bandwidth parameter, have to be chosen carefully. For a review about kernel smoothing approaches, we refer the reader to Silverman [1] and Izenman [2] for uncensored data and Singpurwalla and Wong [3], Tanner and Wong [4], Padgett and McNichols [5], Mielniczuk [6], and Lo et al. [7] in the case of right censoring.

It is well known from the literature that the kernel has less impact than the bandwidth on the resulting estimate. However, when the density function of the data have a bounded support, using the classical kernel leads to an estimator with a large bias near the endpoints. The problem of bias, called also the boundary effect, becomes a serious drawback when a large portion of the sampled data are present in the boundary region. In fact, when we estimate the underlying function near the endpoints, as the support of the kernel exceeds the available range of the data, the bias of the resulting estimator becomes larger. This is especially the case in survival analysis, since the survival time is assumed to be nonnegative variable. So, near zero, the symmetric kernel estimator of the density and the hazard functions underestimates the true ones.

For uncensored data, several methods are available in the literature to overcome this problem, for example, the reflection method of Schuster [8], the smooth optimum kernel of Müller [9], the local linear estimator of Lejeune and Sarda [10], the transformation approach of Marron and Ruppert [11], and the boundary kernel of Jones [12] and Jones and Foster [13]. The local linear estimator is a special case of boundary kernel method. The main idea behind the boundary kernel method is to use an adaptive kernels in the boundary region and to use a fixed symmetric kernel in the interior region. For nonnegative data and in order to overcome the boundary bias problem, Chen [14] considers the gamma kernel estimator. Simulation results of Jones [12] and Chen [14] show that the local linear estimator outperforms the boundary kernel estimator of Müller [9]. For right-censored data and to resolve this problem, Müller and Wang [15] propose a new class of polynomial boundary kernel estimator for hazard rate function, where the kernel and the bandwidth parameter depend on the point where the estimate is to be evaluated. Hess et al. [16] show numerically its performance via an extensive simulation study.

In this paper, we adapt the gamma kernel smoothing procedure to estimate the marginal density and the hazard function of positive independent survival data that are subject to right censoring. We show that both estimators are robust against boundary problems. Also, we establish the mean integrated square error, the asymptotic normality, and the law of iterated logarithm of the two estimators. Furthermore, via a Monte Carlo study, the finite sample performance of the estimators is investigated under various scenarios.

The paper is organized as follows. Section 2 introduces the gamma kernel estimators for the density and the hazard rate functions for right-censored data. In Section 3, we establish the asymptotic properties of the gamma kernel estimators. In Section 4, we investigate the finite sample properties of the gamma kernel estimators. Section 5 provides an application to the classical bone marrow transplant data. The last section is an appendix that gathers the proofs.

2. Methodology

Let 𝑇1,,𝑇𝑛 (survival times) and 𝐶1,,𝐶𝑛 (censoring times) be two i.i.d. nonnegative independent random sequences with distribution functions 𝐹 and 𝐺, respectively. Under the censoring model, instead of observing 𝑇𝑖, we observe the pair (𝑋𝑖,𝛿𝑖), where 𝑋𝑖=min(𝑇𝑖,𝐶𝑖) and 𝛿𝑖=𝐼(𝑌𝑖𝐶𝑖) with 𝐼() being the indicator function. We denote by 𝑓 the density function of 𝐹 and by =𝑓/(1𝐹) the corresponding hazard function. The nonparametric maximum likelihood approach proposed by Kaplan and Meier [17] leads to the estimator of 𝐹 given by 𝐹(𝑥)=1𝑖1𝑖𝑛,𝑋(𝑖)𝑥𝑛𝑖𝑛𝑖+1𝛿(𝑖)if𝑥<𝑋(𝑛)1,otherwise,(2.1) where 𝑋(1)𝑋(𝑛) and 𝛿(1),,𝛿(𝑛) are the corresponding 𝛿𝑖's. This estimator was studied by many authors. For reference, we cite Breslow and Crowley [18], Wang [19], and Stute and Wang [20] among many authors. Lo and Singh [21] expressed the KM estimator as an i.i.d. mean process with a remainder of negligible order. This result was improved by Lo et al. [7] and will be useful in this paper to make the connection between the uncensored and censored case.

From now on, we will denote the right endpoint of a given (sub)distribution 𝐿 by 𝑇𝐿, that is, 𝑇𝐿=sup{𝑡0,𝐿(𝑡)<1}. We will also use the notation 𝐿 for the corresponding survival function, that is, 𝐿()=1𝐿(). Let 𝐻 be the distribution function of 𝑋, that is, 𝐻=𝐹𝐺. We suppose that 𝑇𝐺𝑇𝐹 or equivalently 𝑇𝐻=𝑇𝐹. In the remainder of this paper, except if mentioned otherwise, all the integrations are taken over the interval [0,𝑇], where 𝑇=𝑇𝐹. Based on the smooth kernel technique, we propose to estimate the density by the gamma kernel estimator defined as follows:𝑓𝑏(𝑥)=𝐾(𝑥,𝑏)(𝑡)𝑑𝐹(𝑡)=𝑛𝑖=1𝑋𝐾(𝑥,𝑏)(𝑖)𝕎𝑖,(2.2) where the kernel 𝐾 is given by 𝑡𝐾(𝑥,𝑏)(𝑡)=𝜌𝑏(𝑥)1exp(𝑡/𝑏)𝑏𝜌𝑏(𝑥)Γ𝜌𝑏(𝑥),𝜌𝑏𝑥(𝑥)=𝑏1if𝑥2𝑏4𝑥𝑏2[+1if𝑥0,2𝑏),(2.3) the weights 𝕎𝑖's are the jumps of 𝐹 at 𝑋𝑖 for 𝑖=2,,𝑛, and 0<𝑏𝑏𝑛0 is the bandwidth parameter.

Naturally, the gamma kernel estimator that we propose for the hazard rate is𝑏𝑓(𝑥)=𝑏(𝑥).1𝐹(𝑥)(2.4) As we will see later, those estimators are free of boundary bias; this is due to the fact that the gamma kernel is defined on the positive real, and so no weight is assigned outside the support of the underlying density and/or hazard rate functions. The shape of the gamma kernel and the amount of smoothing are not only controlled by the bandwidth parameter, but also vary according to the position where the function is estimated. For uncensored data, Chen [14] shows that the gamma kernel density estimator achieves the optimal rate of convergence for the mean integrated squared error within the class of nonnegative kernel density estimators. Bouezmarni and Scaillet [22] state the uniform weak consistency for the gamma kernel estimator on any compact set and also the weak convergence in terms of mean integrated absolute error. In the next section, we will prove that even when the data are censored, the gamma kernel estimators perform in the interior and the boundary regions.

3. Asymptotic Properties

In this section, we state the asymptotic properties of the gamma kernel estimator of the density and the hazard rate functions. We start with the following theorem which will play an important role for the remainder of this section. To be concise, in the following, we will denote by 𝑍 either the density or the hazard rate function and 𝑍𝑏 either the gamma kernel estimator for the density or the hazard function.

Theorem 3.1. Assume that 𝑓 is twice continuously differentiable. If (i) log2𝑛/(𝑛𝑏𝑛3/2)0, then the integrated mean squared error of 𝑍𝑏 is 𝑍IMSE𝑏=𝑏2𝐵2(𝑥)𝑑𝑥+𝑛1𝑏1/2𝑏𝑉(𝑥)𝑑𝑥+𝑜2𝑛+𝑜1𝑏1/2,(3.1) where for the density function 𝐵 and 𝑉 are given by 𝐵𝑓1(𝑥)=2𝑥𝑓(𝑥),𝑉𝑓1(𝑥)=2𝜋𝑥1/2𝑓(𝑥),𝐺(𝑥)(3.2) and for the hazard rate function 𝐵 and 𝑉 are given by 𝐵𝐵(𝑥)=𝑓(𝑥)𝐹(𝑥),𝑉𝑉(𝑥)=𝑓(𝑥)𝐹2.(𝑥)(3.3) The optimal bandwidth parameter which minimizes the asymptotic 𝑍IMSE(𝑏) is given by 𝑏=14𝑉(𝑥)𝑑𝑥𝐵(𝑥)𝑑𝑥2/5𝑛2/5,(3.4) and the corresponding optimal asymptotic integrated mean squared error is 𝑍IMSE𝑏544/5𝑉(𝑥)𝑑𝑥4/5𝐵(𝑥)𝑑𝑥1/5𝑛4/5.(3.5)

Theorem 3.1 states that the gamma kernel estimators of the density and hazard rate functions are free of boundary bias and provides the theoretical formula of the optimal bandwidth. However, in real data analysis, to choose an appropriate bandwidth, one needs to use a data driven procedure, for example, the cross-validation, the bootstrap, or the method proposed by Bouezmarni and Scaillet [22]. Of course, all those methods need to be carefully adapted to the censoring case. Also, from (A.25) in the appendix, the asymptotic variances of the gamma kernel estimator are of a larger order 𝑂(𝑛1𝑏1) near the boundaries than those 𝑂(𝑛1𝑏1/2) in the interior. However, Theorem 3.1 shows that the impact of the increased variance near the boundary on the mean integrated squared error is negligible and the optimal rate of convergence in term of integrated mean squared error is achieved by the gamma kernel estimators.

The following proposition deals with the asymptotic normality of the gamma kernel estimators.

Proposition 3.2. Under the same conditions in Theorem 3.1. if (ii) 𝑛𝑏5/2=𝑜(1), then for any 𝑥 such that 𝑓(𝑥)>0 and 𝑥𝑇, one has 𝑛1/2𝑏1/4𝑍𝑏(𝑥)𝑍(𝑥)𝑉(𝑥)𝑁(0,1),indistribution,(3.6) where 𝑉𝑉𝑥(𝑥)=(𝑥)if𝑏Γ(2𝜅+1)𝑏1/221+2𝜅Γ22(𝜅+1)𝜋𝑥1/2𝑥𝑉(𝑥)if𝑏𝜅.(3.7)

We establish in the next proposition the law of iterated logarithm of the gamma kernel estimators. Let Φ𝑥(𝑛,𝑏)=2𝑛1𝑏1/2𝑉(𝑥)loglog(𝑛), where 𝑉 is defined in Proposition 3.2.

Proposition 3.3. Under the same conditions in Theorem 3.1. if, (iii)𝑏𝑛/𝑏𝑚1,as𝑛,𝑚suchthat𝑛/𝑚1, and  2log4(𝑛)/(𝑛𝑏)2Φ𝑥(𝑛,𝑏)0.(iv)𝑏𝑛=𝑜(loglog(𝑛)/𝑛)2/5, then limsup𝑛Φ𝑥(𝑛,𝑏)1/2||𝑍𝑏||(𝑥)𝑍(𝑥)=1,a.s.(3.8)

4. Simulations Results

The finite sample performance of the proposed methodology is studied in this section. Two models are considered. (i)Model A: the survival time follows an exponential distribution, and the censoring times were generated from an uniform distribution [0,𝑐]. 𝑐 was chosen to be the solution of the following equation 𝑝𝑐+exp(𝑐)=1, where 𝑝 is the desired percentage of censoring. (ii)Model B: the survival times follow a Weibull distribution with scale parameter 𝑏=2 and shape parameter 𝑎=1.2, and the censoring times are also generated from a Weibull distribution with shape parameter 𝑎 and a scale parameter given by 𝑏((1𝑝)/𝑝)1/𝑎. This ensures that the degree of censoring is equal to 𝑝.

First, we show the results for the density estimator. To evaluate the performance of the gamma kernel estimator for the density function, we compare this later with the local linear estimator of Jones [12] adapted for the right censoring case. The local linear method is known to be a robust technique against boundary bias problems. We consider different sample sizes, 𝑛=125,250,500,1000. As a measure of errors of the estimators, we analyze the mean and the standard deviation of the 𝐿2-norm.

Tables 1 and 2 provide the results obtained with 1000 replications for the mean and the standard deviation, respectively. Firstly and as expected, when the sample size increases, the mean integrated square error for the two estimators decreases; see Table 1. This is true for both models and all degrees of censoring. For example, in model A, we can see that with 10% of censoring and using the gamma kernel estimator, the mean error decreases from 0.0117 to 0.0101 when the sample size goes from 125 to 250. Note that, for the 50% of censoring, the rate at which the mean error decreases is much smaller. Secondly, except the 50% censoring case with model A, the gamma kernel estimator outperforms strongly the local linear kernel estimator. Thirdly, for model B, when the degree of censoring increases the mean integrated square error increases as expected. In fact, for model B, the cumulative distribution of the censoring times increases when the degree of censoring 𝑝 increases. This implies that the mean integrated squared error of the gamma kernel estimators increases. From Table 2, one can see that the variance of both models and both estimators decreases with the sample size, for all the degree of censoring. For model B, the variance of gamma kernel estimator increases with the degree of censoring. Another point to remark is that, for model B, the local linear estimator is dominated in terms of variance by gamma kernel estimator in all situations. For model A, the variance of the gamma kernel estimator is smaller than the variance of the local linear estimator for 𝑝=0.1 and 0.25, but not for 𝑝=0.5.

For the hazard function, we compare our gamma kernel estimator with the boundary-corrected hazard estimator of Müller and Wang [15]. Table 3 shows the mean and the variance of 𝐿2 errors based on 1000 replications for both 𝑛=125 and 𝑛=250. The same data-generating procedure, as for density function, was used. This results clearly demonstrate that our method is far more efficient than the method proposed by Müller and Wang [15].

5. Application

To illustrate our approach with real data, we consider the classical bone marrow transplant study. The variable of interest is the disease-free survival time, that is, time to relapse or death. Among a total of 137 observations, there are 83 censored times all of them are caused by the end of the follow-up period. The data together with a detailed description of the study can be found in [23, Section 1.3]. Figure 1 shows the estimated hazard function using both our method (G) and that proposed by Müller and Wang [15] (MW) with boundary correction. For the Muller estimator, we use the data-driven global optimal bandwidth as proposed by the authors. To calculate the MW estimator and its bandwidth parameter, we make use of the R package muhaz; see Hess and Gentleman [24]. The estimator of Müller and Wang [15] shown in Figure 1 is based on the bandwidth 𝑏=204.5 while for the gamma kernel estimator 𝑏=153.4. This is a data-driven bandwidth based on the least squared cross-validation method of Marron and Padgett [25] adapted to our case. Note that this is just a practical choice and may be far from the optimal one. From Figure 1, one cannot decide which estimator is better but, given the results of our simulation study, we suspect that the MW method has a global tendency of underestimating the real hazard function and so the real risks after a bone marrow transplant.

Appendix

We start this section with some notations. Let 𝑔(𝑡)=𝑡0(1𝐻(𝑠))2𝑑𝐻1(𝑠),(A.1) where 𝐻1(𝑡)=𝑃(𝑋𝑖𝑡,𝛿𝑖=1)=𝑡0(1𝐺(𝑠))𝑑𝐹(𝑠) denote the subdistribution of the uncensored observations. For positive real numbers 𝑧 and 𝑥, and 𝛿=0 or 1, let 𝜉(𝑧,𝛿,𝑡)=𝑔(𝑧𝑡)+(1𝐻)1𝐼(𝑧𝑡,𝛿=1).(A.2) We also set 𝜉𝑖(𝑥)=𝜉(𝑋𝑖,𝛿𝑖,𝑥). First note that 𝔼𝜉𝑖𝜉(𝑥)=0,Cov𝑖(𝑡),𝜉𝑖(𝑠)=𝑔(𝑠𝑡).(A.3) The following two lemmas will play an important role in the demonstrations. The first lemma is a due to Lo et al. [7] which expressed the KM estimator as an i.i.d. mean process with a remainder of negligible order.

Lemma A.1 (see [7]). For 𝑡𝑇, 𝐹(𝑡)𝐹(𝑡)=𝑛1𝐹(𝑡)𝑛𝑖=1𝜉𝑖(𝑡)+𝑟𝑛(𝑡),(A.4) where sup0𝑥𝑇||𝑟𝑛||(𝑥)=𝑂log𝑛𝑛𝑎.𝑠.,(A.5) and, for any 𝛼1, sup0𝑥𝑇𝔼||𝑟𝑛||(𝑥)𝛼=𝑂log𝑛𝑛𝛼.(A.6)

The next lemma gives a strong approximation of the gamma kernel estimator of the density. This lemma permits us to derive the asymptotic properties of the gamma kernel estimator for density and hazard rate function.

Lemma A.2. The gamma kernel density estimator 𝑓𝑏 admits the strong approximation on the interval [0,T]: 𝑓𝑏(𝑥)=𝑓(𝑥)+𝛽𝑛(𝑥)+𝜎𝑛(𝑥)+𝑒𝑛(𝑥),(A.7) where 𝛽𝑛(𝑥)=𝑓(𝑡)𝐾(𝑥,𝑏)(𝑡)𝑑𝑡𝑓(𝑥),𝜎𝑛(𝑥)=𝑛1𝑛𝜉𝑖(𝑡)𝑑𝐾(𝑥,𝑏)(𝑡),(A.8)sup0𝑥𝑇||𝑒𝑛||(𝑥)=𝑂log𝑛(𝑛𝑏)𝑎.𝑠.(A.9) Also, for any 𝛼1, sup0𝑥𝑇𝔼||𝑒𝑛||(𝑥)𝛼=𝑂log𝑛(𝑛𝑏)𝛼.(A.10)

Proof of Lemma A.2. Let us first show that 𝐾(𝑥,𝑏)(𝑇) can be made arbitrary small for any 𝑥<𝑇. From Γ(𝑤+1)2𝜋exp(𝑤)𝑤𝑤+1/2,as𝑤(A.11) it follows that for 𝑠=𝑥𝑏, 𝑇𝐾(𝑥,𝑏)(𝑇)=𝑠/𝑏exp(𝑇/𝑏)𝑏𝑠/𝑏+1Γ(𝑠/𝑏+1)(𝑏𝑠)1/2𝑠2𝜋exp𝑏𝑇1𝑠𝑇+log𝑠.(A.12) Observe that, for any 𝑤>1, 1𝑤+log(𝑤)<0. So, 𝐾(𝑥,𝑏)(𝑇)=𝑜(𝑏𝛼) for any 𝛼1. Now, using this result, the integration by part, and Lemma A.1, we obtain 𝑓𝑏(𝑥)=𝐾(𝑥,𝑏)(𝑡)𝑑𝐹(𝑡)𝐹(𝑡)𝑑𝐾(𝑥,𝑏)(𝑡)=𝐹(𝑡)+𝑛1𝑛𝜉𝑖(𝑡)+𝑟𝑛(𝑓𝑡)𝑑𝐾(𝑥,𝑏)(𝑡)(𝑡)𝐾(𝑥,𝑏)(𝑡)𝑑𝑡𝑛1𝑛𝜉𝑖𝑟(𝑡)𝑑𝐾(𝑥,𝑏)(𝑡)𝑛(𝑡)𝑑𝐾(𝑥,𝑏)(𝑡)=𝑓(𝑥)+𝛽𝑛(𝑥)+𝜎𝑛(𝑥)+𝑒𝑛(𝑥).(A.13) We deduce the result of Lemma A.2 by using the following inequality: 0+||||𝑑𝐾(𝑥,𝑏)(𝑡)=𝑏10+||𝐾||(𝑥,𝑏)(𝑡)𝐾(𝑥𝑏,𝑏)(𝑡)𝑑𝑡2𝑏1.(A.14)

Proof of Theorem 3.1. We start with the gamma kernel density estimator. From the asymptotic bias of the gamma kernel estimator for uncensored data (see Chen [14]) and the fact that for the interior region log𝑛/𝑛𝑏𝑛𝑛1/2𝑏1/4=log𝑛𝑛1/2𝑏3/40,(A.15) and for the boundary region log𝑛/𝑛𝑏𝑛𝑛1/2𝑏1/2=log𝑛𝑛1/2𝑏1/20,(A.16) it can be easily verified that, in our case, the bias is given by 𝔼𝑓𝑏(1𝑥)𝑓(𝑥)=2𝑥𝑓𝑛(𝑥)𝑏+𝑜(𝑏)+𝑜1/2𝑏1/4𝜉if𝑥2𝑏,𝑏(𝑥)𝑏𝑓𝑛(𝑥)+𝑜(𝑏)+𝑜1/2𝑏1/2[if𝑥0,2𝑏).(A.17) where 𝜉𝑏(𝑥)=(1𝑥)(𝜌𝑏(𝑥)𝑥/𝑏)/(1+𝑏𝜌𝑏(𝑥)𝑥).
To calculate the asymptotic variance of the gamma kernel estimator, we only need to evaluate the variance of 𝜎𝑛(𝑥) since the other terms are negligible. We start with the fact that 𝜎Var𝑛(𝑥)=𝑛1𝐹(𝑡)𝐹(𝑠)𝑔(𝑡𝑠)𝑑𝑘(𝑡)𝑑𝑘(𝑠),(A.18) where 𝑘()=𝑘(𝑥,𝑏)(). Using integration by parts, the first integral becomes 𝐹(𝑡)𝑔(𝑡𝑠)𝑑𝑘(𝑡)=𝑠0𝐹(𝑡)𝑔(𝑡)𝑑𝑘(𝑡)+𝑡𝑠=𝐹(𝑡)𝑔(𝑠)𝑑𝑘(𝑡)𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑠0𝑠0𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑑𝑡+𝑔(𝑠)𝐹(𝑡)𝑘(𝑡)𝑡𝑠𝑔(𝑠)𝑡𝑠𝐹(𝑡)𝑘(𝑡)𝑑𝑡=𝑠0𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑑𝑡𝑔(𝑠)𝑠𝐹(𝑡)𝑘(𝑡)𝑑𝑡.(A.19) So that, 𝜎𝑛Var𝑛(𝑥)=𝐹(𝑠)𝑠0𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑑𝑡𝑑𝑘(𝑠)𝐹(𝑠)𝑔(𝑠)𝑡𝑠𝐹(𝑡)𝑘(𝑡)𝑑𝑘(𝑠)𝑑𝑡=𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑠𝑡𝐹(𝑠)𝑑𝑘(𝑠)𝑑𝑡𝐹(𝑡)𝑘(𝑡)𝑡0𝐹(𝑠)𝑔(𝑠)𝑑𝑘(𝑠)𝑑𝑡=𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)+𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑠𝑡𝐹(𝑠)𝑑𝑘(𝑠)𝑑𝑡𝐹(𝑡)𝑘(𝑡)𝑡0𝐹(𝑠)𝑔(𝑠)𝑑𝑘(𝑠)𝑑𝑡.(A.20) Again, using the integration by parts we obtain, 𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑠𝑡𝐹(𝑠)𝑑𝑘(𝑠)𝑑𝑡=𝐹2(𝑡)𝑔(𝑡)𝑘2(𝑡)𝑑𝑡+𝑂(1),𝐹(𝑡)𝑔(𝑡)𝑘(𝑡)𝑠𝑡𝐹(𝑠)𝑑𝑘(𝑠)𝑑𝑡=𝐹(𝑡)𝐹(𝑡)𝑔(𝑡)𝑘2(𝑡)𝑑𝑡+𝑂(1)𝐹(𝑡)𝑘(𝑡)𝑡0𝐹(𝑠)𝑔(𝑠)𝑑𝑘(𝑠)𝑑𝑡=𝐹(𝑡)𝐹(𝑡)𝑔(𝑡)𝑘2(𝑡)𝑑𝑡+𝑂(1).(A.21) Therefore, from 𝑔(𝑡)=𝑓(𝑡)/[𝐺(𝑡)𝐹(𝑡)2], we get 𝜎𝑛Var𝑛(=𝑥)𝐹2(𝑡)𝑔(𝑡)𝑘2(=𝑡)𝑑𝑡𝑓(𝑡)/𝑘𝐺(𝑡)2(𝑡)𝑑𝑡=𝐵𝑏𝑓𝜂(𝑥)𝐼𝐸𝑥𝐺1𝜂𝑥,(A.22) where 𝜂𝑥 is a gamma(2𝑥/𝑏+1,𝑏/2) random variable and 𝐵𝑏𝑏(𝑥)=1Γ(2𝑥/𝑏+1)22𝑥/𝑏+1Γ2(𝑥/𝑏+1).(A.23) From Chen [14], we have that for a small value of 𝑏, 𝐵𝑏1(𝑥)2𝜋𝑏1/2𝑥1/2𝑥,if𝑏,Γ(2𝜅+1)21+2𝜅Γ2𝑏(𝜅+1)1𝑥,if𝑏𝜅.(A.24) So after some easy development, we get 𝑓𝑛Var𝑏1(𝑥)2𝜋𝑏1/2𝑥1/2𝑓(𝑥)𝑥𝐺(𝑥),if𝑏,Γ(2𝜅+1)21+2𝜅Γ2𝑏(𝜅+1)1𝑓(𝑥)𝑥𝐺(𝑥),if𝑏𝜅.(A.25) Finally, we derive the integrated mean squared error from (A.25) and (A.17). Now, for the gamma kernel estimator of the hazard function, we start by the following decomposition: MSE𝑏𝑛(𝑥)=𝔼𝑏𝑛(𝑥)(𝑥)2[]=𝔼I+II+III2,(A.26) where 𝑓I=𝑏𝑛(𝑥)𝐹(𝑥)𝐹(𝑥)𝐹(𝑥),𝑓𝐹(𝑥)II=𝑏𝑛𝑓(𝑥)𝔼𝑏𝑛(𝑥),𝔼𝑓𝐹(𝑥)III=𝑏𝑛(𝑥)𝑓(𝑥).𝐹(𝑥)(A.27) Observe that 𝔼(IIIII)=III𝔼(II)=0, because the term III is deterministic. Using Schwartz inequity and the boundedness of the gamma kernel density estimator we found that 𝔼|I|=𝑂(𝑛1/2) and 𝔼[I2]=𝑂(𝑛1). Now, from the bias formula of the gamma kernel estimator and the conditions on the bandwidth parameter, we get 𝔼||||=||||𝔼||I||=𝑂𝑛IIIIIII1/2𝑂𝑛(𝑏)+𝑜1/2𝑏𝑛1/4𝑛=𝑜1𝑏𝑛1/2𝑥,if𝑏𝑛𝑂𝑛,1/2𝑛𝑂(𝑏)+𝑜1/2𝑏𝑛1/2𝑛=𝑜1𝑏𝑛1𝑥,if𝑏𝑛𝜅.(A.28) Again, using the Schwartz's inequality, we obtain 𝔼||||𝔼IIII21/2𝑓Var𝑏𝑛(𝑥)1/2=𝑂𝑛𝐹(𝑥)1/2𝑂𝑛1/2𝑏𝑛1/4𝑛=𝑜1𝑏𝑛1/2𝑥if𝑏𝑛𝑂𝑛,1/2𝑂𝑛1/2𝑏𝑛1/2𝑛=𝑜1𝑏𝑛1𝑥if𝑏𝑛𝜅.(A.29) Combining these formulas, one can see that MSE𝑏𝑛(𝑥)=𝔼II2+𝔼III2𝑛+𝑂1+𝑜𝑛𝑏𝑛1=𝔼II2+𝔼III2+𝑜𝑛𝑏𝑛1.(A.30) Finally, using the expression of the bias and the variance of the gamma kernel density estimator, we derive the desired result of the gamma kernel estimator for the failure rate function.

We will only give the proof of the asymptotic normality and the iterated logarithm for the gamma kernel estimator. Thereafter, the result is straightforward for the gamma kernel hazard estimator. The proofs are based on Lemma A.2. Indeed, we only need to prove the asymptotic normality and the iterated logarithm for 𝜎𝑛 defined in (A.8) since the other terms are negligible.

Proof of Proposition 3.2. By (i), see Theorem 3.1, we have 𝑛1/2𝑏1/4𝑒𝑛(𝑥)=𝑜(1) and under the conditions on the bandwidth parameter, we have 𝑛1/2𝑏1/4𝛽𝑛(𝑥)=𝑜(1).
Therefore, we need to state that 𝑛1/2𝑏1/4𝜎𝑛(𝑥)𝑉(𝑥)𝑁(0,1),indistribution.(A.31) But since 𝜎𝑛(𝑥)=𝑛𝑊𝑖(𝑥,𝑏),whereW𝑖(𝑥,𝑏)=𝑛1𝜉𝑖(𝑡)𝑑𝐾(𝑥,𝑏)(𝑡),(A.32) it suffices to prove that 𝑛𝑖=1𝔼||𝑊𝑖||3𝑊𝑛var13/20.(A.33) In fact, using inequality (A.14), sup𝑥||𝑊𝑖||1(𝑥,𝑏)=𝑂𝑛𝑏.(A.34) Therefore, from the variance of 𝜎𝑛(𝑥), 𝑛𝑖=1𝔼||𝑊𝑖||3𝑊𝑛var13/21𝑂𝑊𝑛𝑏𝑛var11/2=𝑂(𝑛𝑏)3/2𝑂𝑛if𝑥2𝑏,3/2𝑛4/5[𝑛if𝑥0,2𝑏).=𝑜(1)since𝑏=𝑜2/5.(A.35)

Proof of Proposition 3.3. Condition (iv) ensures that 𝑛1/2𝑏1/4𝛽𝑛(𝑥)=𝑜(1)and𝑛1/2𝑏1/4𝑒𝑛(𝑥)=𝑜(1).
On the other hand, we apply [26, Theorem 1] to 𝑆𝑛=𝑛𝑖=1𝑊𝑖, where 𝑊𝑖 is defined as in the proof of Proposition 3.2; we get under condition (iii) on the bandwidth parameter limsup𝑛Φ𝑥(𝑛,𝑏)1/2||𝜎𝑛||=1,a.s.(A.36) which concludes the proof of the theorem.

Acknowledgment

M. Mesfioui acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada.