About this Journal Submit a Manuscript Table of Contents
Journal of Probability and Statistics
Volume 2012 (2012), Article ID 138450, 18 pages
doi:10.1155/2012/138450
Research Article

New Bandwidth Selection for Kernel Quantile Estimators

Department of Mathematical Sciences, Brunel University, Uxbridge UBB 3PH, UK

Received 8 August 2011; Revised 26 September 2011; Accepted 10 October 2011

Academic Editor: Junbin B. Gao

Copyright © 2012 Ali Al-Kenani and Keming Yu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We propose a cross-validation method suitable for smoothing of kernel quantile estimators. In particular, our proposed method selects the bandwidth parameter, which is known to play a crucial role in kernel smoothing, based on unbiased estimation of a mean integrated squared error curve of which the minimising value determines an optimal bandwidth. This method is shown to lead to asymptotically optimal bandwidth choice and we also provide some general theory on the performance of optimal, data-based methods of bandwidth choice. The numerical performances of the proposed methods are compared in simulations, and the new bandwidth selection is demonstrated to work very well.

1. Introduction

The estimation of population quantiles is of great interest when one is not prepared to assume a parametric form for the underlying distribution. In addition, due to their robust nature, quantiles often arise as natural quantities to estimate when the underlying distribution is skewed [1]. Similarly, quantiles often arise in statistical inference as the limits of confidence interval of an unknown quantity.

Let 𝑋 1 , 𝑋 2 , , 𝑋 𝑛 be independent and identically distributed random sample drawn from an absolutely continuous distribution function 𝐹 with density 𝑓 . Further, let 𝑋 ( 1 ) 𝑋 ( 2 ) 𝑋 𝑛 denote the corresponding order statistics. For ( 0 < 𝑝 < 1 ) a quantile function 𝑄 ( 𝑝 ) is defined as follows: 𝑄 ( 𝑝 ) = i n f { 𝑥 𝐹 ( 𝑥 ) 𝑝 } . ( 1 . 1 ) If 𝑄 ( 𝑝 ) denotes 𝑝 th sample quantile, then 𝑄 ( 𝑝 ) = 𝑥 ( [ 𝑛 𝑝 ] + 1 ) where [ 𝑛 𝑝 ] denotes the integral part of 𝑛 𝑝 . Because of the variability of individual order statistics, the sample quantiles suffer from lack of efficiency. In order to reduce this variability, different approaches of estimating sample quantiles through weighted order statistics have been proposed. A popular class of these estimators is called kernel quantile estimators. Parzen [2] proposed a version of the kernel quantile estimator as below: 𝑄 𝐾 ( 𝑝 ) = 𝑛 𝑖 = 1 𝑖 / 𝑛 𝑖 1 / 𝑛 𝐾 𝑋 ( 𝑡 𝑝 ) 𝑑 𝑡 ( 𝑖 ) . ( 1 . 2 ) From (1.2) one can readily observe that 𝑄 𝐾 ( 𝑝 ) puts most weight on the order statistics 𝑋 ( 𝑖 ) , for which 𝑖 / 𝑛 is close to 𝑝 . In practice, the following approximation to 𝑄 𝐾 ( 𝑝 ) is often used: 𝑄 𝐴 𝐾 ( 𝑝 ) = 𝑛 𝑖 = 1 𝑛 1 𝐾 𝑋 ( 𝑖 / 𝑛 𝑝 ) ( 𝑖 ) . ( 1 . 3 ) Yang [3] proved that 𝑄 𝐾 ( 𝑝 ) and 𝑄 𝐴 𝐾 ( 𝑝 ) are asymptotically equivalent in terms of mean square errors. Similarly, Falk [4] demonstrates that, from a relative deficiency perspective, the asymptotic performance of 𝑄 𝐴 𝐾 ( 𝑝 ) is better than that of the empirical sample quantile.

In this paper, we propose a cross-validation method suitable for smoothing of kernel quantile estimators. In particular, our proposed method selects the bandwidth parameter, which is known to play a crucial role in kernel smoothing, based on unbiased estimation of a mean integrated squared error curve of which the minimising value determines an optimal bandwidth. This method is shown to lead to asymptotically optimal bandwidth choice and we also provide some general theory on the performance of optimal, data-based methods of bandwidth choice. The numerical performances of the proposed methods are compared in simulations, and the new bandwidth selection is demonstrated to work very well.

2. Data-Based Selection of the Bandwidth

Bandwidth plays a critical role in the implementation of practical estimation. Specifically, the choice of the smoothing parameter determines the tradeoff between the amount of smoothness obtained and closeness of the estimation to the true distribution [5].

Several data-based methods can be made to find the asymptotically optimal bandwidth in kernel quantile estimators for 𝑄 𝐴 𝐾 ( 𝑝 ) given by (1.3). One of these methods use derivatives of the quantile density for 𝑄 𝐴 𝐾 ( 𝑝 ) .

Building on Falk [4], Sheather and Marron [1] gave the MSE of 𝑄 𝐴 𝐾 ( 𝑝 ) as follows. If 𝑓 is not symmetric or 𝑓 is symmetric but 𝑝 0 . 5 , 𝑄 A M S E 𝐴 𝐾 = 1 ( 𝑝 ) 4 𝜇 2 ( 𝑘 ) 2 𝑄 ( 𝑝 ) 2 4 𝑄 + 𝑝 ( 1 𝑝 ) ( 𝑝 ) 2 𝑛 1 𝑄 𝑅 ( 𝐾 ) ( 𝑝 ) 2 𝑛 1 , ( 2 . 1 ) where 𝑅 ( 𝐾 ) = 2 𝑢 𝐾 ( 𝑢 ) 𝐾 1 ( 𝑢 ) 𝑑 𝑢 , 𝜇 2 ( 𝑘 ) = 𝑢 2 𝐾 ( 𝑢 ) 𝑑 𝑢 and 𝐾 1 is the antiderivative of 𝐾 .

If 𝑄 > 0 then o p t = 𝛼 ( 𝐾 ) 𝛽 ( 𝑄 ) 𝑛 1 / 3 , ( 2 . 2 ) where 𝛼 ( 𝐾 ) = [ 𝑅 ( 𝐾 ) / 𝜇 2 ( 𝑘 ) 2 ] 1 / 3 , 𝛽 ( 𝑄 ) = [ 𝑄 ( 𝑝 ) / 𝑄 ( 𝑝 ) ] 2 / 3 .

There is no single optimal bandwidth minimizing the 𝑄 A M S E ( 𝐴 𝐾 ( 𝑝 ) ) when 𝐹 is symmetric and 𝑝 = 0 . 5 . Also, If 𝑞 = 0 , we need higher terms and the 𝑄 A M S E ( 𝐴 𝐾 ( 𝑝 ) ) can be shown to be 𝑄 A M S E 𝐴 𝐾 ( = 1 𝑝 ) 4 1 𝑛 4 𝑄 ( 𝑝 ) 2 𝜇 2 ( 𝑘 ) 2 + 2 𝑛 1 2 𝑄 ( 𝑝 ) 2 ( 𝑞 𝑡 ) 𝑡 𝐾 ( 𝑡 ) 𝑗 ( 𝑡 ) 𝑑 𝑡 , ( 2 . 3 ) where 𝑗 ( 𝑡 ) = 𝑡 𝑥 𝐾 ( 𝑥 ) 𝑑 𝑥 , see Cheng and Sun [6].

In order to obtain o p t we need to estimate 𝑄 = 𝑞 and 𝑄 = 𝑞 . It follows from (1.3) that the estimator of 𝑄 = 𝑞 can be constructed as follows: ̃ 𝑞 𝐴 𝐾 𝑄 ( 𝑝 ) = 𝐴 𝐾 ( 𝑝 ) = 𝑛 𝑖 = 1 𝑋 ( 𝑖 ) 𝐾 𝑎 ( 𝑖 1 ) 𝑛 𝑝 𝐾 𝑎 𝑖 𝑛 𝑝 . ( 2 . 4 ) Jones [7] derived that the A M S E ( ̃ 𝑞 𝐴 𝐾 ( 𝑝 ) ) as A M S E ̃ 𝑞 𝐴 𝐾 = 𝑎 ( 𝑝 ) 4 4 𝜇 2 ( 𝑘 ) 2 𝑞 ( 𝑝 ) 2 + 1 [ ] 𝑛 𝑎 𝑞 ( 𝑝 ) 2 𝐾 2 ( 𝑦 ) 𝑑 𝑦 . ( 2 . 5 ) By minimizing (2.5), we obtain the asymptotically optimal bandwidth for 𝑄 𝐴 𝐾 ( 𝑝 ) : 𝑎 o p t = 𝑄 ( 𝑝 ) 2 𝐾 2 ( 𝑦 ) 𝑑 𝑦 𝑛 𝑄 ( 𝑝 ) 2 𝜇 2 ( 𝑘 ) 2 1 / 5 . ( 2 . 6 ) To estimate 𝑄 = 𝑞 in (2.2), we employ the known result 𝑄 𝐴 𝐾 𝑑 ( 𝑝 ) = 𝑄 𝑑 𝑝 𝐴 𝐾 1 ( 𝑝 ) = 𝑎 2 𝑛 𝑖 = 1 𝑋 ( 𝑖 ) 𝐾 ( 𝑖 1 ) / 𝑛 𝑝 𝑎 𝐾 𝑖 / 𝑛 𝑝 𝑎 , ( 2 . 7 ) and it readily follows that 𝑎 o p t = 3 𝑄 ( 𝑝 ) 2 𝐾 2 ( 𝑥 ) 𝑑 𝑥 𝑛 𝑄 ( 𝑝 ) 2 𝜇 2 ( 𝑘 ) 2 1 / 7 ( 2 . 8 ) which represents the asymptotically optimal bandwidth for 𝑄 𝐴 𝐾 ( 𝑝 ) . By substituting 𝑎 = 𝑎 o p t in (2.4) and 𝑎 = 𝑎 o p t in (2.7) we can compute o p t .

3. Cross-Valdation Bandwidth Selection

When measuring the closeness of an estimated and true function the mean integrated squared (MISE) defined as M I S E ( ) = 𝐸 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 𝑑 𝑝 ( 3 . 1 ) is commonly used as a global measure of performance.

The value which minimises M I S E ( ) is the optimal smoothing parameter, and it is unknown in practice. The following A S E ( ) is the discrete form of error criterion approximating M I S E ( ) : 1 A S E ( ) = 𝑛 𝑛 𝑖 = 1 𝑄 𝑖 𝑛 𝑖 𝑄 𝑛 2 . ( 3 . 2 )

The unknown 𝑄 ( 𝑝 ) is replaced by 𝑄 ( 𝑝 ) and a function of cross-validatory procedure is created as: 1 𝑛 𝑛 𝑖 = 1 𝑄 𝑖 𝑖 𝑛 𝑄 𝑖 𝑛 2 , ( 3 . 3 ) where 𝑄 𝑖 ( 𝑖 / 𝑛 ) denotes the kernel estimator evaluated at observation 𝑥 𝑖 , but constructed from the data with observation 𝑥 𝑖 omitted.

The general approach of crossvalidation is to compare each observation with a value predicted by the model based on the remainder of the data. A method for density estimation was proposed by Rudemo [8] and Bowman [9]. This method can be viewed as representing each observation by a Dirac delta function 𝛿 ( 𝑥 𝑥 𝑖 ) , whose expectation is 𝑓 ( 𝑥 ) , and contrasting this with a density estimate based on the remainder of the data. In the context of distribution functions, a natural characterisation of each observation is by the indicator function 𝐼 ( 𝑥 𝑥 𝑖 ) whose expectation is 𝐹 ( 𝑥 ) . This implies that the kernel method for density estimation can be expressed as 1 𝑓 ( 𝑥 ) = 𝑛 𝑛 𝑖 = 1 𝐾 𝑥 𝑥 𝑖 , ( 3 . 4 )

when 0 𝐾 ( 𝑥 𝑥 𝑖 ) 𝛿 ( 𝑥 𝑥 𝑖 ) .

The kernel method for distribution function 1 𝐹 ( 𝑥 ) = 𝑛 𝑛 𝑖 = 1 𝑊 𝑥 𝑥 𝑖 , ( 3 . 5 )

where 𝑊 is a distribution function, is the bandwidth controls the degree of smoothing. When 0 𝑊 𝑥 𝑥 𝑖 𝐼 𝑥 𝑥 𝑖 , ( 3 . 6 )

where 𝐼 ( 𝑥 𝑥 𝑖 ) is the indicator function 𝐼 𝑥 𝑥 𝑖 = 1 , i f 𝑥 𝑥 𝑖 0 , 0 , o t h e r w i s e . ( 3 . 7 )

Now, from (1.3) when 0 𝑄 A K 𝑖 ( 𝑝 ) 𝛿 𝑛 𝑋 𝑝 ( 𝑖 ) , ( 3 . 8 ) and thus a cross-validation function can be written as 1 C V ( ) = 𝑛 𝑛 𝑖 = 1 1 0 𝛿 𝑖 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑄 𝑖 𝑖 𝑛 2 𝑑 𝑝 . ( 3 . 9 ) The smoothing parameter is then chosen to minimise this function. By subtracting a term that characterise the performance of the true ( 𝑝 ) we have 1 𝐻 ( ) = C V ( ) 𝑛 𝑛 𝑖 = 1 1 0 𝛿 𝑖 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑖 𝑄 𝑛 2 𝑑 𝑝 ( 3 . 1 0 ) which does not involve . By expanding the braces and taking expectation, we obtain 1 𝐻 ( ) = 𝑛 𝑛 𝑖 = 1 1 0 𝑄 2 𝑖 𝑖 𝑛 𝑖 2 𝛿 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑄 𝑖 𝑖 𝑛 𝑖 + 2 𝛿 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑄 𝑖 𝑛 𝑄 2 𝑖 𝑛 𝑑 𝑝 . ( 3 . 1 1 ) When 𝑛 the ( 𝑛 𝑝 ) th order statistic 𝑥 ( 𝑛 𝑝 ) is asymptotically normally distributed 𝑥 ( 𝑛 𝑝 ) A N 𝑄 ( 𝑝 ) , 𝑝 ( 1 𝑝 ) 𝑛 [ ] 𝑓 ( 𝑄 ( 𝑝 ) ) 2 , 1 𝐸 { 𝐻 ( ) } = 𝐸 𝑛 𝑛 𝑖 = 1 1 0 𝑄 2 𝑖 𝑖 𝑛 𝑖 2 𝛿 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑄 𝑖 𝑖 𝑛 𝑖 + 2 𝛿 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑄 𝑖 𝑛 𝑄 2 𝑖 𝑛 , 1 𝑑 𝑝 𝐸 { 𝐻 ( ) } = 𝑛 𝑛 𝑖 = 1 1 0 𝐸 𝑄 2 𝑖 𝑖 𝑛 𝑖 2 𝛿 𝑛 𝑄 𝑖 𝑝 𝑛 𝐸 𝑄 𝑖 𝑖 𝑛 𝑖 + 2 𝛿 𝑛 𝑄 𝑝 2 𝑖 𝑛 𝑄 2 𝑖 𝑛 𝑑 𝑝 , 𝐸 { 𝐻 ( ) } = 𝐸 1 0 𝑄 𝑛 1 𝑖 𝑛 𝑖 𝑄 𝑛 2 𝑑 𝑝 , ( 3 . 1 2 ) where the notation 𝑄 𝑛 1 ( 𝑖 / 𝑛 ) with positive subscript denotes a kernel estimator based on a sample size of 𝑛 1 . The proceeding arguments demonstrate that C V ( ) provides an asymptotic unbiased estimator of the true M I S E ( ) curve for a sample size 𝑛 1 . The identity at (3.12) strongly suggests that crossvalidation should perform well.

4. Theoretical Properties

From (3.1), we can write M I S E ( ) = 1 0 b i a s 2 ( 𝑄 𝐾 ( 𝑝 ) ) 𝑑 𝑝 + 1 0 𝑄 v a r ( 𝐾 ( 𝑝 ) ) 𝑑 𝑝 .

Sheather and Marron [1] have shown that 𝑄 b i a s 𝐾 = 1 ( 𝑝 ) 2 2 𝜇 2 ( 𝑘 ) 𝑄 ( 𝑝 ) + 0 2 . ( 4 . 1 ) while Falk [4, page 263] proved that 𝑄 v a r 𝐾 𝑄 ( 𝑝 ) = 𝑝 ( 1 𝑝 ) ( 𝑝 ) 2 𝑛 1 𝑄 𝑅 ( 𝐾 ) ( 𝑝 ) 2 𝑛 1 𝑛 + 0 1 . ( 4 . 2 )

On combining the expressions for bias and variance we can express the mean integrated square error as 1 M I S E ( ) = 4 4 𝜇 2 ( 𝑘 ) 2 1 0 𝑄 ( 𝑝 ) 2 𝑑 𝑝 + 𝑝 ( 1 𝑝 ) 1 0 𝑄 ( 𝑝 ) 2 𝑑 𝑝 𝑛 1 𝑅 ( 𝐾 ) 1 0 𝑄 ( 𝑝 ) 2 𝑑 𝑝 𝑛 1 + 0 4 + 𝑛 1 , ( 4 . 3 )

and for 𝐶 1 = 𝑝 ( 1 𝑝 ) 1 0 [ 𝑄 ( 𝑝 ) ] 2 𝑑 𝑝 , 𝐶 2 = 𝑅 ( 𝐾 ) 1 0 [ 𝑄 ( 𝑝 ) ] 2 𝑑 𝑝 and 𝐶 3 = 𝜇 2 ( 𝑘 ) 2 1 0 [ 𝑄 ( 𝑝 ) ] 2 𝑑 𝑝 the MISE can be expressed as M I S E ( ) = 𝐶 1 𝑛 1 𝐶 2 𝑛 1 1 + 4 𝐶 3 4 + 0 4 + 𝑛 1 . ( 4 . 4 ) Therefore, the asymptotically optimal bandwidth is 0 = 𝐶 𝑛 1 / 3 , where 𝐶 = { 𝐶 2 / 𝐶 3 } 1 / 3 .

We can see from (3.12) that 𝐻 ( ) may be a good approximation to M I S E ( ) or at least to that function evaluated for a sample of size 𝑛 1 rather than 𝑛 . Additionally, this is true if we adjusted 𝐻 ( ) by adding the quantity 𝐽 𝑛 = 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 . ( 4 . 5 ) This quantity is demean and does not depend on which makes it attractive for obtaining a particularly good approximation to M I S E ( ) .

Theorem 4.1. Suppose that 𝑄 ( 𝑝 ) is bounded on [ 0 , 1 ] and right continuous at the point 0, and that 𝐾 is a compactly supported density and symmetric about 0. Then, for each 𝛿 , 𝜀 , 𝐶 > 0 , 𝐻 ( ) + 𝐽 = M I S E ( ) + 0 2 𝑛 3 / 2 + 𝑛 1 3 / 2 + 𝑛 1 / 2 3 𝑛 𝛿 ( 4 . 6 ) with probability 1 , uniformly in 0 𝐶 𝑛 𝛿 , as 𝑛 .

(An outline proof of the above theorem is in the appendix).

From the above theorem, we can conclude that minimisation of 𝐻 ( ) produces a bandwidth that is asymptotically equivalent to the bandwidth 0 that minimises M I S E ( ) .

Corollary 4.2. Suppose that the conditions of previous theorem hold. If denotes the bandwidth that minimises C V ( ) in the range 0 𝐶 𝑛 𝛿 , for any 𝐶 > 0 and any 0 𝜀 1 / 3 , then 0 1 ( 4 . 7 ) with probability 1 as 𝑛 .

5. A Simulation Study

A numerical study was conducted to compare the performances of the two bandwidth selection methods. Namely, the method presented by Sheather and Marron [1] and our proposed method.

In order to account for different shapes for our simulation study we consider a standard normal, Exp ( 1 ) , Log-normal(0,1) and double exponential distributions and we calculate 18 quantiles ranging from 𝑝 = 0 . 0 5 to 𝑝 = 0 . 9 5 . Through the numerical study the Gaussian kernel was used as the kernel function. Sample sizes of 100, 200 and 500 were used, with 100 simulations in each case. The performance of the methods was assessed through the mean squared errors criterion (MSE). M S E ( ) = 𝐸 { 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) } 2 . And the relative efficiency (R.E) R . E = M I S E M e t h o d 2 M e t h o d 2 , o p t M I S E M e t h o d 1 M e t h o d 1 , o p t . ( 5 . 1 )

Further, for comparison purposes we refer to our proposed method and that of Sheather and Marron [1] as method 1 and method 2 respectively.

(a) Standard normal distribution (see Table 1 and Figure 1).

tab1
Table 1: Mean squared errors results for bandwidth selection methods for different sample sizes and for data from a normal distribution.
fig1
Figure 1: Left panel: plots of the quantile estimators for method 1 (solid line), method 2 (dotted line), and true quantile (dashed line) for different sample sizes and for data from a normal distribution. Right panel: box plots of mean squared errors for the quantile estimators for method 1 and method 2 for different sample sizes.

(b) Exponential distribution (see Table 2 and Figure 2).

tab2
Table 2: Mean squared errors results for bandwidth selection methods for different sample sizes and for data from an exponential distribution.
fig2
Figure 2: Left panel: plots of the quantile estimators for method 1 (solid line), method 2 (dotted line) and true quantile (dashed line) for different sample sizes and for data from an exponential distribution. Right panel: box plots of mean squared errors for the quantile estimators for method 1 and method 2 for different sample sizes.

(c) Log-normal distribution (see Table 3 and Figure 3).

tab3
Table 3: Mean squared errors results for bandwidth selection methods for different sample sizes and for data from a Log-normal distribution.
fig3
Figure 3: Left panel: plots of the quantile estimators for method 1 (solid line), method 2 (dotted line) and true quantile (dashed line) for different sample sizes and for data from a Log-normal distribution. Right panel: box plots of mean squared errors for the quantile estimators for method 1 and method 2 for different sample sizes.

(d) Double exponential distribution (see Table 4 and Figure 4).

tab4
Table 4: Mean squared errors results for bandwidth selection methods for different sample sizes and for data from a double exponential distribution.
fig4
Figure 4: Left panel: plots of the quantile estimators for method 1 (solid line), method 2 (dotted line) and true quantile (dashed line) for different sample sizes and for data from a double exponential distribution. Right panel: box plots of mean squared error for the quantile estimators for method 1 and method 2 for different sample sizes.

We can compute and summarize the relative efficiency of M e t h o d 1 , o p t for the all previous distributions in Table 5.

tab5
Table 5: The relative efficiency (R.E) of M e t h o d 1 , o p t .

From Tables 1, 2, 3, and 4, for the all distributions, it can be observed that in 52.3% of cases our method produces lower mean squared errors, slightly wins Sheather-Marron method.

Also, from Table 5 which describes the relative efficiency for M e t h o d 1 , o p t we can see M e t h o d 1 , o p t more efficient from M e t h o d 2 , o p t for all the cases except the standard normal distribution cases with 𝑛 = 2 0 0 , 5 0 0 and double exponential distribution cases with 𝑛 = 5 0 0 .

So, we may conclude that in terms of MISE our bandwidth selection method is more efficient than Sheather-Marron for skewed distributions but not for symmetric distributions.

6. Conclusion

In this paper we have a proposed a cross-validation-based-rule for the selection of bandwidth for quantile functions estimated by kernel procedure. The bandwidth selected by our proposed method is shown to be asymptotically unbiased and in order to assess the numerical performance, we conduct a simulation study and compare it with the bandwidth proposed by Sheather and Marron [1]. Based on the four distributions considered the proposed bandwidth selection appears to provide accurate estimates of quantiles and thus we believe that the new bandwidth selection method is a practically useful method to get bandwidth for the quantile estimator in the form (1.3).

Appendix

Step 1. Let 𝑛 𝐻 = 𝑆 1 2 𝑆 2 , where 𝑆 1 = 𝑖 1 0 𝑄 𝑖 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 , 𝑆 2 = 𝑖 1 0 𝛿 𝑖 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑄 𝑄 ( 𝑝 ) 𝑖 ( 𝑝 ) 𝑄 ( 𝑝 ) . ( A . 1 )

Step 2. With 𝐷 𝑖 ( 𝑝 ) = 𝐾 ( 𝑖 / 𝑛 𝑝 ) 𝑋 ( 𝑖 ) 𝑄 ( 𝑝 ) and 𝐷 0 𝑖 ( 𝑝 ) = 𝛿 ( 𝑖 / 𝑛 𝑝 ) 𝑋 ( 𝑖 ) 𝑄 ( 𝑝 ) 𝑆 1 = ( 𝑛 1 ) 2 𝑛 2 ( 𝑛 2 ) 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 + ( 𝑛 1 ) 𝑛 2 𝑖 = 1 1 0 𝐷 2 𝑖 𝑆 ( 𝑝 ) , 2 = ( 𝑛 1 ) 1 𝑛 2 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) + ( 𝑛 1 ) 𝑛 1 𝑖 = 1 1 0 𝐷 𝑖 𝐷 0 𝑖 ( 𝑝 ) . ( A . 2 )

Step 3. This step combines Steps 1 and 2 to prove that 𝐻 = 1 ( 𝑛 1 ) 2 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 + 1 𝑛 ( 𝑛 1 ) 2 𝑛 𝑖 = 1 1 0 𝐷 2 𝑖 ( 𝑝 ) 2 1 + ( 𝑛 1 ) 1 1 0 + 2 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑛 ( 𝑛 1 ) 𝑛 𝑖 = 1 1 0 𝐷 𝑖 ( 𝑝 ) 𝐷 0 𝑖 ( 𝑝 ) . ( A . 3 )

Step 4. This step establishes that 𝐸 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 2 + 𝐸 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 𝑛 = 0 2 + 8 , 𝐸 𝑛 𝑛 3 𝑖 = 1 1 0 𝐷 2 𝑖 ( 𝑝 ) 2 𝑛 + v a r 𝑛 2 𝑖 = 1 1 0 𝐷 𝑖 𝐷 0 𝑖 ( 𝑝 ) 2 𝑛 = 0 3 . ( A . 4 )

Step 5. This step combines Steps 3 and 4, concluding that 𝐻 + 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 = 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 + 2 ( 𝑛 1 ) 1 𝜇 ( ) + 0 2 𝑛 3 / 2 + 𝑛 1 4 , ( A . 5 ) where 𝜇 ( ) = 1 0 𝐸 ( 𝐷 𝑖 ( 𝑝 ) 𝐷 0 𝑖 ( 𝑝 ) ) .
Let 𝑈 = 0 2 ( 𝜉 ) , for a random variable 𝑈 = 𝑈 ( 𝑛 ) and a positive sequence 𝜉 = 𝜉 ( 𝑛 ) 𝐸 𝑈 2 𝜉 = 0 2 . ( A . 6 )

Step 6. This step notes that 1 0 ( 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) ) 2 = 𝑆 + 𝑇 , where 𝑆 = 𝑛 2 𝑖 𝑗 𝑔 𝑋 𝑖 , 𝑋 𝑗 , 𝑇 = 𝑛 𝑛 2 𝑖 = 1 𝑔 𝑋 𝑖 , 𝑋 𝑖 , 𝑔 𝑋 𝑖 , 𝑋 𝑗 = 1 0 𝐾 𝑖 𝑛 𝑋 𝑝 ( 𝑖 ) 𝑖 𝛿 𝑛 𝑋 𝑝 ( 𝑖 ) 𝐾 𝑗 𝑛 𝑋 𝑝 ( 𝑗 ) 𝑗 𝛿 𝑛 𝑋 𝑝 ( 𝑗 ) 𝑑 𝑝 , ( A . 7 ) and that 𝑆 = 𝑆 ( 1 ) + 𝑆 ( 2 ) + ( 1 𝑛 1 ) 𝑔 0 , where 𝑆 ( 1 ) = 𝑛 2 𝑖 𝑗 𝑔 𝑋 𝑖 , 𝑋 𝑗 𝑔 1 𝑋 𝑖 𝑔 1 𝑋 𝑗 + 𝑔 0 , 𝑆 ( 2 ) = 2 𝑛 1 1 𝑛 1 𝑛 𝑖 = 1 𝑔 1 𝑋 𝑖 𝑔 0 , 𝑔 1 𝑔 ( 𝑥 ) = 𝐸 𝑥 , 𝑋 1 , 𝑔 0 𝑔 = 𝐸 1 𝑋 1 . ( A . 8 )

Step 7. Shows that 𝐸 { 𝑔 ( 𝑋 1 , 𝑋 1 ) 2 } = 0 ( 1 ) , 𝐸 { 𝑔 ( 𝑋 1 , 𝑋 2 ) 2 } = 0 ( 3 ) , 𝐸 { 𝑔 1 ( 𝑋 1 ) 2 } = 0 ( 6 ) v a r { 𝑇 } = 0 ( 𝑛 3 ) , 𝐸 ( 𝑆 ( 1 ) ) 2 = 0 ( 𝑛 2 3 ) a n d 𝐸 ( 𝑆 ( 2 ) ) 2 = 0 ( 𝑛 1 6 ) .

Step 8. This step combines the results of Steps 5, 6, 7, obtaining 𝐻 + 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 = 𝐸 ( 𝑇 ) + 1 𝑛 1 𝑔 0 + 2 ( 𝑛 1 ) 1 𝜇 ( ) + 0 2 𝑛 3 / 2 + 𝑛 1 3 / 2 + 𝑛 1 / 2 3 = 1 0 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 + 2 ( 𝑛 1 ) 1 𝜇 ( ) + 0 2 𝑛 3 / 2 + 𝑛 1 3 / 2 + 𝑛 1 / 2 3 . ( A . 9 )

Step 9. This step notes that 𝜇 ( ) = 0 ( ) and 1 0 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 = 1 0 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 + 1 0 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 2 𝑛 1 𝜇 ( ) . ( A . 1 0 )

Step 10. This step combines Steps 8 and 9, establishing that 𝐻 + 1 0 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 1 0 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 = 1 0 𝐸 𝑄 ( 𝑝 ) 𝑄 ( 𝑝 ) 2 + 0 2 𝑛 3 / 2 + 𝑛 1 3 / 2 + 𝑛 1 / 2 3 . ( A . 1 1 ) This means that 𝐸 { 𝐻 + 𝐽 M I S E ( ) } 2 = 0 2 𝑛 3 / 2 + 𝑛 1 3 / 2 + 𝑛 1 / 2 3 . ( A . 1 2 )

References

  1. S. J. Sheather and J. S. Marron, “Kernel quantile estimators,” Journal of the American Statistical Association, vol. 85, no. 410, pp. 410–416, 1990. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  2. E. Parzen, “Nonparametric statistical data modeling,” Journal of the American Statistical Association, vol. 74, pp. 105–131, 1979.
  3. S.-S. Yang, “A smooth nonparametric estimator of a quantile function,” Journal of the American Statistical Association, vol. 80, no. 392, pp. 1004–1011, 1985. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  4. M. Falk, “Relative deficiency of kernel type estimators of quantiles,” The Annals of Statistics, vol. 12, no. 1, pp. 261–268, 1984. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  5. M. P. Wand and M. C. Jones, Kernel Smoothing, Chapman and Hall, London, UK, 1995.
  6. M. Y. Cheng and S. Sun, “Bndwidth selection for kernel quantile estimation,” Journal of Chines Statistical Association, vol. 44, no. 3, pp. 271–295, 2006.
  7. M. C. Jones, “Estimating densities, quantiles, quantile densities and density quantiles,” Annals of the Institute of Statistical Mathematics, vol. 44, no. 4, pp. 721–727, 1992. View at Publisher · View at Google Scholar · View at Scopus
  8. M. Rudemo, “Empirical choice of histograms and kernel density estimators,” Scandinavian Journal of Statistics, vol. 9, no. 2, pp. 65–78, 1982. View at Zentralblatt MATH
  9. A. W. Bowman, “An alternative method of cross-validation for the smoothing of density estimates,” Biometrika, vol. 71, no. 2, pp. 353–360, 1984. View at Publisher · View at Google Scholar · View at MathSciNet