Research Article | Open Access

Volume 2014 |Article ID 864530 | 7 pages | https://doi.org/10.1155/2014/864530

# Average Derivative Estimation from Biased Data

Accepted15 Feb 2014
Published11 Mar 2014

#### Abstract

We investigate the estimation of the density-weighted average derivative from biased data. An estimator integrating a plug-in approach and wavelet projections is constructed. We prove that it attains the parametric rate of convergence under the mean squared error.

#### 1. Introduction

The standard density-weighted average derivative estimation problem is the following. We observe i.i.d. bivariate random variables defined on a probability space . Let be the unknown density function of , and let be the unknown regression function given by The density-weighted average derivative is defined by The estimation of is of interest in some econometric problems, especially in the context of estimation of coefficients in index models (see, e.g., Stoker [1, 2], Powell et al. , and Härdle and Stoker ). Among the popular approaches, there are the nonparametric techniques based on kernel estimators (see, e.g., Härdle and Stoker , Powell et al. , Härdle et al. , and Stoker ) or orthogonal series methods introduced in Rao . Recently, Chesneau et al.  have developed an estimator based on a new plug-in approach and a wavelet series method. We refer to Antoniadis , Härdle et al. , and Vidakovic  for further details about wavelets and their applications in nonparametric statistics.

In this paper, we extend this estimation problem to the biased data. It is based on the “biased regression model” which is described as follows. We observe i.i.d. bivariate random variables defined on a probability space with the common density function: where is a known positive function, is the density function of an unobserved bivariate random variable , and (which is an unknown real number). This model has potential applications in biology, economics, and many other fields. Important results on methods and applications can be found in, for example, Ahmad , Sköld , Cristóbal and Alcalá , Wu , Cristóbal and Alcalá , Cristóbal et al. , Ojeda et al. , Cabrera and Van Keilegom , and Chaubey et al. . Wavelet methods related to this model can be found in Chesneau and Shirazi , Chaubey et al. , and Chaubey and Shirazi .

Let be the density of ; that is, , , and let be the unknown regression function given by

The density-weighted average derivative is defined by We aim to estimate from . To reach this goal, we adapt the methodology of Chesneau et al.  to this problem and develop new technical arguments derived to those developed in Chesneau and Shirazi  for the estimation of (4). A new wavelet estimator is thus constructed. We prove that it attains the parametric rate of convergence under the mean squared error, showing the consistency of our estimator.

The paper is organized as follows. In Section 2, we introduce our wavelet methodology including our estimator. Additional assumptions on the model and our main theoretical result are set in Section 3. Finally, the proofs are postponed to Section 4.

#### 2. Wavelet Methodology

In this section, after a brief description of the considered wavelet basis, we present our wavelet estimator of (5).

##### 2.1. Compactly Supported Wavelet Basis

Let us consider the following set of functions: For the purposes of this paper, we use the compactly supported wavelet bases on briefly described below.

Let be a fixed integer, and let and be the initial wavelet functions of the Daubechies wavelets (see, e.g., Mallat ). These functions have the features to be compactly supported and derivable.

Set and . Then, with a specific treatments at the boundaries, there exists an integer such that the collection is an orthonormal basis of .

Hence, a function can be expanded on as where Further details about wavelet basis can be found in, for example, Meyer , Cohen et al. , and Mallat .

##### 2.2. Wavelet Estimator

Proposition 1 provides another expression of the density-weighted average derivative (5) in terms of wavelet coefficients.

Proposition 1. Let be given by (5). Suppose that , , , and . Then the density-weighted average derivative (5) can be expressed as where

In view of Proposition 1, adapting the ideas of Chesneau et al.  and Chesneau and Shirazi  to our framework, we consider the following plug-in estimator for : where and is an integer which will be chosen a posteriori.

Proposition 2 provides some theoretical results explaining the constructions of the above estimators.

Proposition 2. Suppose that . Then(i) (17) is an unbiased estimator for ;(ii) (15) and (16) are unbiased estimators for (12) and (13), respectively;(iii)under , (15) and (16) are unbiased estimators for (12) and (13), respectively.

#### 3. Assumptions and Result

After the presentation of some additional assumptions on the model, we describe our main result on the asymptotic properties of (14).

##### 3.1. Assumptions

We formulate the following assumptions on , , and .(H1)The support of , denoted by , is compact. In order to fix the notations, we suppose that .(H2)There exists a constant such that (H3)The function satisfies and there exists a constant such that (H4)There exists a constant such that (H5)There exist two constants and such that

Let , , and and be given by (13). We formulate the following assumptions on and .(H6  ())There exists a constant such that(H7  ())There exists a constant such that Note that (H2)–(H5) are boundedness assumptions, whereas (H6  ()) and (H7   ()) are related to the smoothness of and represented by and . There exist deep connections between (H6 ()) and (H7 ()) and balls of Hölder spaces (see [10, Chapter 8]).

##### 3.2. Main Result

The following theorem establishes the upper bound of the MSE of our estimator.

Theorem 3. Assume that (H1)–(H5), (H6 ()) with , and (H7 ()) with hold. Let be given by (5), and let be given by (14) with such that . Then there exists a constant such that

Theorem 3 proves that attains the parametric rate of convergence under the MSE. This implies the consistency of our estimator. This result provides a first theoretical aspect to the estimation of from biased data.

#### 4. Proofs

##### 4.1. On the Construction of

Proof of Proposition 1. We follow the approach of Chesneau et al. . Using , , and an integration by parts, we obtain Moreover,(i)since , we can expand it on as (9): where and are (12),(ii)since , we can expand it on as (9): where and are (13).
Thanks to (25) and the orthonormality of on , we get Proposition 1 is proved.

Proof of Proposition 2. (i) We have
(ii) Using the identical distribution of and the definition of (4), we obtain
Similarly, we prove that .
(iii) Using the identical distribution of , an integration by parts, and , we obtain
Similarly, we prove that .
This ends the proof of Proposition 2.

##### 4.2. Proof of Some Intermediate Results

Proposition 4. Suppose that (H1)–(H5) hold. Let and be given by (13), and let and be given by (16) with such that . Then(i)there exists a constant such that(ii)there exists a constant such that(iii)there exists a constant such thatThese inequalities hold with in (15) instead of and in (12) instead of for .

Proof of Proposition 4. We use the following version of the Rosenthal inequality. The proof can be found in Rosenthal .
Lemma  5. Let be a positive integer, , and let be zero mean independent random variables such that . Then there exists a constant such that(i)Note that with Since are i.i.d., we get that are also i.i.d.. Moreover, from Proposition 2, we have . Using (H5), for any , we have . Thus, Lemma  5 with yields (ii)We have By (H5), we have , , and by (H2) and (H3), . Hence, by the triangular inequality,
The elementary inequality, ,    , implies that where Upper bound for . Since are i.i.d., we get that are also i.i.d. Moreover, from Proposition 2, we have . Now, using the Hölder inequality, (H4), (H5) and , for any , observe that Combining Lemma  5 with with the previous inequality, we obtain Upper bound for . The point (i) yields It follows from (41), (44), and (45) that (iii)Similar arguments to the beginning of (ii) give where Upper bound for . Since are i.i.d., we get that are also i.i.d. Moreover, from Proposition 2, we have . Now, using the Hölder inequality, (H4), (H5), , and , for any , observe that Owing to Lemma  5 with and the previous inequality, we obtain Upper bound for . The point (i) yields
It follows from (47), (50), and (51) that Proposition 4 is proved.

Proposition 6 below is a consequence of [8, Proposition  5.2] and the results of Proposition 4 above.

Proposition 6. (i) Suppose that (H1)–(H5), (H6 ()), and (H7 ()) hold. Let and be given by (13), and let and be given by (16) with such that . Then there exists a constant such that
(ii) Suppose that (H1)–(H5) hold. Let and be given by (12), and let and be given by (15). Then there exists a constant such that

##### 4.3. Proof of the Main Result

Proof of Theorem 3. Using the intermediary results above, the proof follows the lines of [8, Theorem  5.1]. It follows from Proposition 1 and the elementary inequality, ,  , that where Upper bound for . Using the Cauchy-Schwarz inequality, the second point of Proposition 6, and , we obtain Upper bound for . It follows from the Cauchy-Schwarz inequality, the first point of Proposition 6, , the elementary inequality, , , , and , that Upper bound for . By (H6 ()) with , (H7 ()) with , and , we have Putting (55), (57), (58), and (59) together, we obtain This ends the proof of Theorem 3.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

1. T. M. Stoker, “Consistent estimation of scaled coefficients,” Econometrica, vol. 54, no. 6, pp. 1461–1481, 1986.
2. T. M. Stoker, “Tests of additive derivative constraints,” Review of Economic Studies, vol. 56, no. 4, pp. 535–552, 1989.
3. J. L. Powell, J. H. Stock, and T. M. Stoker, “Semiparametric estimation of index coefficients,” Econometrica, vol. 57, no. 6, pp. 1403–1430, 1989.
4. W. Härdle and T. M. Stoker, “Investigating smooth multiple regression by the method of average derivatives,” Journal of the American Statistical Association, vol. 84, no. 408, pp. 986–995, 1989.
5. W. Härdle, J. Hart, J. S. Marron, and A. B. Tsybakov, “Bandwidth choice for average derivative estimation,” Journal of the American Statistical Association, vol. 87, no. 417, pp. 218–226, 1992.
6. T. M. Stoker, “Equivalence of direct, indirect, and slope estimators of average derivatives,” in Nonparametric and Semiparametric Methods in Econometrics and Statistics, W. A. Barnett, J. Powell, and G. Tauchen, Eds., pp. 99–118, Cambridge University Press, Cambridge, UK, 1991. View at: Google Scholar | MathSciNet
7. B. L. S. P. Rao, “Consistent estimation of density-weighted average derivative by orthogonal series method,” Statistics & Probability Letters, vol. 22, no. 3, pp. 205–212, 1995.
8. C. Chesneau, M. Kachour, and F. Navarro, “On the estimation of density-weighted average derivative by wavelet methods under various dependence structures,” Sankhya A, vol. 76, no. 1, pp. 48–76, 2014. View at: Google Scholar
9. A. Antoniadis, “Wavelets in statistics: a review,” Journal of the Italian Statistical Society B, vol. 6, no. 2, pp. 97–144, 1997. View at: Publisher Site | Google Scholar
10. W. Härdle, G. Kerkyacharian, D. Picard, and A. Tsybakov, Wavelets, Approximation, and Statistical Applications, vol. 129 of Lecture Notes in Statistics, Springer, New York, NY, USA, 1998. View at: Publisher Site | MathSciNet
11. B. Vidakovic, Statistical Modeling by Wavelets, John Wiley & Sons, New York, NY, USA, 1999. View at: Publisher Site | MathSciNet
12. I. A. Ahmad, “On multivariate kernel estimation for samples from weighted distributions,” Statistics & Probability Letters, vol. 22, no. 2, pp. 121–129, 1995.
13. M. Sköld, “Kernel regression in the presence of size-bias,” Journal of Nonparametric Statistics, vol. 12, no. 1, pp. 41–51, 1999.
14. J. A. Cristóbal and J. T. Alcalá, “Nonparametric regression estimators for length biased data,” Journal of Statistical Planning and Inference, vol. 89, no. 1-2, pp. 145–168, 2000.
15. C. O. Wu, “Local polynomial regression with selection biased data,” Statistica Sinica, vol. 10, no. 3, pp. 789–817, 2000.
16. J. A. Cristóbal and J. T. Alcalá, “An overview of nonparametric contributions to the problem of functional estimation from biased data,” Test, vol. 10, no. 2, pp. 309–332, 2001.
17. J. A. Cristóbal, J. L. Ojeda, and J. T. Alcalá, “Confidence bands in nonparametric regression with length biased data,” Annals of the Institute of Statistical Mathematics, vol. 56, no. 3, pp. 475–496, 2004. View at: Publisher Site | Google Scholar | MathSciNet
18. J. L. Ojeda, W. González-Manteiga, and J. A. Cristobal, “A bootstrap based model checking for selection-biased data,” Tech. Rep. 07–05, Universidade de Santiago de Compostela, 2007. View at: Google Scholar
19. J. L. O. Cabrera and I. van Keilegom, “Goodness-of-fit tests for parametric regression with selection biased data,” Journal of Statistical Planning and Inference, vol. 139, no. 8, pp. 2836–2850, 2009.
20. Y. P. Chaubey, N. Laïb, and J. Li, “Generalized kernel regression estimator for dependent size-biased data,” Journal of Statistical Planning and Inference, vol. 142, no. 3, pp. 708–727, 2012.
21. C. Chesneau and E. Shirazi, “Nonparametric wavelet regression based on biased data,” Communications in Statistics. In press. View at: Google Scholar
22. Y. P. Chaubey, C. Chesneau, and E. Shirazi, “Wavelet-based estimation of regression function for dependent biased data under a given random design,” Journal of Nonparametric Statistics, vol. 25, no. 1, pp. 53–71, 2013.
23. Y. P. Chaubey and E. Shirazi, “On MISE of a nonlinear wavelet estimator of the regression function based on biased data under strong mixing,” Communications in Statistics. In press. View at: Google Scholar
24. S. Mallat, A Wavelet Tour of Signal Processing, Elsevier/Academic Press, Amsterdam, The Netherlands, 3rd edition, 2009. View at: MathSciNet
25. Y. Meyer, Wavelets and Operators, vol. 37, Cambridge University Press, Cambridge, UK, 1992. View at: MathSciNet
26. A. Cohen, I. Daubechies, and P. Vial, “Wavelets on the interval and fast wavelet transforms,” Applied and Computational Harmonic Analysis, vol. 1, no. 1, pp. 54–81, 1993.
27. H. P. Rosenthal, “On the subspaces of ${L}^{p}$$\left(p>2\right)$ spanned by sequences of independent random variables,” Israel Journal of Mathematics, vol. 8, no. 3, pp. 273–303, 1970.

#### More related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.