Research Article | Open Access

# Average Derivative Estimation from Biased Data

**Academic Editor:**Y. Wu

#### Abstract

We investigate the estimation of the density-weighted average derivative from biased data. An estimator integrating a plug-in approach and wavelet projections is constructed. We prove that it attains the parametric rate of convergence under the mean squared error.

#### 1. Introduction

The standard density-weighted average derivative estimation problem is the following. We observe * i.i.d.* bivariate random variables defined on a probability space . Let be the unknown density function of , and let be the unknown regression function given by
The density-weighted average derivative is defined by
The estimation of is of interest in some econometric problems, especially in the context of estimation of coefficients in index models (see, e.g., Stoker [1, 2], Powell et al. [3], and Härdle and Stoker [4]). Among the popular approaches, there are the nonparametric techniques based on kernel estimators (see, e.g., Härdle and Stoker [4], Powell et al. [3], Härdle et al. [5], and Stoker [6]) or orthogonal series methods introduced in Rao [7]. Recently, Chesneau et al. [8] have developed an estimator based on a new plug-in approach and a wavelet series method. We refer to Antoniadis [9], Härdle et al. [10], and Vidakovic [11] for further details about wavelets and their applications in nonparametric statistics.

In this paper, we extend this estimation problem to the biased data. It is based on the “biased regression model” which is described as follows. We observe * i.i.d.* bivariate random variables defined on a probability space with the common density function:
where is a known positive function, is the density function of an unobserved bivariate random variable , and (which is an unknown real number). This model has potential applications in biology, economics, and many other fields. Important results on methods and applications can be found in, for example, Ahmad [12], Sköld [13], Cristóbal and Alcalá [14], Wu [15], Cristóbal and Alcalá [16], Cristóbal et al. [17], Ojeda et al. [18], Cabrera and Van Keilegom [19], and Chaubey et al. [20]. Wavelet methods related to this model can be found in Chesneau and Shirazi [21], Chaubey et al. [22], and Chaubey and Shirazi [23].

Let be the density of ; that is, , , and let be the unknown regression function given by

The density-weighted average derivative is defined by We aim to estimate from . To reach this goal, we adapt the methodology of Chesneau et al. [8] to this problem and develop new technical arguments derived to those developed in Chesneau and Shirazi [21] for the estimation of (4). A new wavelet estimator is thus constructed. We prove that it attains the parametric rate of convergence under the mean squared error, showing the consistency of our estimator.

The paper is organized as follows. In Section 2, we introduce our wavelet methodology including our estimator. Additional assumptions on the model and our main theoretical result are set in Section 3. Finally, the proofs are postponed to Section 4.

#### 2. Wavelet Methodology

In this section, after a brief description of the considered wavelet basis, we present our wavelet estimator of (5).

##### 2.1. Compactly Supported Wavelet Basis

Let us consider the following set of functions: For the purposes of this paper, we use the compactly supported wavelet bases on briefly described below.

Let be a fixed integer, and let and be the initial wavelet functions of the Daubechies wavelets (see, e.g., Mallat [24]). These functions have the features to be compactly supported and derivable.

Set and . Then, with a specific treatments at the boundaries, there exists an integer such that the collection is an orthonormal basis of .

Hence, a function can be expanded on as where Further details about wavelet basis can be found in, for example, Meyer [25], Cohen et al. [26], and Mallat [24].

##### 2.2. Wavelet Estimator

Proposition 1 provides another expression of the density-weighted average derivative (5) in terms of wavelet coefficients.

Proposition 1. *Let be given by (5). Suppose that , , , and . Then the density-weighted average derivative (5) can be expressed as
**
where
*

In view of Proposition 1, adapting the ideas of Chesneau et al. [8] and Chesneau and Shirazi [21] to our framework, we consider the following plug-in estimator for : where and is an integer which will be chosen a posteriori.

Proposition 2 provides some theoretical results explaining the constructions of the above estimators.

Proposition 2. *Suppose that . Then*(i)* (17) is an unbiased estimator for *;(ii)* (15) and ** (16) are unbiased estimators for ** (12) and ** (13), respectively;*(iii)*under *, * (15) and ** (16) are unbiased estimators for ** (12) and ** (13), respectively.*

#### 3. Assumptions and Result

After the presentation of some additional assumptions on the model, we describe our main result on the asymptotic properties of (14).

##### 3.1. Assumptions

We formulate the following assumptions on , , and .(H1)The support of , denoted by , is compact. In order to fix the notations, we suppose that .(H2)There exists a constant such that (H3)The function satisfies and there exists a constant such that (H4)There exists a constant such that (H5)There exist two constants and such that

Let , , and and be given by (13). We formulate the following assumptions on and .(H6 ())There exists a constant such that(H7 ())There exists a constant such that Note that (H2)–(H5) are boundedness assumptions, whereas (H6 ()) and (H7 ()) are related to the smoothness of and represented by and . There exist deep connections between (H6 ()) and (H7 ()) and balls of Hölder spaces (see [10, Chapter 8]).

##### 3.2. Main Result

The following theorem establishes the upper bound of the MSE of our estimator.

Theorem 3. *Assume that (H1)–(H5), (H6 ()) with , and (H7 ()) with hold. Let be given by (5), and let be given by (14) with such that . Then there exists a constant such that
*

Theorem 3 proves that attains the parametric rate of convergence under the MSE. This implies the consistency of our estimator. This result provides a first theoretical aspect to the estimation of from biased data.

#### 4. Proofs

##### 4.1. On the Construction of

*Proof of Proposition 1. *We follow the approach of Chesneau et al. [8]. Using , , and an integration by parts, we obtain
Moreover,(i)since , we can expand it on as (9):
where and are (12),(ii)since , we can expand it on as (9):
where and are (13).

Thanks to (25) and the orthonormality of on , we get
Proposition 1 is proved.

*Proof of Proposition 2. *(i) We have

(ii) Using the identical distribution of and the definition of (4), we obtain

Similarly, we prove that .

(iii) Using the identical distribution of , an integration by parts, and , we obtain

Similarly, we prove that .

This ends the proof of Proposition 2.

##### 4.2. Proof of Some Intermediate Results

Proposition 4. *Suppose that (H1)–(H5) hold. Let and be given by (13), and let and be given by (16) with such that . Then*(i)*there exists a constant ** such that*(ii)*there exists a constant ** such that*(iii)*there exists a constant ** such that**These inequalities hold with in (15) instead of and in (12) instead of for .*

*Proof of Proposition 4. *We use the following version of the Rosenthal inequality. The proof can be found in Rosenthal [27].**Lemma 5**.* Let ** be a positive integer, **, and let ** be ** zero mean independent random variables such that **. Then there exists a constant ** such that*(i)Note that
with
Since are* i.i.d.*, we get that are also* i.i.d.*. Moreover, from Proposition 2, we have . Using (H5), for any , we have . Thus, Lemma 5 with yields
(ii)We have
By (H5), we have , , and by (H2) and (H3), . Hence, by the triangular inequality,

The elementary inequality, , , implies that
where
*Upper bound for **.* Since are* i.i.d.*, we get that are also* i.i.d.* Moreover, from Proposition 2, we have . Now, using the Hölder inequality, (H4), (H5) and , for any , observe that
Combining Lemma** **5 with with the previous inequality, we obtain
*Upper bound for **.* The point (i) yields
It follows from (41), (44), and (45) that
(iii)Similar arguments to the beginning of (ii) give
where
*Upper bound for **.* Since are* i.i.d.*, we get that are also* i.i.d.* Moreover, from Proposition 2, we have . Now, using the Hölder inequality, (H4), (H5), , and , for any , observe that
Owing to Lemma** **5 with and the previous inequality, we obtain
*Upper bound for **.* The point (i) yields

It follows from (47), (50), and (51) that
Proposition 4 is proved.

Proposition 6 below is a consequence of [8, Proposition** **5.2] and the results of Proposition 4 above.

Proposition 6. *(i) Suppose that (H1)–(H5), (H6 ()), and (H7 ()) hold. Let and be given by (13), and let and be given by (16) with such that . Then there exists a constant such that
**(ii) Suppose that (H1)–(H5) hold. Let and be given by (12), and let and be given by (15). Then there exists a constant such that
*

##### 4.3. Proof of the Main Result

*Proof of Theorem 3. *Using the intermediary results above, the proof follows the lines of [8, Theorem** **5.1]. It follows from Proposition 1 and the elementary inequality, , , that
where
*Upper bound for **.* Using the Cauchy-Schwarz inequality, the second point of Proposition 6, and , we obtain
*Upper bound for **.* It follows from the Cauchy-Schwarz inequality, the first point of Proposition 6, , the elementary inequality, , , , and , that
*Upper bound for **.* By (H6 ()) with , (H7 ()) with , and , we have
Putting (55), (57), (58), and (59) together, we obtain
This ends the proof of Theorem 3.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### References

- T. M. Stoker, “Consistent estimation of scaled coefficients,”
*Econometrica*, vol. 54, no. 6, pp. 1461–1481, 1986. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - T. M. Stoker, “Tests of additive derivative constraints,”
*Review of Economic Studies*, vol. 56, no. 4, pp. 535–552, 1989. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - J. L. Powell, J. H. Stock, and T. M. Stoker, “Semiparametric estimation of index coefficients,”
*Econometrica*, vol. 57, no. 6, pp. 1403–1430, 1989. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - W. Härdle and T. M. Stoker, “Investigating smooth multiple regression by the method of average derivatives,”
*Journal of the American Statistical Association*, vol. 84, no. 408, pp. 986–995, 1989. View at: Google Scholar | Zentralblatt MATH | MathSciNet - W. Härdle, J. Hart, J. S. Marron, and A. B. Tsybakov, “Bandwidth choice for average derivative estimation,”
*Journal of the American Statistical Association*, vol. 87, no. 417, pp. 218–226, 1992. View at: Google Scholar | Zentralblatt MATH | MathSciNet - T. M. Stoker, “Equivalence of direct, indirect, and slope estimators of average derivatives,” in
*Nonparametric and Semiparametric Methods in Econometrics and Statistics*, W. A. Barnett, J. Powell, and G. Tauchen, Eds., pp. 99–118, Cambridge University Press, Cambridge, UK, 1991. View at: Google Scholar | MathSciNet - B. L. S. P. Rao, “Consistent estimation of density-weighted average derivative by orthogonal series method,”
*Statistics & Probability Letters*, vol. 22, no. 3, pp. 205–212, 1995. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - C. Chesneau, M. Kachour, and F. Navarro, “On the estimation of density-weighted average derivative by wavelet methods under various dependence structures,”
*Sankhya A*, vol. 76, no. 1, pp. 48–76, 2014. View at: Google Scholar - A. Antoniadis, “Wavelets in statistics: a review,”
*Journal of the Italian Statistical Society B*, vol. 6, no. 2, pp. 97–144, 1997. View at: Publisher Site | Google Scholar - W. Härdle, G. Kerkyacharian, D. Picard, and A. Tsybakov,
*Wavelets, Approximation, and Statistical Applications*, vol. 129 of*Lecture Notes in Statistics*, Springer, New York, NY, USA, 1998. View at: Publisher Site | MathSciNet - B. Vidakovic,
*Statistical Modeling by Wavelets*, John Wiley & Sons, New York, NY, USA, 1999. View at: Publisher Site | MathSciNet - I. A. Ahmad, “On multivariate kernel estimation for samples from weighted distributions,”
*Statistics & Probability Letters*, vol. 22, no. 2, pp. 121–129, 1995. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - M. Sköld, “Kernel regression in the presence of size-bias,”
*Journal of Nonparametric Statistics*, vol. 12, no. 1, pp. 41–51, 1999. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - J. A. Cristóbal and J. T. Alcalá, “Nonparametric regression estimators for length biased data,”
*Journal of Statistical Planning and Inference*, vol. 89, no. 1-2, pp. 145–168, 2000. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - C. O. Wu, “Local polynomial regression with selection biased data,”
*Statistica Sinica*, vol. 10, no. 3, pp. 789–817, 2000. View at: Google Scholar | Zentralblatt MATH | MathSciNet - J. A. Cristóbal and J. T. Alcalá, “An overview of nonparametric contributions to the problem of functional estimation from biased data,”
*Test*, vol. 10, no. 2, pp. 309–332, 2001. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - J. A. Cristóbal, J. L. Ojeda, and J. T. Alcalá, “Confidence bands in nonparametric regression with length biased data,”
*Annals of the Institute of Statistical Mathematics*, vol. 56, no. 3, pp. 475–496, 2004. View at: Publisher Site | Google Scholar | MathSciNet - J. L. Ojeda, W. González-Manteiga, and J. A. Cristobal, “A bootstrap based model checking for selection-biased data,” Tech. Rep. 07–05, Universidade de Santiago de Compostela, 2007. View at: Google Scholar
- J. L. O. Cabrera and I. van Keilegom, “Goodness-of-fit tests for parametric regression with selection biased data,”
*Journal of Statistical Planning and Inference*, vol. 139, no. 8, pp. 2836–2850, 2009. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - Y. P. Chaubey, N. Laïb, and J. Li, “Generalized kernel regression estimator for dependent size-biased data,”
*Journal of Statistical Planning and Inference*, vol. 142, no. 3, pp. 708–727, 2012. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - C. Chesneau and E. Shirazi, “Nonparametric wavelet regression based on biased data,”
*Communications in Statistics*. In press. View at: Google Scholar - Y. P. Chaubey, C. Chesneau, and E. Shirazi, “Wavelet-based estimation of regression function for dependent biased data under a given random design,”
*Journal of Nonparametric Statistics*, vol. 25, no. 1, pp. 53–71, 2013. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - Y. P. Chaubey and E. Shirazi, “On MISE of a nonlinear wavelet estimator of the regression function based on biased data under strong mixing,”
*Communications in Statistics*. In press. View at: Google Scholar - S. Mallat,
*A Wavelet Tour of Signal Processing*, Elsevier/Academic Press, Amsterdam, The Netherlands, 3rd edition, 2009. View at: MathSciNet - Y. Meyer,
*Wavelets and Operators*, vol. 37, Cambridge University Press, Cambridge, UK, 1992. View at: MathSciNet - A. Cohen, I. Daubechies, and P. Vial, “Wavelets on the interval and fast wavelet transforms,”
*Applied and Computational Harmonic Analysis*, vol. 1, no. 1, pp. 54–81, 1993. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet - H. P. Rosenthal, “On the subspaces of ${L}^{p}$$(p>2)$ spanned by sequences of independent random variables,”
*Israel Journal of Mathematics*, vol. 8, no. 3, pp. 273–303, 1970. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet

#### Copyright

Copyright © 2014 Christophe Chesneau et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.