Abstract

A perfect achievement has been made for wavelet density estimation by Dohono et al. in 1996, when the samples without any noise are independent and identically distributed (i.i.d.). But in many practical applications, the random samples always have noises, and estimation of the density derivatives is very important for detecting possible bumps in the associated density. Motivated by Dohono's work, we propose new linear and nonlinear wavelet estimators for density derivatives when the random samples have size-bias. It turns out that the linear estimation for attains the optimal covergence rate when , and the nonlinear one does the same if .

1. Introduction

Wavelet analysis plays important roles in both pure and applied mathematics such as signal processing, image compress, numerical solution, and local fractional calculus [1, 2]. One of which is to estimate an unknown density function based on random samples [38]. The perfect achievement was made by Dohono et al. [9], when the i.i.d. samples have not any noise. On the other hand, Besov spaces contain many functional spaces (e.g., Hölder spaces and Sobolev spaces with noninteger exponents) as their special examples. In some statistical models, the error is measured in norm [913].

In practice, it usually happens that getting the direct sample from a random variable is impossible. In this paper, we want to consider the true density function . But we can only observe the samples , for the size-biased data; that is, where is the so-called bias function, .

In many cases, a linear is recommended, but, in general, the form of should be studied via additional experiments. The purpose of this paper is to estimate the derivatives of the true density functions , ; we study the optimal convergence rate of wavelet estimators in norm over Besov spaces.

Size-biased data arise when an observation depends on samples magnitude. Several examples of model (1) can be found in the literature [14]. For instance, in [15], it is shown that the distribution of the concentration of alcohol in the blood of intoxicated drivers is of interest; since the drunken drivers have a larger chance of being arrested, the collected data are size-biased.

The estimation problem for biased data (1) has been discussed in some papers. In 1982, Vardi [16] considered the nonparametric maximum likelihood estimation for . In 1991, Jones [17] discussed the mean squared error properties of the kernel density estimation. In 2004, Efromovich [18] developed the Efromovich-Pinsker adaptive Fourier estimator. It was based on a block shrinkage algorithm and achieved the minimax rate of convergence under the risk over the Besov class .

In 2010, Ramírez and Vidakovic [14] proposed a linear wavelet estimator and discussed the consistency of function in under the mean integrated squared error (MISE) sense. But the wavelet estimator in paper [14] contained the unknown parameter . In the same year, Chesneau [10] constructed a nonlinear wavelet estimator and evaluated the risk in the Besov space . But about the estimation of the density derivatives about model (1), to our knowledge, we have not seen any result. Estimation of the derivatives of a density is very important in detecting possible bumps.

The current paper is organized as follows. In Section 2, we briefly describe the preliminaries on wavelets and Besov space. The linear estimator and its convergence rate are presented in Section 3. In order to discuss optimality, Section 4 is devoted to give the lower bound for an arbitrary estimator. In Section 5, we consider nonlinear wavelet estimator and its optimal convergence rate. Our estimations improve the theorems in [10, 13, 14, 18].

2. Wavelets and Besov Spaces

In this section, we will recall some useful and well-known concepts and lemmas.

In order to construct a wavelet basis, we need a structure in which can decompose into a direct sum of mutually orthogonal spaces.

Definition 1 (see [19]). A multiresolution analysis (MRA) of is a set of increasing, closed linear subspaces , for all , called scaling spaces, satisfying(a), ;(b) if and only if all ;(c) if and only if for all ;(d)there exists a function such that is an orthogonal basis in . The function is called the scaling function of the multiresolution analysis.

With the standard notation in wavelet analysis, there exists a corresponding wavelet function , such that for fixed is an orthonormal basis of which is the orthogonal complement of the space in . For fixed , both and are orthonormal bases of .

As usual, denotes the classical Lebesgue space on the real line . Although wavelet bases are constructed for , most of them constitute unconditional bases for .

Lemma 2 (see [20]). Let be a compactly supported, orthonormal scaling function and the corresponding wavelet. Then for any with , the following expansion: converges to for almost everywhere , where

Lemma 3 (see [3]). If the scaling function satisfies , then for any sequence , one has where , , , .

Letting , , , , the Besov spaces are defined by with the associated norm , where denotes the smoothness modulus of , and

Between the different Besov spaces, the following embedding conclusions are established [3]. Let , ; then(i), ;(ii), , ,

where denotes that the Banach space is continuously embedding in the Banach space ; that is, there exists a constant such that, for any , we have .

A scaling function is called -regular, if has continuous derivatives of order , and its corresponding wavelet has vanishing moments of order ; that is,

One of advantages of wavelets is that they can characterize Besov spaces.

Lemma 4 (see [3]). Let be a compact supported, -regular orthonormal scaling function with the corresponding wavelet and . If , , and , , then the following are equivalent:(i);(ii), where is the projection operator to ; that is, ;(iii). In this case,

Note 1. The notation indicates that with a positive constant , which is independent of and . If and , we write .

In this paper, the Besov balls are defined by

3. Linear Estimator

In this section, we will give a linear estimator for density derivatives in Besov spaces .

The linear wavelet estimator of the derivative of a density is defined as follows: where

The following inequalities play important roles in this paper.

Lemma 5 (see [3] (Rosenthal inequality)). Let be independent random variables such that and ; then there exists a constant such that(i), ,(ii), .

About the defined coefficients in (9), although , we have the following estimation.

Lemma 6. If , then, for any , one has .

Proof. By the definitions of , and triangular inequality, one observes that Since for any , one has that and . Thanks to embedding theorem , for any , one gets . It is easy to see , , are bounded. Using the convexity inequality, one obtains where (i)To estimate : denote . Note that are i.i.d. samples, and . Moreover, for any given integer , one has Therefore, By Rosenthal’s inequality Lemma 5,if , one has if , one gets (ii)To estimate the term , since let . It is easy to see , and, for any integer , one obtains Similarly, by Rosenthal’s inequality Lemma 5, (a)for , , one gets (b)for , that is, , and , one has Summarizing the above estimation about , , one obtains that .

Theorem 7. Let scaling function be compactly supported and -regular. If be the estimator defined in (9), then for , , one has where .

Proof. Firstly, using triangular inequality and convexity inequality, we decompose into the bias term and the stochastic term; that is, For the bias term , one can estimate it as follows.(i)When , Lemma 4 reduces to When , using Besov space embedding theorems , one has When ,  Hölder’s inequality and Lemma 4 tell us that Hence, for , one obtains that Next, we estimate the stochastic term . Clearly, due to Lemmas 3 and 6, one gets By choosing such that , one obtains that

Remark 8. Theorem 7 can be considered as natural extension of [14] if , . Moreover, the next part shows the optimality of our linear estimation for .

4. Lower Bound and Optimality

This section is devoted to showing that the linear estimator defined in (9) attains the optimal covergence rate for . The idea of proof is motivated by [21].

Lemma 9 (Varshamov-Gilbert Lemma [5]). Let , ; then there exists a subset of with such that and .

Lemma 10 (Fano’s lemma [22]). Let be probability measurable spaces and , . If for , one has where stands for the complement of and stands for Kullback distance in [5].

Based on the above lemmas, we have the following lower bound estimation.

Theorem 11. Let with , , and ; there exist two constants and such that . If is any estimator of with i.i.d. random samples, then

Proof. (i) Firstly, we prove It is sufficient to construct such that and Suppose that is a compactly supported, regular and orthonormal scaling function and is the corresponding wavelet with . Assume . Define , and Obviously, and ; that is, . Moreover, for and . So, one gets Clearly, satisfies for . Then, Fano’s Lemma 10 tells us that On the other hand, one has Then where . Next, one shows that .
Recall that where , . Note that, , if , one has Since and , Taking , then . One can choose such that and . Therefore One has Noting that , one gets (ii)Next, we prove Similarly, it is sufficient to construct , such that and Similarly to prove (i), suppose that , . Defining , and with . Moreover, since , one knows that and By Lemma 10, . Hence . According to Lemma 9, there exist such that and Since for , this leads to and Clearly, the sets , , satisfy for . Using Fano’s Lemma 10, one has where , and one can get due to the similar arguments as (i).
Taking , then . One can choose a constant such that . By , then one obtains On the other hand, reduces to Therefore, one gets the following desired result:

Note that , if . Then we have the following corollary.

Corollary 12. If , the linear estimator (9) attains the optimal covergence rate.

5. Nonlinear Estimator

In this paper, the nonlinear wavelet estimator is defined as follows: where The hard thresholding wavelet coefficients are , where

About the wavelet coefficients, we can get the following lemmas whose proof is very similar to Lemma 6 and we omit it.

Lemma 13. If , then, for any , one has .

Lemma 14 (see [3] (Bernstein inequality)). Let be independent random variables such that , ; then

Lemma 15. If , then, for any , there exists a constant such that

Proof. One can easily get where , . So, one obtains that Now, we estimate . Clearly, and . Moreover, one has where . By Bernstein’s inequality and , one knows that Taking such that , then Next, we estimate . It is easy to know that Since , Bernstein’s inequality tells us that Taking such that , one gets Letting , by (66) and (69), one obtains that .

Lemma 16. Suppose that the parameters , , of the wavelet estimator defined in (57) satisfy the assumptions: then one has where , are positive constants.

Proof. According to Lemma 3, one obtains that Recalling that , one has Note that and one gets ; that is, , if . Therefore, where (i)Firstly, we estimate Let . By , one obtains that On the other hand, implies . Taking , one has Note that if and only if . When , that is, , one can compute . From (70), (71), one obtains that where , are positive constants.(ii)We estimate Due to Lemma 13, one has Using and Jensen’s inequality, it can be proved that Similar to the proof of , by and , one gets Observing that , one obtains that (iii)Finally, we estimate Let , and . Thanks to Jensen’s inequality and Höler’s inequality, one has According to Lemmas 13 and 15, one knows that Choosing large enough such that , one gets Taking as in (70), one has Putting (81), (86), and (91) together, one obtains that

Theorem 17. Let the scaling function be orthonormal, compactly supported, and -regular. If is the nonlinear wavelet estimator defined in (57), and the assumptions (70), (71), and (81) are satisfied. Then for any , , one has where , , are positive constants.

Proof. By the definition of in (57) and the expansion of in (2), one has Hence, where Firstly, we estimate Thanks to Lemma 3 and Jensen’s inequality, one obtains that Due to Lemma 6, one has . Since and are compactly supported, then the number of elements in is . Therefore, From (70), one has Next, we estimate term When , , one has , thanks to the Besov spaces embedding theorem. By Lemma 4, one has Taking as in (71), it can be proved that Finally, we estimate Using Lemma 16, one has By (100), (103), and (105), one obtains that

Remark 18. Since implies that , one can easily find . Then, Theorems 17 and 7 tell us that the nonlinear estimator does better than the linear one. Moreover, Theorems 17 and 11 show that our nonlinear one attains the optimal. covergence rate

Corollary 19. When , the nonlinear estimator (57) attains the optimal covergence rate up to a factor.

Remark 20. If , , our estimation reduces to Ramírez and Vidakovic’s result [14].

Remark 21. If , our results can be considered as the natural extension of paper [10].

Remark 22. If , model (1) reduces to the samples without any noise; our results reduce to the estimations in [13]. Even for , our result is also optimal covergence rate.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This paper is supported by CSC Foundation and Fundamental Research Fund of BJUT.