Abstract

We consider the estimation of an unknown function for weakly dependent data (-mixing) in a general setting. Our contribution is theoretical: we prove that a hard thresholding wavelet estimator attains a sharp rate of convergence under the mean integrated squared error (MISE) over Besov balls without imposing too restrictive assumptions on the model. Applications are given for two types of inverse problems: the deconvolution density estimation and the density estimation in a GARCH-type model, both improve existing results in this dependent context. Another application concerns the regression model with random design.

1. Introduction

A general nonparametric problem is adopted: we aim to estimate an unknown function via random variables from a strictly stationary stochastic process . We suppose that has a weak dependence structure; the -mixing case is considered. This kind of dependence naturally appears in numerous models as Markov chains, GARCH-type models, and discretely observed diffusions (see, e.g., [13]). The problems where is the density of or a regression function have received a lot of attention. A partial list of related works includes Robinson [4], Roussas [5, 6], Truong and Stone [7], Tran [8], Masry [9, 10], Masry and Fan [11], Bosq [12], and Liebscher [13].

For an efficient estimation of , many methods can be considered. The most popular of them are based on kernels, splines and wavelets. In this note we deal with wavelet methods that have been introduced in i.i.d.setting by Donoho and Johnstone [14, 15] and Donoho et al. [16, 17]. These methods enjoy remarkable local adaptivity against discontinuities and spatially varying degree of oscillations. Complete reviews and discussions on wavelets in statistics can be found in, for example, Antoniadis [18] and Härdle et al. [19]. In the context of -mixing dependence, various wavelet methods have been elaborated for a wide variety of nonparametric problems. Recent developments can be found in, for example, Leblanc [20], Tribouley and Viennet [21], Masry [22], Patil and Truong [23], Doosti et al. [24], Doosti and Niroumand [25], Doosti et al. [26], Cai and Liang [27], Niu and Liang [28], Benatia and Yahia [29], Chesneau [3032], Chaubey and Shirazi [33], and Abbaszadeh and Emadi [34].

In the general dependent setting described above, we provide a theoretical contribution to the performance of a wavelet estimator based on a hard thresholding. This nonlinear wavelet procedure has the features to be fully adaptive and efficient over a large class of functions (see, e.g., [1417, 35]). Following the spirit of Kerkyacharian and Picard [36], we determine necessary assumptions on and the wavelet basis to ensure that the considered estimator attains a fast rate of convergence under the MISE over Besov balls. The obtained rate of convergence often corresponds to the near optimal one in the minimax sense for the standard i.i.d. case. The originality of our result is to be general and sharp; it can be applied for nonparametric models of different natures and improves some existing results. This fact is illustrated by the consideration of the density deconvolution estimation problem and the density estimation problem in a GARCH-type model, improving ([30], Proposition 5.1) and ([31], Theorem 2), respectively. A last part is devoted to the regression model with random design. The obtained result completes the one of Patil and Truong [23].

The organization of this note is as follows. In the next section we describe the considered wavelet setting. The hard thresholding estimator and its rate of convergence under the MISE over Besov balls are presented in Section 3. Applications of our general result are given in Section 4. The proofs are carried out in Section 5.

2. Wavelets and Besov Balls

In this section we introduce some notations corresponding to wavelets and Besov balls.

2.1. Wavelet Basis

We consider the wavelet basis on constructs from the Daubechies wavelets db2N with (see, e.g., [37]). A brief description of this basis is given below. Let and be the initial wavelet functions of the family db2N. These functions have the particularity to be compactly supported and to belong to the class for . For any , we set and, for ,

With appropriated treatments at the boundaries, there exists an integer such that, for any integer , is an orthonormal basis of , where For any integer and , we have the following wavelet expansion: where and denote the wavelet coefficients of defined by Technical details can be found in, for example, Cohen et al. [38] and Mallat [39].

In the main result of this paper, we will investigate the MISE rate of the proposed estimator by assuming that the unknown function of interest belongs to a wide class of functions: the Besov class. Its definition in terms of wavelet coefficients is presented in the following.

2.2. Besov Balls

We say that with , and if and only if there exists a constant such that the wavelet coefficients of given by (4) satisfy with the usual modifications if or . Note that, for particular choices of , , and , contains the classical Hölder and Sobolev balls (see, e.g., [40] and [19]).

Remark 1. We have chosen a wavelet basis on to fix the notations; wavelet basis on another interval can be considered in the rest of the study without affecting the results.

3. Statistical Framework, Estimator and Result

3.1. Statistical Framework

As mentioned in Section 1, a nonparametric estimation setting as general as possible is adopted: we aim to estimate an unknown function via random variables (or vectors) from a strictly stationary stochastic process defined on a probability space . We suppose that has a -mixing dependence structure with exponential decay rate; that is, there exist two constants and such that where , is the -algebra generated by the random variables (or vectors) and is the -algebra generated by the random variables (or vectors) .

The -mixing dependence is reasonably weak; it is satisfied by a wide variety of models including Markov chains, GARCH-type models, and discretely observed diffusions (see, for instance, [13, 41]).

The considered estimator for is presented below.

3.2. Estimator

We define the hard thresholding wavelet estimator by where is the indicator function, is a large enough constant, is the integer satisfying where denotes the integer part of and is the integer satisfying Here it is supposed that there exists a function such that(H1)for , any integer and , where denotes the expectation,(H2)there exist two constants, and , satisfying, for , for any integer and , (i) , (ii) , (iii) for any , where denotes the covariance; that is, , denotes the complex conjugate of .

For well-known nonparametric models in the i.i.d. setting, hard thresholding wavelet estimators and important results can be found in, for example, Donoho and Johnstone [14, 15], Donoho et al. [16, 17], Delyon and Juditsky [35], Kerkyacharian and Picard [36], and Fan and Koo [42]. In the -mixing context, defined by (7) is a general and improved version of the estimator considered in Chesneau [30, 31]. The main differences are the presence of the tuning parameter and the global definition of the function offering numerous possibilities of applications. Three of them are explored in Section 4.

Comments on the Assumptions. The assumption (H1) ensures that (8) are unbiased estimators for and given by (4), whereas (H2) is related to their good performance. See Proposition 10. These assumptions are not too restrictive. For instance, if we consider the standard density estimation problem where are i.i.d. random variables with bounded density , the function satisfies (H1) and (H2) with (note that, thanks to the independence of , the covariance term in (H2)-(iii) is zero). The technical details are given in Donoho et al. [17].

Lemma 2 describes a simple situation in which assumption (H2)-(iii) is satisfied.

Lemma 2. We make the following assumptions.(F1)Let be the density of   and let be the density of for any . We suppose that there exists a constant such that (F2)There exist two constants, and , satisfying, for , for any integer and , Then, under (F1) and (F2), (H2)-(iii) is satisfied.

3.3. Result

Theorem 3 determines the rate of convergence attained by under the MISE over Besov balls.

Theorem 3. We consider the general statistical setting described in Section 3.1. Let be (7) under (H1) and (H2). Suppose that with , and , or and . Then there exists a constant such that

The rate of convergence “” is often the near optimal one in the minimax sense for numerous statistical problems in a i.i.d. setting (see, e.g., [19, 43]). Moreover, note that Theorem 3 is flexible; the assumptions on , related to the definition of in (H1) and (H2), are mild. In the next section, this flexibility is illustrated for three sophisticated nonparametric estimation problems: the density deconvolution estimation problem, the density estimation problem in a GARCH-type model, and the regression function estimation in the regression model with random design.

4. Applications

4.1. Density Deconvolution

Let be a strictly stationary stochastic process such that where is a strictly stationary stochastic process with unknown density and is a strictly stationary stochastic process with known density . It is supposed that and are independent for any and is a -mixing process with exponential decay rate (see Section 3.1 for a precise definition). Our aim is to estimate via from . Some related works are Masry [44], Kulik [45], Comte et al. [46], and Van Zanten and Zareba [47].

We formulate the following assumptions.(G1)The support of is .(G2)There exists a constant such that (G3)Let be the density of . We suppose that there exists a constant such that (G4)For any , let be the density of . We suppose that there exists a constant such that (G5)For any integrable function , we define its Fourier transform by

We suppose that there exist three known constants , , and such that, for any ,(i)the Fourier transform of satisfies (ii)for any , the th derivative of the Fourier transform of satisfies We are now in the position to present the result.

Theorem 4. We consider the model (17). Suppose that (G1)–(G5) are satisfied. Let be defined as in (7) with where denotes the complex conjugate of and (appearing in(G5)).
Suppose that with , , and or and . Then there exists a constant such that

Theorem 4 improves ([30], Proposition 5.1) in terms of rate of convergence; we gain a logarithmic term.

Moreover, it is established that, in the i.i.d. setting, “” is(i)exactly the rate of convergence attained by the hard thresholding wavelet estimator,(ii)the near optimal rate of convergence in the minimax sense.

The details can be found in Fan and Koo [42]. Thus, Theorem 4 can be viewed as an extension of this existing result to the weak dependent case.

4.2. GARCH-Type Model

We consider the strictly stationary stochastic process where, for any , is a strictly stationary stochastic process with unknown density , and is a strictly stationary stochastic process with known density . It is supposed that and are independent for any and is a -mixing process with exponential decay rate (see Section 3.1 for a precise definition). Our aim is to estimate via from . Some related works are Comte et al. [46] and Chesneau [31].

We formulate the following assumptions.(J1)There exists a positive integer such that Let us remark that is the density of , where are i.i.d. random variables having the common distribution .(J2)The support of is and .(J3)Let be the density of . We suppose that there exists a constant such that (J4)For any , let be the density of . We suppose that there exists a constant such that We are now in the position to present the result.

Theorem 5. We consider model (26). Suppose that (J1)–(J4) are satisfied. Let be defined as in (7) with where, for any positive integer , and and (appearing in (J1)).
Suppose that with , and , or and . Then there exists a constant such that

Theorem 5 significantly improves ([31], Theorem 2) in terms of rate of convergence; we gain an exponent .

4.3. Nonparametric Regression Model

We consider the strictly stationary stochastic process where, for any , , is a strictly stationary stochastic process with unknown density , is a strictly stationary centered stochastic process, and is the unknown regression function. It is supposed that and are independent for any and is a -mixing process with exponential decay rate (see Section 3.1 for a precise definition). Our aim is to estimate via from . Applications of this problem can be found in Härdle [48]. Wavelet methods can be found in Patil and Truong [23], Doosti et al. [24], Doosti et al. [26], and Doosti and Niroumand [25].

We formulate the following assumptions.(K1)The support of and is and and .(K2) is bounded.(K3)There exists a constant such that (K4)There exist two constants and such that (K5)Let be the density of . We suppose that there exists a constant such that (K6)For any , let be the density of . We suppose that there exists a constant such that We are now in the position to present the result.

Theorem 6. We consider the model (32). Suppose that (K1)–(K6) are satisfied. Let be the truncated ratio estimator. Consider where(i) is defined as in (7) with and ,(ii) is defined as in (7) with instead of , and ,(iii) is the constant defined in (K4).
Suppose that and with , and or , and . Then there exists a constant such that

The estimator (37) is derived by combining the procedure of Patil and Truong [23] with the truncated approach of Vasiliev [49].

Theorem 6 completes Patil and Truong [23] in terms of rates of convergence under the MISE over Besov balls.

Remark 7. The assumption(K2) can be relaxed with another strategy to the one developed in Theorem 6. Some technical elements are given in Chesneau [32].
Conclusion. Considering the weak dependent case on the observations, we prove a general result on the rate of convergence attains by a hard wavelet thresholding estimator under the MISE over Besov balls. This result is flexible; it can be applied for a wide class of statistical models. Moreover, the obtained rate of convergence is sharp; it can correspond to the near optimal one in the minimax sense for the standard i.i.d. case. Some recent results on sophisticated statistical problems are improved. Thanks to its flexibility, the perspectives of applications of our theoretical result in other contexts are numerous.

5. Proofs

In this section, denotes any constant that does not depend on , , and . Its value may change from one term to another and may depend on or .

5.1. Key Lemmas

Let us present two lemmas which will be used in the proofs.

Lemma 8 shows a sharp covariance inequality under the -mixing condition.

Lemma 8 (see [50]). Let be a strictly stationary -mixing process with mixing coefficient , , and let and be two measurable functions. Let and satisfying such that and exist. Then there exists a constant such that

Lemma 9 below presents a concentration inequality for -mixing processes.

Lemma 9 (see [13]). Let be a strictly stationary process with the th strongly mixing coefficient , , let be a positive integer, let be a measurable function, and, for any , . We assume that and there exists a constant satisfying . Then, for any and , we have where

5.2. Intermediary Results

Proof of Lemma 2. Using a standard expression of the covariance, and (F1) as well as (F2), we obtain This ends the proof of Lemma 2.

Proposition 10 proves probability and moments inequalities satisfied by the estimators (8).

Proposition 10. Let and be defined as in (8) under (H1) and (H2), let be (9) and let be (10).(a)There exists a constant such that, for any and , (b)There exists a constant such that, for any and , (c)Let be defined as in (11). There exists a constant such that, for any large enough, and , we have

Proof of Proposition 10. (a) Using (H1) and the stationarity of , we obtain By (H2)-(ii) we get For the covariance term, note that where It follows from (H2)-(iii) and that The Davydov inequality described in Lemma 8 with , (H2)-(i)-(ii), and give Thus Putting (49), (50), and (55) together, the first point in (a) is proved. The proof of the second point is identical with instead of .
(b) Thanks to (H2)-(i), we have . It follows from the triangular inequality and that This inequality and the second result of (a) yield Using , the proof of (b) is completed.
(c) We will use the Liebscher inequality described in Lemma 9. Let us set
We have and, by(H2)-(i) and , (so ).
Proceeding as for the proofs of the bounds in(a), for any integer , since , we show that Therefore Owing to Lemma 9 applied with , , , , and the bound (61), we obtain Taking large enough, the last term is bounded by . This completes the proof of(c).
This completes the proof of Proposition 10.

Proof of Theorem 3. Theorem 3 can be proved by combining arguments of ([36], Theorem 5.1) and ([51], Theorem 4.2). It is close to ([30], Proof of Theorem 2) by taking . The interested reader can find the details below.
We consider the following wavelet decomposition for : where and .
Using the orthonormality of the wavelet basis , the MISE of can be expressed as where
Let us now investigate sharp upper bounds for , and successively.
Upper Bound for  . The point (a) of Proposition 10 and yield Upper Bound for (i)For and , we have . Using , we obtain (ii)For and , we have . The condition implies that . Thus Hence, for , and or , and , we have
Upper Bound for  . Adopting the notation , can be written as where
Upper Bound for  . Owing to the inequalities , and , the Cauchy-Schwarz inequality, and the points (b) and (c) of Proposition 10, we have
Upper Bound for  . It follows from the point (a) of Proposition 10 that Let us now introduce the integer defined by Note that for large enough.
Then can be bounded as where On the one hand we have On the other hand, we have the following.(i)For and , the Markov inequality and yield (ii)For , and , the Markov inequality, , and imply that Therefore, for , and or , and , we have
Upper Bound for  . We have Let be the integer (74). Then can be bound as where On the one hand, we have On the other hand, we have the following(i)For and , since , we have (ii)For , and , owing to the Markov inequality, and , we get So, for , and or , and , we have Putting (70), (72), (80), and (87) together, for , and or and , we obtain
Combining (64), (66), (69), and (88), we complete the proof of Theorem 3.

Proof of Theorem 4. The proof of Theorem 4 is a direct application of Theorem 3: under (G1)–(G5), the function defined by (24) satisfies (H1) see ([42], equation (2)) and (H2): (i) see ([42], Lemma 6), (ii) see, ([42], equation (11)) and (iii) see ([30], Proof of Proposition 6.1), with .

Proof of Theorem 5. The proof of Theorem 5 is a consequence of Theorem 3: under (J1)–(J4), the function defined by (30) satisfies (H1) and (H2): (i)-(ii) see ([31], Proposition 1) and (iii) see ([52], equation (26)), with .

Proof of Theorem 6. Set . Following the methodology of [49], we have where Using (K3) and the indicator function, we have It follows from , (K3), (K4), and the Markov inequality that The triangular inequality yields The elementary inequality implies that We now bound this two MISEs via Theorem 3.
Upper Bound for the MISE of  . Under (K1)–(K6), the function defined by (38) satisfies the following.(H1) With instead of : since and are independent with , (H2): (i)-(ii)-(iii) with :
(i) since is bounded thanks to (K2) and (K3), say with , we have
(ii) using the boundedness of , then (K4), we have
(iii) using the boundedness of and making the change of variables , we obtain We conclude by applying Lemma 2 with ; (K5) and (K6) imply (F1), and the previous inequality implies (F2).
Therefore, assuming that with , and or and , Theorem 3 proves the existence of a constant satisfying
Upper Bound for the MISE of  . Under (K1)–(K6), proceeding as the previous point, we show that the function defined by (39) satisfies (H1) with instead of and instead of , and (H2): (i)-(ii)-(iii) with .
Therefore, assuming that with , and or , and , Theorem 3 proves the existance of a constant satisfying Combining (94), (99), and (100), we end the proof of Theorem 6.

Conflict of Interests

The author declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The author is thankful to the reviewers for their comments which have helped in improving the presentation.