Chinese Journal of Mathematics

Volume 2016, Article ID 6204874, 8 pages

http://dx.doi.org/10.1155/2016/6204874

## A Note on the Adaptive Estimation of a Conditional Continuous-Discrete Multivariate Density by Wavelet Methods

^{1}Laboratoire de Mathématiques Nicolas Oresme, Université de Caen, BP 5186, 14032 Caen Cedex, France^{2}Mashhad University of Medical Sciences, P.O. Box 91735-951, Mashhad, Iran

Received 11 April 2016; Revised 24 May 2016; Accepted 6 June 2016

Academic Editor: Niansheng Tang

Copyright © 2016 Christophe Chesneau and Hassan Doosti. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We investigate the estimation of a multivariate continuous-discrete conditional density. We develop an adaptive estimator based on wavelet methods. We prove its good theoretical performance by determining sharp rates of convergence under the risk with for a wide class of unknown conditional densities. A simulation study illustrates the good practical performance of our estimator.

#### 1. Introduction

The estimation of conditional densities is an important statistical challenge with applications in many practical problems, especially those connected with forecasting (economics, etc.). There is a vast literature in this area. We refer to the papers of Li and Racine [1], Akakpo and Lacour [2], and Chagny [3] and the references therein. In this paper we focus our attention on a specific problem: the estimation of a multivariate continuous-discrete conditional density. The considered model is described as follows. Let , , , and be positive integers and let be iid random vectors defined on the probability space . We suppose that is continuous with support and that is discrete with support . Let be the density of . We define the density function of conditionally to the event by. We aim to estimate from . The most common approach is based on the kernel methods developed by Li and Racine [4]. Applications and recent developments for these methods are described in detail in Li and Racine [1].

In this paper we develop a new estimator based on wavelet methods. It is now an established fact that, in comparison to kernel methods, wavelet methods have the advantage to achieve a high degree of adaptivity for a large class of unknown functions, with possible complex discontinuities (jumps, spikes, etc.). See, for instance, Antoniadis [5], Härdle et al. [6], and Vidakovic [7]. This fact motivates our interest to develop wavelet methods for the considered conditional density estimation problem. The main ingredients in the construction of are an estimation of with a new wavelet estimator , an estimation of by an empirical estimator, and a global thresholding technique developed by Vasiliev [8]. In particular, the considered estimator can be viewed as a multivariate (but “nonsmooth”) version of the one introduced in the univariate case, that is, , in Chesneau et al. [9]. We prove that is both adaptive and efficient; it is not dependent on the smoothness of in its construction and, under mild assumptions on the smoothness of (we assume that it belongs to a wide class of functions, the so-called Besov balls), it attains fast rates of convergence under the risk (with ). These theoretical guarantees are illustrated by a numerical study showing the good practical performance of our estimator.

The remainder of this paper is set out as follows. Next, in Section 2, we briefly describe the considered multidimensional wavelet bases and Besov balls. Our wavelet estimator and some of its theoretical properties are presented in Section 3. A short numerical study can be found in Section 4. Finally, the proofs are postponed to Section 5.

#### 2. Multidimensional Wavelet Bases and Besov Balls

Let be positive integers and let . First of all, we define the spaces as .

In this study, we consider a wavelet base on based on the scaling and wavelet functions and , respectively, from Daubechies family (see [10]). For any , we setwhere forms the set of all nonvoid subsets of of cardinality greater than or equal to .

For any integer and any , we consider

Let . Then, with an appropriate treatment at the boundaries, there exists an integer such that the collection forms an orthonormal basis of . A function can be expanded into a wavelet series aswhereAll the details about these wavelet bases, including the expansion into wavelet series as described above, can be found in, for example, Meyer [11], Daubechies [10], Cohen et al. [12], and Mallat [13].

Let , , , and . We say that a function belongs to the Besov ball if and only if the associated wavelet coefficients (5) satisfy and with the usual modifications for or .

These sets contain function classes of significant spatial inhomogeneity, including Sobolev balls and Hölder balls. Details about Besov balls can be found in, for example, Meyer [11] and Härdle et al. [6].

#### 3. Conditional Density Estimation

We formulate the following assumptions.There exists a known constant such that There exists a known constant such that We propose the following “ratio-thresholding estimator” for :, where** 1** denotes the indicator function, refers to the constant in , and is defined bywhere is a large enough constant and is an integer such that , and is defined by The estimator (10) uses a hard thresholding technique of the wavelet coefficients estimators (12). Such a selection rule is at the heart of the adaptive nature of wavelet methods which have the ability to capture the most important wavelet coefficients of a function, that is, those with the high magnitudes. We refer to Antoniadis [5], Härdle et al. [6], and Vidakovic [7] for further details. The definition of the threshold, that is, , corresponds to the universal one proposed by Donoho and Johnstone [14] and Donoho et al. [15]. It is based on technical considerations ensuring good convergence properties of the hard thresholding wavelet estimator (see also Theorem A.3 in Appendix).

Note that (10) can be viewed as a nonsmooth multivariate version of the estimator proposed by Chesneau et al. [9]. The main advantage of this estimator is to be more easy to implement from a practical point of view (see Section 4 below for a numerical comparison in the univariate case). Concerning , let us mention that it is a natural unbiased estimator for with nice convergence properties. They will be used in the proof of our main result.

The global construction of (9) follows the idea proposed by Vasiliev [8] for other statistical contexts. Note that a control on the lower bound of is necessary; it must be large enough to ensure good statistical properties for (9).

The following result investigates the rates of convergence attained by (9) under the risk with .

Theorem 1. *Let , let , let be (1), and let be defined by (9) with a large enough (the exact condition is described in (29)). Suppose that and hold and that with , , , and . Then there exists a constant such that, for being large enough, where *

The proof of Theorem 1 is based on several technical inequalities and the application of a general result derived from [16, Theorem 5.1] and [17, Theorem 1] (see Theorem A.3 in Appendix).

Theorem 1 provides theoretical guarantees on the convergence of (9) under mild assumptions on the smoothness of and a fortiori under the risk. The obtained rates of convergence are sharp. However, since the lower minimax bounds are not established in our setting, we do not claim that they are the optimal ones in the minimax sense. An important benchmark is that they correspond to the optimal ones in the minimax sense for the standard multivariate density estimation problem, corresponding to and , is constant almost surely, up to a logarithmic term (see [15]).

Finally, note that the factor plays a secondary role in our study; it only appears in the presentation of the model and the construction of and its performance does not depend on the value of .

#### 4. A Short Numerical Study

In this section we investigate some practical aspects of our wavelet methods. For the sake of simplicity, we focus our attention on the univariate case, that is, (so , , , etc.). The codes are written in MATLAB and are adapted from Ramirez and Vidakovic [18]. First we compare the performance of new estimators of density functions with those proposed in our former publication, Chesneau et al. [9], in two styles, accuracy and speed of computation. In order to illustrate the rate of decrease of errors, as Chesneau et al. [9], we employ the indicator defined by where and are sample size and the number of replications, respectively, represents the true density, and is an estimator. We consider three estimators based on our statistical methodology: the linear wavelet estimator; that is,, the hard thresholding wavelet estimator defined by (10), and the smooth version of the linear wavelet estimator after local linear regression (see, e.g., [19]). The practical construction of this smooth version of linear wavelet estimators was proposed by Ramirez and Vidakovic [18]. Several studies confirm that this version of estimators has nice performance in different fields (see, e.g., [20, 21]). We adopt similar setup from Chesneau et al. [9] for our example; that is, we use Daubechies’s compactly supported “Daubechies 3” and we take . Also, we generate different sample sizes , and data points , from distribution. The discrete random sample is generated from Binomial(); the bivariate density function is . Table 1 gives the value of computed from simulations for different sample sizes. This table should be compared with Table 1 in page 70 in Chesneau et al. [9]. As we see, similar results could be obtained; decreases while the sample size increases. The performance of the smooth version of linear wavelet estimator is the best. As we see there is no significant difference between the new version of estimators with former versions in Chesneau et al. [9].