Abstract

Let be an estimator obtained by integrating a kernel type density estimator based on a random sample of size . A central limit theorem is established for the target statistic , where the underlying random vector forms an asymptotically stationary absolutely regular stochastic process, and is an estimator of a multivariate parameter by using a vector of U-statistics. The results obtained extend or generalize previous results from the stationary univariate case to the asymptotically stationary multivariate case. An example of asymptotically stationary absolutely regular multivariate ARMA process and an example of a useful estimation of are given in the applications.

1. Introduction

The purpose of this paper is to estimate the value of a multivariate distribution function, called the target distribution function, at a given point, when observing a nonstationary process. Clearly, there must be a connection between the process and the target distribution. We will assume that as time goes, the marginal distribution of the process gets closer and closer to the target in a suitable sense. The point at which we want to estimate the target distribution is not any bona fide vector, for we will assume that it can be estimated by a vector of -statistics. Such a problem is clearly out of reach with that generality, and we will assume that, though nonstationary, the process exhibits an asymptotic form of stationarity and has a suitable mixing property. Those will be defined formally after this general introduction.

Let be a stochastic process indexed by the positive integers, taking value in a finite dimensional Euclidean space . Identifying with a product of a finite number copies or the real line, we write for the distribution function of . We will assume that the process has some form of asymptotic stationarity, implying that the sequence converges in a sense to be made precise to a limiting distribution function .

For , let denote the -algebra of events generated by .

We will say that the nonstationary stochastic process is absolutely regular if where

Assume that for some positive less than ,

We consider a parameter in whose components can be naturally estimated by -statistics. To be more formal and precise, we assume that is defined as follows. Let be an integer to be the degree of the -statistics. Let be a function from into , invariant by permutation of its arguments. We are interested in parameters of the form and the function is called the kernel of the parameter

Example 1.1. Take to be . The mean vector corresponds to taking , and is the identity.

Example 1.2. Take to be . Consider to be the 2-dimensional vector whose components are the marginal variances. We take and is going to be a function defined on . It has two arguments, each being in , and it is defined by

Such a parameter can be estimated naturally by -statistics, essentially replacing in (1.3) by an empirical counterpart. By using the invariance of the estimator of is then of the form

Now, we have described the parameter and its estimator, going back to our problem, we need to define an estimator of the distribution function .

A natural one would be the empirical distribution function calculated on the observed values of the process. Even if the empirical distribution function is optimal with respect to the speed of convergence of the mean square error, it is not appropriate for not taking care of the fact that is smooth, and particularly of the existence of a density .

It is, therefore, natural to seek an estimator of the target distribution which is smooth. A good candidate is to smooth the empirical distribution function with a kernel. Another way to introduce this estimator is to say that we integrate a standard kernel estimator of the density.

Such an estimator estimates the mean distribution function But since the sequence has a limit , it estimates the limit as well. To be explicit, we consider a sequence , of distribution function converging, in the usual sense of convergence in distribution, to that of the point mass at the origin. We write for the empirical distribution function pertaining to the measure having mass at each sample point . Our nonparametric estimator of is where denotes the convolution operator. Finally, our estimator of is

Our method is the adaption of some of the ideas of Puri and Ralescu [1] who proved a central limit theorem of for the i.i.d. case which was generalized by Sun [2] for the stationary absolutely regular case, then Sun [3] proved the asymptotic normality of and the perturbed sample quantiles for the nonstationary strong mixing condition. We also have to mention Harel and Puri [4, 5] who proved central limit theorems of -statistics for nonstationary (not necessarily bounded) strong mixing double array of random variables, Ducharme and El Mouvid [6] who proved limit theorems for the conditional cumulative distribution function by using the convergence of the rapport of two -statistics, and Oodaira and Yoshihara [7] who obtained the law of the iterated logarithm for the sum of random variables satisfying the absolute regularity, then Harel and Puri [8] proved the law of the iterated logarithm for perturbed empirical distribution function when the random variables are nonstationary absolutely regular; later, this result was generalized for the strong mixing condition by Sun and Chiang [9]. In addition, some of the ideas of Billingsley [10] and Yoshihara [11] have been used to study our problem. For the study of some limit theorems dealing with -statistic for processes which are uniformly mixing in both directions of time, the reader is also referred to Denker and Keller [12].

2. Preliminaries

To specify our assumption on the process, it is convenient to introduce copies of . Hence we write , an infinite sequence of copies of . The basic idea is to think of the process at time as taking value in and we think of each as the th component of We then agree on the following definition.

Definition 2.1. A canonical -subspace of is any subspace of the form with . We write for a generic canonical -subspace.

Remark 2.2. For if we note and we have with and
The origin of this terminology is that when is the real line, then a canonical -subspace is a subspace spanned by exactly distinct vectors of the canonical basis of . We write for a sum over all canonical -subspaces included in .
To such a canonical subspace , we can associate the distribution function of as well as the distribution function with the same marginals Clearly, the marginals of are independent, while these of are not.
Consider two nested canonical subspaces and , where For a function symmetric in its argument and defined on , we can define its projection onto the functions defined on by Identifying with and with allows to project functions defined on onto functions on However, with this identification, the projection depends on the particular choice of in To remove the dependence in we sum over all choices of in by Given -statistics of degree , we can then define an analogue of Hoeffding decomposition (e.g., Hoeffding [13]) when the random variables come from a nonstationary process. For this purpose, consider, firstly, an expectation of if the process had no dependence, namely, Then for any , we define where is the Dirac function.
Finally, for , we set The analogue of Hoeffding decomposition is the equality When we have a vector of -statistics defined by a function as in (1.3), we can write the decomposition componentwise. This is a little cumbersome to write explicitly. Identifying with says, we write and each -statistics has a Hoeffding type decomposition where and is defined by (2.3) for the component of .
We can construct the vector Now, writing for the largest of the 's, we can write a vector version of the Hoeffding decomposition Note that this decomposition makes an explicit use of convention (2.7), and this is why this convention was introduced.
We now need to specify exactly what we mean by asymptotic stationary of a process. For this, recall the following notion of distance between probability measures.

Definition 2.3. The distance in total variation between two probability measures and defined on the same -algebra is If is a canonical subspace of , we write as the -algebra generated by the 's with . We write as the probability measure pertaining to the process , which is a probability measure on .

Definition 2.4. The process with probability measure on is geometrically asymptotically (pairwise) stationary if there exists a strictly stationary process with distribution on , and a positive less than 1, such that for , Since is strictly stationary in this definition, its restriction to depends in fact only on . Hence, this definition asserts that the process is very close of being stationary when is large. It also implies that This asserts that the marginal distribution of the process converges geometrically fast to a fixed distribution.
We suppose that there exists a strictly stationary process with probability measure on , which is absolutely regular with the same rate as the process . is the distribution function of .
We define the function on by Next, we denote Identifying with says, the vector of -statistics being defined by a vector function , we can write We can construct the vector Let where for , otherwise, and is the differential operator.
We have We also define

3. Weak Convergence of the Smoothed Empirical Distribution Function

In this section, we identify with and we have, of course, a vector of -statistics defined obviously by a vector function , where the degree of is .

Let be a probability density function on and let be a sequence of positive window-width, tending to zero as Denote , and consider the perturbed empirical distribution function defined by (1.7) corresponding to the sequence

Consider the smoothed empirical distribution defined in (1.7) and using the kernel density estimator , where , and define Note that this is of the form (1.7), with where

For a better understanding of the use of the integral type estimators , it is of interest to study the asymptotic behavior of the distribution of (defined by (3.1)) evaluated at a random point (defined by (1.5)). Such a statistic is useful in estimating a functional if is unknown.

Supposing that the conditions introduced in Section 2 are satisfied, our main result establishes that is an estimator which converges to , and the asymptotic normality will allow us to obtain confidence intervals for . Finally, by using the notations introduced in Section 2, we can write the following result.

Theorem 3.1. We suppose that
(i)there exists a finite positive constant such that where is the number introduced in (1.2);(ii)the mixing rate of absolute regularity verifies Condition (1.2);(iii)condition (2.14) is verified;(iv) where is a probability density function; (v)the sequence and are twice differentiable on with uniformly bounded first and second partial derivatives. Then converges in law to a normal distribution as , where is defined in (2.22).

We are then faced with a difficulty as the variance defined in (2.22) is unknown. In order to overcome this difficulty, we can proceed to an estimation of the variance by truncating the expansion of , keeping only the first more informative terms and estimating by its empirical counterpart defined by where From condition (1.2), we have where is some finite positive constant.

To obtain a suitable value for , a simple criterion consists of computing the smallest integer for which where would be the needed level of precision.

From the empirical construction of the estimator , we deduce easily the convergence in distribution of to

4. Applications

4.1. Application to an ARMA Process

First, we give an example for which the the stochastic process satisfies our general condition. It means that is a multivariate asymptotical stationary absolute regular stochastic process.

Example 4.1. ARMA process.
Consider a -variate ARMA(1,1) process defined by where the initial vector has a measure which is not necessarily the invariant measure and admits a strictly positive density, and are square matrices and is a -variate white noise with strictly positive density and geometrical absolute regularity. If the eigenvalues of the square matrix have modulus strictly less than 1, then the process satisfying Condition (2.14) (for a proof, see (5.4) in [8]), the process is asymptotically stationary and geometrically absolutely regular.
Consider the strictly stationary process satisfying (4.1), associated to the process where has a measure which is the invariant measure. Some parameters of the model (4.1) can be estimated by estimators of the form (1.5) and we can apply Theorem 3.1.
For example, take and denote by the mean and by the covariance function of the process
Consider to be the 2-dimensional vector which is the column matrix , and suppose that we want to estimate , then one possibility should be to use the estimator , and its associated kernel is, of course, the identity.
For estimating the two parameters and , where is the first column and is the second column of , we could also use the estimator , where the associated kernel of is defined by and the associated kernel of is defined by

4.2. Application to Estimation of the Median

We give a very simple example for which it is useful to estimate For simplicity, we suppose

Let be a random sample for which the sequence of distribution functions of converges to the limiting distribution function with median A well-known estimator of is the Hodges-Lehmann estimator defined by The estimator is a weighed -statistic with kernel

The theorems of convergence for -statistics remain true for weighted -statistics. We can easily conclude that convergence in law to , and also converges in law to We can confront these two results to evaluate the validity of the estimation of the parameter by

5. Proof of Theorem 3.1

We are going to use the following lemma proved by Harel and Puri [4, Lemma 2.2], which is a generalization of a lemma of Yoshihara [11, Lemma 2].

Lemma 5.1. Suppose that (3.2), (3.3), and Condition (ii) of Theorem 3.1 are satisfied, then where is .

Writing as

will allow us to determine the contributions of the stochastic behavior of and that of to the limiting distribution.

First, we use smoothness of our nonparametric estimator to linearize the term , approximating it by the differential minus a centralization term . The second term , plus the centralization term defined above, is analyzed using an empirical process technique for dependent random variables. Finally, the last term satisfies , using the exponential asymptotic stationarity.

Setting we can rewrite the first and second terms as

Lemma 5.2. Under the conditions of Theorem 3.1, converges in law to the normal distribution as , where is defined in (2.22).

Proof. From the decomposition we can write where and is defined by (2.3) for the component of
From the exponential asymptotic stationarity, converges to zero, and from Lemma 5.1 and Markov inequality, we deduce that converges to zero in probability.
It remains to show that converges to a normal distribution, noting that is a nonstationary absolutely regular unbounded sequence of random variables which verifies the mixing rate (1.2). To prove the asymptotic normality of , we use the following lemma, obtained by Harel and Puri [4, Lemma 2.3].

Lemma 5.3. Let be a nonstationary absolutely regular unbounded sequence of random variables, which verifies the mixing rate (1.2). Suppose that for any positive , there exists a sequence of random variables satisfying (1.2) such that where is a positive constant; where is a positive constant; where is a positive constant; Then converges in law to a normal distribution with mean zero and variance .

First, we prove (5.10) and (5.11) for the sequence .

Put where From condition (v),?we have where and (5.10) is proved.

Now, by using the inequality let such that , we obtain

From (3.2), we have where is constant positive, which implies and (5.11) is proved.

We now show (5.12).

We first denote that where

It results that where Thus to prove (5.12), we have to show that From (2.14) and the convergence of to the function , as , and by using [4, Lemma 3.1] of Harel and Puri, as .

From the well-known inequality on moments of absolute regular processes, we have, for , where From (3.3), we have where is a finite positive constant which implies so that then We have We deduce that Consequently, (5.12) is verified. Analogously, we show (5.13).

We put where and is defined by (2.16) for the component of .

By using the Lebesgue dominated convergence theorem, we obtain which implies that which proves (5.14). Thus, assumptions (5.10) and (5.14) are satisfied, and converges in law to the normal distribution . Consequently, converges in law to the normal distribution . Therefore, Lemma 5.2 is proved.

Lemma 5.4. Under the conditions of Theorem 3.1, converges to zero in probability.

Proof. We write where with Of course is a multidimensional empirical process.
From [11, Theorem 3]by Yoshihara, we have where Then, we deduce that where is a positive constant.
To prove that it suffices to show that Since is differentiable, there exists in such that The differential being bounded, there exists a positive constant such that which implies that We have where is the copula of and denotes the indicator function.
We have with where is the module of continuity for any bounded function defined by We generalize [2, Relation (3.11)] by Sun from the univariate case to the multivariate case by using similar methods as in [14, Lemmas (6.3) and (6.5)] by Harel and Puri. Therefore, we get It results from the inequalities (5.51) and (5.54) that converges to zero in probability as .
For the second term of and using the Lagrange form of Taylor theorem, applied to on the points and until the second order, there exists such that with
In the same way, there exists such that with , From Harel and Puri [5], we deduce that converges to a multinormal distribution as .
From Condition (iv), we have which implies Consequently, we deduce that converges to zero in probability, and Lemma 5.4 is proved.
Therefore, the proof of Theorem 3.1 follows from Lemmas 5.2 and 5.4.