Abstract
Let be an estimator obtained by integrating a kernel type density estimator based on a random sample of size . A central limit theorem is established for the target statistic , where the underlying random vector forms an asymptotically stationary absolutely regular stochastic process, and is an estimator of a multivariate parameter by using a vector of U-statistics. The results obtained extend or generalize previous results from the stationary univariate case to the asymptotically stationary multivariate case. An example of asymptotically stationary absolutely regular multivariate ARMA process and an example of a useful estimation of are given in the applications.
1. Introduction
The purpose of this paper is to estimate the value of a multivariate distribution function, called the target distribution function, at a given point, when observing a nonstationary process. Clearly, there must be a connection between the process and the target distribution. We will assume that as time goes, the marginal distribution of the process gets closer and closer to the target in a suitable sense. The point at which we want to estimate the target distribution is not any bona fide vector, for we will assume that it can be estimated by a vector of -statistics. Such a problem is clearly out of reach with that generality, and we will assume that, though nonstationary, the process exhibits an asymptotic form of stationarity and has a suitable mixing property. Those will be defined formally after this general introduction.
Let be a stochastic process indexed by the positive integers, taking value in a finite dimensional Euclidean space . Identifying with a product of a finite number copies or the real line, we write for the distribution function of . We will assume that the process has some form of asymptotic stationarity, implying that the sequence converges in a sense to be made precise to a limiting distribution function .
For , let denote the -algebra of events generated by .
We will say that the nonstationary stochastic process is absolutely regular if where
Assume that for some positive less than ,
We consider a parameter in whose components can be naturally estimated by -statistics. To be more formal and precise, we assume that is defined as follows. Let be an integer to be the degree of the -statistics. Let be a function from into , invariant by permutation of its arguments. We are interested in parameters of the form and the function is called the kernel of the parameter
Example 1.1. Take to be . The mean vector corresponds to taking , and is the identity.
Example 1.2. Take to be . Consider to be the 2-dimensional vector whose components are the marginal variances. We take and is going to be a function defined on . It has two arguments, each being in , and it is defined by
Such a parameter can be estimated naturally by -statistics, essentially replacing in (1.3) by an empirical counterpart. By using the invariance of the estimator of is then of the form
Now, we have described the parameter and its estimator, going back to our problem, we need to define an estimator of the distribution function .
A natural one would be the empirical distribution function calculated on the observed values of the process. Even if the empirical distribution function is optimal with respect to the speed of convergence of the mean square error, it is not appropriate for not taking care of the fact that is smooth, and particularly of the existence of a density .
It is, therefore, natural to seek an estimator of the target distribution which is smooth. A good candidate is to smooth the empirical distribution function with a kernel. Another way to introduce this estimator is to say that we integrate a standard kernel estimator of the density.
Such an estimator estimates the mean distribution function But since the sequence has a limit , it estimates the limit as well. To be explicit, we consider a sequence , of distribution function converging, in the usual sense of convergence in distribution, to that of the point mass at the origin. We write for the empirical distribution function pertaining to the measure having mass at each sample point . Our nonparametric estimator of is where denotes the convolution operator. Finally, our estimator of is
Our method is the adaption of some of the ideas of Puri and Ralescu [1] who proved a central limit theorem of for the i.i.d. case which was generalized by Sun [2] for the stationary absolutely regular case, then Sun [3] proved the asymptotic normality of and the perturbed sample quantiles for the nonstationary strong mixing condition. We also have to mention Harel and Puri [4, 5] who proved central limit theorems of -statistics for nonstationary (not necessarily bounded) strong mixing double array of random variables, Ducharme and El Mouvid [6] who proved limit theorems for the conditional cumulative distribution function by using the convergence of the rapport of two -statistics, and Oodaira and Yoshihara [7] who obtained the law of the iterated logarithm for the sum of random variables satisfying the absolute regularity, then Harel and Puri [8] proved the law of the iterated logarithm for perturbed empirical distribution function when the random variables are nonstationary absolutely regular; later, this result was generalized for the strong mixing condition by Sun and Chiang [9]. In addition, some of the ideas of Billingsley [10] and Yoshihara [11] have been used to study our problem. For the study of some limit theorems dealing with -statistic for processes which are uniformly mixing in both directions of time, the reader is also referred to Denker and Keller [12].
2. Preliminaries
To specify our assumption on the process, it is convenient to introduce copies of . Hence we write , an infinite sequence of copies of . The basic idea is to think of the process at time as taking value in and we think of each as the th component of We then agree on the following definition.
Definition 2.1. A canonical -subspace of is any subspace of the form with . We write for a generic canonical -subspace.
Remark 2.2. For if we note and we have with and
The origin of this terminology is that when is the real
line, then a canonical -subspace is a
subspace spanned by exactly distinct vectors of the canonical basis of . We write for a sum over all canonical -subspaces
included in .
To such a canonical subspace , we can associate the distribution function of as well as the
distribution function with the same marginals Clearly, the marginals of are independent, while these of are not.
Consider two nested canonical subspaces and , where For a function symmetric in its argument and defined on , we can define its projection onto the functions defined on by
Identifying with and with allows to project functions defined on onto functions on However, with this identification, the projection depends on the particular choice of in To remove the dependence in we sum over all choices of in by
Given -statistics of degree , we can then define an analogue of Hoeffding decomposition (e.g., Hoeffding
[13]) when the random variables come from a nonstationary process. For this purpose, consider,
firstly, an expectation of if the process had no dependence, namely,
Then for any , we define where is the Dirac function.
Finally, for , we set The analogue of Hoeffding decomposition is the equality
When we have a vector of -statistics defined by a function as in (1.3), we
can write the decomposition componentwise. This is a little cumbersome to write
explicitly. Identifying with says, we write and each -statistics
has a Hoeffding type decomposition where
and
is defined by (2.3) for the component of .
We can construct the vector Now, writing for the largest
of the 's, we can write a vector version of the Hoeffding decomposition Note that this decomposition makes an explicit use of convention (2.7), and this
is why this convention was introduced.
We now need to specify exactly what we mean by asymptotic stationary of a process. For this, recall the following notion of distance
between probability measures.
Definition 2.3. The distance in total variation between two probability measures and defined on the same -algebra is If is a canonical subspace of , we write as the -algebra generated by the 's with . We write as the probability measure pertaining to the process , which is a probability measure on .
Definition 2.4. The process
with probability measure on is geometrically asymptotically (pairwise) stationary if there exists a strictly stationary process with distribution on , and a positive less than 1, such that for , Since is strictly
stationary in this definition, its restriction to depends in fact
only on . Hence, this definition asserts that the process is very close
of being stationary when is large. It
also implies that
This asserts that the marginal distribution of the process converges
geometrically fast to a fixed distribution.
We suppose that there exists a strictly stationary
process with
probability measure on , which is absolutely regular with the same rate as
the process . is the
distribution function of .
We define the function on by
Next, we denote Identifying with says, the
vector of -statistics being defined by a vector function , we can write We can construct the vector
Let
where for , otherwise, and is the
differential operator.
We have We also define
3. Weak Convergence of the Smoothed Empirical Distribution Function
In this section, we identify with and we have, of course, a vector of -statistics defined obviously by a vector function , where the degree of is .
Let be a probability density function on and let be a sequence of positive window-width, tending to zero as Denote , and consider the perturbed empirical distribution function defined by (1.7) corresponding to the sequence
Consider the smoothed empirical distribution defined in (1.7) and using the kernel density estimator , where , and define Note that this is of the form (1.7), with where
For a better understanding of the use of the integral type estimators , it is of interest to study the asymptotic behavior of the distribution of (defined by (3.1)) evaluated at a random point (defined by (1.5)). Such a statistic is useful in estimating a functional if is unknown.
Supposing that the conditions introduced in Section 2 are satisfied, our main result establishes that is an estimator which converges to , and the asymptotic normality will allow us to obtain confidence intervals for . Finally, by using the notations introduced in Section 2, we can write the following result.
Theorem 3.1. We suppose that
(i)there exists a finite positive constant such that
where is the number
introduced in (1.2);(ii)the mixing rate of absolute regularity verifies
Condition (1.2);(iii)condition (2.14) is verified;(iv)
where is a
probability density function;
(v)the sequence and are twice
differentiable on with uniformly
bounded first and second partial derivatives.
Then converges in
law to a normal distribution as , where is defined in (2.22).
We are then faced with a difficulty as the variance defined in (2.22) is unknown. In order to overcome this difficulty, we can proceed to an estimation of the variance by truncating the expansion of , keeping only the first more informative terms and estimating by its empirical counterpart defined by where From condition (1.2), we have where is some finite positive constant.
To obtain a suitable value for , a simple criterion consists of computing the smallest integer for which where would be the needed level of precision.
From the empirical construction of the estimator , we deduce easily the convergence in distribution of to
4. Applications
4.1. Application to an ARMA Process
First, we give an example for which the the stochastic process satisfies our general condition. It means that is a multivariate asymptotical stationary absolute regular stochastic process.
Example 4.1. ARMA
process.
Consider a -variate
ARMA(1,1) process defined by
where the initial vector has a measure
which is not necessarily the invariant measure and admits a strictly positive
density, and are square
matrices and is a -variate white
noise with strictly positive density and geometrical absolute regularity. If
the eigenvalues of the square matrix have modulus
strictly less than 1, then the process satisfying
Condition (2.14) (for a proof, see (5.4) in [8]), the process is asymptotically stationary and
geometrically absolutely regular.
Consider the strictly stationary process satisfying
(4.1), associated to the process where has a measure
which is the invariant measure. Some parameters of the model (4.1) can be
estimated by estimators of the form (1.5) and we can apply Theorem 3.1.
For example, take and denote by the mean and by the covariance
function of the process
Consider to be the
2-dimensional vector which is the column matrix , and suppose that we want to estimate , then one possibility should be to use the estimator , and its associated kernel is, of course,
the identity.
For estimating the two parameters and , where is the first
column and is the second
column of , we could also use the estimator , where the associated kernel of is defined by
and the associated kernel of is defined by
4.2. Application to Estimation of the Median
We give a very simple example for which it is useful to estimate For simplicity, we suppose
Let be a random sample for which the sequence of distribution functions of converges to the limiting distribution function with median A well-known estimator of is the Hodges-Lehmann estimator defined by The estimator is a weighed -statistic with kernel
The theorems of convergence for -statistics remain true for weighted -statistics. We can easily conclude that convergence in law to , and also converges in law to We can confront these two results to evaluate the validity of the estimation of the parameter by
5. Proof of Theorem 3.1
We are going to use the following lemma proved by Harel and Puri [4, Lemma 2.2], which is a generalization of a lemma of Yoshihara [11, Lemma 2].
Lemma 5.1. Suppose that (3.2), (3.3), and Condition (ii) of Theorem 3.1 are satisfied, then where is .
Writing as
will allow us to determine the contributions of the stochastic behavior of and that of to the limiting distribution.
First, we use smoothness of our nonparametric estimator to linearize the term , approximating it by the differential minus a centralization term . The second term , plus the centralization term defined above, is analyzed using an empirical process technique for dependent random variables. Finally, the last term satisfies , using the exponential asymptotic stationarity.
Setting we can rewrite the first and second terms as
Lemma 5.2. Under the conditions of Theorem 3.1, converges in law to the normal distribution as , where is defined in (2.22).
Proof. From the decomposition we can write
where and is defined by
(2.3) for the component of
From the exponential asymptotic stationarity, converges to zero, and from Lemma 5.1 and Markov inequality, we deduce that converges to zero in probability.
It remains to show that converges to a
normal distribution, noting that is a
nonstationary absolutely regular unbounded sequence of random variables which
verifies the mixing rate (1.2). To prove the asymptotic normality of , we use the following lemma, obtained by Harel and
Puri [4, Lemma
2.3].
Lemma 5.3. Let be a nonstationary absolutely regular unbounded sequence of random variables, which verifies the mixing rate (1.2). Suppose that for any positive , there exists a sequence of random variables satisfying (1.2) such that where is a positive constant; where is a positive constant; where is a positive constant; Then converges in law to a normal distribution with mean zero and variance .
First, we prove (5.10) and (5.11) for the sequence .
Put where From condition (v),?we have where and (5.10) is proved.
Now, by using the inequality let such that , we obtain
From (3.2), we have where is constant positive, which implies and (5.11) is proved.
We now show (5.12).
We first denote that where
It results that where Thus to prove (5.12), we have to show that From (2.14) and the convergence of to the function , as , and by using [4, Lemma 3.1] of Harel and Puri, as .
From the well-known inequality on moments of absolute regular processes, we have, for , where From (3.3), we have where is a finite positive constant which implies so that then We have We deduce that Consequently, (5.12) is verified. Analogously, we show (5.13).
We put where and is defined by (2.16) for the component of .
By using the Lebesgue dominated convergence theorem, we obtain which implies that which proves (5.14). Thus, assumptions (5.10) and (5.14) are satisfied, and converges in law to the normal distribution . Consequently, converges in law to the normal distribution . Therefore, Lemma 5.2 is proved.
Lemma 5.4. Under the conditions of Theorem 3.1, converges to zero in probability.
Proof. We write
where
with Of course is a
multidimensional empirical process.
From [11, Theorem 3]by Yoshihara, we have where
Then, we deduce that where is a positive
constant.
To prove that it suffices to show that
Since is
differentiable, there exists in such that The differential being bounded,
there exists a positive constant such that which implies that
We have
where is the copula
of and denotes the
indicator function.
We have
with where is the module
of continuity for any bounded function defined by We generalize [2, Relation (3.11)] by Sun from the univariate case to the multivariate case by
using similar methods as in [14, Lemmas (6.3) and (6.5)] by Harel and Puri. Therefore, we get It results from the inequalities (5.51) and (5.54) that converges to
zero in probability as .
For the second term of and using the
Lagrange form of Taylor theorem, applied to on the points and until the
second order, there exists such that with
In the same way, there exists such that with ,
From Harel and Puri [5], we deduce that converges to a
multinormal distribution as .
From Condition (iv), we have
which implies
Consequently, we deduce that converges to zero in probability, and Lemma 5.4 is proved.
Therefore, the proof of Theorem 3.1 follows from Lemmas
5.2 and 5.4.