Tightness Criterion and Weak Convergence for the Generalized Empirical Process in
We prove Shao and Yu's tightness criterion for the generalized empirical process in the space with topology. Covariance inequalities are used in applying the criterion to particular types of the empirical processes. We weaken the assumptions imposed on the covariance structure as well as the properties of the underlying sequence of r.v.'s, under which presented processes converge weakly.
Let be a sequence of absolutely continuous identically distributed (i.d.) random variables (r.v.’s) with an unknown distribution function (d.f.) and probability density function (p.d.f.) . The empirical distribution function, based on the first r.v.’s, is defined by . It is well known, however, that this estimate does not make use of the smoothness of , that is, the existence of the p.d.f. . Therefore, the kernel estimate has been proposed, where the kernel function is a known d.f. and is a sequence of positive constants descending at an appropriate rate. Such estimator has been deeply studied in the last two decades mainly by Cai and Roussas in [1–4], Li and Yang in  and others. Asymptotic normality, Berry-Essen bounds for smooth estimator are only examples of their fruitful results.
Recently, Li et al. proposed in  the so-called recursive kernel estimator of the d.f. as follows: The seemingly tiny modification they introduced to the formula of the typical kernel estimator has an important advantage. Namely, in the case of a large size of a sample, can be easily updated with each new observation since it is computable recursively by where . The authors discussed the asymptotic bias and quadratic-mean convergence and established the pointwise asymptotic normality of under relevant assumptions.
In this paper, however, we will focus on the empirical process built on an estimator of the d.f. rather than itself. Let us recall that the following process: is called the empirical process built on an estimator .
Yu  studied the case when is a standard empirical d.f. and showed weak convergence of to the Gaussian process assuming stationarity and association of the underlying r.v.’s. Cai and Roussas  obtained a similar result in the case when is the kernel estimator of the d.f. built on a stationary sequence of negatively associated r.v.’s.
In this paper, we shall study the empirical process generated by the generalized kernel estimator of the d.f. given by the formula A1: is a sequence of absolutely continuous i.d. r.v.’s taking values in and having twice differentiable d.f. with first and second derivative bounded;A2: is a kernel function such that and with bounded derivative ;A3: is a sequence of positive constants subject to the following conditions: , (actually, since , under the limit suffices).
Explicitly, we take a look onto the process we shall call from now on the generalized empirical process. Let us pay attention to the fact that in the case of (i) for , is the empirical process based on the kernel estimator of the d.f. ; (ii) for , is the empirical process based on the recursive kernel estimator of the d.f. ; (iii), is the standard empirical process (based on the empirical d.f.).
It is well known that the crucial procedure in showing weak convergence for an empirical process is to verify tightness. In , Shao and Yu gave the following criterion: under which the standard empirical process based on stationary sequence of uniform r.v.’s is tight. It is stated there that the proof of that fact is an easy standard procedure parallel to the one presented in . It is the main aim of this paper to carry it in details but for the generalized empirical process defined by (6) and without assuming stationarity. Nevertheless, we will always return to stationarity assumption while establishing weak convergence.
In order to obtain tightness, one has to assume appropriate covariance structure of the underlying r.v.’s, that is, the covariance of a pair of r.v.’s and has to decline at the right rate while and are growing apart. In this paper we lower the demanded rate of covariance decay using the covariance inequalities for associated (c.f. [10, 11]) and multivariate totally positive of order 2 () (c.f. ) r.v.’s obtained in .
The paper is organized as follows. In Section 2 we present the proof of the Shao and Yu’s tightness criterion formulated for our generalized empirical process. Sections 3 and 4 are devoted to application of the criterion to showing tightness and thus weak convergence of the specific types of empirical processes. Section 5 concerns weak convergence of the recursive kernel-type process for i.i.d. r.v.’s.
2. Tightness Criterion
We start with the key point of the paper.
Theorem 1. Let be the generalized empirical process defined as in (6). One assume that A1, A2, A3 hold.
If there exist constants , , , , , such that for any and the following inequality holds: then the process is tight in with topology.
Proof. The proof boils down to showing that under the assumptions made in Theorem 1, conditions of Theorem 13.2 in  hold. Let us recall that in light of the above mentioned theorem, a process is tight in if
where is the supremum norm, that is,
and is the modulus of continuity of the function , that is
The infimum runs over all finite “-sparse” decompositions of the interval . In other words, it runs over all choices of increasingly ordered points such that , where , .
Let us first show that condition (9) is satisfied. According to the corollary following Theorem 13.2 in , it suffices to show that is, Let us fix and . We need to find , such that where is the generalized kernel estimator of the d.f. . Applying Chebyshev’s inequality we shall find satisfying Such exists if for all . Implementing and in the assumption (8), we get for , Now, applying Hölder’s inequality, we arrive at which in light of the inequality (18) for , is bounded from above by . Therefore condition (9) holds.
We shall now proceed to checking condition (10). Let us recall that the modulus of continuity of the function , , is given by the formula As it is shown in  (page 123), there is a relation between and in the spaces and relatively. Namely, for and Thus, Therefore, condition (10) holds if we show With a view to obtaining the desired inequality we shall proceed patiently in five steps.
Step 1. We need a moment inequality for the r.v. involving the distance between the points and . Let us then assume that for the constants , , , , given in Theorem 1, inequality (8) holds. Fix , and define the quantity , . Next, we fix and take large enough so that . Then and we have
Step 2. Let us now fix , and consider the following r.v.’s: for , where is such that . It is easy to see that for Let us notice that for r.v.’s the conditions of Theorem 10.2 in  are satisfied with , and . Therefore, we are equipped with the following maximal inequality: for all .
Step 3. Let , where is the p.d.f. of r.v.’s . For fixed , let us take such that and define . For sufficiently large we have and then we get
Step 4. Our goal is to obtain an inequality which enables us to bound the supremum of the increment of the function via the maximum of the increments of that function on some subintervals. To be more precise, we will find the upper bound for in terms of where , , and is defined as in Step 3.
Let us recall that , where . From the triangle inequality we can see that Since is a nondecreasing function and applying triangle inequality again we get Cai and Roussas in  showed that under assumptions made on the d.f. and the kernel function , we have Similarly, by Taylor expansion, it is easy to see that Thus, where and is a positive constant dependent on the functions and .
Let us recall that , and . Taking we notice that , thus when the interval coincides with . Approaching the aim of Step 4, let us observe that where is a point at which the above supremum is attained and Without the loss of generality, we may assume that , which implies Now, applying inequality (36), the definition of and the triangle inequality we have If we plug in we arrive at
Step 5. Finally, we are in a position to obtain inequality (23), that is, We now successively make use the inequalities (29), (41), (27), (29), and (28) to get for any fixed . Since the upper bound does not depend on and the probability measure as well as supremum function are continuous, we obtain where is arbitrarily small.
Since condition (10) is checked, the proof is completed.
3. Tightness of the Standard Empirical Process
In this section, we deal with the standard empirical process built on an associated sequence of uniformly distributed r.v.’s , that is, where . We shall relax the restrictions imposed on the process by Yu in  to obtain tightness. Precisely, we do not need stationarity any more due to the technique drawn from , and we lower the assumed rate at which the covariance tends to zero.
While proving tightness of the empirical process, we will use the criterion proved in the first section as well as some of our covariance inequalities.
We shall start with the fact known under the name of multinomial theorem. We recall it in the following lemma.
Lemma 2. For natural numbers , and where and .
In particular, which implies for , that Let us now introduce the following notation due to Doukhan and Louhichi (see ) for a sequence of centered r.v.’s . where the supremum runs over all divisions of the group composed of r.v.’s into two subgroups, such that the distance between the highest index of the r.v.’s in the first group and the lowest index of the r.v.’s from the second group is equal to , . For , we shall define . Let us then put We will now estimate the summands in (48).
If , then Similarly, for , we have When , then We keep on estimating the fourth moment of by introducing —the index, for which attains its maximum. The terms obtained in (54) may further be bounded from above in the following way: We thus get the inequality
We shall now focus on estimating and . Since and are associated uniformly distributed r.v.’s, and —as monotone functions of these r.v.’s—are associated as well. In order to bound let us notice that from Schwarz inequality On the other hand, invoking inequalities from [13, 15], As a consequence, where . Still, we need the upper bound for . It turns out that In other words, among all divisions of the group , into two subgroups, attains the biggest value in case we take two subgroups consisted of two r.v.’s. The supremum runs over the set . Elementary calculation leads to the following formula: where for the sake of simplicity, , , and are, respectively, the free coefficient, the expression with , and the expression with . Using the Lebowitz inequality (see  for instance) and inequalities obtained in [13, 15] we arrive at Analogously, we get Eventually, we have where and for .
Let us now introduce the following notation: and assume it decays powerly at rate in the following way: Let us get back to inequality (56), we can now carry on. At first, for associated r.v.’s, where and are constants and It is worth mentioning, that in the last inequality of (68), we used the estimate At the same time, in the case of r.v.’s, we get where is constant and Let now . Then has the fourth moment estimated—in the case of associated r.v.’s—by and in the case of r.v.’s by In light of the Shao and Yu’s criterion, our process is tight for associated r.v.’s when and for r.v.’s when . Let us sum up this result in the following theorem.
Theorem 3. Let be the empirical process built on an associated sequence of uniformly distributed r.v.’s . Let also Then is tight for . If the r.v.’s are , then the process is tight for .
Yu assumed stationarity of and for a positive constant , thus, the rate of decay . Our result weakens considerably these assumptions especially in the case of r.v.’s.
Louhichi, in , proposed a different tightness criterion involving the so-called bracketing numbers. She managed to enhance Yu’s result—even more than Shao and Yu in —since she proved that it suffices to take to get tightness of the empirical process based on the associated r.v.’s. Nevertheless, she kept the assumption of stationarity valid.
In the final analysis, our result’s advantage is the absence of the stationarity assumption and the rate of decay for remains (up to the author’s knowledge) unimproved for r.v.’s.
Unfortunately, with a view to obtaining weak convergence of the process in question, that is also convergence of finite-dimensional distributions, we do not know how to manage without the assumption of stationarity. Therefore, we conclude with the following corollary.
Corollary 4. Let be the empirical process built on a stationary associated sequence of uniformly distributed r.v.’s . Let also
Then, if where is the zero mean Gaussian process on with covariance structure defined by
If the r.v.’s are , then it suffices to be in order to claim the above convergence.
Proof. It remains to establish convergence of finite-dimensional distributions repeating the procedure from .
4. Tightness of the Kernel-Type Empirical Process
In this section we shall weaken assumption imposed on the covariance structure of r.v.’s by Cai and Roussas in  for the kernel estimator of the d.f. They deal with a stationary sequence of negatively associated r.v.’s (c.f. ) and need the same condition as Yu , that is, to get tightness of the smooth empirical process (see condition (A4) in ).
It turns out that it suffices to have where is a positive constant taken from the tightness criterion (8). It is easy to see that asymptotically we get the rate .
On the way to prove it, we will also take use of a Rosenthal-type inequality due to Shao and Yu (see Theorem 2 in ) we shall recall in the following lemma.
Lemma 5. Let and be a real valued function bounded by with bounded first derivative. Suppose that is a sequence of stationary and associated r.v.’s, such that for Then, for any there exists some positive constant independent of the function , for which
As we can see, the lemma assumes association, but it works for negatively associated r.v.’s as well, since in the proof, it reaches back the result of Newman (see Proposition 15 in ), where both types of association are allowed.
Let us recall that , where . It is easy to see that where With an intent to use Lemma 5, we need to be bounded from above by (which in our case is obvious) and to have bounded ; thus, we assume Now, applying inequality (84) we have Newman showed in  that if and are real valued functions on having square integrable derivatives and , respectively, and provided that , have finite second moments. In light of that equation and linearity of covariance, we get Without the loss of generality, we may and do assume that , thus and by triangle inequality we get , and are the joint p.d.f. of and marginal p.d.f.’s of r.v.’s and , respectively. We need to further assume that stands for a common upper bound of and for all , that is, Then, we finally obtain On the other hand, proceeding like Cai and Roussas in , we can shortly get where is a constant relevant to the covariance inequality for negatively associated r.v.’s (see ).
We now arrive at the following inequality: Assuming and proceeding similarly to (68), we can get To sum up, we obtain the following inequality: