Note on Qualitative Robustness of Multivariate Sample Mean and Median
It is known that the robustness properties of estimators depend on the choice of a metric in the space of distributions. We introduce a version of Hampel's qualitative robustness that takes into account the -asymptotic normality of estimators in , and examine such robustness of two standard location estimators in . For this purpose, we use certain combination of the Kantorovich and Zolotarev metrics rather than the usual Prokhorov type metric. This choice of the metric is explained by an intention to expose a (theoretical) situation where the robustness properties of sample mean and -sample median are in reverse to the usual ones. Using the mentioned probability metrics we show the qualitative robustness of the sample multivariate mean and prove the inequality which provides a quantitative measure of robustness. On the other hand, we show that -sample median could not be “qualitatively robust” with respect to the same distance between the distributions.
The following Hampel’s definition (originally given for the one-dimensional case) of qualitative robustness [1, 2] deals with -balls in the space of distributions rather than with standard “contamination neighborhoods” (see for the latter, e.g., [3, 4]).
The sequence , , of estimators is qualitatively robust at the distribution if for every there exists such that entails
Here, and throughout, denotes the Prokhorov metric, and ; , are i.i.d. random vectors distributed, respectively, as and .
For a metric on the space of distributions and random vectors we will write (as in (1)) having in mind the -distance between the distributions of and .
By all means, the use of the Prokhorov metric is only an option. For instance, in  other probability metrics in the definition of qualitative robustness were used. (See also [6–9] for using different probability metrics or pseudo-metrics related to the estimation of robustness).
As noted in [1, 2] in , sample means are not qualitatively robust at any , while sample medians are qualitatively robust at any having a unique median. (See also  for lack of qualitative robustness of sample means in certain Banach spaces).
Moreover, in  it was shown that for symmetric distributions the median is, in certain sense, the “most robust” estimator of a center location when using the pseudo-metric corresponding to neighborhoods of contamination type. (See more with this respect in ).
At the same time, it is known from the literature that under different circumstances, in particular, using distinct probability metrics (as, e.g., in (1) or in other definitions) the robustness properties of estimators can change considerably (see, for instance, the discussion in ).
The first aim of the present paper is to consider a modified version of Hampel’s definition of qualitative robustness taking into account the -asymptotic normality of the sequence of estimators. The formal definition is given in the next section but, basically, we replace (1) with the following condition: where is some constant, and is a probability metric (different from the Prokhorov metric in our case).
The second goal of the paper is to present an example of two probability metrics and (on the space of distributions in ) for which the following holds: (i)when , are multivariate sample means, the left-hand side of inequality (2) (with some ) is bounded by const ; (ii)when , , are sample medians, we give an example of symmetric smooth distributions and , , in , such that as , while there is a positive constant such that the left-hand side of inequality (2) (with ) is greater than for all sufficiently small . Therefore, sample medians are not qualitative robust (in our sense) with respect to these metrics.
The metrics and are the following (the complete definitions are given in Section 2). is the Kantorovich metric (see, e.g., ) and is certain combination of and of the Zolotarev metric of order 2 (see ).
We should stress that this choice is not determined by any advantages for statistical applications. Moreover, the closeness of distributions in the Kantorovich metric implies the closeness in the Prokhorov metric, but also reduces the probability of “large-valued outliers” (but not the rounding errors). Therefore, our selection of metric is not quite consistent with the standard approach to qualitative robustness where the Prokhorov metric (or its invariant versions) is used.
Nevertheless, our choice allows to unveil the possible unusual robustness properties of sample means and medians and to assess the certain quantitative robustness of the multivariate sample mean (with respect to the considered metrics!). The obtained “robustness inequality" does not work in the “gross error model” but (jointly with inequalities (23)) it could be useful for quantitative assessment of the robustness of sample means, under perturbation of data of “rounding” type.
2. Basic Definitions
Let be a separable Hilbert space with the norm generated by an inner product , and let be the Borel -algebra of subsets of . Let also and be two sequences of i.i.d. random vectors in with their respective distributions denoted by and .
Under the assumption the means and are defined as the corresponding Bochner integrals.
In what follows we denote ():
On the other hand, let (), be sample medians defined by (4) and (5) replacing and by the corresponding empirical distributions and (obtained from and , resp.). Robustness (in terms of -contamination neighborhoods) and asymptotic normality of , , were proved, for instance, in  (see also ). The qualitative robustness properties of sample means and sample medians when the Prokhorov metric is used were discussed in Introduction.
Let us first see what happens with qualitative robustness of and if in the above definition we replace with the Kantorovich metric: where
Now, applying the regularity properties of (see, e.g., ): we see that the sequence of sample means is “qualitatively robust” using instead of the Prokhorov metric.
It seems straightforward (using the approach similar to one given in [10, 19] and results in ) to show that the sequence of sample medians is also “qualitatively robust” with respect to . However, this is out of the scope of this paper.
3. -Robustness and Main Results
Now and in what follows we suppose that with the Euclidean norm . The results presented in this section are an extension to the multidimensional case of the similar findings for , published in the hardly accessible proceedings . Moreover, we improve the results of  even in the one-dimensional case.
In order to simplify calculations in the proof of Theorem 5 below we will use in the definitions of Kantorovich’s and Zolotarev’s metrics (see below) the following norm in the space .
Let be some fixed simple probability metric on the set of all probability distributions on .
We will consider a sequence , , of estimators of some parameter of the distribution ( of the distribution , resp.).
Definition 1. We say that a sequence , , is -robust at if there is some fixed vector such that for every there exists such that entails
Remark 2. Taking into account that , , we see that (12) can be related with the -asymptotic normality of estimators , . The “scaling parameter” (in (12)) is necessary to ensure the equality of means of the corresponding limit normal distributions (when they exist). Only in case in (12) .
Remark 3. The distance can take infinite value. Particularly, from (13), (14) we see that if . On the other hand, if there exist second moments, and , , then (see, e.g., ). The function in (14). Therefore, by (13), Thus, if and then . Consequently, if and , then if and only if .
We now define the metric to work with:
3.1. -Robustness of Sample Means
To prove the inequality in Theorem 5 below we need to impose the following restriction on the distribution .
Assumption 4. (i) , and the covariance matrix of is positive definite.
The distribution of has a density such that for some :
(ii) the density of is bounded and differentiable;
(iii) the gradient is bounded and belongs to ;
(iv) for some
Remark 6. (i) The constant in (20), (21) is entirely determined by the distribution of . For various particular densities of the constant in (21) can be bounded by means of computer calculations. For this one can use the fact (true under wide conditions) that the sequence , , converges in to the corresponding partial derivative of the limit normal density with covariance matrix (and zero mean since is invariant under translations).
For example, let and , where and are independent random variables; has the gamma density with and arbitrary , while has the gamma density with and arbitrary . Simple computer calculations show that in (21) , and since we can take , we obtain in (20) that For instance, for . (For these values of , we can take in (20) and obtain .)
(ii) Since under the above assumption entails (18), inequality (19) ensures -robustness of the sequence of sample means , .
(iii) For , in  an example is given showing that in general the sequence of sample means , is not -robust (even if (18) holds and ). It is also almost evident that the sample means , , are not -robust, for example, if is the total variation metric (or, if ). The appearance of Zolotarev’s metric on the right-hand side of (19) is related to closeness of corresponding limit normal distributions.
Corollary 7. Suppose for a moment that and that one evaluates the quality of estimators by mean of absolute errors: , . Then from (19) it follows that
(The simple proof is similar to the one given in ).
3.2. About -Robustness of Sample Medians
Let again be the metric defined in (16). We show that the sequence of sample medians , , in general, is not -robust even when and have strictly positive, bounded, smooth densities symmetric with respect to the origin, and the sequences of sample medians , , , , are -asymptotically normal. We consider a modified version of the corresponding example from .
Example 8. Let , , and for let be a random variable with the density: where is a normalizing constant. By symmetry of the density, we get and also it is clear that , and
First of all let us show that for this example the left-hand side of (12) is infinite for any . It is well known (see, e.g., ) that , as with probability 1. Also, from the results of  we can obtain that . Using this inequality it is easy to show that the sequence , , is uniformly integrable. Therefore, (and also ) as , and for this .
Remark 9. The densities as in (24) represent the following somewhat strange type of “contamination.” Since sample points from tend to concentrate around the origin. But as , and therefore sample points from frequently in some extent are separated from 0.
A natural question is “how to choose the metric to ensure -robustness of the sequence of sample medians , ?” Our conjecture is (for , e.g.) to try (supposing the existence of densities).
If then under certain conditions the closeness in guarantees the closeness of normal densities which are limiting for and for , respectively. To attempt proving -robustness of , (as in (12)) one can show Hampel’s qualitative robustness of , with respect to the metric and then use the property . A not clear point of this plan is finding conditions under which as .
Example 10. Let us give another (very simple) example of the sequence of estimators which is -robust with (on the class of distributions described below). For we consider the class of all random variables with bounded supports , having density such that , . We suppose that , , and is the same for all . Assume that parameter is unknown, and the sequence of estimators is used to estimate it.
Denoting , and choosing in (12) , we get because of the metric is minimal for the compound metric (see, e.g., ).
By elementary calculations we bound the right-hand side of (31) by
Now for each fixed, by induction we obtain (see (10))
On the other hand, let, for example, . Then But for , From this relationship it follows that Finally, applying (32), we can select in such a way that For we can use the inequality and (33), (36). Thus, choosing small enough we can ensure inequality (12) for all with . In this way we proved -robustness of on the set .
A. The Proofs
The proof is based on the two following lemmas.
Let be a differentiable function. We will write , where , .
Lemma A.1. Let , and be random vectors in such that (a) is independent of and ; (b), ; ; (c) has a bounded differentiable density (with respect to the Lebesgue measure) such that , .
Proof. From the definition (8), (9) of the metric (with the norm instead of !) it follows (the proof is simple) that in (8) the class of functions Lip (given in (9)) can be replaced by the class of all bounded differentiable functions such that
Let us fix any such function and arbitrary . Then (by the Fubini theorem), For each fixed let Because of boundedness of and integrability of we can differentiate in (A.7) under the integral sign (see [23, Appendix ]). Thus, In view of (A.5) we obtain
For every we have Comparing (A.6)–(A.11) and taking into account the definition of in (13), (14), we obtain inequality (A.4).
Lemma A.2. Under the assumption of the previous section the constant in (21) is finite.
Remark A.3. A similar assertion was proved in  for of second derivatives of the densities (under sightly different conditions). For this reason we give only a sketch of the proof of Lemma A.2 indicating only differences in comparison to the proof of Lemma 4.1 in  (where the omitted details can be seen).
Proof. As before, let () be the density of , and its characteristic function. Let also denote the density of . There is such that and therefore
(fixing other variables), we can integrate by parts:
and (A.12) follows for large enough .
In view of (A.12) we can write down the inverse Fourier transform for : and differentiate under the integral sign in (A.15).
The condition () and Assumption (i) in Section 2 ensure the hypothesis of in [25, Theorem 19.2, Ch. 4]. By this theorem, (as ), where , , are certain polynomials and is the normal density with zero mean and the covariance matrix (see Assumption (i)).
Let denote the Fourier transform of , where Using (A.15), (A.16), and in [25, Lemma 7.2, Ch. 2], we can obtain that for each fixed with certain polynomials .
By arguments similar to those given in , it follows from (A.18) that there exist constants , such that On the other hand, to prove that it is sufficient to show that But the last equality follows from (A.12) and the fact that Expressing in terms of (as an inverse Fourier transform), and using (A.16)–(A.20) we can establish that there exist a constant and polynomials , such that for each ,
The next step is to find for each an upper bound for which does not depend on .
For , and . Thus, using Assumptions (ii), (iii), and the corresponding theorems in [23, Appendix ], we get We have where is the constant from Assumption (iv). The first summands on the right-hand side of (A.25) are uniformly bounded in due to (A.23). To bound the second terms in (A.25) we write (see (A.24)) Now From and it follows that . Thus, the first terms on the right-hand side of (A.27) are less than due to Assumption (iv). Applying the Fubini theorem and using the fact of integrability of we see that the second summand in (A.27) is bounded by const , which is by the Chebyshev and Rosenthal inequalities.
Exploiting Lemmas A.1 and A.2, the rest of the proof of the Theorem 5 in Section 3 is carried out exactly as the proof in one-dimensional case given in . The “ideality properties” of the metric , used there hold true for random vectors (see, e.g., ). The general version of inequality (3.15) in  which relates the Kantorovich and the total variation metrics is proved in [13, page 89]. Note that the proof presented in  uses the so-called convolution approach (see, e.g., ) and induction arguments. This method has been widely used to estimate rates of convergence in multidimensional Central Limit Theorems (see, e.g., [13, 14, 21, 26]).
The authors thank the National System of Investigators (SNI) of CONACYT, Mexico for the partial support of this work. The authors are grateful to the anonymous referee for his valuable suggestions on the improvement of an earlier version of this paper.
F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel, Robust Statistics: The Approach Based on Inuence Functions, Wiley, New York, NY, USA, 1986.View at: MathSciNet
P. J. Huber, Robust Statistics, Wiley, New York, NY, USA, 1981.View at: MathSciNet
J. Jurečková and P. K. Sen, Robust Statistical Procedures. Asymptotics and Interrelations, Wiley, New York, NY, USA, 1996.View at: MathSciNet
S. T. Rachev and L. R. Rüshendorf, Mass Transportation Problems, vol. II: Applications, Springer, New York, NY, USA, 1998.View at: MathSciNet
I. Mizera, “Qualitative robustness and weak continuity: the extreme function? Nonparametrics and robustness in modern statistical inference and time series analysis,” in A Festschrift in Honor of Professor Jana Jurečková Institute of Mathematical Statistics Collections, J. Antoch, M. Hušková, and P. K. Sen, Eds., vol. 7, pp. 169–181, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2010.View at: Google Scholar | MathSciNet
E. Gordienko and A. Novikov, “Probability metrics and robustness: is the sample median more robust than the sample mean?” in Proceedings of the Joint Session of 7th Prague Symposium on Asymptotic Statistics and 15th Prague Conference on Information Theory, Statistics Decision Functions, pp. 374–386, Charles University in Prague, Prague, Czech Republic, 2006.View at: Google Scholar
S. T. Rachev, Probability Metrics and the Stability of Stochastic Models, Wiley, Chichester, UK, 2009.View at: MathSciNet
A. W. Van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge, UK, 1998.View at: MathSciNet