Abstract

ANOVA is one of the most important tools in comparing the treatment means among different groups in repeated measurements. The classical test is routinely used to test if the treatment means are the same across different groups. However, it is inefficient when the number of groups or dimension gets large. We propose a smoothing truncation test to deal with this problem. It is shown theoretically and empirically that the proposed test works regardless of the dimension. The limiting null and alternative distributions of our test statistic are established for fixed and diverging number of treatments. Simulations demonstrate superior performance of the proposed test over the F test in different settings.

1. Introduction

In bioscience, given treatments, a central interesting problem is to compare the treatment mean differences. To deal with this problem, one usually employs the traditional univariate ANOVA to analyse independent random samples from the th treatment, . A critical assumption is that the sample is from an population. Then, is a sequence of independent random variables satisfyingwhere follows the distribution. Let , where . Then, , and is referred to as the effect of the th treatment. Furthermore, model (1) can be rewritten as

Let and , where . Therefore, for model (2), one of the important problems is to test if the treatment means are different, which is amount to testing

The classical test is routinely employed in practice and takes the formwhere is the mean sum of squares due to treatments, and is the mean sum of squares due to errors. In the current research, we relax the normality assumption by assuming that are independent and identically distributed noises of mean zero and variance .

The properties of the test have been well studied in the conventional low-dimensional setting. It enjoys desirable properties when the dimension is fixed, see, for example, Casella and Berger [1]. The test is also robust to the normality assumption, if is fixed and [2]. Akritas and Papadatos [3] (Theorem 2.1) proved that if as , then under , . This shows that the test is asymptotically accurate as when the normality assumption does not hold.

For a range of applications including anomaly detection, medical imaging, and genomics, the means of two levels are typically identical or are quite similar in the sense that they possibly differ in only a small number of levels or groups. In other words, under the alternative , the treatment effects are sparse. This is equivalent to the sparse alternative, see, e.g., Cao and Worsley [4] and Taylor and Worsley [5]. In these sparse settings, the F test is not powerful, and the power of the F test in general is fast decreasing as the number of levels increases. This motivates us to propose a smoothing truncation test which smoothly downweights the contributions of those data with treatment means close to zero. This is desired since those data with small treatment means are just noise. Our test is different from the adaptive Neyman test for goodness-of-fit in Fan [6] which works for two-sample means without repeated measurements. Our test is also different from the multisample ANOVA tests for high-dimensional means by Chen et al. [7] where the number of samples is fixed, since testing problem (3) for model (1) can be regarded as a test for -sample -dimensional means, with diverging and .

We establish asymptotic distributions of the proposed test under the null and the alternative. Simulations demonstrate superior performance of the proposed test over the classical F test. Our test performs well in general and is particularly much more powerful against sparse alternatives than the F test in high-dimensional settings.

Our approach can be extended to heteroscedastic and unbalance cases considered in Akritas and Papadatos [3]. It is also obviously applicable to general multifactor models in Wang (2004) [8]. Since the one-way F statistic coincides with the lack-of-fit statistic for testing if a regression function is constant against a general alternative at the repeated measurement settings, thus our methodology can be applied to this problem, with the current repeated measurements replaced by the residuals under the general alternative. Interested readers are recommended to refer to Härdle and Mammen [9] and Hart [10], among others.

The remainder of the paper is organized as follows. In Section 2, we introduce the smoothing truncation test. In Section 3, we establish the asymptotic distributions of the test under the null and the alternatives. In Section 4, we conduct simulations to compare finite sample performances of different tests.

2. Smoothing Truncation Test

Let , which normalizes the estimator, , of the th treatment effect. Then, the F statistic in (3) can be rewritten aswhere each treatment receives the same weight in the average. For fixed , is asymptotically -distributed with degrees of freedom , and only those treatments of nonzero means contribute to the power of the test. Hence, the test can be improved if different weights are used in its definition. To this end, we downweigh the contributions of those data with small treatment effect by smoothly truncating the contribution of each :where with being a kernel function which can be taken as the standard normal density function. The smoothing parameter controls the size of weight. Intuitively, is large if the th treatment effect is nonzero; otherwise, it is small noise. Therefore, weight gets smaller as the treatment effect gets closer to zero, and the truncation test should be more powerful than the test. Like the F test, large values of suggest rejection of the null, so it is a right-tailed test. Other ways can also be developed to downweigh the contributions of small and will be explored in the future.

3. Asymptotic Distributions

To study the distributions of under the null and alternative hypotheses, we first introduce some notations. For a vector , define the -norm by . Let be the identity matrix, and let , then , where .

3.1. Smoothing Truncation Test with Fixed Number of Treatments

The following condition on the kernel function is needed for establishing the limiting distributions of .

Condition 1. Assume is uniformly continuous and satisfies for some .
Condition 1 is satisfied for a wide range of kernel functions, for example, the standard normal density function. The boundness of the first moment of the kernel was used in Jiang [11], and the uniformly continuous assumption is satisfied by common choice of kernel functions, such as the standard normal density kernel and the Epanechnikov kernel.

Theorem 1. Let . Assume that as . Then,

Proof. Let . Then, is a uniformly continuous function. When is finite, it is obvious that . Note that for each given , . Then, . It follows from the continuous mapping theorem that
For studying the power, we consider a sequence of local Pitman alternatives , where is a sequence of vectors in such that with .

Theorem 2. Assume that Condition 1 holds and . Then, as ,(i)Under , , where and thereafter “” represents converging in distribution(ii)Under ,

Proof. We observe thatNote that . Let and . Then, , s are independent random variables with mean and variance one, andwhere , with and being an -dimensional column vector of all components equal to one. For fixed , by the Cramér–Wold device, is asymptotically normal distribution with mean zero and variance-covariance matrix . It is straightforward to verify thatThen, there exists an orthogonal matrix and a diagonal matrix such that . Let , where is the i-th row of , and let with being the th entry of . Then, is asymptotically normal with mean zero and variance-covariance matrix . Recall from (2) thatIt follows that(i)Under , is asymptotically normal with mean zero and variance-covariance matrix , so that . Then since the 2nd term on the righthand of (1) is for , as .(ii)Under , , and is asymptotically normal with mean and variance-covariance matrix , so that . Hence, .Combining Theorems 1 and 2 gives us the following asymptotic distribution of .

Theorem 3. Assume that conditions in Theorem 2 hold. Then, as ,(i)(ii)The above theorem demonstrates that the smoothing truncation can detect local alternatives close to the null at rate of , which is the optimal rate that all regular parametric tests can achieve.

3.2. Smoothing Truncation Test with Diverging Number of Treatments

Let . To obtain the limiting null and alternative distributions, we need additional conditions.

Condition 2. , as;, as

Condition 3. Suppose the Cramér condition holds for , i.e.,for all and , where is a positive constant, , and .
The first part of Condition 2 means that we consider high-dimensional settings with a diverging number of populations. It is a setting considered in Akritas and Papadatos [3]. Condition 2 restricts the smoothing parameters . This is a wild condition. As , it only requires . Condition 3 is trivially fulfilled if , s are bounded; for Gaussian variables, it obviously holds.
By the definition of , we haveThe following result shows that the difference between and is uniformly in for extremely large .

Lemma 1. Assume that for and . Then, , as .

Proof. (i)We show that . It is easy to show the identity:Then, by the definition of and the triangle inequality,Hence, for any , we haveLet . Then, is a sequence of iid random variables with mean zero and variance 1 and satisfying Condition 3. Using the Bernstein inequality, we obtain thatCombining (5) and (6) yields thatif . Thus, . Let . Then, and . It follows that, for a positive sequence and positive constant ,By the Bernstein inequality, we haveif . Therefore, . Then, by (4), , and thus, .(ii)Since , we haveRewrite , where . Note that is a sequence of iid random variables with mean zero and variance and satisfying Condition 3. It follows from the Bernstein exponential inequality thatif . Hence, . Note thatwhere and if . In fact,Applying the Bernstein exponential inequality again, we get thatif . Then, , and thus, , which combined with (12) leads to , if we take and , under the condition of .

Lemma 2. Assume that Conditions 13 hold. Under , we havewhere is a standard normal random variable.

Proof. Under , we have . By Theorem 2.1(b) of Akritas and Paradatos [3], we have , as , assuming that . That is, , where is a standard normal random variable. Then, by (3) and part (i) of the proof for Lemma 1,Hence,

Theorem 4. Assume that Conditions 13 are satisfied. Then, under the null hypothesis for ,as.

Proof. Observe thatIt follows thatSinceby the Markov inequality,if . Then, . By Lemma 2, we have as . These, combined (10) and Slusky’s theorem, yield the result of theorem.
To study the power of the test, we consider the same local alternatives as Akritas and Paradatos [3], which specifieswhere is a continuous function on [0, 1] such that . With such local alternatives, we havewhere and . Obviously, converges to at the rate of .

Lemma 3. Assume that Conditions 13 hold. Under , we have

Proof. Under , we have . Then,Let . Then,, and , since . Thus, by (27),Let . It follows from (3) thatwhereSince is the statistic, denoted by with a little bit of abuse of notation, for model (12) with all , , using Theorem 2.1 of Akritas and Paradatos [3], we have , or equivalently, , so thatNote that the 2nd term isand the 3rd term isIt is straightforward to show that and . Since , Thus, , and , which together with (10)–(14) yields thatThen, by (3),Hence,

Theorem 5. Under the alternative hypothesis ,as , provided that Conditions 1-3 are satisfied.

Proof. Sinceit can be rewritten thatNote thatIt follows that from Markov’s inequality, for any ,if . Thus, . By Lemma 3, we have Then, by Slusky’s theorem and (15), the result of theorem holds.

Corollary 1. Under the null hypothesis ,as , provided that Conditions 13 are satisfied.

From Corollary 1, one gets the rejection region of the test:where is the upper -percentile of .

Corollary 2. Under the alternative hypothesis ,as , provided that Conditions 13 are satisfied.

It is obvious that the power of for testing problem against is

This means that the test still has power to distinguish from .

Finally, it is worth pointing out that our theorems above are established under the equal sample size setting, but they can be extended to allow for different sample sizes. Since it involves in more dedicated proofs, we leave this as an open problem that can be explored in the future.

4. Numerical Results

In this section, we consider the numerical performance of the proposed test and compare it with the F test.

4.1. Simulations

Without loss of generality, we take and in model (1). Our test involves kernel function and the bandwidth . We take as the standard normal density function and set which satisfies Condition 2. For the following two examples, we draw samples from the normal distribution for model (1). Specifically, for each level , we draw a sample of size from with being different in the following two examples. For each setting, we conduct 1000 simulations to calculate the critical values of tests under the null hypothesis. That is, for significance level , we calculate the values of the test statistics in each simulation and then use the th percentile of the realized values of test statistics in 1000 simulations. To evaluate the global power of test, we generate 1000 normally distributed samples from this alternative and evaluate the test statistics for each sample. The power of test is calculated as the proportion of the realized values of test statistic larger than the critical value.

Example 1. (global power). We consider different model sizes with and for , and want to test the hypothesesThe simulation results are summarized in Table 1. It is shown that the power of the F test drops significantly as the number of levels increases while the power of our proposed test drops just slightly. It is clear that our proposed test significantly outperforms the F test.

Example 2. (local power). With different model sizes of and , we testwith , , , where has the 1st half components of and the 2nd half of 0.5. Figure 1 displays the powers of tests, which verifies desired results on the power: when , the null and the alternative coincide, so that the power of test should be the significance level ; as increases, the alternative gets further away from the null, and the power should become larger. It is seen that the proposed smoothing truncation test has same performance as the F test in low-dimension settings and is much better than the F test in high-dimensional settings. In particular, our test exhibits robust performance as the dimension changes, but the F test has difficulty to distinguish the alternatives from the null.

4.2. A Real Example

In this section, we apply the proposed test and the traditional F test to analyse a breast tumor dataset. This dataset contains 107 cDNA microarray experiments [12]. As indicated in Benito et al. [13], there were two distinct experiment biases in the data which might be from different handling procedures. Jiang et al. [14] corrected the systematic batch biases in the cDNA microarray data and published the batch-adjusted dataset on the website: https://www.stat.unc.edu/postscript/papers/marron/GeneArray/. The data consist of vectors representing relative expression of genes for each of which there are total cases. To perform high-dimensional tests, we keep the samples unchanged for first two genes and centralized and standardized the sample for each of the remaining genes. Hence, in this transformed dataset, the two samples for the first two genes have different means from the others, which results in a high-dimensional sparse setting for hypothesis testing problem (3). Now, we employ the traditional F test and the proposed test for this problem. With , we calculate the values of and as 1 and 0.024. That is, at level, the test fails in distinguishing the population mean differences, but our test is successful for this testing problem. This is expected, since the F test loses its power from 5959 noise samples and ours wins due to its ability in reducing the contributions of noises.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.