#### Abstract

A procedure for testing the significance of the main random effect
is proposed under a model which does not require the traditional assumptions of symmetry, homoscedasticity, and normality for the error term and
random effects. To accommodate this level of model generality, and also unbalanced designs, suitable adjustments to the *F*-test are made. The extensive
simulations performed under the random effects model, and the unrestricted
and restricted versions of the mixed effects model, indicate that the classical
*F* procedure is extremely liberal under heteroscedasticity and unbalancedness.
The proposed test procedure performs well in all settings and is comparable
to the classical *F*-test when the classical assumptions are met. An analysis of
a dataset from the Mussel Watch Project is presented.

#### 1. Introduction

Consider a two-factor mixed or random effects design, and let denote the th observation at level of the row factor and level of the column factor. In the case of a mixed effects design we assume that the row factor is fixed.

The classical two-way model, compare [1–5], uses the decomposition For the random effects model, the ’s, ’s, ’s, and ’s are mutually independent, the ’s are iid , the ’s are iid , the ’s are iid , and the ’s are iid .

For the mixed effects model, there are two common definitions of the effects. The *unrestricted* version of the model assumes
and the are all independent. The *restricted* version of the model keeps the above assumptions but requires that
implying that are correlated. Both versions assume the to be iid and independent from the other random effects. There is a dichotomy of opinion in the statistical literature as to which model should be used. Cornfield and Tukey [6], Scheffé [1], Winer [7], and Khuri et al. [5], among others, advocate the restricted version. Searle [2], Hocking [8], and Searle et al. [4] advocate the unrestricted version. While the statistic for testing for no main random effect differs in the two versions, both use what Scheffé [1, page 264] calls the *symmetry assumption*. Basically, this is the assumption of independence of the random main and interaction effects. This assumption was criticized in [9, 10] as unrealistic in most practical situations. More importantly, simulations demonstrated that the classical -test for testing the significance of the main fixed effect is very sensitive to departures from the symmetry assumption even in balanced designs with homoscedastic errors. Additional simulations showed the classical -test for testing the significance of the interaction effect to also be very sensitive to unbalancedness and heteroscedasticity.

This paper deals with testing the significance of the main random effect in two-factor mixed and random effects designs. Simulations suggest that the classical -test for this hypothesis is also very sensitive to departures from the symmetry assumption, as well as to unbalancedness and heteroscedasticity. New test procedures are presented under fully nonparametric models for the two-factor mixed and random effects designs. Unbalanced mixed and random effects designs are incorporated in the modeling by considering the sample sizes to be random. The models and test procedures pertain to both continuous and discrete data and allow heteroscedasticity in the error and interaction terms, as well as nonnormality. The interaction term is not assumed independent from the main effect (so the symmetry assumption is not made), though, under the random effects model, the two are shown to be uncorrelated.

The asymptotic theory of the test statistics is derived under the Neyman-Scott framework, in the sense that the number of levels of both factors is large but the group sizes can remain fixed. In such cases, test statistics are constructed by taking the difference (as opposed to the ratio) of the appropriate mean squares; compare [9–12] and references therein. In accordance to this literature, the proposed test statistics for testing for no main random effects are scaled versions of and , for the mixed and random effects designs, respectively, where the precise definition of these mean squares is given in Section 3.

The novelty of the proposed test statistics is twofold. First, the mean squares are based on unweighted cell averages to accommodate unbalanced designs. Second, in order to accommodate heteroscedasticity, is replaced by a different linear combination of the cell sample variances, denoted by , which, under the null hypothesis, has the same expected value as (ensuring zero mean value of the test statistic under the null hypothesis). A modified version of the for accommodating heteroscedasticity was first used in Akritas and Papadatos [13] in the context of a high-dimensional one-way fixed effects design.

The rest of the paper is organised as follows. Section 2 reviews the fully nonparametric random and mixed effects models. In Section 3 we derive the test statistics and present their asymptotic distributions. Proofs of the propositions in Section 3 appear in the appendix. Section 4 discusses the estimation of the limiting null distributions and also presents results of simulations comparing our testing procedures to the usual -tests. Finally, in Section 5 we present the analysis of a dataset from NOAA’s NS&T Mussel Watch Program.

#### 2. Fully Nonparametric Models

Here we review the fully nonparametric random and mixed effects models developed in [9, 10].

##### 2.1. The Random Effects Model

Consider a two-way random effects design. The random levels of the row factor are obtained by random sampling from the population , while the random levels of the column factor are obtained by random sampling from the population . Let denote a row level randomly selected from , and denote a column level randomly selected from . Assume that the selections are independent. For each factor-level combination , observations , are generated from a distribution function, . can be any arbitrary distribution function, which depends only on the factor levels .

Decompose into a random mean and error term as follows: where and are defined implicitly in (4). The definitions of and imply Conditioning on and using the last relation in (5) it further follows that Next, decompose the random means as where the overall mean and the random effects , , and are defined as Combining (4) and (7) we obtain where the random effects have zero means, are both marginally and conditionally on and uncorrelated from the error terms (this follows from (6)), and satisfy as is easily seen using the independence of . Note that even though the main and interaction effects are uncorrelated, in general they will not be independent. This is because the interaction term is typically related to the two main effects by a multiplicative relation rendering the interaction effect nonnormal even in the presence of normal main effects. Therefore the symmetry assumption is typically violated in random effects designs.

Relation (9) with the associated properties (5), (6), and (10) is derived using only the assumption that and are independent. Thus, it is a nonparametric version of the classical random effects model, in the sense that it allows for arbitrary (rather than normal) distributions of the random effects and error term, the variance (as well as the entire distribution) of the error term can depend on the random factor levels, and the effects are uncorrelated, and uncorrelated to the error term, as opposed to the assumed independence of the classical model.

##### 2.2. The Mixed Effects Model

Consider a design where the rows correspond to the fixed effects and the columns correspond to the random effects. The levels of the random effects are obtained by simple random sampling from a population . Let denote a randomly selected element from , a fixed effect level, and an observation obtained from the cell . We place no restrictions on the form of the distribution function, , of other than the obvious restriction that it has to depend on and . In particular, can be a discrete distribution function. Since is obtained from a random selection, , its mean and variance are all random. The nonparametric model consists of a decomposition of in terms of plus an error term, and further decomposition of to define the main effects and interaction terms. In particular, write where the error term is defined implicitly in the above relation. An easy consequence of the definition of is Next decompose into main and interaction effects as where is the overall mean and are the main and interaction effects. An immediate implication of (14) is Using (12) it further follows that Combining (11) and (13) we have While model (17) has the appearance of the usual mixed effects model, it does not require normality (or even continuity) of the observations, the variance of the random interaction effects can depend on , and the random effects and are not uncorrelated without further assumptions (which we do not make) about constant variance of the random means and constant covariance for any two distinct random means.

Suppose now that there are fixed levels, so , and let , denote the levels of the random effect. Recall that the are obtained by simple random sampling from . Also let , denote the replications (iid observations) in factor-level combination . For notational simplicity we write for and for . Thus, conditionally on (or ), the are iid . From (17) we have where the effects and error terms satisfy (14), (15), and (16). In addition, and are independent for , , , are iid conditionally on , , and are independent for , and and are uncorrelated for . These imply We note that relation (14) implies that is a function of the random level . Since the random levels are iid, we have that the variances of , , are equal. Arguing similarly, we have that the variances of the , , are equal. However the model allows heteroscedasticity across different levels of the fixed effect, that is, and , for . Moreover, the conditional variances of the error term given different levels of the random effect can be different. Summarizing, we have but the model allows

#### 3. The Test Statistics and Their Asymptotic Distributions

Before proceeding, we first define our notation: The usual statistic in the balanced case in the restricted version of the mixed model is given by while the usual statistic in the balanced case in the random effects model is given by In the unbalanced case there is no theoretically justified procedure, but Searle [14] does have a number of recommendations which are implemented in software packages.

The test statistics we propose for testing the hypothesis of no main random effect, , , are modeled after the usual statistics given in (23) and (24), except that we take the difference rather that the ratio, as explained in the Introduction. Moreover, in order to accommodate unbalanced designs, , , and are replaced by , , and , respectively. As mentioned in the Introduction, our asymptotic theory requires that the difference has mean value zero. While this is the case for the random effects design, the expected value of is not zero for the mixed effects design with heteroscedastic errors. Therefore we replace with a different linear combination of the cell sample variances:

Thus, the proposed test statistics for the mixed effects and random effects models are, respectively, where .

Proposition 1. *Let be as defined in (26) and let be as defined in (27), with the observations generated according to the model of Section 2. Then, *(i)*if the null hypothesis *, holds,
(ii)*if the null hypothesis *, does not hold,

The second item in Proposition 1 suggests that the null hypothesis should be rejected at level whenever the test statistic takes a value larger than the th percentile of the null distribution. The next proposition establishes asymptotic representations of and that are useful for establishing their asymptotic distributions.

Proposition 2. *Under , *(i)
(ii)

Theorem 3. * Let . Then under **
in distribution as . *

*Proof. *Letting , then, as stated above,
The proof that in distribution follows from an application of the Lindeberg-Feller CLT. We first rewrite our statement in the notation of the Lindeberg-Feller CLT and then show that the Lindeberg condition holds after a straightforward application of the Dominated Convergence Theorem. Now let , so that . Also note that we can write
which is finite because we have assumed a finite fourth moment for the terms. Moreover, this implies that
which is again finite. Now in order to use the Lindeberg-Feller CLT, we need to verify the Lindeberg condition:
Note that
Because and is finite, the Dominated Convergence Theorem implies that , and hence the Lindeberg condition is verified. This completes the proof of the theorem.

Theorem 4. *Define the operator , for , where is the class of Borel subsets of and is the Lebesgue measure on , by
**
where . Let , , , be the eigenvalues and eigenfunctions of the integral equation
**
and assume that the eigenvalues satisfy
**
Then, if and have finite fourth moments and the covariances , , , and are bounded uniformly in , the asymptotic null distribution of , as , , is the same as the asymptotic distribution of , as , where , , are independent standard normal random variables. *

*Proof. *The proof proceeds by first conditioning on the levels of the column factor and then applying the arguments used in the proof of Theorem 3.3 of [9]. Indeed, the proof of the aforementioned theorem uses an expansion similar to (31) of Proposition 2 (with the indices interchanged) which is modeled as a -statistic with kernel depending on . The same arguments establish the limiting distribution of the test statistic conditionally on . As , the strong law of large numbers guarantees that the kernel converges to the same limit, and thus the conditional limiting distribution does not depend on . Thus, the test statistic converges to the same distribution unconditionally.

#### 4. Test Procedures and Simulation Results

In testing for the random main effect in the mixed effects model, we only need to estimate the variance term in Theorem 3. According to (30) of Proposition 2 we need to estimate the variance of It is easy to see that in probability. Thus, we need to estimate the variance of the independent terms . By the Central Limit Theorem, as , is approximately Normally distributed with mean zero and variance . Therefore, the approximate variance of is . As a result, we estimate by Then, according to Proposition 1 and Theorem 3, is rejected at level when where is the percentile of the standard normal distribution.

Testing for main effects in the random effects model involves estimation of the asymptotic distribution of Theorem 4. Koltchinskii and Giné [15] show that the infinite spectrum of a Hilbert-Schmidt operator can be approximated by the finite spectrum of the empirical matrix version of the operator. We will use this result to approximate the spectrum of the operator in Theorem 4 by its empirical version, , defined below. To this end, the following representation of as a -statistic is useful: where the -statistic is defined by the kernel Under the null hypothesis , the empirical version, , of the operator is the matrix with entries . The eigenvalues of are then used to estimate the asymptotic distribution of as where the are randomly sampled from the distribution. To approximate the estimated distribution, we randomly generate a large number of sets , and use the formula above to obtain realizations from the distribution. Using this approximate distribution, for a level test we reject when is greater than the percentile.

We now present simulation results to compare our test procedures for the main random effect in the mixed and random effects models to the usual -tests given by (23) and (24). Note that these statistics are only valid for balanced designs. In the unbalanced designs, the datasets used to obtain the achieved size and power are written out to files and input to SAS for the -test to be carried out in SAS PROC MIXED. The achieved size and power are calculated from the values given from SAS.

Using the decomposition in relation (18), we generate our data for the mixed effects model as follows.(1)Set equal to any constant. (2)Generate a vector of main random effects , from the standard normal distribution. (3)Independently generate a vector of fixed “functions” , , from the standard exponential distribution and set . (4)Generate from a normal distribution with variance and mean , where the last two terms are the main random and interaction effects, respectively.

Similarly, using the decomposition in relation (9), we generate our data for the random effects model as follows.(1)Set equal to any constant. (2)Generate a vector of main row random effects , from the standard normal distribution. (3)Independently generate a vector of main column random effects , from the standard normal distribution, and set , where is used to set the degree of deviation from the null hypothesis. (4)Set the interaction effects . (5)Generate from a normal distribution with variance and mean .

In Table 1, each simulation used . For the unbalanced setups, rows had 4 observations per cell, whereas rows had 2 observations per cell. For the heteroscedastic setups, the first 15 rows all had variance 1, while the remaining rows had variance 5; that is, .

#### 5. Data Analysis

One real-world application for our methodology can be found through the National Oceanic and Atmospheric Administration’s National Status and Trends Program. In 1986, this division undertook a very large scale project to monitor the levels of numerous chemical contaminants and organic chemical constituents in marine sediment and bivalve (mollusk) tissue samples. This project (http://ccma.nos.noaa.gov/about/coast/nsandt/musselwatch.aspx), dubbed the Mussel Watch Project, is still on-going and there are no apparent plans to discontinue it in the near future. There are currently over 300 coastal sites at which sediment and bivalve samples are collected and analysed for the project. Each site is categorised as being within a certain estuarine drainage area (EDA). For our data analysis, we chose to analyse the Mercury concentrations in tissue samples of the *Crassostrea virginica*, or Eastern Oyster. There was a high degree of missingness of Mercury concentrations in the dataset, so we used imputations to generate these measurements. Because of how we performed our imputations, only EDAs that did not have missing measurements for two consecutive years and EDAs that had measurements in 1986 were included in the analysis. Once we had a complete imputed dataset, we employed our methodology to test for effects due to EDA.

##### 5.1. Imputations

Because of the coefficient that is used in the statistic , each cell needs to have at least 2 observations. To facilitate the imputations, we identified 195 instances of 2 consecutive measurements falling into the following groupings: 1986-1987, 1988-1989, 1990-1991, 1992-1993, 1994-1995, 1996-1997, 1998-1999, 2000-2001, 2002-2003, and 2004-2005. If any cell had more than one observation, they were averaged. We used these 195 pairs to build the regression model , where we predict a value of Mercury concentration at time from the Mercury concentration at time . For this model, the value for the test of versus was 0.000 and the . When fitting the regression model in Minitab, we also saved the residuals from the model and the actual imputations that we used were , where is a randomly selected residual from the fitting of the model. We separately treat imputations for cells with only one observation and imputations for empty cells. When cells are empty, we use the regression equation twice to impute two values of Hg concentration at time . When a cell has only one observation, we use a different technique.(1)If the EDA has any cells with more than one observation in any year, we calculate residuals for each year with more than one observation as . Then for the EDA and year of interest, we impute a second observation as , where denotes a randomly selected residual. (2)If the EDA of interest has at most one observation in each cell, we identify the closest neighboring EDA (denoted by level ) and calculate residuals for each year with more than one observation in the neighboring EDA as . Then for the EDA and year of interest, we impute a second observation as , where denotes a randomly selected residual.

After performing the imputations, there are a total of observations of Hg concentration in CV tissue samples. The 20 years included are 1986–2005, and there are 39 EDAs included in the dataset. The data are rather unbalanced, with most EDAs having only two observations per year, while others such as the Galveston Bay EDA have as many as six observations per year. Moreover, the observations do not seem to be homoscedastic. These points are illustrated in Figure 1, where we show scatterplots of Hg concentrations for six EDAs after imputation.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 5.2. The Analysis

In our analysis of this dataset under the mixed effects model, we take the years as the fixed effect and the EDAs as the random effect. We can think of the EDAs as a random effect because we are only analyzing a very small subset of a much larger group of EDAs and the location effect is not of specific interest. Some may view the time effect as being of specific interest, while others may not. Whether one views the time effect as a fixed or random factor would precipitate analysis using a mixed or random effects model.

We want to model the Hg concentration with an interaction effect between year and location, so we use the model in relation (18). Because we are dealing with compositional data, we employ the common practice of analyzing the data after taking a log transformation; compare [16].

When we apply our proposed test procedure based on the test statistic , we obtain a value of . When we apply the -test computed from SAS PROC MIXED, we obtain a value of .

The value for the test using is simply calculated as , where is the distribution function of the standard normal random variable. The value for the SAS -test is calculated as , where and denote the numerator and denominator degrees of freedom. In the test for the random interaction effect, SAS uses , . The simulation results shown in Table 1 suggest that the significance of the interaction effect indicated by the -test cannot be trusted. Indeed, according to these simulation results, we expect the -test to detect interaction in almost any unbalanced and heteroscedastic dataset. On the other hand, the simulation results suggest that the marginal significance indicated by the value obtained from the proposed test procedure can be trusted. According to this we recommend that the interaction effect be included in the modeling of this dataset.

#### 6. Summary

The classical statistic for testing the significance of main random effects in two-factor mixed and random effects model is shown to be very sensitive to violations of the assumptions under which it is derived, in particular those of symmetry, homoscedasticity, and balancedness. Two new test procedures for testing the hypothesis of no main random effects are proposed under fully nonparametric modeling of the mixed and random effects designs. The test statistics are defined as differences (as opposed to ratios) of suitably defined mean squares, and their asymptotic theory is derived as the number of levels tends to infinity. Simulations suggest that the new procedures perform reasonably well in situations where the statistic breaks down and have comparable performance to it when the assumptions it requires are met. The practical value of the proposed procedures is demonstrated with the analysis of a real dataset.

#### Appendices

#### A. Proof of Proposition 1

We first present the proof for the statistic under the mixed effects model, and then for the statistic under the random effects model.

*Proof for *. Using relations (15) and (18) we write
The conditional expectation of the first part given the vector of random column levels is
Therefore, under the null hypothesis that , a.s., for all , the expected value of is zero conditionally on , and thus also unconditionally. Moreover, it is clear that under the alternative the expected value of is positive.

*Proof for *. Using relation (9), we write
Thus, the proof of this part of the proposition will follow if we show
Relation (A.4) follows from
where the second equality follows from the fact that for . Relation (A.5) follows similarly.

#### B. Proof of Proposition 2

We first show the proof for the statistic under the mixed effects model, and then for the statistic under the random effects model.

*Proof for *. Because we have shown above in the proof of Proposition 1 that, under , the statistic has mean value zero, it suffices to show that with data generated as in Section 2 and effects defined in (18), the following convergence holds:
in probability as . Now, first note that the above expression is centered, so we only need to show that its variance tends to zero as tend to . This is easily done using the facts, given in relation (19), that the terms are uncorrelated for different levels of the fixed effect, independent for different levels of the random effect, and, conditionally on the level of the random effect, iid for . Note that, using these relations,
Because we assume a finite fourth moment for the terms and a finite covariance for any two distinct squared terms, this conditional variance tends to zero. Therefore, by the Dominated Convergence Theorem, the unconditional variance also tends to zero, thus proving the proposition.

*Proof for *. First, note that after some straightforward algebra we can write

To complete the proof, we claim that converges to zero after showing that the leading term of this product tends to zero. We proceed similarly for the last two terms in . Finally, we show that tends to zero to finish the proof. Note that each of these terms has expected value zero, so to demonstrate these facts, we only need to show that the variances tend to zero. To this end, note that This completes the proof of the proposition.

#### Acknowledgments

This research was supported in part by NSF Grants SES-0318200 and DMS-0805598. This research was initiated during the second author’s sabbatical visit at the Australian National University in the Spring of 2006. Helpful discussions with Peter Hall are gratefully acknowledged.