#### Abstract

We propose and study parametric bootstrap (PB) tests for heteroscedastic two-factor MANOVA with nested designs. For the problem of testing “main effects” of both factors, we develop a flexible test based on a parametric bootstrap approach. The PB test is shown to be invariant under affine-transformations. Moreover, the PB test does not depend on the chosen weights used to define the parameters uniquely. The proposed test is compared with the approximate Hotelling (AHT) test by the simulations. Simulation results indicate that the PB test performs satisfactorily for various cell sizes and parameter configurations and generally outperforms the AHT test in terms of controlling the nominal size. For the heteroscedastic cases, the PB test outperforms the AHT test in terms of power. In addition, the PB test does not lose too much power when the homogeneity assumption is actually valid.

#### 1. Introduction

There are many situations where we record more than one response variable from each sampling or experimental unit and where these units are allocated to or occur in treatment groups. Ecologists often record the abundances of many species from each sampling or experimental unit and physiologists commonly measure more than one variable (e.g., blood pressure, heart rate, etc.) on experimental animals. With multiple response variables, we might be more interested in whether there are group differences on all the response variables considered simultaneously. This is the aim of multivariate analysis of variance (MANOVA), the analogue of univariate ANOVA when we have multiple response variables for each experimental or sampling unit.

The simplest design is single factor MANOVA (or one-way MANOVA) which tests the effect of a factor, having levels. If the cell covariance matrices are assumed to be equal, then there are some popular tests available to test the equality of the mean vectors. The tests that are commonly used are the Wilks likelihood ratio (WLR), Lawley-Hotelling trace (LHT), Bartlett-Nanda-Pillai (BNP), and Roys largest root tests ([1], chap. 8, sec. 6). When there are some departures from the standard assumption, that is, unequal cell covariance matrices, these solutions were proposed by James [2], Johansen [3], Gamage et al. [4], and Krishnamoorthy and Lu [5], among others. The more complex design is multifactor MANOVA especially when the homogeneity of the cell covariance matrices assumption is seriously violated. There has been a continuous interest in the heteroscedastic two-factor MANOVA which tests in checking the significance of the effects of two factors and , each having and levels, respectively, in a multivariate factorial experiment with crossed designs. Recently, Harrar and Bathke [6] attacked this problem via modifying the WLR, LHT, and BNP tests. Their main ideas focus on modifying the degrees of freedom of the random matrices involved in the test statistics so that the heteroscedasticity of the cell covariance matrices is taken into account and the WLR, LHT, and BNP tests can still be used but with the degrees of freedom estimated from the data via matching the first two moments; see some details in Section 2.2 of Zhang and Xiao [7]. Zhang [8] proposed an approximate Hotelling (AHT) test. A Wald-type test statistic is used. Its null distribution is approximated by a Hotelling -distribution with one parameter estimated from the data. Some simulation studies conducted in Zhang [8] showed that the AHT test outperforms the modified LHT test of Harrar and Bathke [6].

Another useful two-factor design is the nested MANOVA [9]. Notice that each of the testing problems associated with the three null hypotheses in the heteroscedastic two-way MANOVA can be equivalently expressed in the form of the general linear hypothesis testing (GLHT) problem as described in Zhang [8]. The GLHT problem is very general. It includes not only the main and interaction effect tests but also various post hoc and contrast tests as special cases. To construct the test for two-factor nested MANOVA, we may use the same Wald-type test statistic as in Zhang [8]. However, according to our simulation studies, the empirical sizes of the AHT test for the two-factor nested MANOVA model may far exceed the nominal level. On the other hand, the AHT test needs to consider the the selected weights when the cell sizes are unequal. Therefore, it is important to develop a test procedure for the nesting effects and the nested effects with satisfactory size and power regardless of number of factorial effects and the sample sizes.

Accordingly, the present paper will develop a parametric bootstrap (PB) test for heteroscedastic two-factor nested MANOVA. We use standardized effects sum of squares and a natural test statistic obtained by replacing cell covariance matrices by the corresponding sample cell covariance matrices. The PB test admits several nice properties: (1) it can be simply conducted by a routine Monte Carlo algorithm; (2) it is shown to be invariant under affine-transformations; (3) The PB test does not depend on choices of the weights used to identify the parameters; and (4) it works well. Simulation results reported in Section 4 indicate that the PB test performs satisfactorily for various cell sizes and parameter configurations when the homogeneity assumption is seriously violated and generally outperforms the AHT test in terms of power and controlling size. The main ideas of the proposed PB test are closely related to the work by Xu et al. [10]. The methodologies for the PB test are presented in Sections 2 and 3. Simulation results are presented in Section 4. Some concluding remarks are given in Section 5. Finally, some technical proofs of the main results are given in the Appendix.

#### 2. Tests for Nested Effects

Consider a two-factor nested design model with factors and . Suppose that the nesting effect corresponds to the factor having factor levels, and the nested effect corresponds to the factor having a total of levels with levels of nested within the th level of the factor . Suppose a -variate random sample of size is available under th level, ; . Let , ; ; represent these random vectors and represent their observed (sample) values. Assume that so that positive definite sample covariance matrices can be computed for each cell of the design. Suppose that satisfy the following model: where and are the cell mean vector and cell covariance matrix of the random sample at the th cell and is an experimental error vector term. All these samples are assumed to be independent with each other. In the two-factor nested model, the cell mean vectors are usually decomposed into the form where is the grand mean vector, is the effect vector of the th level of on each of the variables in , and is the effect due to the th level of the factor nested within the th level of the factor so that (1) can be further written as the following two-factor multivariate nested model with unequal error covariance matrices: In order to have , , and uniquely defined, we need to have additional constraints. Let and be nonnegative weights such that and . We consider the following constraints: where and are nonnegative weights such that and for each .

In this section, we are interested in testing the null hypothesis, against its natural alternative hypothesis. The null hypothesis in (5) aims to test if the nested effect corresponding to the factor is statistically significant.

The sample mean vector and the sample covariance matrix of the th cell are denoted by and , respectively, where ; , and denotes the transpose of the matrix . The observed values of these random variables are denoted as and , respectively, , . We note that ’s and ’s are mutually independent with and the model for as follows: where and denotes the -dimensional Wishart distribution with degrees of freedom and scale parameter matrix . Writing and , the model (7) can be written as where , , , , denotes the vector of ones, is an identity matrix with order , denotes the Kronecker product of matrices and , and denotes a block-diagonal matrix with along the blocks.

Define the standardized sum of squares due to the factor where and are solutions of and that minimize the quadratic equation subject to the constraints given in (4).

In fact, denoting and , it follows from Theorem 5.2.5 in Wang and Chow [11] that where and . Then Notice that indeed depends on of chosen weights. However, we can prove that the standardized sum of squares in (9) does not depend on . This is the following Theorem.

Theorem 1. *The standardized sum of squares does not depend on of chosen weights, and
**
where denotes any generalized inverse of matrix .*

Note from Theorem 1 that does not depend on . If are known, then a natural statistic for testing (5) is . In fact, , and is an idempotent matrix with rank ; we have where and denotes a noncentral chi-square random variable with degrees of freedom and noncentrality parameter . The noncentrality parameter is equal to zero when . Let be the observed value of . Then, the test that rejects in (5) whenever is a size test, where is the upper th quantile of a chi-square distribution with .

In general, the covariance matrices are unknown; in this case, a test statistic can be obtained by replacing in (13) by , and is given by where .

As shown in Zhang [8], the testing problem associated with the null hypothesis (5) can also be equivalently expressed in the form of the general linear hypothesis testing (GLHT) problem as , where , and , . Then the associated Wald-type test statistic is given as where , and is defined as in (17).

In the following, we shall describe two tests based on (17) and (19), respectively.

##### 2.1. The Approximate Hotelling (AHT) Test

Similar to Zhang [8], we proposed a test referred as the AHT test, which is based on the test statistic where , and and with , and is the th block matrix of . Zhang [8] showed that, under , is approximately distributed as random variable. Thus, the AHT test rejects the null hypothesis in (5) when the critical value for the nominal significance level is exceeded by the observed test statistic in (20). The AHT test can also be conducted via computing the -value based on the approximate null distribution.

##### 2.2. The Parametric Bootstrap (PB) Test

The parametric bootstrap involves sampling from the estimated models. That is, samples or sample statistics are generated from parametric models with the parameters replaced by their estimates. Recall that under the vector have the mean , where . As the test statistic in (17) is location invariant under the group of location transformations , without loss of generality, we can take . Using these facts, the parametric bootstrap pivot variable can be developed as follows. For a given , let and , , . Then the PB pivot variable based on the test statistic (17) is given by where and . For a given level , the PB test rejects in (5) when where is an observed value of in (17). For fixed ;, the above probability does not depend on any unknown parameters, and so it can be estimated using Monte Carlo simulation given in the following Algorithm 2.

*Algorithm 2. *For a given , , and , Compute in (17) and call it . For , Generate and , , Compute using (22). If , set . (End loop) is a Monte Carlo estimate of the value in (23).

##### 2.3. Some Desirable Properties of the PB Test

The PB test (23) has several desirable invariance properties. First of all, the PB test is affine-invariant. That is, it is invariant under the following affine-transformation: where is any nonsingular matrix and is any given vector. This property is desirable since in practice, the cell responses (1) are often recentered or rescaled before an inference is conducted. The recentering and rescaling transformations are special cases of (25).

Theorem 3. *The PB test (23) is affine-invariant in the sense that both the observed test statistic (24) and the distribution of PB pivot variable (22) are affine-invariant.*

The model (3) is identifiable when the constraints (4) are imposed. In many situations there are no natural weights to justify a particular test procedure. For this, we have the following result.

Theorem 4. *The PB test (23) is invariant on the different weight choices in constraints (4).*

Notice that the test statistic (17) does not depend on the chosen weights. The conclusion in Theorem 4 is obvious. However, we have the following property.

Theorem 5. *The test statistic (17)
**
This property is desirable since in practice, in order to avoid computation of generalized inverse of matrices and enhance the accuracy of computation, we have taken a particular such as in the simulation study.*

#### 3. Tests for the Nesting Effects

When the nested effects are present, the nesting effect can not reflect the effect of because it depends on which level of factor it is in. As pointed in Searle [12], Chap. 7, in the case of , a popular solution to the problem is not quite a test for in the presence of nested effects but rather to test the null hypothesis We will see that the problem of testing (27) is the same case of the problem of comparing normal mean vectors when the cell covariance matrices are unknown and arbitrary [5]. In fact, if are known, then is the best linear unbiased estimator of , and a natural statistic for testing (27) is where . Notice that , and is an idempotent matrix with rank ; we have The noncentrality parameter is equal to zero when , , . Then, the test that rejects in (27) whenever is a size test. In general, the covariance matrices are unknown; in this case, a test statistic can be obtained by replacing in (29) by , , and is given by

In the following, we describe the PB test for in (27). Recall that under the vector have the mean . As the test statistic in (32) is location invariant under the group of location transformations , without loss of generality, we can take . Using these facts, the parametric bootstrap pivot variable can be developed as follows. For a given , let and , . Then the PB pivot variable based on the test statistic (32) is given by where and . For a given level , the PB test rejects in (27) when where is an observed value of in (32). For fixed , the above probability does not depend on any unknown parameters, and so it can be estimated using Monte Carlo simulation given in the following Algorithm 6.

*Algorithm 6. *For a given , and , Compute in (32) and call it . For , Generate and , , Compute using (33). If , set . (End loop) is a Monte Carlo estimate of the value in (34). Note that the PB test provided in this section has the same invariance properties as in Section 2.

#### 4. Monte Carlo Studies

In this section, we need to compare the PB test with the AHT test via comparing their empirical sizes (Type I error rates) and powers for two-factor nested MANOVA models via simulations. As pointed out in Section 3, the problem of testing (27) is the same case of the problem of comparing normal mean vectors when the cell covariance matrices are unknown and arbitrary. Due to this reason, we will look only at the nested effects for comparisons.

Let the two factors be and , respectively. Suppose that the nesting effect of the factor has factor levels, and the nested effect of the factor has a total of levels with levels of nested within the th level of the factor . Let denote the vector of cell sizes. For given and covariance matrices , , , we first generate multivariate samples as where the cell mean vectors with being the first cell mean vector, a constant unit vector specifying the direction of the cell mean differences, and a tuning parameter controlling the amount of the cell mean differences. We independently generate the entries of the error terms from the distribution so that we always have and . This means that (36) will generate the th multivariate normal sample , with the given mean vector and covariance matrix . Without loss of generality, we specify as and as where for any given dimension and denotes the usual -norm of . To estimate the sizes and powers of the AHT test, we used simulation consisting of 10,000 runs and recorded corresponding values. Notice that two nested “do loops” are required to estimate the sizes and powers of the PB test, we used 2500 runs for outer “do loops” (for generating the data) and 5000 runs for inner “do loops” for estimating the bootstrap values. The empirical sizes (when ) and powers (when ) of the two tests are the proportions of rejecting the null hypothesis, that is, when their values are less than the nominal significance level . In all the simulations conducted, we used for simplicity.

Notice that the PB test does not depend on chosen weights. Here we report only the comparative studies for the equal-weight method to specify the weights of the AHT test so that the AHT test in Zhang [8] may be used for showing properties of the PB test. The empirical sizes and powers of the two tests for nested effect tests, together with the associated tuning parameters, are presented in Tables 1–3, in the columns labeled with AHT and PB under “” and “”, respectively. As seen from the three tables, three sets of the tuning parameters for the cell covariance matrices are examined, with the first set specifying the homogeneous cases; four sets of the cell sizes are specified, with the first two sets specifying the balanced cell size cases. To measure the overall performance of a test in terms of maintaining the nominal size , we use the average relative error defined in Zhang [8] as ARE where denotes the th empirical size for , , and is the number of empirical sizes under consideration. The smaller ARE value indicates the better overall performance of the associated test. Usually, when , the test performs very well; when , the test performs reasonably well; and when , the test does not perform well since its empirical sizes are either too liberal or too conservative and hence may be unacceptable. Notice that for a good test, the larger the cell sizes, the smaller the ARE values. The ARE values of the two tests under the two error schemes are also presented in these three tables. Notice that for simplicity, in the specification of the covariance and size tuning parameters, we often use to denote “ repeats times.” Table 1 shows the empirical sizes and powers of the two tests for a bivariate case with and , . With , one may be able to check how the two tests behave when one of the factors has a large number of levels. Tables 2 and 3 show the empirical sizes and powers of the two tests for a three-variate case with and , , , and a 10-variate case with and , , , respectively. These two tables allow us to compare the two tests for higher-dimensional data.

In the following, let us compare the AHT and PB tests via examining their empirical sizes and powers. It is seen from the three tables that for all the cases, the PB test generally outperforms the AHT test for all the cases under consideration as shown by their empirical sizes and the associated ARE values presented in the three tables. The AHT test performs reasonably well only for 10-dimensional data. In the homogeneous cases, the AHT test appears to be more powerful than the PB tests because of its inflated empirical sizes exceeding the nominal level considerably. For the heteroscedastic cases, we once again observe from Tables 2 and 3 that the PB test performs superior to the AHT test in terms of controlling nominal level and powers.

We conclude this simulation section via summarizing all the simulation studies conducted. In terms of size, the overall conclusion is that the PB test is a flexible procedure that performs satisfactorily, regardless of the cell sizes, values of the cell covariance matrices, and the number of effects being compared. In terms of power, the PB test generally outperforms the AHT test for most heteroscedastic cases under consideration. Thus, one may recommend to use the PB test as a good alternative in practical applications because of its simplicity and accuracy.

#### 5. Concluding Remarks

The available classical tests for the two-factor MANOVA model with crossed designs and heteroscedastic cell covariance matrices have serious Type I error problems that have been overlooked. To address this serious problem, Zhang [8] proposed the AHT test. In this paper, we proposed and studied the PB test and the AHT test for the two-factor MANOVA model with nested designs and heteroscedastic cell covariance matrices. We showed that the PB test is invariant under affine-transformations and different choices of weights used to define the parameters uniquely. We demonstrated via intensive simulations that the PB test generally performs well and outperforms the AHT test in terms of size and power for most cell sizes and parameter configurations.

#### Appendix

*Proof of Theorem 1. *When is true, the model (8) can be written as the following reduced model:
Premultiplying in model (A.1), we have the following transformed model:
Under the constraint , the model (A.2) becomes
In view of Theorem 5.2.5 in Wang and Chow [11], P. 177, we shall prove that is a side condition on the matrix ; that is, satisfies conditions
where denotes the subspace spanned by the column of matrix . Denote and , respectively. Then , and . Note that are nonnegative weights such that . Then it can be easily verified that and satisfy the following conditions:
Assume that . Then there exist two vectors and such that
Denote
and , where and are vectors, respectively, . Then (A.6) may be rewritten as
which is equivalent to , where , . It follows from (A.5) that
and hence
This means that
It follows from that
On the other hand,
The second equation of (A.13) may be derived easily from Theorem 2.2.1 in Wang and Chow [11]. It follows from (A.13) that
Combining (A.12) with (A.14), we obtain that is a side condition on the matrix . Thus it follows from Lemma 5.2.2 in Wang and Chow [11] that
Thus, we obtain that . This imply that is a generalized inverse of . Notice from Corollary 2.2.3 in Wang and Chow [11] that is invariant with respect to the choice of the involved generalized , which leads to . Therefore, It follows that
Then we have
as desired. Theorem 1 is proved.

*Proof of Theorem 3. *Since the observed values of , , are denoted as , we let denote the observed values of the affine-transformed cell responses , , given by (25). Then we have . On the other hand, we obtain that the sample mean vector and the sample covariance matrix of the th affine-transformed cell are and , respectively. The observed values of these random variables are and , respectively, , . Let and ; from the reduced model (A.1), we have the following affine-transformed reduced model:
where , .

Denote . Let be affine-transformed observed test statistic of (24). Then, by direct calculations, we have
where is the observed values of . The affine-invariance of the observed test statistic (24) then follows.

Let
be mutually independent, , . Then and , where means that and have the same distribution. Using these facts, the affine-transformed parametric bootstrap pivot variable can be given by
where and . The affine-invariance of the distribution of PB pivot variable (22) follows. Theorem 3 is then proved.

*Proof of Theorem 4. *The proof of Theorem 4 is obvious and hence omitted here.

*Proof of Theorem 5. *Using the same proof as that of Theorem 1, it follows that is a side condition on the matrix ; that is, satisfies conditions
where a.e. denotes “almost everywhere.” Thus it follows from the similar proof of Theorem 1 that
Then we obtain from (A.23) that the test statistic (17)
as desired. Theorem 5 is proved.

#### Conflict of Interests

The author declares that there is not any conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (11171002) and the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions (CIT&TCD201404002).