#### Abstract

One of the key issues in robust parameter design is to configure the controllable factors to minimize the variance due to noise variables. However, it can sometimes happen that the number of control variables is greater than the number of noise variables. When this occurs, two important situations arise. One is that the variance due to noise variables can be brought down to zero The second is that multiple optimal control variable settings become available to the experimenter. A simultaneous confidence region for such a locus of points not only provides a region of uncertainty about such a solution, but also provides a statistical test of whether or not such points lie within the region of experimentation or a feasible region of operation. However, this situation requires a confidence region for the multiple-solution factor levels that provides proper simultaneous coverage. This requirement has not been previously recognized in the literature. In the case where the number of control variables is greater than the number of noise variables, we show how to construct critical values needed to maintain the simultaneous coverage rate. Two examples are provided as a demonstration of the practical need to adjust the critical values for simultaneous coverage.

#### 1. Introduction

Robust Parameter Design (RPD) is also called Robust Design or Parameter Design in the literature [1, 2]. The concept of RPD was introduced in the United States by Genichi Taguchi in the early 1980s. It is a methodology that takes both the mean and variance into consideration for product or process optimization. Taguchi [3] divided the predictor variables into two categories: control variables and noise variables. Control variables are easy to control while noise variables are either difficult to control or uncontrollable at a large scale. In practice, we would like to find a range of control variables such that (1) the variance caused by the change of noise variables is minimized and (2) the mean response is close to target. Multiple optimization design and analysis methods have been developed to achieve these two goals simultaneously, ranging from the traditional Taguchi methods to the more sophisticated response surface alternatives. (See [2, 4, 5] for detailed reviews on these methods.)

Under certain conditions, Myers et al. [6] proposed a way to construct a confidence region of the control variables where the variability transmission by the noise variables is minimized to zero. Although they only focused on the variance part, there are many such situations in which focus is placed entirely on the process variance (see p506 in [5]). For instance, if the process mean can be optimized using certain control factors that do not interact with noise variables or impact the noise variance (i.e., “tuning factors”), then one can seek other control factors which can drive the noise variance to zero. Even if the process mean cannot be driven to the target by tuning factors alone, it is nonetheless illuminating to consider both the confidence region for the minimum process variance and the confidence region for the optimal process mean [5, 6]. If the number of noise variables is not greater than the number of control factors (for examples, see [7–9] and see pages 491–492, 499 in [5]), then the conditions for a minimum process variance and a zero-gradient solution will coincide.

Furthermore, a zero-gradient solution is quite useful in that, even for very large noise variation, the transmission of that noise to the output of the process can be made negligible (or very small) by utilization of zero-gradient (or near-zero-gradient) operating conditions. Therefore, it is desirable to have a way to statistically test for the existence of such a solution within the experimental region or a region of feasible operation (see p506 in [5]). A simultaneous confidence region for the locus of points forming a zero-gradient solution forms such a test and also provides a graphical measure of uncertainty about the zero-gradient solution.

The existence of a zero-gradient solution is especially interesting for mixture experiments with control factor ingredients and noisy process variables. If a zero-gradient solution for the ingredient mixture can be found, that is, within the experimental region, and the confidence region for the zero-gradient solution intersects one or more of the mixture simplex boundaries, then this implies that it may be possible to remove one or more mixture ingredients and still maintain very low noise variance. Removal of one or more mixture ingredients may help to reduce production cost [10].

Myers et al. [6] proposed a confidence region in control variables based upon the standard response surface model as shown in (1) for incorporating noise variables (see, for examples, [4, 11, 12]) where is a vector of control variables, is a vector of noise factors, is the intercept, is a vector of coefficients for the main effects of control variables, is a vector of coefficients for the main effects of noise variables, is a matrix whose diagonals are the coefficients for the quadratic terms of control variables and whose off-diagonals are one-half of the control variable interaction effects, and is a matrix of control by noise variable interaction effects. is the random error term. It is assumed here that .

Assuming that noise variables have mean zero and variance-covariance matrix , the variance of the response in Model (1) is
Here the variance of the response is divided into two parts: the variance transmitted by the noise variables (represented by the first term) and the constant variance () due to modeling error and other factors not considered in the model. In other words, it is the noise variables that lead to the variance heterogeneity in the response. Since the changes of noise variables are inevitable in practice, Myers et al. [6] proposed that the “minimum process variance” can be reached by setting the slope of the noise variables, , equal to zero and, therefore, eliminating the noise variance part from the response variance. A confidence region for such control variable values can be constructed by inverting a hypothesis test of the form:
for each **x**-point. To simplify the notation, let be a vector that contains all the elements of the noise variable’s main effect vector and the interaction matrix , that is, ,, where , is the element of vector and , is the element of matrix in the row and column. Let **, **where is an identity matrix of dimension , is a vector of the corresponding control variables that interact with noise variables, and denotes the Kronecker product. The null hypothesis in (3) can then be written as
where is a matrix and . Now the confidence region in control variables for zero variance due to noise variables can be defined as
Furthermore, let denote the test statistic for which is
where is the estimate of ; is the usual unbiased estimate of the variance of , .

Let, Myers et al. [6] have shown that, for each fixed, where is a vector of random variables (i.e., is an estimator of , whereas is an estimate from the actual data), is the residual degrees of freedom (df), and is distribution with numerator df equal to and denominator df equal to . They then conclude that the percent confidence region in (5) is where the critical value .

Note that the confidence region in (7) (called the “MKG confidence region” from now on) and the critical value were derived based on two critical assumptions: (1) the minimum of the variance due to noise variables is zero; (2) the solution to the zero-gradient equation in (4) is unique. There are situations where the first assumption cannot be met due to the fact that the solution to (4) is either outside the experimental region or does not exist. (However, the approach proposed in this article may provide the determination of such existence in a statistically significant way.) Assuming that the first assumption is met (like the two examples in Section 3), the second assumption is only true when . Notice that (4) represents a series of equations with unknown control variables. As recognized by Myers et al. (see page 506 in [5]), the equation will result in a single point solution when , a line or hyperplane when , and a single point solution or no solution when . In other words, the MKG confidence region provides the correct critical value for the zero-gradient solution (if it exists) only when there are at least as many noise variables as control variables.

However, when the number of noise variables is less than the number of control variables, multiple solutions can exist to the zero-gradient equation in (4). In such situations, use of the MKG region will provide below nominal simultaneous coverage. As such, a confidence region which covers all the solutions simultaneously needs to be developed. In practice, statistical inference for the multiple-solution problems is important as this gives the experimenter more options with regard to finding the zero-gradient factor settings. The objective of this paper is to generalize the MKG confidence region such that it will provide the adequate coverage for both the single-solution case (where ) and multiple-solution case (where ).

The rest of this paper will be organized as follows. Section 2 will focus on the derivation of such a generalized confidence region and the corresponding critical value required for inverting the associated null hypothesis. In Section 3, we give two examples to demonstrate the difference in simultaneous coverage between the MKG confidence region and our proposed confidence region when the number of control factors exceeds the number of noise variables. Section 4 provides a summary of the results.

#### 2. A Generalized Confidence Region Approach

##### 2.1. The Multiple Zero-Gradient Solution Problem

To address the multiple solution situation, the hypothesis in (4) is generalized to , where is the linear subspace representing either a unique single solution (i.e., a point) or multiple solutions (i.e., a line, or a hyperplane). In other words, the confidence region could be a collection of either points or linear subspaces (of dimension ≥1) depending on whether the solution to the equation is unique or not. Therefore, we propose to generalize the MKG confidence region to
where represents the linear subspace of the space defined by the elements of , which are solutions to . Here is a vector. has dimension , where is defined as , 0 otherwise. Therefore, when the solution to is a single point, ; otherwise, . In this section, we derive values for in (8). When and is, therefore, not a point, computation of the confidence region in (8) may appear difficult due to the replacement of *x*-points by subspaces. However, in this section, we will also show that the confidence region in (8) is equivalent to one based on pointwise gridding (of the type done for the MKG confidence region computations).

We call the confidence region given in (8) the generalized zero-gradient (GZG) confidence region. As indicated by the definition, the MKG confidence region is a special case of the GZG confidence region where is a point and . The MKG confidence region is correct, and the critical value is for , if a solution exists. (If and the MKG region is the null set, then there is statistically significant evidence that a solution does not exist.) The next question is what value should take when ? For , note that a GZG confidence region should contain the zero-gradient solution set , with probability before the experiment is performed. It is worth pointing out that when , the GZG confidence region in (8) is a simultaneous confidence region problem in that a line or hyperplane will be included in the confidence region only if all the points on the line or hyperplane satisfy the criterion in (8). Therefore, the GZG confidence region in (8) can also be expressed as where and . To find the critical value for , we need to first investigate the distribution of the test statistic when is true.

When , , which is a vector. Based on Miller’s theorem (see p65 and p113 in [13]), the critical value should be , where is the dimension of the solution set, , for the equation . Here, is the degrees of freedom of the residuals. Note that when , means that the solution is a point, which is a linear space of dimension zero. The critical value then becomes the MKG critical value because .

For the case, the distribution of is more complex. Section 2.2 addresses the full model in (1) where the experimental design is completely orthogonal or partially orthogonal so that for some positive constant, , residual variance , and an identity matrix , of dimension of . Here, an exact simultaneous confidence region is derived. For the general case, Section 2.3 proposes a simulation method based upon the multivariate *t*-distribution to find approximate critical values with which to construct the confidence region.

##### 2.2. Full Model-Orthogonal Case

Here, we assume that the data are generated from an orthogonal design or partially orthogonal design such that . Furthermore, it is assumed that we have a full-noise-control variable interaction model, meaning that each noise variable interacts with the same set of control variables, that is, each element of interaction matrix is nonzero. If , and all the elements of the interaction matrix are nonzero, then, for , the distribution of the test statistic has the same distribution as a function of a chi-square random-variable and a random Wishart’s matrix (which are stochastically independent) as shown below: where , ~Wishart , and is the residual degrees of freedom. Here, is a identity matrix, where is the same as defined in Section 2.1. The degrees of freedom of the Wishart distribution is , and is the maximum eigenvalue of the matrix . The proof of this result is provided in Appendix A.

Using (10), the critical value, , can then be computed as the 100(1-* α*)th percentile of the distribution of , which can be obtained by the simple Monte Carlo simulation from and Wishart’s distributions. Some limited tables of critical values are given in Appendix B based on (10), although computation of the critical value for any specific case using the random variable in (10) is easily accomplished. Note that the critical value determined by (10) becomes the MKG critical value, , when , that is, . This is because when , has a Wishart distribution which is . In other words, is not a matrix anymore but a random variable. Hence, . So can then be written as
Furthermore, the same result in (10) holds when has an matrix normal distribution. (See p90-91 in [14].) (For details, see [15].)

##### 2.3. The General Case

In some cases, the experimental design may be such that Var does not have the orthogonal form, or we may wish to use a model with some “control × noise variable” interaction terms deleted, that is, has some zero elements. In such situations, when , the distribution of does not have a simple form and may depend upon even under . Nonetheless, in such situations, it is still possible to obtain approximately conservative simultaneous confidence regions for control variables associated with zero-gradient solutions. We provide such a construction as follows.

Recall that in (9) is a function of , and consider
where and . Note that by using is an approximate upper confidence bound for the scalar-valued quantity, . (See Clarke, [16], for a discussion of confidence bounds on nonlinear functions of model parameters constructed from confidence regions.) Let denote the 100(1−* α*)th percentile of the distribution of under . Consider the confidence region defined by . This confidence region should provide (at least approximately) a conservative simultaneous confidence region for the zero-gradient solutions. However, computation of (using (12)) and the associated confidence region is numerically difficult due to the complex constraints associated with the definition of . Fortunately, it can be shown that
where . A proof is given in Appendix C. The expression for in (13) allows for much easier computation of the critical value. The actual construction of the GZG confidence region from the relevant critical value will be outlined in Section 2.4.

###### 2.3.1. The Critical Value Computation for the General Case

Note that under we can express as
where is the mean squared error, is a known matrix computed from the design matrix, and . Here, **t** follows the multivariate distribution with location parameter equal to zero, scale matrix , and degrees of freedom . Using (14) we can then compute the critical value, , using the Monte Carlo simulations as follows.

*Step 1. *Compute , where .

*Step 2. *Simulate a multivariate **t** random vector (rv) with scale matrix and * ν* df. (This can be done by simulation of a multivariate normal rv with mean vector

**0**and variance-covariance matrix and a chi-square random variable with

*df. See [17] for details.)*

*ν**Step 3. *Compute using the expressions in (13) and (14). (For practical reasons, computation of can be done by maximization of over instead, where is a prespecified, bounded region. This will calibrate the coverage to be simultaneous only over , where is the true linear subspace such that .)

*Step 4. *Do Steps 2-3 a large number of times to estimate the 100(1-* α*)th percentile of the Monte Carlo distribution of . This 100(1-

*)th percentile is then a Monte Carlo estimate of .*

*α*###### 2.3.2. The Coverage Rate of the Critical Value

In order to check the accuracy of as a critical value, we have done some Monte Carlo simulations of the above four-step procedure using three different noise variable models in conjunction with both orthogonal and nonorthogonal designs. The statistical models used are summarized in Table 1. These models are constructed so that the zero-gradient solution exists in the experimental region. Three partially orthogonal, face-centered central composite experimental designs were assessed, with associated statistical models 1, 2, and 3, respectively. These designs employed a coded factor space with factor levels equal to (except for the center points). The axial points in noise variables are deleted to maintain partial orthogonality. The factorial part of the designs is either full factorial (e.g., model 1 and model 2) or half factorial (e.g., model 3). The nonorthogonal designs are constructed by changing the (one) factorial point (comprised of all −1s) from () to (). The resultant sample size () and the residual df () of each composite design are both listed in Table 2. For demonstration purposes, we simply chose model parameters to be either 1 or −1, with residual error variance equal to 1.

The results of these coverage rate simulations are summarized in Table 2. The coverage rates were computed as follows. The models in Table 1 were used to compute the true space, . For each of the three models, the region, , was a hypercube constructed from the Cartesian product of intervals of the form . For each simulation, a dataset was generated based on the model and the corresponding central composite design, a critical value was computed based on the simulated dataset, then a check was done to see if the event occurred, where is the convex hull formed by the factor levels. For each simulated dataset, the critical value of was computed using 1000 Monte Carlo simulations. 5000 simulations were done to assess the simultaneous coverage of the GZG confidence region for the set . In an attempt to reduce the conservatism of the above approach for computing , we also considered the approximate approach obtained by maximizing over , where . We denote this approximate critical value by and use it in place of to reduce conservatism.

*Remark 1. *Because is a function of the data, the relatively large region, , was chosen for these simulations so that would be extremely unlikely to be empty for any simulated dataset. In addition, we did not want to rule out situations where the confidence region was outside the experimental region. While, in practice, such extrapolated inferences must be treated with caution, nonetheless it may be desired to compute such a confidence region. Such a confidence region outside the experimental region suggests that it may not be possible to obtain a “zero-gradient” solution for noise transmission, at least within the current experimental region. However, such a confidence region just outside the experimental region may offer hope that resetting process control conditions may allow for a more robust process. Of course, additional experiments outside the current experimental region would be needed to confirm this.

*Remark 2. *Maximization of over , to compute the Monte Carlo critical value, , was accomplished by using the SAS/IML Nelder-Mead simplex algorithm, nlpnms. This was done to make the Monte Carlo simulations of this Monte Carlo procedure tractable. Some limited simulations were also done whereby the maximization of over was computed by gridding instead. This was done to make sure that the Nelder-Mead algorithm did not stop its maximization prematurely. In all cases, each critical value, , computed using nlpnms, was slightly larger than that obtained using gridding. (Random number seeds were aligned to avoid the Monte Carlo differences in the comparisons between gridding and the use of the Nelder-Mead simplex algorithm.) For the approximate approach, maximization over was done by gridding as this was easier to accomplish with finer gridding.

Table 2 below displays the percent of times the event in (15) occurred for each of the three models with and without an orthogonal design. If the event in (15) occurs, then that portion of the true linear subspace, (within ), is entirely covered by the GZG confidence region; otherwise it is not.

Table 2 indicates that the simultaneous coverage rate of the GZG confidence region using the conservative critical value, , produces reasonably conservative results, while the approximate approach (that maximizes over , instead) achieves closer to nominal (yet slightly conservative) coverage rates. It is interesting to note that for each approach the coverage rate appears to be insensitive to the minor departure from orthogonality that was induced by changing the (one) factorial point (comprised of all −1s) from () to (). Such a departure from design orthogonality could happen due to a design execution error or a process restriction.

###### 2.3.3. The Full Model Nonorthogonal Case

Because computation of and requires maximization within a Monte Carlo calculation, it would be useful to assess if this can be eliminated when a full model is employed. We, therefore, conduct another simulation study to see if the critical value based upon the random variable in (10) can be used as an approximate critical value for mild departures from orthogonality. We use the same nonorthogonal designs as used in Table 2. The corresponding full-interaction models are listed in Table 3. This time the critical value was used with these nonorthogonal designs to assess the simultaneous coverage rate. The results are shown in Table 4. In order to assess the coverage rate gridding had to be done over a subset of . As a more fair comparison with the theoretical critical value, gridding was done over , (whereas before is a hypercube region composed of the Cartesian product of the intervals , instead of ). This is because the critical value associated with the random variable in (10) is computed by maximization over the whole linear subspace, .

Table 4 indicates that this minor departure from orthogonality has virtually no effect on the coverage rate of the GZG confidence region when the more convenient critical value is used. For more radical departures from orthogonality, it may possibly be safer to use the conservative critical value. But further robustness studies are needed to ascertain how well the more convenient critical value works under departures from its assumptions.

##### 2.4. Computation of Simultaneous Confidence Region

For the case, once we have computed the critical value, the confidence region can be computed by searching linear subspaces, , that satisfy the condition as defined in (9). However, searching over various lines or hyperplanes that span an experimental region is more computationally difficult than searching the same experimental region in a pointwise fashion. Fortunately, it can be shown that, for any given critical value, the GZG confidence region can be computed by pointwise gridding. This is because for in (7) and in (8), with the same critical value, . A proof is provided in Appendix D. This equivalency shows that one can construct the GZG confidence region by simply gridding over the experimental region in a pointwise fashion.

#### 3. Examples

##### 3.1. One Noise Variable

This example is from Myers et al. [6]. It was originally taken from Montgomery [18] (2009, page 231). The data was generated from a 2^{4} factorial experiment with a total of 16 observations from a pilot plant to explore the factors that could affect the filtration rate of a chemical bonding substance. The goal is to maximize the filtration rate, .

As in Myers et al. [6], one of the four factors, temperature, is assumed difficult to control at large scale and, therefore, treated as a noise variable . The rest of the factors are control variables: : pressure, : concentration, : stirring rate. The fitted model is with mean square error equal to 21.12 and residual df equal to 9.

The estimated slope of noise variable is Therefore, and . The solution to the null hypothesis is a line. Then the general critical value (based on Miller’s Theorem (1981) [13]) should be used to calculate the GZG confidence region (as shown in Figure 1). The GZG confidence region in Figure 1 is clearly wider than the MKG confidence region in Myers et al. (1997, Figure 2 in [6]) where is used as the critical value. It is clear from Figure 1 that we are at least 95% confident that the zero-gradient locus of points passes through the experimental region.

Next, we do some simulations to compare the coverage rates of the GZG and MKG confidence regions. Since the true optima is not known in practice, we calculate the coverage rate for the solution to , using a simulation model equal to the fitted model in (16) with . Note that the true solution in this example is a line with infinite length. But the simulation is done only for the line within the experimental region, that is, of the control variables.

Using 100,000 Monte Carlo simulations, the simultaneous coverage rate of the GZG confidence region for all of the zero-gradient solutions in the experimental region is 97% while the MKG confidence region only has 92% coverage rate. The MKG confidence region has a lower coverage rate because it was designed to contain the true optima only when the optimum is a point. Although the GZG confidence region is designed to contain all the true solutions (which could be a point, a line, or a hyperplane), the simulated coverage rate tends to exceed the nominal coverage rate because the simulation is done within a finite range of the control variables while the line or hyperplane has an infinite range in theory.

##### 3.2. Two Noise Variables

This example comes from a face-centered central composite design with the factorial part being a half-fractional factorial design (see details in [19]). The objective of this study is to find the optimized condition that maximizes the yield of diacylglycerol oil, which is a natural component of various edible oils and has shown some beneficial effects as compared to the traditional triacylglycerol oil.

Five factors were studied in this experiment: reaction time (RTIME), enzyme load (ENZL), reaction temperature (RTEMP), water content (WATC), and substrate molar ratio (SUBR). Water content (WATC) is difficult to control at large scale [19] and, therefore, treated as a noise variable. For illustration purposes, substrate molar ratio is also treated as a noise variable, and the axial points corresponding to the noise variables are excluded from the analysis to obtain partial design orthogonality with respect to the noise variables (i.e., to ensure ). The final model in coded factor value is as follows: where : RTIME, : ENZL, : RTEMP, : WATC, : SUBR. Here, the residual mean squared error is equal to 2.56 with 25 observations and residual df equal to 9.

Since , the solution to the null hypothesis is a line in a 3-dimensional space determined by control variables . Therefore, the confidence region for this line is a tube in this 3-dimensional space. A 95% GZG confidence region is shown in Figure 2. Based on (10), the GZG critical value is obtained via and Wishart’s distribution. From Figure 2, we can see that while this confidence region does not provide statistically significant evidence that the zero-gradient locus of points passes through the experimental region, it does appear that a good portion of the confidence region is within the experimental region, and hence, attainment of near-zero-gradient conditions should be feasible for this process.

As with previous example, we compare the coverage rates for the GZG and MKG confidence regions using the fitted models as true population models. Using 100,000 Monte Carlo simulations (based upon the fitted model in (18) with ), the simultaneous coverage rate is 96% while it is only 90% for the nominal 95% MKG confidence region. (Here, gridding was done over the cube formed by the Cartesian product of associated with each .)

#### 4. Summary

This paper shows that when the number of control variables does not exceed the number of noise variables, the MKG approach provides a confidence region for control variables associated with a zero-gradient for noise transmission. Otherwise, the MKG approach results in a confidence region that is too small for simultaneous coverage of the linear subspace of zero-gradient solutions. It is important to know that the true optimal condition represented by control variables is either a line or a hyperplane instead of a single point when . In this situation, constructing a simultaneous confidence region about the linear subspace solution is desirable in that a subspace of solutions provides the investigator with many options for setting the zero-gradient control level. Of course a confidence region also provides the experimenter with a measure of uncertainty for the optimal solution. If the confidence region is too large, further experimental runs may be needed to make more accurate inferences. If the current manufacturing set point is outside of the confidence region, this provides statistically significant evidence that reconfiguration of the set point may help improve process variability by lowering the transmission of noise through the system. The GZG confidence region for the zero-gradient conditions is proposed and is shown to provide nominal or reasonably conservative coverage rates for many noise variable experiments that occur in practice.

In the situation where there are many noise variables, it may be either costly or difficult to study all the noise effects. One way to deal with this problem is to combine the multiple noise factors into one compound noise factor with two extreme conditions as its two levels (provided certain assumptions can be satisfied). See [1, 20, 21] for discussion. If the noise factors can be combined into one compound noise factor, then we could have a situation where the number of control variables is greater than the number of noise variables. The GZG approach is directly applicable for this situation. In some cases, however, one may desire to create a compound noise factor, with more than two levels [22], or two or more compound noise factors. In either case, as long as the predictive model is in the form of (1), the GZG approach is applicable.

The GZG confidence region provides inferences about the optimal control point or points that yield a zero-gradient for the transmission of variability from the noise variables. However, there are situations where the control points corresponding to zero noise variance are either outside the experimental region or simply do not exist. In this case, it would still be useful to find the constrained optimal control point for minimum noise variance over the experimental region. A method to further generalize the confidence region for the constrained optimal point for minimum noise variance is needed and is currently under development.

#### Appendices

#### A. Proof of (10)

We will prove the result for the no-intercept case where in the hypothesis . It then follows that the result can be generalized to the intercept case where .

*Part 1. *For the no-intercept case:

When , the solution, , to can be expressed as the linear combination of the basis vectors for the linear subspace , that is, , where is a coefficient vector, and is a matrix where the rows consist of the basis vectors of linear subspace . (Here, and only **L** is a function of .) Then . Let , then and , where is defined as
where , and is the element on the row of the matrix . Since the vector , by the definition of Wishart distribution (see p92 in [23]). Let be the sample estimate of the residual variance , then the test statistic becomes

By replacing by **z** in (A.2), becomes
where and is a scalar. (For a proof of (A.3), see the result in Problem 22.1 in Rao 1973 [24, p74].)

By the definition of an eigenvalue, it follows that Note is positive definite and symmetric, and, therefore, is positive definite and symmetric as well. Therefore, both and exist.

Now multiply both sides of (A.4) by from the left, then multiply both sides by from the right, we get Let , then . Therefore, where follows distribution.

Next, we show that **A**~Wishart . Because ~Wishart and is a matrix with rank , by the property of Wishart distribution (see the theorem in Rencher 1998 [23, page 56]), ~Wishart . Note that rank rank rank. Since is symmetric and positive definite, then is also symmetric and positive definite, and its rank is . Hence, ~Wishart .

*Part 2. *For the intercept case:

The intercept case can be proved using the same arguments as indicated by Miller ([13, page 113]). Note that, in the intercept case, the rank of is . Hence, .

#### B. Critical Values Based on (10)

See Table 5.

#### C. Proof of (13)

Note that Adapting the proof of Theorem 2.1 in Peterson et al. [25], it follows that for any critical value, , Since is not a function of , it follows directly that So (13) follows directly from (C.1) and (C.3).

#### D. The Proof That

Recall that . By definition, implies that . Next, we will show that if , then . This can be proved by contradiction. Suppose that there exist some s such that , but . Then there exists at least one point, say , in such that there is no linear subspace that satisfies the two conditions: (1) it contains , (2) for every point in this subspace . Define and consider the set, . It follows using a proof analogous to that in Theorem 2.1 in Peterson et al. [25] that . Therefore, . This then implies that there exists a such that . If , then there exists a line or hyperplane such that all the s that satisfy are in . Hence, . Again using the proof of Theorem 2.1 in Peterson et al. [25], it follows that for any given point , if and only if for some . Therefore, for all the in . In other words, satisfies the above two conditions, which is a contradiction.