Generalized Inferences about the Mean Vector of Several Multivariate Gaussian Processes

Ibarrola, Pilar; Vélez, Ricardo

doi:https://doi.org/10.1155/2015/479762

Journal of Probability and Statistics

On this page

Abstract Introduction References Copyright Related Articles

Research Article | Open Access

Volume 2015 | Article ID 479762 | https://doi.org/10.1155/2015/479762

Generalized Inferences about the Mean Vector of Several Multivariate Gaussian Processes

Pilar Ibarrola¹and Ricardo Vélez²

Academic Editor: Dejian Lai

Received22 Jun 2015

Accepted07 Oct 2015

Published29 Oct 2015

Abstract

We consider in this paper the problem of comparing the means of several multivariate Gaussian processes. It is assumed that the means depend linearly on an unknown vector parameter and that nuisance parameters appear in the covariance matrices. More precisely, we deal with the problem of testing hypotheses, as well as obtaining confidence regions for . Both methods will be based on the concepts of generalized value and generalized confidence region adapted to our context.

1. Introduction

The generalized values to test statistical hypotheses in the presence of nuisance parameters are introduced by Tsui and Weerahandi (1989) [1], where the univariate Behrens-Fisher problem, as well as other examples, is considered in order to illustrate the usefulness of this approach. Afterwards Weerahandi (1993) [2] introduces the generalized confidence intervals.

In 2004, Gamage et al. [3] developed a procedure based on the generalized values to test the equality of the mean vectors of two multivariate normal populations with different covariance matrices. They also construct a confidence region for the means difference, using the concept of generalized confidence regions. Finally, by means of the generalized value approach, a solution is obtained for the heteroscedastic MANOVA problem, but without reaching the desirable invariance property.

In 2007, Lin et al. [4] considered the generalized inferences on the common mean vector of several multivariate normal populations. They obtained a confidence region for the common mean vector and simultaneous confidence intervals for its components. Their method is numerically compared with other existing methods, with respect to the expected area and coverage probabilities.

In 2008, Xu and Wang [5] considered the problem of comparing the means of populations with heteroscedastic variances. They provided a new generalized value procedure for testing the equality of means, assuming that the variables are univariate and normally distributed. Numerical results show that their generalized value test works better than a generalized -test. We will set out our MANOVA problem as a generalization of their framework.

In 2012, Zhang [6] considered the general linear hypothesis testing (GLHT) in an heteroscedastic one-way MANOVA. The multivariate Behrens-Fisher problem is a special case of GLHT.

In this paper we first consider the generalized inference for the case of two continuous time Gaussian processes. Later, the results will be extended for such processes. In both cases, for the testing problem, the main step is constructing a generalized test process and analyzing the associated generalized value, proving some linear invariance properties.

With respect to the construction of generalized confidence regions, one should use a generalized pivotal quantity and use the approach of multiple comparisons as in [4].

Finally, in the same line of Zhang [6], we consider the general linear hypothesis testing (GLHT) as a generalization of the MANOVA, adapting the setting and method of this paper.

It must be emphasized that all the references above develop these techniques for discrete univariate or multivariate models, whereas here we are concerned with a continuous time model. It is well known that when the underlying phenomenon is in essence continuous, even if it is observed at a sequence of epochs , different models may be necessary for distinct values of . On the contrary a continuous time model embodies simultaneously all the statistical properties of the time series obtained for each value of .

2. Continuous Time Generalized Tests and Confidence Regions

Let be a -dimensional stochastic process with distribution depending on the unknown parameter , being the vector of parameters of interest and a nuisance parameter vector. For any random vector , will denote its observed value.

For the problem of testing a null hypothesis against the alternative , where is a given vector (the inequalities like should be understood componentwise.), a generalized test process is defined, following [3], as follows.

Definition 1. A generalized test process is, for each , a one-dimensional function depending on and its observed value , as well as the parameter value , satisfying the following:(1)the distribution of does not depend on , for any fixed ,(2)the observed value does not depend on ,(3) is nondecreasing in every component of , for any and any fixed and .

Under the above conditions, the generalized value is defined asWhen testing versus , condition must be replaced by(3′) is stochastically larger under than under , for any fixed and .

In this case, the generalized value is given by

Towards the confidential estimation of , we give the following definition.

Definition 2. A generalized pivotal quantity is, for each , a one-dimensional function satisfying the following: (1)the distribution of does not depend on nor ,(2)the observed value does not depend on .

Then, if are such that is a generalized confidence region for .

3. Estimation Method

In the previous paper [7] we have considered the confidence estimation of a -dimensional parameter , when observing a continuous time -dimensional Gaussian process , with covariances function and mean function , where and are known matrices, but and are unknown parameters. More concrete assumptions were specified by Ibarrola and Vélez [7] and we will here suppose that they hold for all the considered processes.

The estimation method of , described in [7], is based on the estimatorwhere is the -matrix with columns in satisfying the equationand is given byAs proved in [7], does not depend on and constitutes a Gaussian process such that Consequently, and are independent if , while Moreover, is a mean square consistent estimator of , since all the eigenvalues of the covariance matrix converge to 0.

In order to estimate , if are arbitrarily chosen so that is nonsingular, we can consider the random variable such that has a distribution.

4. The Behrens-Fisher Problem

We first consider the case of two independent Gaussian stochastic processes, and , of dimensions and , respectively, and with similar characteristics. More precisely,where and , are unknown parameters, while , and , are given matrices of appropriate dimensions.

We will focus on the Behrens-Fisher type problem of comparing the parameters and and, more concretely, our aim is to make inferences about , based on the progressive observation of both processes and .

The progressive estimators and of and can be constructed according to (4), (5), and (6). For , and will denote the characteristics defined in (6), (7) for and , respectively. We thus obtain the unbiased estimator of : which is normally distributed, satisfies , and, since and are independent, has covariance In order to estimate and , we can take where and are such that and are nonsingular. In this way and are independent, -distributed random variables, independent of .

4.1. Generalized Value for the Behrens-Fisher Problem

We consider the problem of testing against the alternative and we look for a generalized test process .

The covariance matrix of isand can be estimated by means of Recall that , , and represent the observed values, obtained when and are replaced with and , and let us define Under , has distribution , so that the distribution of is . Moreover does not depend on and its distribution is independent of the parameters, since and are independent, distributed random variables, independent of . The observed value of is . Finally,is a one-dimensional random variable with distribution not depending on , whose observed value is , that neither depends on .

Under , when , as has distribution , the distribution of is , with . Note that, given and , since is a positive definite matrix, is stochastically larger under than under . So, since the conditions , , and of Definition 1 are verified, we have proved the following result.

Proposition 3. is a generalized test process for testing against .

In order to simplify the expression of , we will put which are positive definite matrices such that . Then where, under , has a -multivariate Student’s -distribution with degrees of freedom and is a random variable with distribution Beta, which is independent of .

Thus the generalized value of the given test: may be calculated once is observed.

4.2. Invariance Properties of the Generalized Value

Let us assume that the basic processes are transformed by means of where , are nonsingular square matrices of order and , respectively. The characteristics of the transformed processes are where and .

For , the solution of (5) with the new characteristics will be related to by means of . Therefore, according to (4) and (6), we get so that , , and .

The same results hold for yielding , as well as . In this sense the generalized test process and the corresponding generalized value are invariant under the proposed transformations.

4.3. Generalized Confidence Region for

For any value of the unknown parameter , the difference has distribution and is . Hence has the same distribution as under , which is independent of all the parameters, while its observed value does not depend on the nuisance parameters . Since conditions and of Definition 2 are accomplished, the following result is established.

Proposition 4. is a generalized pivotal quantity and is a generalized confidence region for , whenever .

According to (20), once is observed, the constant may be determined and the confidence region for is the ellipsoid in :which is centered at and with axes in the direction of each eigenvector of of length , where is the corresponding eigenvalue. Thus the -dimensional volume of the confidence region results in

Simultaneous confidence intervals for the components of can be obtained from the following consequence of the Cauchy-Schwarz inequality: satisfies if and only if for all . Thus, with and , the above equivalence allows to expressand thereforeThis last set provides simultaneous confidence intervals for the components of with confidence level greater than .

4.4. An Example with Simulation Results

In order to evaluate the performance of the proposed confidence region and confidence intervals, we will analyze a particular situation that allows accomplishing simulation studies.

The considered problem is the case of two Wiener processes, and , with The estimators defined in (4) together with their covariance matrices have been determined in [8]: while Similar results hold for and . We will take , so that Since and are positive definite symmetric matrices, there exists a nonsingular matrix that simultaneously diagonalizes them: and therefore .

The random variable has distribution with and the confidence region (26) may be written asThus, the basic case corresponds to and , from which the confidence region for more general cases may be constructed. With this simple choice, the matrix in (19) is the diagonal matrixand (20) may be written asIn the following algorithms we suppose that the values of the dimension and the variance parameters: , , and are given. The values of , , and are also fixed.

Algorithm 5. Generate the independent variables , with a distribution.
Compute and , and subsequently, for each , Fixing a large number of iterations , for each ,(a)generate with a -variate -distribution with degrees of freedom;(b)generate with a Beta distribution;(c)compute Determine as the percentile of the sample .
The obtained value of may be validated by generating a new sample in which the proportion of less than may be estimated. Table 1 shows the results obtained by means of a MATLAB program, using the data , , , for , , , and .

The next algorithm is designed to obtain the expected volume of the generalized confidence region and the coverage probability of a given vector .

Algorithm 6. Given a large number of iterations , for each ,(1)generate a -dimensional vector with independent components, such that has a distribution for each ;(2)generate and as in Algorithm 5;(3)computewhere the coefficients are the terms of the diagonal matrix ;(4)use the value given in Algorithm 5 to compute the estimated coverage probability of as the proportion of less than ;(5)according to (27) compute the volumeand the estimated expected volume .Taking for instance and with the remaining data as before, the obtained coverage probability and estimated expected volume are shown in Table 2.
Finally an algorithm may be designed in order to simulate the simultaneous confidence intervals for the components of and to estimate the joint coverage probability.

Algorithm 7. Given a large number of iterations , for each the following hold.(1)For each , compute, according to the results of the previous algorithms, the extremes of the intervals , where (2)Compute the estimated joint coverage probability of as the proportion of iterations such that for all .(3)Compute that gives the expected length of the interval .With the same data as before and the same vector , the obtained results are shown in Table 3. Let us observe that the coverage probability of the generalized confidence region always exceeds the confidence level and that the coverage probability of the confidence intervals is always very close to 1. Taking alternative parameter values, some other simulations have been made with similar results.

5. Inferences about the Vector Means of Several Independent Gaussian Processes

As a generalization we now consider the case of independent multivariate Gaussian processes with dimensions and time parameters varying in some interval . We assume, for each , the following: where and are unknown parameters.

5.1. Testing the Equality of Means

We first consider a test of the null hypothesis , or with for .

According to the method of Section 3, we may consider the estimators of the parameters , whose distribution is , and also the differences for with distributions .

The -dimensional vector will have a multivariate normal distribution with vector mean and covariance matrix where is the Kronecker product (composed by a column of copies of the identity matrix ).

For each , given such that is nonsingular, let which are independent random variables, independent also of and such that has distribution.

The covariance matrix may be estimated by means of with observed value As in (17), the matrix does not depend on any parameter and is independent of ; moreover its observed value is Let us define that have, under , distributions and , respectively. So that, finally, are scalars not depending on the parameters and the distribution of neither depends on . Moreover, is a positive definite quadratic form in whose distribution under is with . Hence is stochastically larger under than under and we get the following.

Proposition 8. is a generalized test process for testing against .

An alternative expression will be more suitable for the calculation of the values. For each , let be the square matrix obtained by replacing in the matrix (46) the block with while the remaining are replaced with (of order ); that means Now satisfy and allow expressing Consequently where has a -multivariate Student -distribution, with degrees of freedom, while is such that has a Dirichlet distribution with parameters and is independent of .

After the observation of , the generalized value of the given test is

As in Section 4.2, an analogous property of invariance of the test statistic may be stated with respect to the transformations of each process for nonsingular matrices .

5.2. Confidence Region for

For any value of the unknown parameter , the difference has distribution and is . Hence has the same distribution as under , which is independent of all the parameters, while its observed value does not depend on the nuisance parameters. The following proposition is thus established.

Proposition 9. is a generalized pivotal quantity and a generalized confidence region for is given by , if .

As observed in Section 4.3, the just obtained confidence region may be written as For and , let represent the vector in whose only nonzero term is 1 in the position, whereas for . Then And from (51) we get Therefore, as in Section 4.3, is contained in the intersection of all the intervals: for and . This gives a set of simultaneous confidence intervals for all the components of the differences between the mean parameters of the processes.

5.3. General Linear Hypotheses

When the null hypothesis is rejected, one can focus on testing a linear relation between the mean parameters. That is, if , is an matrix, and is a vector in , the objective is to test the null hypothesis against the alternative .

We denote which has distribution , where . Now, has distribution, whose mean vector is under . Moreover may be estimated by which has observed value . Observing that, under , has distribution , we get that has distribution. Therefore, if then is a one-dimensional random variable with distribution, under , independent of the parameters and whose observed value, given by , does not depend on the parameters. Since is a positive quadratic form in the normal vector whose mean under is different from 0, is stochastically larger under than under , so that the following proposition is proved.

Proposition 10. is a generalized test process for the contrast of against .

With the same matrix defined in (56), one can put that satisfy . Since where are mutually independent, distributed and independent of , we get with , given by (60), distributed according to a multivariate Student distribution and such that has a Dirichlet distribution with parameters and is independent of . Thus, the generalized value of the given test can be calculated as in (61), although the expressions of are now different.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

K. Tsui and S. Weerahandi, “Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters,” Journal of the American Statistical Association, vol. 84, no. 406, pp. 602–607, 1989.
View at: Publisher Site | Google Scholar
S. Weerahandi, “Generalized confidence intervals,” Journal of the American Statistical Association, vol. 88, no. 423, pp. 899–905, 1993.
View at: Publisher Site | Google Scholar
J. Gamage, T. Mathew, and S. Weerahandi, “Generalized p-values and generalized confidence regions for the multivariate Behrens-Fisher problem and MANOVA,” Journal of Multivariate Analysis, vol. 88, no. 1, pp. 177–189, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
S. H. Lin, J. C. Lee, and R. S. Wang, “Generalized inferences on the common mean vector of several multivariate normal populations,” Journal of Statistical Planning and Inference, vol. 137, no. 7, pp. 2240–2249, 2007.
View at: Publisher Site | Google Scholar
L.-W. Xu and S.-G. Wang, “A new generalized p-value for ANOVA under heteroscedasticity,” Statistics and Probability Letters, vol. 78, no. 8, pp. 963–969, 2008.
View at: Publisher Site | Google Scholar
J. T. Zhang, “An approximate hotelling T²-test for heteroscedastic one-way MANOVA,” Open Journal of Statistics, vol. 2, no. 1, pp. 1–11, 2012.
View at: Publisher Site | Google Scholar
P. Ibarrola and R. Vélez, “Testing and confidence estimation of the mean of a multidimensional Gaussian process,” Statistics, vol. 36, no. 4, pp. 317–327, 2002.
View at: Publisher Site | Google Scholar
P. Ibarrola and R. Vélez, “On Behrens-Fisher problem for continuous time Gaussian processes,” Linear Algebra and Its Applications, vol. 389, no. 1–3, pp. 63–76, 2004.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2015 Pilar Ibarrola and Ricardo Vélez. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1476

Downloads

855

Citations