Review Article | Open Access
Generalised Score and Wald Tests
The generalised score and Wald tests are described and related to their nongeneralised versions. Two interesting applications are discussed. In the first a new test for the Behrens-Fisher problem is derived. The second is testing homogeneity of variances from multiple univariate normal populations.
This paper is intended to be a tutorial for those wishing to inform themselves about the generalised score and Wald Tests. It extends the content of  and has similar objectives; that is, it focuses on the use of these tests rather than their properties. It is intended to be very accessible. Readers need only some prior knowledge of partitioned matrices, score and Wald tests, see, for example,  and [2, Chapter 3].
The score test is particularly valuable when maximum likelihood (ML) estimation under the full model is not preferred, but ML estimation under the null model is. The converse holds for the Wald test. Thus when ML estimation under one of the null and full models is not preferred, the likelihood ratio test is problematic, but one of the score and Wald tests is not. Here by not preferred we mean that, for example, estimates may be calculated by some iterative scheme with dubious convergence. Other possibilities are that estimates may have a particularly convoluted expression or the finite sample properties (such as large bias) may be inappropriate for the problem of interest.
When ML estimation under both the null and full models is not preferred, we need another way forward. This is provided by the generalised score and Wald Tests. These tests are especially valuable when the model may be misspecified, but that will not be the focus here.
In Section 2 the generalised score and Wald Tests are described. In Section 3 this material is applied to deriving a new test for the Behrens-Fisher problem, while Section 4 looks at testing equality of variances from multiple independent normal samples.
2. M-Estimators and Generalised Score Tests
The class of M-estimators includes both ML and method of moments estimators. An M-estimator satisfies
in which are independent but not necessarily identically distributed, is a known function not depending on or , is a -dimensional parameter, and in general denotes an vector of zeros. The estimating function must be sufficiently `smooth.’ In particular, its derivatives up to second order, and their expectations, must exist. Hence the matrices and defined subsequently are assumed to exist. Also, the expectation of the second-order derivatives must be bounded in probability. More technical details on M-estimators may be found in [3, Chapter 5].
In our setting we assume that and that we wish to test : against the alternative : with being the vector of primary interest, with a vector of nuisance parameters, and with . The generalised score test is based on the partial M-estimator that satisfies
where is partitioned similarly to , so that , and where in which is the M-estimator of under the null hypothesis. Define in which denotes expectation under the null hypothesis. Here and are and and are . We note that is not necessarily symmetric while is. This means that the form of the generalised tests given by, for example, , needs to be slightly modified. The generalised score test statistic is given by
in which, as can readily be shown, and the arguments in , , and are suppressed; here all are . Similarly the generalised Wald test statistic is given by
in which all arguments are . In the exposition in  parameters are estimated by ML but the data do not come from the parametric model: this is ML under misspecification. In , Kent’s definitions are given but in place of ML estimators any M-estimators are permitted. It is also noted in  that and can in practice be replaced by any consistent estimates.
An alternative form of that is more convenient for calculation is given in , where it is applied to the construction of generalized smooth tests of goodness of fit. This form gives
in which The equivalence of the two forms requires routine but tedious matrix algebra and is omitted here. The asymptotic distribution of both and under is .
If is the derivative of logarithm of the likelihood, which is the usual score function, then is the usual (symmetric) information matrix, and . If ML estimation is used, then , the usual Wald test statistic, and , the usual score test statistic. Both are given in this form in . For more information see [5, 6].
In [5, page 328] replacing the inverse of the asymptotic covariance matrix in by a generalised inverse of a consistent estimate of is recommended. Although it may sound trivial, when calculating any of the ordinary or generalised score or Wald test statistics, we are finding where is at least asymptotically multivariate normal and is at least asymptotically the full rank covariance matrix of . Very occasionally it may be more convenient to find the exact covariance matrix rather than one that is asymptotically equivalent. If so the exact covariance matrix can be used in the above expressions; similarly when appropriate a generalised inverse of the exact or an asymptotically equivalent covariance matrix can be used.
3. The Behrens-Fisher Problem
In the Behrens-Fisher problem, is a random sample from an population, and is an independent random sample from an population. It is desired to test : against : , with the standard deviations and being nuisance parameters. In [2, Example 3.3.2] the likelihood ratio, score, and Wald tests are derived. The score test requires the solution of an inconvenient cubic equation; so this is one situation in which the Wald statistic looks distinctly more appealing than both the likelihood ratio and score test statistics.
When the estimating function is the usual score function, the generalised score test is the usual score test. To conform to our notation put , , , and . We test against , with nuisance parameters , and . The logarithm of the likelihood is
and therefore the score function has the following components: These are the partial derivatives of the logarithm of the likelihood. Under the null hypothesis the estimating equations are . This leads to the inconvenient cubic equation mentioned previously. If we proceed with this model, the cubic must be solved to find , and hence and . We also findwhence
and the generalised score test statistic is . This is just the ordinary score test statistic.
While solving the cubic is not a great difficulty, if we modify so that it becomes a possibly less efficient but certainly more convenient estimator of the common mean under the null hypothesis may be found. This estimator is the solution to , namely, . If we also modify so that while leaving the other two equations unchanged, the generalised score test is based on The estimators of and are slightly different from those found previously, being and . Modifying the previous derivation gives whence
and the generalised score test statistic is
It may be shown that the Wald test statistic is a one-one function of this , so that these two tests are equivalent. However, if using the asymptotic critical values, the generalised score test has actual test sizes much closer to the nominal sizes than the Wald test. When using simulated critical values that are virtually exact, the generalised score test power is within 1% of the entrenched test due to Welch . So on this criterion the Welch and generalised score tests are virtually indistinguishable.
The Welch test is very similar to the Wald test. Using Satterthwaite’s approximation to the null distribution of the Welch test gives excellent agreement between the nominal and actual test sizes. However Satterthwaite’s approximation does not work nearly as well for . Hence, in terms of agreement between nominal and actual test sizes using approximations and asymptotic critical values, the Welch test is to be preferred. Support for these assertions and more numerical details are available in .
4. Testing Equality of Variances
Suppose that we have independent random samples, with the th, , being of size and from a normal population. The total sample size is . We seek to test equality of variances: : say against the alternative : not . Popular tests include the likelihood ratio test, frequently referred to as Bartlett’s test, and Levene’s test. The former is known to be nonrobust, while the latter is more robust in that its actual levels are closer to the nominal levels. Levene’s test is less powerful than Bartlett’s when the data are consistent with normality.
We now construct a Wald test of against . We could use the generalised Wald test construction with being the derivative of logarithm of the likelihood, but we leave that as an exercise for the interested reader. We could also calculate one of the forms of the asymptotic covariance matrix, but this is a case where it is simpler to calculate the exact covariance matrix. Moreover the exact covariance matrix involves an inconvenient inverse; so we instead use the Moore-Penrose inverse. This is defined in the appendix, along with some relevant useful results. This approach leads to a simpler test statistic.
Throughout this example, since we are calculating the Wald test statistic, all estimation is ML. As a consequence estimators are denoted by hats () instead of tildes (). We also use unbiased versions of the sample variances (with divisors instead of ). These are asymptotically equivalent to the usual ML estimators, and the corresponding test statistic is asymptotically equivalent to the usual Wald test statistic.
Before proceeding with the construction, note that if is the unbiased sample variance from a random sample of size from a distribution, then has the distribution. As is well known, , so that From the Rao-Blackwell theorem is an optimal estimator of , being the unique estimator with minimum variance in the class of unbiased estimators of . This optimality is conferred upon when estimating . Writing for the unbiased estimator of the th population variance , , the optimal estimator of is for .
Should the null hypothesis be true, an unbiased estimator of the common population variance is the pooled sample variance where for . Note that since , . Now define Then . This is zero if and only if for all . Hence testing equality of variances is equivalent to testing : against : . An unbiased estimator of is and since is symmetric, has covariance matrix estimated by where now . Now CDC is not of full rank, and in order to use results on quadratic forms of multivariate normal random variables generalised or pseudoinverses are required. Here we use , the Moore-Penrose inverse of the matrix M. See the appendix.
Because is idempotent, the Moore-Penrose inverse of CDC is given by
A Wald test statistic for testing : against : is Since , should be compared with the distribution to assess significance. Should the test indicate significance at an appropriate level, rough pairwise comparisons can be made as in the comparison of means in the analysis of variance. To see how to do this first note that, as above, has the distribution which, for large , is approximately . Hence is approximately and under the null hypothesis of equality of variances for any the difference is approximately , and can be estimated by . A least significant difference may be constructed in the usual way.
The Moore-Penrose Inverse
One of several pseudo-inverses or generalised inverses is the Moore-Penrose inverse: see, for example, [9, section 8.11]. The unique Moore-Penrose inverse of a real symmetric matrix B satisfies It is routine to show the following.(i)If then .(ii)If H is orthogonal, then .(iii)If A is idempotent, then .(iv)If the subsequent matrix products are defined, then and .
It is well known that if is with rank , then has the distribution where is a pseudoinverse of . For the scenario here it is reasonable to test against using the test statistic .
- J. C. W. Rayner, “The asymptotically optimal tests,” Journal of the Royal Statistical Society Series D, vol. 46, no. 3, pp. 337–346, 1997.
- J. C. W. Rayner, O. Thas, and D. J. Best, Smooth Tests of Goodness of Fit: Using R, John Wiley & Sons, Singapore, 2nd edition, 2009.
- A. W. van der Vaart, Asymptotic Statistics, vol. 3 of Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge, UK, 1998.
- J. T. Kent, “Robust properties of likelihood ratio tests,” Biometrika, vol. 69, no. 1, pp. 19–27, 1982.
- D. Boos, “On generalised score tests,” The American Statistician, vol. 47, pp. 327–333, 1992.
- H. White, “Maximum likelihood estimation of misspecified models,” Econometrica, vol. 50, no. 1, pp. 1–25, 1982.
- B. L. Welch, “The significance of the difference between two means when the population variances are unequal,” Biometrika, vol. 29, pp. 350–362, 1937.
- P. Rippon, J. Rayner, and O. Thas, “A competitor for the test Welch test in the Behrens-Fisher problem,” Unpublished Report, 2008.
- J. L. Goldberg, Matrix Theory with Applications, McGraw-Hill, New York, NY, USA, 1991.
Copyright © 2010 Paul Rippon and J. C. W. Rayner. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.