Testing Normal Means: The Reconcilability of the Value and the Bayesian Evidence
The problem of reconciling the frequentist and Bayesian evidence in testing statistical hypotheses has been extensively studied in the literature. Most of the existing work considers cases without the nuisance parameters which is not the frequently encountered situation since the presence of the nuisance parameters is very common in practice. In this paper, we consider the reconcilability of the Bayesian evidence against the null hypothesis in terms of the posterior probability of being true and the frequentist evidence against in terms of the value in testing normal means where the nuisance parameters are present. The reconcilability of evidence can be obtained both for testing a normal mean and for the Behrens-Fisher problem.
In the problem of testing a statistical hypothesis , a frequentist may give evidence against by the observed significance level, the value, while a Bayesian may give it by the posterior probability that is true. Lindley  illustrated the possible discrepancy between the Bayesian and the frequentist evidence. The relationship of these two measures of evidence is then extensively studied in the literature. Pratt  revealed that the values are usually approximately equal to the posterior probabilities in the one-sided testing problems. Casella and Berger  considered testing the one-sided hypothesis for a location parameter and showed that the lower bounds of the posterior probability over some reasonable classes of priors are exactly equal to the corresponding values in many cases. Some important papers which deal with the reconcilability of the Bayesian and frequentist evidence are Bartlett , Cox , Shafer , Berger and Delampady , and Berger and Sellke .
Although many researches have been carried out to deal with the problem of reconciling the Bayesian and frequentist evidence and some of them show that evidence is reconcilable in several specific situations, most of the existing work assumes that no other unknown parameters are present except the parameters of interest. In fact, we may be confronted with the nuisance parameters in various situations. In the location-scale settings, for example, when the location parameter is unknown, so is the scale parameter, in general.
However, in significance testing of hypotheses with the nuisance parameters, the classical values are typically not available. Tsui and Weerahandi , considering testing the one-sided hypothesis of the form where is the parameter of interest and is a fixed constant, introduced the concept of the generalized value, which appears to be useful in situations where conventional frequentist approaches do not provide useful solutions.
Tsui and Weerahandi  and some later relevant works formulated the generalized values for many specific examples. Hannig et al.  provided a general method for constructing the generalized value via fiducial inference.
In this paper, for the one-sided testing situations about normal means where the nuisance parameters are present, we study the reconcilability of the Bayesian evidence and the generalized value. It is shown that, under the conjugate class of prior distributions, the Bayesian evidence and the generalized value are reconcilable both for the problem of testing a normal mean and for the Behrens-Fisher problem.
This paper is organized as follows. In Section 2, we give the main results of the reconcilability of the value and the Bayesian evidence in testing normal means. Some conclusions and discussions are given in Section 3.
2. Main Results
In this section, we consider two testing problems in which the nuisance parameters are present. When no efficient classical frequentist evidence is available because of the presence of the nuisance parameters, we formulate the frequentist evidence by the generalized value.
2.1. One-Sample Normal Mean
Let be a random sample from a normal population , where both the mean and the variance are unknown. Consider now the following problem of testing the mean of a normal distribution where is a fixed constant.
For this testing problem, where the nuisance parameter is present, we can still obtain the classical value as where is a -variable with degrees of freedom and and stand for the observed sample mean and sample variance, respectively.
To derive the Bayesian evidence, we need a prior for the parameters. One reasonable and conventional class of priors for and is the following conjugate class of prior distributions : where the prior parameters can be interpreted as the mean and sample size of the normal prior observations and the sample variance and sample size of the Gamma prior observations.
Under (4) we have where , , , , and . Therefore, we can give the posterior density for as Then the marginal posterior density for can be obtained by integrating out as from which we know that Consequently, the posterior probability of being true is where is a -variable with degrees of freedom. Notice that if , we have
Lemma 1. Let be a -variable with degrees of freedom, where is a positive real number. Then for and a fixed constant , is nonincreasing in if and is nondecreasing in if .
Proof. Suppose that is a nonpositive random variable obtained by the negative part of ; that is, the density of is Then the density of is By Theorem in Lehmann , for any fixed nonpositive constant , we have that is nonincreasing in since it can be verified that the family of densities has monotone likelihood ratio in . This implies that Lemma 1 holds for the case when since . Since when , we have , the proof for the latter case is completely analogous if we introduce a nonnegative random variable obtained by the positive part of .
Now take . By Lemma 1, for , we have Then comparing (3) and (10), for and any fixed nonnegative , we have which implies that The reconcilability of the Bayesian and frequentist evidence is therefore obtained in this testing problem. We summarize this as the following theorem.
Theorem 2. For testing the hypothesis of the form (2) under a normal distribution with unknown, the Bayesian and frequentist lines of evidence are reconcilable under the conjugate class of priors (4).
2.2. Behrens-Fisher Problem
Now we turn to consider the Behrens-Fisher problem. It is a classical testing situation in which the nuisance parameters are present and no useful pivotal quantities are available. Suppose that and are two independent random samples from two normal populations and , respectively, where both and are completely unspecified. We are interested in testing the hypothesis of the form where is a fixed constant.
In situations where the traditional frequentist approaches fail to provide useful solutions, the conception of the generalized values introduced by Tsui and Weerahandi  appears to be helpful in deriving the frequentist evidence for testing a statistical hypothesis. For this specific problem of testing hypothesis (16), we can give the generalized value as where , , , , , and .
In this problem, we consider the reconcilability of evidence under the following conjugate class of prior distributions : Under , the posterior density of is where Let . Then the posterior density of is where So that the posterior probability of is It is straightforward to check that where and are the observation of the sample mean and , respectively, and are that of the sample variance and respectively, , , and .
Now we prove an interesting result that, when and are sufficiently large, the frequentist and Bayesian lines of evidence given respectively by (17) and (23) are reconcilable for any fixed and under the prior class of .
Theorem 3. As , for any fixed , we have which implies that
(I) We first prove that, given , as is sufficiently large, we have In fact, for any , as is sufficiently large, where , as . On the other hand, we have Since holds for any , it follows that, as , That is,
Similarly, as , we have Therefore, we have Consequently, we have
(II) We now show that the final conclusion holds. In fact, if we let , then where stands for the cumulative distribution function of a standard normal distribution and the last equation is due to the fact that and are independent normal distributions.
Similarly, for (24), we have
Note that for each in is increasing in . Therefore, by (35), we have
Consequently, we have
In addition, by the symmetry of the -distribution, it follows that is equivalent to . Therefore, by (36), (37), and (39), the conclusion of Theorem 3 holds.
The following theorem shows that, even for fixed and with , , we still obtain the reconcilability of the frequentist and Bayesian evidence.
Theorem 4. As , the conclusion of Theorem 3 holds for any fixed and with , ; that is,
Proof. We still adopt the notations of Theorem 3. We first prove that, as ,
By the proof of Theorem 3, we have
It is obvious that
Let denote the density function of . Then it is easy to see that reaches the maximum at and that , as . Therefore, as is sufficiently large, it holds that for some , where , as , and .
Therefore, for any fixed , as , we have where , as .
On the other hand, for any fixed and , we have for some . Therefore, as , by (46) and (47), we have . Furthermore, by (45), we have That is, Similarly, for any fixed , as , we have Therefore, similar to Theorem 3, as , for any fixed and with , , we have The rest part of the proof is similar to that of (II) of Theorem 3.
The following simulation results show that even for small and fixed values of and or and , the generalized value and Bayesian evidence for testing the Behrens-Fisher problem are still reconcilable.
For fixed and and for and , taking different values of and , some results of comparing the value and are listed in Table 1.
For fixed and and for and , taking different values of and , we list some results of comparing value and in Table 2.
In the presence of the nuisance parameters, we study the reconcilability of the value and the Bayesian evidence in the one-sided hypothesis testing problem about normal means. For the problem of testing a normal mean where the nuisance parameter is present, it is shown that the Bayesian and frequentist lines of evidence are reconcilable. For the Behrens-Fisher problem, it is illustrated that if the sample sizes and tend to infinity, then for fixed prior parameters and , both lines of evidence are reconcilable. Furthermore, it is illustrated that if the prior parameters and tend to infinity, then for any fixed sample sizes and , lines of evidence are reconcilable. Simulation results show that even for small and fixed values of sample sizes and or for small values of prior parameters and , the reconcilable conclusion of the Bayesian and frequentist evidence still holds.
This provides another illustration of testing situation where the Bayesian and frequentist evidence can be reconciled and may therefore to some extent prevent people from debasing or even dismissing values as evidence in hypothesis testing problems. Furthermore, our results of the reconcilability in the one-sided testing situations may help us to come to the idea that maybe it is arbitrary to assert the irreconcilability of the evidence in the two-sided (point or interval) hypothesis testing problems and perhaps we should be concerned more about the appropriateness of the methods we employ to tackle a two-sided hypothesis in both the frequentist and the Bayesian frameworks.
The work was supported by the Foundation for Training Talents of Beijing (Grant no. 19000532377), the Project of Construction of Innovative Teams and Teacher Career Development for Universities and Colleges Under Beijing Municipality (Grant no. IDHT20130505) and the Research Foundation for Youth Scholars of Beijing Technology and Business University (Grant no. QNJJ2012-03).
D. V. Lindley, “A statistical paradox,” Biometrika, vol. 44, pp. 187–192, 1957.View at: Google Scholar
J. W. Pratt, “Bayesian interpretation of standard inference statements,” Journal of the Royal Statistical Society B, vol. 27, pp. 169–203, 1965.View at: Google Scholar
M. S. Bartlett, “A comment on D. V. Lindley's statistical paradox,” Biometrika, vol. 44, pp. 533–534, 1957.View at: Google Scholar
D. R. Cox, “The role of significance tests,” Scandinavian Journal of Statistics, vol. 4, pp. 49–70, 1977.View at: Google Scholar
J. O. Berger and T. Sellke, “Testing a point null hypothesis: the irreconcilability of p-values and evidence,” Journal of the American Statistical Association, vol. 82, pp. 112–122, 1987.View at: Google Scholar
K. W. Tsui and S. Weerahandi, “Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters,” Journal of the American Statistical Association, vol. 84, pp. 602–607, 1989.View at: Google Scholar
J. Hannig, H. Iyer, and P. Patterson, “Fiducial generalized confidence intervals,” Journal of the American Statistical Association, vol. 473, pp. 254–269, 2006.View at: Google Scholar
E. L. Lehmann, Testing Statistical Hypotheses, John Wiley & Sons, New York, NY, USA, 2nd edition, 1986.