Abstract

Consider the multiple testing problem of testing m null hypotheses 𝐻1,,𝐻𝑚, among which 𝑚0 hypotheses are truly null. Given the P-values for each hypothesis, the question of interest is how to combine the P-values to find out which hypotheses are false nulls and possibly to make a statistical inference on 𝑚0. Benjamini and Hochberg proposed a classical procedure that can control the false discovery rate (FDR). The FDR control is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion (FDP). The control of the actual random variable FDP has recently drawn much attention. For any level 1𝛼, this paper proposes a procedure to construct an upper prediction bound (UPB) for the FDP for a fixed rejection region. When 1𝛼=50%, our procedure is very close to the classical Benjamini and Hochberg procedure. Simultaneous UPBs for all rejection regions' FDPs and the upper confidence bound for the unknown 𝑚0 are presented consequently. This new proposed procedure works for finite samples and hence avoids the slow convergence problem of the asymptotic theory.

1. Introduction

In this paper, we consider the problem of testing 𝑚 null hypotheses 𝐻1,,𝐻𝑚, among which 𝑚0 hypotheses are truly null. We shall assume that 𝑃-values are available for individual hypotheses. In a seminal paper, Benjamini and Hochberg [1] proposed the false discovery rate (FDR) as an alternative to the classically defined family-wise error rate (FWER). The proposed FDR achieves a good balance between the 𝑃-value itself and the FWER correction [2]; the former may give too many false positives, and the latter may give too many false negatives. However, the control of the FDR is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion (FDP). In practice, researchers may be interested in more detailed statistical inference on the actual random variable FDP, not just its expectation. The goal of this paper is to provide a simple procedure to control the FDP.

Let us first introduce some notation. Given 𝑚 hypotheses 𝐻1,,𝐻𝑚, let the complete index set be 𝑀={1,,𝑚}, 𝑀0 the unknown subset of 𝑀 for which the null hypotheses are true, and 𝑀1=𝑀𝑀0 the subset for which null hypotheses are false. Denote that 𝑚0=|𝑀0|,𝑚1=|𝑀1|, where || denotes the cardinality of a set. The 𝑃-values for testing the 𝑚 hypotheses are 𝑃1,,𝑃𝑚. A fixed rejection region for the 𝑃-values can conveniently be taken as [0,𝑡] (0<𝑡<1). The value of 𝑡 could be 0.05, for example. Define the number 𝑅𝑡 of all rejected hypotheses and the number 𝑉𝑡 of falsely rejected hypotheses, respectively,𝑅𝑡=𝑚𝑖=1𝐼𝑃𝑖𝑡,𝑉𝑡=𝑖𝑀0𝐼𝑃𝑖𝑡.(1.1) Following the notation of Korn et al. [3], and Genovese and Wasserman [4], Lehmann and Romano [5], the false discovery proportion is defined to be the proportion of falsely rejected null hypotheses among the rejected ones,𝑄𝑡=𝑉𝑡𝑅𝑡,(1.2) where the ratio is defined to be zero when the denominator is zero. For a given fixed rejection region [0,𝑡], 𝑅𝑡, 𝑉𝑡, and 𝑄𝑡 are random variables. 𝑅, 𝑉, and 𝑄 will be shorthand for 𝑅𝑡, 𝑉𝑡, and 𝑄𝑡 respectively, when the rejection region [0,𝑡] is clear from the context. The false discovery rate of Benjamini and Hochberg [1] isFDR=𝐸(𝑄).(1.3)

A good understanding of 𝑄 will lead investigators to pick an appropriate rejection region [0,𝑡] of 𝑃-values. As 𝑄 is an unobservable random variable depending on the observed 𝑃-values and the rejection region [0,𝑡], the quantity FDR just describes the expectation of this random variable 𝑄. One way to have a more detailed statistical inference on the random variable 𝑄 is to derive its distribution, which is very difficult unless a strong assumption can be imposed on the 𝑃-values from the false null hypotheses. A conservative approach is to compute an upper prediction bound for 𝑄 so that we can safeguard against excessive type I errors. In Section 2, for a fixed rejection region [0,𝑡], we can compute an upper prediction bound (UPB) 𝑄1𝛼(𝑡) for 𝑄𝑡 such that𝑄pr𝑡𝑄1𝛼(𝑡)1𝛼.(1.4)

If 𝑄1𝛼(𝑡) had been a nonrandom variable, then it should be always no less than the 1𝛼 quantile of the random variable 𝑄𝑡. When 1𝛼=50%, our procedure is very close to the classical BH procedure of Benjamini and Hochberg [1]. In other words, the BH procedure gives us an approximate 50% upper prediction bound (UPB) for 𝑄. With different degrees of being conservative, one should take 1𝛼 at 0.9, 0.95, and 0.99 to ensure high coverage of the false discovery proportion. We also describe how to compute an upper confidence bound (UCB) for 𝑚0, the number of true null hypotheses. The UCB for 𝑚0 can be used to improve the estimate 𝑄1𝛼(𝑡). In practice, the rejection region [0,𝑡] needs to be adapted to the actual dataset. In Section 3, we give a procedure to construct an upper prediction band for 𝑄𝑡 for all 𝑡(0,1), and this upper prediction band can be used to pick a data-defined rejection region [0,𝜏] of 𝑃-values such that the false discovery proportion 𝑄 can be controlled at target level 𝛾 with prediction accuracy 1𝛼, that is,𝑄pr𝜏𝛾1𝛼.(1.5)

Thus with probability at least 1𝛼, the value of 𝑄 is 𝛾 or less. For the independent true null 𝑃-values, Genovese and Wasserman [4], Meinshausen and Rice [6] also worked on the control of the FDP in the sense of the above equation. However, their results are based on asymptotic theory, while our focus is on the finite-sample results and avoids the slow convergence problem of the asymptotic theory. Other works such as Lehmann and Romano [5], Romano and Shaikh [7, 8], and van der Laan et al. [9] proposed procedures that allow dependence in the 𝑃-values but have potentially lost statistical power as the dependence information is not exploited. Section 4 presents a focused statistical inference by restricting the rejection regions onto {[0,𝑡]𝑡[𝑡0,𝑡0]}, which unifies the results of Sections 2 and 3. Section 5 generalizes the results from independent data to less-independent situations, and Section 6 gives our discussion.

2. Finding a 1𝛼 UPB for the False Discovery Proportion for a Fixed Rejection Region

For the sake of simplicity, we will first assume that the 𝑃-values from the true null hypotheses are following mutually independently uniform distribution 𝑈[0,1]. We have no further assumptions on the 𝑃-values from false null hypotheses. This assumption is the same as in Benjamini and Hochberg [1]. In Section 5 we will generalize the result to less independent situations. For a fixed rejection region [0,𝑡] of the 𝑃-values, we would like to find the 1𝛼 upper prediction bound (UPB) for the false discovery proportion 𝑄𝑡. As we mentioned in Section 1, the distribution of 𝑄𝑡 is unknown. However, for any given experimental data, the total number of rejections, 𝑅𝑡, can be easily obtained by (1.1). Under the assumption that true null 𝑃-values are independently distributed as 𝑈[0,1], 𝑉𝑡 has a binomial distribution Bin(𝑚0,𝑡). Let 𝑈𝑖,𝑖=1,,𝑚 be random variables mutually independently distributed as 𝑈[0,1], and 𝑁𝑚0,𝑡=𝑚0𝑖=1𝐼(𝑈𝑖𝑡) distributed as Bin(𝑚0,𝑡), hence,𝑉𝑡𝑑=𝑁𝑚0,𝑡.(2.1)

The 1𝛼 quantile for 𝑉𝑡 is the 1𝛼 quantile 𝐶1𝛼(𝑚0,𝑡) of the distribution Bin(𝑚0,𝑡). Here 𝐶1𝛼(𝑚0,𝑡) is defined as𝐶1𝛼𝑚0𝑁,𝑡=min𝑘pr𝑚0,𝑡.𝑘1𝛼(2.2) As 𝑅𝑡 can be computed from the observed data, a 1𝛼 UPB for 𝑄𝑡 can be estimated by𝑄1𝛼𝑚0=𝐶,𝑡1𝛼𝑚0,𝑡𝑅𝑡.(2.3)

Lemma 2.1. For any given 0𝑚1𝑚2𝑚, (a)𝐶1𝛼(𝑚1,𝑡)𝐶1𝛼(𝑚2,𝑡), (b)𝑚1𝐶1𝛼(𝑚1,𝑡)𝑚2𝐶1𝛼(𝑚2,𝑡), and(c) let 𝑔(𝑘)=𝐶1𝛼(𝑘,𝑡) and (𝑘)=𝑘𝐶1𝛼(𝑘,𝑡). The values that 𝑔(𝑘+1)𝑔(𝑘) and (𝑘+1)(𝑘) take can only be zero or one.

Proof. By noting that 𝑁𝑚1,𝑡𝑁𝑚2,𝑡 when 𝑚1𝑚2, we have pr(𝑁𝑚1,𝑡𝑘)pr(𝑁𝑚2,𝑡𝑘). Applying the definition of 𝐶1𝛼, we obtain the result for part (a).
Note that 𝐶1𝛼𝑚1𝑁,𝑡=min𝑘pr𝑚1,𝑡𝑚𝑘1𝛼=min𝑘pr1𝑁𝑚1,𝑡𝑚1𝑘1𝛼=𝑚1𝑘max𝑚pr1𝑁𝑚1,𝑡𝑘1𝛼use𝑘toreplace𝑚1.𝑘(2.4) Therefore, 𝑚1𝐶1𝛼𝑚1𝑘,𝑡=max𝑚pr1𝑁𝑚1,𝑡𝑘.1𝛼(2.5) Similarly, we can obtain that 𝑚2𝐶1𝛼𝑚2𝑘,𝑡=max𝑚pr2𝑁𝑚2,𝑡𝑘.1𝛼(2.6) By noting that 𝑚1𝑁𝑚1,𝑡=𝑚1𝑖=1𝐼𝑈𝑖>𝑡𝑚2𝑖=1𝐼𝑈𝑖>𝑡=𝑚2𝑁𝑚2,𝑡,(2.7) we have 𝑚pr1𝑁𝑚1,𝑡𝑘𝑚pr2𝑁𝑚2,𝑡𝑘.(2.8) Combining this with (2.5) and (2.6) leads to the result for part (b).
Parts (a) and (b), respectively, say that 𝑔(𝑘) and (𝑘) are both increasing functions of 𝑘. Simple algebra can establish that {𝑔(𝑘+1)𝑔(𝑘)}+{((𝑘+1)(𝑘)}=1. Both {𝑔(𝑘+1)𝑔(𝑘)} and {(𝑘+1)(𝑘)} are nonnegative due to the increasing property of functions 𝑔(𝑘) and (𝑘), and hence 0𝑔(𝑘+1)𝑔(𝑘)1 and 0(𝑘+1)(𝑘)1. The only values that {𝑔(𝑘+1)𝑔(𝑘)} and {(𝑘+1)(𝑘)} take can only be zero and one as functions 𝑔(𝑘) and (𝑘) only take integer values. Thus, we complete the proof for part (c).

Lemma 2.2. For any given t, 0<𝑡<1, 𝑄1𝛼(𝑚0,𝑡) of (2.3) is a 1𝛼 UPB for the false discovery proportion 𝑄𝑡, that is, 𝑄pr𝑡𝑄1𝛼𝑚0,𝑡1𝛼.(2.9)

The proof is straightforward by using the fact that 𝑉𝑡𝑑=𝑁𝑚0,𝑡. We have𝑄pr𝑡𝑄1𝛼𝑚0𝑉,𝑡=pr𝑡𝑅𝑡𝐶1𝛼𝑚0,𝑡𝑅𝑡,𝑅𝑡𝑅>0+pr𝑡𝑉=0=pr𝑡𝐶1𝛼𝑚0,𝑡,𝑅𝑡𝑅>0+pr𝑡𝑉=0=pr𝑡𝐶1𝛼𝑚0,𝑡1𝛼.(2.10) In the third line, we have used the fact that {𝑉𝑡𝐶1𝛼(𝑚0,𝑡)and𝑅𝑡=0} is the same as the set {𝑅𝑡=0}, which is obtained by noting that 𝑉𝑡 must be zero when 𝑅𝑡=0. Following this proof, we can easily see that𝑄𝑡𝑄1𝛼𝑚0=𝑉,𝑡𝑡𝐶1𝛼𝑚0,𝑡.(2.11) The basic construction of 𝑄1𝛼(𝑚0,𝑡) in (2.3) is the idea central to formulating prediction inference for 𝑄𝑡. In practice, 𝑚0 is an unknown parameter. The most conservative approach is to replace 𝑚0 with 𝑚, in which case we obtain a conservative 1𝛼 UPB for 𝑄𝑡. The independence assumption among true null 𝑃-values can be used to give a confidence inference for 𝑚0; thus, we can find a better estimate of the UPB for 𝑄𝑡. For any given 0<𝜆<1, a 1𝛼 UCB for 𝑚0 is given by𝑚0,1𝛼(𝜆)=max𝑘=0,,𝑚1𝑘(𝑘)=𝑚𝑅𝜆if(𝑚)<𝑚𝑅𝜆,𝑚otherwise,(2.12) where (𝑘)=𝑘𝐶1𝛼(𝑘,𝜆) as defined in Lemma 2.1(c). Since (0)=0 and (𝑘+1)(𝑘) takes value of only zero and one, there exists at least one 𝑘,𝑘{0,,𝑚1} such that (𝑘)=𝑚𝑅𝜆 when (𝑚)<𝑚𝑅𝜆. Therefore, 𝑚0,1𝛼(𝜆) in (2.12) is well defined. The parameter 𝜆 in (2.12) is used to construct a UCB for 𝑚0; more discussion about it can be seen in Remark 2.6 of the following theorem.

Theorem 2.3. (a) 𝑚0,1𝛼(𝜆) is a conservative 1𝛼 UCB for 𝑚0, that is, 𝑚pr0𝑚0,1𝛼(𝜆)1𝛼.(2.13)
(b) Especially, if 𝜆 takes the same value as 𝑡 in the 𝑃-value rejection region, then 𝑄pr𝑡𝑄1𝛼𝑚0,1𝛼(𝑡),𝑡,𝑚0𝑚0,1𝛼(𝑡)1𝛼.(2.14)

Proof. Use 𝑚0 as a shorthand of 𝑚0,1𝛼(𝜆) in this proof. We want to establish that 𝑚0𝑚0=𝑚0𝑚0.(2.15) The fact that function (𝑘) is increasing in 𝑘 leads to {𝑚0𝑚0}{(𝑚0)(𝑚0)}. On the other hand, if 𝑚0>𝑚0, then 𝑚0 is strictly less than 𝑚, and we must have (𝑚0)=𝑚𝑅𝜆 according to (2.12). 𝑚0 is the maximum of 𝑘 such that (𝑘)=𝑚𝑅𝜆, and hence (𝑚0)𝑚𝑅𝜆=(𝑚0) as 𝑚0>𝑚0. The increasing property of (𝑘) leads to (𝑚0)(𝑚0). Combining this with (𝑚0)(𝑚0), we obtain that (𝑚0)>(𝑚0); therefore, we conclude 𝑚0>𝑚0𝑚0>𝑚0(2.16) and complete the proof of (2.15).
Note that 𝑚0𝑚0=𝑚0𝑚𝑅𝜆,𝑚0𝑚<𝑚0(𝑚),𝑚0=𝑚=𝑚0𝑚𝑅𝜆,𝑚0<𝑚𝑚0𝑚=𝑚0𝑚(𝑚)alwaysholds0𝑚𝑅𝜆,(2.17) we have 𝑚pr0𝑚0𝑚=pr0𝑚0𝑚pr0𝐶1𝛼𝑚0,𝜆𝑚𝑅𝜆𝑚pr0𝐶1𝛼𝑚0,𝜆𝑚0𝑉𝜆using𝑚𝑅𝜆𝑚0𝑉𝜆𝑉=pr𝜆𝐶1𝛼𝑚0,𝜆1𝛼Notethat𝑉𝜆𝑑=𝑁𝑚0,𝜆.(2.18) Hence, we have the proof of part (a). When 𝜆 is set to be 𝑡, we have that the set {𝑚0𝑚0,1𝛼(𝑡)} contains {𝑉𝑡𝐶1𝛼(𝑚0,𝑡)} from the above derivation, and that {𝑄𝑡𝑄1𝛼(𝑚0,𝑡)}={𝑉𝑡𝐶1𝛼(𝑚0,𝑡)} from the derivation of Lemma 2.2. Therefore, 𝑄𝑡𝑄1𝛼𝑚0,1𝛼(𝑡),𝑡,𝑚0𝑚0,1𝛼𝑄(𝑡)𝑡𝑄1𝛼𝑚0,𝑡,𝑚0𝑚0,1𝛼𝑉(𝑡)𝑡𝐶1𝛼𝑚0.,𝑡(2.19) Thus, we have the proof of part (b) of this theorem.

Remark 2.4. Theorem 2.3 gives researchers a good sense of the total number 𝑚0 of true null hypotheses. Other papers, for example, Storey et al. [10], Benjamini and Hochberg [11], and Langaas et al. [12], gave only point estimates of 𝑚0 or 𝜋0=𝑚0/𝑚. Part (a) gives a confidence inference for 𝑚0, and part (b) gives a simultaneous statement for the 𝑄𝑡 and 𝑚0, which is more interesting. Meinshausen [13] gives a confidence for 𝑚0 by using resampling methods, while ours exploited the independence information so that it works for finite samples.

Remark 2.5. Theorem 2.3 of Göb [14] implies that for a binomial distribution, the difference between the median and mean is less than 1, that is, |𝐶0.5(𝑚0,𝑡)𝑚0𝑡|<1. From (2.3), we know the 50% UPB for the 𝑄𝑡 can be estimated by 𝐶0.5(𝑚0,𝑡)/𝑅𝑡. This UPB for 𝑄𝑡 is very close to 𝑚0𝑡/𝑅𝑡 with a difference smaller than 1/𝑅𝑡. Replacing 𝑚0 by 𝑚 in 𝑚0𝑡/𝑅𝑡 is equivalent to the classical BH procedure. For a very large 𝑅𝑡, the term 1/𝑅𝑡 can be ignored, and the BH procedure offers an approximate estimate of the 50% UPB for 𝑄𝑡.

Remark 2.6. When 𝑘 is large, the distribution Bin(𝑘,𝜆) can be closely approximated by 𝑁(𝑘𝜆,𝑘𝜆(1𝜆)). Let 𝑧1𝛼 be the 1𝛼 quantile of a standard normal distribution. After some algebraic manipulations, we obtain a 1𝛼 UCB for 𝑚0𝑚0,1𝛼1(𝜆)𝑧2(1𝜆)1𝛼𝜆(1𝜆)+𝑧21𝛼𝜆(1𝜆)+4𝑚𝑅𝜆(1𝜆)2.(2.20) Taking 1𝛼=0.5, we have (𝑚𝑅𝜆)/(1𝜆), which is equivalent to (2.3) of Storey et al. [10]. For most practical applications, one can set the value of 𝜆=0.5. Fine tuning of the parameter 𝜆 will be discussed in Section 3.3.
When the rejection region [0,𝑡] is small, the UCB for 𝑚0 obtained from part (b) of Theorem 2.3 may be too conservative. It may be advantageous to have separate values for 𝜆 and 𝑡. Part (a) of Theorem 2.3 implies the following.

Corollary 2.7. Replacing 𝑚0 by its the upper confidence bound 𝑚0,1𝛼2(𝜆) and 𝛼 by 𝛼1 in (2.3), we define 𝑄1𝛼1,1𝛼2𝐶(𝑡,𝜆)=1𝛼1𝑚0,1𝛼2(𝜆),𝑡𝑅𝑡.(2.21) Then, 𝑄1𝛼1,1𝛼2(𝑡,𝜆) is a conservative 1𝛼(𝛼=𝛼1+𝛼2) UPB for the false discovery proportion 𝑄𝑡.

3. Upper Prediction Bounds and Simultaneous Inferences

3.1. The Setup

In Section 2, the UPBs for 𝑄 are only valid for a fixed rejection region [0,𝑡] of 𝑃-values. In practice, researchers will not fix the rejected region [0,𝑡] but adapt it to the actual data. The logic is the same as with single hypothesis testing. In single hypothesis testing with nested rejection regions {Γ𝛼0<𝛼<1}, for an observed statistic 𝑇, one will find the rejection region that contains the observed statistic with the smallest type I error 𝛼, that is,𝑃-value(𝑇)=min𝛼𝑇Γ𝛼.(3.1)

The same logic can be applied to our false discovery proportion. In this case, we will try to find the largest rejection region [0,𝑡] such that the false discovery proportion 𝑄 is not more than 𝛾, say 10%, with probability 1𝛼. Define𝜏=max𝑡𝑄1𝛼1,1𝛼2(𝑡,𝜆)𝛾.(3.2) We then reject any hypothesis whose 𝑃-value is no greater than 𝜏. If 𝜏 is independent of 𝑄 and 𝑄, then we can expect that𝑄pr𝜏𝑄𝛾pr𝜏𝑄1𝛼1,1𝛼2(𝜏,𝜆)1𝛼.(3.3)

Asymptotically, 𝜏 and (𝑄,𝑄) may be independent: this question is open for future research. To overcome the independence assumption of 𝜏 and (𝑄,𝑄), we seek an alternative approach: to find simultaneous UPBs for all rejection regions [0,𝑡], 𝑡(0,1), that is, to find an upper prediction band 𝑄𝑡 such that𝑄pr𝑡𝑄𝑡for𝑡(0,1)1𝛼.(3.4)

Hence we have the simultaneous inferences on 𝑄𝑡 for each rejection region [0,𝑡],𝑡(0,1). Following the definition of 𝐶1𝛼(𝑛,𝑡) in (2.2) to construct the UPB for 𝑄𝑡, we want to define the simultaneous critical values of 𝑁𝑛,𝑡. Using the distribution of max𝑡(0,1)𝑁𝑛,𝑡 is unwise as max𝑡(0,1)𝑁𝑛,𝑡=𝑁𝑛,1, which takes value 𝑛 with probability one. A better approach is to center 𝑁𝑛,𝑡, that is,sup𝑡(0,1)𝑁𝑛,𝑡.𝑛𝑡(3.5)

This leads to a test statistic related to the Kolmogorov-Smirnov test statistic, which gives an upper confidence band for a cumulative distribution function 𝐹(𝑥). It turns out that this method leads to very high UPBs when 𝑡 is close to zero or one. Therefore, we normalize 𝑁𝑛,𝑡, that is,𝑍𝑛=sup𝑡(0,1)𝑁𝑛,𝑡𝑛𝑡𝑛𝑡(1𝑡).(3.6) Note that 𝑍𝑛 is continuously distributed even though each 𝑁𝑛,𝑡 is discretely distributed. Let ̃𝑧1𝛼(𝑛) be the 1𝛼 quantile of 𝑍𝑛, that is,𝑍pr𝑛̃𝑧1𝛼(𝑛)=1𝛼.(3.7) We can then redefine 𝑄 as𝑄1𝛼𝑚0=𝑚,𝑡0𝑡+̃𝑧1𝛼𝑚0𝑚0𝑡(1𝑡)𝑅𝑡.(3.8) Corresponding to Lemma 2.2 and Corollary 2.7, we have similar results below.

Corollary 3.1. For any given 0<𝑡<1, 𝑄1𝛼(𝑚0,𝑡) of (3.8) is an exactly 1𝛼 upper prediction band for the false discovery proportion 𝑄𝑡, that is, 𝑄pr𝑡𝑄1𝛼𝑚0,𝑡𝑡(0,1)=1𝛼.(3.9)

Corollary 3.2. Denote that 𝑚0=𝑚0,1𝛼2(𝜆). Define 𝑄1𝛼1,1𝛼2(𝑡,𝜆)=𝑚0𝑡+̃𝑧1𝛼1𝑚0𝑚0𝑡(1𝑡)𝑅𝑡.(3.10) Let 𝛼=𝛼1+𝛼2. Then 𝑄1𝛼1,1𝛼2(𝑡,𝜆) is a conservative 1𝛼 upper prediction band for the false discovery proportion 𝑄𝑡, that is, 𝑄pr𝑡𝑄1𝛼1,1𝛼2(𝑡,𝜆)𝑡(0,1)1𝛼.(3.11)

Remark 3.3. Using the same idea as in the proof of Lemma 2.2, the proof of the above corollaries is straightforward after converting the comparison between 𝑄 and 𝑄 to the comparison between 𝑉𝑡 and 𝑚0𝑡+̃𝑧1𝛼(𝑚0)𝑚0𝑡(1𝑡). This conversion provides a powerful tool for understanding the false discovery proportion.

Remark 3.4. The formulation of 𝑄1𝛼(𝑚0,𝑡) in (3.8) is motivated by the normal approximation of 𝑁𝑛,𝑡. But our definition of 𝑄1𝛼(𝑚0,𝑡) gives exact UPBs simultaneously for all 𝑡(0,1) due to the exactness of the quantile ̃𝑧1𝛼.

Remark 3.5. Meinshausen and Rice [6] and Donoho and Jin [15] also utilize the empirical process 𝑍𝑛. However, they focus on the asymptotic theory for 𝑍𝑛, which may face the slow convergence problem described in the next section. Our focus is on the finite sample control.

Remark 3.6. Let 𝑓(𝑡)=(𝑘𝑛𝑡)/𝑛𝑡(1𝑡). After some simplifications, we have 𝑓(𝑡)(𝑛𝑡(1𝑡))3=𝑛2𝑡(1𝑡)1/2(𝑘𝑛𝑡)[𝑛(12𝑡)]=𝑛/2[𝑘(1𝑡)+(𝑛𝑘)𝑡], and then 𝑓(𝑡) is a decreasing function in 𝑡,  0<𝑡<1 for 0𝑘𝑛. Equation (3.6) can be simplified to be max𝑘=1,,𝑛sup𝑈𝑡(𝑘),𝑈(𝑘+1)𝑘𝑛𝑡𝑛𝑡(1𝑡)=max𝑘=1,,𝑛𝑘𝑛𝑈(𝑘)𝑛𝑈(𝑘)1𝑈(𝑘),(3.12)where 𝑈(𝑘) is the 𝑘th smallest ordered one among the 𝑛 samples of 𝑈[0,1] distribution. This formulation facilitates the computation of the distribution of 𝑍𝑛 by Monte Carlo methods. The standard error associated with the Monte Carlo simulations in computing the probability in (3.7) is no greater than 𝛼(1𝛼)/𝐵, where 𝐵 is the number of simulations.

3.2. Computing the Distribution of 𝑍𝑛

In order to make simultaneous inferences, we need to know the distribution of 𝑍𝑛 defined in (3.6). Example 1 of Jaeschke [16] showed that asymptotically, for any 𝑥,lim𝑛𝑍pr𝑛𝑥+2lnln𝑛+(1/2)lnlnln𝑛(1/2)ln𝜋2lnln𝑛=expexp(𝑥).(3.13) This implies that 𝑍𝑛/2lnln𝑛 converges to 1 in probability as 𝑛 goes to . Jaeschke [16] claimed that this probability convergence is of almost no practical use. This is where we need to be cautious using asymptotic results. Figure 1 shows the poor approximation of the asymptotic result, even for a very large 𝑛=105. Noe and Vandewiele [17] gave an iterative algorithm to compute the exact probability 𝑍pr(𝑛̃𝑧). Their algorithm is only good for very small 𝑛 due to the computational time and propagation of precision errors in representing real numbers in computer. Equation (24) of their paper gives an approximate formula for 𝑛=1,,100,𝑍pr𝑛̃𝑧1(̃𝑧)223𝑛1(̃𝑧)41057𝑛1+48𝑛2(̃𝑧)6741021𝑛1+2743𝑛21797𝑛3(̃𝑧)870619123𝑛1+111905𝑛2213619𝑛3+120132𝑛4(̃𝑧)10.(3.14) This approximation is very good for ̃𝑧4 but is away from the true probability when ̃𝑧<4. For our applications, the 50% quantile (median) of 𝑍 is very useful, but the approximation of (3.14) is poor there.

In order to overcome the above poor approximation, we propose to use the Monte Carlo method to obtain the probability pr(𝑍̃𝑧). Figures 2 and 3 give the probability for ̃𝑧[1,20] with 106 simulations for 𝑛=1,10,102,103,104, and 105. The Monto Carlo method generated quantiles ̃𝑧 for 𝑛=1,,100 are almost the same as those quantiles that were able to be computed by the exact algorithm in Noe and Vandewiele [17]. Our two figures show that the distribution of 𝑍𝑛 does not change dramatically from 𝑛=1,,105. This property is beneficial for a multiple testing problem with large number of hypotheses, as it will not be overpenalized. Table 1 gives the quantiles of 𝑍𝑛 with 𝑛=105.

3.3. More about the Upper Confidence Bound for 𝑚0

In computing the UCB for 𝑚0 and consequently the UPB for 𝑄𝑡, we rely on the unspecified parameter 𝜆. A conventional choice of 𝜆 is 0.5. It is tempting to use min𝜆(0,1)𝑚0,1𝛼(𝜆) as the best UCB for 𝑚0. This approach should be avoided as it may lead to an overoptimistic UCB. We can use the same idea in computing the simultaneous upper prediction bounds for 𝑄𝑡 to find an UCB for 𝑚0. Equation (2.20) motivates to the following theorem.

Theorem 3.7. Define 𝑚0,1𝛼(𝜆) as 12(1𝜆)̃𝑧1𝛼(𝑚)𝜆(1𝜆)+̃𝑧1𝛼(𝑚)2𝜆(1𝜆)+4𝑚𝑅𝜆(1𝜆)2,(3.15) where ̃𝑧1𝛼(𝑚)=max𝑛=1,,𝑚̃𝑧1𝛼(𝑛).(3.16) Let 𝑚0,1𝛼=min𝜆(0,1)𝑚0,1𝛼(𝜆).(3.17) Using 𝑚0,1𝛼 to replace 𝑚0 in (3.8) results in 𝑄1𝛼(𝑚0,1𝛼,𝑡). We have 𝑚pr0𝑚0,1𝛼,𝑄𝑡𝑄1𝛼𝑚0,1𝛼,𝑡𝑡(0,1)1𝛼.(3.18) Thus simultaneously 𝑚0,1𝛼 is a 1𝛼 UCB for 𝑚0 and 𝑄 is a 1𝛼 upper prediction band.

Proof. Note that when 𝑥>0, 𝑥(1𝜆)𝑥̃𝑧1𝛼(𝑚)𝜆(1𝜆)𝑚𝑅𝜆0(3.19) if and only if 1𝑥2(1𝜆)̃𝑧1𝛼(𝑚)𝜆(1𝜆)+̃𝑧1𝛼(𝑚)2𝜆(1𝜆)+4𝑚𝑅𝜆(1𝜆)2.(3.20) Therefore, 𝑚0𝑚0,1𝛼=𝑚0𝑚0,1𝛼=𝑚(𝜆)𝜆(0,1)0(1𝜆)𝑚0̃𝑧1𝛼(𝑚)𝜆(1𝜆)𝑚𝑅𝜆=0𝜆(0,1)max𝜆(0,1)𝑚0(1𝜆)𝑚𝑅𝜆𝑚0𝜆(1𝜆)̃𝑧1𝛼(𝑚)max𝜆(0,1)𝑉𝜆𝑚0𝜆𝑚0𝜆(1𝜆)̃𝑧1𝛼𝑚0.(3.21) The last step follows: (i) ̃𝑧1𝛼(𝑚) is no less than ̃𝑧1𝛼(𝑛) for any 𝑛𝑚, and (ii) 𝑚𝑅𝜆𝑚0𝑉𝜆. The fact (ii) gives 𝑚0(1𝜆)𝑚𝑅𝜆𝑚0𝑚(1𝜆)0𝑉𝜆=𝑉𝜆𝑚0𝜆.(3.22) Following the same idea as in the proof of Theorem 2.3 part (b), we can show that the set {𝑚0𝑚0,1𝛼and𝑄𝑡𝑄1𝛼(𝑚0,1𝛼,𝑡)forall𝑡(0,1)} is a superset of {max𝑡(0,1)(𝑉𝑡𝑚0𝑡)/𝑚0𝑡(1𝑡)̃𝑧1𝛼(𝑚0)}. Therefore, 𝑚pr0𝑚0,1𝛼,𝑄𝑡𝑄1𝛼𝑚0,1𝛼,𝑡𝑡(0,1)prmax𝑡(0,1)𝑉𝑡𝑚0𝑡𝑚0𝑡(1𝑡)̃𝑧1𝛼𝑚0=1𝛼Notethat𝑉𝑡𝑑=𝑁𝑚0,𝑡.(3.23)Readers should note that the maximum quantile ̃𝑧1𝛼(𝑚) defined in (3.16) is only used to construct 𝑚0,1𝛼, The construction of 𝑄1𝛼(𝑚0,1𝛼,𝑡) itself does not use the maximum but the quantile ̃𝑧1𝛼(𝑚), while 𝑄1𝛼(𝑚0,1𝛼,𝑡) still depends on the maximum quantiles indirectly through 𝑚0,1𝛼.

3.4. The Algorithm

Putting all these pieces together, we describe the procedure to compute the upper prediction band for 𝑄𝑡 and the UCB for 𝑚0 in Algorithm 1. Note that we have to compute the quantile ̃𝑧1𝛼(𝑚) and ̃𝑧1𝛼(𝑚0,1𝛼). This is very time consuming for large 𝑚, which is typically from thousands to tens of thousands. The computationally time can be reduced by the following strategies.(1)After careful study of the two equations (3.8) and (3.15), we find that if we replace all ̃𝑧1𝛼(𝑛) and ̃𝑧1𝛼(𝑛) by ̃𝑧1𝛼(𝑁), where 𝑁𝑛, the conclusions of Corollaries 3.1 and 3.2 and Theorem 3.7 still hold.(2)The quantile ̃𝑧1𝛼(𝑛) is an increasing function of 𝑛, as shown by the Monte Carlo simulations in Figures 2 and 3. The rigorous mathematical proof of this finding is open to future research. For practical applications, we can first use Monte Carlo simulations to verify this property for the range of 𝑛 that is related to the project and then replace all ̃𝑧1𝛼(𝑛) by ̃𝑧1𝛼(𝑛).(3)Figure 2 shows that ̃𝑧1𝛼(𝑛) is very close to ̃𝑧1𝛼(𝑁) if 𝑛 is close to a large 𝑁, say more than 100. Therefore, in practical computations, we can first compute and store a representative sequences of the quantiles ̃𝑧1𝛼(𝑛) for 𝑛=𝑛1,,𝑛𝐼, and consequently we can get an upper bound for ̃𝑧1𝛼(𝑛) for 𝑛=1,,𝑚. In computing the quantiles ̃𝑧1𝛼(𝑛), we recommend to have at least 104 Monte Carlo simulations in order to get an accurate quantile computation for tail part. Even with 106 simulations, we still see a small amount of random noise in the tail part in Figure 3.

For any given 𝛼 and 𝛾 ,
( 1 ) Compute 𝑚 0 , 1 𝛼 ( 𝜆 ) of (3.15) for some pre-specified 𝜆 𝑖 ’s, say 𝜆 𝑖 = 𝑖 / 1 0 0 0 , for 𝑖 = 1 , , 9 9 9 .
( 2 ) Compute 𝑚 0 , 1 𝛼 = m i n 𝑖 𝑚 0 , 1 𝛼 ( 𝜆 𝑖 ) . This 𝑚 0 , 1 𝛼 is the 1 𝛼 UCB for 𝑚 0 .
 If 𝑚 0 , 1 𝛼 exceeds 𝑚 , replace it by 𝑚 .
( 3 ) Sort the observed 𝑃 -values such that 𝑃 ( 1 ) 𝑃 ( 𝑚 ) , and use (3.8) to compute the 1 𝛼
 simultaneous UPBs for the false discovery proportion 𝑄 , that is, for 𝑖 = 1 , , 𝑚 ,
𝑄 1 𝛼 ( 𝑃 ( 𝑖 ) ) = ( 1 / 𝑖 ) ( 𝑚 0 , 1 𝛼 𝑃 ( 𝑖 ) + ̃ 𝑧 1 𝛼 ( 𝑚 0 , 1 𝛼 ) 𝑚 0 , 1 𝛼 𝑃 ( 𝑖 ) ( 1 𝑃 ( 𝑖 ) ) ) .
 If 𝑄 1 𝛼 ( 𝑃 ( 𝑖 ) ) exceeds 1, replace it by 1.
( 4 ) Compute 𝜏 = m a x { 𝑃 ( 𝑖 ) 𝑄 1 𝛼 ( 𝑃 ( 𝑖 ) ) 𝛾 } ,
 reject the hypotheses whose 𝑃 -values are no greater than 𝜏 ,
 which ensures that the false discovery proportion 𝑄 is not exceeding 𝛾 with probability 1 𝛼 .

4. A Focused Inference on 𝑄 and 𝑚0: A Unified Approach

In many applications, it may be unnecessary to compute the simultaneous UPBs for 𝑄𝑡 for all 𝑡 in (0,1) and using 𝑚1𝛼(𝜆) for all 𝜆 in (0,1) to derive a 1𝛼 UCB for 𝑚0. In most applications, it may be reasonable to restrict the rejections onto {[0,𝑡]𝑡[𝑡0,𝑡0]}. The 𝑡0 can take value of 0.01/𝑚 based on Bonferroni FWER control at level 0.01. It is rare to consider a smaller rejection region than this. The 𝑡0 can take value of 0.05 as it is rare to consider a larger rejection region than [0,0.05] even in a single hypothesis testing problem. For the same reason, we can also restrict 𝜆 onto [𝜆0,𝜆0] in (3.17). The interval [𝜆0, 𝜆0] can be taken as a region close to one as the minimum of 𝑚0,1𝛼(𝜆) is reached when 𝜆 is close to 1 [18], but if 𝜆 is too close to 1, 𝑚0,1𝛼(𝜆) is not stable. One good choice of 𝜆0 can be 0.8, and 𝜆0 can be 0.95.

The above scenario is a focused inference on 𝑄 and 𝑚0. The ̃𝑧 in Section 3.2 will be a little bit more conservative for us. We can redefine 𝑍𝑛 as in the following:𝑍𝑛=maxsup𝑡[𝑡0,𝑡0]𝑁𝑛,𝑡𝑛𝑡,𝑛𝑡(1𝑡)sup𝜆[𝜆0,𝜆0]𝑁𝑛,𝜆𝑛𝜆𝑛𝜆(1𝜆).(4.1)

From this 𝑍𝑛 we can define the 1𝛼 quantile ̃𝑧1𝛼(𝑛), and derive results similar to Theorem 3.7. Figure 4 shows quantiles ̃𝑧1𝛼(𝑛) for 𝑛=1,,105 for [𝑡0,𝑡0]=[0.01/𝑛,0.05] and [𝜆0,𝜆0]=[0.8,0.95]. Table 2 gives the numerical values of ̃𝑧1𝛼(𝑛) for 𝑛=105. It clearly shows that ̃𝑧1𝛼(𝑛) is around 10% smaller than the unrestricted quantiles ̃𝑧1𝛼(𝑛). For small values of 𝛼, say that 𝛼0.01, the former is at least 25% smaller than the latter.

Corollary 4.1. Define 𝑚0,1𝛼(𝜆) as 12(1𝜆)̃𝑧1𝛼(𝑚)𝜆(1𝜆)+̃𝑧1𝛼(𝑚)2𝜆(1𝜆)+4𝑚𝑅𝜆(1𝜆)2,(4.2) where ̃𝑧1𝛼(𝑚)=max𝑚𝑛=1̃𝑧1𝛼(𝑛). Let 𝑚0,1𝛼=minmin𝜆𝜆0,𝜆0𝑚0,1𝛼(𝜆),min𝑡𝜆0,𝑡0𝑚0,1𝛼(𝜆).(4.3) Define 𝑄𝑄1𝛼𝑚0=𝑚,𝑡0𝑡+̃𝑧1𝛼𝑚0𝑚0𝑡(1𝑡)𝑅𝑡.(4.4) Replacing 𝑚0 by 𝑚0,1𝛼 results in 𝑄1𝛼(𝑚0,1𝛼,𝑡). We have that 𝑚0,1𝛼 is a 1𝛼 UCB for 𝑚0, and 𝑄 is a 1𝛼 upper prediction band for 𝑄 for 𝑡[𝑡0𝑡0], that is, 𝑚pr0𝑚0,1𝛼,𝑄𝑡𝑄1𝛼𝑚0,1𝛼𝑡,𝑡𝑡0,𝑡01𝛼.(4.5)

Note that the 1𝛼 UCB for 𝑚0 takes not only the minimum of 𝑚0,1𝛼(𝜆) for 𝜆[𝜆0,𝜆0], but also the minimum of 𝑚0,1𝛼(𝑡) for 𝑡[𝑡0,𝑡0]. This advantage is due to the construction of the 𝑍𝑛, which takes maximum over these two intervals. The details of the calculation are summarized in Algorithm 2. The proof of this corollary is the same as that in Theorem 3.7.

For any given 𝛼 and 𝛾 , choose 𝑡 0 = 0 . 0 1 / 𝑚 , 𝑡 0 = 0 . 0 5 , 𝜆 0 = 0 . 8 , 𝜆 0 = 0 . 9 5 .
( 1 ) Compute 𝑚 0 , 1 𝛼 ( 𝜆 ) of (4.2) for some pre-specified 𝜆 𝑖 ’s in the region ( 𝜆 0 , 𝜆 0 )
 and pre-specified 𝑡 𝑖 ’s in the region ( 𝑡 0 , 𝑡 0 ) ,
 say 𝜆 𝑖 = 𝜆 0 + ( 𝜆 0 𝜆 0 ) 𝑖 / 1 0 0 0 , 𝑡 𝑖 = 𝑡 0 + ( 𝑡 0 𝑡 0 ) 𝑖 / 1 0 0 0 for 𝑖 = 0 , , 1 0 0 0 .
( 2 ) Compute 𝑚 0 , 1 𝛼 = m i n ( m i n 𝑖 𝑚 0 , 1 𝛼 ( 𝜆 𝑖 ) , m i n 𝑖 𝑚 0 , 1 𝛼 ( 𝑡 𝑖 ) ) .
 This 𝑚 0 , 1 𝛼 is the 1 𝛼 UCB for 𝑚 0 . If 𝑚 0 , 1 𝛼 exceeds 𝑚 , replace it by 𝑚 .
( 3 ) Sort the observed 𝑃 -values such that 𝑃 ( 1 ) 𝑃 ( 𝑚 ) ,
 and use (4.4) to compute the 1 𝛼 UPB for the false discovery proportion 𝑄 ,
 that is, for 𝑃 ( 𝑖 ) [ 𝑡 0 , 𝑡 0 ]
𝑄 1 𝛼 ( 𝑃 ( 𝑖 ) ) = ( 1 / 𝑖 ) ( 𝑚 0 , 1 𝛼 𝑃 ( 𝑖 ) + ̃ 𝑧 1 𝛼 ( 𝑚 0 , 1 𝛼 ) 𝑚 0 , 1 𝛼 𝑃 ( 𝑖 ) ( 1 𝑃 ( 𝑖 ) ) ) .
 If 𝑄 1 𝛼 ( 𝑃 ( 𝑖 ) ) exceeds 1, replace it by 1.
( 4 ) Compute 𝜏 = m a x { 𝑃 ( 𝑖 ) [ 𝑡 0 , 𝑡 0 𝑄 ] 1 𝛼 ( 𝑃 ( 𝑖 ) ) 𝛾 } ,
 reject the hypotheses whose 𝑃 -values are no greater than 𝜏 ,
 which ensures that the false discovery proportion 𝑄 is not exceeding 𝛾 with
 probability 1 𝛼 .

By setting 𝜆0=𝜆0=𝑡 and 𝑡0=𝑡0=𝑡, this corollary is equivalent to Theorem 2.3 through some algebra manipulations, while Theorem 2.3 uses the exact confidence bound from the binomial distribution without relying on the quantiles of 𝑍𝑛. Furthermore, Theorem 3.7 is exact a special case of Corollary 4.1 by setting 𝜆0=0,𝜆0=1, and 𝑡0=0,𝑡0=1 and by considering open intervals rather close intervals. The focused inference thus unifies both the fixed rejection approach and simultaneous approach. We should be cautious of selecting [𝑡0,𝑡0] and [𝜆0,𝜆0] based on the observed data, which may result in overoptimistic false discovery proportions. These settings have to be decided before the data are generated. A careful study of choosing appropriate values for [𝑡0,𝑡0] and [𝜆0,𝜆0] is open for future research.

5. Generalizing the Results to Less-Independent Situations

The results of Sections 2, 3, and 4 are based on the assumption that the true null 𝑃-values are independently distributed as 𝑈[0,1]. Given this, we need no further assumptions concerning the false null 𝑃-values. This independence assumption can be weakened as in the following:

Binomial Dominant Condition: One has 𝑉𝑡𝑑𝑁𝑚0,𝑡for0<𝑡<1.

The notation 𝑋𝑑𝑌 means that random variable 𝑋 is stochastically no greater than random variable 𝑌, that is, pr(𝑋𝑥)pr(𝑌𝑥) for any 𝑥. Replacing the independence assumption by the binomial dominant condition, the results corresponding to Lemma 2.2, Theorem 2.3, and Corollary 2.7 in Section 2 still hold for a fixed rejection region. For the simultaneous UPBs in Sections 3 and 4, we need a stronger assumption than the binomial dominant condition as the joint distribution of {𝑉𝑡,𝑡(0,1)} needs to be specified. We can replace the binomial dominant condition by the following.

Joint Binomial Dominant Condition: (𝑉𝑡1,,𝑉𝑡𝑘)𝑑(𝑁𝑚0,𝑡1,,𝑁𝑚0,𝑡𝑘) for any 𝑘=1,2,, and 𝑡1,,𝑡𝑘(0,1). Here 𝑁𝑚0,𝑡=𝑚0𝑖=1𝐼(𝑈𝑖𝑡), and 𝑈𝑖,𝑖=1,,𝑚0 are mutually independently distributed as distribution 𝑈[0,1]. The notation (𝑋1,,𝑋𝑘)𝑑(𝑌1,,𝑌𝑘) means for any 𝑥1,,𝑥𝑘, pr(𝑋1𝑥1,,𝑋𝑘𝑥𝑘)pr(𝑌1𝑥1,,𝑌𝑘𝑥𝑘).

Replacing the independence assumption by this joint binomial dominant condition, the results in Sections 3 and 4 are still valid for the upper prediction band for 𝑄 and the UCB for 𝑚0. A special case for this joint binomial dominant condition is that when the true null 𝑃-values are independent with distribution stochastically no smaller than 𝑈[0,1]. This happens when the null hypothesis is composite or the statistic to test the null hypothesis is not a continuous random variable.

More generally, we would like the construction of upper prediction band for 𝑄 not to rely on the independence assumption or any kind of weak dependence assumption (the binomial dominant condition or the joint binomial dominant condition). The method of Romano and Shaikh [7, 8] can be applied without any assumptions on the dependence, but may potentially have lost power due to that the correlation structure of the data has not been exploited. A resampling procedure [13, 19] has been proposed to address this limitation.

6. Discussion

The method of this paper applies to data where true null 𝑃-values are independent, or to slightly dependent data where the joint binomial dominant condition is satisfied. This assumption does not rely on any specification for the false null 𝑃-values. In this paper we used the idea of considering a fixed rejection region to construct a UPB for 𝑄𝑡 and a UCB for 𝑚0. By utilizing the normalized empirical process 𝑍𝑛=sup𝑡(0,1)(𝑁𝑛,𝑡𝑛𝑡)/𝑛𝑡(1𝑡), we find simultaneous UPBs for 𝑄𝑡 for all 𝑡(0,1) and can further modify the construction of the UCB for 𝑚0. The result of Theorem 3.7 gives the joint statement about the UCB for 𝑚0 and the simultaneous UPBs for the false discovery proportions 𝑄. A focused approach in Corollary 4.1 unifies the result of the fixed rejection region method and the simultaneous approach.

The method in this paper is based on finite samples and avoids the slow convergence problem of the asymptotic theory for the empirical process 𝑍𝑛. The Monte Carlo simulations give very accurate estimates of the quantiles for 𝑍𝑛. The standard error associated with the Monte Carlo simulations in computing the probability in (3.7) is no greater than 𝛼(1𝛼)/𝐵, where 𝐵 is the number of simulations.

In the dataset where the test statistics are not independent or do not satisfy joint binomial dominant condition, the method in this paper may not be guaranteed to work. One can alternatively use the methods proposed in Romano and Shaikh [7, 8], Meinshausen [13], Ge et al. [19]. The method proposed in this paper can be potentially extended to dependent data by using resamplings, and this work is open for future research.

Acknowledgments

The authors thank Terry Speed, Stuart Sealfon, Carol Bodian, Sylvan Wallenstein, John Mandeli, Samprit Chatterjee, and Jim Godbold for their discussions of this work. They thank the editor and referees for helpful comments that have led to an improved paper. This work was partly supported by National Institute of Allergy and Infectious Diseases with the contract HHSN266200500021C.