Research Article

An Efficient Approach to Screening Epigenome-Wide Data

Figure 4

Numbers of misidentified CpG sites versus cutoff frequency for a CpG site being potentially important (based on ordinary least squares regressions). The true numbers of important CpG sites are (a) 10, (b) 100, (c) 200, and (d) 400 out of 2,000 CpG sites across 400 subjects. For the TT screening method, significance levels considered are 0.05 for both training and testing data. For the FDR-based and Bonferroni methods, the level was set at 0.05.
(a) 10 important CpG sites out of 2000
(b) 100 important CpG sites out of 2000
(c) 200 important CpG sites out of 2000
(d) 400 important CpG sites out of 2000