Research Article

An Efficient Approach to Screening Epigenome-Wide Data

Figure 2

Numbers of misidentified CpG sites versus cutoff frequency for a CpG site being potentially important (based on ordinary least squares regressions). The true numbers of important CpG sites are (a) 10, (b) 100, (c) 200, and (d) 400 out of 2,000 CpG sites. For the TT screening method, two sets of significance levels are considered: (1) 0.05 for training data and 0.1 for testing data; (2) 0.05 for both training and testing data. For the FDR-based and Bonferroni methods, the level was set at 0.05.
(a) 10 important CpG sites out of 2000
(b) 100 important CpG sites out of 2000
(c) 200 important CpG sites out of 2000
(d) 400 important CpG sites out of 2000