An Efficient Approach to Screening Epigenome-Wide Data
Table 6
Simulation results for selecting important variables among 2,000 candidates including the most and least important surrogate variables across 600 subjects.
Bonferroni
FDR
TT
.sv = 5
.sv = 10
.sv = 15
.sv = 5
.sv = 10
.sv = 15
.sv = 5
.sv = 10
.sv = 15
Most important surrogate variables included
# incorrect
0
0
0
6
6
6
5
4
3
2
3
2
19
19
19
7
5
5
6
6
5
26
29
29
6
6
4
12
11
11
40
38
39
7
7
7
Sensitivity
1
1
1
1
1
1
1
1
1
0.98
0.97
0.98
1
1
1
0.98
0.98
0.98
0.97
0.97
0.975
0.995
0.995
0.995
0.985
0.985
0.985
0.97
0.973
0.973
1
1
1
0.988
0.988
0.985
Specificity
1
1
1
0.997
0.997
0.997
0.997
0.998
0.998
1
1
1
0.99
0.99
0.99
0.997
0.998
0.998
1
1
1
0.986
0.984
0.984
0.998
0.998
0.999
1
1
1
0.975
0.976
0.976
0.999
0.999
0.999
Most important surrogate variables not included
# incorrect
10
10
10
10
10
10
10
10
10
100
100
100
100
100
100
100
100
100
200
200
200
200
200
200
200
200
200
400
400
400
400
400
400
400
400
400
Sensitivity
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Specificity
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
FDR: false discovery rate, TT: training and testing, .sv = number of surrogate variables, and : the number of truly important CpG sites out of 2,000.