BioMed Research International

Research Article

An Efficient Approach to Screening Epigenome-Wide Data

Table 6

Simulation results for selecting important variables among 2,000 candidates including the most and least important surrogate variables across 600 subjects.


	Bonferroni			FDR			TT
	.sv = 5	.sv = 10	.sv = 15	.sv = 5	.sv = 10	.sv = 15	.sv = 5	.sv = 10	.sv = 15

Most important surrogate variables included

	# incorrect
	0	0	0	6	6	6	5	4	3
	2	3	2	19	19	19	7	5	5
	6	6	5	26	29	29	6	6	4
	12	11	11	40	38	39	7	7	7

	Sensitivity
	1	1	1	1	1	1	1	1	1
	0.98	0.97	0.98	1	1	1	0.98	0.98	0.98
	0.97	0.97	0.975	0.995	0.995	0.995	0.985	0.985	0.985
	0.97	0.973	0.973	1	1	1	0.988	0.988	0.985

	Specificity
	1	1	1	0.997	0.997	0.997	0.997	0.998	0.998
	1	1	1	0.99	0.99	0.99	0.997	0.998	0.998
	1	1	1	0.986	0.984	0.984	0.998	0.998	0.999
	1	1	1	0.975	0.976	0.976	0.999	0.999	0.999

Most important surrogate variables not included

	# incorrect
	10	10	10	10	10	10	10	10	10
	100	100	100	100	100	100	100	100	100
	200	200	200	200	200	200	200	200	200
	400	400	400	400	400	400	400	400	400

	Sensitivity
	0	0	0	0	0	0	0	0	0
	0	0	0	0	0	0	0	0	0
	0	0	0	0	0	0	0	0	0
	0	0	0	0	0	0	0	0	0

	Specificity
	1	1	1	1	1	1	1	1	1
	1	1	1	1	1	1	1	1	1
	1	1	1	1	1	1	1	1	1
	1	1	1	1	1	1	1	1	1

FDR: false discovery rate, TT: training and testing, .sv = number of surrogate variables, and : the number of truly important CpG sites out of 2,000.