Practical Calling Approach for Exome Array-Based Genome-Wide Association Studies in Korean Population
Table 1
The number of variants that were excluded after implementing automated clustering guidelines in Illumina GenomeStudio. We compared criteria of the CHARGE consortium with adjusted criteria in our study using Korean samples. The total number of SNVs was counted with redundancy. Different categories of our guideline are shown in boldface.
Type
CHARGE consortium guidelines
Our guidelines
Criteria
# of SNVs
Criteria
# of SNVs
Clustering errors
Call Freq 0.95~0.99
2,841
Call Freq 0.95~0.99
2,841
Cluster Sep < 0.4
693
Cluster Sep < 0.4
693
AB Freq > 0.6
1
AB Freq > 0.6
1
AB Mean
645
AB Mean
645
Het Excess > 0.1
13
Het Excess > 0.1
13
Het Excess < −0.9
17
Het Excess < −0.9
17
MAF < 0.0001 & Call Freq ≠ 1
119,896
MAF < 0.0001 & Call Freq < 0.99
2,171
AA cluster error
AA Mean 0.2~0.3
759
AA Mean 0.2~0.3
759
AA Dev > 0.025
2,195
AA Dev > 0.025
2,195
AA Freq = 1 & Call Freq < 1
43,012
AA Freq = 1 & Call Freq < 0.99
561
AB cluster error
AB Mean 0.2~0.3,
847
AB Mean 0.2~0.3,
847
AB Mean 0.7~0.8
2,685
AB Mean 0.7~0.8
2,685
AB Dev ≥ 0.07
272
AB Dev ≥ 0.07
272
AB Freq = 0 & MAF > 0
70,597
AB Freq = 0 & MAF > 0.0002
11,572
BB cluster error
BB Mean 0.7~0.8
690
BB Mean 0.7~0.8
690
BB Dev > 0.025
2,742
BB Dev > 0.025
2,742
BB Freq = 1 & Call Freq < 1
16,352
BB Freq = 1 & Call Freq < 0.99
92
Total # of SNVs
264,257
46,076
SNV: single nucleotide variation; MAF: minor allele frequency.