Research Article

GNormPlus: An Integrative Approach for Tagging Genes, Gene Families, and Protein Domains

Table 1

The statistic of our gene corpus.

Data setArticlesGene mentions (gene/family/domains)Gene identifiers

BioCreative II GN training set2813,019/1,115/278758
BioCreative II GN test set2623,233/1,252/361928
NLM Citation GIA test collection1511,205/160/17310

Total6947,457/2,527/6561996