Discovering the Unknown: Improving Detection of Novel  Species and Genera from Short Reads

<table>The datasets used in this paper are composed of a classifier training set/database, novel genomes used to train the detector, and a separate novel-genome test set. The <i>blue</i> areas represent the percentage of genomes that have “known” genera/species; the  <i>green</i> areas represent the percentage of genomes that are “known” at the genus level but “unknown” on the species level; the <i>red</i> areas represent the percentage of genomes that are “unknown” at both the species and genus levels.</table>

BioMed Research International

fig1

Figure 1

Figure 1: Discovering the Unknown: Improving Detection of Novel  Species and Genera from Short Reads