Research Article
Metagenome Fragment Classification Using -Mer Frequency Profiles
Table 2
63 500 25 bp fragments, 100 from each genome, are BLASTed and compared to the NBC. BLAST gives 66% of them unique
top-scoring hits, where all of them were correct. Almost 34% of the reads have
ambiguous top-scoring hits, meaning that there are multiple organisms that have
top scores and -values. Also, even though the exact string or complement exist
in the database, 287 fragments receive no hit from BLAST with an -value of
3000. NBC is able to correctly identify 71% of those. Being that the multiple
top-scoring genomes can be randomly chosen as a top hit, we can compare
directly, how often BLAST would get the genome correct compared to the NBC.
Taking this and the single top hits into consideration, NBC scored 48118
(75.8%) fragments correct while BLAST matched 47889 (75.4%) fragments
correct.
|