Research Article

Unsupervised Two-Way Clustering of Metagenomic Sequences

Table 4

Performance of Poisson mixture model (without word grouping) on datasets across various taxonomic ranks. Each dataset contains 50,000 reads of length 500 bp. AR stands for abundance ratio.

Species AR Rank Accuracy (%)

M. hyopneumoniae, M. mycoides 3 : 2 Genus 95.73
M. avium, M. leprae 3 : 4 Genus 94.22
A. vinelandii, C. japonicus 1 : 1 Family 92.81
M. leprae, S. erythraea 1 : 1 Order 95.58
B. pertussis, N. gonorrhoeae 1 : 2 Class 97.52
A. parvulum, S. erythraea 5 : 1 Class 99.64
R. prowazekii, S. meliloti 3 : 1 Class 99.91