Metagenome Fragment Classification Using -Mer Frequency Profiles
Figure 4
A log-log plot of the -mer frequences versus -mers in ranked order for various E. coli
strains (K12 is the commensal strain, O157H7 is highly pathogenic, and HS is
the commensal isolate from the human gastrointestinal tract). E. coli has a
characteristic curve for all strains in this domain. This curvature is then
compared to Zipf's law which states that -mer frequency is directly related to inverse
rank order. While E. coli generally obeys this law, the curvature deviation
from the straight line shows that higher ranking of words has higher normal
frequency.