Abstract

The ratio of genomic oligonucleotide frequencies relative to the mean genomic AT/GC content has been shown to be similar for closely related species and, therefore, said to reflect a “genomic signature”. The genomic signature has been found to be more similar within genomes than between closely related genomes. Furthermore, genomic signatures of closely related organisms are, in turn, more similar than more distantly related organisms. Since the genomic signature is remarkably stable within a genome, it can be extracted from only a fraction of the genomic DNA sequence. Genomic signatures, therefore, have many applications. The most notable examples include recognition of pathogenicity islands in microbial genomes and identification of hosts from arbitrary DNA sequences, the latter being of great importance in metagenomics. What shapes the genomic signature in microbial DNA has been readily discussed, but difficult to pinpoint exactly. Most attempts so far have mainly focused on correlations from in silico data. This mini-review seeks to summarize possible influences shaping the genomic signature and to survey a set of applications.