Volume 1 (2003), Issue 3, Pages 185-190
Research Article

Remarkable sequence signatures in archaeal genomes

1The Center for Applied Genomics, Hospital for Sick Children, Toronto, Ontario M5G 1Z8, Canada
2Bioinformatics Supercomputing Centre, The Genomics and Genetics Biology Program, Hospital for Sick Children, Toronto, Ontario M5G 1Z8, Canada

Received 15 October 2002; Accepted 6 November 2002

Complete archaeal genomes were probed for the presence of long (≥ 25 bp) oligonucleotide repeats (words). We detected the presence of many words distributed in tandem with narrow ranges of periodicity (i.e., spacer length between repeats). Similar words were not identified in genomes of non-archaeal species, namely Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Mycoplasma genitalium and Mycoplasma pneumoniae. BLAST similarity searches against the GenBank nucleotide sequence database revealed that these words were archaeal species-specific, indicating that they are of a signature character. Sequence analysis and genome viewing tools showed these repeats to be restricted to non-coding regions. Thus, archaea appear to possess a non-coding genomic signature that is absent in bacterial species. The identification of a species-specific genomic signature would be of great value to archaeal genome mapping, evolutionary studies and analyses of genome complexity.