Review Article

Close Encounters of the Third Domain: The Emerging Genomic View of Archaeal Diversity and Evolution

Figure 1

Bayesian phylogeny of 80 representative archaeal species. BLAST databases containing the proteome of 6 new archaeal genomes were retrieved from NCBI (in bold font on the tree): Methanomassiliicoccus luminyensis B10 (acc. no. CAJE01), MCG SCGC AB539E09 (acc. no. ALXK01), Marine Benthic Group D (MBGD) SCGC AB539N05, AB539C06, and AB540F20 (acc. no. ALXL01, AOSH01, and AOSI01, resp.). Protein sequence alignment from the 57 clusters in the discFilter 15 p dataset from [70] for which eukaryotes were removed were used as an input to psi-blast, with the six new proteomes as a database. Orthologs were retrieved as in [70]. For the three MBGD strains, one composite set of orthologs was constituted by using the most complete one (AB539C06) whenever possible and complementing with sequences from the other two if available. Orthologous genes selection, alignment, trimming, and concatenation were performed as in [70] resulting in a 15,069 amino-acid alignment. Four chains of Bayesian phylogenies were run with Phylobayes [71], under the CAT-Poisson model, running for approximately 10000 generations and discarding half as a burn-in. The tree was rooted with bacteria. Posterior probabilities (pp) are represented by colored dots on the nodes, with support values coloured according to the depicted heat-map colour scheme. The scale represents the number of substitutions per site. Species are colored according to the following: red, Euryarchaeota; green, Nanoarchaeota (N) and ARMAN; pink, Korarchaeota (K); black, Misc. Crenarchaeal Group (MCG); orange, Thaumarchaeota and Aigarchaeota (A); blue, Crenarchaeota. The DNA collection method, if different from pure culture, is indicated by a symbol next to the organism name: square, coculture; star, metagenome; circle, single-cell genome; triangle, enrichment culture.
202358.fig.001