Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015, Article ID 786861, 11 pages
http://dx.doi.org/10.1155/2015/786861
Research Article

Lengths of Orthologous Prokaryotic Proteins Are Affected by Evolutionary Factors

1Children’s Hospital Los Angeles, Keck School of Medicine, University of Southern California, Los Angeles, CA 90027, USA
2Department of Evolutionary and Environmental Biology and Institute of Evolution, University of Haifa, 3498838 Haifa, Israel
3Department of Computer Science, University of Haifa, 3498838 Haifa, Israel
4The Tauber Bioinformatics Research Center, University of Haifa, 3498838 Haifa, Israel

Received 8 September 2014; Accepted 2 November 2014

Academic Editor: Vassily Lyubetsky

Copyright © 2015 Tatiana Tatarinova et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Proteins of the same functional family (for example, kinases) may have significantly different lengths. It is an open question whether such variation in length is random or it appears as a response to some unknown evolutionary driving factors. The main purpose of this paper is to demonstrate existence of factors affecting prokaryotic gene lengths. We believe that the ranking of genomes according to lengths of their genes, followed by the calculation of coefficients of association between genome rank and genome property, is a reasonable approach in revealing such evolutionary driving factors. As we demonstrated earlier, our chosen approach, Bubble-sort, combines stability, accuracy, and computational efficiency as compared to other ranking methods. Application of Bubble Sort to the set of 1390 prokaryotic genomes confirmed that genes of Archaeal species are generally shorter than Bacterial ones. We observed that gene lengths are affected by various factors: within each domain, different phyla have preferences for short or long genes; thermophiles tend to have shorter genes than the soil-dwellers; halophiles tend to have longer genes. We also found that species with overrepresentation of cytosines and guanines in the third position of the codon (GC3 content) tend to have longer genes than species with low GC3 content.