Journal of Artificial Evolution and Applications

Journal of Artificial Evolution and Applications / 2009 / Article
Special Issue

Artificial Evolution Methods in the Biological and Biomedical Sciences

View this Special Issue

Research Article | Open Access

Volume 2009 |Article ID 963150 |

Edgar D. Arenas-Díaz, Helga Ochoterena, Katya Rodríguez-Vázquez, "Multiple Sequence Alignment Using a Genetic Algorithm and GLOCSA", Journal of Artificial Evolution and Applications, vol. 2009, Article ID 963150, 10 pages, 2009.

Multiple Sequence Alignment Using a Genetic Algorithm and GLOCSA

Academic Editor: Jason Moore
Received14 Nov 2008
Revised04 Apr 2009
Accepted13 Jun 2009
Published27 Aug 2009


Algorithms that minimize putative synapomorphy in an alignment cannot be directly implemented since trivial cases with concatenated sequences would be selected because they would imply a minimum number of events to be explained (e.g., a single insertion/deletion would be required to explain divergence among two sequences). Therefore, indirect measures to approach parsimony need to be implemented. In this paper, we thoroughly present a Global Criterion for Sequence Alignment (GLOCSA) that uses a scoring function to globally rate multiple alignments aiming to produce matrices that minimize the number of putative synapomorphies. We also present a Genetic Algorithm that uses GLOCSA as the objective function to produce sequence alignments refining alignments previously generated by additional existing alignment tools (we recommend MUSCLE). We show that in the example cases our GLOCSA-guided Genetic Algorithm (GGGA) does improve the GLOCSA values, resulting in alignments that imply less putative synapomorphies.


  1. F. C. Bernstein, T. F. Koetzle, G. J. B. Williams et al., “The protein data bank: a computer based archival file for macromolecular structures,” Journal of Molecular Biology, vol. 112, no. 3, pp. 535–542, 1977. View at: Google Scholar
  2. D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler, “Genbank,” Nucleic Acids Reseach, vol. 34, pp. D16–D20, 2006. View at: Google Scholar
  3. R. C. Edgar, “Muscle: multiple sequence alignment with high accurracy and high throughput,” Nucleic Acids Reseach, vol. 32, no. 5, pp. 1792–1797, 2004. View at: Google Scholar
  4. W. S. Klug, M. R. Cummings, and C. Spencer, Concepts of Genetics, Benjamin Cummings, Essex, UK, 2005.
  5. “Using genetic algorithms for pairwise and multiple sequence alignments,” in Evolutionary Computation in Bioinformatics, G. B. Fogel and D. W. Corne, Eds., chapter 5, Morgan Kaufman, San Francisco, Calif, USA, 2003. View at: Google Scholar
  6. B. Haubold and T. Wiehe, Introduction to Computational Biology: An Evolutionary Approach, Birkhäuser, Basel, Switzerland, 2007.
  7. S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, vol. 48, no. 3, pp. 443–453, 1970. View at: Google Scholar
  8. T. F. Smith and M. S. Waterman, “Comparison of biosequences,” Advances in Applied Mathematics, vol. 2, no. 4, pp. 482–489, 1981. View at: Google Scholar
  9. M. Ishikawa, T. Toya, and Y. Tokoti, “Parallel iterative aligner with genetic algorithm,” in Proceedings of the 13th International Conference on Artificial Ingelligence and Genome Workshop, pp. 84–93, 1993. View at: Google Scholar
  10. C. Notredame and D. G. Higgins, “SAGA: sequence alignment by genetic algorithm,” Nucleic Acids Research, vol. 24, no. 8, pp. 1515–1524, 1996. View at: Publisher Site | Google Scholar
  11. C. Notredame, E. A. O'Brien, and D. G. Higgins, “RAGA: RNA sequence alignment by genetic algorithm,” Nucleic Acids Research, vol. 25, no. 22, pp. 4570–4580, 1997. View at: Google Scholar
  12. K. Chellapilla and G. Fogel, “Multiple sequence alignment using evolutionary programming,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, p. 452, Washington, DC, USA, July 1999. View at: Publisher Site | Google Scholar
  13. L. Cai, D. Juedes, and E. Liakhovitch, “Evolutionary computation techniques for multiple sequence alignment,” in Proceedings of the IEEE Conference on Evolutionary Computation (ICEC '00), vol. 2, pp. 829–835, 2000. View at: Google Scholar
  14. C. Sander and R. Schneider, “Database of homology-derived protein structures and the structural meaning of sequence alignment,” Proteins: Structure, Function and Genetics, vol. 9, no. 1, pp. 56–68, 1991. View at: Google Scholar
  15. J. I. Davis and J. J. Doyle, “Homology in molecular phylogenetics: a parsimony perspective,” in Molecular Systematics of Plants II, pp. 101–131, Kluwer Academic Publishers, Boston, Mass, USA, 1998. View at: Google Scholar
  16. H. Ochoterena, “Homology in coding and non-coding DNA sequences: a parsimony perspective,” Plant Systematics and Evolution. View at: Publisher Site | Google Scholar
  17. M. O. Dayhoff, Atlas of Protein Sequence and Structure, National Biomedical Research Fundation, Washington, DC, USA, 1978.
  18. D. J. Lipman, S. F. Altschul, and J. D. Kececioglu, “A tool for multiple sequence alignment,” Proceedings of the National Academy of Sciences of the United States of America, vol. 86, no. 12, pp. 4412–4415, 1989. View at: Google Scholar
  19. S. Henikoff and J. G. Henikoff, “Amino acid substitution matrices from protein blocks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 89, no. 22, pp. 10915–10919, 1992. View at: Publisher Site | Google Scholar
  20. S. F. Altschul, “Gap costs for multiple sequence alignment,” Journal of Theoretical Biology, vol. 138, no. 3, pp. 297–309, 1989. View at: Google Scholar
  21. S. F. Altschul and D. J. Lipman, “Trees, stars, and multiple biological sequence alignment,” SIAM Journal on Applied Mathematics, vol. 49, no. 1, pp. 197–209, 1989. View at: Google Scholar
  22. J. D. Thompson, F. Plewniak, and O. Poch, “BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs,” Bioinformatics, vol. 15, no. 1, pp. 87–88, 1999. View at: Publisher Site | Google Scholar
  23. A. Bahr, J. D. Thompson, J.-C. Thierry, and O. Poch, “BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations,” Nucleic Acids Research, vol. 29, no. 1, pp. 323–326, 2001. View at: Google Scholar

Copyright © 2009 Edgar D. Arenas-Díaz et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles