About this Journal Submit a Manuscript Table of Contents
BioMed Research International
Volume 2013 (2013), Article ID 502827, 11 pages
http://dx.doi.org/10.1155/2013/502827
Research Article

ASPic-GeneID: A Lightweight Pipeline for Gene Prediction and Alternative Isoforms Detection

1Centre Nacional d’Anàlisi Genòmica (CNAG), Parc Científic de Barcelona, 08028 Barcelona, Spain
2Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica, Università degli Studi di Bari, 70126 Bari, Italy
3Istituto di Biomembrane e Bioenergetica del Consiglio Nazionale delle Ricerche (CNR), 70126 Bari, Italy
4Centre de Regulació Genòmica (CRG), 08003 Barcelona, Spain
5Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
6Centro di Eccellenza in Genomica Comparata, Università degli Studi di Bari, 70126 Bari, Italy

Received 16 June 2013; Revised 1 August 2013; Accepted 4 August 2013

Academic Editor: Tao Huang

Copyright © 2013 Tyler Alioto et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. J. Zhang, R. Chiodini, A. Badr, and G. Zhang, “The impact of next-generation sequencing on genomics,” Journal of Genetics and Genomics, vol. 38, no. 3, pp. 95–109, 2011. View at Publisher · View at Google Scholar · View at Scopus
  2. E. Picardi and G. Pesole, “Computational methods for ab initio and comparative gene finding,” Methods in Molecular Biology, vol. 609, pp. 269–284, 2010. View at Publisher · View at Google Scholar · View at Scopus
  3. T. Alioto, “Gene prediction,” Methods in Molecular Biology, vol. 855, pp. 175–201, 2012. View at Publisher · View at Google Scholar · View at Scopus
  4. I. Korf, “Gene finding in novel genomes,” BMC Bioinformatics, vol. 5, article 59, 2004. View at Publisher · View at Google Scholar · View at Scopus
  5. M. Stanke, R. Steinkamp, S. Waack, and B. Morgenstern, “AUGUSTUS: a web server for gene finding in eukaryotes,” Nucleic Acids Research, vol. 32, pp. W309–W312, 2004. View at Publisher · View at Google Scholar · View at Scopus
  6. S. L. Cawley and L. Pachter, “HMM sampling and applications to gene finding and alternative splicing,” Bioinformatics, vol. 19, supplement 2, pp. ii36–ii41, 2003. View at Publisher · View at Google Scholar · View at Scopus
  7. J. S. Pedersen and J. Hein, “Gene finding with a hidden Markov model of genome structure and evolution,” Bioinformatics, vol. 19, no. 2, pp. 219–227, 2003. View at Publisher · View at Google Scholar · View at Scopus
  8. R. Guigó, P. Flicek, J. F. Abril et al., “EGASP: the human ENCODE Genome Annotation Assessment Project,” Genome Biology, vol. 7, supplement 1, pp. S2.1–S2.31, 2006. View at Publisher · View at Google Scholar · View at Scopus
  9. C. Wei and M. R. Brent, “Using ESTs to improve the accuracy of de novo gene prediction,” BMC Bioinformatics, vol. 7, article 327, 2006. View at Publisher · View at Google Scholar · View at Scopus
  10. G. Parra, P. Agarwal, J. F. Abril, T. Wiehe, J. W. Fickett, and R. Guigó, “Comparative gene prediction in human and mouse,” Genome Research, vol. 13, no. 1, pp. 108–117, 2003. View at Scopus
  11. M. J. van Baren, B. C. Koebbe, and M. R. Brent, “Using N-SCAN or TWINSCAN to predict gene structures in genomic DNA sequences,” in Current Protocols in Bioinformatics, A. D. Baxevanis, Ed., chapter 4, unit 4.8, 2007. View at Publisher · View at Google Scholar
  12. S. S. Gross, C. B. Do, M. Sirota, and S. Batzoglou, “CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction,” Genome Biology, vol. 8, no. 12, article R269, 2007. View at Publisher · View at Google Scholar · View at Scopus
  13. O. Jaillon, J.-M. Aury, B. Noel et al., “The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla,” Nature, vol. 449, no. 7161, pp. 463–467, 2007. View at Publisher · View at Google Scholar · View at Scopus
  14. “Analysis of the genome sequence of the flowering plant Arabidopsis thaliana,” Nature, vol. 408, no. 6814, pp. 796–815, 2000. View at Scopus
  15. G. A. Tuskan, S. DiFazio, S. Jansson et al., “The genome of black cottonwood, Populus trichocarpa (Torr. & Gray),” Science, vol. 313, no. 5793, pp. 1596–1604, 2006. View at Publisher · View at Google Scholar · View at Scopus
  16. The Tomato Genome Consortium, “The tomato genome sequence provides insights into fleshy fruit evolution,” Nature, vol. 485, no. 7400, pp. 635–641, 2012. View at Publisher · View at Google Scholar
  17. R. Guigó and M. G. Reese, “EGASP: collaboration through competition to find human genes,” Nature Methods, vol. 2, no. 8, pp. 575–577, 2005. View at Publisher · View at Google Scholar · View at Scopus
  18. N. de Souza, “The ENCODE project,” Nature Methods, vol. 9, no. 11, article 1046, 2012.
  19. L. L. Elnitski, P. Shah, R. T. Moreland, L. Umayam, T. G. Wolfsberg, and A. D. Baxevanis, “The ENCODEdb portal: simplified access to ENCODE consortium data,” Genome Research, vol. 17, no. 6, pp. 954–959, 2007. View at Publisher · View at Google Scholar · View at Scopus
  20. J. Harrow, F. Denoeud, A. Frankish et al., “GENCODE: producing a reference annotation for ENCODE,” Genome Biology, vol. 7, supplement 1, pp. S4.1–S4.9, 2006. View at Scopus
  21. S. H. Nagaraj, R. B. Gasser, and S. Ranganathan, “A hitchhiker's guide to Expressed Sequence Tag (EST) analysis,” Briefings in Bioinformatics, vol. 8, no. 1, pp. 6–21, 2007. View at Publisher · View at Google Scholar · View at Scopus
  22. M. Stanke, A. Tzvetkova, and B. Morgenstern, “AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome,” Genome Biology, vol. 7, supplement 1, pp. S11.1–S11.8, 2006. View at Scopus
  23. A. Krogh, “Using database matches with HMMGene for automated gene detection in Drosophila,” Genome Research, vol. 10, no. 4, pp. 523–528, 2000. View at Publisher · View at Google Scholar · View at Scopus
  24. M. Arumugam, C. Wei, R. H. Brown, and M. R. Brent, “Pairagon+N-SCAN_EST: a model-based gene annotation pipeline,” Genome Biology, vol. 7, supplement 1, pp. S5.1–S5.10, 2006. View at Scopus
  25. S. Djebali, F. Delaplace, and H. R. Crollius, “Exogean: a framework for annotating protein-coding genes in eukaryotic genomic DNA,” Genome Biology, vol. 7, supplement 1, pp. S7.1–S7.10, 2006. View at Scopus
  26. T. Castrignanò, R. Rizzi, I. G. Talamo et al., “ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization,” Nucleic Acids Research, vol. 34, pp. W440–W443, 2006. View at Publisher · View at Google Scholar · View at Scopus
  27. P. Bonizzoni, R. Rizzi, and G. Pesole, “ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences,” BMC Bioinformatics, vol. 6, article 244, 2005. View at Publisher · View at Google Scholar · View at Scopus
  28. E. Blanco, G. Parra, and R. Guigo, “Using geneid to identify genes,” in Current Protocols in Bioinformatics, A. D. Baxevanis, Ed., chapter 4, unit 4.3, 2007. View at Publisher · View at Google Scholar
  29. G. Parra, E. Blanco, and R. Guigó, “GeneId in Drosophila,” Genome Research, vol. 10, no. 4, pp. 511–515, 2000. View at Publisher · View at Google Scholar · View at Scopus
  30. T. D. Wu and C. K. Watanabe, “GMAP: a genomic mapping and alignment program for mRNA and EST sequences,” Bioinformatics, vol. 21, no. 9, pp. 1859–1875, 2005. View at Publisher · View at Google Scholar · View at Scopus
  31. W. J. Kent, “BLAT—the BLAST-like alignment tool,” Genome Research, vol. 12, no. 4, pp. 656–664, 2002. View at Publisher · View at Google Scholar · View at Scopus
  32. G. format, http://www.sanger.ac.uk/Software/formats/GFF/.
  33. S. M. J. Searle, J. Gilbert, V. Iyer, and M. Clamp, “The Otter annotation system,” Genome Research, vol. 14, no. 5, pp. 963–970, 2004. View at Publisher · View at Google Scholar · View at Scopus
  34. M. Burset and R. Guigó, “Evaluation of gene structure prediction programs,” Genomics, vol. 34, no. 3, pp. 353–367, 1996. View at Publisher · View at Google Scholar · View at Scopus
  35. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” Journal of Molecular Biology, vol. 215, no. 3, pp. 403–410, 1990. View at Publisher · View at Google Scholar · View at Scopus
  36. A. Coghlan and R. Durbin, “Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron-exon structure,” Bioinformatics, vol. 23, no. 12, pp. 1468–1475, 2007. View at Publisher · View at Google Scholar · View at Scopus
  37. S. W. Roy and D. Penny, “Intron length distributions and gene prediction,” Nucleic Acids Research, vol. 35, no. 14, pp. 4737–4742, 2007. View at Publisher · View at Google Scholar · View at Scopus
  38. D. V. Lu, R. H. Brown, M. Arumugam, and M. R. Brent, “Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner,” Bioinformatics, vol. 25, no. 13, pp. 1587–1593, 2009. View at Publisher · View at Google Scholar · View at Scopus
  39. D. S. Gerhard, “The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC),” Genome Research, vol. 14, no. 10, pp. 2121–2127, 2004. View at Scopus
  40. V. Solovyev, P. Kosarev, I. Seledsov, and D. Vorobyev, “Automatic annotation of eukaryotic genes, pseudogenes and promoters,” Genome Biology, vol. 7, pp. S10.11–S10.12, 2006. View at Scopus
  41. V. Curwen, E. Eyras, T. D. Andrews et al., “The Ensembl automatic gene annotation system,” Genome Research, vol. 14, no. 5, pp. 942–950, 2004. View at Publisher · View at Google Scholar · View at Scopus
  42. A. A. Salamov and V. V. Solovyev, “Ab initio gene finding in Drosophila genomic DNA,” Genome Research, vol. 10, no. 4, pp. 516–522, 2000. View at Publisher · View at Google Scholar · View at Scopus
  43. J. Q. Wu, D. Shteynberg, M. Arumugam, R. A. Gibbs, and M. R. Bren, “Identification of rat genes by TWINSCAN gene prediction, RT-PCR, and direct sequencing,” Genome Research, vol. 14, no. 4, pp. 665–671, 2004. View at Publisher · View at Google Scholar · View at Scopus
  44. D. Weissglas-Volkov, C. L. Plaisier, A. Huertas-Vazquez et al., “Identification of two common variants contributing to serum apolipoprotein B levels in mexicans,” Arteriosclerosis, Thrombosis, and Vascular Biology, vol. 30, no. 2, pp. 353–359, 2010. View at Publisher · View at Google Scholar · View at Scopus
  45. J. Besemer and M. Borodovsky, “GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses,” Nucleic Acids Research, vol. 33, no. 2, pp. W451–W454, 2005. View at Publisher · View at Google Scholar · View at Scopus
  46. B. Brejová, D. G. Brown, M. Li, and T. Vinař, “ExonHunter: a comprehensive approach to gene finding,” Bioinformatics, vol. 21, supplement 1, pp. i57–i65, 2005. View at Publisher · View at Google Scholar · View at Scopus
  47. M. L. Metzker, “Sequencing technologies the next generation,” Nature Reviews Genetics, vol. 11, no. 1, pp. 31–46, 2010. View at Publisher · View at Google Scholar · View at Scopus
  48. E. R. Mardis, “The impact of next-generation sequencing technology on genetics,” Trends in Genetics, vol. 24, no. 3, pp. 133–141, 2008. View at Publisher · View at Google Scholar · View at Scopus
  49. C. Trapnell, A. Roberts, L. Goff et al., “Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks,” Nature Protocols, vol. 7, no. 3, pp. 562–578, 2012. View at Publisher · View at Google Scholar