Table of Contents Author Guidelines Submit a Manuscript
Advances in Bioinformatics
Volume 2010, Article ID 287070, 8 pages
http://dx.doi.org/10.1155/2010/287070
Research Article

Testing the Coding Potential of Conserved Short Genomic Sequences

Department of Statistics, Carnegie Mellon University, PA 15213, USA

Received 21 September 2009; Accepted 2 January 2010

Academic Editor: Igor B. Rogozin

Copyright © 2010 Jing Wu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. C. Burge and S. Karlin, “Prediction of complete gene structures in human genomic DNA,” Journal of Molecular Biology, vol. 268, no. 1, pp. 78–94, 1997. View at Publisher · View at Google Scholar · View at PubMed
  2. S. Batzoglou, L. Pachter, J. P. Mesirov, B. Berger, and E. S. Lander, “Human and mouse gene structure: comparative analysis and application to exon prediction,” Genome Research, vol. 10, no. 7, pp. 950–958, 2000. View at Publisher · View at Google Scholar
  3. V. Bafna and D. H. Huson, “The conserved exon method for gene finding,” in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB '00), vol. 8, pp. 3–12, AAAI Press, 2000.
  4. I. Korf, P. Flicek, D. Duan, and M. R. Brent, “Integrating genomic homology into gene structure prediction,” Bioinformatics, vol. 17, supplement 1, pp. S140–S148, 2001. View at Google Scholar
  5. S. Cawley, L. Pachter, and M. Alexandersson, “SLAM web server for comparative gene finding and alignment,” Nucleic Acids Research, vol. 31, no. 13, pp. 3507–3509, 2003. View at Publisher · View at Google Scholar
  6. R. Guigó, E. T. Dermitzakis, P. Agarwal et al., “Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes,” Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 3, pp. 1140–1145, 2003. View at Publisher · View at Google Scholar · View at PubMed
  7. J. Wu and D. Haussler, “Coding exon detection using comparative sequences,” Journal of Computational Biology, vol. 13, no. 6, pp. 1148–1164, 2006. View at Publisher · View at Google Scholar · View at PubMed
  8. M. Rè, G. Pesole, and D. S. Horner, “Accurate discrimination of conserved coding and non-coding regions through multiple indicators of evolutionary dynamics,” BMC Bioinformatics, vol. 10, article 282, 2009. View at Publisher · View at Google Scholar · View at PubMed
  9. S. Rogic, A. K. Mackworth, and F. B. F. Ouellette, “Evaluation of gene-finding programs on mammalian sequences,” Genome Research, vol. 11, no. 5, pp. 817–832, 2001. View at Publisher · View at Google Scholar · View at PubMed
  10. W. S. Cleveland, “Robust locally weighted regression and smoothing scatterplots,” Journal of the American Statistical Association, vol. 74, pp. 829–836, 1979. View at Google Scholar
  11. I. B. Rogozin, D. D'Angelo, and L. Milanesi, “Protein-coding regions prediction combining similarity searches and conservative evolutionary properties of protein-coding sequences,” Gene, vol. 226, no. 1, pp. 129–137, 1999. View at Publisher · View at Google Scholar
  12. A. Nekrutenko, K. D. Makova, and W.-H. Li, “The KA/KS ratio test for assessing the protein-coding potential of genomic regions: an empirical and simulation study,” Genome Research, vol. 12, no. 1, pp. 198–202, 2002. View at Publisher · View at Google Scholar · View at PubMed
  13. A. Nekrutenko, W.-Y. Chung, and W.-H. Li, “An evolutionary approach reveals a high protein-coding capacity of the human genome,” Trends in Genetics, vol. 19, no. 6, pp. 306–310, 2003. View at Publisher · View at Google Scholar
  14. R. Tibshirani, T. Hastie, B. Narasimhan et al., “Sample classification from protein mass spectrometry, by “peak probability contrasts”,” Bioinformatics, vol. 20, no. 17, pp. 3034–3044, 2004. View at Publisher · View at Google Scholar · View at PubMed
  15. Y. Benjamini and Y. Hochberg, “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” Journal of the Royal Statistical Society. Series B, vol. 57, no. 1, pp. 289–300, 1995. View at Google Scholar
  16. J. D. Storey, “The positive false discovery rate: a Bayesian interpretation and the q-value,” Annals of Statistics, vol. 31, no. 6, pp. 2013–2035, 2003. View at Publisher · View at Google Scholar · View at MathSciNet
  17. K. D. Pruitt, K. S. Katz, H. Sicotte, and D. R. Maglott, “Introducing RefSeq and LocusLink: curated human genome resources at the NCBI,” Trends in Genetics, vol. 16, no. 1, pp. 44–47, 2000. View at Publisher · View at Google Scholar
  18. K. D. Pruitt, T. Tatusova, and D. R. Maglott, “NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins,” Nucleic Acids Research, vol. 33, pp. D501–D504, 2005. View at Publisher · View at Google Scholar · View at PubMed
  19. D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler, “GenBank: update,” Nucleic Acids Research, vol. 32, pp. D23–D26, 2004. View at Google Scholar
  20. T. Wiehe, S. Gebauer-Jung, T. Mitchell-Olds, and R. Guigó, “SGP-1: prediction and validation of homologous genes based on sequence alignments,” Genome Research, vol. 11, no. 9, pp. 1574–1583, 2001. View at Publisher · View at Google Scholar · View at PubMed
  21. R. Guigó, “Assembling genes from predicted exons in linear time with dynamic programming,” Journal of Computational Biology, vol. 5, no. 4, pp. 681–702, 1998. View at Google Scholar
  22. M. Stanke and S. Waack, “Gene prediction with a hidden Markov model and a new intron submodel,” Bioinformatics, vol. 19, supplement 2, pp. ii215–ii225, 2003. View at Publisher · View at Google Scholar
  23. P. Kim, N. Kim, Y. Lee, B. Kim, Y. Shin, and S. Lee, “ECgene: genome annotation for alternative splicing,” Nucleic Acids Research, vol. 33, pp. D75–D79, 2005. View at Publisher · View at Google Scholar · View at PubMed
  24. W. J. Kent, “BLAT—the BLAST-like alignment tool,” Genome Research, vol. 12, no. 4, pp. 656–664, 2002. View at Google Scholar
  25. D. Thierry-Mieg, J. Thierry-Mieg, M. Potdevin, and M. Sienkiewicz, “AceView: identification and functional annotation of cDNA-supported genes in higher organisms,” Genome Biology, vol. 7, supplement 1, p. S12, 2006. View at Google Scholar
  26. T. Hubbard, D. Barker, E. Birney et al., “The Ensembl genome database project,” Nucleic Acids Research, vol. 30, no. 1, pp. 38–41, 2002. View at Google Scholar
  27. W. J. Kent, R. Baertsch, A. Hinrichs, W. Miller, and D. Haussler, “Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes,” Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 20, pp. 11484–11489, 2003. View at Publisher · View at Google Scholar · View at PubMed
  28. Z. Zhang, P. M. Harrison, Y. Liu, and M. Gerstein, “Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome,” Genome Research, vol. 13, no. 12, pp. 2541–2558, 2003. View at Publisher · View at Google Scholar · View at PubMed
  29. A. E. Lash, C. M. Tolstoshev, L. Wagner et al., “SAGEmap: a public gene expression resource,” Genome Research, vol. 10, no. 7, pp. 1051–1060, 2000. View at Publisher · View at Google Scholar
  30. M. Q. Zhang, “Identification of protein coding regions in the human genome by quadratic discriminant analysis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 94, no. 2, pp. 565–568, 1997. View at Publisher · View at Google Scholar
  31. A. Siepel and D. Haussler, “Phylogenetic hidden Markov models,” in Statistical Methods in Molecular Evolution, R. Nielsen, Ed., pp. 325–351, Springer, New York, NY, USA, 2005. View at Google Scholar
  32. E. Birney, J. A. Stamatoyannopoulos, A. Dutta et al., “Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project,” Nature, vol. 447, no. 7146, pp. 799–816, 2007. View at Publisher · View at Google Scholar · View at PubMed
  33. S. Washietl, J. S. Pedersen, J. O. Korbel et al., “Structured RNAs in the ENCODE selected regions of the human genome,” Genome Research, vol. 17, no. 6, pp. 852–864, 2007. View at Publisher · View at Google Scholar · View at PubMed