Table of Contents Author Guidelines Submit a Manuscript
Abstract and Applied Analysis
Volume 2014 (2014), Article ID 402567, 14 pages
http://dx.doi.org/10.1155/2014/402567
Research Article

Identification of Protein Coding Regions in the Eukaryotic DNA Sequences Based on Marple Algorithm and Wavelet Packets Transform

School of Mathematics, Shandong University, Jinan, Shandong 250100, China

Received 11 April 2014; Revised 30 June 2014; Accepted 1 July 2014; Published 15 July 2014

Academic Editor: Caihong Li

Copyright © 2014 Guangchen Liu and Yihui Luan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. S. Nemati, M. E. Basiri, N. Ghasem-Aghaee, and M. H. Aghdam, “A novel ACO-GA hybrid algorithm for feature selection in protein function prediction,” Expert Systems with Applications, vol. 36, no. 10, pp. 12086–12094, 2009. View at Publisher · View at Google Scholar · View at Scopus
  2. S. A. Marhon and S. C. Kremer, “Gene prediction based on DNA spectral analysis: a literature review,” Journal of Computational Biology, vol. 18, no. 4, pp. 639–676, 2011. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  3. C. Mathé, M. Sagot, T. Schiex, and P. Rouzé, “Current methods of gene prediction, their strengths and weaknesses,” Nucleic Acids Research, vol. 30, no. 19, pp. 4103–4117, 2002. View at Publisher · View at Google Scholar · View at Scopus
  4. N. Y. Song and H. Yan, “Short exon detection in DNA sequences based on multifeature spectral analysis,” Eurasip Journal on Advances in Signal Processing, vol. 2011, Article ID 780794, 8 pages, 2011. View at Publisher · View at Google Scholar · View at Scopus
  5. S. Maji and D. Garg, “Progress in gene prediction: principles and challenges,” Current Bioinformatics, vol. 8, no. 2, pp. 226–243, 2013. View at Publisher · View at Google Scholar · View at Scopus
  6. N. Goel, S. Singh, and T. C. Aseri, “A review of soft coputing techniques for gene prediction,” ISRN Genomics, vol. 2013, Article ID 191206, 8 pages, 2013. View at Publisher · View at Google Scholar
  7. H. Saberkari, M. Shamsi, H. Heravi, and M. H. Sedaaghi, “A novel fast algorithm for exon prediction in eukaryotic genes using linear predictive coding model and Goertzlel algorithm based on the Z-curve,” Journal of Medical Signals and Sensors, vol. 3, pp. 139–149, 2013. View at Google Scholar
  8. J. Mena-Chalco, H. Carrer, Y. Zana, and R. M. Cesar Jr., “Identification of protein coding regions using the modified Gabor-wavelet transform,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 5, no. 2, pp. 198–206, 2008. View at Publisher · View at Google Scholar · View at Scopus
  9. R. Guigo, “DNA compositon, codon usage and exon prediction,” in Genetic Databases, Academic Press, 1999. View at Google Scholar
  10. J. Henderson, S. Salzberg, and K. H. Fasman, “Finding genes in DNA with a hidden Markov model,” Journal of Computational Biology, vol. 4, no. 2, pp. 127–141, 1997. View at Publisher · View at Google Scholar · View at Scopus
  11. S. Agoes, “A Hidden Markov Model for identificatino of exons in DNA of genes Plasmodium falciparum,” International Journal of Electrical & Computer Sciences, vol. 11, pp. 33–36, 2011. View at Google Scholar
  12. C. H. Q. Ding and I. Dubchak, “Multi-class protein fold recognition using support vector machines and neural networks,” Bioinformatics, vol. 17, no. 4, pp. 349–358, 2001. View at Google Scholar · View at Scopus
  13. J. W. Fickett, “Recognition of protein coding regions in DNA sequences,” Nucleic Acids Research, vol. 10, no. 17, pp. 5303–5318, 1982. View at Publisher · View at Google Scholar · View at Scopus
  14. S. Tiwari, S. Ramachandran, A. Bhattacharya, S. Bhattacharya, and R. Ramaswamy, “Prediction of probable genes by Fourier analysis of genomic sequences,” Computer Applications in the Biosciences, vol. 13, no. 3, pp. 263–270, 1997. View at Google Scholar · View at Scopus
  15. R. Nini and S. J. Shepherd, “Detection of 3-periodicity for small genomic sequences based on AR technique,” in Proceedings of the International Conference on Communications, Circuits and Systems (ICCCAS '04), vol. 2, pp. 1032–1036, June 2004. View at Publisher · View at Google Scholar
  16. A. A. Tsonis, J. B. Elsner, and P. A. Tsonis, “Periodicity in DNA coding sequences: implications in gene evolution,” Journal of Theoretical Biology, vol. 151, no. 3, pp. 323–331, 1991. View at Publisher · View at Google Scholar · View at Scopus
  17. R. F. Voss, “Evolution of long-range fractal correlations and 1/f noise in DNA base sequences,” Physical Review Letters, vol. 68, no. 25, pp. 3805–3808, 1992. View at Publisher · View at Google Scholar · View at Scopus
  18. C. A. Chatzidimitriou-Dreismann and D. Larhammar, “Long-range correlations in DNA,” Nature, vol. 361, no. 6409, pp. 212–213, 1993. View at Google Scholar · View at Scopus
  19. H. Saberkari, M. Shamsi, M. Sedaaghi, and F. Golabi, “Prediction of protein coding regions in DNA sequences using signal processing methods,” in Proceedings of the IEEE Symposium on Industrial Electronics and Applications (ISIEA '12), pp. 355–360, Bandung, Indonesia, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  20. N. Rao, X. Lei, J. Guo, H. Huang, and Z. Ren, “An efficient sliding window strategy for accurate location of eukaryotic protein coding regions,” Computers in Biology and Medicine, vol. 39, no. 4, pp. 392–395, 2009. View at Publisher · View at Google Scholar · View at Scopus
  21. P. P. Vaidyanathan and B. J. Yoon, “Gene and exon prediction using allpass-based filters,” in Proceedings of the IEEE International Workshop on Genoimc Signal Processing and Statistics (GENSIPS '02), Raleigh, NC, USA, 2002.
  22. E. Ifeachor and B. Jervis, Digital Signal Processing: A Practical Approach, Prentice-Hall, 2nd edition, 2002.
  23. N. Chakravarthy, A. Spanias, L. D. Iasemidis, and K. Tsakalis, “Autoregressive modeling and feature analysis of DNA sequences,” Eurasip Journal on Applied Signal Processing, vol. 2004, no. 1, pp. 13–28, 2004. View at Publisher · View at Google Scholar · View at Scopus
  24. M. Akhtar, E. Ambikairajah, and J. Epps, “Detection of period-3 behavior in genomic sequences using singular value decomposition,” in Proceeding of the IEEE 2005 International Conference on Emerging Technologies (ICET '05), pp. 13–17, September 2005. View at Publisher · View at Google Scholar · View at Scopus
  25. M. Akhatar, “Comparison of gene and exon prediction techniques for detection of short coding regions,” International Journal of Information Technology, vol. 11, pp. 26–35, 2005. View at Google Scholar
  26. H. Yan and T. D. Pham, “Spectral estimation techniques for DNA sequence and microarray data analysis,” Current Bioinformatics, vol. 2, no. 2, pp. 145–156, 2007. View at Publisher · View at Google Scholar · View at Scopus
  27. M. K. Choogn and H. Yan, “Multi-scale parametric spectral analysis for exon detection in DNA sequences based on forward-backward linear prediction and singular value decomposition of the double-base curves,” Bioinformation, vol. 2, pp. 273–278, 2008. View at Google Scholar
  28. S. Xu, N. Rao, X. Chen, and B. Zhou, “Inferring an organism-specific optimal threshold for predicting protein coding regions in eukaryotes based on a bootstrapping algorithm,” Biotechnology Letters, vol. 33, no. 5, pp. 889–896, 2011. View at Publisher · View at Google Scholar · View at Scopus
  29. O. Abbasi, A. Rostami, and G. Karimian, “Identification of exonic regions in DNA sequences using cross-correlation and noise suppression by discrete wavelet transform,” BMC Bioinformatics, vol. 12, article 430, 2011. View at Publisher · View at Google Scholar · View at Scopus
  30. S. Rogic, A. K. Mackworth, and F. B. F. Ouellette, “Evaluation of gene-finding programs on mammalian sequences,” Genome Research, vol. 11, pp. 817–832, 2001. View at Google Scholar
  31. M. Burset and R. Guigó, “Evaluation of gene structure prediction programs,” Genomics, vol. 34, no. 3, pp. 353–367, 1996. View at Publisher · View at Google Scholar · View at Scopus
  32. C. Burge, Identification of genes in human genomic DNA, [Ph.D. distertation], Stanford University, Stanford, Calif, USA, 1997.
  33. H. K. Kwan, B. Y. M. Kwan, and J. Y. Y. Kwan, “Novel methodologies for spectral classification of exon and intron sequences,” Eurasip Journal on Advances in Signal Processing, vol. 2012, no. 1, article 50, 2012. View at Publisher · View at Google Scholar · View at Scopus
  34. M. Abo-Zahhad, M. A. Ahmed, and S. A. Abd-Elrahamn, “Genomic analysis and classification of exon and intron sequences using DNA numerical mapping techniques,” International Journal of Information Technology and Computer Science, vol. 4, no. 8, pp. 22–36, 2012. View at Publisher · View at Google Scholar
  35. B. D. Silverman and R. Linsker, “A measure of DNA periodicity,” Journal of Theoretical Biology, vol. 118, no. 3, pp. 295–300, 1986. View at Publisher · View at Google Scholar · View at Scopus
  36. D. Anastassiou, “Genomic signal processing,” IEEE Signal Processing Magazine, vol. 18, no. 4, pp. 8–20, 2001. View at Publisher · View at Google Scholar · View at Scopus
  37. P. D. Cristea, “Genetic signal representation and analysis,” in International Conference on Biomedical Optics, vol. 4623 of Proceedings of SPIE, pp. 77–84, 2002.
  38. S. N. Achuthsanar and S. S. Pillai, “A coding measure scheme empoying electron-ion interaction pseudo potential (EIIP),” Bioinformatics, vol. 1, pp. 197–202, 2006. View at Google Scholar
  39. M. Akhtar, J. Epps, and E. Ambikairajah, “On DNA numerical representations for period-3 based exon prediction,” in Proceedings of the 5th IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS '07), Tuusula, Finland, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  40. R. Zhang and C. Zhang, “Identification of replication origins in archaeal genomes based on the Z-curve method,” Archaea, vol. 1, no. 5, pp. 335–346, 2005. View at Publisher · View at Google Scholar · View at Scopus
  41. K. Florquin, Y. Saeys, S. Degroeve, P. Rouzé, and Y. van de Peer, “Large-scale structural analysis of the core promoter in mammalian and plant genomes,” Nucleic Acids Research, vol. 33, no. 13, pp. 4255–4264, 2005. View at Publisher · View at Google Scholar · View at Scopus
  42. W. F. Zhang and H. Yan, “Exon prediction using empirical mode decomposition and Fourier transform of structural profiles of DNA sequences,” Pattern Recognition, vol. 45, no. 3, pp. 947–955, 2012. View at Publisher · View at Google Scholar · View at Scopus
  43. S. M. Kay, Modern Spectral Estimation: Theory and Application, Prentice-Hall, Englewood Cliffs, NJ, USA, 1988.
  44. C. L. Lawson and R. J. Hanson, Solving Least Squares Problems, Prentice-Hall, Englewood Cliffs, NJ, USA, 1974. View at MathSciNet
  45. H. Akaike, “Fitting autoregressive models for prediction,” Annals of the Institute of Statistical Mathematics, vol. 21, pp. 243–247, 1969. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  46. H. Akaike, “A new look at the statistical model identification,” IEEE Transactions on Automatic Control, vol. 19, pp. 716–723, 1974. View at Google Scholar · View at MathSciNet · View at Scopus
  47. S. W. Lang and J. H. McClellan, “Frequency estimation with maximum entropy spectral estimator,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 6, pp. 716–724, 1980. View at Publisher · View at Google Scholar · View at Scopus
  48. L. C. Zhao, J. J. Ma, S. Q. Fan, and Z. Z. Si, “Research on AR model in vibration analysis of rolling bearings,” Chinese Mechanical Engineering, vol. 15, no. 3, pp. 210–213, 2004. View at Google Scholar · View at Scopus
  49. L. T. Guan, “Wavelet interpolation and decomposition in a finite interval with boundary conditions,” Chinese Journal of Engineering Mathematics, vol. 12, no. 3, pp. 1–9, 1995. View at Google Scholar · View at MathSciNet
  50. X. Wang, C. Liu, F. Bi, X. Bi, and K. Shao, “Fault diagnosis of diesel engine based on adaptive wavelet packets and EEMD-fractal dimension,” Mechanical Systems and Signal Processing, vol. 41, no. 1-2, pp. 581–597, 2013. View at Publisher · View at Google Scholar · View at Scopus
  51. C. M. Vong and P. K. Wong, “Engine ignition signal diagnosis with Wavelet Packet Transform and Multi-class Least Squares Support Vector Machines,” Expert Systems with Applications, vol. 38, no. 7, pp. 8563–8570, 2011. View at Publisher · View at Google Scholar · View at Scopus
  52. S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way, Academic Press, 3rd edition, 2009. View at MathSciNet
  53. S. Xu, Research on the thresholds selection based on the bootstrap algorithm in gene-prediction [M.S. thesis], University of Electronic Science and Technology of China, Chengdu, China, 2008.
  54. J. Y. Y. Kwan, B. Y. M. Kwan, and H. K. Kwan, “Spectral analysis of numerical exon and intron sequences,” in Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW '10), pp. 876–877, Hongkong, China, December 2010. View at Publisher · View at Google Scholar · View at Scopus
  55. M. Akhtar, J. Epps, and E. Ambikairajah, “Signal processing in sequence analysis: advances in eukaryotic gene prediction,” IEEE Journal on Selected Topics in Signal Processing, vol. 2, no. 3, pp. 310–321, 2008. View at Publisher · View at Google Scholar · View at Scopus
  56. T. Fawcett, ROC Graphs: Notes and Practical Considerations for Researchers, HP Laboratories, Palo Alto, Calif, USA, 2003, http://www.hpl.hp.com/techreports/2003/HPL-2003-4.pdf.
  57. Z.-F. Li, C.-G. Zhang, Z.-Y. Shen, and X.-Y. Hang, “Dual coding genes in eukaryote,” Progress in Biochemistry and Biophysics, vol. 36, no. 5, pp. 536–540, 2009. View at Publisher · View at Google Scholar · View at Scopus
  58. W. Y. Chung, S. Wadhawan, R. Szklarczyk, S. K. Pond, and A. Nekrutenko, “A first look at ARFome: dual-coding genes in mammalian genomes,” PLoS Computational Biology, vol. 3, article e91, no. 5, 2007. View at Publisher · View at Google Scholar · View at Scopus