Table of Contents Author Guidelines Submit a Manuscript
Computational and Mathematical Methods in Medicine
Volume 2013 (2013), Article ID 524502, 8 pages
http://dx.doi.org/10.1155/2013/524502
Research Article

Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information

1Golden Audit College, Nanjing Audit University, Nanjing 210029, China
2School of Geography and Biological Information, Nanjing University of Posts and Telecommunications, Nanjing 210046, China
3Graduate School of Chinese Academy of Agricultural Sciences, Beijing 100081, China

Received 13 May 2013; Accepted 19 August 2013

Academic Editor: Nestor V. Torres

Copyright © 2013 Xin Ma et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. C. J. Drummond, G. J. Finlay, L. Broome, E. S. Marshall, E. Richardson, and B. C. Baguley, “Action of SN 28049, a new DNA binding topoisomerase II-directed antitumour drug: comparison with doxorubicin and etoposide,” Investigational New Drugs, vol. 29, no. 5, pp. 1102–1110, 2011. View at Google Scholar · View at Scopus
  2. H. Gao and K. Dahlman-Wright, “From DNA binding to metabolic control: integration of-omics data reveals drug targets for prostate cancer,” EMBO Journal, vol. 30, no. 13, pp. 2516–2517, 2011. View at Publisher · View at Google Scholar · View at Scopus
  3. G. Y. Park, J. J. Wilson, Y. Song, and S. J. Lippard, “Phenanthriplatin, a monofunctional DNA-binding platinum anticancer drug candidate with unusual potency and cellular activity profile,” Proceedings of the National Academy of Sciences of the United States of America, vol. 109, no. 30, pp. 11987–11992, 2012. View at Publisher · View at Google Scholar
  4. H. Zhao, Y. Yang, and Y. Zhou, “Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function,” Bioinformatics, vol. 26, no. 15, Article ID btq295, pp. 1857–1863, 2010. View at Publisher · View at Google Scholar · View at Scopus
  5. S. Ahmad, M. M. Gromiha, and A. Sarai, “Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information,” Bioinformatics, vol. 20, no. 4, pp. 477–486, 2004. View at Publisher · View at Google Scholar · View at Scopus
  6. W. A. McLaughlin, D. W. Kulp, J. de la Cruz, X. Lu, C. L. Lawson, and H. M. Berman, “A structure-based method for identifying DNA-binding proteins and their sites of DNA-interaction,” Journal of Structural and Functional Genomics, vol. 5, no. 4, pp. 255–265, 2005. View at Google Scholar · View at Scopus
  7. G. Nimrod, A. Szilágyi, C. Leslie, and N. Ben-Tal, “Identification of DNA-binding proteins using structural, electrostatic and evolutionary features,” Journal of Molecular Biology, vol. 387, no. 4, pp. 1040–1053, 2009. View at Publisher · View at Google Scholar · View at Scopus
  8. A. Szaboova, O. Kuzelka, F. Zelezny, and J. Tolar, “Prediction of DNA-binding proteins from relational features,” Proteome Science, vol. 10, no. 1, article 66, 2012. View at Publisher · View at Google Scholar
  9. N. Bhardwaj, R. E. Langlois, G. Zhao, and H. Lu, “Kernel-based machine learning protocol for predicting DNA-binding proteins,” Nucleic Acids Research, vol. 33, no. 20, pp. 6486–6493, 2005. View at Publisher · View at Google Scholar · View at Scopus
  10. Y. D. Cai and S. L. Lin, “Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence,” Biochimica et Biophysica Acta, vol. 1648, no. 1-2, pp. 127–133, 2003. View at Publisher · View at Google Scholar · View at Scopus
  11. X. Yu, J. Cao, Y. Cai, T. Shi, and Y. Li, “Predicting rRNA-, RNA-, and DNA-binding proteins from primary structure with support vector machines,” Journal of Theoretical Biology, vol. 240, no. 2, pp. 175–184, 2006. View at Publisher · View at Google Scholar · View at Scopus
  12. M. Kumar, M. M. Gromiha, and G. P. S. Raghava, “Identification of DNA-binding proteins using support vector machines and evolutionary profiles,” BMC Bioinformatics, vol. 8, article 463, 2007. View at Publisher · View at Google Scholar · View at Scopus
  13. X. Shao, Y. Tian, L. Wu, Y. Wang, L. Jing, and N. Deng, “Predicting DNA- and RNA-binding proteins from sequences with kernel methods,” Journal of Theoretical Biology, vol. 258, no. 2, pp. 289–293, 2009. View at Publisher · View at Google Scholar · View at Scopus
  14. A. K. Patel, S. Patel, and P. K. Naik, “Binary classification of uncharacterized proteins into DNA binding/non-DNA binding proteins from sequence derived features using ANN,” Digest Journal of Nanomaterials and Biostructures, vol. 4, no. 4, pp. 775–782, 2009. View at Google Scholar · View at Scopus
  15. K. K. Kumar, G. Pugalenthi, and P. N. Suganthan, “DNA-prot: identification of DNA binding proteins from protein sequence information using random forest,” Journal of Biomolecular Structure and Dynamics, vol. 26, no. 6, pp. 679–686, 2009. View at Google Scholar · View at Scopus
  16. W. Z. Lin, J. A. Fang, X. Xiao, and K. C. Chou, “iDNA-prot: identification of DNA binding proteins using random forest with grey model,” PLoS ONE, vol. 6, no. 9, Article ID e24756, 2011. View at Publisher · View at Google Scholar · View at Scopus
  17. X. Ma, J. Guo, H. D. Liu, J. M. Xie, and X. Sun, “Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information,” IEEE Transactions on Computational Biology and Bioinformatics, vol. 9, no. 6, pp. 1766–1775, 2012. View at Google Scholar
  18. T. U. Consortium, “Reorganizing the protein space at the universal protein resource (UniProt),” Nucleic Acids Research, vol. 40, no. 1, pp. D71–D75, 2012. View at Publisher · View at Google Scholar
  19. S. Ahmad and A. Sarai, “Moment-based prediction of DNA-binding proteins,” Journal of Molecular Biology, vol. 341, no. 1, pp. 65–71, 2004. View at Publisher · View at Google Scholar · View at Scopus
  20. C. R. Peng, L. Liu, B. Niu et al., “Prediction of RNA-binding proteins by voting systems,” Journal of Biomedicine and Biotechnology, vol. 2011, Article ID 506205, 8 pages, 2011. View at Publisher · View at Google Scholar · View at Scopus
  21. J. R. Bock and D. A. Gough, “Predicting protein-protein interactions from primary structure,” Bioinformatics, vol. 17, no. 5, pp. 455–460, 2001. View at Google Scholar · View at Scopus
  22. C. H. Q. Ding and I. Dubchak, “Multi-class protein fold recognition using support vector machines and neural networks,” Bioinformatics, vol. 17, no. 4, pp. 349–358, 2001. View at Google Scholar · View at Scopus
  23. C. Z. Cai, L. Y. Han, Z. L. Ji, X. Chen, and Y. Z. Chen, “SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence,” Nucleic Acids Research, vol. 31, no. 13, pp. 3692–3697, 2003. View at Publisher · View at Google Scholar · View at Scopus
  24. S. Ahmad and A. Sarai, “PSSM-based prediction of DNA binding sites in proteins,” BMC Bioinformatics, vol. 6, article 33, 2005. View at Publisher · View at Google Scholar · View at Scopus
  25. S. Y. Ho, F. C. Yu, C. Y. Chang, and H. L. Huang, “Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method,” BioSystems, vol. 90, no. 1, pp. 234–241, 2007. View at Publisher · View at Google Scholar · View at Scopus
  26. L. Wang, M. Q. Yang, and J. Y. Yang, “Prediction of DNA-binding residues from protein sequence information using random forests,” BMC Genomics, vol. 10, supplement 1, article S1, 2009. View at Publisher · View at Google Scholar · View at Scopus
  27. J. Wu, H. Liu, X. Duan et al., “Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature,” Bioinformatics, vol. 25, no. 1, pp. 30–35, 2009. View at Publisher · View at Google Scholar · View at Scopus
  28. X. Ma, J. Wu, H. Liu, X. Yang, J. Xie, and X. Sun, “A SVM-based approach for predicting DNA-binding residues in proteins from amino acid sequences,” in Proceedings of the International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing (IJCBS '09), pp. 225–229, Shanghai, China, August 2009. View at Publisher · View at Google Scholar · View at Scopus
  29. L. Wang, C. Huang, M. Q. Yang, and J. Y. Yang, “BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features,” BMC Systems Biology, vol. 4, supplement 1, article S3, 2010. View at Publisher · View at Google Scholar · View at Scopus
  30. S. F. Altschul, T. L. Madden, A. A. Schäffer et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Research, vol. 25, no. 17, pp. 3389–3402, 1997. View at Publisher · View at Google Scholar · View at Scopus
  31. J. C. Platt, “Sequential minimal optimization: a fast algorithm for training support vector machine,” Microsoft Research, Technical Report MSR-TR-98-14, 1998. View at Google Scholar
  32. N. Landwehr, M. Hall, and E. Frank, “Logistic model trees,” Machine Learning, vol. 59, no. 1-2, pp. 161–205, 2005. View at Publisher · View at Google Scholar · View at Scopus
  33. L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. View at Publisher · View at Google Scholar · View at Scopus
  34. P. Domingos and M. Pazzani, “On the optimality of the simple Bayesian classifier under zero-one loss,” Machine Learning, vol. 29, no. 2-3, pp. 103–130, 1997. View at Google Scholar · View at Scopus
  35. L. Rokach and O. Maimon, Data Mining with Decision Trees: Theory and Applications, World Scientific, River Edge, NJ, USA, 2008.
  36. V. N. Vapnik, Statistical Learning Theory, John Wiley & Sons, New York, NY, USA, 1998.
  37. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” SIGKDD Explorations, vol. 11, no. 1, pp. 10–18, 2009. View at Publisher · View at Google Scholar
  38. S. Campagne, V. Gervais, and A. Milon, “Nuclear magnetic resonance analysis of protein-DNA interactions,” Journal of the Royal Society Interface, vol. 8, no. 61, pp. 1065–1078, 2011. View at Publisher · View at Google Scholar · View at Scopus