About this Journal Submit a Manuscript Table of Contents
International Journal of Genomics
Volume 2015 (2015), Article ID 608042, 7 pages
http://dx.doi.org/10.1155/2015/608042
Research Article

PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction

1Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
2Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA

Received 7 January 2015; Revised 22 July 2015; Accepted 26 July 2015

Academic Editor: Ian Dunham

Copyright © 2015 Jianzhuang Yao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. A.-C. Gavin, M. Bösche, R. Krause et al., “Functional organization of the yeast proteome by systematic analysis of protein complexes,” Nature, vol. 415, no. 6868, pp. 141–147, 2002. View at Publisher · View at Google Scholar
  2. B. Alberts, “The cell as a collection of protein machines: preparing the next generation of molecular biologists,” Cell, vol. 92, no. 3, pp. 291–294, 1998. View at Publisher · View at Google Scholar
  3. D. Devos and R. B. Russell, “A more complete, complexed and structured interactome,” Current Opinion in Structural Biology, vol. 17, no. 3, pp. 370–377, 2007. View at Publisher · View at Google Scholar
  4. A.-C. Gavin, P. Aloy, P. Grandi et al., “Proteome survey reveals modularity of the yeast cell machinery,” Nature, vol. 440, no. 7084, pp. 631–636, 2006. View at Google Scholar
  5. A. Kumar and M. Snyder, “Proteomics: protein complexes take the bait,” Nature, vol. 415, no. 6868, pp. 123–124, 2002. View at Publisher · View at Google Scholar
  6. I. Xenarios, “DIP: the database of interacting proteins,” Nucleic Acids Research, vol. 28, no. 1, pp. 289–291, 2000. View at Publisher · View at Google Scholar
  7. A. Franceschini, D. Szklarczyk, S. Frankild et al., “STRING v9.1: protein-protein interaction networks, with increased coverage and integration,” Nucleic Acids Research, vol. 41, no. D1, pp. D808–D815, 2013. View at Publisher · View at Google Scholar
  8. C. von Mering, R. Krause, B. Snel et al., “Comparative assessment of large-scale data sets of protein–protein interactions,” Nature, vol. 417, no. 6887, pp. 399–403, 2002. View at Publisher · View at Google Scholar
  9. J. Planas-Iglesias, J. Bonet, J. García-García, M. A. Marín-López, E. Feliu, and B. Oliva, “Understanding protein–protein interactions using local structural features,” Journal of Molecular Biology, vol. 425, no. 7, pp. 1210–1224, 2013. View at Publisher · View at Google Scholar
  10. J. L. Sussman, D. Lin, J. Jiang et al., “Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules,” Acta Crystallographica Section D Biological Crystallography, vol. 54, no. 6, part 1, pp. 1078–1084, 1998. View at Publisher · View at Google Scholar
  11. G. T. Hart, A. Ramani, and E. Marcotte, “How complete are current yeast and human protein-interaction networks?” Genome Biology, vol. 7, no. 11, p. 120, 2006. View at Google Scholar
  12. G. Gallone, T. I. Simpson, J. D. Armstrong, and A. P. Jarman, “Bio::Homology::InterologWalk—a Perl module to build putative protein-protein interaction networks through interolog mapping,” BMC Bioinformatics, vol. 12, article 289, 2011. View at Publisher · View at Google Scholar
  13. C. Y. Yu, L. C. Chou, and D. T. H. Chang, “Predicting protein-protein interactions in unbalanced data using the primary structure of proteins,” BMC Bioinformatics, vol. 11, article 167, 2010. View at Publisher · View at Google Scholar
  14. J. Garcia-Garcia, E. Guney, R. Aragues, J. Planas-Iglesias, and B. Oliva, “Biana: a software framework for compiling biological interactions and analyzing networks,” BMC Bioinformatics, vol. 11, no. 1, article 56, 2010. View at Publisher · View at Google Scholar
  15. Y. Liu, I. Kim, and H. Zhao, “Protein interaction predictions from diverse sources,” Drug Discovery Today, vol. 13, no. 9-10, pp. 409–416, 2008. View at Publisher · View at Google Scholar
  16. Y. Qi, J. Klein-Seetharaman, and Z. Bar-Joseph, “Random forest similarity for protein-protein interaction prediction from multiple sources,” in Proceedings of the Pacific Symposium on Biocomputing, pp. 531–542, January 2005. View at Publisher · View at Google Scholar
  17. X. W. Chen and M. Liu, “Prediction of protein-protein interactions using random decision forest framework,” Bioinformatics, vol. 21, no. 24, pp. 4394–4400, 2005. View at Publisher · View at Google Scholar
  18. K. A. Theofilatos, C. M. Dimitrakopoulos, A. K. Tsakalidis, S. D. Likothanassis, S. T. Papadimitriou, and S. P. Mavroudi, “Computational approaches for the prediction of protein-protein interactions: a survey,” Current Bioinformatics, vol. 6, no. 4, pp. 398–414, 2011. View at Publisher · View at Google Scholar
  19. J. Garcia-Garcia, S. Schleker, J. Klein-Seetharaman, and B. Oliva, “BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference,” Nucleic Acids Research, vol. 40, no. W1, pp. W147–W151, 2012. View at Publisher · View at Google Scholar
  20. R. Jansen, H. Yu, D. Greenbaum, and et al, “A bayesian networks approach for predicting protein-protein interactions from genomic data,” Science, vol. 302, no. 5644, pp. 449–453, 2003. View at Publisher · View at Google Scholar
  21. X.-W. Chen, M. Liu, and Y. Hu, “Integrative neural network approach for protein interaction prediction from heterogeneous data,” in Advanced Data Mining and Applications, C. Tang, C. X. Ling, X. Zhou, N. J. Cercone, and X. Li, Eds., vol. 5139 of Lecture Notes in Computer Science, pp. 532–539, Springer, Berlin, Germany, 2008. View at Publisher · View at Google Scholar
  22. S. M. Gomez, W. S. Noble, and A. Rzhetsky, “Learning to predict protein–protein interactions from protein sequences,” Bioinformatics, vol. 19, no. 15, pp. 1875–1881, 2003. View at Publisher · View at Google Scholar
  23. C. Strobl, A.-L. Boulesteix, A. Zeileis, and T. Hothorn, “Bias in random forest variable importance measures: illustrations, sources and a solution,” BMC Bioinformatics, vol. 8, no. 1, article 25, 2007. View at Google Scholar
  24. Y. Qi, Z. Bar-Joseph, and J. Klein-Seetharaman, “Evaluation of different biological data and computational classification methods for use in protein interaction prediction,” Proteins: Structure, Function, and Bioinformatics, vol. 63, no. 3, pp. 490–500, 2006. View at Publisher · View at Google Scholar
  25. Y. Zhang, D. Zhang, G. Mi et al., “Using ensemble methods to deal with imbalanced data in predicting protein-protein interactions,” Computational Biology and Chemistry, vol. 36, pp. 36–41, 2012. View at Publisher · View at Google Scholar
  26. S. M. Augusty and S. Izudheen, “A survey: evaluation of ensemble classifiers and data level methods to deal with imbalanced data problem in protein-protein interactions,” Review of Bioinformatics and Biometrics, vol. 2, no. 1, 2013. View at Google Scholar
  27. M. Pellegrini, E. M. Marcotte, M. J. Thompson, D. Eisenberg, and T. O. Yeates, “Assigning protein functions by comparative genome analysis: protein phylogenetic profiles,” Proceedings of the National Academy of Sciences, vol. 96, no. 8, pp. 4285–4288, 1999. View at Publisher · View at Google Scholar
  28. T. Gaasterland and M. A. Ragan, “Constructing multigenome views of whole microbial genomes,” Microbial & Comparative Genomics, vol. 3, no. 3, pp. 177–192, 1998. View at Publisher · View at Google Scholar
  29. E. S. Snitkin, A. M. Gustafson, J. Mellor, J. Wu, and C. DeLisi, “Comparative assessment of performance and genome dependence among phylogenetic profiling methods,” BMC Bioinformatics, vol. 7, no. 1, article 420, 2006. View at Publisher · View at Google Scholar
  30. R. Jothi, T. M. Przytycka, and L. Aravind, “Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment,” BMC Bioinformatics, vol. 8, no. 1, article 173, 17 pages, 2007. View at Publisher · View at Google Scholar
  31. J. Sun, Y. Li, and Z. Zhao, “Phylogenetic profiles for the prediction of protein–protein interactions: how to select reference organisms?” Biochemical and Biophysical Research Communications, vol. 353, no. 4, pp. 985–991, 2007. View at Publisher · View at Google Scholar
  32. D. Herman, D. Ochoa, D. Juan, D. Lopez, A. Valencia, and F. Pazos, “Selection of organisms for the co-evolution-based study of protein interactions,” BMC Bioinformatics, vol. 12, no. 1, article 363, 2011. View at Publisher · View at Google Scholar
  33. M. Simonsen, S. R. Maetschke, and M. A. Ragan, “Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling,” Bioinformatics, vol. 28, no. 6, pp. 851–857, 2012. View at Publisher · View at Google Scholar
  34. S. V. Date and E. M. Marcotte, “Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages,” Nature Biotechnology, vol. 21, no. 9, pp. 1055–1062, 2003. View at Publisher · View at Google Scholar
  35. J. Wu, S. Kasif, and C. DeLisi, “Identification of functional links between genes using phylogenetic profiles,” Bioinformatics, vol. 19, no. 12, pp. 1524–1530, 2003. View at Publisher · View at Google Scholar
  36. S. Cokus, S. Mizutani, and M. Pellegrini, “An improved method for identifying functionally linked proteins using phylogenetic profiles,” BMC Bioinformatics, vol. 8, supplement 4, article S7, 2007. View at Publisher · View at Google Scholar
  37. S. Singh and D. P. Wall, “Testing the accuracy of eukaryotic phylogenetic profiles for prediction of biological function,” Evolutionary Bioinformatics, vol. 4, pp. 217–223, 2008. View at Google Scholar
  38. S. R. Maetschke, M. Simonsen, M. J. Davis, and M. A. Ragan, “Gene Ontology-driven inference of protein-protein interactions using inducers,” Bioinformatics, vol. 28, no. 1, pp. 69–75, 2011. View at Publisher · View at Google Scholar
  39. C. Lin, W. Chen, C. Qiu, Y. Wu, S. Krishnan, and Q. Zou, “LibD3C: ensemble classifiers with a clustering and dynamic selection strategy,” Neurocomputing, vol. 123, pp. 424–435, 2014. View at Publisher · View at Google Scholar
  40. L. Song, D. Li, X. Zeng, Y. Wu, L. Guo, and Q. Zou, “nDNA-prot: identification of DNA-binding proteins based on unbalanced classification,” BMC Bioinformatics, vol. 15, no. 1, article 298, 2014. View at Publisher · View at Google Scholar
  41. C. Lin, Y. Zou, J. Qin et al., “Hierarchical classification of protein folds using a novel ensemble classifier,” PLoS ONE, vol. 8, no. 2, Article ID e56499, 2013. View at Publisher · View at Google Scholar
  42. P. Smialowski, P. Pagel, P. Wong et al., “The Negatome database: a reference set of non-interacting protein pairs,” Nucleic Acids Research, vol. 38, supplement 1, pp. D540–D544, 2010. View at Publisher · View at Google Scholar
  43. L. Zhang, S. Wong, O. King, and F. P. Roth, “Predicting co-complexed protein pairs using genomic and proteomic data integration,” BMC Bioinformatics, vol. 5, no. 1, article 38, 2004. View at Google Scholar
  44. L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. View at Google Scholar