Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015 (2015), Article ID 254838, 7 pages
http://dx.doi.org/10.1155/2015/254838
Research Article

METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text

1School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore DC, QLD 4558, Australia
2School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China
3Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Peking University, Beijing 100871, China

Received 13 March 2015; Accepted 21 June 2015

Academic Editor: Shigehiko Kanaya

Copyright © 2015 Min Zhao et al. This is an open access paper distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. I. Thiele and B. Ø. Palsson, “A protocol for generating a high-quality genome-scale metabolic reconstruction,” Nature Protocols, vol. 5, no. 1, pp. 93–121, 2010. View at Publisher · View at Google Scholar · View at Scopus
  2. Q. Ren, K. Chen, and I. T. Paulsen, “TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels,” Nucleic Acids Research, vol. 35, no. 1, pp. D274–D279, 2007. View at Publisher · View at Google Scholar · View at Scopus
  3. D. Campa, K. Butterbach, S. L. Slager et al., “A comprehensive study of polymorphisms in the ABCB1, ABCC2, ABCG2, NR1I2 genes and lymphoma risk,” International Journal of Cancer, vol. 131, no. 4, pp. 803–812, 2012. View at Publisher · View at Google Scholar · View at Scopus
  4. M. Zhao, X. Chen, G. Gao, L. Tao, and L. Wei, “RLEdb: a database of rate-limiting enzymes and their regulation in human, rat, mouse, yeast and E. coli,” Cell Research, vol. 19, no. 6, pp. 793–795, 2009. View at Publisher · View at Google Scholar · View at Scopus
  5. M. Zhao and H. Qu, “PathLocdb: a comprehensive database for the subcellular localization of metabolic pathways and its application to multiple localization analysis,” BMC Genomics, vol. 11, supplement 4, article S13, 2010. View at Publisher · View at Google Scholar · View at Scopus
  6. M. Zhao, Y. M. Chen, D. Qu, and H. Qu, “TSdb: a database of transporter substrates linking metabolic pathways and transporter systems on a genome scale via their shared substrates,” Science China-Life Sciences, vol. 54, no. 1, pp. 60–64, 2011. View at Publisher · View at Google Scholar · View at Scopus
  7. M. H. Saier Jr., C. V. Tran, and R. D. Barabote, “TCDB: the transporter classification database for membrane transport protein analyses and information.,” Nucleic Acids Research, vol. 34, pp. D181–D186, 2006. View at Publisher · View at Google Scholar · View at Scopus
  8. http://www.ncbi.nlm.nih.gov/pubmed.
  9. The UniProt Consortium, “The Universal Protein Resource (UniProt) in 2010,” Nucleic Acids Research, vol. 38, supplement 1, pp. D142–D148, 2009. View at Publisher · View at Google Scholar
  10. M. Krallinger and A. Valencia, “Text-mining and information-retrieval services for molecular biology,” Genome Biology, vol. 6, no. 7, article 224, 2005. View at Publisher · View at Google Scholar · View at Scopus
  11. F. M. Couto, M. J. Silva, V. Lee et al., “GOAnnotator: linking protein GO annotations to evidence text,” Journal of Biomedical Discovery and Collaboration, vol. 1, article 19, 2006. View at Publisher · View at Google Scholar · View at Scopus
  12. A. Doms and M. Schroeder, “GoPubMed: exploring PubMed with the gene ontology,” Nucleic Acids Research, vol. 33, no. 2, pp. W783–W786, 2005. View at Publisher · View at Google Scholar · View at Scopus
  13. R. Hoffmann, M. Krallinger, E. Andres, J. Tamames, C. Blaschke, and A. Valencia, “Text mining for metabolic pathways, signaling cascades, and protein networks,” Science's STKE, vol. 2005, no. 283, p. pe21, 2005. View at Google Scholar
  14. S. Ananiadou, S. Pyysalo, J. Tsujii, and D. B. Kell, “Event extraction for systems biology by text mining the literature,” Trends in Biotechnology, vol. 28, no. 7, pp. 381–390, 2010. View at Publisher · View at Google Scholar · View at Scopus
  15. D. Zhou and Y. He, “Extracting interactions between proteins from the literature,” Journal of Biomedical Informatics, vol. 41, no. 2, pp. 393–407, 2008. View at Publisher · View at Google Scholar · View at Scopus
  16. L. Tari, S. Anwar, S. Liang, J. Cai, and C. Baral, “Discovering drug-drug interactions: a text-mining and reasoning approach based on properties of drug metabolism,” Bioinformatics, vol. 26, no. 18, pp. i547–i553, 2010. View at Publisher · View at Google Scholar · View at Scopus
  17. M. Zhao and H. Qu, “Human liver rate-limiting enzymes influence metabolic flux via branch points and inhibitors,” BMC Genomics, vol. 10, supplement 3, article S31, 2009. View at Publisher · View at Google Scholar · View at Scopus
  18. M. Zhao and H. Qu, “High similarity of phylogenetic profiles of rate-limiting enzymes with inhibitory relation in Human, Mouse, Rat, budding Yeast and E. coli,” BMC Genomics, vol. 12, supplement 3, p. S10, 2011. View at Google Scholar
  19. A. Y. Ye, Q. Liu, C. Li, M. Zhao, H. Qu, and Y. Xue, “Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes,” PLoS ONE, vol. 9, no. 2, Article ID e88883, 2014. View at Publisher · View at Google Scholar
  20. L. Kong, L. Cheng, L.-Y. Fan, M. Zhao, and H. Qu, “IQdb: an intelligence quotient score-associated gene resource for human intelligence,” Database, vol. 2013, Article ID bat063, 2013. View at Publisher · View at Google Scholar · View at Scopus
  21. M. Zhao, X. Li, and H. Qu, “EDdb: a web resource for eating disorder and its application to identify an extended adipocytokine signaling pathway related to eating disorder,” Science China Life Sciences, vol. 56, no. 12, pp. 1086–1096, 2013. View at Publisher · View at Google Scholar · View at Scopus
  22. C. Nobata, P. D. Dobson, S. A. Iqbal et al., “Mining metabolites: extracting the yeast metabolome from the literature,” Metabolomics, vol. 7, no. 1, pp. 94–101, 2011. View at Publisher · View at Google Scholar · View at Scopus
  23. J. Ding, D. Berleant, D. Nettleton, and E. Wurtele, “Mining MEDLINE: abstracts, sentences, or phrases?” Pacific Symposium on Biocomputing, pp. 326–337, 2002. View at Google Scholar · View at Scopus
  24. C. Elkan and K. Noto, “Learning classifiers from only positive and unlabeled data,” in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD' 08), pp. 213–220, August 2008. View at Publisher · View at Google Scholar · View at Scopus
  25. A. V. Aho, J. E. Hopcroft, and J. D. Ullman, Data Structure and Algorithms, Addison-Wesley, Reading, Mass, USA, 1983. View at MathSciNet
  26. T. M. Mitchell, Machine Learning, McGraw-Hill, New York, NY, USA, 1997.
  27. M. F. Porter, “An algorithm for suffix stripping,” Program-Electronic Library and Information Systems, vol. 40, no. 3, pp. 211–218, 2006. View at Google Scholar
  28. A. K. McCallum, MALLET: A Machine Learning for Language Toolkit, 2002.
  29. A. L. Berger, V. J. Della Pietra, and S. A. Della Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, vol. 22, no. 1, pp. 39–71, 1996. View at Google Scholar · View at Scopus
  30. S. Guiasu and A. Shenitzer, “The principle of maximum entropy,” The Mathematical Intelligencer, vol. 7, no. 1, pp. 42–48, 1985. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  31. J. L. Kamal Nigam and A. McCallum, “Using maximum entropy for text classification,” in Proceedings of the Workshop on Machine Learning for Information Filtering (IJCAI '99), pp. 61–67, 1999.
  32. M. Craven and J. Kumlien, “Constructing biological knowledge bases by extracting information from text sources,” in Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, pp. 77–86, 1999.
  33. S. Das, M. H. Saier Jr., and C. Elkan, “Finding transport proteins in a general protein database,” in Knowledge Discovery in Databases: PKDD 2007, vol. 4702 of Lecture Notes in Computer Science, pp. 54–66, Springer, Berlin, Germany, 2007. View at Publisher · View at Google Scholar