Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015, Article ID 918710, 7 pages
http://dx.doi.org/10.1155/2015/918710
Research Article

GNormPlus: An Integrative Approach for Tagging Genes, Gene Families, and Protein Domains

1National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, MD 20894, USA
2Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan

Received 15 January 2015; Revised 3 April 2015; Accepted 4 April 2015

Academic Editor: Yudong Cai

Copyright © 2015 Chih-Hsuan Wei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. Z. Lu, “PubMed and beyond: a survey of web tools for searching biomedical literature,” Database, vol. 2011, Article ID baq036, 2011. View at Publisher · View at Google Scholar · View at Scopus
  2. P. Zweigenbaum, D. Demner-fushman, H. Yu, and K. B. Cohen, “Frontiers of biomedical text mining: current progress,” Briefings in Bioinformatics, vol. 8, no. 5, pp. 358–375, 2007. View at Publisher · View at Google Scholar · View at Scopus
  3. A. Rzhetsky, M. Seringhaus, and M. Gerstein, “Seeking a new biology through text mining,” Cell, vol. 134, no. 1, pp. 9–13, 2008. View at Publisher · View at Google Scholar · View at Scopus
  4. H. Shatkay and R. Feldman, “Mining the biomedical literature in the genomic era: an overview,” Journal of Computational Biology, vol. 10, no. 6, pp. 821–855, 2003. View at Publisher · View at Google Scholar · View at Scopus
  5. D. Rebholz-Schuhmann, H. Kirsch, and F. Couto, “Facts from text—is text mining ready to deliver?” PLoS Biology, vol. 3, no. 2, article e65, 2005. View at Publisher · View at Google Scholar · View at Scopus
  6. S. Ananiadou, D. B. Kell, and J.-I. Tsujii, “Text mining and its potential applications in systems biology,” Trends in Biotechnology, vol. 24, no. 12, pp. 571–579, 2006. View at Publisher · View at Google Scholar · View at Scopus
  7. M. Krallinger, F. Leitner, C. Rodriguez-Penagos, and A. Valencia, “Overview of the protein-protein interaction annotation extraction task of BioCreative II,” Genome Biology, vol. 9, no. 2, article S4, 2008. View at Publisher · View at Google Scholar · View at Scopus
  8. M. Krallinger, M. Vazquez, F. Leitner et al., “The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text,” BMC Bioinformatics, vol. 12, supplement 8, article S3, 2011. View at Publisher · View at Google Scholar · View at Scopus
  9. W. A. Baumgartner Jr., Z. Lu, H. L. Johnson et al., “Concept recognition for extracting protein interaction relations from biomedical text,” Genome Biology, vol. 9, supplement 2, article S9, 2008. View at Publisher · View at Google Scholar · View at Scopus
  10. I. Segura-Bedmar, P. Martínez, and M. Herrero-Zazo, “Lessons learnt from the DDIExtraction-2013 shared task,” Journal of Biomedical Informatics, vol. 51, pp. 152–164, 2014. View at Publisher · View at Google Scholar
  11. I. Segura-Bedmar, P. Martínez, and C. de Pablo-Sánchez, “Using a shallow linguistic kernel for drug-drug interaction extraction,” Journal of Biomedical Informatics, vol. 44, no. 5, pp. 789–804, 2011. View at Publisher · View at Google Scholar · View at Scopus
  12. J. Gobeill, E. Pasche, D. Vishnyakova, and P. Ruch, “Closing the loop: from paper to protein annotation using supervised Gene Ontology classification,” Database, vol. 2014, Article ID bau088, 2014. View at Publisher · View at Google Scholar
  13. Y. Mao, K. Van Auken, D. Li et al., “Overview of the gene ontology task at BioCreative IV,” Database, vol. 2014, Article ID bau086, 2014. View at Publisher · View at Google Scholar
  14. A. J. Yepes and K. Verspoor, “Mutation extraction tools can be combined for robust recognition of genetic variants in the literature,” F1000Research, vol. 3, article 18, 2014. View at Google Scholar
  15. C.-H. Wei, B. R. Harris, H.-Y. Kao, and Z. Lu, “TmVar: a text mining approach for extracting sequence variants in biomedical literature,” Bioinformatics, vol. 29, no. 11, pp. 1433–1439, 2013. View at Publisher · View at Google Scholar · View at Scopus
  16. E. Doughty, A. Kertesz-Farkas, O. Bodenreider et al., “Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature,” Bioinformatics, vol. 27, no. 3, Article ID btq667, pp. 408–415, 2011. View at Publisher · View at Google Scholar · View at Scopus
  17. W. A. Baumgartner Jr., Z. Lu, H. L. Johnson et al., “An integrated approach to concept recognition in biomedical text,” in Proceedings of the 2nd BioCreative Challenge Evaluation Workshop, pp. 257–271, Centro Nacional de Investigaciones Oncologicas (CNIO), Madrid, Spain, 2007.
  18. R. I. Dogan, G. C. Murray, A. Névéol, and Z. Lu, “Understanding PubMed user search behavior through log analysis,” Database, vol. 2009, Article ID bap018, 2009. View at Publisher · View at Google Scholar
  19. C.-H. Wei and H.-Y. Kao, “Cross-species gene normalization by species inference,” BMC Bioinformatics, vol. 12, supplement 8, article S5, 2011. View at Publisher · View at Google Scholar · View at Scopus
  20. R. T. Tsai and P.-T. Lai, “Multi-stage gene normalization for full-text articles with context-based species filtering for dynamic dictionary entry selection,” BMC Bioinformatics, vol. 12, supplement 8, article S7, 2011. View at Publisher · View at Google Scholar · View at Scopus
  21. C.-J. Kuo, M. H. T. Ling, and C.-N. Hsu, “Soft tagging of overlapping high confidence gene mention variants for cross-species full-text gene normalization,” BMC Bioinformatics, vol. 12, 8, article S6, 2011. View at Publisher · View at Google Scholar · View at Scopus
  22. M. Huang, J. Liu, and X. Zhu, “GeneTUKit: a software for document-level gene normalization,” Bioinformatics, vol. 27, no. 7, pp. 1032–1033, 2011. View at Publisher · View at Google Scholar · View at Scopus
  23. J. Hakenberg, M. Gerner, M. Haeussler et al., “The GNAT library for local and remote gene mention normalization,” Bioinformatics, vol. 27, no. 19, Article ID btr455, pp. 2769–2771, 2011. View at Publisher · View at Google Scholar · View at Scopus
  24. J. Wermter, K. Tomanek, and U. Hahn, “High-performance gene name normalization with GeNo,” Bioinformatics, vol. 25, no. 6, pp. 815–821, 2009. View at Publisher · View at Google Scholar · View at Scopus
  25. J. Hakenberg, C. Plake, R. Leaman, M. Schroeder, and G. Gonzalez, “Inter-species normalization of gene mentions with GNAT,” Bioinformatics, vol. 24, no. 16, pp. i126–i132, 2008. View at Publisher · View at Google Scholar · View at Scopus
  26. S. van Landeghem, J. Björne, C.-H. Wei et al., “Large-scale event extraction from literature with multi-level gene normalization,” PLoS ONE, vol. 8, no. 4, Article ID e55814, 2013. View at Publisher · View at Google Scholar · View at Scopus
  27. R. Leaman, R. I. Doğan, and Z. Lu, “DNorm: disease name normalization with pairwise learning to rank,” Bioinformatics, vol. 29, no. 22, pp. 2909–2917, 2013. View at Publisher · View at Google Scholar · View at Scopus
  28. R. Leaman, C.-H. Wei, and Z. Lu, “tmChem: a high performance approach for chemical named entity recognition and normalization,” Journal of Cheminformatics, vol. 7, supplement 1, article S3, 2015. View at Publisher · View at Google Scholar
  29. Z. Lu, H.-Y. Kao, C.-H. Wei et al., “The gene normalization task in BioCreative III,” BMC Bioinformatics, vol. 12, 8, article S2, 2011. View at Publisher · View at Google Scholar · View at Scopus
  30. A. A. Morgan, Z. Lu, X. Wang et al., “Overview of BioCreative II gene normalization,” Genome Biology, vol. 9, supplement 2, article S3, 2008. View at Publisher · View at Google Scholar · View at Scopus
  31. L. Hirschman, M. Colosimo, A. Morgan, and A. Yeh, “Overview of BioCreAtIvE task 1B: normalized gene lists,” BMC Bioinformatics, vol. 6, supplement 1, article S11, 2005. View at Publisher · View at Google Scholar · View at Scopus
  32. C.-C. Huang and Z. Lu, “Community challenges in biomedical text mining over 10 years: success, failure and the future,” Briefings in Bioinformatics, 2015. View at Publisher · View at Google Scholar
  33. A. Yeh, A. Morgan, M. Colosimo, and L. Hirschman, “BioCreAtIvE task 1A: gene mention finding evaluation,” BMC Bioinformatics, vol. 6, supplement 1, article S2, 2005. View at Publisher · View at Google Scholar · View at Scopus
  34. L. Smith, L. K. Tanabe, R. Ando et al., “Overview of BioCreative II gene mention recognition,” Genome Biology, vol. 9, no. 2, article S2, 2008. View at Publisher · View at Google Scholar · View at Scopus
  35. C.-N. Hsu, Y.-M. Chang, C.-J. Kuo, Y.-S. Lin, H.-S. Huang, and I.-F. Chung, “Integrating high dimensional bi-directional parsing models for gene mention tagging,” Bioinformatics, vol. 24, no. 13, pp. i286–i294, 2008. View at Publisher · View at Google Scholar · View at Scopus
  36. R. Leaman and G. Gonzalez, “BANNER: an executable survey of advances in biomedical named entity recognition,” in Proceedings of the Pacific Symposium on Biocomputing, pp. 652–663, Kohala Coast, Hawaii, USA, January 2008.
  37. M. Torii, Z. Hu, C. H. Wu, and H. Liu, “BioTagger-GM: a gene/protein name recognition system,” Journal of the American Medical Informatics Association, vol. 16, no. 2, pp. 247–255, 2009. View at Publisher · View at Google Scholar · View at Scopus
  38. H.-J. Dai, J. C.-Y. Wu, and R. T.-H. Tsai, “Collective instance-level gene normalization on the IGN corpus,” PLoS ONE, vol. 8, no. 11, Article ID e79517, 2013. View at Publisher · View at Google Scholar · View at Scopus
  39. L. Li, S. Liu, W. Fan, D. Huang, and H. Zhou, “A multistage gene normalization system integrating multiple effective methods,” PLoS ONE, vol. 8, no. 12, Article ID e81956, 2013. View at Publisher · View at Google Scholar · View at Scopus
  40. Y. Hu, Y. Li, H. Lin, Z. Yang, and L. Cheng, “Integrating various resources for gene name normalization,” PLoS ONE, vol. 7, no. 9, Article ID e43558, 2012. View at Publisher · View at Google Scholar · View at Scopus
  41. C.-H. Wei, R. leaman, and Z. Lu, “SimConcept: a hybrid approach for simplifying composite named entities in biomedicine,” in Proceedings of the ACM Conference on Bioinformatics Computational Biology and Health Informatics, pp. 138–146, ACM, Newport Beach, Calif, USA, 2014.
  42. C.-H. Wei, H.-Y. Kao, and Z. Lu, “SR4GN: a species recognition software tool for gene normalization,” PLoS ONE, vol. 7, no. 6, Article ID e38460, 2012. View at Publisher · View at Google Scholar · View at Scopus
  43. S. Sohn, D. C. Comeau, W. Kim, and J. W. Wilbur, “Abbreviation definition identification based on automatic precision estimates,” BMC Bioinformatics, vol. 9, no. 1, article 402, 2008. View at Publisher · View at Google Scholar · View at Scopus
  44. C.-H. Wei, H.-Y. Kao, and Z. Lu, “PubTator: a web-based text mining tool for assisting biocuration,” Nucleic Acids Research, vol. 41, pp. W518–W522, 2013. View at Publisher · View at Google Scholar · View at Scopus
  45. C.-H. Wei, B. R. Harris, D. Li et al., “Accelerating literature curation with text-mining tools: a case study of using PubTator to curate genes in PubMed abstracts,” Database, vol. 2012, Article ID bas041, 2012. View at Google Scholar · View at Scopus
  46. C. N. Arighi, B. Carterette, K. B. Cohen et al., “An overview of the BioCreative 2012 Workshop Track III: interactive text mining task,” Database, vol. 2013, Article ID bas056, 2013. View at Publisher · View at Google Scholar · View at Scopus
  47. J. Lafferty, A. McCallum, and F. Pereira, “Conditional random fields: probabilistic models for segmenting and labeling sequence data,” in Proceedings of the 18th International Conference on Machine Learning (ICML '01), pp. 282–289, ACM, Williamstown, Mass, USA, June-July 2001.
  48. D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Mathematical Programming B, vol. 45, no. 3, pp. 503–528, 1989. View at Publisher · View at Google Scholar · View at MathSciNet
  49. C.-H. Wei, I.-C. Huang, Y.-Y. Hsu, and H.-Y. Kao, “Normalizing biomedical name entities by similarity-based inference network and de-ambiguity mining,” in Proceedings of the 9th IEEE International Conference on Bioinformatics and Bioengineering, pp. 461–466, Taichung, Taiwan, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  50. Z. Lu and L. Hirschman, “Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II,” Database, vol. 2012, Article ID bas043, 2012. View at Publisher · View at Google Scholar · View at Scopus
  51. B. Xie, Q. Ding, H. Han, and D. Wu, “miRCancer: a microRNA-cancer association database constructed by text mining on literature,” Bioinformatics, vol. 29, no. 5, pp. 638–644, 2013. View at Publisher · View at Google Scholar · View at Scopus