Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015 (2015), Article ID 873012, 10 pages
http://dx.doi.org/10.1155/2015/873012
Research Article

Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields

1Department of Computer Science and Information Engineering, National Taitung University, Taitung 95092, Taiwan
2Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11042, Taiwan

Received 15 January 2015; Revised 20 April 2015; Accepted 11 May 2015

Academic Editor: X. L. Li

Copyright © 2015 Hong-Jie Dai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. W. T. Kerr, E. P. Lau, G. E. Owens, and A. Trefler, “The future of medical diagnostics: large digitized databases,” The Yale Journal of Biology and Medicine, vol. 85, no. 3, pp. 363–377, 2012. View at Google Scholar · View at Scopus
  2. D. Capurro, Secondary use of electronic clinical data: barriers, facilitators and a proposed solution [Ph.D. thesis], University of Washington, 2013.
  3. H. Liu, S. J. Bielinski, S. Sohn et al., “An information extraction framework for cohort identification using electronic health records,” AMIA Joint Summits on Translational Science Proceedings, vol. 2013, pp. 149–153, 2013. View at Google Scholar
  4. C. Friedman, L. Shagina, Y. Lussier, and G. Hripcsak, “Automated encoding of clinical documents based on natural language processing,” Journal of the American Medical Informatics Association, vol. 11, no. 5, pp. 392–402, 2004. View at Publisher · View at Google Scholar · View at Scopus
  5. A. R. Aronson, “Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program,” Journal of Biomedical Informatics, vol. 35, pp. 17–21, 2001. View at Google Scholar
  6. G. K. Savova, J. J. Masanz, P. V. Ogren et al., “Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications,” Journal of the American Medical Informatics Association, vol. 17, no. 5, pp. 507–513, 2010. View at Publisher · View at Google Scholar · View at Scopus
  7. J. C. Denny, R. A. Miller, K. B. Johnson, and A. Spickard III, “Development and evaluation of a clinical note section header terminology,” AMIA Annual Symposium Proceedings, vol. 2008, pp. 156–160, 2008. View at Google Scholar
  8. S. T. Rosenbloom, J. C. Denny, H. Xu, N. Lorenzi, W. W. Stead, and K. B. Johnson, “Data from clinical notes: a perspective on the tension between structure and flexible documentation,” Journal of the American Medical Informatics Association, vol. 18, no. 2, pp. 181–186, 2011. View at Publisher · View at Google Scholar · View at Scopus
  9. A. Stubbs, C. Kotfila, H. Xu, and O. Uzuner, “Practical applications for NLP in Clinical Research: the 2014 i2b2/UTHealth shared tasks,” in Proceedings of the i2b2 Shared Task and Workshop Challenges in Natural Language Processing for Clinical Data, 2014.
  10. J. Lafferty, A. McCallum, and F. Pereira, “Conditional random fields: probabilistic models for segmenting and labeling sequence data,” in Proceedings of the 18th International Conference on Machine Learning (ICML '01), pp. 282–289, 2001.
  11. Y. Li, S. Lipsky Gorman, and N. Elhadad, “Section classification in clinical notes using supervised hidden Markov model,” in Proceedings of the 1st ACM International Health Informatics Symposium (IHI '10), pp. 744–750, Arlington, Va, USA, November 2010. View at Publisher · View at Google Scholar · View at Scopus
  12. K. A. Ganesan and M. Subotin, “A general supervised approach to segmentation of clinical texts,” in Proceedings of the IEEE International Conference on Big Data (Big Data '14), pp. 33–40, IEEE, Anchorage, Alaska, USA, October 2014. View at Publisher · View at Google Scholar
  13. C.-W. Chen, N.-W. Chang, Y.-C. Chang, and H.-J. Dai, “Section heading recognition in electronic health records using conditional random fields,” in Technologies and Applications of Artificial Intelligence, S.-M. Cheng and M.-Y. Day, Eds., vol. 8916 of Lecture Notes in Computer Science, pp. 47–55, Springer International Publishing, 2014. View at Publisher · View at Google Scholar
  14. P. Stenetorp, S. Pyysalo, G. Topić, T. Ohta, S. Ananiadou, and J. I. Tsujii, “brat: a web-based tool for NLP-assisted text annotation,” in Proceedings of the Demonstrations Session at EACL 2012, Avignon, France, 2012.
  15. L. Smith, T. Rindflesch, and W. J. Wilbur, “MedPost: a part-of-speech tagger for bioMedical text,” Bioinformatics, vol. 20, no. 14, pp. 2320–2321, 2004. View at Publisher · View at Google Scholar · View at Scopus
  16. R. T.-H. Tsai, C.-L. Sung, H.-J. Dai, H.-C. Hung, T.-Y. Sung, and W.-L. Hsu, “NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition,” BMC Bioinformatics, vol. 7, supplement 5, article S11, 2006. View at Publisher · View at Google Scholar · View at Scopus
  17. M. Tepper, D. Capurro, F. Xia, L. Vanderwende, and M. Yetisgen-Yildiz, “Statistical section segmentation in free-text clinical records,” in Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC '12), pp. 2001–2008, European Language Resources Association (ELRA), 2012.
  18. D. Beeferman, A. Berger, and J. Lafferty, “Statistical models for text segmentation,” Machine Learning, vol. 34, no. 1–3, pp. 177–210, 1999. View at Publisher · View at Google Scholar · View at Scopus
  19. J. Lin, D. Karakos, D. Demner-Fushman, and S. Khudanpur, “Generative content models for structural analysis of medical abstracts,” in Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis (BioNLP '06), pp. 65–72, June 2006. View at Publisher · View at Google Scholar
  20. K. Hirohata, N. Okazaki, S. Ananiadou, and M. Ishizuka, “Identifying sections in scientific abstracts using conditional random fields,” in Proceedings of the 3rd International Joint Conference of Natural Language Processing (IJCNLP '08), Hyderabad, India, 2008.
  21. R. T. K. Lin, H.-J. Dai, Y.-Y. Bow, J. L.-T. Chiu, and R. T.-H. Tsai, “Using conditional random fields for result identification in biomedical abstracts,” Integrated Computer-Aided Engineering, vol. 16, no. 4, pp. 339–352, 2009. View at Publisher · View at Google Scholar · View at Scopus
  22. A. Varga, D. Preotiuc-Pietro, and F. Ciravegna, “Unsupervised document zone identification using probabilistic graphical models,” in Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC '12), pp. 1610–1617, Istanbul, Turkey, May 2012.
  23. J. C. Denny, A. Spickard III, K. B. Johnson, N. B. Peterson, J. F. Peterson, and R. A. Miller, “Evaluation of a method to identify and categorize section headers in clinical documents,” Journal of the American Medical Informatics Association, vol. 16, no. 6, pp. 806–815, 2009. View at Publisher · View at Google Scholar · View at Scopus