Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2017, Article ID 7831897, 10 pages
https://doi.org/10.1155/2017/7831897
Research Article

Semantic Annotation of Unstructured Documents Using Concepts Similarity

1National Center of Research and Technological Development (CENIDET), Cuernavaca, MOR, Mexico
2Center for Research and Innovation in Information and Communications Technologies, Ciudad de México, Mexico
3National Institute of Electricity and Clean Energy (INEEL), Cuernavaca, MOR, Mexico

Correspondence should be addressed to Fernando Pech; xm.ude.tedinec@hcepf

Received 17 June 2017; Revised 2 October 2017; Accepted 8 November 2017; Published 7 December 2017

Academic Editor: José María Álvarez-Rodríguez

Copyright © 2017 Fernando Pech et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. T. Zhang, K. Liu, and J. A. Zhao, “Graph-based similarity measure between wikipedia concepts and its application in entity linking system,” Journal of Chinese Information Processing, vol. 29, no. 2, pp. 58–67, 2015. View at Google Scholar
  2. C. Bizer, T. Heath, and T. Berners-Lee, “Linked data—the story so far,” International Journal on Semantic Web and Information Systems, vol. 5, no. 3, article 122, 2009. View at Google Scholar
  3. C. Bizer, J. Lehmann, G. Kobilarov et al., “DBpedia—a crystallization point for the Web of Data,” Journal of Web Semantics: Science, Services and Agents on the World Wide Web, vol. 7, no. 3, pp. 154–165, 2009. View at Publisher · View at Google Scholar · View at Scopus
  4. K. Bollacker, R. Cook, and P. Tufts, “Freebase: a shared database of structured general human knowledge,” in In Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI '07), vol. 2, pp. 1962-1963, AAAI Press, British Columbia, Canada, July 2007.
  5. F. M. Suchanek, G. Kasneci, and G. Weikum, “Yago: a core of semantic knowledge,” in Proceedings of the 16th International World Wide Web Conference (WWW '07), pp. 697–706, Alberta, Canada, May 2007. View at Publisher · View at Google Scholar · View at Scopus
  6. L. Bos and K. Donnelly, “The advanced terminology and coding system for ehealth,” Studies in Health Technology and Informatics, vol. 121, pp. 279–290, 2009. View at Google Scholar
  7. J. M. Ruiz-Martínez, R. Valencia-García, J. T. Fernández-Breis, F. García-Sánchez, and R. Martínez-Béjar, “Ontology learning from biomedical natural language documents using UMLS,” Expert Systems with Applications, vol. 38, no. 10, pp. 12365–12378, 2011. View at Publisher · View at Google Scholar · View at Scopus
  8. C. Caracciolo, A. Stellato, A. Morshed et al., “The AGROVOC linked dataset,” Journal of Web Semantics, vol. 4, no. 3, pp. 341–348, 2013. View at Publisher · View at Google Scholar · View at Scopus
  9. A. R. Aronson and F.-M. Lang, “An overview of MetaMap: historical perspective and recent advances,” Journal of the American Medical Informatics Association, vol. 17, no. 3, pp. 229–236, 2010. View at Publisher · View at Google Scholar · View at Scopus
  10. R. Berlanga, V. Nebot, and E. Jiménez, “Semantic annotation of biomedical texts through concept retrieval,” Procesamiento del Lenguaje Natural, vol. 45, pp. 247–250, 2010. View at Google Scholar
  11. M. Dai, N. Shah, W. Xuan et al., “An efficient solution for mapping free text to ontology terms,” in Proceedings of the American Medical Informatics Association Symposium on Translational BioInformatics (AMIA-TBI '08), Washington, DC, USA, November 2008.
  12. R. Navigli and M. Lapata, “An experimental study of graph connectivity for unsupervised word sense disambiguation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 4, pp. 678–692, 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. E. Agirre, O. L. de Lacalle, and A. Soroa, “Random walks for knowledge-based word sense disambiguation,” Computational Linguistics, vol. 40, no. 1, pp. 57–84, 2014. View at Publisher · View at Google Scholar · View at Scopus
  14. M. Lee and M. Welsh, “An empirical evaluation of models of text document similarity,” in Proceedings of the 27 Annual Conference of the Cognitive Science Society (CogSci '05), pp. 1254–1259, Erlbaum, Stresa, Italy, July 2005.
  15. P. Ferragina and U. Scaiella, “Fast and accurate annotation of short texts with wikipedia pages,” IEEE Software, vol. 29, no. 1, pp. 70–75, 2012. View at Publisher · View at Google Scholar · View at Scopus
  16. P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer, “DBpedia spotlight: shedding light on the web of documents,” in Proceedings of the 7th International Conference on Semantic Systems (I-SEMANTICS '11), pp. 1–8, Graz, Austria, September 2011. View at Publisher · View at Google Scholar · View at Scopus
  17. OpenCalais, 2014, http://www.opencalais.com/.
  18. IBM, AlchemiLanguage, 2015, https://alchemy-language-demo.mybluemix.net/.
  19. D. O. C. S. The University of Sheffield, Developing Language Processing Components with GATE, 8 edition, 2017 https://gate.ac.uk/userguide.
  20. M. Laclavík, M. Šeleng, M. Ciglan, and L. Hluchý, “Ontea: platform for pattern based automated semantic annotation,” Computing and Informatics, vol. 28, no. 4, pp. 555–579, 2009. View at Google Scholar · View at Scopus
  21. D. Rebholz-Schuhmann, M. Arregui, S. Gaudan, H. Kirsch, and A. Jimeno, “Text processing through web services: calling Whatizit,” Bioinformatics, vol. 24, no. 2, pp. 296–298, 2008. View at Publisher · View at Google Scholar · View at Scopus
  22. C. Tao, D. Song, D. Sharma, and C. G. Chute, “Semantator: semantic annotator for converting biomedical text to linked data,” Journal of Biomedical Informatics, vol. 46, no. 5, pp. 882–893, 2013. View at Publisher · View at Google Scholar · View at Scopus
  23. B. Popov, A. Kiryakov, A. Kirilov, D. Manov, D. Ognyanoff, and M. Goranov, “KIM—semantic annotation platform,” in Proceedings of the 2nd International Conference on Semantic Web Conference (ISWC '03), vol. 2870 of Lecture Notes in Computer Science, pp. 834–849, Springer, Sanibel Island, Fla, USA, October 2003. View at Publisher · View at Google Scholar
  24. P. Castells, M. Fernández, and D. Vallet, “An adaptation of the vector-space model for ontology-based information retrieval,” IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 2, pp. 261–272, 2007. View at Publisher · View at Google Scholar · View at Scopus
  25. M. Fernández, I. Cantador, V. López, D. Vallet, P. Castells, and E. Motta, “Semantically enhanced information retrieval: an ontology-based approach,” Journal of Web Semantics, vol. 9, no. 4, pp. 434–452, 2011. View at Publisher · View at Google Scholar · View at Scopus
  26. R. Berlanga, V. Nebot, and M. Pérez, “Tailored semantic annotation for semantic search,” Journal of Web Semantics, vol. 30, pp. 69–81, 2015. View at Publisher · View at Google Scholar · View at Scopus
  27. V. Nebot and R. Berlanga, “Exploiting semantic annotations for open information extraction: an experience in the biomedical domain,” Knowledge and Information Systems, vol. 38, no. 2, pp. 365–389, 2014. View at Publisher · View at Google Scholar · View at Scopus
  28. D. Fuentes-Lorenzo, N. Fernández, J. A. Fisteus, and L. Sánchez, “Improving large-scale search engines with semantic annotations,” Expert Systems with Applications, vol. 40, no. 6, pp. 2287–2296, 2013. View at Publisher · View at Google Scholar · View at Scopus
  29. L. Ding, T. Finin, A. Joshi et al., “Swoogle: a search and metadata engine for the semantic web,” in Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM '04), pp. 652–659, Washington, DC, USA, November 2004. View at Scopus
  30. Y. Lei, V. Uren, and E. Motta, “SemSearch: a search engine for the semantic web,” in Managing Knowledge in a World of Networks, S. Staab and V. Svtek, Eds., vol. 4248 of Lecture Notes in Computer Science, pp. 238–245, Springer, Berlin, Germany, 2006. View at Publisher · View at Google Scholar
  31. C. Tao, Z. Yongjuan, Z. Shen, C. Chengcai, and C. Heng, “Building semantic information search platform with extended sesame framework,” in Proceedings of the 8th International Conference on Semantic Systems (ISEMANTICS '12), pp. 193–196, New York, NY, USA, September 2012.
  32. S. Saha, A. Sajjanhar, S. Gao, R. Dew, and Y. Zhao, “Delivering categorized news items using RSS feeds and web services,” in Proceedings of the 10th IEEE International Conference on Computer and Information Technology (ScalCom '10), pp. 698–702, Bradford, UK, July 2010. View at Publisher · View at Google Scholar · View at Scopus
  33. V. Lopez, M. Fernández, E. Motta, and N. Stieler, “PowerAqua: supporting users in querying and exploring the Semantic Web,” Journal of Web Semantics, vol. 3, no. 3, pp. 249–265, 2012. View at Publisher · View at Google Scholar · View at Scopus
  34. A. Singhal, G. Salton, M. Mitra, and C. Buckley, “Document length normalization,” Information Processing & Management, vol. 32, no. 5, pp. 619–633, 1996. View at Publisher · View at Google Scholar · View at Scopus
  35. I. Augenstein, L. Derczynski, and K. Bontcheva, “Generalisation in named entity recognition: a quantitative analysis,” Computer Speech and Language, vol. 44, pp. 61–83, 2017. View at Publisher · View at Google Scholar · View at Scopus
  36. Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL '94), pp. 133–138, ACM, Las Cruces, NM, USA, June 1994. View at Publisher · View at Google Scholar
  37. P. Resnik, “Using information content to evaluate semantic similarity in a taxonomy,” in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI '95), vol. 2, pp. 448–453, Morgan Kaufmann Publishers Inc., Quebec, Canada, August 1995.
  38. T. M. Cover and J. A. Thomas, Elements of Information Theory Wiley Series in Telecommunications and Signal Processing, John Wiley & Sons, New York, NY, USA, 2007.
  39. P. Kapanipathi, P. Jain, C. Venkataramani, and A. Sheth, “Hierarchical interest graph,” 2016, http://wiki.knoesis.org/index.php/Hierarchical_Interest_Graph.
  40. S. Hassan and R. Mihalcea, “Semantic relatedness using salient semantic analysis,” in Proceedings of the 25th AAAI Conference on Artificial Intelligence, pp. 884–889, San Francisco, Claif, USA, August 2011. View at Scopus
  41. E. Gabrilovich and S. Markovitch, “Computing semantic relatedness using wikipedia-based explicit semantic analysis,” in Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI '07), pp. 1606–1611, Hyderabad, India, January 2007. View at Scopus
  42. M. Schuhmacher and S. P. Ponzetto, “Knowledge-based graph document modeling,” in Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM '14), pp. 543–552, ACM, New York, NY, USA, February 2014. View at Publisher · View at Google Scholar · View at Scopus
  43. L. Huang, D. Milne, E. Frank, and I. H. Witten, “Learning a concept-based document similarity measure,” Journal of the Association for Information Science and Technology, vol. 63, no. 8, pp. 1593–1608, 2012. View at Publisher · View at Google Scholar · View at Scopus