Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2015, Article ID 401024, 11 pages
http://dx.doi.org/10.1155/2015/401024
Research Article

Exploiting Language Models to Classify Events from Twitter

1School of Electrical Engineering, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan 680-749, Republic of Korea
2Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 156-743, Republic of Korea

Received 23 November 2014; Accepted 2 January 2015

Academic Editor: Weihui Dai

Copyright © 2015 Duc-Thuan Vo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. N. Dalvi, R. Kumar, and B. Pang, “Object matching in tweets with spatial models,” in Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM '12), pp. 43–52, February 2012. View at Publisher · View at Google Scholar · View at Scopus
  2. J. Eisenstein, N. A. Smith, and E. P. Xing, “Discovering sociolinguistic associations with structured sparsity,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT '11), 2011.
  3. K. Gimpel, N. Schneider, B. O'Connor et al., “Part-of-speech tagging for twitter: annotation, features, and experiments,” in Proceedings of 51st Annual Meeting of the Association for Computational Linguistics (ACL '11), pp. 42–47, June 2011. View at Scopus
  4. X. Liu, S. Zhang, F. Wei, and M. Zhou, “Recognizing named entities in tweets,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT '11), pp. 359–367, June 2011. View at Scopus
  5. B. Locke and J. Martin, Named entity recognition: adapting to microbliogging [Senior thesis], University of Colorado, 2009.
  6. G. Neubig, Y. Matsubayashi, M. Hagiwara, and K. Murakami, “Safety information mining—what can NLP do in a disaster,” in Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP '11), pp. 965–973, 2011.
  7. E. Benson, A. Haghighi, and R. Barzilay, “Event discovery in social media feeds,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT '11), pp. 389–398, Portland, Ore, USA, June 2011. View at Scopus
  8. K. Kireyev, L. Palen, and K. Anderson, “Applications of topics models to analysis of disaster-related twitter data,” in Proceedings of the NIPS Workshop on Applications for Topic Models: Text and Beyond, December 2009.
  9. J. Weng and B. S. Lee, “Event detection in twitter,” in Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM '11), Barcelona, Spain, July 2011.
  10. Q. Diao, J. Jiang, F. Zhu, and P.-E. Lim, “Finding bursty topics from microblogs,” in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers (ACL'12), pp. 536–544, Jeju, South Korea, 2012.
  11. Z. J. Gao, Y. Song, S. Liu et al., “Tracking and connecting topics via incremental hierarchical Dirichlet processes,” in Proceedings of the 11th IEEE International Conference on Data Mining (ICDM '11), pp. 1056–1061, Vancouver, Canada, December 2011. View at Publisher · View at Google Scholar · View at Scopus
  12. T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes Twitter users: real-time event detection by social sensors,” in Proceedings of the 19th International World Wide Web Conference (WWW '10), pp. 851–860, New York, NY, USA, April 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” Journal of Machine Learning Research, vol. 3, no. 4-5, pp. 993–1022, 2003. View at Google Scholar · View at Scopus
  14. X.-H. Phan, C.-T. Nguyen, D.-T. Le, L.-M. Nguyen, S. Horiguchi, and Q.-T. Ha, “A hidden topic-based framework toward building applications with short web documents,” IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 7, pp. 961–976, 2011. View at Publisher · View at Google Scholar · View at Scopus
  15. D. T. Vo and C. Y. Ock, “Extraction of semantic relation based on feature vector from Wikipedia,” in PRICAI 2012: Trends in Artificial Intelligence: Proceedings of the 12th Pacific Rim International Conference on Artificial Intelligence, Kuching, Malaysia, September 3–7, 2012, vol. 7458 of Lecture Notes in Computer Science, pp. 814–819, Springer, Berlin, Germany, 2012. View at Publisher · View at Google Scholar
  16. R. Speer and C. Havasi, “Conceptnet 5: a large semantic network for relational knowledge,” in The People's Web Meets NLP, Theory and Applications of Natural Language Processing, pp. 161–176, Springer, Berlin, Germany, 2013. View at Publisher · View at Google Scholar
  17. A. Ritter, Mausam, and O. Etzioni, “A latent dirichlet allocation method for selectional preferences,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL '10), pp. 424–434, July 2010. View at Scopus
  18. M. Cheong and V. Lee, “Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base,” in Proceedings of the 2nd ACM Workshop on Social Web Search and Mining (SWSM '09), Co-located with the 18th ACM International Conference on Information and Knowledge Management (CIKM '09), pp. 1–8, November 2009. View at Publisher · View at Google Scholar · View at Scopus
  19. N. Glance, M. Hurst, and T. Tomokiyo, “Blogpulse: automated trend discovery for weblogs,” in Proceedings of the WWW Workshop on the Weblogging Ecosystem: Aggregation, Analysis, and Dynamics, 2004.
  20. D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins, “Information diffusion through blogspace,” in Proceedings of the 13th International Conference on the World Wide Web (WWW '04), pp. 491–501, 2004. View at Scopus
  21. J. Allan, R. Papka, and V. Lavrenko, “On-line new event detection and tracking,” in Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '98), pp. 37–45, 1998. View at Publisher · View at Google Scholar
  22. R. Nallapati, A. Feng, F. Peng, and J. Allan, “Event threading within news topics,” in Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM '04), pp. 446–453, November 2004. View at Scopus
  23. B. Shaparenko, R. Caruana, J. Gehrke, and T. Joachims, “Identifying temporal patterns and key players in document collections,” in Proceedings of the IEEE ICDM Workshop on Temporal Data Mining: Algorithms, Theory, and Applications (TDM '05), pp. 165–174, 2005.
  24. W. H. Dai, H. Z. Hu, T. Wu, and Y. H. Dai, “Information spread of emergency events: path searching on social networks,” The Scientific World Journal, vol. 2014, Article ID 179620, 7 pages, 2014. View at Publisher · View at Google Scholar · View at Scopus
  25. W. H. Dai, X. Q. Wan, and X. Y. Liu, “Emergency event: internet spread, psychological impacts and emergency management,” Journal of Computers, vol. 6, no. 8, pp. 1748–1755, 2011. View at Publisher · View at Google Scholar · View at Scopus
  26. H. Z. Hu, D. Wang, W. H. Dai, and L. H. Huang, “Psychology and behavior mechanism of micro-blog information spreading,” African Journal of Business Management, vol. 6, no. 35, pp. 9797–9807, 2012. View at Google Scholar
  27. X. H. Hu, T. Mu, W. H. Dai, H. Z. Hu, and G. H. Dai, “Analysis of browsing behaviors with ant colony clustering algorithm,” Journal of Computers, vol. 7, no. 12, pp. 3096–3102, 2012. View at Publisher · View at Google Scholar · View at Scopus
  28. S. Harabagiu and A. Hickl, “Relevance modeling for microblog summarization,” in Proceedings of the 5th International Conference Webblogs and Social Media (ICWSM '11), pp. 514–517, Barcelona, Spain, July 2011.
  29. D. Inouye and J. K. Kalita, “Comparing twitter summarization algorithms for multiple post summaries,” in Proceedings of the IEEE 3rd International Conference on Privacy, Security, Risk and Trust (PASSAT) and IEEE 3rd Inernational Conference on Social Computing (SocialCom '11), pp. 298–306, Boston, Mass, USA, October 2011. View at Publisher · View at Google Scholar
  30. B. Sharifi, M. Hutton, and J. Kalita, “Automatic summarization of twitter topics,” in Proceedings of the National Workshop Design and Analysis of Algorithm, pp. 121–128, Assam, India, 2010.
  31. H. Takamura, H. Yokono, and M. Okumura, “Summarizing a document stream,” in Advances in Information Retrieval, vol. 6611 of Lecture Notes in Computer Science, pp. 177–188, Springer, New York, NY, USA, 2011. View at Publisher · View at Google Scholar
  32. H. Becker, M. Naaman, and L. Gravano, “Beyond trending topics: real-world event identification on twitter,” in Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM '11), 2011.
  33. A. M. Popescu, M. Pennacchiotti, and D. A. Paranjpe, “Extracting events and event descriptions from Twitter,” in Proceedings of the 20th International Conference Companion on World Wide Web (WWW '11), pp. 105–106, April 2011. View at Publisher · View at Google Scholar · View at Scopus
  34. E. Erosheva, S. Fienberg, and J. Lafferty, “Mixed-membership models of scientific publications,” Proceedings of the National Academy of Sciences of the United States of America, vol. 101, no. 1, pp. 5220–5227, 2004. View at Publisher · View at Google Scholar · View at Scopus
  35. M. Banko and O. Etzioni, “The tradeoffs between open and traditional relation extraction,” in Proceedings of the ACL '08: HLT, 2008.
  36. S. Petrović, M. Osborne, and V. Lavrenko, “The Edinburgh twitter corpus,” in Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media (WSA '10), June 2010.
  37. T. Joachims, Learning to classify text using support vector machines [Ph.D. dissertation], Kluwer Academic Publishers, 2002.
  38. D. T. Vo and C. Y. Ock, “Learning to classify short text from scientific documents using topic models with various types of knowledge,” Expert Systems with Applications, vol. 42, no. 3, pp. 1684–1698, 2015. View at Publisher · View at Google Scholar
  39. W. Zhang, T. Yoshida, and X. Tang, “Text classification based on multi-word with support vector machine,” Knowledge-Based Systems, vol. 21, no. 8, pp. 879–886, 2008. View at Publisher · View at Google Scholar · View at Scopus
  40. K. W. Church and P. Hanks, “Word association norms, mutual information, and lexicography,” Computational linguistics, vol. 16, no. 1, pp. 22–29, 1990. View at Google Scholar
  41. B. Gerloff, “Normalized (pointwise) mutual information in collocation extraction,” in Proceedings of the Biennial GSCL Conference, 2009.
  42. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, vol. 41, no. 6, pp. 391–407, 1990. View at Publisher · View at Google Scholar
  43. T. K. Landauer, P. W. Foltz, and D. Laham, “An introduction to latent semantic analysis,” Discourse Processes, vol. 25, no. 2-3, pp. 259–284, 1998. View at Publisher · View at Google Scholar