Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2016, Article ID 7819626, 12 pages
http://dx.doi.org/10.1155/2016/7819626
Research Article

Improved Feature Weight Algorithm and Its Application to Text Classification

1School of Computer Science, Communication University of China, Beijing 100024, China
2School of Computer, Faculty of Science and Engineering, Communication University of China, Beijing 100024, China

Received 2 November 2015; Revised 1 February 2016; Accepted 3 March 2016

Academic Editor: Andrzej Swierniak

Copyright © 2016 Songtao Shang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. S. Jinshu, Z. Bofeng, and X. Xin, “Advances in machine learning based text categorization,” Journal of Software, vol. 17, no. 9, pp. 1848–1859, 2006. View at Google Scholar
  2. G. Salton and C. Yang, “On the specification of term values in automatic indexing,” Journal of Documentation, vol. 29, no. 4, pp. 351–372, 1973. View at Publisher · View at Google Scholar
  3. Z. Cheng, L. Qing, and L. Fujun, “Improved VSM algorithm and its application in FAQ,” Computer Engineering, vol. 38, no. 17, pp. 201–204, 2012. View at Publisher · View at Google Scholar
  4. X. Junling, Z. Yuming, C. Lin, and X. Baowen, “An unsupervised feature selection approach based on mutual information,” Journal of Computer Research and Development, vol. 49, no. 2, pp. 372–382, 2012. View at Google Scholar
  5. Z. Zhenhai, L. Shining, and L. Zhigang, “Multi-label feature selection algorithm based on information entropy,” Journal of Computer Research and Development, vol. 50, no. 6, pp. 1177–1184, 2013. View at Google Scholar
  6. L. Kousu and S. Caiqing, “Research on feature-selection in Chinese text classification,” Computer Simulation, vol. 24, no. 3, pp. 289–291, 2007. View at Google Scholar
  7. Y. Yang and J. Q. Pedersen, “A comparative study on feature selection in text categorization,” in Proceedings of the 14th International Conference on Machine Learning, pp. 412–420, Nashville, Tenn, USA, July 1997.
  8. Q. Liqing, Z. Ruyi, Z. Gang et al., “An extensive empirical study of feature selection for text categorization,” in Proceedings of the 7th IEEE/ACIS International Conference on Computer and Information Science, pp. 312–315, IEEE, Washington, DC, USA, May 2008. View at Publisher · View at Google Scholar · View at Scopus
  9. S. Wenqian, D. Hongbin, Z. Haibin, and W. Yongbin, “A novel feature weight algorithm for text categorization,” in Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE '08), pp. 1–7, Beijing, China, October 2008. View at Publisher · View at Google Scholar
  10. L. Yongmin and Z. Weidong, “Using Gini-index for feature selection in text categorization,” Computer Applications, vol. 27, no. 10, pp. 66–69, 2007. View at Google Scholar
  11. L. Xuejun, “Design and research of decision tree classification algorithm based on minimum Gini index,” Software Guide, vol. 8, no. 5, pp. 56–57, 2009. View at Google Scholar
  12. R. Guofeng, L. Dehua, and P. Ying, “An improved algorithm for feature weight based on Gini index,” Computer & Digital Engineering, vol. 38, no. 12, pp. 8–13, 2010. View at Google Scholar
  13. L. Yongmin, L. Zhengyu, and Z. Shuang, “Analysis and improvement of feature weighting method TF-IDF in text categorization,” Computer Engineering and Design, vol. 29, no. 11, pp. 2923–2925, 2929, 2008. View at Google Scholar
  14. Q. Shian and L. Fayun, “Improved TF-IDF method in text classification,” New Technology of Library and Information Service, vol. 238, no. 10, pp. 27–30, 2013. View at Google Scholar
  15. L. Yonghe and L. Yanfeng, “Improvement of text feature weighting method based on TF-IDF algorithm,” Library and Information Service, vol. 57, no. 3, pp. 90–95, 2013. View at Google Scholar
  16. Y. Xu, Z. Li, and J. Chen, “Parallel recognition of illegal Web pages based on improved KNN classification algorithm,” Journal of Computer Applications, vol. 33, no. 12, pp. 3368–3371, 2013. View at Publisher · View at Google Scholar
  17. Y. Liu, Y. Jian, and J. Liping, “An adaptive large margin nearest neighbor classification algorithm,” Journal of Computer Research and Development, vol. 50, no. 11, pp. 2269–2277, 2013. View at Google Scholar
  18. M. Wan, G. Yang, Z. Lai, and Z. Jin, “Local discriminant embedding with applications to face recognition,” IET Computer Vision, vol. 5, no. 5, pp. 301–308, 2011. View at Google Scholar
  19. U. Maulik and D. Chakraborty, “Fuzzy preference based feature selection and semi-supervised SVM for cancer classification,” IEEE Transactions on Nanobioscience, vol. 13, no. 2, pp. 152–160, 2014. View at Publisher · View at Google Scholar · View at Scopus
  20. X. Liang, “An effective method of pruning support vector machine classifiers,” IEEE Transactions on Neural Networks, vol. 21, no. 1, pp. 26–38, 2010. View at Publisher · View at Google Scholar · View at Scopus
  21. P. Baomao and S. Haoshan, “Research on improved algorithm for Chinese word segmentation based on Markov chain,” in Proceedings of the 5th International Conference on Information Assurance and Security (IAS '09), pp. 236–238, Xi'an, China, September 2009. View at Publisher · View at Google Scholar · View at Scopus
  22. S. Yan, C. Dongfeng, Z. Guiping, and Z. Hai, “Approach to Chinese word segmentation based on character-word joint decoding,” Journal of Software, vol. 20, no. 9, pp. 2366–2375, 2009. View at Publisher · View at Google Scholar · View at Scopus
  23. J. Savoy, “A stemming procedure and stopword list for general French corpora,” Journal of the American Society for Information Science, vol. 50, no. 10, pp. 944–952, 1999. View at Publisher · View at Google Scholar · View at Scopus
  24. L. Rutkowski, L. Pietruczuk, P. Duda, and M. Jaworski, “Decision trees for mining data streams based on the McDiarmid's bound,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 6, pp. 1272–1279, 2013. View at Publisher · View at Google Scholar · View at Scopus
  25. N. Prasad, P. Kumar, and M. M. Naidu, “An approach to prediction of precipitation using gini index in SLIQ decision tree,” in Proceedings of the 4th International Conference on Intelligent Systems, Modelling and Simulation (ISMS '13), pp. 56–60, Bangkok, Thailand, January 2013. View at Publisher · View at Google Scholar · View at Scopus
  26. S. S. Sundhari, “A knowledge discovery using decision tree by Gini coefficient,” in Proceedings of the International Conference on Business, Engineering and Industrial Applications (ICBEIA '11), pp. 232–235, IEEE, Kuala Lumpur, Malaysia, June 2011. View at Publisher · View at Google Scholar · View at Scopus
  27. S. V. Stehman, “Selecting and interpreting measures of thematic classification accuracy,” Remote Sensing of Environment, vol. 62, no. 1, pp. 77–89, 1997. View at Publisher · View at Google Scholar · View at Scopus
  28. F. Guohe, “Review of performance of text classification,” Journal of Intelligence, vol. 30, no. 8, pp. 66–69, 2011. View at Google Scholar
  29. C. J. Van Rijsbergen, Information Retrieval, Butter-Worths, London, UK, 1979.
  30. L. Jianghao, Y. Aimin, Y. Yongmei et al., “Classification of microblog sentiment based on Naïve Byaesian,” Computer Engineering & Science, vol. 34, no. 9, pp. 160–165, 2012. View at Google Scholar
  31. Z. Kenan, Y. Baolin, M. Yaming, and H. Yingnan, “Malware classification approach based on valid window and Naïve bayes,” Journal of Computer Research and Development, vol. 51, no. 2, pp. 373–381, 2014. View at Google Scholar