Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015, Article ID 751646, 10 pages
http://dx.doi.org/10.1155/2015/751646
Research Article

Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection

School of Technology, Nanjing Audit University, 86 W. Yushan Road, Nanjing 211815, China

Received 20 October 2014; Revised 13 December 2014; Accepted 14 December 2014

Academic Editor: Fang-Xiang Wu

Copyright © 2015 Yifei Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. S. Wang, D. Li, X. Song, Y. Wei, and H. Li, “A feature selection method based on improved fisher's discriminant ratio for text sentiment classification,” Expert Systems with Applications, vol. 38, no. 7, pp. 8696–8702, 2011. View at Publisher · View at Google Scholar · View at Scopus
  2. T. S. Guzella and W. M. Caminhas, “A review of machine learning approaches to Spam filtering,” Expert Systems with Applications, vol. 36, no. 7, pp. 10206–10222, 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. B. Zhou, Y. Y. Yao, and J. Lou, “A three-way decision approach to email spam filtering,” in Proceedings of the 23rd Canadian Conference on Artificial Intelligence (Canadain AI '10), vol. 6085 of Lecture Notes in Computer Science, pp. 28–39, 2010.
  4. N. Cheng, R. Chandramouli, and K. P. Subbalakshmi, “Author gender identification from text,” Digital Investigation, vol. 8, no. 1, pp. 78–88, 2011. View at Publisher · View at Google Scholar · View at Scopus
  5. S. A. Özel, “A web page classification system based on a genetic algorithm using tagged-terms as features,” Expert Systems with Applications, vol. 38, no. 4, pp. 3407–3415, 2011. View at Publisher · View at Google Scholar · View at Scopus
  6. A. Genkin, D. D. Lewis, and D. Madigan, “Large-scale Bayesian logistic regression for text categorization,” Technometrics, vol. 49, no. 3, pp. 291–304, 2007. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  7. Y. Yang and J. O. Pedersen, “A comparative study on feature selection in text categorization,” in Proceedings of the 14th International Conference on Machine Learning (ICML '97), pp. 412–420, 1997.
  8. A. L. Blum and P. Langley, “Selection of relevant features and examples in machine learning,” Artificial Intelligence, vol. 97, no. 1-2, pp. 245–271, 1997. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  9. N. Azam and J. T. Yao, “Incorporating game theory in feature selection for text categorization,” in Proceedings of the 13th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing (RSFDGrC '11), vol. 6743 of Lecture Notes in Computer Sience, pp. 215–222, Springer, 2011. View at Google Scholar
  10. H. Liang, J. Wang, and Y. Yao, “User-oriented feature selection for machine learning,” The Computer Journal, vol. 50, no. 4, pp. 421–434, 2007. View at Publisher · View at Google Scholar · View at Scopus
  11. S. Piramuthu, “The protein-protein interaction tasks of biocreative III: evaluating feature selection methods for learning in data mining applications,” European Journal of Operational Research, vol. 156, no. 2, pp. 483–494, 2004. View at Google Scholar
  12. Y. Y. Yao and Y. Zhao, “Attribute reduction in decision-theoretic rough set models,” Information Sciences, vol. 178, no. 17, pp. 3356–3373, 2008. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  13. Y. Y. Yao, Y. Zhao, and J. Wang, “On reduct construction algorithms,” Transactions on Computational Science, vol. 2, pp. 100–117, 2008. View at Google Scholar
  14. G. Forman, “An extensive empirical study of feature selection metrics for text classification,” Journal of Machine Learning Research, vol. 3, pp. 1289–1305, 2003. View at Google Scholar · View at Scopus
  15. W. Shang, H. Huang, H. Zhu, Y. Lin, Y. Qu, and Z. Wang, “A novel feature selection algorithm for text categorization,” Expert Systems with Applications, vol. 33, no. 1, pp. 1–5, 2007. View at Publisher · View at Google Scholar · View at Scopus
  16. J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature selection for text classification with Naïve Bayes,” Expert Systems with Applications, vol. 36, no. 3, pp. 5432–5435, 2009. View at Publisher · View at Google Scholar · View at Scopus
  17. N. Azam and J. T. Yao, “Comparison of term frequency and document frequency based feature selection metrics in text categorization,” Expert Systems with Applications, vol. 39, no. 5, pp. 4760–4768, 2012. View at Publisher · View at Google Scholar · View at Scopus
  18. J. Yang, Y. Liu, X. Zhu, Z. Liu, and X. Zhang, “A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization,” Information Processing and Management, vol. 48, no. 4, pp. 741–754, 2012. View at Publisher · View at Google Scholar · View at Scopus
  19. Y. Saeys, I. Inza, and P. Larrañaga, “A review of feature selection techniques in bioinformatics,” Bioinformatics, vol. 23, no. 19, pp. 2507–2517, 2007. View at Publisher · View at Google Scholar · View at Scopus
  20. F. Sebastiani, “Machine learning in automated text categorization,” ACM Computing Surveys, vol. 34, no. 1, pp. 1–47, 2002. View at Publisher · View at Google Scholar · View at Scopus
  21. Y.-Q. Wei, P.-Y. Liu, and Z.-F. Zhu, “A feature selection method based on improved TFIDF,” in Proceedings of the 3rd International Conference on Pervasive Computing and Applications (ICPCA '08), pp. 94–97, Alexandria, Egypt, October 2008. View at Publisher · View at Google Scholar · View at Scopus
  22. W. E. Winkler, “tring comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage,” in Proceedings of the Section on Survey Research Methods (American Statistical Association), vol. 359, pp. 354–359, 1990.
  23. M. A. Jaro, “Advances in record linkage methodology as applied to the 1985 census of Tampa Florida,” Journal of the American Statistical Association, vol. 84, no. 406, pp. 414–420, 1989. View at Publisher · View at Google Scholar
  24. M. A. Jaro, “Probabilistic linkage of large public health data files,” Statistics in Medicine, vol. 14, no. 5–7, pp. 491–498, 1995. View at Publisher · View at Google Scholar · View at Scopus
  25. V. N. Vapnik, Statistical Learning Theory, Adaptive and Learning Systems for Signal Processing, Communications, and Control, John Wiley & Sons, New York, NY, USA, 1998. View at MathSciNet
  26. T. Xia and Y. Du, “Improve VSM text classification by title vector based document representation method,” in Proceedings of the 6th International Conference on Computer Science and Education (ICCSE '11), pp. 210–213, August 2011. View at Publisher · View at Google Scholar · View at Scopus
  27. M. Antunes, C. Silva, B. Ribeiro, and M. Correia, “A hybrid ais-svm ensemble approach for text classification,” in Proceedings of the 10th International Conference on Adaptive and Natural Computing Algorithms, pp. 342–352, 2011.
  28. C.-C. Chang and C.-J. Lin, “LIBSVM : a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, article 27, 2011. View at Publisher · View at Google Scholar
  29. M. Krallinger and A. Valencia, “Evaluating the detection and ranking of protein interaction relevant articles: the biocreative challenge interaction article sub-task (ias),” in Proceedings of the 2nd BioCreative Challenge EvaluationWorkshop, pp. 29–39, 2007.
  30. M. Krallinger, M. Vazquez, F. Leitner et al., “The protein-protein interaction tasks of bioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text,” BMC Bioinformatics, vol. 12, supplement 8, article S3, 2011. View at Publisher · View at Google Scholar · View at Scopus
  31. S. Gunal and R. Edizkan, “Subspace based feature selection for pattern recognition,” Information Sciences, vol. 178, no. 19, pp. 3716–3726, 2008. View at Publisher · View at Google Scholar · View at Scopus