Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2015 (2015), Article ID 269856, 10 pages
http://dx.doi.org/10.1155/2015/269856
Research Article

A Structural SVM Based Approach for Binary Classification under Class Imbalance

1Key Laboratory of Intelligent Computing & Signal Processing, Ministry of Education, Anhui University, No. 3, Feixi Road, Hefei, Anhui 230039, China
2School of Computer, Anhui University, No. 3, Feixi Road, Hefei 230039, China

Received 4 January 2015; Accepted 4 May 2015

Academic Editor: Haibo He

Copyright © 2015 Fan Cheng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. D. Vassis, B. A. Kampouraki, and P. Belsis, “Using neural networks and SVMs for automatic medical diagnosis: a comprehensive review,” in 4th International Conference on Integrated Information (IC-ININFO '14), vol. 1644 of AIP Conference Proceedings, pp. 32–36, AIP Publishing, Madrid, Spain, September 2014. View at Publisher · View at Google Scholar
  2. A. Dal Pozzolo, O. Caelen, Y.-A. Le Borgne, S. Waterschoot, and G. Bontempi, “Learned lessons in credit card fraud detection from a practitioner perspective,” Expert Systems with Applications, vol. 41, no. 10, pp. 4915–4928, 2014. View at Publisher · View at Google Scholar · View at Scopus
  3. A. Kshirsagar and L. Dole, “A review on data mining methods for identity crime detection,” International Journal of Electrical, Electronics and Computer Systems, vol. 2, no. 1, pp. 312–318, 2014. View at Google Scholar
  4. Q. Wu, Y. Ye, H. Zhang, M. K. Ng, and S. Ho, “ForesTexter: an efficient random forest algorithm for imbalanced text categorization,” Knowledge-Based Systems, vol. 67, no. 9, pp. 105–116, 2014. View at Publisher · View at Google Scholar
  5. Q. Yang and X.-D. Wu, “10 Challenging problems in data mining research,” International Journal of Information Technology & Decision Making, vol. 5, no. 4, pp. 597–604, 2006. View at Publisher · View at Google Scholar · View at Scopus
  6. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. View at Google Scholar · View at Scopus
  7. C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap, “Safe-level-SMOTE: safe-level-synthetic minority over-sampling TEchnique for handling the class imbalanced problem,” in Advances in Knowledge Discovery and Data Mining: Proceedings of the 13th Pacific-Asia Conference, PAKDD 2009 Bangkok, Thailand, April 27–30, 2009, vol. 5476 of Lecture Notes in Computer Science, pp. 475–482, Springer, Berlin, Germany, 2009. View at Publisher · View at Google Scholar
  8. S. Barua, M. M. Islam, X. Yao, and K. Murase, “MWMOTE—majority weighted minority oversampling technique for imbalanced data set learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 2, pp. 405–425, 2014. View at Publisher · View at Google Scholar · View at Scopus
  9. M. Z. Zhu, C. Xu, and Y.-F. B. Wu, “IFME: information filtering by multiple examples with under-sampling in a digital library environment,” in Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '13), pp. 107–110, ACM, July 2013. View at Publisher · View at Google Scholar · View at Scopus
  10. J. L. Hsu, P. C. Hung, and H. Y. Lin, “Applying under-sampling techniques and cost-sensitive learning methods on risk assessment of breast cancer,” Journal of Medical Systems, vol. 39, no. 4, pp. 1–13, 2015. View at Google Scholar
  11. W. Liu, S. Chawla, D. A. Cieslak, and N. V. Chawla, “A robust decision tree algorithms for imbalanced data sets,” in Proceedings of the 10th SIAM International Conference on Data Mining (SDM '10), pp. 766–777, Sydney, Australia, December 2010.
  12. S. Kang and K. Ramamohanarao, “A robust classifier for imbalanced datasets,” in Advances in Knowledge Discovery and Data Mining, pp. 212–223, Springer, 2014. View at Google Scholar
  13. S. Köknar-Tezel and L. J. Latecki, “Improving SVM classification on imbalanced data sets in distance spaces,” in Proceedings of the 9th IEEE International Conference on Data Mining (ICDM '09), pp. 259–267, December 2009. View at Publisher · View at Google Scholar · View at Scopus
  14. Y.-H. Shao, W.-J. Chen, J.-J. Zhang, Z. Wang, and N.-Y. Deng, “An efficient weighted Lagrangian twin support vector machine for imbalanced data classification,” Pattern Recognition, vol. 47, no. 9, pp. 3158–3167, 2014. View at Publisher · View at Google Scholar · View at Scopus
  15. T. Joachims, “A Support Vector Method for multivariate performance measures,” in Proceedings of the 22nd International Conference on Machine Learning (ICML '05), pp. 377–384, ACM, August 2005. View at Publisher · View at Google Scholar · View at Scopus
  16. F. Aiolli, “Convex AUC optimization for top-N recommendation with implicit feedback,” in Proceedings of the 8th ACM Conference on Recommender Systems, pp. 293–296, ACM, Silicon Valley, Calif, USA, October 2014. View at Publisher · View at Google Scholar
  17. S. Paisitkriangkrai, C. Shen, and A. V. D. Hengel, “Efficient pedestrian detection by directly optimizing the partial area under the ROC curve,” in Proceedings of the 14th IEEE International Conference on Computer Vision (ICCV '13), pp. 1057–1064, IEEE, Sydney, Australia, December 2013. View at Publisher · View at Google Scholar · View at Scopus
  18. H. Narasimhan and S. Agarwal, “A structural SVM based approach for optimizing partial AUC,” in Proceedings of the 30th International Conference on Machine Learning (ICML '13), pp. 516–524, June 2013. View at Scopus
  19. H. Narasimhan and S. Agarwal, “SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound,” in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '13), pp. 167–175, ACM, 2013. View at Publisher · View at Google Scholar
  20. P. M. Chinta, P. Balamurugan, and S. Shevade, “Optimizing F-measure with non-convex loss and sparse linear classifiers,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '13), pp. 1–8, IEEE, August 2013. View at Publisher · View at Google Scholar · View at Scopus
  21. A. Maratea, A. Petrosino, and M. Manzo, “Adjusted F-measure and kernel scaling for imbalanced data learning,” Information Sciences, vol. 257, no. 2, pp. 331–341, 2014. View at Publisher · View at Google Scholar · View at Scopus
  22. Z. C. Lipton, C. Elkan, and B. Naryanaswamy, “Optimal thresholding of classifiers to maximize F1 measure,” in Machine Learning and Knowledge Discovery in Databases, vol. 8725, pp. 225–239, Springer, Berlin, Germany, 2014. View at Publisher · View at Google Scholar
  23. Q. Gu, L. Zhu, and Z. Cai, “Evaluation measures of the classification performance of imbalanced data sets,” Communications in Computer and Information Science, vol. 51, pp. 461–471, 2009. View at Publisher · View at Google Scholar · View at Scopus
  24. S. Lawrence, I. Burns, A. Back, A. C. Tsoi, and C. L. Giles, “Neural network classification and prior class probabilities,” in Neural Networks: Tricks of the Trade, vol. 1524 of Lecture Notes in Computer Science, pp. 299–313, Springer, Berlin, Germany, 1998. View at Publisher · View at Google Scholar
  25. A. Menon, H. Narasimhan, and S. Agarwal, “On the statistical consistency of algorithms for binary classification under class imbalance,” in Proceedings of the 30th International Conference on Machine Learning (ICML '13), pp. 603–611, 2013.
  26. T. Joachims, T. Finley, and C.-N. J. Yu, “Cutting-plane training of structural SVMs,” Machine Learning, vol. 77, no. 1, pp. 27–59, 2009. View at Publisher · View at Google Scholar · View at Scopus
  27. C. N. J. Yu and T. Joachims, “Training structural svms with kernels using sampled cuts,” in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08), pp. 794–802, August 2008. View at Publisher · View at Google Scholar · View at Scopus
  28. T. Joachims, Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms, Kluwer Academic Publishers, 2002.
  29. W. Chen, T. Y. Liu, and Y. Y. Lan, “Ranking measures and loss functions in learning to rank,” in Advances in Neural Information Processing Systems, pp. 315–323, 2009. View at Google Scholar