About this Journal Submit a Manuscript Table of Contents
Abstract and Applied Analysis
Volume 2013 (2013), Article ID 196256, 6 pages
http://dx.doi.org/10.1155/2013/196256
Research Article

A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

School of Computer and Information Technology, Liaoning Normal University, No. 1, Liushu South Street, Ganjingzi, Dalian, Liaoning 116081, China

Received 28 December 2012; Accepted 25 March 2013

Academic Editor: Jianhong (Cecilia) Xia

Copyright © 2013 Yong Zhang and Dapeng Wang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. View at Zentralblatt MATH · View at Scopus
  2. M. Gao, X. Hong, S. Chen, and C. J. Harris, “A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems,” Neurocomputing, vol. 74, pp. 3456–3466, 2011. View at Publisher · View at Google Scholar
  3. G. Weiss, “Mining with rarity: a unifying framework,” SIGKDD Explorations, vol. 6, no. 1, pp. 7–19, 2004.
  4. C. Seiffert, T. M. Khoshgoftaar, J. van Hulse, and A. Napolitano, “RUSBoost: a hybrid approach to alleviating class imbalance,” IEEE Transactions on Systems, Man, and Cybernetics A, vol. 40, no. 1, pp. 185–197, 2010. View at Publisher · View at Google Scholar · View at Scopus
  5. M. S. Kim, “An effective under-sampling method for class imbalance data problem,” in Proceedings of the 8th Symposium on Advanced Intelligent Systems, pp. 825–829, 2007. View at Publisher · View at Google Scholar
  6. S. J. Yen and Y. S. Lee, “Cluster-based under-sampling approaches for imbalanced data distributions,” Expert Systems with Applications, vol. 36, no. 3, pp. 5718–5727, 2009. View at Publisher · View at Google Scholar · View at Scopus
  7. X. Y. Liu, J. X. Wu, and Z. H. Zhou, “Exploratory undersampling for class-imbalance learning,” IEEE Transactions on Systems, Man, and Cybernetics B, vol. 39, no. 2, pp. 539–550, 2009. View at Publisher · View at Google Scholar · View at Scopus
  8. C. Drummond and R. C. Holte, “C4.5 decision tree, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,” in Proceedings of the Workshop on Learning from Imbalanced Data Sets II, International Conference on Machine Learning, 2003. View at Publisher · View at Google Scholar
  9. N. V. Chawla, A. Lazarevic, L. O. Hall, and K. W. Bowyer, “SMOTEBoost: improving prediction of the minority class in boosting,” in Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD '03), pp. 107–119, September 2003. View at Scopus
  10. S. Wang, Z. Li, W. Chao, and Q. Cao, “Applying adaptive over-sampling technique based on data density and cost-sensitive SVM to imbalanced learning,” in The International Joint Conference on Neural Networks (IJCNN '12), 2012. View at Publisher · View at Google Scholar
  11. M. Gao, X. Hong, S. Chen, and C. J. Harris, “Probability density function estimation based over-sampling for imbalanced two-class problems,” in The International Joint Conference on Neural Networks (IJCNN '12), 2012. View at Publisher · View at Google Scholar
  12. C. Elkan, “The foundations of cost-sensitive learning,” in Proceedings of the 17th International Joint Conference on Artificial Intelligence, pp. 973–978, 2001.
  13. B. X. Wang and N. Japkowicz, “Boosting support vector machines for imbalanced data sets,” Knowledge and Information Systems, vol. 25, no. 1, pp. 1–20, 2010. View at Publisher · View at Google Scholar · View at Scopus
  14. Y. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, “Cost-sensitive boosting for classification of imbalanced data,” Pattern Recognition, vol. 40, no. 12, pp. 3358–3378, 2007. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  15. H. Guo and H. L. Viktor, “Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach,” SIGKDD Explorations, vol. 6, no. 1, pp. 30–39, 2004. View at Publisher · View at Google Scholar
  16. R. Akbani, S. Kwek, and N. Japkowicz, “Applying support vector machines to imbalanced datasets,” in Proceedings of the 15th European Conference on Machine Learning (ECML '04), pp. 39–50, Pisa, Italy, September 2004. View at Zentralblatt MATH · View at Scopus
  17. Y. Tang, Y. Q. Zhang, and N. V. Chawla, “SVMs modeling for highly imbalanced classification,” IEEE Transactions on Systems, Man, and Cybernetics B, vol. 39, no. 1, pp. 281–288, 2009. View at Publisher · View at Google Scholar · View at Scopus
  18. J. Wang, J. You, Q. Li, and Y. Xu, “Extract minimum positive and maximum negative features for imbalanced binary classification,” Pattern Recognition, vol. 45, pp. 1136–1145, 2012. View at Publisher · View at Google Scholar
  19. N. García-Pedrajas, J. Pérez-Rodríguez, and A. de Haro-García, “OligoIS: scalable instance selection for class-imbalanced data sets,” IEEE Transactions on Systems, Man, and Cybernetics B, 2012. View at Publisher · View at Google Scholar
  20. K. Veropoulos, C. Campbell, and N. Cristianini, “Controlling the sensitivity of support vector machines,” in Proceedings of the International Joint Conference on Artificial Intelligence, pp. 55–60, 1999.
  21. H. S. Seung, M. Opper, and H. Sompolinsky, “Query by committee,” in Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 287–294, July 1992. View at Scopus
  22. Y. Freund, H. S. Seung, E. Shamir, and N. Tishby, “Selective sampling using the query by committee algorithm,” Machine Learning, vol. 28, no. 2-3, pp. 133–168, 1997. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  23. Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” in Proceedings of the 2nd European Conference on Computational Learning Theory, pp. 23–37, 1995. View at Zentralblatt MATH
  24. M. V. Joshi, V. Kumar, and R. C. Agarwal, “Evaluating boosting algorithms to classify rare classes: comparison and improvements,” in Proceedings of the 1st IEEE International Conference on Data Mining (ICDM '01), pp. 257–264, December 2001. View at Publisher · View at Google Scholar · View at Scopus
  25. T. Fawcett, “ROC graphs: notes and practical considerations for researchers,” Tech. Rep. HPL-2003-4, HP Labs, Palo Alto, Calif, USA, 2003.
  26. D. Lewis and W. Gale, “Training text classifiers by uncertainty sampling,” in Proceedings of the 7th Annual International ACM SIGIR Conference on Research and Development in Information, pp. 73–79, New York, NY, USA, 1998.
  27. A. Frank and A. Asuncion, UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, Calif, USA, 2010, http://archive.ics.uci.edu/ml/.