Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2015, Article ID 109806, 11 pages
http://dx.doi.org/10.1155/2015/109806
Research Article

Immune Centroids Oversampling Method for Binary Classification

1The Institute of Information Processing and Application, Soochow University, Suzhou 215006, China
2Department of Computer Science, University of Central Arkansas, Conway, AR 72035, USA

Received 11 November 2014; Accepted 14 February 2015

Academic Editor: Justin Dauwels

Copyright © 2015 Xusheng Ai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. N. V. Chawla, N. Japkowicz, and A. Kotcz, “Special issue on learning from imbalanced data sets,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 1–6, 2004. View at Publisher · View at Google Scholar
  2. H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. C. Elkan, “The foundations of cost-sensitive learning,” in Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI '01), pp. 973–978, August 2001. View at Scopus
  4. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. View at Google Scholar · View at Scopus
  5. G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behaviour of several methods for balancing machine learning training data,” SIGKDD Explorations, vol. 6, no. 1, pp. 20–29, 2004. View at Publisher · View at Google Scholar
  6. D. L. Wilson, “Asymptotic properties of nearest neighbor rules using edited data,” IEEE Transactions on Systems Man and Cybernetics, vol. 2, no. 3, pp. 408–421, 1972. View at Google Scholar · View at MathSciNet
  7. H. Han, W. Y. Wang, and B. H. Mao, “Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning,” in Advances in Intelligent Computing: International Conference on Intelligent Computing, ICIC 2005, Hefei, China, August 23–26, 2005, Proceedings, Part I, vol. 3644 of Lecture Notes in Computer Science, pp. 878–887, Springer, Berlin, Germany, 2005. View at Publisher · View at Google Scholar
  8. C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap, “Safe-level-SMOTE: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem,” in Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD '09), pp. 475–482, Bangkok, Thailand, April 2009.
  9. E. Ramentol, Y. Caballero, R. Bello, and F. Herrera, “SMOTE-RSB*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory,” Knowledge and Information Systems, vol. 33, no. 2, pp. 245–265, 2012. View at Publisher · View at Google Scholar · View at Scopus
  10. L. N. D. Castro and F. J. V. Zuben, “aiNet: an artificial immune network for data analysis,” in Data Mining: A Heuristic Approach, H. A. Abbass, R. A. Sarker, and C. S. Newton, Eds., chapter 12, pp. 231–259, Idea Group Publishing, New York, NY, USA, 2001. View at Google Scholar
  11. R. Batuwita and V. Palade, “Efficient resampling methods for training support vector machines with imbalanced datasets,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '10), pp. 1–8, IEEE, Barcelona, Spain, July 2010. View at Publisher · View at Google Scholar · View at Scopus
  12. A. Fernández, M. J. del Jesus, and F. Herrera, “On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets,” Information Sciences, vol. 180, no. 8, pp. 1268–1291, 2010. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  13. A. Fernández, S. García, M. J. D. Jesus, and F. Herrera, “A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets,” Fuzzy Sets and Systems, vol. 159, no. 18, pp. 2378–2398, 2008. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  14. V. López, A. Fernández, S. García, V. Palade, and F. Herrera, “An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics,” Information Sciences, vol. 250, pp. 113–141, 2013. View at Publisher · View at Google Scholar · View at Scopus
  15. T. Jo and N. Japkowicz, “Class imbalances versus small disjuncts,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 40–49, 2004. View at Publisher · View at Google Scholar
  16. N. K. Jerne, “Towards a network theory of the immune system,” Annales d'immunologie, vol. 125C, no. 1-2, pp. 373–389, 1974. View at Google Scholar
  17. F. M. Burnet, “A modification of Jerne's theory of antibody production using the concept of clonal selection,” CA: A Cancer Journal for Clinicians, vol. 26, no. 2, pp. 119–121, 1976. View at Publisher · View at Google Scholar · View at Scopus
  18. L. N. de Castro and F. J. von Zuben, “Learning and optimization using the clonal selection principle,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 3, pp. 239–251, 2002. View at Publisher · View at Google Scholar · View at Scopus
  19. J. Alcalá-Fdez, L. Sánchez, S. García et al., “KEEL: a software tool to assess evolutionary algorithms for data mining problems,” Soft Computing, vol. 13, no. 3, pp. 307–318, 2009. View at Publisher · View at Google Scholar · View at Scopus
  20. G. J. McLachlan, Discriminant Analysis and Statistical Pattern Recognition, John Wiley & Sons, New York, NY, USA, 2004.
  21. J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kauffman, 1993.
  22. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. View at Publisher · View at Google Scholar · View at Scopus
  23. J. Alcalá-Fdez, A. Fernández, J. Luengo et al., “KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework,” Journal of Multiple-Valued Logic and Soft Computing, vol. 17, no. 2-3, pp. 255–287, 2011. View at Google Scholar · View at Scopus
  24. A. P. Bradley, “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognition, vol. 30, no. 7, pp. 1145–1159, 1997. View at Publisher · View at Google Scholar · View at Scopus
  25. J. Huang and C. X. Ling, “Using AUC and accuracy in evaluating learning algorithms,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 3, pp. 299–310, 2005. View at Publisher · View at Google Scholar · View at Scopus
  26. J. G. Moreno-Torres and F. Herrera, “A preliminary study on overlapping and data fracture in imbalanced domains by means of genetic programming-based feature extraction,” in Proceedings of the 10th International Conference on Intelligent Systems Design and Applications (ISDA '10), pp. 501–506, December 2010. View at Publisher · View at Google Scholar · View at Scopus
  27. J. G. Moreno-Torres, T. Raeder, R. Alaiz-Rodríguez, N. V. Chawla, and F. Herrera, “A unifying view on dataset shift in classification,” Pattern Recognition, vol. 45, no. 1, pp. 521–530, 2012. View at Publisher · View at Google Scholar · View at Scopus
  28. H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: adaptive synthetic sampling approach for imbalanced learning,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '08), pp. 1322–1328, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  29. H. Han, W.-Y. Wang, and B.-H. Mao, “Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning,” in Proceedings of the International Conference on Intelligent Computing (ICIC '05), vol. 3644 of Lecture Notes in Computer Science, pp. 878–887, August 2005. View at Scopus
  30. J. P. Shaffer, “Modified sequentially rejective multiple test procedures,” Journal of the American Statistical Association, vol. 81, no. 395, pp. 826–831, 1986. View at Publisher · View at Google Scholar