Table of Contents Author Guidelines Submit a Manuscript
Abstract and Applied Analysis
Volume 2014 (2014), Article ID 972786, 7 pages
http://dx.doi.org/10.1155/2014/972786
Research Article

A Hybrid Sampling SVM Approach to Imbalanced Data Classification

College of Electrical and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China

Received 17 April 2014; Accepted 22 May 2014; Published 12 June 2014

Academic Editor: Fuding Xie

Copyright © 2014 Qiang Wang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Imbalanced datasets are frequently found in many real applications. Resampling is one of the effective solutions due to generating a relatively balanced class distribution. In this paper, a hybrid sampling SVM approach is proposed combining an oversampling technique and an undersampling technique for addressing the imbalanced data classification problem. The proposed approach first uses an undersampling technique to delete some samples of the majority class with less classification information and then applies an oversampling technique to gradually create some new positive samples. Thus, a balanced training dataset is generated to replace the original imbalanced training dataset. Finally, through experimental results on the real-world datasets, our proposed approach has the ability to identify informative samples and deal with the imbalanced data classification problem.