Research Article

Imbalanced Data Set CSVM Classification Method Based on Cluster Boundary Sampling

Table 1

The basic information of four UCI data sets.

Data setNumber of negative samplesNumber of positive samplesImbalance ratioData description

Shuttle57829171338 : 1High imbalance ratio
High information amount
Abalone414532130 : 1High imbalance ratio
Low information amount
Yeast14335128 : 1Low imbalance ratio
Low information amount
Churn42937076 : 1Low imbalance ratio
High information amount