Research Article
An Empirical Study on the Performance of Cost-Sensitive Boosting Algorithms with Different Levels of Class Imbalance
Table 1
Summary of characteristics for the used data sets.
| Data set | Samples | Minority/majority | No. of min/no. of maj | Imbalance ratio |
| LetterA | 20000 | Class 1/rest | 789/19211 | 24.35 | Cbands | 12000 | Class 1/rest | 500/11500 | 23.00 | Pendigits | 10992 | Class 5/rest | 1055/9937 | 9.42 | Satimage | 6435 | Class 4/rest | 626/5809 | 9.28 | Optidigts | 5620 | Class 8/rest | 554/5066 | 9.14 | Mfeat_kar | 2000 | Digit 9/rest | 200/1800 | 9.00 | Mfeat_zer | 2000 | Digit 9/rest | 200/1800 | 9.00 | Segment | 2310 | Class 5/rest | 330/1980 | 6.00 | Scrapie | 3113 | Class 1/class 0 | 531/2582 | 4.86 | Vehicle | 846 | van/rest | 212/634 | 2.99 | Haberman | 306 | Class 2/class 1 | 81/225 | 2.78 | Yeast | 1484 | Class 2/rest | 429/1055 | 2.46 | Breast | 336 | Class 1/class 0 | 81/196 | 2.42 | Phoneme | 5404 | Class 1/class 0 | 1586/3818 | 2.41 | German | 1000 | Class 2/class 1 | 300/700 | 2.33 | Pima | 768 | Class 1/class 0 | 268/500 | 1.87 | Spambase | 4601 | Class 1/class 0 | 1813/2788 | 1.54 |
|
|