Research Article

A Novel Boundary Oversampling Algorithm Based on Neighborhood Rough Set Model: NRSBoundary-SMOTE

Table 1

Data description.

Datasets Size Attribute Class label (minority : majority) Class distribution

1 Austra 690 14C 1 : 0 307/383
2 Heart-s 270 6C 7N Present : absent 120/150
3 Bupa 345 6C 1 : 2 145/200
4 Auto-mpg 398 5C 2N Others : 1 149/249
5 Colic 368 7C 15N No : yes 136/232
6 Ionosphere 351 34C b : g 126/225
7 Machine 209 7C Others : 2 74/135
8 Labor 57 8C 8N Bad : good 20/37
9 Pima 768 8C 1 : 0 268/500
10 Vertebral column (VC) 310 7C Normal : abnormal 100/210
11 German 1000 24C 2 : 1 300/700
12 Haberman 306 3C 2 : 1 81/225
13 Transfusion 748 4C 1 : 0 178/570
14Contraceptive method choice (CMC) 1473 9C 2 : others 333/1140
15 Yeast 1484 8C MIT : others 244/1240

C: continuous, N: nominal.