Research Article

HSDP: A Hybrid Sampling Method for Imbalanced Big Data Based on Data Partition

Table 6

G-means values of RF + various sampling methods.

DatasetAlgorithm
RFSMOTE + RFADASYN + RFBorderline-SMOTE + RFHSDP (proposed) + RF

Pima0.61730.64120.63150.63850.6479
Yeast30.82430.85020.84730.83620.8532
Abalone190.27820.63520.65290.45640.7001
Segment00.93440.95010.93130.94210.9567
Page-blocks00.83480.89900.90780.88460.8815
Glass50.83660.82010.83020.81720.8401
Ecoli40.64900.74420.77510.74030.7239
Haberman0.46590.56860.59820.57430.5991