Research Article

HSDP: A Hybrid Sampling Method for Imbalanced Big Data Based on Data Partition

Table 5

F-measure values of RF + various sampling methods.

DatasetAlgorithm
RFSMOTE + RFADASYN + RFBorderline-SMOTE + RFHSDP (proposed) + RF

Pima0.58910.56250.56320.57100.6000
Yeast30.73260.74110.72140.73660.7611
Abalone190.30540.31230.36520.28640.3912
Segment00.91020.93750.88650.92920.9301
Page-blocks00.74430.77450.75860.77230.7438
Glass50.81480.78260.79310.77250.8201
Ecoli40.57010.57370.57230.58020.5623
Haberman0.34020.40110.43860.41540.4489