Research Article

HSDP: A Hybrid Sampling Method for Imbalanced Big Data Based on Data Partition

Table 3

F-measure values of KNN + various sampling methods.

DatasetAlgorithm
KNNSMOTE + KNNADASYN + KNNBorderline-SMOTE + KNNHSDP (proposed) + KNN

Pima0.55430.57660.58240.57260.6105
Yeast30.63430.63010.62940.65280.6698
Abalone190.20190.33550.28010.24380.3045
Segment00.86530.87340.86970.88430.8928
Page-blocks00.71460.73010.70990.74010.6988
Glass50.71320.71020.71290.72650.6827
Ecoli40.54010.52680.60110.57010.6698
Haberman0.29220.30270.34170.32850.3636