Research Article

Does Determination of Initial Cluster Centroids Improve the Performance of -Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study

Table 1

Description of seven datasets utilized for comparisons among the methods1.

Name of datasetSample size (+/-)No. of variables (features)No. of classes (labels)No. of optimal clusters2

Leukemia64 (26/38)422
Prostate30 (15/15)322
Colon Cancer111 (56/55)422
Haberman306322
Iris150433
Wine1781333
Glass2141077

1Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/gds) & University of California, Irvine Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets.php). 2The number of optimal clusters based on the elbow, gap, and silhouette by applying the law of the majority.