Mathematical Problems in Engineering

Research Article

Novel Automated K-means++ Algorithm for Financial Data Sets

Table 5

Evaluate the quality of clustering according to external validity indexes.


Dataset and algorithm	Set matching measures				Pair-counting measures		Entropy
Dataset and algorithm	Purity	Recall	Precision	F-measure	Rand-index	Jaccard-index	Entropy

S_1 K-means	0.87	0.68	0.62	0.65	0.84	0.48	0.62
S_1 K-means++	0.88	0.78	0.78	0.78	0.89	0.60	0.46
S_1 SDK-means++	0.98	0.96	0.98	0.97	0.92	0.95	0.06
S_2 K-means	0.82	0.80	0.70	0.75	0.92	0.59	0.57
S_2 K-means++	0.86	0.83	0.84	0.83	0.92	0.71	0.31
S_2 SDK-means++	0.97	0.97	0.98	0.97	0.99	0.96	0.08
S_3 K-means	0.87	0.26	0.83	0.40	0.65	0.25	0.49
S_3 K-means++	0.88	0.28	0.86	0.42	0.66	0.26	0.47
S_3 SDK-means++	0.90	0.33	0.88	0.48	0.69	0.32	0.38
S_4 K-means	0.86	0.38	0.83	0.52	0.83	0.35	0.56
S_4 K-means++	0.86	0.36	0.83	0.50	0.83	0.34	0.52
S_4 SDK-means++	0.90	0.49	0.91	0.64	0.87	0.47	0.41
S_5 K-means	0.85	0.29	0.77	0.42	0.80	0.27	0.53
S_5 K-means++	0.88	0.29	0.80	0.43	0.80	0.27	0.49
S_5 SDK-means++	0.89	0.32	0.88	0.47	0.82	0.31	0.43
RMSE & K-means	0.018	0.217	0.081	0.134	0.088	0.129	0.043
RMSE & K-means++	0.009	0.244	0.028	0.176	0.091	0.184	0.073
RMSE & SDK-means++	0.038	0.292	0.042	0.223	0.101	0.293	0.165