Research Article
Clustering Categorical Data Using Community Detection Techniques
Input: dataset with data points, each data point has | attributes. The number of clusters . | Output: clusters of data points . | compute pairwise Hamming distances | compute CDF for pairwise Hamming distances | // estimate distance threshold | for do | (5) if and then | (6)break | (7) // clustering | (8) | (9)for do | (10) if then | (11) | (12) run Louvain method [12] on | (13) keep top- communities by size | (14)for each cluster do | (15) compute the mode of | (16)for each remaining data point do | (17) assign to the nearest mode , | (18)return |
|