Research Article

Clustering Categorical Data Using Community Detection Techniques

Algorithm 1

CD-Clustering.
Input: dataset with data points, each data point has
attributes. The number of clusters .
Output: clusters of data points .
compute pairwise Hamming distances
compute CDF for pairwise Hamming distances
// estimate distance threshold
for     do
(5) if   and   then
(6)break
(7) // clustering
(8)
(9)for     do
(10) if     then
(11)
(12) run Louvain method [12] on
(13) keep top- communities by size
(14)for each cluster   do
(15) compute the mode of
(16)for each remaining data point   do
(17) assign to the nearest mode ,
(18)return