Research Article
An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets
Algorithm 1
Pseudocode of computing significance.
Set ; | for each numeric attribute in dataset do | Figure out the similarity matrix based on (10) as the input; | Calculate the median of similarities as the shared value of preference; | Perform the AP algorithm using (1)–(4) to obtain an classification result; | Discretize attribute to intervals according to the clustering result; | ; | end for | Establish a new dataset which is a pure categorical dataset composed of the discretized numeric | attributes and the original categorical attributes; | for each attribute in dataset do | Calculate the distance between two distinct values of any categorical attribute using (5)–(8); | Compute the significance (weight) of each numeric attribute using (9) in which the interval is replaced by . | end for |
|