Research Article

An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets

Algorithm 1

Pseudocode of computing significance.
Set ;
for each numeric attribute in dataset   do
 Figure out the similarity matrix based on (10) as the input;
 Calculate the median of similarities as the shared value of preference;
 Perform the AP algorithm using (1)–(4) to obtain an classification result;
 Discretize attribute to intervals according to the clustering result;
;
end for
Establish a new dataset which is a pure categorical dataset composed of the discretized numeric
attributes and the original categorical attributes;
for each attribute in dataset   do
 Calculate the distance between two distinct values of any categorical attribute using (5)–(8);
 Compute the significance (weight) of each numeric attribute using (9) in which the interval is replaced by .
end for