Research Article

Application of Customer Segmentation for Electronic Toll Collection: A Case Study

Algorithm 1

CLARA algorithm.
Input:
D - ETC customer index dataset;
k - the number of clusters;
samples - number of samples to be drawn from the dataset;
sampsize - number of observations in each sample.
Output:
The clustering results of ETC customer.
Method:
for i = 1 to samples, repeat (a)-(d);
(a) select sampsize objects randomly from ETC customer index dataset D as a sample, apply
the PAM algorithm to compute the best k-medoids – ;
(b) apply k-medoids to the entire dataset D and calculate the distance from every non-
medoids object in D to the closest object in the set , reassign each ETC
customer to different clusters;
(c) compute the average dissimilarity of this clustering, if the value is less than the current
minimum value, then replace the current value, and form the best k-medoids and the new
set of k representative objects;
(d) return to step , repeat the iterative process;
until no change, output clustering results of ETC customer.