Research Article

Application of Customer Segmentation for Electronic Toll Collection: A Case Study

Algorithm 2

CART algorithm.
Input:
D - ETC customer index dataset and their associated class labels;
minbucket - the minimum number of observations in any terminal (leaf) node.
Output:
A decision tree of ETC customer segmentation.
Method:
create a node N;
set a split point, a, for a specific segmentation index A, and split D into
subsets D1 and D2. Thus, for ETC segmentation index, three set of subsets
are obtained;
computerize the Gini indexes of three indexes in dataset D, respectively.
Determine an optimal splitting index;
repeat steps until the samples in the subset are too few or the
reduction of “node impurity” cannot be below the given threshold and
create a leaf node;
the leaf node is labelled with the majority class in D to node N, and
generate a decision tree of ETC customer segmentation;
select different subtrees (branches) in the decision tree and prune it by the
cross-validated error and cost complexity;
output an optimal decision tree of ETC customer segmentation.