Discrete Dynamics in Nature and Society

Research Article

An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

Combiner(key ₁, medi).

Input:
: the index of the cluster,
medi: the list of the samples assigned to the same cluster.
Output: ,
: the index of the cluster,
: the sum of the values of the samples belonging to the same cluster and the number of samples.
(1) Construct a counter num_s to record the number of samples in the same cluster;
(2) Construct an array sum_v to record the sum of the values of different dimensions of the samples belonging
to the same cluster (i.e., the samples in the list medi);
(3) Construct the sample examples to extract the data objects from medi.next(), and the dimensions to obtain
the dimension of the original data object;
(4) num_s = 0;
(5) while (medi.hasNext()) do
(6) CurrentPoint = medi.next();
(7) num_s++;
(8) for to dimensions do
(9) sum_v[]+ = CurrentPoint.point[];
(10) //Calculate the sum of the values of each dimension of examples
(11) end for
(12) for to dimensions do
(13) mean[] = sum_v[]/num_s;
(14) //Compute the mean value of the samples for each cluster
(15) end for
(16) end while
(17) index = ;
(18) Construct as a string containing the sum of the values of each dimension sum_v[] and
the number of samples num_s;
(19) return pairs;