Research Article

Partition Selection for Large-Scale Data Management Using KNN Join Processing

Algorithm 4

vKNN processing.
Input:, S, k
Output: s.predictLabel//prediction label of s
map: <row number, s>
 knnSet[k] ← selectTrainSet(); //calculate the k nearest neighbors
for i = 0 to k do//fetch the k nearest neighbors label
  trainLable[i] = knnSet[i].label;
for j = 0 to k do
  output(s, trainLabel[j]);
reduce: <s, Labels>
 hmp = new HashMap(); //create a HashMap object hmp
foreach label ∈ Labels do//count the number of each label
  if hmp.get(label) ! = NULL then//if the label exists in hmp
   label.value ++; //take the value of the label and add 1
   hmp.put(label) = label.value; //update the value of the label in hmp
  else//if the label does not exist in hmp
   hmp.put(label) = 1; //set the value of the label to 1 and insert to hmp
 predictLabel = hmp.maxvalue;//the label with the largest value as the prediction label
 output(s, predictLabel);