Mathematical Problems in Engineering

Research Article

Partition Selection for Large-Scale Data Management Using KNN Join Processing

KNN join processing.

	Input: k, R, S
	Output: s.predictLabel//prediction label of s
	map: <row number, s>
	foreach r ∈ R do
	dis = dis(r, s);//calculate the Euclidean distance between r and s
	for i = 0 to k do
	if dis < distance[i] then//find the minimum k distances
	distance[i] = dis;
	trainLabel[i] = r.label;
	break;
	for j = 0 to k do
	output(s, trainLabel[i]);
	reduce: <s, Labels>
	hmp = new HashMap(); //create a HashMap object hmp
	foreach label ∈ Labels do//count the number of each label
	if hmp.get(label) ! = NULL then//if the label exists in hmp
	label.value ++; //take the value of the label and add 1
	hmp.put(label) = label.value;//update the value of the label in hmp
	else//if the label does not exist in hmp
	hmp.put(label) = 1; //set the value of the label to 1 and insert to hmp
	predictLabel = hmp.maxvalue; //the label with the largest value as the prediction label
	output(s, predictLabel);