Parallel Attribute Reduction Algorithm for Complex Heterogeneous Data Using MapReduce
Algorithm 2
Hash-Reduce function.
Input: <KEYHM, VALUEHM>
Output: <KEYHR, VALUEHR> // let KEYHR be the set of different hash value key', and VALUEHR be the set of sample IDs subset value' with the same hash value key'.
begin
<KEYHR, VALUEHR>=
for <key, value>in <KEYHM, VALUEHM>do
if key is not appeared in <KEYHR, VALUEHR>
<key', value'>=<key, value>
else
if key=key'k
<KEYHR, VALUEHR>=<KEYHR, VALUEHR>-<key', value'>
value'k=value'k value // combine samples with the same hash value, obtain the hash bucket
end if
end if
<KEYHR, VALUEHR>=<KEYHR, VALUEHR> <key', value'>
end for //output with multi-file; a file named after a hash value is a hash bucket