Parallel Attribute Reduction Algorithm for Complex Heterogeneous Data Using MapReduce

<table class="algorithm-group"><tr><td><table class="algorithm" id="alg2"><tr><td colspan="2">Input: &lt;KEYHM, VALUEHM&gt;</td></tr><tr><td colspan="2">Output: &lt;KEYHR, VALUEHR&gt; // let KEYHR be the set of different hash value key', and VALUEHR be the set of sample IDs subset value' with the same hash value key'.</td></tr><tr><td colspan="2"> begin</td></tr><tr><td colspan="2">  &lt;KEYHR, VALUEHR&gt;=<svg height="9.24682pt" id="M130" style="vertical-align:-0.6109209pt" version="1.1" viewbox="-0.0498162 -8.6359 6.36303 9.24682" width="6.36303pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M411 648L381 663L349 596C321 620 285 635 241 635C89 635 35 457 35 312C35 217 55 118 102 56L63 -24L92 -43L128 29C157 3 194 -12 240 -12C390 -12 443 166 443 312C443 406 424 507 374 569L411 648ZM141 132C126 185 119 247 119 312C119 465 149 601 239 601C278 601 305 577 321 542L141 132ZM338 497C355 434 360 373 360 312C360 159 329 21 240 21C202 21 174 47 157 85L338 497Z" id="g113-25"></path></g></svg></td></tr><tr><td colspan="2">  for &lt;key, value&gt;in &lt;KEYHM, VALUEHM&gt;do</td></tr><tr><td colspan="2">   if key is not appeared in &lt;KEYHR, VALUEHR&gt;</td></tr><tr><td colspan="2">    &lt;key', value'&gt;=&lt;key, value&gt;</td></tr><tr><td colspan="2">   else</td></tr><tr><td colspan="2">   if key=key'k</td></tr><tr><td colspan="2">    &lt;KEYHR, VALUEHR&gt;=&lt;KEYHR, VALUEHR&gt;-&lt;key', value'&gt;</td></tr><tr><td colspan="2">    value'k=value'k<svg height="7.30254pt" id="M131" style="vertical-align:-0.04981995pt" version="1.1" viewbox="-0.0498162 -7.25272 7.75925 7.30254" width="7.75925pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M535 194V548H477V194C477 102 391 35 294 35S111 102 111 194V548H53V194C53 72 166 -15 294 -15S535 72 535 194Z" id="g117-59"></path></g></svg> value // combine samples with the same hash value, obtain the hash bucket</td></tr><tr><td colspan="2">   end if</td></tr><tr><td colspan="2">   end if</td></tr><tr><td colspan="2">   &lt;KEYHR, VALUEHR&gt;=&lt;KEYHR, VALUEHR&gt; <svg height="7.30254pt" id="M132" style="vertical-align:-0.04981995pt" version="1.1" viewbox="-0.0498162 -7.25272 7.75925 7.30254" width="7.75925pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M535 194V548H477V194C477 102 391 35 294 35S111 102 111 194V548H53V194C53 72 166 -15 294 -15S535 72 535 194Z" id="g117-59"></path></g></svg> &lt;key', value'&gt;</td></tr><tr><td colspan="2">  end for //output with multi-file; a file named after a hash value is a hash bucket</td></tr><tr><td colspan="2"> end</td></tr></table></td></tr></table>

Complexity

alg2

Algorithm 2

Algorithm 2: Parallel Attribute Reduction Algorithm for Complex Heterogeneous Data Using MapReduce