Research Article

Cross-Checking Multiple Data Sources Using Multiway Join in MapReduce

Figure 5

Step 1: use anchor-points algorithm. Step 2: exploit fat relations to distribute tuples optimally.