Research Article
Handling Data Skew in MapReduce Cluster by Using Partition Tuning
Table 1
Application characteristics.
| Application | Data type | Input data size (GB) | Frequency of variation of the keys | Average variation in key distribution | Method |
| II-1 | Wikipedia | 4 | 61% | 33% | Hadoop, PTSH | II-2 | Wikipedia | 4 | 156% | 108% | Hadoop, PTSH | WC-1 | RandomWriter | 7.5 | 42% | 136% | Hadoop, PTSH | WC-2 | RandomWriter | 7.5 | 125% | 211% | Hadoop, PTSH | WC-3 | RandomWriter | 7.5 | 116% | 130% | Hadoop, Closer, LEEN, PTSH |
|
|