Research Article

Query Execution Optimization in Spark SQL

Table 1

Comparison between calculated and actual shuffle sizes.

ID1234567891011

General task9114141197149221477
Input (MB)25626883842560512384128128256640256
Calculated shuffle size27282.440.326953.8140.3513.4515.0528.0367.2529.06
Actual shuffle size25267.5838.2254.850.9638.2212.7612.7625.4863.7125.48