Research Article

An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

Table 2

Execution time comparison on seven data sets.

Data sets Size (MB) Execution time (s)
-Means ParCLARA Par2PK-Means Par3PKM

Taxi Trajectory 80 75 662 439 312
160 103 756 486 371
320 195 893 579 406
640 1125 1173 838 614
1280 2373 1679 1348 1026
2560 2721 2312 1879

Iris 80 32 612 301 213
160 54 693 380 279
320 116 812 440 306
640 685 1045 630 403
1280 1800 1248 1005 768
2560 2463 2013 1420

Haberman's Survival 80 56 576 311 296
160 60 675 400 324
320 130 823 470 378
640 720 987 670 426
1280 2010 1321 1200 873
2560 2449 2200 1719

Ecoli 80 38 628 330 283
160 66 712 400 324
320 130 835 460 375
640 756 1104 700 482
1280 1912 1636 1234 763
2560 2479 2312 1416

Hayes-Roth 80 41 568 310 278
160 57 643 395 347
320 125 726 460 387
640 715 973 660 438
1280 1980 1479 1211 736
2560 2423 2120 1567

Lenses 80 32 624 309 297
160 59 701 389 327
320 130 924 452 376
640 700 1072 545 432
1280 1895 1378 1085 814
2560 2379 2089 1473

Wine 80 50 635 350 317
160 78 705 420 356
320 130 835 470 402
640 730 1006 610 426
1280 2100 1346 1245 843
2560 2463 2240 1645