Research Article

OpenCL Performance Evaluation on Modern Multicore CPUs

Table 3

Configurations of simple applications.

Benchmark Kernel Global work size Local work size

Square Square 10000, 100000, 1000000, 10000000 NULL
Vectoraddition vectoadd 110000, 1100000, 5500000, 11445000 NULL
Matrixmul matrixMul 800 × 1600, 1600 × 3200, 4000 × 8000 16 × 16
Reduction reduce 640000, 2560000, 10240000 256
Histogram histogram256 409600 128
Prefixsum prefixSum 1024 1024
Blackscholes blackScholes 1280 × 1280, 2560 × 2560 16 × 16
Binomialoption binomialoption 255000, 2550000 255
Matrixmul(naive)matrixMul 800 × 1600, 1600 × 3200, 4000 × 8000 16 × 16