Figure 16: Performance of Matrixmul with different workgroup size on CPUs and GPUs.