Research Article
Utilizing the Double-Precision Floating-Point Computing Power of GPUs for RSA Acceleration
Table 6
Performance of RSA decryption.
| Model | Threads/RSA | Batch Size | Regs/Thread | Threads/Block | Throughput (ops/s) | Latency (ms) |
| RSA-2048
| 4 2 | 14 64 | 127 | 512 | 42,211 | 21.22 | 8 2 | 14 32 | 34,400 | 13.02 | 4 2 | 14 32 | 255 | 256 | 31,095 | 14.41 | 8 2 | 14 16 | 20,744 | 10.80 |
| RSA-3072
| 4 2 | 14 64 | 127 | 512 | 10,642 | 84.19 | 8 2 | 14 32 | 12,151 | 36.86 | 4 2 | 14 32 | 255 | 256 | 10,555 | 42.44 | 8 2 | 14 16 | 8,393 | 26.69 |
| RSA-4096
| 4 2 | 14 64 | 127 | 512 | 821 | 1092.00 | 8 2 | 14 32 | 5,790 | 77.37 | 4 2 | 14 32 | 255 | 256 | 4,022 | 111.39 | 8 2 | 14 16 | 4,147 | 54.01 |
|
|