Research Article
Utilizing the Double-Precision Floating-Point Computing Power of GPUs for RSA Acceleration
Algorithm 3
DPF-based parallel Montgomery multiplication (
) algorithm: Converting Phase.
Input: | : Thread ID; | : Number of processed limbs per thread; | : Number of threads per Montgomery multiplication, where ; | : Redundant-format sub-result, where | ; | Output: | : Simplified-format sub-result, where | ; | (1) | (2) for to do | (3) | (4) | (5) end for | (6) while carry of any thread is non-zero do | (7) | (8)for to do | (9) | (10) | (11)end for | (12) end while |
|