Table of Contents Author Guidelines Submit a Manuscript
Security and Communication Networks
Volume 2017 (2017), Article ID 3508786, 15 pages
Research Article

Utilizing the Double-Precision Floating-Point Computing Power of GPUs for RSA Acceleration

1State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
2Data Assurance and Communication Security Research Center, Chinese Academy of Sciences, Beijing, China
3School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China

Correspondence should be addressed to Fangyu Zheng

Received 23 May 2017; Accepted 8 August 2017; Published 17 September 2017

Academic Editor: Zhe Liu

Copyright © 2017 Jiankuo Dong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Asymmetric cryptographic algorithm (e.g., RSA and Elliptic Curve Cryptography) implementations on Graphics Processing Units (GPUs) have been researched for over a decade. The basic idea of most previous contributions is exploiting the highly parallel GPU architecture and porting the integer-based algorithms from general-purpose CPUs to GPUs, to offer high performance. However, the great potential cryptographic computing power of GPUs, especially by the more powerful floating-point instructions, has not been comprehensively investigated in fact. In this paper, we fully exploit the floating-point computing power of GPUs, by various designs, including the floating-point-based Montgomery multiplication/exponentiation algorithm and Chinese Remainder Theorem (CRT) implementation in GPU. And for practical usage of the proposed algorithm, a new method is performed to convert the input/output between octet strings and floating-point numbers, fully utilizing GPUs and further promoting the overall performance by about 5%. The performance of RSA-2048/3072/4096 decryption on NVIDIA GeForce GTX TITAN reaches 42,211/12,151/5,790 operations per second, respectively, which achieves 13 times the performance of the previous fastest floating-point-based implementation (published in Eurocrypt 2009). The RSA-4096 decryption precedes the existing fastest integer-based result by 23%.