Research Article

Efficient Parallel Implementation of Matrix Multiplication for Lattice-Based Cryptography on Modern ARM Processor

Table 2

Matrix transpose performance (Unit: ms).

NMLC version Proposed (NEON)
(Auto-Vectorization)

5361024256364.23040.446443

6631024256630.00660.707373

8161024384970.47821.78282

95210243841172.6072.078113