Abstract

Among cryptographic systems, multivariate signature is one of the most popular candidates since it has the potential to resist quantum computer attacks. Rainbow belongs to the multivariate signature, which can be viewed as a multilayer unbalanced Oil-Vinegar system. In this paper, we present techniques to exploit Rainbow signature on hardware meeting the requirements of efficient high-performance applications. We propose a general architecture for efficient hardware implementations of Rainbow and enhance our design in three directions. First, we present a fast inversion based on binary trees. Second, we present an efficient multiplication based on compact construction in composite fields. Third, we present a parallel solving system of linear equations based on Gauss-Jordan elimination. Via further other minor optimizations and by integrating the major improvement above, we implement our design in composite fields on standard cell CMOS Application Specific Integrated Circuits (ASICs). The experimental results show that our implementation takes  us and clock cycles to generate a Rainbow signature with the frequency of  MHz. Comparison results show that our design is more efficient than the RSA and ECC implementations.

1. Introduction

The idea of public key cryptography was introduced by Diffie and Hellman. Their method for key exchange came to be known as Diffie-Hellman key exchange [1]. This was the first published practical method for establishing a shared secret key over an authenticated communications channel without using a prior shared secret. Then a public key cryptographic scheme was invented by Rivest et al. [2]. This scheme came to be known as RSA, from their initials. RSA uses exponentiation modulo, a product of two very large primes, to encrypt and decrypt, performing both public key encryption and public key digital signature. The introduction of elliptic curve cryptography by Koblitz [3] and Miller [4] in the mid-s has yielded new public key algorithms based on the discrete logarithm problem. Elliptic curves provide smaller key sizes and faster operations for approximately equivalent estimated security. Since then, various schemes of encryption and signature generation have been developed in the field of public key cryptography.

Efficient implementations of these schemes have played a crucial role in numerous real-world security applications, such as confidentiality, authentication, integrity, and nonrepudiation. Since software implementations even on multicore processors can often not provide the performance level needed, hardware implementations are thus the only option, which appear to be a promising solution to inherent performance issues of public key cryptographic systems and provide greater resistance to tampering. Among hardware implementations of public key cryptographic systems, RSA and elliptic curves systems are the most widely adopted candidates [514]. Their security lies in the difficulty of factorizing large integers and the discrete logarithm problem, respectively. Shor algorithm was invented by Shor which could solve the problems of the prime factors of large numbers and elliptic curve discrete logarithm in polynomial time [15]. Such cryptographic schemes have potential weakness under quantum computer attacks.

Multivariate cryptography is one of the most popular postquantum cryptography since it has the potential to resist quantum computer attacks [16]. The main strength of multivariate cryptography is that its underlying mathematical problem is to solve a set of Multivariate Quadratic (MQ) polynomial equations in a finite field, which is proven to be an NP-hard problem [17]. During the past thirty years, various multivariate cryptographic schemes have been proposed, like Unbalanced Oil-Vinegar Signature (UOV) [18], Rainbow [19, 20], Tame Transformation Signature (TTS) [21, 22], and others [2325]. Their implementations have been one of the subjects of a lot of researches and continue to be a topic of interest in many areas, for example, efficient multivariate systems on Field Programmable Gate Arrays (FPGAs) [26], small multivariate processors on FPGAs [27], high speed Rainbow on FPGAs [28], and minimized multivariate PKC on Application Specific Integrated Circuits (ASICs) [29].

Among the existing multivariate cryptographic schemes, Rainbow belongs to Oil-Vinegar family, which can be viewed as a multilayer unbalanced Oil-Vinegar system. Compared with RSA and elliptic curves, the security of Rainbow is based on solving a set of MQ polynomial equations, which has the potential to resist quantum computer attacks.

Our Contributions. In this paper, we present techniques to exploit Rainbow signature on hardware meeting the requirements of efficient high-performance applications. We propose a general architecture for efficient hardware implementations of Rainbow and enhance our design in three directions. First, we present a fast inversion in based on binary trees, which is the extension of the work in [30]. Second, we present an efficient multiplication in based on compact construction, which is the extension of the work in [27]. Third, we present a parallel solving system of linear equations in based on Gauss-Jordan elimination, which is based on the work in [28]. Via further other minor optimizations and by integrating the major improvement above, our design is implemented on ASICs and provides significant reductions in time-area product. The comparisons with other public key cryptographic systems show that Rainbow has a good performance on hardware and is a better candidate than RSA and elliptic curves under quantum computer attacks.

Moreover, our design can be generalized with minor modifications that also support FPGAs. Besides, Rainbow implementations on hardware must be protected against a wide range of attacks, including side channel attacks. Side channel attack belongs to physical attack, which is any attack based on information gained from the physical implementation of cryptographic systems, rather than brute force or theoretical weaknesses in cryptographic algorithms. Therefore, we discuss defending against a possible differential power analysis for Rainbow and we present countermeasures against fault analysis and differential power analysis attack.

Organization. The rest of this paper is organized as follows: Section 2 introduces Rainbow signature schemes. Section 3 presents building blocks for Rainbow schemes. Section 4 presents efficient implementations of Rainbow on ASICs. Section 5 compares our design with other public key cryptographic systems. Section 6 discusses defending against a possible differential power analysis for Rainbow. Section 7 summarizes our design.

2. Preliminary

Among multivariate signatures, Rainbow belongs to Oil-Vinegar family, which can be viewed as a multilayer unbalanced Oil-Vinegar system. The construction of Rainbow includes affine transformation , central map transformation , and affine transformation ; that is,

The hash value of the message of Rainbow is and its size is , where are elements in a finite field. We also suppose that the signature is and its size is , where are elements in a finite field. The private keys of Rainbow are , , and .

Among the existing Rainbow schemes, Rainbow(17, 13, 13) is commonly believed to provide a security level of [20], which works with first-layer Vinegar variables and first-layer and second-layer Oil variables in . This scheme is depicted in Table 1 and is introduced as follows.

We suppose that the hash value of the message is and its size is , where are field elements. We also suppose that the signature is and its size is , where are field elements.

In order to sign a message, we need to solve the equation

To do this, we first solve

is an affine transformation:where is a matrix with the size of and is a vector with the size of . and are parts of private keys.

Second, we solvewhere the construction depends on a map

is a two-layer construction; namely, are divided into two layers:

Similarly, are divided into two layers. and are Vinegar variables and Oil variables of the first layer, respectively; and are Vinegar variables and Oil variables of the second layer, respectively.

MQ polynomials are defined bywhere are Oil and Vinegar variables on this layer and the coefficients , , , , and are parts of private keys.

We randomly choose and evaluate . Then we solve the systems of linear equations on . Then we evaluate and solve the systems of linear equations on .

Last, we solve

is an affine transformationwhere is a matrix with the size of and is a vector with the size of . and are parts of private keys.

Then is the signature of .

3. Building Blocks for Rainbow Schemes

Considering Section 2, we see that, in order to generate a Rainbow signature, the following operations are required:(1)Computing affine transformations, that is, , where is a matrix and is a vector(2)Computing central map transformation, that is, evaluating multivariate polynomials and solving systems of linear equations.

Computing these operations requires multiplications, inversions, and solving systems of linear equations in a finite field, which are presented in the following.

3.1. A Fast Inversion Based on Binary Trees

We suppose that and are the elements in and is the inverse of , where the subfield is and , , , and are elements in . The irreducible polynomials in are . Then the inversion is computed as follows:

We adopt a pipelined architecture in , which is the extension of the work in [30]. We use two binary trees for computing squares and inversions in , which are illustrated as follows:(1)Each binary tree has four layers; root nodes are on the third layer.(2)Each node has at most two child nodes, left node represents value of zero, and right node represents value of one.(3)Each child must either be a leaf or be the root of another tree; each node has a father node when it is not a root node.(4)Each element in a finite field has a unique traversal from root to leaf.(5)Each leaf (most) is linked to another leaf.

Figure 1 is the architecture based on binary trees for computing squares and inversions in . We use two architectures in our design, that is, square-trees for squares and inversion-trees for inversions.

Square-trees: we suppose that traversal from root () to leaf () includes tree nodes , , , and , which represents the element in . If traversal from root () to leaf () represents the element in , which is the square of , then is linked to . When we are required to compute the square of , it is very convenient to find its square via traversing the square-trees.

Inversion-trees: we suppose that traversal from root () to leaf () includes tree nodes , , , and , which represents the element in . If traversal from root () to leaf () represents the element in , which is the inverse of , then is linked to . When we are required to compute the inverse of , it is very convenient to find its inverse via traversing the inversion-trees.

Since square-trees and inversion-trees have four layers, we can use them to compute squares and inversions with pipelining. The computation of is presented as follows:(1)Via using square-trees, we can compute and with pipelining.(2)Via using a multiplier, we can compute .(3)Via using a multiplier and an adder, we compute and .(4)Via using inversion-trees, we can compute .(5)Via using a multiplier and an adder, we compute and .(6)The inversion has been computed.

3.2. An Efficient Multiplication Based on Compact Construction

We suppose that and are the elements in , where , , , and are elements in . We also suppose that is the multiplication result of and , where is an element in and , are elements in . The irreducible polynomials in are . Then the multiplication is computed as follows:

By substituting into (12), we have

The computations of and use a compact construction, which is the extension of the work in [27].

We design components , , and .

. It computes and in , where , , , and are elements in .

. It computes , , and in , where , , , , , and are elements in .

. It performs the computation of right shift and a bit addition.

We adapt four s, three s, and , where and compute additions and multiplications in , respectively.

and are used to computerespectively.

, , and are used to computerespectively.

is used to compute a right shift and a bit addition:

and are used to computerespectively.

The multiplication has been computed.

3.3. A Parallel Solving System of Linear Equations Based on Gauss-Jordan Eliminations

We propose a parallel solving system of linear equations based on Gauss-Jordan eliminations, which is the extension of the work in [28]. We give a straightforward description of the proposed algorithm of the parallel variant of Gauss-Jordan elimination in Algorithm 1, where stands for operation performed in the th iteration, and . The optimized Gauss-Jordan elimination with iterations consists of pivoting, inversion, normalization, and elimination in each iteration.

var
: Integer;
begin
;
Pivoting();
repeat
inversion(), Normalization(), Elimination();
Pivoting();
;
until
end.

We enhance the algorithm in four directions. First, multiplication is computed by invoking efficient multipliers designed in Section 3.2. Second, we adopt fast inverter described in Section 3.1. Third, inversion, normalization, and elimination are designed to perform simultaneously. Fourth, during the elimination in the th iteration, we simultaneously choose the right pivot for the next iteration; namely, if element of the next iteration is zero, we swap the th row with another th row with the nonzero element , where . The difference from usual Gauss-Jordan elimination is that the usual Gauss-Jordan elimination chooses the pivot after the elimination, while we perform the pivoting during the elimination. In other words, at the end of each iteration, by judging the computational results in this iteration, we can decide the right pivoting for the next iteration. By integrating these optimizations, it takes only one clock cycle to perform one iteration.

The architecture for solving systems of linear equations in is depicted in Figure 2 with matrix size . There exist three kinds of cells in the architecture, namely, , , and , where and . The cell is for fast inversion. As described in Section 3.1, two binary trees are included in the cell for computed inversion. The cells are for normalization. And the cells are for elimination. The architecture consists of one cell, cells, and cells.

4. Efficient Implementation and Performance Evaluation

Rainbow(17, 13, 13) is computed via invoking affine transformation, polynomial evaluation, and solving systems of linear equations in . We depict the flowchart of implementations of Rainbow(17, 13, 13) in Figure 3:(1)Compute the first affine transformation via invoking matrix-vector multiplication and vector addition.(2)Evaluate the first multivariate polynomials on the first layer of central map transformation .(3)Solve the first systems of linear equations with matrix size of central map transformation .(4)Evaluate the second multivariate polynomials on the second layer of central map transformation .(5)Solve the second systems of linear equations with matrix size of central map transformation .(6)Compute the second affine transformation via invoking matrix-vector multiplication and vector addition.

In order to prove that the designs of Rainbow(17, 13, 13) are efficient on hardware, Hardware Description Language (Verilog HDL) code for modeling the designs has been implemented on ASICs. We implement our design in on TSMC-0.18 m standard cell CMOS ASICs. We use Synopsys Design Vision, which is a GUI for Synopsys Design Compiler tools. The map effort is set to medium. We present the experimental results in Tables 2 and 3, which are extracted after place and route.

Tables 2 and 3 show that Rainbow implementation includes two affine transformations with matrix sizes and , respectively, and MQ polynomial evaluations and solving two systems of linear equations with matrix size . Table 3 summarizes the performance of our implementation of Rainbow signature measured in clock cycles, which shows that our design takes only clock cycles to generate a Rainbow signature. In other words, our implementation takes  ns to generate a Rainbow signature with the frequency of  MHz. Among all of the operations, MQ polynomial evaluation occupies most of the executing time.

5. Comparisons with Other Implementations

The works in [5, 6, 2629] are believed to be the latest RSA, ECC, and multivariate public key cryptographic systems on hardware, respectively. We compare our design with these systems, which is depicted in Table 4. Comparison results show that our design is more efficient than the related implementations.

Besides, Rainbow implementation of the work in [28] is believed to be the fastest multivariate implementation, and Rainbow implementation of the work in [27] is believed to be the smallest multivariate implementation. Thus, the implementations of the work in [28], the work in [27], and this work show that Rainbow has a good performance on hardware and is a better candidate than RSA and elliptic curves under quantum computer attacks.

6. Side Channel Attack Considerations

Cryptographic systems must be protected against a wide range of attacks, including side channel attacks. Side channel attack belongs to physical attack, which is any attack based on information gained from the physical implementation of cryptographic systems, rather than brute force or theoretical weaknesses in cryptographic algorithms. The underlying principle of side channel attack is that side channel information such as power consumption, electromagnetic leaks, timing information, or even sound can provide extra sources of information about secrets in cryptographic systems, for example, cryptographic keys, partial state information, full or partial plain texts, which can be exploited to break the cryptographic systems. General classes of side channel attack include timing analysis [31], power analysis [32], electromagnetic analysis [33], fault analysis [34], acoustic cryptanalysis [35], data remanence analysis [36], and row hammer analysis attacks [37].

Fault analysis attacks intend to manipulate the environmental conditions of cryptographic systems, such as voltage, clock, temperature, radiation, light, and eddy current, to generate faults during secret-related computations, for example, multiplications and inversions in a finite field, and observe the related behavior, which may help a cryptanalyst break the cryptographic systems. Fault analysis attacks can be engineered by simply illuminating a transistor with a laser beam, which causes some bits to assume wrong values. The notion of using a fault induced during a secret-related computation to guess the secret key has been practically observed in implementations of the RSA that use the Chinese remainder theorem [38, 39]. A general fault analysis attack on schemes of MPKC is proposed in [40]. The work in [40] has attacked partial secret keys from affine transformations of the multivariate public key cryptographic schemes.

Power analysis attack can provide detailed information by observing the power consumption of cryptographic systems, which is roughly categorized into Simple Power Analysis (SPA) [41] and Differential Power Analysis (DPA) [32]. In the family of power analysis attacks, DPA is of particular interest and is a statistical test which examines a large number of power consumption signals to retrieve secret keys. A differential power analysis attack on SFLASH is proposed in [42]. The work in [42] has attacked secret keys from SHA-1 module of the SFLASH schemes. A side channel attack to enTTS has been proposed in [43], which uses differential power analysis and fault analysis to attack two affine transformations and central map transformation. The method in [43] shows that it can obtain all secret keys of enTTS.

Since the construction of Rainbow includes two affine transformations and central map transformation, such methods in [40, 42, 43] have the potential to obtain its secret keys. Thus, we discuss defending against a possible side channel attack for Rainbow and the countermeasure is described in the following:(1)We suppose that is the message and each element of is in .(2)We take a random vector ; the elements of are in .(3)We compute .(4)We compute and , where is a matrix and is a vector with size .(5)We compute , which is equivalent to .(6)The first affine transformation has been computed; then we take random bytes for Vinegar variables.(7)We double check the random bytes to protect against fault analysis attacks.(8)We compute the multivariate polynomial evaluations and solving systems of linear equations until the central map transformation is completed.(9) is the result of central map transformation; then we take two random vectors and , where and the elements are in .(10)We compute and , where is a matrix and is a vector with size .(11)We compute , which is equivalent to .(12) is the Rainbow signature of .

The work in [40] uses fault analysis to attack the random bytes in central map transformations; thus we double check the random bytes to protect against fault analysis attacks. The work in [42] uses differential power analysis to attack SHA-1 module; thus we take a method to protect affine transformations. However, the countermeasure mentioned above is theoretical; we should be able to implement and verify it on hardware.

7. Conclusions

In this paper, we present techniques to exploit Rainbow signature cryptographic systems on hardware meeting the requirements of efficient high-performance applications. We propose a general architecture for efficient hardware implementations of Rainbow and enhance our design in three directions. First, we present a fast inversion in based on binary trees. Second, we present an efficient multiplication in based on compact construction. Third, we present a parallel solving system of linear equations in based on Gauss-Jordan elimination. Via further other minor optimizations and by integrating the major improvement above, we implement our design in on TSMC-0.18 m standard cell CMOS ASICs. We use Synopsys Design Vision and the map effort is set to medium. Our design can be generalized with minor modifications that also support FPGAs.

The experimental results show that Rainbow implementation includes two affine transformations with matrix sizes and , respectively, and MQ polynomial evaluations and solving two systems of linear equations with matrix size . Our implementation takes  ns and clock cycles to generate a Rainbow signature with the frequency of  MHz. Among all of the operations, MQ polynomial evaluation occupies most of the executing time. Comparison results show that our design is more efficient than the related implementations.

Moreover, the implementations of a fast Rainbow, a small Rainbow, and this work show that Rainbow has a good performance on hardware and is a better candidate than RSA and elliptic curves under quantum computer attacks.

Besides, Rainbow implementations must be protected against a wide range of attacks, including side channel attacks. We discuss defending against a possible side channel attack for Rainbow and we present countermeasures against fault analysis and differential power analysis attack.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

Acknowledgments

The author acknowledges Shenzhen Science and Technology Program under Grants no. JCYJ20170306144219159 and no. JCYJ20160428092427867; Science and Technology Program of Shenzhen Polytechnic (no. 601722K20018); and Special Funds for Shenzhen Strategic Emerging Industries and Future Industrial Development (no. 20170502142224600).