Improved Masking Multiplication with PRGs and Its Application to Arithmetic Addition

Wang, Bohan; Sui, Qian; Ji, Fanjie; Guo, Chun; Wang, Weijia

doi:https://doi.org/10.1049/2024/5544999

IET Information Security

On this page

Abstract Introduction Preliminaries Conclusion Appendix Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2024 | Article ID 5544999 | https://doi.org/10.1049/2024/5544999

Improved Masking Multiplication with PRGs and Its Application to Arithmetic Addition

Bohan Wang,¹Qian Sui,¹Fanjie Ji,¹Chun Guo,^1,2,3and Weijia Wang^1,2,4

Academic Editor: Qichun Wang

Received17 Jul 2023

Revised11 Nov 2023

Accepted18 Nov 2023

Published18 Jan 2024

Abstract

At Eurocrypt 2020, Coron et al. proposed a masking technique allowing the use of random numbers from pseudo-random generators (PRGs) to largely reduce the use of expansive true-random generators (TRNGs). For security against probes, they describe a construction using PRGs, each of which is fed with at most random variables in a finite field, resulting in a randomness requirement of . In this paper, we improve the technique on multiple frontiers. On the theoretical level, we push the limits of the randomness requirement by providing an improved masking multiplication using only PRGs, each of which is fed with random variables, saving more than half random bits. On the practical level, considering that the masking of arithmetic addition usually requires more randomness (than multiplication), we apply the technique to the algorithm proposed at FSE 2015 that is a very efficient scheme performing arithmetic addition modulo . It significantly reduces the randomness cost of masked arithmetic addition, and further advocates the advantage of masking with PRGs. Furthermore, we apply our masking scheme to the SPECK, XTEA, and SPARKLE, and provide the first (to the best of our knowledge) higher order masked implementations for the ciphers using ARX structure.

1. Introduction

Side-channel attack (SCA) [1, 2] is a kind of attack exploiting physical leakage ( timing information, power consumption, or electromagnetic leaks) of the cryptographic implementations. Masking is a popular countermeasure against SCA, whose concept is to randomly divide every variable (say, ) into shares such that the joint distribution of any shares is independent of . This is known as the -probing (aka., -private) security, and is called the security order. Notably, for the popular Boolean masking, we have with the addition over (aka., bitwise XOR). Besides, it has been proved that -probing security can ensure that the information exploited from any adversary decreases exponentially with [3]. Mainstream masking schemes use a gate-by-gate approach that transforms each elemental operation ( addition and multiplication over ) into its masked correspondence called gadget, surrounding which flourishing literature emerges in the last years.

One of the most groundbreaking works toward designing masking schemes is the work of Barthe et al. [4]. Instead of proving the security of full implementation at once, this work introduces the composable security notions called noninterference/strong noninterference (NI/SNI). The composable security notions allow proving the security of smaller gadgets in terms of composability with other masked circuits. Later, Cassiers and Standaert [5] proposed a new composable security notion called probing–isolating noninference (PINI), enabling a more straightforward composition of gadgets. That is, gadgets fulfilling this PINI notion can be freely composed with each other without interfering with their SCA resistance.

Coron et al. [6] proposed a special technique called locality of randomness subset, allowing the usage of multiple PRGs to reduce the randomness cost by setting proper randomness subsets of each gadget. According to it, if all gadgets are SNI-R/PINI-R defined in [6], we can securely use -wise PRGs [7] to generate the random bit for the gadgets and keep an equivalent security in the probing model, even if the worst case where the adversary can get the variables in a PRG with one probe happens. Then, we can reuse the random seeds of -wise PRGs in different gadgets based on the locality of the subsets, which significantly reduces the randomness cost. In [6], the ISWAND [8] has been proved as SNI-R with -local use of subsets. Furthermore, two better SNI-R AND algorithms are given in [6] with -local use subsets and -local use subsets.

When a cryptographic algorithm involves arithmetic addition operations ( the add-rotate-xor (ARX)-based block ciphers such as XTEA [9] and SPECK [10], hash functions SHA-1 and SHA-2, and NIST lightweight cryptography finalist SPARKLE [11, 12]), transforming elemental operations becomes intricate—because of the higher algebra degree of the arithmetic addition operations. At CHES 2001, Goubin [13] described a very elegant algorithm for converting between shares and shares such that:with the arithmetic addition. Afterward, there has been a series of literature focusing on designing better-converting algorithms [13–19]. At FSE 2015, Coron et al. [20] described an improved algorithm performing arithmetic addition modulo with complexity that integrated conversion of both directions. Although Coron’s algorithm is a very efficient scheme, (there indeed exist some other approaches only focusing on the conversion (of one direction) from Boolean to arithmetic masking with somewhat better complexities [18, 21, 22]. Despite their prospective applications in many scenarios such as masked postquantum cryptography [23, 24], it is intricate to applying them to arithmetic addition, which requires the conversions of both directions) it is not provably secure in any composed security notions and thus is risky to be used for larger composed computation. Then there is a high-order arithmetic addition algorithm proposed at [25], which is based on [20] but only satisfies NI security.

1.1. Our Contributions

Following [6], our main contribution is to propose a new security notion allowing multiple PRGs, and a more efficient masked -order AND algorithm with -local use of randomness subsets based on [5] which satisfies the new notion. Besides, we consolidate the work on masked arithmetic addition by improving the existing work of Coron et al. [20]. Our contributions can be summarized as follows.

1.1.1. A New Security Notion Allowing the Use of Multiple PRGs

We extend the composable security notion called PINI to allow more efficient (than the work in [6]) the use of multiple PRGs. This brings a new notion called PINI-extension (PINI-E). We also describe the deduction from security in PINI-E to the security in the probing model. We introduce the usage of PRGs for gadgets at Figure 1.

1.1.2. A New Algorithm for Bitwise Multiplication

We propose a new -order AND algorithm with random bits and security, where we apply the PINI trick proposed at [5]. Besides, we can keep its -local use for randomness subsets. We show the comparison of the works proposed at [6] and ours in the locality and randomness subsets in Table 1.

1.1.3. Application to Arithmetic Addition

Based on the methodology from Coron et al. [20], we provide an algorithm for higher order masked arithmetic addition, and describe applications of our countermeasure to the SPECK, XTEA, and SPARKLE. We implement masked round functions on the ARM Cortex M3 architecture at the assembly level and report the performance results. Notably, to the best of our knowledge, they are the first implementation results of higher order masking for the ciphers using the ARX structure.

1.2. Organization

In the rest of this paper, we present notations and backgrounds in Section 2. And, we describe the new AND algorithms and give the necessary proofs in Section 3. Section 4 presents the arithmetic algorithm, including its description, related proofs and randomness cost. The implementations of the arithmetic algorithm are in Section 5. Finally, we conclude our work in Section 6.

2. Preliminaries

2.1. Notations

Let be a field with characteristic two. Let be the field addition over (aka., bitwise XOR), and be bitwise AND operation. We denote a set of variables by , and particularly, if , we denote the set of variables by . In addition, we use to denote a set of variables whose indices are contained in and denote the size of indices set by . Let be addition modulo . For any , let . Let . Let for .

In a matrix A, we define as the element at the -th row and -th column. For matrices and , let where for . Let be a sequence in a matrix for . And, let for .

2.2. Private Circuits

In this part, we describe some definitions regarding the private circuit proposed in [8]. A circuit is a directed acyclic graph with gates as vertices and wires as edges, respectively, where every wire carries a variable in , and each gate represents an elementary calculation over . We recall the definition of private circuit proposed at [26] below.

Definition 1 (private circuit [26]). A private circuit for is defined by a triple , where(1) is a randomized circuit called input encoder. It maps each input to independent shares.(2) is a randomized circuit with inputs and outputs over .(3) is a circuit called decoder. It maps the outputs ( shares) of to the original outputs of the private circuit.Moreover, a private circuit is called a -private (or -probing secure) circuit if it satisfies the requirements below:(1)Correctness: for any input , ;(2)Privacy: for any and any set of at most wires in , the distributions of and are identical, where refers to the values of variables in with input .

Although the definition of private circuit nicely provides protection against the SCAs, proving a large circuit (such as the AES) to be -private is nontrivial since the possible tuples of the wires grow exponentially with the circuit size. To cope with such an issue, Ishai et al. [8] proposed a gate-by-gate approach to transform each gate separately into the masked correspondence circuit called gadget and compose the gadgets to achieve the private circuit. A gadget is a circuit with shares as inputs and outputs.

The first -probing secure bitwise AND gadget (that implements the bitwise AND operation over in the masked domain) was proposed by Ishai et al. [8] at CRYPTO 2003 named ISWAND, which we give an example for in the following:

Meanwhile, we give a in Figure 2 to express the construction of its randomness, which will appear in Section 3 again.

We can verify that the sum of all s is the bitwise AND of and . Note that the order of the calculation is strict. For instance, at line , is calculated before XORing .

2.3. Composable Security Notions and Extensions

Note that one has to insert many refreshing gadgets to compose -private gadgets securely, significantly increasing the randomness cost. Barthe et al. [4] proposed the concept of composable security notions NI/SNI that enable the composition without refreshing gadgets. Below, we recall the definitions of NI/SNI proposed at [4].

Definition 2 (NI/SNI [4]). Let be a gadget taking as inputs and returning . The gadget is NI (resp., SNI) secure if and only if for any set of intermediate variables and any subset of output indices such that , there exists sets and of input indices with and (resp., and ), such that the intermediate variables and the output variables can be perfectly simulated from and .

Besides, there is an updated definition called SNI-R proposed at [6], which is used in the situation where a randomness subset in gadget can be got with a single probe.

Definition 3 (SNI-R [6]). Let be a gadget with input shares and , output shares . Let be subsets of the randoms used by . The gadget is SNI-R if and only if for any set of intermediate variables, any subset of output indices and any subset , such that . Then the intermediate variables, the output variables , and all for can be perfectly simulated from the knowledge of and with and .

However, the security of the trivial composition of several private circuits is not evident. More precisely, even the SNI circuits can not keep its security with trivial composition. To mitigate this issue, Cassiers and Standaert [5] proposed a new composable security notion called PINI, by which we can concentrate on the proof of every single gadget and the global security can be directly deduced. We recall it in the following.

Definition 4. (PINI [5], adapted (the original PINI security is defined for arbitrary number of inputs, we provide a fan-in 2 version in our paper)). Let be a gadget with input shares and output shares . The gadget is PINI if for any , any set of intermediate variables and any subset of output indices, there exists a subset of input indices with such that the intermediate variables and the output shares can be perfectly simulated from the input shares and .

Meanwhile, Cassiers and Standaert [5] provided a gadget construction called double-SNI which can turn SNI gadgets into PINI one.

Definition 5 (double-SNI [5]). Let G be an SNI gadget taking as input and output . Let R be an SNI gadget taking as input and output . The composite gadget G’ taking as input , and output with is PINI.

To reduce the randomness cost of a large circuit, there is an adapted definition called PINI-R proposed at [6] which also assumes the adversary can get a randomness subset with a single probe.

Definition 6 (PINI-R [6]). Let be a gadget with input shares and output shares . Let be subsets of the randoms used by G. The gadget is PINI-R if for any , any set of intermediate variables, any subset of output indices and any subset , there exists a subset of input indices with such that the intermediate variables, the output shares , and the randoms for can be perfectly simulated from the input shares and .

2.4. Masking with Randomness from PRGs

In this part, we recall some definitions for gadgets using randomness generated from PRGs and the corresponding PRGs.

2.4.1. Locality of Randomness and Its Application

First of all, we introduce the locality of randomness subset proposed at [6] used to describe the reuse extent of the randoms. It decides the PRGs used for the subset.

Definition 7 (-local randomness subset [6], adapted). Let be a gadget and be a randomness subset used by . We say that is -local use if any intermediate variable of is related with at most elements of .

With the definition of locality, we propose a weaker security definition than PINI-R which can also keep the composability and -private with the same extended probing model as PINI-R. We define it as (shorted for PINI-Extension) and provide the definition in the following.

Definition 8 (). Let G be a gadget with input shares and output shares . Let be subsets of the randoms used by G with , and each is local use. The gadget G is if for any set of intermediate variables, any subset of output indices and any set of randomness subsets with , there exist subsets with and such that the intermediate variables, the subsets and the output shares can be perfectly simulated from the input shares and .

Obviously, is an extension of PINI which allows to probe a subset of randomness with a single probe. And compared with PINI-R, the security does not need to simulate , therefore algorithm is easier to construct. But intuitively, its number of randomness subsets is not bounded as PINI-R. We introduce the -private security and composability of in the following and provide the proofs at Appendix A as supproting information.

Theorem 1 (security of ). Let G be a gadget with input shares and output shares . Let be a partition of the randomness used by G. If G is with randoms , then G is -private secure in an extended model of security where the adversary can get each with a single probe.

Theorem 2 (composability of ). Let be implementations of with randomness subsets . The composite gadget made of is with randomness subsets .

We stress that the in Theorem 2 are the implementations of the same . Meanwhile, we provide a proposition about the composition of gadgets implementing different algorithms which we also prove at Appendix A as supproting information.

Proposition 1. Let be implementations of with randomness . The composite gadget made of is with the same randomness subsets.

Proposition 1 shows why is weaker than PINI-R since the composition of PINI-R gadgets keeps the number of randomness subsets regardless of the circuit size. We mention that the composability of PINI-E is theoretically limited for the situation where there is more than one kind of gadget used to replace the same gates in the unprotected circuit, and all these gadgets use the same PRG (e.g., two kinds of multiplication gadgets are used in one circuit, and both of them use the same PRG), which barely happens in reality. In Table 1, we compare the security of PINI, PINI-R, and . The remaining part is the construction of the masked implementation with locality property. We recall the mask refreshing named locality refreshing () from the study of Ishai et al. [7] to keep a small locality for each gadget in Algorithm 1.

Input: shares
Output: shares
It ensures
1:
2: for do
3:
4:
5:
6: end for

Lemma 1. The gadget is with .

The proof of Lemma 1 is equivalent to prove PINI which has been proposed at [6], because the division of in Lemma 1 is exactly the single random .

Theorem 3 (locality composition with randomness subset [6], adapted). Let be a set of 2-input gadgets with randomness subsets , each of which makes an -local use. Consider the gadgets where the inputs and output of each is locality refreshed with randoms , and for . Any composite gadget made of makes an -local use of randomness , and for all , it makes a 1-local use of the randoms in .

2.4.2. Application of Multiple PRGs

We recall the definition of -wise independent PRG, which can be much more efficient than traditional PRGs.

Definition 9. (-wise independent PRG [7] (we adapt the elements in each subset to those in , while the original definition was in in [7])). A function : is an -wise independent PRG if any subset of its outputs is independently and uniformly distributed when the input is uniformly distributed.

Here, we describe two -wise PRGs called and proposed at [6]. The parameter of can be set as any positive integer while that of is fixed as three. However, the running efficiency of is much higher than that of .

We define as follows:where and:

is an -wise PRG because there is a bijection between the coefficients of and its evaluation at distinct points [6]. For instance, can output at most bits of randomness when given bit seeds over .

We define another PRG as follows:

This PRG is based on the expander graph used in [7]. It can generate randoms by bit seeds. It is much more lightweight (with only XOR operations) than . In [6], it is proved as a 3-wise PRG, recalled Lemma 2.

Lemma 2 (see [6]). The randomized function is a 3-wise independent PRG.

Then, we introduce the security of masking with multiple PRGs in Theorem 4 proposed at [6], where we can keep -local gadgets secure when multiple PRGs are used to generate the random elements. This reduces the randomness cost efficiently.

Theorem 4 (security with multiple PRGs [6], adapted). Suppose is a -private implementation of f with encoder and decoder , where the circuit uses for each , random elements and makes an -local use of , are the inputs of , and the adversary can obtain with a single probe. Let be a linear -wise independent PRG. Then, the circuit denoted by is a -private implementation of with encoder and decoder , which uses random elements.

2.5. Coron’s Work on Masked Arithmetic Addition

Coron et al. [20] introduce a new algorithm to convert from arithmetic masking to Boolean masking, which is introduced in Theorem 5. This algorithm uses Kogge-Stone [27] carry look-ahead algorithm proposed at to replace the classical ripple-carry adder, which reduces the complexity from (in a previous work [15]) to .

Theorem 5 (see [20]). Let , . Define the sequence of -bit variables and , with and , and:for . Then .

In Section 4, we propose a new algorithm that expands the security order from to any based on Theorem 5. Besides, we have proved its security and locality in Appendix B as supproting information. We give a new AND algorithm in Section 3 with 1-local use of its randomness subsets with odd and use it in the new arithmetic algorithm.

3. The New Masked AND Gadget

In this section, we introduce a new masked AND algorithm with lower locality, as well as its security and corresponding proof.

3.1. The Description of the New Algorithm

We describe our new algorithm with odd in Algorithm 2, which is provable secure in with 1-local use of randomness subsets. We will prove its security and locality in the next section. Algorithm 3 provides a PINI trick proposed at [5] keeping PINI (and ). Note that the inputs and ouput of (short for PINI-PART) are explained at Figure 3.

Input: shares and
Output: shares
It ensures
1:
2: for do
3:
4: for do
5: if then
6:
7:
8: end if
9: if then
10:
11:
12: end if
13: end for
14: end for
15: for do
16: for do
17: if then
18:
19: end if
20: if then
21:
22: end if
23: end for
24: end for
25: for do
26:
27:
28: end for

Input: input shares , random and an intermediate variable
Output: the intermediate variable
It ensures
1:
2:
3:
4:

Intuitionally, PINI security does not allow the leakage of more than one input indices with one probe, and the PINI trick (i.e., Algorithm 3) avoids these leakages in the multiplication gadgets by changing the operation order of multiplying secret (i.e., and in Algorithm 3) and adding randoms (i.e., in Algorithm 3). In comparison, ISWAND calculates directly and thus it is not PINI.

The intermediate step intuitionally defines a partial order among the intermediate variables. We use this definition in the proof of which is given at Appendix B as supproting information.

Definition 10. (Intermediate step). Let and be the intermediate variables of gadget G. We define as the intermediate step of if some exists for or .

Theorem 6. is with randomness subsets and for .

3.2. The Randomness Reuse of

We mention that is 1-local use of randomness subsets in Theorem 7. And we can build a gadget G with which always keeps its 1-local use of randomness subsets by Theorem 3. And, if G satisfies the -probing security in Theorem 4, we can use -wise PRGs to generate all and in .

Theorem 7. is 1-local use of and for .

The proof of Theorem 7 is obvious, because all randoms in each or appear only once in each . This is exactly the definition of the locality of randomness subset.

3.3. Discussion for the Randomness of

We have proven in the prevoius subsections that PINI-E gadget is -probing secure with PRG-generated randoms and almost trivial composability, and the PRGs are required to be -wise if the randoms are -local use. Also, we provide the construction of -wise PRGs with arbitrary . Moreover, we have proven that is PINI-E. Thus, the randomness of is theoretically indistinguishable from a -probing secure AND gadget with TRNG-generated randoms if the PRGs of are -wise.

To validate the impact of the randomness on the practical security, we run and another multiplication gadget proposed in [28] on a ChipWhisperer STM32F4 UFO target board and collect its power traces with Picoscope 5244D at sampling rate of 125 MS/s. Besides, we perform a Welch’s T-test with 10, 000 executions, whose randoms are generated by PRGs () and TRNGs (AND gadget proposed in [28]), respectively, to compare the randomness of the PRG implementation and the TRNG ones. Figure 4 depicts the T-test results for , and we provide in Figure 5, the result for the other gadget with the randomness from TRNGs.

4. Application to Arithmetic Addition

In this section, we implement gadget in an arithmetic addition algorithm proposed at [25], which is costly in randomness for previous multiple gadgets.

Our description is structured by means of top-down. All gadgets presented in this subsection are , and we defer the security proofs to Appendix B as supproting information. First of all, we describe the algorithm to perform addition operations directly on the masked shares, which is similar to the algorithm proposed in [25] but we add some construction in our algorithm so that it can use multiple PRGs. More precisely, we receive the shares and satisfying and as inputs, and the goal is to compute satisfying . Note that our new algorithm is based on the concept of [20] and adapted for higher security orders. We describe it in Algorithm 4.

Input: shares and
Output: shares
It ensures
1:
2:
3:
4: for do
5:
6:
7: end for
8:
9: for do
10:
11: end for
12:
13:

In the rest of this subsection, we will explain the construction of the ingredients and . Both of them are additionally with locality property for the use of -wise PRGs, so that the randomness cost can be reduced.

First, we propose the gadgets to calculate and in Theorem 5. We will introduce gadget first, which is used to generate proposed at Equation (6) because the inputs of are the outputs of , furthermore, they do not need any intermediate variables from the generation of for with . Algorithm 5 is the description of .

Then it comes to , which is shown in Algorithm 6. We use gadget to get all proposed at Equation (6). But the inputs of are the outputs of and , so we must get and first to calculate , and this is why we introduce it as the latter one. Meanwhile, we provide the evaluation of the randomness cost for Algorithm 4 in Appendix C.

Input: shares
Output: shares
1: for do
2:
3: end for
4:

Input: shares and
Output: shares
1: for do
2:
3: end for
4:
5:

5. Masked Implementations of SPARKLE, XTEA, and SPECK

In this section, to evaluate the performance of , we apply our scheme to SPARKLE, XTEA, and SPECK ciphers. SPARKLE [11, 12] is a family of cryptographic permutations shortlisted for the finalists NIST lightweight cryptography standardization. We choose SPARKLE256 for the evaluation. XTEA block cipher was introduced in [9], which is designed to correct weaknesses in TEA. And SPECK is a family of lightweight block ciphers publicly released by the National Security Agency (NSA) [10], which is optimized for performance in software implementations.

SPARKLE, XTEA, and SPECK are all based on the ARX design with arithmetic addition, rotation, and XOR operations, where the masked arithmetic addition perfectly fits . We can use for the masking of bitwise XOR operations, which is . For masking of shifting operations, we directly use the trivial implementation where each share is operated separately. For example, the masked rotate left shifting by can be implemented by with the input share and the output one, which is secure in . We use independent random bits/seeds for different . is according to the composability of . By Proposition 1, the masked SPARKLE, XTEA, and SPECK are all .

We implement masked SPARKLEXTEA and SPECK based on ARM Cortex M3 architecture at assembly level, for illustrative purposes and timing comparisons. We show the costs in Tables 2–4, with the number of required true random number bits. We present the implementations using with both and introduced in Section 2.

6. Conclusion

We proposed a new security definition named to release the requirements in PINI-R proposed in [6], where both of them support the randoms generated by multiple PRGs. Furthermore, we provide a high-order multiplication gadget (i.e., Algorithm 2) with a two-thirds reduction of true random cost compared with the state-of-the-art work proposed in [6]. Then we apply the new multiplication gadget into the Boolean-to-Boolean arithmetic addition algorithm (i.e., Algorithm 4), and use it in the implementations of SPARKLE, XTEA, and SPECK based on ARM Cortex M3, which are the first implementations of higher order masking for the ciphers using the ARX structure.

Appendix

A. Composability and Security of

A.1. Proof of Composability

Proof. Consider the composite gadget like Figure 6, we define as the probed intermediate variables of , where . And we denote by the indice sets of probed randomness subsets for each . Furthermore, the indice set of probed output are defined as for . For which is the last gadget of the composite gadget, its probed output set is . Meanwhile, we have:First we consider . According to , the indice set of its inputs can simulate all probes in , where . Then we consider . Since the outputs of are the inputs of , the indice set for the simulation of is equivalent to the probed output of . Therefore the probed output indice set of becomes . Meanwhile, according to Theorem 4, the indice set of randoms for should be . So, the indice set of input for to simulate all the probes is as follows:With this proof method, the indice set of input for the first of the composite gadget is . And we have:which means the composite gadget is also with .

A.2. Proof of Security

Proof. WLOG, let . The intermediate variables and probed randomness subsets can be simulated by input shares with indices , and the probed outputs can be simulated by the inputs with indices . Consider , we haveTherefore the adversary learns nothing from the inputs.

A.3. Proof of Proposition 1

Proof. We suppose the input indice set for is and the probed randomness subset is . Consider the last gadget where its ouput is the output of the whole composition, we use to simulate all its probed variables. For where its output is one of the inputs of , its indice set for simulation should be:If and is the implementation of the same , they are according to Theorem 2. And if they are different implementations, we havewhere (resp., ) is the number of probes in (resp., ) without its probed output. Therefore, the simulation for the whole composition needs no more than input indices, and the indice set is which satisfies . As a result, the composition is .

A.4. Application of

Consider the properties of , if are proved as , their composition will satisfy Theorem 4. Moreover, all -local randomness subsets in Theorem 2 can be generated by a -wise PRG.

Moreover, we provide the whole procedure of how to use multiple PRGs in gadgets and keep them -private in the following.

How to use multiple PRGs in PINI-E gadgets:(1)We assume gadget is the composition of gadgets where is the implementation of and are those of , and each is with randomness subsets , each of which is local use. The subscript is used to distinguish the different randomness subsets in . We mention that the subscript is used to count how many times the is implemented in .(2)According to Theorem 3, we add three LR gadgets to the two inputs and one output of each , each of which owns the randomness subsets and with 1-local use. So that each randomness subset keeps their -local use. We define the composition of and LRs as .(3)According to Theorem 2, the composition of with the same is with the randomness subsets . We define these compositions as for each . And the LRs are also with randomness subsets .(4)According to Proposition 1, gadget is with randomness subsets which keeps -local use, and LR gadgets with 1-local use randomness subsets . Therefore according to Theorem 4, is still -private with PRGs among which the -wise one is used to generate randoms for the -local subset, and the other PRGs are used to generate randoms for the LR gadgets.

According to the illustration in Sections 2.3 and 2.4, we summarize a proof sketch on the probing security of gadgets’ composition in Figure 7.

B. Proofs for

B.1. The Security of

First we introduce the construction of the matrix . We give a matrix as Figure 8. Its first row is , and in other rows, the order of sequence is the cyclic shift of its last row except the last row whose first elements are for the -th element and the rest elements are . Then we add a sequence as the first column of . We give the construction of as an instance in Figure 8.

Then let be the randomness matrix of in with order , and we define the mapping where with and . For a -order , we define:with , which are the randomness subsets of randoms . The randomness matrix of randoms , called , is the mirror symmetry of . And the construction of the randomness subsets of is also the mirror symmetry of for , called . We give an example of and in Figure 9. Let be the randomness matrix of .

Finally, we define as the matrix mixed with the inputs . More precisely, let:

For example, and correspondingly . And we define as the indice set of the corresponding of .

We provide Lemma 3 for the proof of and prove it at Appendix C as supproting information.

Lemma B.1. In , there are at most 2 randoms and satisfying for .

Proposition B.1. Lemma 3 also works when we replace with and . More precisely, the randoms pair exists for and iff the randoms pair in Lemma 3 does not exist.

Lemma 3 and Proposition 2 show that every random is used only twice in the different outputs.

B.2. Proof of Lemma 3

Proof. According to the construction of , there always exists and satisfying Lemma 3 between and . So Lemma 3 is proved.

B.3. Proof of Proposition 2

Proof. Mention that and , the proof is the same as Lemma 3.

B.4. Proof of Theorem 6

Proof. There are two steps in our proof, the proof of PINI and the extension to .
First we prove the PINI security. Let be the indice set of inputs. WLOG, we only consider the randoms , because the other randoms do not weaken the security.(1)According to the construction of , each pair of and for is protected by the same random . As a result, if the random in the probed variables is simulated, we put the corresponding indice into .(2)Then we consider the situation where the randoms of probed variable are simulated by more than one probe, for example, in the probed variable are simulated by variables and because each probe can simulate at most one random of the other probe according to Lemma 3 and Proposition 2. Consider Algorithm 2, the input indice of each intermediate step of for are continuous. More precisely, for the adjacent elements and in of , there must be or , which means the input indices corresponding to the randoms are also continuous. Thus in this case, each additional probe only adds 1 more indice into .(3)For the intermediate steps of , the adjacent elements of also own the same indice. And thanks to , the simulation of in needs all randoms contained in its intermediate steps, which is similar to case 2. Consider and must satisfy PINI, there is only left for the proof of PINI. Since , there are at least 2 probes to simulate . Therefore the intermediate steps of also satisfy PINI, the PINI security of is deduced.The proof also works when we only consider the randoms , the proof is the same as that of so we omit it. Then we prove the of . We provide Figure 10 to describe the distribution of proved randoms at when a and a are probed.
First, we prove that all intermediate variables must satisfy except those at the “intersection” as Figure 10. In other words, an intermediate variables will not break unless all its randoms and are contained in the probed randomness subsets. Consider the proof of PINI, if the randoms for some intermediate variable are probed, it still satisfies PINI security because the PINI proof also works with the randoms . And the situation of probing is the same. We mention that the only difference between and PINI is the probes of randomness subsets, so the intermediate variables mentioned above also satisfy .
As a result, we only consider those intermediate variables whose randoms are simulated with their randomness subsets, which is called in the rest of the proof.(1)We prove that there are at most two bare when there are one probe for and , respectively, with . First, we consider the -th row for . In this case, the proof is equal to prove there are at most two intersections in Figures 10(a) and 10(b). Mention that the included angle of either the blue line or the orange line in Figure 10 and the edges of are , we know that the blue line is perpendicular to the orange one. We assume, there are more than two intersections of these lines, i.e., there are three or four intersections. In this case there must be two intersections at the extreme points of one of the dotted lines, WLOG, we assume they are at the extreme points of the vertical line. However, according to , if , . And for , refers to . Hence, there must not be two intersections at the extreme points of the dotted line, which means there are no more than two intersections. So, we prove the proposition. WLOG, in the rest of the proof we assume there are two bare for each two probes of and with (generally, if and , we have , which comes from the construction of . Therefore we only consider ).(2)Then we show that indice set for and satisfies , which can be got from the construction of . And it also works for . More precisely, for a fixed , there is always an input indice in the with , and for . Specially, . Meanwhile, there are for and for . Therefore, each probe for adjacent can provide at most one more indice.(3)Then we prove there are at least two probes to get at . Mention that there is no appearing directly in the intermediate variables at , the only way to get is to probe both and at . As a result, getting needs at least probes. Moreover, consider the distribution of in , which we discuss at last case, the most efficient probe method for the adversary is to probe the continuous sequence instead of with discrete , thus we omit other situations in the rest of the proof.(4)According to case 2 and case 3 above, we consider the probes containing the adjacent and for and the corresponding intermediate variables, the “adjacent” means the subsets are adjacent at and and intuitionally the adjacent subsets can also describe as the orange and blue lines in Figure 10 with larger thickness. Figures 10(a) and 10(b) show two different situations of the intersections of and . In the situation of Figure 10(a), there are no are bare, so we only consider the situation of Figure 10(b) according to case 1. And the case of Figure 10(b) can be divided into two different situations as Figures 10(c) and 10(d). The “shapes” (enclosed by the red lines at Figures 10(c) and 10(d)) of the bare may be square, hexagon or octagon, which depends on the choice and the number of probes. All the shapes can be contained at a square with side length where and (resp., ) is the number of probes of (resp., ). We assume the number of probes at and is equivalent, and we put the explanations about the propositions and assumption above into Appendix C as supproting information. In the rest of the proof, we assume the shape of the bare variables is the -length square proposed at Figure 11.(5)According to case 2, the adjacent at the same row or the same column have at least one same indice if the adjacent elements satisfy or . Therefore, if any for is probed, there are at least more probes for the randomness subsets needed to make the probe bare with . And consider that there are at most indices contained at , its simulation satisfies . Mention that , we consider other probes contained by the -length square mentioned at case 4. Note that, there are two intersections for the probes of randomness subsets, we assume there are probes for each of and , and probe all sequences contained by the two squares, which are probes totally according to the discussion at case 3. First, we consider the square with the probe simulated before, each other sequence in this square provides at most two more input indices, one of the additional indice comes from the situation with mentioned at case 2 and the other possible indice comes from . Therefore the probes in this square provide at most , and the indices for the two squares are less than . Hence, we prove the security for and their intermediate steps with .(6)The security for and its intermediate steps is trivial. Since the adjacent and are different with any other elements at other rows, the probed are not adjacent when we probe the adjacent with , which means there are twice probes needed to probe the bare . Therefore the security also works for and its intermediate steps, and we deduce Theorem 6.

(a)

(b)

(c)

(d)

Figure 10

Subparts (a, b) are the two cases for probing randomness subsets and at for , where the blue line corresponds to the probed and the orange one corresponds to . And the is the index of in . Subpart (a) refers to the situation where the probed and do not intersect at , while subpart (b) does. Subparts (c, d) are two cases of the intersections of more than one probes to and , where we assume there are three probes for both and . The (resp., ) refers to the probed randoms at (resp., ), and refers to that both randoms and are probed, defined as bare. The red squares in subparts (c, d) are used to stress the bare variables.

(a)

(b)

(c)

(d)

Remark B.1. In this part we give a retrospect of the proof. First, we prove is PINI with either or . Then in the proof of , we reduce the scope of potential “unsecure” intermediate variables and finally prove that all variables are . More precisely,(1)In case 1 we provide the distribution of the bare with the single probe of both and . Consider the PINI security of with either or for , the intermediate variables which are not bare must satisfy PINI, and thus satisfy . As a result, we only consider these bare .(2)In case 2 we analyze the indices of the elements at at the same row or column.(3)In case 3 we provide the relation between the number of probes and the constructions of probed sequences at . Consider the indice distribution of discussed at case 2, we determine the most efficient probing method to get most indices.(4)In case 4 we extend the conclusion at case 1 from the single probe of both and to several probes. Also, we discuss the “shape” of the bare variables and extend it into a square in which is easier to prove security. The details of why the “shape” is exactly what we claim and how the size of the extended square comes are put at Appendix C as supproting information. According to case 1, we only need to prove the security of the contained in the extended squares.(5)In case 5 we prove the security of and their intermediate steps with the conclusion in case 2 and 3 for .(6)In case 6 we prove the of the rest intermediate variables (i.e., and its intermediate steps).

B.5. Explanations about the Proof of Theorem 6

B.5.1. The Enclosed Part at Figures 10(c) and 10(d)

We provide the different shapes at Figure 11, in which the enclosed parts of the dashed lines refer to the probed randomness subsets and those with full lines refer to the bare variables at . The Figure 11(a) is the situation where there is an element of at each vertex of the dashed square exactly, therefore the full line square is also a square. The Figure 11(b) is the situation where there is no element at the top and bottom vertexes of the dashed square, thus the remained shape is hexagon, note that if the top vertex is not element, the bottom one is neither because of the symmetry of the construction. Figure 11(c) is the situation where there is no element at all 4 vertexes of the dashed square, therefore the remained shape is octagon.

B.5.2. The Figure of the Enclosed Part and the Scaled Square

Figure 11(d) shows the situation where the probe number of and is unequal. With the green full lines at Figure 11(d), we know that the red rectangle can be contained by a square with side length (this conclusion comes from elementary geometry and we omit the detailed proof), where and is the probe number of and and we assume . The blue squares at Figure 11 are the scaled ones at the proof of Theorem 6, easy to see that its side length is and it contains all bare .

C. Evaluation of the Randomness Cost for

In this part, we will calculate the cost of randomness in . We consider the operations are over and let .

Now we calculate the cost in with multiple PRGs. According to Theorem 4, we use a set of -wise PRGs to generate randoms used by gadgets as they make a 1-local use of each for , and a set of -wise PRGs to generate and in gadget because they make a -local use of each and . In the following, we will separately discuss the number of PRGs and randoms with either or .

When using , we calculate the number of PRGs and randoms in different situations. First, we consider which is used to generate the randoms in and . According to the maximum distance separable (MDS) conjecture [29], we have the following inequality, where there are gadgets used in a :from which we have . In our implementation, we set and (the input length of are set as 32. When randoms are needed, we use 4 outputs of the 8-bit PRGs to generate a 32-bit random. In other words, there are 4 PRGs needed for a random with different seeds). It means that we can use -wise PRGs to generate all randoms in or for any . Then we calculate for with some , we have:

From the value of and given above, we know that it is satisfied. Therefore, we need -wise PRGs to generate the randoms in for . According to all the calculations above, we know that we need bits of randomness for the whole algorithm.

Then we consider the case using , and we will also calculate PRGs and randoms, respectively. As the output of is 3-wise independent, according to Theorem 4, we always need -wise PRGs, and thus the security order is no more than 3. Let and be the numbers of randoms needed in for or and for some . First, we consider , we can get the following inequality by the definition of :therefore . Similarly, for we have:thus we have . Then we know that we need with bit seeds to generate all and in , and with bit seeds to generate all randoms in . We compare the randomness cost of and situation without PRGs in Table 5.

Then, we discuss the case when a set of PRGs are used by mutliple . We only consider the use of PRG , and the maximum number of can be calculated by . It means that there are in each and each randomness subset of contains at most elements, and thus elements are contained by a in a algorithm. And, refers to the number of output variables of a PRG. Hence, is the maximum number of for one set of . We set , , and which is quite a practical relevant setting. Then, we have . Considering that, in Table 5, one using and no PRGs requires and random bits, respectively. Therefore, the randomness cost can be reduced by a factor of up to .

Data Availability

The source code data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Bohan Wang and Weijia Wang contributed in the writing, investigation, and methodology. Qian Sui contributed in the software and methodology. Fanjie Ji contributed in the software and validation. Chun Guo and Weijia Wang contributed in the resources and funding. Weijia Wang contributed in the conceptualization and validation. Chun Guo contributed in the investigation.

Acknowledgments

This work was financially supported by the National Key Research and Development Program of China (No. 2021YFA1000600), the Program of Qilu Young Scholars (Grant Nos. 61580089963177 and 61580082063088) of Shandong University, the Program of Taishan Young Scholars of the Shandong Province, the National Natural Science Foundation of China (Grant Nos. 62032014, 62002202, and 62002204), the Major Basic Research Project of Natural Science Foundation of Shandong Province, China (Grant No. ZR202010220025), the Department of Science & Technology of Shandong Province (SYS202201), and the Quan Cheng Laboratory (QCLZD202306).

References

P. C. Kocher, “Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems,” in Advances in Cryptology—CRYPTO ’96, N. Koblitz, Ed., vol. 1109 of Lecture Notes in Computer Science, pp. 104–113, Springer, Berlin, Heidelberg, 1996.
View at: Publisher Site | Google Scholar
P. C. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” in Advances in Cryptology—CRYPTO’ 99, M. J. Wiener, Ed., vol. 1666 of Lecture Notes in Computer Science, pp. 388–397, Springer, Berlin, Heidelberg, 1999.
View at: Publisher Site | Google Scholar
A. Duc, S. Dziembowski, and S. Faust, “Unifying leakage models: from probing attacks to noisy leakage,” in Advances in Cryptology—EUROCRYPT 2014, P. Q. Nguyen and E. Oswald, Eds., vol. 8441 of Lecture Notes in Computer Science, pp. 423–440, Springer, Berlin, Heidelberg, 2014.
View at: Publisher Site | Google Scholar
G. Barthe, S. Belaïd, F. Dupressoir et al., “Strong non-interference and type-directed higher-order masking,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, E. R. Weippl, S. Katzenbeisser, C. Kruegel, A. C. Myers, and S. Halevi, Eds., pp. 116–129, ACM, Vienna, Austria, 2016.
View at: Publisher Site | Google Scholar
G. Cassiers and F.-X. Standaert, “Trivially and efficiently composing masked gadgets with probe isolating non-interference,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 2542–2555, 2020.
View at: Publisher Site | Google Scholar
J.-S. Coron, A. Greuet, and R. Zeitoun, “Side-channel masking with pseudo-random generator,” in Advances in Cryptology—EUROCRYPT 2020, A. Canteaut and Y. Ishai, Eds., vol. 12107 of Lecture Notes in Computer Science, pp. 342–375, Springer, Cham, 2020.
View at: Publisher Site | Google Scholar
Y. Ishai, E. Kushilevitz, X. Li et al., “Robust pseudorandom generators,” in Automata, Languages, and Programming. ICALP 2013, F. V. Fomin, R. Freivalds, M. Z. Kwiatkowska, and D. Peleg, Eds., vol. 7965 of Lecture Notes in Computer Science, pp. 576–588, Springer, Berlin, Heidelberg, 2013.
View at: Publisher Site | Google Scholar
Y. Ishai, A. Sahai, and D. A. Wagner, “Private circuits: securing hardware against probing attacks,” in Advances in Cryptology—CRYPTO 2003, D. Boneh, Ed., vol. 2729 of Lecture Notes in Computer Science, pp. 463–481, Springer, Berlin, Heidelberg, 2003.
View at: Publisher Site | Google Scholar
R. M. Needham and D. J. Wheeler, Tea Extensions, Report, Cambridge University, 1997.
R. Beaulieu, D. Shors, J. Smith, S. Treatman-Clark, B. Weeks, and L. Wingers, “The SIMON and SPECK lightweight block ciphers,” in Proceedings of the 52nd Annual Design Automation Conference, pp. 1–6, ACM, San Francisco, CA, USA, 2015.
View at: Publisher Site | Google Scholar
C. Beierle, A. Biryukov, L. C. dos Santos et al., “Alzette: a 64-bit ARX-box,” in Advances in Cryptology—CRYPTO 2020, D. Micciancio and T. Ristenpart, Eds., vol. 12172 of Lecture Notes in Computer Science, pp. 419–448, Springer, Cham, 2020.
View at: Publisher Site | Google Scholar
C. Beierle, A. Biryukov, L. C. dos Santos et al., “Lightweight AEAD and hashing using the sparkle permutation family,” IACR Transactions on Symmetric Cryptology, vol. 2020, no. S1, pp. 208–261, 2020.
View at: Publisher Site | Google Scholar
L. Goubin, “Cryptographic Hardware and Embedded Systems—CHES 2001,” A sound method for switching between Boolean and arithmetic masking, Springer, Berlin, Heidelberg, vol. 2162, pp. 3–15, 2001, Lecture Notes in Computer Science.
View at: Publisher Site | Google Scholar
M. Van Beirendonck, J.-P. D’Anvers, and I. Verbauwhede, “Analysis and comparison of table-based arithmetic to Boolean masking,” IACR Transactions on Cryptographic Hardware and Embedded Systems, vol. 2021, no. 3, pp. 275–297, 2021.
View at: Publisher Site | Google Scholar
J.-S. Coron, J. Großschädl, and P. K. Vadnala, “Secure conversion between Boolean and arithmetic masking of any order,” in Cryptographic Hardware and Embedded Systems—CHES 2014, L. Batina and M. Robshaw, Eds., vol. 8731 of Lecture Notes in Computer Science, pp. 188–205, Springer, Berlin, Heidelberg, 2014.
View at: Publisher Site | Google Scholar
J.-S. Coron and A. Tchulkine, “A new algorithm for switching from arithmetic to Boolean masking,” in Cryptographic Hardware and Embedded Systems—CHES 2003, C. D. Walter, Ç. K. Koç, and C. Paar, Eds., vol. 2779 of Lecture Notes in Computer Science, pp. 89–97, Springer, Berlin, Heidelberg, 2003.
View at: Publisher Site | Google Scholar
B. Debraize, “Efficient and provably secure methods for switching from arithmetic to Boolean masking,” in Cryptographic Hardware and Embedded Systems—CHES 2012, E. Prouff and P. Schaumont, Eds., vol. 7428 of Lecture Notes in Computer Science, pp. 107–121, Springer, Berlin, Heidelberg, 2012.
View at: Publisher Site | Google Scholar
M. Hutter and M. Tunstall, “Constant-time higher-order Boolean-to-arithmetic masking,” Journal of Cryptographic Engineering, vol. 9, no. 2, pp. 173–184, 2019.
View at: Publisher Site | Google Scholar
M. Karroumi, B. Richard, and M. Joye, “Addition with blinded operands,” in Constructive Side-Channel Analysis and Secure Design. COSADE 2014, E. Prouff, Ed., vol. 8622 of Lecture Notes in Computer Science, pp. 41–55, Springer, Cham, 2014.
View at: Publisher Site | Google Scholar
J.-S. Coron, J. Großschädl, M. Tibouchi, and P. K. Vadnala, “Conversion from arithmetic to Boolean masking with logarithmic complexity,” in Fast Software Encryption, G. Leander, Ed., vol. 9054 of Lecture Notes in Computer Science, pp. 130–149, Springer, Berlin, Heidelberg, 2015.
View at: Publisher Site | Google Scholar
L. Bettale, J.-S. Coron, and R. Zeitoun, “Improved high-order conversion from Boolean to arithmetic masking,” IACR Transactions on Cryptographic Hardware and Embedded Systems, vol. 2018, no. 2, pp. 22–45, 2018.
View at: Publisher Site | Google Scholar
J.-S. Coron, “High-order conversion from Boolean to arithmetic masking,” in Cryptographic Hardware and Embedded Systems—CHES 2017, W. Fischer and N. Homma, Eds., vol. 10529 of Lecture Notes in Computer Science, pp. 93–114, Springer, Cham, 2017.
View at: Publisher Site | Google Scholar
J. W. Bos, M. Gourjon, J. Renes, T. Schneider, and C. Van Vredendaal, “Masking kyber: first- and higher-order implementations,” IACR Transactions on Cryptographic Hardware and Embedded Systems, vol. 2021, no. 4, pp. 173–214, 2021.
View at: Publisher Site | Google Scholar
V. Migliore, B. Gérard, M. Tibouchi, and P.-A. Fouque, “Masking dilithium,” in Applied Cryptography and Network Security, R. Deng, V. Gauthier-Umaña, M. Ochoa, and M. Yung, Eds., vol. 11464 of Lecture Notes in Computer Science, pp. 344–362, Springer, Cham, 2019.
View at: Publisher Site | Google Scholar
G. Barthe, S. Belaïd, T. Espitau et al., “Masking the GLP lattice-based signature scheme at any order,” in Advances in Cryptology—EUROCRYPT 2018, J. B. Nielsen and V. Rijmen, Eds., vol. 10821 of Lecture Notes in Computer Science, pp. 354–384, Springer, Cham, 2018.
View at: Publisher Site | Google Scholar
S. Belaïd, F. Benhamouda, A. Passelègue, E. Prouff, A. Thillard, and D. Vergnaud, “Randomness Complexity of Private Circuits for Multiplication,” in Advances in Cryptology—EUROCRYPT 2016, M. Fischlin and J. S. Coron, Eds., vol. 9666 of Lecture Notes in Computer Science, pp. 616–648, Springer, Berlin, Heidelberg, 2016.
View at: Publisher Site | Google Scholar
P. M. Kogge and H. S. Stone, “A parallel algorithm for the efficient solution of a general class of recurrence equations,” IEEE Transactions on Computers, vol. C-22, no. 8, pp. 786–793, 1973.
View at: Publisher Site | Google Scholar
W. Wang, F. Ji, J. Zhang, and Y. Yu, “Efficient private circuits with precomputation,” IACR Transactions on Cryptographic Hardware and Embedded Systems, vol. 2023, no. 2, pp. 286–309, 2023.
View at: Publisher Site | Google Scholar
B. Segre, “Curve razionali normali ek-archi negli spazi finiti,” Annali Di Matematica Pura ed Applicata, vol. 39, no. 1, pp. 357–379, 1955.
View at: Publisher Site | Google Scholar
T. Fritzmann, M. Van Beirendonck, D. B. Roy et al., “Masked accelerators and instruction set extensions for post-quantum cryptography,” IACR Transactions on Cryptographic Hardware and Embedded Systems, vol. 2022, no. 1, pp. 414–460, 2022.
View at: Google Scholar

Copyright

Copyright © 2024 Bohan Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

129

Downloads

133

Citations