Abstract

Rotational/Rotational-XOR differential-linear attack is a kind of powerful attack on ARX primitives. In this paper, we analyse the message authentication code algorithm—Chaskey—by using rotational differential-linear attack and using partitioning technique for key recovery. We presentthe results of distinguishing attack under weak-key and related-key settingswith correlation for 9-round Chaskey. This is for the first time rotational differential-linear distinguisher has been used to analyse Chaskey. It is concluded that Chaskey can resist against rotational differential-linear attack.

1. Introduction

MAC (Message Authentication Code) is a small piece of information generated by a specific algorithm, which can check the integrity of a certain piece of message and is used for message authentication. It can be used to check whether the content of the message has been changed during message delivery, regardless of whether the reason for the change is accidental or deliberate. In addition, it can be used as the identity verification to confirm the source of message. Specifically, when given a message , the MAC algorithm produces a tag by processing message and secret key . Then, one can send the combination of message and tag as . It should be difficult for an attacker to forge a tag without knowing the key.

MAC algorithm can be constructed from other cryptography primitives. Chaksey[1] is a MAC algorithm, the permutation of which is based on ARX primitives. Since ARX primitives only consist of modular additions, bit rotations, and XORs, they are lightweight and can be implemented easily. The designer of Chaksey thought that it is lightweight and performs well in software as well as hardware. It is natural to think that the security level of MACs mainly depends on its permutation. Therefore, we analyse the permutation of Chaskey, which is presented in Section 4.

Rotational cryptanalysis [24] was first introduced by Dmitry Khovratovich et al. It is a probabilistic technique which mainly focuses on rotational pair through the rounds, where is plaintext and is a rotation of to the left with bits. This attack is powerful to ARX primitives because the property of rotational pair is preserved by XOR and rotational operations, and also preserved by modular additions with high probability. The only prerequisite for successfully applying a rotation attack is that all constants used in the function maintain their values after rotation. Differential-linear attack [5] combines two most powerful techniques for symmetric-key cryptanalysis, proposed by Langford and Hellman at CRYPTO 1994. At EUROCRYPT 2021, Liu et al. proposed a new attack called rotational/rotational-XOR differential-linear attack [6], which combined the advantages of rotational attack and differential-linear attack.

1.1. Our Contributions

Rotational differential-linear attack combines the advantage of rotational attack and differential-linear attack, and it is powerful to analyse ARX primitives. This attack can be divided into two parts. The first part is rotational attack. If there are no constant-xor or there is only rotational-invariable constant in function, the probability of rotational attack is only related to the number of modular additions. However, the conclusion is met when the cipher satisfies Markov cipher assumption, which results in a Markov chain assumption. Since there are some consecutive modular additions in the permutation of Chaksey, the Markov cipher assumption may not hold. In order to get precise results, we should consider the effects of consecutive modular additions. The second part is the linear attack. In this part, we transform the search for linear traces of high correlation into boolean satisfiability problem and use an automatic tool to search the best linear approximations.

In order to recover the whitening key, we consider some techniques in differential-linear attack. One of them is the partitioning technique proposed by Beierle et al. [8] at CRYPTO 2020. This technique mainly focuses on multiple linear characteristics rather than one to increase the correlation of linear approximations. We will explain it in detail in Section 3.

As a consequence, we get the distinguishing attack under the related-key setting with correlation over 9 rounds of the permutation. Our results with the corresponding complexities are summarized in Tables 1 and 2, together with a comparison to the best attacks published so far. It is the first time to attack Chaskey by using rotational differential-linear attack with partitioning technique.

The structure of the remaining paper is as follows: Section 2 briefly describes some cryptanalysis methods and techniques for understanding content in this paper. Section 3 gives some examples to introduce the partitioning technique explicitly. Section 4 elaborates on how to attack Chaskey using all the techniques mentioned in the previous section and finally Section 5 is a short conclusion.

2. Preliminaries

We use to represent the binary vector to rotate 1 bit to the left and to represent the binary vector to rotate 1 bit to the right, respectively. For , means the inner product of and . is used for denoting the -th bit of and denotes the bitwise complement of . We denote dimensional binary vector that all bits are 0 by , the -th unit vector by and the xor sum of vectors by where is the block size of a cipher. Considering a set and a Boolean function : , we define

In most situations, the correlation of is estimated by using a subset of due to the limitation of computation resources.

2.1. Markov Ciphers and Differential Cryptanalysis

Differential cryptanalysis [9] is a probabilistic chosen-plaintext attack introduced by Biham and Shamir. We choose plaintext pair with difference and study how it propagates throughout the function. If an input difference can yield an output difference whatever the middle differences are, is a differential characteristic. The differential probability is the number of pairs for which plaintext pair with difference and ciphertext pair with difference over the total number of pairs with input difference . However, in practice, it is hard to calculate differential probability. Since the differential characteristics can be regarded as a Markov chain, the differential probability is estimated by the product of intermediate probabilities [10].

Theorem 1 (see [7]). A sequence of discrete random variables is a Markov chain, for ,

2.2. Boolean Satisfiability Problem and Automatic Search

In recent years, a lot of work that focuses on searching differential characteristic and linear characteristic automatically have come out. There are mainly two types of automation tools. One is Mixed Integer Linear Programming (MILP) and the other is boolean satisfiability problem. The aim of MILP is to optimize an objective function under certain constraints. MILP is usually used for searching differential characteristic and linear characteristic of a cipher whose substitution layer is S-box structure. This idea was first proposed by Sun et al. [11] and many methods which can reduce search time or model for large S-box have been proposed recently [1215].

Three or more SAT is NP-complete. But the solution can be found in a reasonable time for most practical problems. There are a large number of heuristic SAT solvers, which output satisfiable or unsatisfiable depending on whether problem can be solved. More specifically, SAT solvers assign initial values to variables, then compute the number of conflicting clauses, and decides the next step to search until a valid or no solution is found.

2.3. Rotational Attack and Differential-Linear Attack

Rotational cryptanalysis was first introduced by Dmitry Khovratovich et al. We use and to represent the variable rotation to the left and right by bits, respectively. A rotational pair can be denoted by . It is easy to see that a rotational pair is preserved throughout bitwise XOR and any rotational operations. However, it is not always preserved by modular addition. Therefore, the probability of this attack only depends on the rotational probability of modular additions.

For large and small , Table 3 is obtained. It is clear to see when is equal to 1, the probability is the highest, that is why we always choose 1 to be the rotational offset.

In Ref. [3], they claimed that when rotation amount is fixed, the probability of rotational attack depends on the number of modular addition. But the prerequisite of the lemma is that variables in and round can form a Markov chain. When variables are not random and independent, the lemma will not give a precise answer. Due to some consecutive modular addition in the permutation of Chaskey, we realize that the probability of rotational attack may not only rely on the number of modular addition. In Ref. [7], they explored and proposed a lemma, which calculates the rotational probability of consecutive modular additions precisely. The result shows that when consecutive modular exits, the probability of rotational attack will be much lower.

Differential-linear attack [5, 16] was first introduced by Langford and Hellman. As shown in Figure 1, when considering a cipher , it can be divided into two subparts and , which are covered by differential characteristics and linear characteristics, respectively. Assume that input differences and output differences for holds with probability

Assume that input mask and output mask for holds with bias

Assume that the variables in and are random and independent, the overall bias of differential-linear distinguisher is

The data complexity of differential-linear distinguisher is .

In almost all practical ciphers, and are not independent. In recent years, many significant researches have come out to pursue more accurate results. For example, Bar-On et al. proposed Differential-Linear Connectivity Table (DLCT) [17] to deal with the dependencies between the differential and linear parts. Here, cipher is divided into three parts: , and the intermediate part is evaluated experimentally.

2.4. Rotational Differential-Linear Attack

Rotational differential-linear attack is an attack method that combines the advantage of rotational attack and differential-linear attack, the difference between rotational differential-linear distinguisher and differential-linear distinguisher is the differential part, which is replaced by a rotational differential trace. Given a cipher , it can be described as a cascade of two subparts such that . is covered by rotational differential withand is covered by a linear approximation

Then, the bias of a rotational differential-linear distinguisher can be represented as follows [6]:

Note that if rotational offset is 0, the rotational differential-linear distinguisher will be degenerated to differential-linear distinguisher. It is clear to see that the bias of rotational differential-linear attack depends on , and . is related to the first part, which is corresponding to rotational differential attack. and are the correlations of linear attack. We will explain how to get a higher correlation of the distinguisher in Section 4.

3. Partitioning Technique

In this section, we use some examples in Ref. [8] to explain how to utilize partitioning technique briefly.

3.1. An Introduction of Partitioning Technique

Partitioning technique can improve the correlation of differential-linear attack powerfully by yielding linear equations between cipher and key, which was first proposed by Leurent at EUROCRYPT 2016 [18]. For ARX primitives, modular addition is the only nonlinear component. Assume , , where means modular addition. For , . But for is not always equal to . In order to derive certain linear equation, the following lemma is given.

Lemma 1 (see [18]). Let . For , we haveLemma 1 shows the way that can be calculated by a linear equation if some certain conditions are met with . As shown in Figure 2, we want to get specific equations on ; the above will not work because the relation between and are not the relation of modular addition but modular subtraction. Formally, it is expressed by , here means modular subtraction. For this case, the following lemma can help solve the problem.

Lemma 2 (see [8]). Let . Then,

In the later section, we not only care about but also . It is clear that Lemma 2 can be applied to and separately. Then is evaluated by the equations which are yielded by applying Lemma 2 on and . However, this method requires knowledge of three bits. The following lemma states that is evaluated by knowing only two bits. Since and , the following lemma helps us guess less bits to get linear equations.

Lemma 3 (see [8]). Let . Then,

Next, we will explain partitioning technique explicitly. The structure of rotational differential-linear attack is shown in Figure 3. Note that is the rotational differential part, is the linear part, and is the part that we use to recover the whitening key. The key idea of partitioning technique is that we want to use multiple linear approximations to recover some bits of key. The space of ciphertext can be split into a direct sum . We denote the dimension of by and the dimension of by . The specific definition of and depends on the attack. Considering some tuples , where is a coset of and . If we can get a high correlation,

The above equation reveals that if the ciphertext belongs to some specific subset (defined by ), we yield some linear equation on because and is the ciphertext that can be observed. If the is not equal to 1 but close to 1, we still obtain the equation on with high correlation. Since , we can guess the bits of and split ciphertext into corresponding . After some equations on are obtained with high correlation, bits on will be recovered. The way to recover some bits on is introduced as follows. We define

We obtain

For , let us define ; we define

We have . Let us define,which is of equal size for all and consider the scaled version of , i.e.,

For each  = Span , we define

We can use this function to recover bits of information on . can be expressed as , where is the key that can be obtained from . By using Fast Walsh–Hadamard Transform on . Fast Walsh–Hadamard Transform reduces the time complexity from to .

We compute a cumulative counter.for each tuple .

Note that when counter is larger than some threshold , we store the tuple in the list of key candidates.

3.2. A Simple Example for Partition Point

As shown in Figure 2, we define partition point . For , in order to compute or by using Lemmas 2 and 3, we need to know which condition some bits in the ciphertext meets. Considering some definitions in Section 2, a cipher space is split into the direct sum , is complement space of the space

Therefore, the dimension of is 2. For each element , it can be rewritten as where indicates the value of and indicates the value of . To enumerate all possible values of , we can yield the following four tuples and corresponding according to Lemmas 2 and 3:

We take the second tuple to illustrate the principle. When it is observed that , the linear masks are determined. Therefore, the correlation can be deducted by using Lemma 2. Then, we have  = Span . Two bits and will be recovered by using Fast Walsh–Hadamard Transform.

3.3. Two Consecutive Modular Additions for Partition Point

In this Section, we introduce some cases of consecutive modular additions. As shown in Figure 4, we still define partition point . Note that and is related to some bits of , and . Formally,where means modular subtraction and . Therefore, we get a 5-dimensional subspace , which is a complement subspace of the subspace:where and . For each element , it can be rewritten as , where with . To enumerate all possible values of , we can yield tuples and the corresponding . In these tuples, , and the corresponding linear mask involves these bits:

For the tuples given above, 4 of them correspond to correlation , 8 of them correspond to correlation , and 12 of them correspond to correlation . Therefore, there are 24 tuples with high correlation in total. The tuples with the corresponding correlations are listed in the appendix.

4. Application to Chaskey

4.1. Description of Chaskey

Chaskey, proposed by Nicky Mouha et al., is a very efficient Message Authentication Code (MAC) algorithm for 32-bit microcontrollers. Chaskey takes a 128-bit key and processes a message in 128-bit blocks using a 128-bit permutation . Designers claimed that Chaskey performs well both in software and hardware, and it is an Even-Mansour construction. A message is processed by -bit key into a tag . The message is split into blocks of 128-bit each, except for the last block which may be incomplete. If the last block is not complete, a periodical binary string is pushed back until the size of is 128-bit. This permutation is based on the Addition–Rotation–XOR (ARX) design methodology. In practical analysis, we care about the permutation which is shown in Figure 5. When we want to tag a message with one block of 128 bits, the tag would be . is generated by , i.e., , where , as shown in Figure 6.

4.2. Rotational Differential-Linear Distinguishing Attack on Chaskey

In this section, we recall the weak-key class which was introduced in Ref. [7] at first. Then a 9-round rotational differential-linear distinguisher is represented. Finally, an 8-round key recovery attack of Chaskey with partitioning techinique is explained in detail.

4.2.1. Weak Key Class of Rotational Attack

Because of the consecutive modular additions in Chaskey, we need to explore the properties in consecutive modular additions to get more accurate formulas. This work has been done in Ref. [7]. Note that in order to preserve the rotational property, we have to consider the pair and , where is a key generated from , that is, . To ensure is the word rotation of , the first 2 bits of every word is assumed to be zero. The key must be in the following form with the aim of attacking successfully.where is 30 bits that we do not care. The set of keys that satisfy such a condition is called the weak key class. It is clear that the size of the weak key class is . The result that we use to calculate the probability precisely is described in Table 4. Note that there are 2 single modular additions in the every round of permutation.

A 9-round distinguisher is divided into two parts. The first part is covered by rotational attack and the second part is covered by linear attack. We discuss them separately and the detail of attack is represented as follows.

4.2.2. Rotational Differential Part

Here we set 7 rounds for rotational part and the expected probability of this part is , which can be seen in Table 4. Mentioned in Section 2, the probability of rotational attack only depends on the number of (consecutive) modular additions. Since our work on key recovery is interested in rotational attack of 6.5 rounds Chaskey, it is easy to calculate the number of (consecutive) modular additions and the probability is .

4.2.3. Linear Part

Since is close to , we cannot use two high-correlation linear characteristics without specific relations directly. Sometimes is the best linear characteristic, but has very low correlation. In order to get as high as possible, we should search two linear characteristics simultaneously. This problem can be transformed to a SAT problem [19, 20] and a SAT solver can be used to search. Because more binary variables are needed to add constraints, we may search less round than a single linear characteristic. Therefore, we set 2 rounds for linear part and the correlation of linear part is . The total correlation is and the complexity is .

4.2.4. Experimental Verification

In order to verify the correctness of formula in Section 2.4, we use random data to check the correlation of 3-round Chaskey with 1 round for rotational part and 2 rounds for linear part. The expected correlation of 3-round Chaskey is . The experimental result shows the correlation is . We also provide 3 code files, the first one is a python script to generate a smt2 file, which can be solved by the SAT solver. The second one is smt2 file mentioned above. The third one is a c++ file which is used to verify the correlation. The code URL is https://github.com/yuanyuan1024/Chaskey_code.

4.3. Partioning Technique for Permutation of Chaskey

Ref. [8] improves the results of differential-linear attack on permutation of Chaskey by using the partitioning technique. Inspired by this, we do a similar research on the rotational differential-linear distinguisher. We want to recover the last whitening key of Chaksey. Since the process of key recovery needs some time, a 9-round distinguisher can no longer be used. Therefore, we set 6.5 rounds for the rotational part, 0.5 rounds for linear part, and 1 round for key recovery. For linear part , we consider one trail for mask , that is,and the other trail for mask , that is,

Partition point and rotational partition point are shown in Figures 7 and 8, respectively. Then we begin to compute the correlation. Note that can be computed by either or and rot can be computed by either or ; the attacker uses one of the highest correlation partition. For example, we assume that has a higher correlation than and rot has a higher correlation than rot .

The total correlation can be computed in two steps. The first step is computing the correlation of . We experimentally evaluated the correlations of any combination, that is:which is the correlation of rotational differential-linear distinguishers. We use random data to calculate the empirical correlations. Subsequently, the correlation is . Then, we focus on some partition bits. Partition is defined as the following 11 bits:and partition is defined as the following 11 bits:where and . It is difficult to evaluate the actual correlations of all experimentally with a high significance. Therefore, we simply assume that these correlations are the same for each partition, which is for all and . The second step is computing the correlation of each partition. We enumerate all possible values of 11 bits for each partition according to Table 5  in the appendix and the results are shown in Table 6.

In this table, we can see that 1472 partitions have high correlation and Table 5 average of the absolute value of those correlations is . Therefore, in this part, we can get the total correlation . The process of key recovery is the same as Ref. [6]; we will introduce it briefly. First, we choose rotational pair and encrypt them to get . For every cipher pair we get, we guess all possible values of and identify for and get corresponding and , and meanwhile set . Second, for , we compute by using Fast Walsh–Hadamard Transform. The symbol is used to denote the threshold that we set. If , we save as a key candidate.

The successful probability of key recovery can be computed by following proposition.

Proposition 1 (see [8]). After running the algorithm of key recovery times, the probability that the correct key is among the key candidates iswhere is the correlation that we compute in linear part and is available pairs, and is the cumulative distribution function of the standard normal distribution.

4.3.1. Data and Time Complexities and Success Probability

In order to get the right pair, we repeat the process of key recovery times. For each iteration, we use pairs and available pairs . By using the threshold , we can get a success probability of 0.972 under the condition that the right pair is successfully obtained during iterations. On this success probability, the data complexity is and the time complexity is .

5. Conclusion

In this paper, we study the rotational differential-linear distinguisher of Chaskey’s permutation and partitioning technique for the linear part [21]. For the linear part, we use an automatic tool to search for the linear characteristic to get the highest correlation. As a consequence, we give a distinguishing attack over 9 rounds of Chaskey with complexity , and a key recovery attack over 8 rounds of Chaskey with complexity .

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by National Natural Science Foundation of China (Nos. 62072181, 62132005), NSFC-ISF Joint Scientific Research Program (No. 61961146004), Shanghai Trusted Industry Internet Software Collaborative Innovation Center.