Research Article  Open Access
Automatic Search for the Linear (Hull) Characteristics of ARX Ciphers: Applied to SPECK, SPARX, Chaskey, and CHAM64
Abstract
Linear cryptanalysis is an important evaluation method for cryptographic primitives against key recovery attack. In this paper, we revisit the Walsh transformation for linear correlation calculation of modular addition, and an efficient algorithm is proposed to construct the inputoutput mask space of specified correlation weight. By filtering out the impossible large correlation weights in the first round, the search space of the first round can be substantially reduced. We introduce a concept of combinational linear approximation table (cLAT) for modular addition with two inputs. When one input mask is fixed, another input mask and the output mask can be obtained by the SplittingLookupRecombination approach. We first split the nbit fixed input mask into several subvectors and then find the corresponding bits of other masks, and in the recombination phase, pruning conditions can be used. By this approach, a large number of search branches in the middle rounds can be pruned. With the combination of the optimization strategies and the branchandbound search algorithm, we can improve the search efficiency for linear characteristics on ARX ciphers. The linear hulls for SPECK32/48/64 with a higher average linear potential () than existing results have been obtained. For SPARX variants, an 11round linear trail and a 10round linear hull have been found for SPARX64 and a 10round linear trail and a 9round linear hull are obtained for SPARX128. For Chaskey, a 5round linear trail with a correlation of has been obtained. For CHAM64, 34/35round optimal linear characteristics with a correlation of are found.
1. Introduction
The three components: modular addition, rotation, and XOR, constitute the basic operations in ARX cryptographic primitives [1]. In ARX ciphers, modular additions provide nonlinearity diffusion with efficient software implementation and low dependencies on computing resources. Compared with Sboxbased ciphers, ARX ciphers do not need to store Sbox in advance, which can reduce the occupation of storage resources, especially in resourceconstrained devices. In addition, ARX ciphers do not need to query Sboxes in the encryption and decryption process, which can reduce a lot of query operations. Therefore, ARX construction is preferred by many designers of lightweight ciphers. At present, there are many primitives used in this construction, such as HIGHT [2], SPECK [3], LEA [4], Chaskey [5], SPARX [6], and CHAM [7].
Until now, cryptanalysis on ARX ciphers is still not well understood as Sboxbased ciphers, and the security analysis on them are relatively lagging behind [8]. Linear cryptanalysis is very important for evaluating the security margin of symmetric cryptographic primitives [9, 10]. The linear approximation tables of Sboxbased ciphers mostly can be constructed and stored directly; however, the full linear approximation table of modular addition will be too large to store when the word length of modular addition is large.
For linear cryptanalysis of ARX ciphers, one crucial step is to calculate the linear correlation of modular addition. In [11–14], the linear properties of modular addition have been carefully studied. In [13], a method to calculate the linear correlation of modular addition recursively was proposed, but the calculation process that was based on state transition in bit level leads to high complexity. Based on this method, only the optimal linear characteristics for the variants of SPECK32 [15] and SPECK32/48 [16] were found.
In 2013, SchulteGeers used CCZ equivalence to improve the explicit formula for the calculation of linear correlation of modular addition [17]. Based on the improved formula and SAT solver model, Liu et al. obtained better linear characteristics for SPECK [18], the optimal linear trails for SPECK32/48/64 with correlation close to the security boundary () were obtained, and the 9/10round linear hull with a potential of for SPECK32 was obtained.
According to the position of the starting round of the search algorithm, there are currently 3 types of automatic search technologies for linear/differential cryptanalysis on ARX primitives. They are bottomup techniques [15], topdown techniques [19–21], and the method of extending from the middle to the ends [22]. In these methods, the linear correlations are directly calculated based on the inputoutput masks or by looking up the precomputed partial linear approximation table (pLAT) [23]. For addition modulo with two inputs, the correlations need to be calculated based on the known inputoutput masks. However, in the search process for linear characteristics of ARX ciphers, due to the existence of threeforked branches, in most case, for the inputoutput masks of modular addition, only one input mask is determined, and another input mask and the output mask u are unknown. Although all space of can be traversed in a trivial way, it is very timeconsuming.
High efficiency query operations can be achieved by constructing a linear approximation table of reasonable storage size. The pLAT can store the inputoutput masks whose linear correlation is greater than a certain threshold [15]. When the branches cannot be queried in pLAT and that need to be calculated by the inputoutput masks, the calculation process will lead to a significant reduction in search efficiency. Although the heuristic method can speed up the search, it cannot guarantee the results will be the best [24].
Therefore, constructing a search model based on the precise correlation calculation formula and realizing an efficient search for linear characteristics on ARX ciphers are still a study worth working on. The motivation of this paper is to investigate how to speed up the search algorithm in order to realize the search for linear (hull) characteristics on typical ARX ciphers.
1.1. Our Contributions
In this paper, we first revisit the linear correlation calculation of modular addition and introduce an algorithm to construct the inputoutput masks of specific correlation weight. Then, we propose a novel concept of combinational linear approximation table (cLAT) and introduce an algorithm to generate the lookup tables. Combining with these two optimization algorithms, we propose an automatic algorithm to search for the optimal linear characteristics on ARX ciphers. In the first round, we can exclude the search space of the nonoptimal linear trails by increasing the correlation weight of each modular addition monotonically. In the middle rounds, the undetermined masks and the corresponding correlation weight of each modular addition can be obtained by querying the cLAT, and a large number of nonoptimal branches can be filtered out during the recombination phase. Also, the algorithm can be appropriately modified for the heuristic search.
As applications, for SPECK32/48/64, the 9/11/14round linear hulls are obtained. For SPARX64, the 11round linear trail with a correlation of and a 10round linear hull with an of are found. For SPARX128, we can experimentally get the optimal linear trails of the first eight rounds, and we get a 10round linear trail with a correlation of . For Chaskey, the linear characteristics covering more rounds are updated, and a 5round linear trail with a correlation of is found. For CHAM64, we find a new 34round optimal linear trail with a correlation of . A summary table is shown in Table 1.

1.2. Roadmap
This paper is organized as follows. We first present some preliminaries used in this paper in Section 2. In Section 3, we introduce the algorithm for constructing the space of inputoutput mask tuples, the algorithm for constructing cLAT, and the improved automatic search algorithm for linear cryptanalysis on ARX ciphers. In Section 4, we apply the new tool to several typical ARX ciphers. Finally, we conclude our work in Section 5.
2. Preliminaries
2.1. Notation
For addition modulo , i.e., , we use the symbols and to indicate rotation to the left and right and and to indicate the left and right shift operation, respectively. The binary operator symbols , , , , and represent XOR, OR, AND, concatenation, and bitwise NOT, respectively. For a vector x, represents its Hamming weight and is the bit of it. is a zero vector.
2.2. Linear Correlation Calculation for Modular Addition
Let be the n dimensional vector space over binary field ; for Boolean function and , , the linear correlation between f and h can be denoted by
For modular addition , let () be the input masks, u be the output mask, and be the standard inner product. According to the definition of linear correlation, when , the linear approximation probability is defined as
Let , then the linear correlation of modular addition can be denoted by Walsh transformation, and thus,
Let , where ε is the bias. When , the linear approximation probability is . The linear correlation can be denoted by
We call as the correlation weight, and the linear square correlation can be denoted by
For addition modulo , it can be rewritten as , in which and for . The firstorder approximation is . If all for , and , the highorder approximation is
In [13], Wallén introduced the theorem to calculate the linear correlation by analyzing the carry highorder approximation function recursively. In [12], based on the bit state transformation, the formula to calculate the correlation was given by the following theorem.
Theorem 1 (See [12]). For addition modulo , let be the input masks and u be the output mask. Define an auxiliary vector , and each is an octal word, . Then, the linear correlation can be denoted bywhere the row vector , the column vector , and each matrix is defined byIn [17], SchulteGeers extended Theorem 1 and derived a fully explicit formula for the linear correlation calculation, given by Theorem 2.
Theorem 2 (See [17]). For addition modulo with inputoutput mask tuple , a vectorial Boolean function denotes the partial sum mapping:Let , then the linear correlation can be denoted bywhere is an indicator function for graph ; for nbit vectors a and b, represents for .
In iterative ciphers, the correlation of a single rround linear trail is the product of the correlations of each round [25]. Assuming that there are additions modulo with two inputs in round, and are the input and output masks of the rround linear trail, and the correlation of it can be denoted byThe linear approximation of a linear hull represents the potential of all linear trails with same inputoutput masks [26]. The averaged linear potential () can be counted by the following formula:Assuming that the key k is selected uniformly from the key space K, the statistics of can be formulated as (13), where is the number of trails with correlation weight of . Let be the correlation weight of the linear trail whose inputoutput masks are chosen as the fixed inputoutput masks of the linear hull. is the upper bound to be searched, which should be chosen by the tradeoff between the search time and the accuracy of :
2.3. Linear Properties of SPECK, SPARX, Chaskey, and CHAM
The SPECK family ciphers were designed by NSA in 2013 [3]. The SPARX family ciphers were introduced by Dinu et al. at ASIACRYPT′16 [6]. In SPARX, the nonlinear ARXbox (SPECKEY) is obtained by modifying the round function of SPECK32. The linear mask propagation properties of the round function in SPECK and SPECKEY are shown in Figure 1. The rotation parameters for SPECK32, while for other variants.
If the inputoutput masks () and () of the modular additions in the two consecutive rounds of SPECK are known, the input and output masks of these two rounds can be denoted by Property 1.
Property 1. If () and () are given, then , , , , , and .
The linear layer functions [6] for SPARX64 and SPARX128 are shown in Figure 2. Due to the existence of the threeforked branches, the masks of the linear transformation layer have the following properties.
(a)
(b)
(c)
(d)
Property 2. For SPARX64, if the masks are transformed by the linear layer function , let , , then , , , and .
Property 3. For SPARX128, if the masks are transformed by the linear layer function , let , , then , , , , , , , and .
Chaskey is a MAC algorithm introduced by Mouha et al. at SAC′14 [5], and an enhanced variant was proposed in 2015 [27], which increases the number of permutation rounds from 8 to 12. The round function of the permutation is shown in Figure 3. The 4 modular additions are labeled by , respectively. The input mask and the output mask of the first round can be denoted by Property 4.
Property 4. For the permutation of Chaskey, if the inputoutput masks of each modular addition in the first round are (), , the corresponding correlation weight of each modular addition is , respectively. Hence, in the first round, , , , , , , , and . The corresponding correlation weight of the round function is .
CHAM is a family of lightweight block ciphers that was proposed by Koo et al. at ICISC′17, which blends the good designs of SIMON and SPECK [7]. 3 variants of CHAM have two kinds of block size, i.e., CHAM64 and CHAM128. The linear mask propagation for the 4 consecutive rounds of CHAM is shown in Figure 4. If the inputoutput mask tuples of each modular addition of the first 4 rounds are given, the input and output masks of the first 4 rounds can be deduced by Property 5.
Property 5. For CHAM, if the inputoutput mask tuples () of each modular addition of the first 4 rounds are given, , the input and output masks of the first 4 rounds can be deduced as follows. , , , ; , , , ; , , , ; , , , ; , , , .
3. Automatic Search for the Linear Characteristics on ARX Ciphers
3.1. InputOutput Masks of Specific Correlation Weight
The number of inputoutput mask tuples in the first round is closely related to the complexity of the branchandbound search algorithm, but traversing all possible input masks of the first round will result in high complexity. An alternative approach is to consider the possible correlation weight corresponding to the inputoutput masks and exclude those tuples that have a large correlation weight. However, for a fixed correlation weight, it may correspond to multiple inputoutput mask tuples although the correlation can be calculated by Theorem 2 when the inputoutput masks are fixed for a modular addition.
For addition modulo , its maximum correlation weight is , and the size of the total space S of all inputoutput mask tuples is . We can rank the correlation weights from 0 to and construct the inputoutput masks subspace corresponding to correlation weight , . Therefore, the total space S can be divided into n subspaces, i.e., .
Definition 1. Let be the inputoutput masks for a modular addition with nonzero correlation. Let us define an octal word sequence , where , for .
Definition 2. Let us define three sets that may belong to, i.e., , , and .
In Theorem 2, when the correlation of a modular addition is nonzero, the value distribution of the 3 consecutive bits in and the 3 consecutive words in Φ has the following relationships, shown in Observation 1.
Observation 1. Let and for , ; hence, . For , assuming when for or , they should have , on bit level, and it is equivalent to and . Since when and when ,Hence, the value of depends on whether the bit positions of and are active. The last significant bits () of the inputoutput masks construct the value of , which is only related to the Hamming weight of , i.e., and . Therefore, if we get the Hamming weight distribution of , from the LSB to MSB direction, as is determined, can be obtained. Next, is determined, and and should be satisfied; hence, the possible values of can be obtained. Recursively, all values can be constructed as an octal word sequence from the LSB to MSB direction to subject to the above observation. Hence, the tuples of () can be generated from the elements in Φ. The process to construct the subspace is shown in Algorithm 1, marked as Const ().

3.2. Combinational Linear Approximation Table
For addition modulo , the full LAT requires a storage size of ; when n is too large, it will be very difficult to store. To facilitate the storage, an intuitive approach is to store only a part of the full LAT. For a nbit vector, we can split it into t subvectors of m bits, where . When each mbit subvector is determined, the nbit vector can be obtained by concatenating. This idea gives birth to the concept of combinatorial LAT (cLAT).
Property 6. For and , they are equivalent to and .
Corollary 1. Let be the inputoutput masks of the modular addition with nonzero correlation, and let , , , and . Splitting the vectors , , , and into t subvectors, respectively, , , . Then, the correlation weight of the modular addition can be denoted bywhen
Proof. is the sum of the Hamming weight of each subvector , so . For , the bit in can be denoted by . Let , , and the bit in should be . Hence, , when and are satisfied, i.e., and for .
If the mbit subvector adjacent to is known, can be calculated by subvector tuple (, , ) and the lowest bit of . We call (, , ) as a subblock, and we call the bit as the connection status when used in the calculation of . Splitting the nbit vector into t subvectors, there should have connection status , , and for the highest subvector, its connection status . Hence, for the highest subblock (, , and ), the Hamming weight of and the bit can be obtained, recursively, and the Hamming weight of the remaining subvectors can also be obtained. Therefore, as connection status and , we can construct a mbit lookup table for modular addition in advance and query the tables by indexing inputoutput masks and the connection status. In addition, the connection status for the next subblock can also be generated.
In the topdown search techniques for the ARX ciphers, for the modular additions in the middle rounds, in most cases, only one input mask is fixed (assuming it is ), and another input mask and the output mask u are unknown. In the lookup tables, we need to lookup all valid subvectors of () that correspond to nonzero correlation based on . The lookup table (called as cLAT) is constructed by Algorithm 2, and it takes about 4 seconds on a 2.5 GHz CPU to generate the table with storage size about 1.2 GByte when .

3.3. SplittingLookupRecombination
Algorithm 2 constructs a mbit cLAT for the addition modulo , and this section describes how to use it. When one input mask is fixed, we can get another input mask, the output mask, and the corresponding correlation weight by the SplittingLookupRecombination approach, which contains three steps.
3.3.1. Splitting
For addition modulo , , if one of the two input masks is fixed, then split into tmbit subvectors. The larger the m, the fewer the times to lookup cLAT and the fewer the number of bit concatenation operations, but the more space the memory takes up, and after the tradeoffs, we choose .
3.3.2. Lookup
From the MSB to the LSB direction, query the subvectors of () that correspond to each subvector of and the corresponding correlation weights. For the highest mbit subvector , its connection status , looking up cLAT to get and , the corresponding correlation weight , and the connection status for the subvector . Similarly, other subvectors of u and and the corresponding correlation weights can be obtained.
3.3.3. Recombination
All subvectors of u and can be obtained by lookup tables, and the nbit u and can be obtained by bit concatenation. The correlation weight of the modular addition is the sum of the weight of each subblock, i.e., .
When there are multiple modular additions in the round function, i.e., , for each modular addition, its undetermined input mask and output mask need to be obtained by the SplittingLookupRecombination approach, respectively. In the lookup phase, a total of lookup operations are required. And the correlation weight of the round function is .
For each subvector , the possible minimum linear correlation weight corresponding to it can be calculated in advance by Algorithm 2, that is,
During the Recombination phase, the correlation boundary can be constructed by the associated weights that have been obtained and the possible minimum correlation weights, shown in Corollary 2.
Corollary 2. For addition modulo , one of the input mask is fixed, , , and . For any of nonzero correlation, the correlation boundary should have
Proof. The correlation of modular addition is the product of the correlation of each subblock after splitting, i.e., . Let be the correlation weight of the subvector tuples that are obtained by lookup tables. The sum of the correlation weights of the subvector tuples have not been looked up yet, which should s.t. .
Assuming the number of corresponding to each subvector is , hence, the number of mask branches corresponding to the modular addition is . Corollary 2 can be used to filter out of large correlation weight.
3.4. Improved Automatic Search Algorithm
In this section, we adopt a topdown technique [19–21], taking the first round as the starting point of the search process. In the first/second rounds, the inputoutput mask tuples of each modular addition with correlation weight increasing monotonically can be obtained by Algorithm 1. In the middle rounds, for each modular addition, u and can be obtained by the SplittingLookupRecombination approach. Algorithm 3 takes SPECK as an example.
