Security and Communication Networks

Volume 2018 (2018), Article ID 7393401, 11 pages

https://doi.org/10.1155/2018/7393401

## On the Complexity of Impossible Differential Cryptanalysis

^{1}State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China^{2}Data Assurance and Communication Security Research Center, Chinese Academy of Sciences, Beijing, China^{3}School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China^{4}NTT Secure Platform Laboratories, Tokyo, Japan

Correspondence should be addressed to Lei Hu; nc.ca.si@uh

Received 12 September 2017; Accepted 20 December 2017; Published 17 April 2018

Academic Editor: Jiankun Hu

Copyright © 2018 Qianqian Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

While impossible differential attack is one of the most well-known and familiar techniques for symmetric-key cryptanalysts, its subtlety and complicacy make the construction and verification of such attacks difficult and error-prone. We introduce a new set of notations for impossible differential analysis. These notations lead to unified formulas for estimation of data complexities of ordinary impossible differential attacks and attacks employing multiple impossible differentials. We also identify an interesting point from the new formulas: in most cases, the data complexity is only related to the form of the underlying distinguisher and has nothing to do with how the differences at the beginning and the end of the distinguisher propagate in the outer rounds. We check the formulas with some examples, and the results are all matching. Since the estimation of the time complexity is flawed in some situations, in this work, we show under which condition the formula is valid and give a simple time complexity estimation for impossible differential attack which is always achievable.

#### 1. Introduction

Impossible differential attack, introduced by Knudsen [1] and Biham et al. [2] independently, is one of the most well-known cryptanalytic techniques for symmetric-key cryptanalysts [3–9]. Generally, in impossible differential cryptanalysis, we guess some key bits involved in the outer rounds of the target cipher. Then the guess is rejected if it leads to impossible differentials at the inner rounds. Despite its extensive application in symmetric-key cryptanalysis, errors in the analysis are often discovered and many papers in the literature presented subtle flaws. Note that the flaws typically arise in the estimation of the time and data complexities rather than in the distinguisher, similar to searching differential and linear characteristic [10–13], the methodology of searching for impossible differential is fairly mature, and automatic tools are available [14–17]. To relieve the difficulty of the complexity analysis, Boura et al. presented generic complexity analysis formulas along with the development of new ideas for optimizing impossible differential cryptanalysis [18]. However, at FSE 2016, Derbez identified some flaws in the formulas for the time complexity estimation given in [18], and concrete examples were presented such that the time complexities estimated with the formulas given in [19] are not achievable.

Our contribution follows Boura, Naya-Plasencia, Suder, and Derbez’s work at ASIACRYPT 2014, FSE 2016, and ESC 2017; we investigate further some aspects of the estimation of the impossible differential attack which have not been explored or stated explicitly in previous work.

Firstly, we introduce a new set of notations for impossible differential analysis. With these notations, there is no difference between ordinary impossible differentials and multiple impossible differentials. Under some reasonable assumptions (the same assumptions were made implicitly in [18, 19]), we modify the formula in [18] for calculating the data complexity into a form getting rid of the parameters of the number of bit-conditions (the and notations in [18]) that have to be verified to follow some specified behavior in the outer rounds of a target cipher. Moreover, in the formulas derived with the new notations, we identify a very interesting and somehow strange point: in most cases, the data complexity is only related to the form of the underlying distinguisher and has nothing to do with how the differences at the beginning and the end of the distinguisher propagate in the outer rounds. That is, in most cases, the data complexity can be completely determined by the underlying impossible differential distinguisher employed in the attack. Hence, estimating the data complexity with the new formulas is much more easier and straightforward than that of [18].

Secondly, since Derbez showed concrete examples where Boura et al.’s formula of the time complexity of impossible differential attack is invalid, we are interested in the condition under which the estimation of Boura et al. is correct, and we prove that the time complexity of the key-sieving process given by Boura et al. is not only achievable but also optimal if the key bits involved in the outer rounds are independent. Using the early abort technique presented by Lu et al. in [20, 21], we give the optimal result with detailed process.

Finally, we give a formula to estimate the time complexity of the key-sieving process in the case where the key bits involved in the outer rounds are not independent. The estimation is not guaranteed to be equal to the complexity of the optimal attack as discussed by Derbez in [19], but it is always achievable. Therefore, the formula serves to give a rough estimation of an impossible differential attack without diving into complicated calculations and time-consuming search algorithms, which should be very useful in fast prototyping in cryptanalysis.

We present a new set of notations for impossible differential analysis in Section 2. Section 3 briefly shows impossible differential attacks. In Section 4, we modify the data formula, which is related to a few parameters and unifies multiple impossible differential attacks with ordinary impossible differential attacks. In Section 5 we prove that the formula of the time complexity is achievable and optimal with the key bits independent and give a rough estimation formula for the key bits without independence. At last we conclude the paper in Section 6.

#### 2. Notations

Let be the finite field of two elements. For a set , its number of elements is denoted by , and let . Also, for an integer , let .

In addition, we use some notations like regular expression to represent a set of bit strings. For example, is equivalent to the set , is equivalent to , and is equivalent to , which is alternatively denoted by , where the subscript tells the number of occurrences of the symbol concerned.

*Definition 1. *Let be a block cipher and , ; if , for all , we call an impossible differential of , which is denoted by . More generally, let , ; we call an impossible differential, denoted by , if for any , , such that , and for any , , such that .

*Note that this notation is different from the notation of impossible differential we typically see in the literature, since in our notation, it is possible that ** and **, such that ** is not an impossible differential.*

Let , where is simply written as if is clear from the context. Then we have , and in the special case for any and , . It is worth mentioning, with the new notation, that we can unify ordinary impossible differentials and multiple impossible differentials in impossible differential cryptanalysis.

For example, if , , and , with the new notation, we call an impossible differential, and .

*Definition 2. *Let , and the structure derived from is defined to be the set of all -bit strings such that for all . Given a bit string , is defined to be the set .

For example, if and , then and , . Recall that, in differential type of cryptanalysis, if we want to get many pairs of data whose differences are in a set , we typically first prepare a structure from which the needed pairs will be generated. From Figure 1, we can see the relationship in , , and .