Research Article | Open Access
Hao Chen, Tao Wang, Fan Zhang, Xinjie Zhao, Wei He, Lumin Xu, Yunfei Ma, "Stealthy Hardware Trojan Based Algebraic Fault Analysis of HIGHT Block Cipher", Security and Communication Networks, vol. 2017, Article ID 8051728, 15 pages, 2017. https://doi.org/10.1155/2017/8051728
Stealthy Hardware Trojan Based Algebraic Fault Analysis of HIGHT Block Cipher
HIGHT is a lightweight block cipher which has been adopted as a standard block cipher. In this paper, we present a bit-level algebraic fault analysis (AFA) of HIGHT, where the faults are perturbed by a stealthy HT. The fault model in our attack assumes that the adversary is able to insert a HT that flips a specific bit of a certain intermediate word of the cipher once the HT is activated. The HT is realized by merely 4 registers and with an extremely low activation rate of about 0.000025. We show that the optimal location for inserting the designed HT can be efficiently determined by AFA in advance. Finally, a method is proposed to represent the cipher and the injected faults with a merged set of algebraic equations and the master key can be recovered by solving the merged equation system with an SAT solver. Our attack, which fully recovers the secret master key of the cipher in 12572.26 seconds, requires three times of activation on the designed HT. To the best of our knowledge, this is the first Trojan attack on HIGHT.
The resource-constrained devices such as RFID tags and smart cards have been pervasively used in the daily activities of human society, such as intelligent transportation, modern logistics, and food safety [1, 2]. As these devices have inherent constrains in storage space, computation ability, and power supply, modern cryptographic primitives like DES, AES, or RSA are difficult to be deployed on them. Hence, the research of lightweight cryptography, which aims at designing and implementing security primitives fitting the needs of low-resource devices, has been focused on a large scale . Particularly, the lightweight block cipher is one of the most studied metrics, which has been extensively explored in numerous prior papers. There have existed a lot of lightweight block ciphers, such as PRESENT , LED , SIMON , mCrypton , and HIGHT [8, 9].
Hardware Trojan is a circuit maliciously inserted into integrated circuit (IC) that typically functions to deactivate the host circuit, change its functionality, or provide covert channels through which sensitive information can be leaked [10, 11]. They can be implemented as hardware modifications to ASICs, commercial-off-the-shelf (COTS) parts, microprocessors, microcontrollers, network processors, or digital-signal processors (DSPs) and can also be implemented as firmware modifications to, for example, FPGA bitstreams . An adversary is expected to make a Trojan stealthy in nature, that is, to evade detection by methods such as postmanufacturing test, optical inspection, or side-channel analysis [13–15]. Due to outsourcing trend of the semiconductor design and fabrication, hardware Trojan attacks have emerged as a major security concern for integrated circuits (ICs) .
Differential Fault Analysis (DFA)  was one of the earliest techniques invented to attack block ciphers by provoking a computational error. DFA retrieves the secret key based on information of the characteristics of the injected faults and the difference of the ciphertexts and faulty ciphertexts. However, since DFA relies on manual analysis, it often has inherit inherent limitations in scenarios that have very high complexity, for example, when faults are located in deeper rounds of the cipher or when the exact location of the injected faults in a deep round is unknown.
In eSmart 2010, Courtois and Pieprzyk combine algebraic cryptanalysis  with fault analysis to propose a more powerful fault analysis technique called algebraic fault analysis (AFA) . The basic idea of AFA is to convert both the cipher and the injected faults into algebraic equations and recover the secret key with automated solvers such as SAT instead of the manual analysis on fault propagations in DFA, hence making it easier to extend AFA to deep rounds and different ciphers and fault models. AFA has been successfully used to improve DFA on the stream ciphers such as Trivium  and Grain  and block ciphers such as AES , LED [22, 23], KASUMI , and Piccolo .
HIGHT is a lightweight block cipher that has attracted a lot of attention because it is constructed by only ARX operations (modular addition, bitwise rotation, bitwise shift, and XOR), which exhibits high performance in terms of hardware compared to other block ciphers. HIGHT has been selected as a standardized block cipher by Telecommunications Technology Association (TTA) of Korea and ISO/IEC 18033-3 .
It is noted that both the DFA and AFA require high precision in the fault injection in terms of location and timing. In practice, low-cost fault injection techniques like reduction of the feeding voltage or clock manipulation do not achieve the required accuracy, while highly precise methods such as pinpointed irradiation of desired fault sites by intensive laser light are difficult to perform and require costly equipment . However, if the adversary is able to insert hardware Trojan (HT) to the underlying cryptographic hardware , AFA can be easily achieved. A well designed HT can precisely inject any type of faults to enable AFA and evade detections, by having low cost and with low activation rate.
In addition, since the design of lightweight block ciphers is compact, especially for HIGHT whose construction only based on ARX operations, it is simple to represent the cipher as a set of algebraic equations. It is also easier to implant hardware Trojans into devices that adopt such lightweight algorithms because these devices are normally used in RFID system and composed of sorts of IPs, and they are typically designed and manufactured by offshore design houses or foundries. In theory, any parties involving into the design or manufacturing stages can make alterations in the circuits for malicious purpose , and thus these circuits are more vulnerable to algebraic fault attacks which inject faults by triggering HT.
In this paper, we show that the lightweight block cipher HIGHT is prone to algebraic fault analysis, which can be feasible with a stealthy HT. The proposed analysis of HIGHT is implemented on SASEBO-GII board soldering a 65 nm Virtex-5 FPGA  and recovers the 128-bit secret master key with only 3 faults. The main contributions of the paper are summarized as follows:
(1) We design a stealthy FSM-based HT by using 4 flip-flops overhead which is a 1.63% additional cost in flip-flops for HIGHT implemented on SASEBO-GII board and with an extremely low activation rate of about 0.000025. The HT enables the adversary to induce a single-bit fault precisely in both location and time when it is activated and thus make the bit-level AFA efficiently.
(2) Some properties of faults are given to maximize the utilization of the fault leakages and show that the adversary can predetermine the optimal location for the HT by AFA to maximize the attack efficiency.
(3) A very simple and efficient method is proposed to describe HIGHT and the injected faults as a merged set of algebraic equations and transform the problem of searching for the secret master key into solving the merged equation system with an SAT solver.
(4) It is proven that the lower bound for the number of the required faults is 3 and an efficient distinguisher is proposed to uniquely determine the secret master key.
The rest of this paper is organized as follows. Section 2 introduces the related works. Section 3 lists the notations used in the paper and briefly describes the HIGHT algorithm and the overview of the attack. Section 4 presents some important properties of the faults and the details of the HT are given in Section 5. Then, Section 6 describes our attack on HIGHT and the experimental results are shown in Section 7. Finally, Section 8 concludes the paper.
2. Related Work
Since the proposal of HIGHT, there have been many studies on the security of HIGHT. The preliminary security analysis , conducted during the HIGHT design process, includes the assessment of the cipher with respect to different cryptanalytic attacks such as differential cryptanalysis, related-key attack, saturation attack, and algebraic attack and the designers claim that at least 20 rounds of HIGHT are secure against these attacks. But in 2007, Lu  presents the first public cryptanalysis of reduced versions of HIGHT which indicates the reduced versions of HIGHT are less secure than the designers claimed. Then in 2009, Lu’s attack results were improved by Özen et al.  by presenting an impossible differential attack on 26-round HIGHT and a related-key impossible differential attack on 31 round HIGHT. At CANS 2009, Zhang et al.  present a 22-round saturation attack on HIGHT including full whitening keys with 262.04 chosen plaintext and 2118.71 22-round encryptions. The first attack on full HIGHT was proposed by Koo et al. at ICISC 2010  using related-key rectangle attack based on a 24-round related-key distinguisher with the data complexity of 257.84 chosen plaintext and the time complexity of 2123.17 encryptions. The second attack on full HIGHT was proposed by Hong et al. at ICISC 2011  with a Biclique cryptanalysis of the full HIGHT which recovers the 128-bit secret master key with the computational complexity of 2126.4, faster than exhaustive search. In , Lee et al. present the first DFA against HIGHT. In this attack, authors claimed that the full secret master key of HIGHT can be recovered in a few minutes or seconds with a success rate of 96%, computational complexity of , and memory complexity of by injecting 12 faults based on a random byte fault model.
The main idea of this attack is to collect pairs of correct and faulty ciphertexts by injecting adequate faults and use them to distinguish where the faults are injected. Once the fault locations are determined, a number of equations can be built based on manual analysis of the fault propagations to filter out the wrong subkey candidates and thus to recover the secret master key. However, since the adversary analyzes fault propagations and filters out wrong subkey candidates manually, the fault leakages are not maximally utilized and the attack can be further improved.
In this paper, we elaborate an algebraic fault analysis of HIGHT with a stealthy HT. The fault model we choose in this attack is the one in which the adversary is assumed to inject a single-bit fault precisely in both location and the time of the disturbance by a HT which is activated just by choosing certain plaintexts. The attack converts both the cipher and the injected faults into algebraic equations automatically and recovers the secret master key with an SAT solver. The attack recovers the secret master key with a success rate of 96% within 12,572.26 seconds and requires only 3 faults. We summarize our results as well as the major previous results in Table 1.
|Imp.: impossible, Diff.: differential, Rel.: related, Rec.: rectangle, Weak.: weak key, CP: chosen plaintext, EN: encryptions, Com.: computational, and Mem.: memory.|
In the rest of the paper, the following notations are used:1.: means , where .2.: bitwise XOR and concatenation operations.3.: -bit left rotation of an 8-bit value .4.: sign for denoting faulty ciphertext or intermediate values.5.: the 64-bit plaintext, ciphertext, and faulty ciphertext.6.: the 16 bytes master key.7.: the whitening keys, .8.: the round keys, .9.: the 64-bit input of the ()th round, .10.: the th bit of , .
3.2. Brief Description of HIGHT Cipher
HIGHT is a lightweight block cipher with 64-bit block length and 128-bit key length. The encryption process of HIGHT is as follows.
(1) The KeySchedule is performed to generate 8 bytes whitening keys and 128 bytes :(2) The InitialTransformation is performed to transform the 64-bit plaintext P to the input of the first round by using four bytes whitening keys , , , and . (3) For , RoundFunction is performed to transform into as follows: For , The two auxiliary functions and are defined as follows:(4) The FinalTransformation transforms into the ciphertext :For complete description of HIGHT, the reader is referred to [8, 9].
3.3. Overview of the Attack
As illustrated in Figure 1, our attack consists of four steps.
(1) Inducing the Designed HT in a Selected Location. The task of this step is to design a HT and insert it in the cipher chip. The optimal location of inserting a HT should be in a deeper round to enable the fault to involve the whole master key bytes during its propagation. It also should ensure that the injected HT escapes detections by having low cost and with extremely low activation rate.
(2) Constructing Boolean Equations for the Cipher. In this step, the target cipher and its key schedule are described by a set of Boolean equations , which contain unknowns (master key bits, whitening key bits, subkey bits, and intermediate variables) and constants (plaintext and ciphertext bits). The most important and difficult part in this step for HIGHT is to describe nonlinear operations like addition mod and complicated linear functions like and .
(3) Constructing Boolean Equations for the Faults. After the fault injections, the faults are also represented with a set of Boolean equations . It is obvious that the more secret variables contains, the more master key bits that can be recovered. Therefore, the key point of this step is how to make contain secret variables that were involved during the fault propagation as many as possible in an efficient and simple way.
(4) Solving the Algebraic Equation System. The problem of searching for the secret master key is now transformed into solving the merged equation system and . Many automatic tools [25, 34–37] can be leveraged.
4. Some Properties of the Faults
This section is devoted to presenting the fault properties, which are helpful to our attack. For the sake of simplicity, we denote the deduction of from by equation by
Property 1. Assume that a fault was induced to , then defineas the set of subkey bytes and whitening key bytes that were involved by the fault during its propagation form round to FinalTransformation, where , , , , and . Then we have
Proof. Without loss of generality, we assume that the fault was induced to .
(1) For , , and , the fault will propagate to and in the next round as shown in Figure 2; thus we have .
When the fault was injected in , the fault will propagate to and in the final round. Then, we have .
(2) For , , and , the fault will only propagate to in the next round as shown in Figure 3; thus we have .
In the similar way, the fault will propagate to and in the next round for . Then, we have .
Property 2. Assume that a fault was induced to , then define as the set of the master key bytes that were involved during the propagation of the fault. Then we have the following conclusion.
Proof. From Section 3.2, for the FinalTransformation of HIGHT, we have the following formula:For the KeySchedule of HIGHT, we have the following formula:For (14)~(15), we haveThus, we have and . Moreover, according to Property 2, then we have the desired conclusions which are shown in Table 2.
Note that, to fully recover the master key, the entire master key bytes must be included in the merged equation system. That is, the master key bytes can possibly be recovered only for the case that .
Property 3. Given that a single-bit fault is inserted in , the fault propagation paths are shown in Figure 4. The intermediate words , , , , , and are all corrupted that , , , , and . Then the intermediate words are included in More generally, if we use 8-bit words and to denote the inputs of modular addition, and to denote the difference of the inputs, and to denote the corresponding output difference in (18), then the above two equations can be simplified as That is, the intermediate words, whitening keys and subkeys can be recovered by solving the two equations.
5. The Proposed Trojan Circuit
In this section, we give the details of the HT. In general, a hardware Trojan consists of two parts: trigger logic (TL) and payload logic (PL). The TL is used to judge whether the values of signal lines and states meet the activation condition which is referred to the values of signal lines and states set by the adversary in advance. Once the activation condition is satisfied, the PL executes attacks. Attacks of Trojan circuit may deactivate the circuit (denial-of-service), change its functionality, or provide covert channels through which the protected secret information can be leaked.
5.1. Assumption of Trojan Circuits
In this paper, we make three assumptions about the design of hardware Trojan circuits.
(1) HIGHT is implemented in a cryptographic intellectual property (IP) with advanced protections like sensors from an untrusted IP vendor or system integrator. The prototype is on a Xilinx FPGA device implementing a cryptographic IP. In fact, it is a common practice to deploy physical sensors alongside cryptographic IP in industrial designs.
(2) The adversary is assumed to be able to assign the plaintext to be encrypted. And he is also assumed to be able to insert a smart but functional hardware Trojan in Register Transfer Level (RTL) by either modifying the RTL or the corresponding logic elements in the postplace or route netlist. But he only has the access to the Xilinx Design Language (XDL) file and no access to the design stage.
(3) The hardware Trojan is designed to introduce a fault by flipping only one bit of a certain intermediate word of the cipher when it was activated.
5.2. Trigger Design
The FSM-based Trojans  have two prominent advantages over many other Trojans: one is that they can be designed to be arbitrarily complicated with the same amount of resources and can reuse both combinational logic and flip-flops of the original circuit, and the other is that the FSM-based Trojans are bidirectional which means they can have state transitions leading back to the previous or initial state, thus causing the final Trojan state to be reached only if the entire state sequence is satisfied in consecutive clock cycles. The above two advantages both make the FSM-based Trojans harder-to-detect than other Trojans.
As shown in Figure 6, to design a hard-to-detect Trojan circuit, the TL of the proposed HT is designed based on a finite state machine (FSM). In this FSM, a 3-bit register is used to store the current state. The Trojan circuit undergoes state transition under the certain state transition diagram which is defined by the adversary in advance and shown in Figure 5. Moreover, only the adversary knows the predefined state transition diagram. The 3-bit input is derived from any three of the four different 8-bit intermediate words , , , and , randomly. And it is assigned as the transition condition of the FSM that causes the state transition. If the input agrees with the current state, the FSM will transition to the next state; otherwise the FSM will go back to the previous state. When the FSM reaches the final state , the Trojan output is activated (the single act is “1” ) and the PL will cause a single-bit fault in the original circuit. In the next clock cycle, the Trojan will automatically go back to the initial state ; thus the Trojan can be disguised as a random fault.
Since a 3-bit register is able to store 8 different states, the test space that is to activate the trigger logic is 8! (>215); that is, the probability of activating the HT is Pr ≈ 0.000024 which is an extremely low probability. However, since according to InitialTransformation (see (5)), the required four plaintext bytes P1, P3, P5, and P7 can be directly deduced by X0,1, X0,3, X0,5, and X0,7. Hence, the adversary can trigger the HT by carefully choosing , P3, P5, and P7. The total logic overhead of the implemented trigger logic is three flip-flops and four 3-input LUTs.
5.3. Payload Design
For clarity, the th encryption and plaintext are denoted as and , respectively. A pair of correct and faulty ciphertexts () is required to be collected for the same plaintext Pm. The payload component PL(A) is designed to inject a single-bit fault in round during . When the HT is triggered by carefully choosing some certain plaintexts, a “1” is stored in the flip-flop which waits for the target round . A signal Rflag, derived from state machine, indicates whether the current round is the target round or not. The value of () is determined by AFA which will be described in detail in Section 7.2.1. Once the Trojan is triggered, the th bit of , that is, , is flipped due to PL(A) in . This is realized by function as shown in Figure 6. The total costs of implementing the payload logic are a flip-flop and a 3-input payload gate that can be implemented by 1 LUT in both 4-input and 6-input FPGA series.
6. The AFA with a HT of HIGHT
6.1. The Optimal Location Selection
Let be the location where the HT is inserted, , , and . In order to search the optimal location, four properties are desired:
(1) Note that the secret master key can be recovered only for the case that they are involved during the fault propagation; thus the number of elements in should be equal to 16.
(2) The required number of faults to recover the secret master key and the reduced key search space after the injection to should be both minimized to make the attack more practical.
(3) The average time of the solver to solve the merged equation system should be minimized to increase the effectiveness of the attack.
(4) should be in a deeper round to maximize utilize the fault leakages and to evade the detection.
In order to search the optimal bit location for the HT, AFA is used to enumerate every possible (). The attempts are conducted in advance, which can guide the logic designs of the HT and reduce costs. Since AFA is executed as machine-based automation, all possible key candidates will be eventually checked along the fault propagation paths. The utilization of fault leakages is maximized. The automation shows its advantage over traditional manual analysis, such as DFA, especially when the analysis goes into the deeper round.
6.2. Constructing Algebraic Equations for Encryption of HIGHT
The task of this stage is to represent HIGHT cipher with a large system of low degree Boolean equations. Suppose and are the 64-bit input of round and ciphertext, respectively. Since the key schedule of HIGHT is very simple, we mainly focus on the encryption of HIGHT which is shown in Algorithm 1. From Algorithm 1, the most important yet difficult problem is to construct the equations for ARX operations.
It is stressed that in general the adversary will not choose a very deep round as the target round. That is, the rounds between the target round and FinalTransformation are not very large. Therefore, instead of constructing equations for the full rounds of the cipher, we only construct equations for the rounds from the target round to the FinalTransformation which will result in a smaller equation script and thus will accelerate the solving procedure.
According to Algorithm 1, for every fault that is injected in , there are variables and ANF equations were introduced to the equation system . In addition, variables and ANF equations are required for round keys, 64 variables and ANF equations are for the whitening keys, and 128 variables and ANF equations are for the master keys.
6.2.1. The Equations for Addition mod
Assume , , are the two inputs and output of addition modulo , where , , and with , , and being the least significant bit, respectively. Then addition modulo can be described as Boolean equations as follows:
6.2.2. The Equations for and
Given that the input and output of and are and , respectively, then and can be described as the following Boolean equations:
6.3. Constructing Equations for the Injected Faults
This stage illustrates the method of constructing equations for the injected faults. To clarify the method, the example is shown in Figure 7.
Given that every time the HT was activated, a single-bit fault was introduced to flip the most significant bit of . The fault propagation paths are shown by bold line in Figure 7. The correct and faulty 64-bit inputs to the th round are denoted by and , respectively. Then, the complex fault propagation paths can be described as a set of algebraic equations with the variables that were involved. Since the fault flips the most significant bit of , we havewhere . For the fault propagation paths that from round 25 to the FinalTransformation, they can be described by equations as Algorithm 2.
Algorithm 2 constructs the equations for the injected faults. The main idea is that every time a fault was induced, the intermediate variables from round to round 32 were viewed as new variables . Then, we reconstruct the equations for the encryption by replacing with . Furthermore, for variables that were not involved along the fault propagation paths which can be deduced by the function SearchFaultyInterVal(), we have . Thus, there are variables and ANF equations were introduced to the equation system for every fault that was injected in .
The function, SearchFaultyInterVal() searches the faulty intermediate variables automatically according to the fault location Xr,i and finally returns them. The main idea is explained earlier in Section 3. Algorithm 3 describes the procedure.