Abstract

GEA-1, a proprietary stream cipher, was initially designed and used to protect against eavesdropping general packet radio service (GPRS) between the phone and the base station. Now, a variety of current mobile phones still support this standard cipher. In this paper, a structural weakness of the GEA-1 stream cipher that has not been found in previous works is discovered and analyzed. That is the probability that two different inputs of GEA-1 generate the identical keystream can be up to , which is quite high compared with an ideal stream cipher that generates random sequences. Based on this newfound weakness, a new practical distinguishing attack on GEA-1 is proposed, which shows that the keystreams generated by GEA-1 are far from random and can be easily distinguished with a practical time cost. After then, a new practical key recovery attack on GEA-1 is presented. It has a time complexity of GEA-1 encryptions and requires only seven related keys, which is much less than the existing related key attack on GEA-1. The experimental results show that GEA-1 can be broken within about 41.75 s on a common PC in the related key setting. These cryptanalytic results show that GEA-1 cannot provide enough security and should be immediately prohibited to be supported in the massive GPRS devices.

1. Introduction

General packet radio service (GPRS) is a “packet mode” wireless data system which has been standardized to operate on GSM infrastructure. GPRS was originally a standard under the European Telecommunications Standards Institute (ETSI), but was finally transferred to the 3rd Generation Partnership Project (3GPP) and released in 1998. The GPRS standard was widely used during the late 1990s and the early 2000s. To protect against eavesdropping GPRS between the base station and the phone, two proprietary stream ciphers GEA-1 and GEA-2 were initially designed and utilized for this purpose. As analyzed in [1], a variety of current mobile phones still support GEA-1. As pointed out in [1] and [2], it is a serious security problem because the support of GEA-1 by the current mobile phones makes it possible to recover a previous session key. Once the previous session key is recovered, the attacker can decrypt the previous session until the key becomes invalid.

The stream cipher GEA-1 was designed by ETSI Security Algorithms Group of Experts (SAGE) in 1998 and not made public until 2021. Only a technical report on the design process can be found at ETSI [3]. The GEA-2 stream cipher [4] was designed in 1999 as an improved variant of GEA-1. Both of them take a 64-bit key, a 32-bit initialization vector (IV) which is commonly used as a counter incremented for each frame, and a public bit which indicates the transfer direction as input, and output a keystream of 1,600 bytes for each frame. Recently, an improved variant of GEA-2 named GEA-2a was designed in [2]. The designers of GEA-2a claimed that the new variant can resist against all existing attacks and provide the 64-bit security.

1.1. Related Works

The full description of GEA-1 was first given by Beierle et al. [1] at EUROCRYPT 2021, where they presented the first publicly available cryptanalytic attack on it. Their attack on GEA-1 is based on a unusual weakness that after the linear initialization process the joint initial state of two of the three linear feedback shift registers (LFSRs) has only possible values (out of ). This weakness leads to a key recovery attack on GEA-1, which has an online/offline time complexity of / GEA-1 evaluations and a memory space of 44.5 GiB, where the time complexity unit “GEA-1 evaluation” indicates the time cost of generating a 128-bit keystream. The attack recovers the 64-bit key of GEA-1 and requires only 65 bits of known keystream. After then, they checked how frequently the weakness occurs for randomly chosen LFSRs experimentally, the experimental results showed that the weakness in GEA-1 is unlikely to occur by chance. It indicates that the 40-bit security is intentionally designed for GEA-1 due to export regulations. Later, Beierle et al. [5] and Beierle et al. [6] took a deep insight into the design of GEA-1 and analyzed how to construct such a weak GEA-like cipher effectively.

At EUROCRYPT 2022, Amzaleg and Dinur [7] improved the attack on GEA-1 by Beierle et al. [1]. In the improved attack on GEA-1, the required memory space is decreased from 44.5 GiB to about 4 MiB, but the time complexity remains . The attack can be implemented in an average of 2.5 hr on a modern laptop.

Recently, new attacks on GEA-1 were proposed by Ding et al. in [2]. A key recovery attack on GEA-1 in the chosen IV setting was presented, where none of the online and offline time complexities is larger than . It requires a memory space of 8 MiB and 64 keystream bits for each of chosen IVs. Furthermore, they analyzed the slide property of GEA-1 and used it to devise a practical key recovery attack in the related key setting. Their result shows that the 64-bit secret key of GEA-1 can be successfully recovered with a time complexity of GEA-1 encryptions, requiring a total of keystream bits. The main practical obstacle is that their attack requires 50 related keys, which are too many to be available to the attacker.

1.2. Our Contributions

A structural weakness of the GEA-1 stream cipher that has not been found in previous works is discovered and analyzed in this paper. As results, new practical distinguishing attack and key recovery attack on GEA-1 are presented. The comparisons of the previous attacks with our cryptanalytic results are summarized in Table 1. In Table 1, the complexities of the previous attacks on GEA-1 are described in detail to make clear comparisons with our new attacks. Our contributions are given as follows.

(1)In this paper, we find that the initialization of the GEA-1 stream cipher is noninjective, due to the fact that the input size of GEA-1 is much larger than the size of the nonlinear feedback shift register (NFSR) used in the initialization of GEA-1. Based on this observation, the differential collision characteristic of GEA-1, i.e., there are different inputs of GEA-1 which generate the identical keystream, is explored and analyzed. The result shows that the probability that two different inputs of GEA-1 generate the identical keystream can be up to , which is quite high compared to an ideal stream cipher that generates random sequences.(2)Based on the differential collision characteristic of GEA-1, a practical distinguishing attack on GEA-1 in the related key setting is proposed. The attack has a time complexity of GEA-1 encryptions, requiring one related key, chosen IVs and keystream bits. Note that the required keystream bits are generated by two keys together with chosen IVs, and only 32 keystream bits are needed for each key-IV pair. The success probability of this attack is almost 1. The result shows that the keystreams generated by GEA-1 are far from random and can be easily distinguished with a practical time cost.(3)Based on the differential collision characteristic of GEA-1, a practical key recovery attack on GEA-1 in the related key setting is presented. The key recovery attack on GEA-1 has a time complexity of GEA-1 encryptions, requiring seven related keys and chosen IVs. The attack requires keystream bits, which are generated by two keys together with chosen IVs, and only 32 keystream bits are needed for each key-IV pair. The success probability of this attack is almost 1. The attack is confirmed by the experimental results, which show that GEA-1 can be broken within about 41.75 s on a common PC in the related key setting. As shown in Table 1, our practical key recovery attack on GEA-1 has a significantly lower time cost, compared with the previous attacks [1, 7]. Meanwhile, our practical key recovery attack on GEA-1 requires only seven related keys, which is much less than the existing related key attack in [2]. In contrast, if only seven related keys are used in the attack [2], their time complexity should be about GEA-1 encryptions, which is much worse than our attack. Thus, our practical key recovery attack on GEA-1 is more practical than the attack [2].

1.3. Feasibility and Impact of Our Attacks

In this paper, based on the differential collision characteristic of GEA-1, new practical distinguishing attack and key recovery attack on GEA-1 are proposed. The feasibility and impact of these attacks are discussed as follows.(1)To carry out the attacks on GEA-1 proposed in the paper, the attacker requires to collect about 32 keystream bits for one frame. As shown by Beierle et al. [1] in, collecting 65 keystream bits is feasible for an entirely passive attacker by exploiting predictable SNDCP (Subnetwork Dependent Convergence Protocol) and IP header patterns. Thus, the requirement of about 32 keystream bits for one frame in our attacks can be easily achieved.(2)GEA-1 is a proprietary stream cipher and was initially designed and used to protect against eavesdropping GPRS between the base station and the phone. It is still supported by a variety of current mobile phones. As shown in [1], once the key is recovered, the attacker can decrypt all traffic for the complete GPRS session until the key gets invalid, which happens in the GPRS authentication and ciphering procedure triggered by the network. Thus, the practical key recovery attack on the GEA-1 stream cipher is probably a serious threat to the massive GPRS communication users.

The rest of this paper is structured as follows. A brief description of GEA-1 is given in Section 2. In Section 3, the differential collision characteristic of GEA-1 is introduced. Based on the differential collision characteristic of GEA-1, practical distinguishing and key recovery attack on GEA-1 are proposed in Sections 4 and 5, respectively. The paper is concluded in Section 6.

2. A Brief Description of GEA-1

This section gives a brief description of the GEA-1 stream cipher, for more details refer to [1, 2]. An overview of the structure of the cipher is depicted in Figure 1.

2.1. The Keystream Generator of GEA-1

The keystream generator of GEA-1 is mainly made up of three LFSRs over denoted as , and , and a filter function denoted as . The three LFSRs have the bit sizes of 31, 32, and 33, respectively. Let , and denote the internal state of GEA-1 at time , where represent the leftmost bits of three LFSRs, respectively. All of these three LFSRs work in Galois mode, and their update functions are given as follows.

LFSR :where .

LFSR :where .

LFSR :where .

The filter function is a nonlinear Boolean function that takes seven bits as input and generates one bit as output. It has an algebraic degree of 4. The specification of is given in algebraic normal form as:

In the keystream generation process, one keystream bit is generated per clock by:where the three output bits can be generated as follows:

2.2. The Initialization of GEA-1

The GEA-1 stream cipher takes a 64-bit key, a 32-bit public IV, and a public bit which indicates the transfer direction as input. The initialization process of GEA-1 uses a NFSR in size of 64 (denoted as ). At the beginning of this process, the NFSR is set to be the all-zero state. Then it is clocked 97 times, feeding in one input bit with each clock. The 97 input bits are introduced in the sequence . After loading all input bits, the NFSR is clocked another 128 times with all zeros as input. An overview of the initialization of the NFSR is depicted in Figure 2. As shown in Figure 2, the feedback bit of NFSR is produced by the filter function , XORed with one input bit and the bit that is shifted out.

After clocking the NFSR 225 times, the content of NFSR is denoted as , which is named the initial state of GEA-1 in this paper. It is used as a seed for initializing the three LFSRs of GEA-1 as follows. First, three LFSRs are filled with all zeros. Then it clocks each of three LFSRs 64 times, where the feedback bit is produced by XORing one bit from the initial state and the bit that is shifted out. More specifically, the LFSRs , and insert the bits from the initial state, starting from , and , respectively. It should be noted that if any of the LFSRs , and end up in the all-zero state, the leftmost bit of the LFSR is forcibly set to one before producing the first keystream bit.

3. Differential Collision Characteristic of GEA-1

In this section, we will present a structural weakness in the GEA-1 stream cipher by discovering the differential collision characteristic of GEA-1. More specifically, since the input of GEA-1 (i.e., key, IV, and ) has a size of 97 bits which is much larger than the size of NFSR , then it is certain that there are different inputs of GEA-1 which generate the identical initial state after clocking the NFSR 225 times. Once the same initial state is generated, the identical keystream will be generated. For convenience of description, a new definition is given as follows.

Definition 1. Two different inputs and are called an input collision pair of GEA-1, if they generate the identical keystream.

Denote by , and the differences of , and , respectively, and denote by the state of the NFSR at clock , . It is easy to obtain , since the NFSR is set to be all-zero state at the beginning of initialization. After then, the NFSR is updated 225 times as follows.

For Where

Now, we introduce an effective method of searching for input collision pairs of GEA-1. To achieve this goal, we have found a kind of differential paths for GEA-1. Firstly, we introduce the difference into the input bit , i.e., , and then the difference will appear in the updated bit after one clock, i.e., . Here, the integral parameter should satisfy and , since the input bit is equal to the public bit for and fixed to be zero for , and cannot contain any difference. It is easy to see that there are nine input bits (i.e., ) to generate the feedback bit . As shown in Figure 3, if the difference only appears in the state bit and the other eight state bits do not contain any difference, it is possible that the difference disappears in the feedback bit , i.e., . Thus, when the following condition denoted as is satisfied, holds.where denotes the output difference of the nonlinear function .

After then, by shifting more, the bit is updated to be , , , and after 8, 21, 25, 41, 51, and 60 clocks, respectively. By simply repeating the process above, we know that it is possible to generate the identical feedback bit if the difference only appears in one of all input bits of the nonlinear function . That is, when the following six conditions denoted as are satisfied, holds.

Finally, after 63 clocks, the bit is updated to be by shifting. As shown in Figure 4, since the seven input bits of the nonlinear function do not contain any difference, we have to introduce another difference into the input bit (i.e., ) to generate the identical feedback bit . Since the input bit is fixed to be zero for and cannot contain any difference, then it has (and thus ) to generate the identical feedback bit . Therefore, when the input difference is satisfied, holds directly.

Up to now, we have found a kind of differential paths for GEA-1, which is detailed in Table 2. As described in Table 2, the first difference introduced into the NFSR is at the -th clock, due to the difference . From the -th to the -th clock, the weight of the difference of the NFSR is always to be one, i.e., for . The difference disappears at the -th clock, due to the introduced difference .

Clearly, to construct the differential paths for GEA-1 in Table 2, the input difference should satisfy a condition, called R1, described as follows.where the parameter is an integer satisfying .

When the condition R1 is satisfied, the kind of differential paths for GEA-1 in Table 2 holds when the seven conditions hold simultaneously. Once the seven conditions hold simultaneously, two different input pairs will generate the identical state , and then the identical initial state and keystream will be generated. Now, an observation for GEA-1 has been obtained as follows.

Observation 1. When the input difference is chosen to satisfy the condition R1 and the seven conditions hold simultaneously, two different inputs of GEA-1 are an input collision pair.

Now, we calculate the probability that the seven conditions hold simultaneously. First, for each condition, we make an assumption that the seven input bits of the nonlinear function are independent and identically distributed. Under this assumption, we can easily calculate the probabilities as follows.

After then, we make another assumption that the seven conditions are independent, and then we can calculate the probability that the seven conditions hold simultaneously as follows.

To verify the theoretical probability calculated above, we have made an experiment. In this experiment, we first randomly select 100 keys and IVs to form different inputs. Under the condition R1, we can obtain another different inputs. After then, for each of these 100 keys, we count the number of IVs such that and are an input collision pair of GEA-1. Then we obtain the following results:(1)When the integer satisfies , there does not exist any IV such that and are an input collision pair of GEA-1.(2)When the integer satisfies , for each of these 100 keys, there are some IVs such that and are an input collision pair of GEA-1. The experimental results are listed in Table 3. Some input collision pairs found in this experiment are listed in Table 4. Take for example. The total number of input collision pairs we have found is 50,792, and thus the average number of input collision pairs per key is . This means that the experimental probability that the seven conditions hold simultaneously is about , which is very close to the theoretical result .

As shown in Table 3, when , the experimental probabilities are all quite close to the theoretical result , and thus the kind of differential paths in Table 2 exists and input collision pairs of GEA-1 can be found effectively. However, when , none of input collision pairs is found among 100 keys and IVs. In order to validate this, we make a supplementary experiment by increasing the number of IVs to , but the result remains the same and none of input collision pairs of GEA-1 is found. This is probably due to that there are contradictions in the state of the NFSR such that the seven conditions cannot hold simultaneously when .

According to the experimental results, we update the condition R1 to R2. The new condition R2 is the same with R1, except that the parameter in R2 is limited to satisfy , instead of in R1. Now, a new observation for GEA-1 can be given as follows.

Observation 2. When the input difference is chosen to satisfy the condition R2, two different inputs of GEA-1 are an input collision pair with a probability of about .

Therefore, the probability that two different inputs of GEA-1 satisfying the condition R2 generate the identical keystream is about , which is quite high compared to an ideal stream cipher that generates random sequences. Using this weakness, the attacker can easily distinguish the keystream generated by GEA-1 from the random sequence.

4. Practical Distinguishing Attacks on GEA-1

Based on Observation 2, this section aims at proposing practical distinguishing attacks on GEA-1 in the related key chosen IV setting, whose goal is to distinguish the keystream from a truly random sequence. It should be noted that the related key chosen IV setting is a common attack setting in cryptanalysis of stream ciphers and has been utilized in many cryptanalytic works, e.g., Grain-like [810], WG-8 [11], SNOW 3G [12], GEA-1 [2], and GEA-2 [2]. In this attack setting, the chosen IV setting can be directly satisfied. Though the related key setting is considered to be a more unrealistic scenario than the chosen IV setting, it may still be utilized to break some cryptosystems. For instance, related key weaknesses of the RC4 stream cipher led to a practical attack on the WEP protocol [13]. In the related key setting, the attacker is allowed to use two different keys that have a known relationship between them, but he does not know the values of these two keys [1416]. Thus, the condition R2 can be satisfied directly in the related key chosen IV setting.

4.1. A Practical Distinguishing Attack on GEA-1

Based on Observation 2, a practical distinguishing attack on GEA-1 will be proposed in this subsection. The attack can be described as an algorithm as follows.

1. Randomly choose IVs, i.e., .
2. For from 1 to , do the followings:
2.1 Generate an output sequence with length using the input .
2.2 Generate another output sequence with length using the input , where the input difference between and satisfies the condition R2.
2.3 If the two output sequences are identical, judge that they are keystreams generated by GEA-1. Otherwise, return to Step 2 and try the next IV.
3. If no identical output sequence is found after checking all IVs, judge that they are random sequences.

Clearly, there are two types of errors when Algorithm 1 makes such a judgment. The first is that the algorithm judges the output sequences are keystreams generated by GEA-1 but they are in fact random sequences. The probability of this error can be calculated as . The second is that the algorithm judges the output sequences are random but they are in fact keystreams generated by GEA-1. The probability of this error can be calculated as . Thus, the probability that Algorithm 1 succeeds can be calculated as:

It is easy to see that there is tradeoff between the number of chosen IVs used in Algorithm 1 and the success probability of this algorithm. The values of and can be chosen by the attacker to make a reasonable tradeoff between them. Here, to achieve a high success probability, we set and for GEA-1, and then we have . To validate this, we make an experiment by randomly choosing 1,000 keys, and execute Algorithm 1 once for each key. The result shows that Algorithm 1 always succeeds giving the right output. Thus, a distinguishing attack on GEA-1 with a time complexity of GEA-1 encryptions has been proposed, requiring one related key and chosen IVs. Each key and IV pair only needs to generate keystream bits, which leads to a total data complexity of keystream bits. The attack has a success probability of almost 1.

4.2. Improved Distinguishing Attacks on GEA-1

In fact, the practical distinguishing attack above can be further improved, if we take a deeper look at the seven conditions . This subsection aims at presenting improved distinguishing attacks on GEA-1.

At the beginning of initialization of GEA-1, the NFSR is filled with all zeros, and then it is clocked 97 times, feeding in one input bit for each time. The 97 input bits are loaded in the sequence . Since both of IV and are public to the attacker, the state of for can be naturally known to the attacker. Take for example. The first two conditions and directly hold in the chosen IV setting, as no key bit is involved in these two conditions. This means that when the remaining five conditions hold simultaneously, the two different input pairs are an input collision pair. The probability that two different inputs and are an input collision pair becomes , which is larger than by a factor of . This enables us to propose a better distinguishing attack on GEA-1 that has a time complexity of GEA-1 encryptions, by choosing which is large enough. This improved attack requires only chosen IVs and has a total data complexity of keystream bits, while its success probability is still almost 1. Clearly, this improved attack always holds for . However, when , only the first condition can be directly satisfied in the chosen IV setting. Similarly, the probability that two different inputs and are an input collision pair becomes , and thus we can obtain another distinguishing attack on GEA-1 that has a time complexity of GEA-1 encryptions, requiring chosen IVs and keystream bits. The success probability is still almost 1.

5. Practical Key Recovery Attacks on GEA-1

This section focuses on how to recover the secret key of GEA-1 based on the differential collision distinguishers constructed above. To effectively recover the secret key of GEA-1, we have analyzed which key bits are involved in the conditions for . The obtained results are listed in Table 5.

5.1. Key Recovery Attacks on GEA-1 Using One Related Key

In this subsection, some key recovery attacks on GEA-1 using one related key will be proposed. For convenience of description, a key recovery attack on GEA-1 when choosing is presented as follows.

By Algorithm 1, we can find an input collision pair such that the seven conditions are simultaneously satisfied. However, no key bit can be recovered by using the first two conditions, i.e., and , since no key bit is involved in these two conditions. As for the remaining five conditions , as listed in Table 5, there are a total of 44 key bits (i.e., ) are involved in these five conditions. About five key bits can be recovered theoretically by using these five conditions, as the probability that these five conditions hold simultaneously is about . Thus, the attacker can make an exhaustive search of these 44 key bits, and then check whether these five conditions hold simultaneously. This enables to reduce the number of possible guesses from to about. Then, the attacker can make an exhaustive search of the obtained possible guesses together with the remaining key bits (i.e., ) to recover the 64-bit key. Since the maximum number of possible guesses is no more than in this whole key recovery process, the time complexity of this key recovery process is at most . Considering the cost of finding an input collision pair, the key recovery attack on GEA-1 has a total time complexity of GEA-1 encryptions, requiring one related key, chosen IVs and keystream bits. The success probability of the key recovery attack is completely dominated by the distinguishing attack’s success probability, and thus is also almost 1.

Clearly, we can obtain similar key recovery attacks for different values of with . The best result we have found is obtained when equals to 25. When , the attacker first guesses the values of 53 key bits , and then reduces the number of possible guesses from to about by checking whether the six conditions hold simultaneously. Then the attacker guesses the values of the remaining 11 key bits to recover the 64-bit key, which increases the number of possible guesses from to about . Since the maximum number of possible guesses is no more than in this whole key recovery process, the time complexity of this key recovery process is at most . Considering the cost of finding an input collision pair, the key recovery attack on GEA-1 has a total time complexity of GEA-1 encryptions, requiring one related key, chosen IVs and keystream bits. The success probability of this attack is almost 1.

5.2. Practical Key Recovery Attack on GEA-1 Using More Related Keys

In fact, the key recovery attacks on GEA-1 above are all “minimal” in the sense that each of them requires only one related key. If more related keys are available for cryptanalysis, the time complexity can be significantly reduced. Now, we attempt to propose a practical related key attack on GEA-1 using more related keys. It should be noted that there is a tradeoff between the number of related keys used in the attack and the time complexity of the attack. To achieve a practical time complexity, we assume that seven related keys are available to the attacker, i.e., for . Similar to the attacks using one related key above, for each of these seven related keys, the attacker should find an input collision pair such that the seven conditions are simultaneously satisfied. This leads to a time complexity of GEA-1 encryptions and requires about chosen IVs. After then, the detailed process of recovering the key of GEA-1 is described in Table 6.

As shown in Table 6, in the first step, the attacker guesses the values of one key bit , which leads to two possible guesses. Then, the attacker can check whether the condition with is satisfied, reducing the number of possible guesses from two to about , since holds. The following 38 steps are all similar to the first step. After the first 39 steps in Table 6, about possible guesses are obtained. Then the attacker could make an exhaustive search of the remaining five key bits to recover the 64-bit key of GEA-1, which increases the number of possible guesses from to .

It is easy to see that the maximum number of possible guesses is no more than , as shown in Table 6. Thus, the time complexity of the key recovery process is at most . Considering the cost of finding input collision pairs, the total time complexity of the attack on GEA-1 is GEA-1 encryptions, requiring seven related keys, chosen IVs and keystream bits. The success probability of this attack can be calculated as , as seven related keys are used in this attack. To validate this cryptanalytic result, we have randomly chosen 100 keys and simulated the whole attack process once for each key on a common PC with 2.5 GHz Intel Pentium 4 processor. The experimental result shows that the attack above always succeeds recovering the 64-bit secret key of GEA-1, and the average time to recover the 64-bit key is approximately 41.75 s.

6. Conclusions

In this paper, we find that the initialization of GEA-1 is noninjective, due to that the input size of GEA-1 is much larger than the size of the NFSR used in the initialization of GEA-1. Based on this observation, a structural weakness of the GEA-1 stream cipher that has not been found by the previous works is discovered and analyzed. As results, new practical attacks on GEA-1 are proposed, and the cryptanalytic attacks show that GEA-1 cannot provide enough security and should be immediately prohibited from being supported in the massive GPRS devices. It should be noted that this new-found weakness of GEA-1 is not present in its successors GEA-2 and GEA-2a. That is because the input size of GEA-2 (or GEA-2a) is no smaller than the size of the NFSR. Hopefully, this can provide some new insights on how to design a secure GEA-like stream cipher.

Data Availability

No new data were generated or analysed in support of this research.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under grant numbers 61602514, 62202493, 61802437, and 61902428.