Abstract

A potential security vulnerability of embedding compression in a chaos-based cryptography is studied. Furthermore, a scheme for improving its security is proposed. This correspondence considers the use of multiple chaotic dynamics and drive chaotic trajectory by both plaintext sequence and initial values of a chaotic map. Chaotic trajectory is used for encryption that is never reused for different plaintext. This makes that scheme naturally resist chosen plaintext attack and cipher text-only attack. Its strong security is justified by the key space, key sensitivity, and tests of random number sequences. The results show that the security of the proposed scheme is stronger than the latest algorithm especially in resisting chosen plaintext attack, while its performance is not sacrificed.

1. Introduction

Data compression and encryption become more and more important in multimedia communication. In order to improve both performance and security of multimedia application, it is worthwhile to joint compression and encryption in a united process [1–4]. In comparison with the classical separate compression-encryption schemes, the united scheme is more secure and effective [5]. Meanwhile, in the AES system, the zero padding can degrade coding efficiency, due to the block nature of AES. Another classical approach to provide simultaneous compression and encryption is to join a traditional entropy coder in a stream cipher, for example, RC4. Unfortunately, inappropriate integration with an initialization vector of RC4 leads to severe security vulnerability [5]. In order to improve the performance, two distinct research directions are studied recently. One is embedding key-based confusion and diffusion characteristics in existing compression algorithms such as entropy coding [3, 4]. The scheme in [3] is based on multiple Huffman tables and achieves encryption and compression simultaneously via swapping the left and right branches of the Huffman tree. Moreover, some approaches are proposed to embed cryptography in arithmetic coding with key-based interval split [4].

Another approach is to incorporate compression in cryptography [2]. It is reported that the compression property can be embedded in chaos-based cryptography, so that the total time consumption of this scheme is less than the separation of encryption and compression. This scheme mixes block cipher and stream cipher processes. In the block cipher process, more probable symbols are encrypted by searching location of the plaintext symbol in the dynamic lookup table. While in the stream cipher process, less probable ones are masked by a pseudorandom bitstream. The block cipher could be considered as a variant of Baptista-type cryptosystem [6]. The Baptista-type cryptosystem is a cipher searching plaintext symbols based on a chaotic pseudorandom sequence. With its secret key, the original search trajectory is regenerated in decryption and the correct plaintext symbols are retrieved. It is found that the Baptista-type cryptosystem and some of its variants are vulnerable to various attacks, especially to chosen-plaintext attacks [7, 8]. In order to counter the attacks, remedial operations were suggested [9, 10]. Unfortunately, after studying the remedial method, it is found that the remedial method is inconvenient to improve security of [2]. Firstly, the remedial method is vulnerable to resist brute-force attack for it cuts down key space. Moreover, its speed is slow because of complicated operation. Therefore, it is necessary to enhance security of this class of chaos-based cryptosystems. For the mask model, there exists a potential safety risk of recovering pseudorandom number sequences and need to be solved.

In proposed scheme, we address this problem by disturbing the search of chaotic trajectory by an additional chaotic dynamics where confusion property is made more complicated while compression radio is not degraded evidently. Two chaotic maps are required in proposed cryptosystem: one map encrypts the first plaintext block. Then the other map encrypts next plaintext using final output of the first map as initial value. Short-period effect is also analyzed in details in terms of the drawback of chaotic dynamics in finite precision systems. Security analysis shows that the enhanced scheme can effectively repair the potential vulnerability and strengthen the security. Simulation results show that security of our proposed scheme is evidently improved and its compression radio is not degraded. Moreover, the key space of the multiple chaotic dynamics approach is enlarged and thus could effectively resist brute-force search attack.

The rest of this paper is organized as follows. In Section 2, cryptanalysis of embedding compression in chaos-based cryptography is presented. Its enhanced scheme could be found in Section 3. Analyses and simulation results can be given in Sections 4 and 5, respectively. In Section 6, conclusions will be drawn.

2. Cryptanalysis of Embedding Compression in Chaos-Based Cryptography

In recent years, there is an increasing trend of designing ciphers based on chaos [11–14]. This is because chaotic systems are sensitive to the initial condition and the system parameters [15]. These properties are desirable in cryptography. Moreover, the knowledge on chaos and nonlinear dynamics can be applied in the field of cryptography. Embedding compression in chaos-based cryptography proposed in [2] could be considered as hybrid cipher with both block cipher and stream cipher, which could be seamlessly embedded in image and multimedia systems [3, 16, 17]. Firstly, plaintext blocks with high probability are firstly encrypted by searching in the lookup table named block cipher, as is also named search model. After that, all the cipher text as well as plaintext with low probability is masked by a pseudorandom bitstream with stream ciphers, as is also named mask model.

Here, vulnerability of mask model is presented. The random number sequences used in mask model are generated by chaotic trajectory, and it is simply given as (2.1). The initial value π‘₯0 and control coefficient b could be considered as pseudorandom number seed, and pseudorandom number sequences π‘š are real key of this model. In a chosen plaintext attack, the attacker requests the ciphertext of 𝑃1={𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1β‹―}. The lookup table only has one symbol 𝑠1, the needed iteration of block cipher is one time. Thus, the output of block cipher is composed of number 1. Now, the attacker knows output of block cipher and ciphertext exactly, and then he can obtain mask sequences without any knowledge of pseudorandom number seed. Even though the scheme proposed in [2] is hybrid cryptosystem, after the mask sequences have been obtained, the attacker could recover part of ciphertext. Furthermore, the cryptosystem becomes a single operation of block cipher which has been found vulnerable [7] π‘š=𝑔1ξ€·π‘₯0ξ€Έ,𝑏.(2.1)

3. The Proposed Scheme

3.1. Basic Principle

In the proposed scheme, the time series used in cryptosystem is generated by two chaotic systems, as shown in Figure 1. Moreover, the time series also could be fractal time series which is random as reported in [18–20]. Reference [21] discusses the random data generation of the type discussed in [22]. For sake of simplicity, only chaotic-based time series is studied in proposed paper. The first one is an arbitrary chaotic map stated in (3.1), the second one, as given by (3.2), is also a chaotic map and its chaotic trajectory is different from that of the first ones. There are two constraint conditions. Firstly, if the same type chaotic maps are adopted, the control coefficients must keep unequal. Secondly, the phase space of chaotic map must be in the same range 𝑋𝑑=𝑓1ξ€·π‘‹π‘‘βˆ’1ξ€Έπ‘Œ,(3.1)𝑑=𝑓2ξ€·π‘Œπ‘‘βˆ’1ξ€Έ.(3.2)

Here, the searching model, that is, the first process of proposed scheme, is presented. The initial conditions are arbitrarily assigned with 𝑋0. First of all, the sender performs π‘˜1 iterations on (3.1) until chaotic state π‘‹π‘˜1 locates in the phase space mapped to the first plaintext block. After the first symbol was encrypted, π‘‹π‘˜1 is selected as the initial state of the second chaotic map and used for encrypting the second plaintext block. The sender performs π‘˜2 iterations on (3.2) and π‘Œπ‘˜2 locates in the phase space of the second plaintext block. And then π‘Œπ‘˜2 is selected as initial state of (3.1) in the second round. The third plaintext block is encrypted by (3.1). These operations repeat until all the plaintext blocks have been encrypted. The cipher text is a collection of the number of iterations {π‘˜1π‘˜2π‘˜3β‹―π‘˜π‘š} obtained in each round. Therefore, there are new initial states related to previous plaintext block for encrypting each plaintext block.

3.2. Chaotic Maps for the Proposed Scheme

To illustrate the proposed scheme, logistic map is used as chaotic map which is a typical chaotic system widely used in cryptosystem [1, 2, 6], as shown in (3.3) and (3.4). The control coefficient 𝑏0 and 𝑏1 are unequal. Both of them are in the range of [3.9, 4] so as to avoid the nonchaotic regions according to [23]π‘₯𝑑=𝑏0π‘₯π‘‘βˆ’1ξ€·1βˆ’π‘₯π‘‘βˆ’1𝑦,(3.3)𝑑=𝑏1π‘¦π‘‘βˆ’1ξ€·1βˆ’π‘¦π‘‘βˆ’1ξ€Έ.(3.4)

To examine the properties of the multiple chaotic dynamics, the largest Lyapunov exponent of the discrete time series is computed by the method described in [24]. The parameters of chaotic map 𝑏0 and 𝑏1 are arbitrarily selected from the range of [3.6,4]. The iteration times of two chaotic maps are randomly selected. The largest Lyapunov exponent is tested thousands of times, all of test results are bigger than 0. It means that the system is chaotic at various situations that assure the chaotic status at encryption and decryption processes.

3.3. Encryption Procedures

Similar to [2], the encryption procedure is hybrid one. More probable symbols are encrypted by searching in the lookup table, while less probable ones are masked by a pseudorandom bitstream as performed in stream ciphers. In the searching model, the phase space of the logistic map is divided into a number of equal-width partitions. Each partition associates with a possible plaintext symbol. The greater the probability of occurrence of the symbol is, the more phase-space partitions it maps to. A secret chaotic trajectory produced by the multiple chaotic dynamics system is used to search the partition mapped to the plaintext symbol. The number of iterations of the logistic map for searching each plaintext symbol is the length of the searching trajectory, which is then taken as the cipher text [6]. Meanwhile, the pseudorandom sequence is generated from the secret chaotic trajectory, and it will be used in mask model. After searching model has been processed, Huffman code is performed for all the collected number of iterations. At the last step, the intermediate sequence and the less probable plaintext sequence are masked by the binary mask sequence which is generated in searching model.

Without loss of generality, the plaintext is assumed as a sequence of symbols. The encryption procedures are as follows.

Step 1. Scan the whole plaintext sequence to find out the number of occurrence for each plaintext symbol. Then, select the top symbols, and construct map table according to the probability.

Step 2. Encrypt each more probable plaintext symbol sequentially according to the method in [2]. The different is that the two chaotic maps are used rotationally. For instance, (3.3) is used for encrypting the 𝑗th plaintext block, the end state of (3.3) is adopted as the initial state of (3.4). Then (𝑗+1)th block is encrypted by (3.4). In the each iteration, eight masking bits are extracted from the least significant byte of the chaotic trajectory. They are appended to form a binary mask sequence for later use in mask model.

Step 3. After all the plaintext blocks have been processed, a Huffman tree is built for all the collected number of iterations, including zero. When the Huffman tree is built, the number of iterations and the special symbol are replaced by the corresponding variable-length Huffman code to form the intermediate sequence π‘Ÿ.

Step 4. The intermediate sequence π‘Ÿ should be masked by the binary mask sequence, which is a stream cipher named mask model. This mask model is different from the original method proposed in [2] as shown in (3.5). The ciphertext of block 𝑗 is given by (3.5) where π‘Ÿπ‘—, π‘šπ‘—βˆ’1 and 𝐢𝑗 are intermediate sequences, mask sequences and ciphertext sequence packaged in 32-bit block, respectively. The π‘šβˆ’1 is initial value pseudorandom number sequence that could be considered as cipher key of proposed scheme 𝑐𝑗=ξ€·π‘šπ‘—βˆ’1+π‘Ÿπ‘—ξ€Έmod232.(3.5)

3.4. Decryption Procedures

Before the decryption, initial values of the chaotic map, which are the secret key, must be delivered to the receiver secretly. The key includes the control parameters 𝑏0, 𝑏1, π‘šβˆ’1 and the initial values π‘₯0 of the chaotic maps. Moreover, the plaintext probability information must also be available to the receiver for reconstructing the symbol mapping table. With this information, the receiver is able to reproduce the secret chaotic trajectory used in encryption. The decryption process is similar to the encryption one. The receiver regenerates the chaotic sequence from the secret key, and then looks up the plaintext symbol from the table. The decryption processes for decrypting 𝑗th are as follows.

Step 1. Unmask the intermediate sequence π‘Ÿπ‘— according to π‘šπ‘—βˆ’1 of (3.5).

Step 2. According to intermediate sequence π‘Ÿπ‘—, iterate the chaotic map using the shared secret parameters and the initial conditions to regenerate the chaotic trajectory as used in encryption. Decode the variable-length Huffman code using the shared Huffman tree to find out the number of iterations required. If the number is zero, this means that the block was only encrypted in mask mode and the block is copied directly as the output. Otherwise, iterate the chaotic map with the nonzero number of iterations and determine the final partition visited by the chaotic trajectory. Then, extract the mask bits from the binary representation of each chaotic map output to form the binary mask sequence π‘šπ‘—. Go to Step 1 and unmask the intermediate sequence π‘Ÿπ‘—+1.

Step 3. Perform Steps 1 and 2 repeatedly until whole ciphertext blocks are decrypted.

4. Security Analysis

4.1. One-Time Pad Attacks

The security of the original Baptista-type cryptosystem is analyzed in [7, 8] and some effective attacks were suggested there. The major problem of this cryptosystem is that the search trajectory is solely determined by the secret key. The same trajectory is used for all plaintext sequences unless the secret key is changed. As a result, the cryptanalyst can easily launch a chosen-plaintext attack to recover the whole chaotic trajectory. Details of the attack can be found in [7, 8]. Therefore, some researchers insist that Baptista-type cryptosystem can not meet even the most basic security requirements. The fatal vulnerability of Baptista’s scheme is one-time pad attack which is a considered as a kind of chosen plaintext attack. The plaintext block of each state of chaotic trajectory is obtained by way of one-time pad attacks. As shown in Figure 2, the chaotic trajectory π‘ˆ is generated from key based chaotic map, and the corresponding plaintext sequence 𝑉 is obtained by one-time pad attacks [7]. Since π‘ˆ remains unchanged for different plaintext sequence, if one-time pad requests the cipher text of the following plain text 𝑃1=(𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1β‹―), the corresponding cipher text is 𝐢1=(532223232232β‹―). Examining the cipher text, the attacker knows that the value {𝑣5,𝑣8,𝑣10β‹―} is 𝑠1. Then the attacker could get partial information about 𝑉. To complete knowledge of the symbolic sequence 𝑉, the attacker requests the cipher text of all kinds of plaintext sequence similar to 𝑃1. After obtaining enough sequence 𝑉, attacker does not need to know initial value of chaotic map, while he can know exactly the plaintext corresponding to ciphertext.

From the above analysis, it is known that the one-time pad attacker is to obtain corresponding plaintext block sequences 𝑉. The variables 𝑉1⋯𝑉2 denote the partial information of corresponding plaintext block sequences 𝑉. The way is to request the cipher text of the plaintext like 𝑃1, and then obtain partial information 𝑉1 of 𝑉. After the attacker has requested the cipher text of all 256 kinds of plaintext, full knowledge of the symbolic sequence 𝑉 is gained as shown in (4.1) where 𝑒 is plaintext bit length. The value of 𝑒 is set 8 because symbol in ASCII is 8-bit. In proposed scheme, the encryption procedure not only depends on the initial value, but also relates to the plaintext. Each distinct plaintext block leads to a different search trajectory for encryption. The knowledge of a search trajectory, for a particular plaintext block, is useless for other plaintext block. Moreover, the chaotic maps are iterated sequentially, the confusion of encryption is diffused. These lead to the difficulty in attacking the proposed cryptosystem using known/chosen-plaintext attack. In Section 4.2, we know that the trajectory is different for different plaintext {𝑃1𝑃2𝑃3β‹―}. Therefore, (4.1) is unavailable in proposed scheme, and the attacker couldnot get the full knowledge of 𝑉𝑉=𝑉1βˆͺ𝑉2⋯𝑉2π‘’βˆ’1βˆͺ𝑉2𝑒.(4.1)

4.2. The Security of Block Model and Mask Model

The block cipher of Embedding Compression in Chaos-Based Cryptography could be considered as a variant of Baptista-type cryptosystem [6]. Here, the reasons of vulnerability were reviewed. Considering the ciphertext sequence 𝐢1=(122131β‹―) and 𝐢2=(213211β‹―), the corresponding trajectories are π‘ˆ1 and π‘ˆ2 as shown in Table 1. For comparison, the original scheme Baptista-type cryptosystem [6] is present. It is found that the chaotic trajectory π‘ˆ1 is distinct from π‘ˆ2 in proposed scheme. However, they keep the same in Baptista’s type cryptosystem.

The scheme presented in [25] is a complex combination of multiple chaotic maps. In proposed scheme, the interaction among chaotic trajectories is existent. After encrypting current plaintext block, the last state of chaotic map is selected as the initial state of next chaotic trajectory to encrypt next plaintext block. Therefore, the initial value used for encrypting current plaintext block is related to last ciphertext, and thus different plaintext sequences lead to distinct initial values for encryption. It is useless to launch a chosen plaintext attack by regenerating the chaotic trajectory, since each different plaintext is encrypted by unique chaotic trajectory. Meanwhile, the additional chaotic dynamics results in a more secure cryptosystem for it increases the confusion in the encryption process and expands key space.

After searching model has been performed, pseudorandom bitstream is generated by chaotic trajectory, the less probable symbols are encrypted by masking them with the pseudorandom bitstream. In proposed scheme, the pseudorandom number sequences are also generated by chaotic trajectory which is related to plaintext sequences. It is simply given as (4.2) where P is plaintext sequences. In a chosen plaintext attack, the attacker requests the ciphertext of 𝑃1={𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1𝑠1β‹―}. In the same way presented in Section 2, it can obtain mask sequences without any awareness of pseudorandom number seed. However, the pseudorandom number sequences only relate to the plaintext 𝑃1. For other plaintext sequences, the mask sequences π‘š are totally different. Hence this kind of chosen plaintext attack doesnot work in proposed method. Now, both block cipher and stream cipher of [2] are enhanced π‘š=𝑔2ξ€·π‘₯0ξ€Έ,𝑏,𝑃.(4.2)

4.3. Periodicity of Multiple Chaotic Dynamics

The period of outputs in chaotic region can be looked on as infinite. However the actual period is limited by precision format of digital computer. Short period of chaotic trajectories generated by finite precision system [26] has been an obstacle in employing chaotic dynamics for cryptographic purpose. Short period makes the chaotic trajectories reused frequently in encryption process. This is equivalent to reuse key and weaken the security of the cryptosystem. The control parameter 𝑏0 and 𝑏1 should be chosen carefully to avoid nonchaotic regions [2, 23] whose Lyapunov exponent is negative. In proposed scheme, Lyapunov exponent is chosen as an indicator. It defines the region where the system behaves chaos [27] as shown in Figure 3. We calculated the period of logistic map when the control coefficient 𝑏0 and 𝑏1 is in the region where Lyapunov exponent is positive. Results show that all the length of period in chaotic region is far larger than 107 in double precision system which is a reasonable period length in practical cipher applications [26].

For plaintext perturbation and interaction of multiple chaotic maps, the period of multiple chaotic maps tends to infinity in theory. Even in the worst case, the period of multiple chaotic maps is the same as that of single map.

5. Simulation Results

To implement the proposed cryptosystem, the control parameters are arbitrarily selected as 𝑏0=3.999999991 and 𝑏1=3.999991. The initial condition π‘₯0 is arbitrarily set to 0.3388. The plaintext block is encrypted in bytes and the phase space of chaotic map is divided into 256 equal-width partitions. The maximum number of iterations for the search mode is 15. The proposed algorithm is implemented in C++ programming language running on a personal computer with an Intel Core(TM) 2 2.00 GHz processor and 2 GB memory.

5.1. Compression Ratio, Encryption and Decryption Speed

To test the compression capability of the proposed scheme, 18 distinct files of different types are used, including text, executable file, geophysical data and picture. They are standard files from the Calgary Corpus. These files are encrypted using the proposed scheme and the algorithm suggested in [2], respectively. Only 16 probable plaintext symbols are selected and they are all mapped to one table [2]. The simulation results are listed in Table 2. It shows that all files can be compressed effectively using proposed approach. The compression ratios of our scheme are same as those reported in [2].

The encryption and decryption speed of Baptista-type chaotic cryptosystem depends on the average number of iterations required. The encryption and decryption time on the Calgary Corpus files [28] can be found in Table 3. For comparison purpose, the encryption and decryption time using the algorithm in [2] are also listed. The results show that total processing time of proposed scheme is almost the same as that of the scheme proposed in [2].

Compared with existing chaotic cryptographic schemes, the proposed scheme does not take any additional computation. Tables 2 and 3 indicate that the multiple chaotic dynamics do not lead to a sacrifice of performance due to enhance of security. For the classical separate compression-encryption schemes, the test results of speed are directly chosen from [2]. Table 3 shows that speed of both proposed scheme and scheme in [2] is faster than that of the separate compression-encryption scheme.

5.2. Key Space and Key Sensitivity

To test the key sensitivity, the files from the Calgary Corpus are encrypted using different sets of secret key. Encryptions using proposed scheme were performed with only a small change of the parameters. The two resultant cipher text sequences are then compared bit-by-bit and the percentage of bit change is calculated. For parameters 𝑏0, 𝑏1, and π‘₯0, the 15th digit after decimal point is changed by a minimal value. For π‘šβˆ’1, the least bits in different position are crossover from 0 to 1 or 1 to 0. The measured bit change percentages are 50.01%, 50.01%, 49.98% and 50.07% for 𝑏0, 𝑏1, π‘₯0 and π‘šβˆ’1, respectively. The results show that bit change percentages are close to 50% and indicate that the cipher text is very sensitive to the key.

The freely-chosen key of the proposed scheme consists of the control parameters 𝑏0, 𝑏1 and the initial values π‘₯0, together with the 32-bit initial cipher block value π‘šβˆ’1. The key sensitivity test indicates that the 15th bit is effective that the ciphertext is sensitive to that bit. Therefore, the control parameters 𝑏0, 𝑏1 and initial state π‘₯0 are real numbers represented by double-precision format that are equivalent to 52 bits. According to Rule 5 of [23], 𝑏0 and 𝑏1 should avoid the nonchaotic regions. Here, similar to [2], the range of [3.9, 4.0] is adopted as chaotic regions and this is equivalent to 46 bits. The total key space is then 52+46+46+32=176 bits, which is longer than 130 bits used in [2]. Similar to [12, 13], chaotic maps with a higher dimension can be chosen if a larger key space is required. As a result, the attacker has to meet with an even larger key space for resisting a brute-force search attack.

5.3. Randomness of the Binary Mask Sequence

The mask model is a stream cipher that masks plaintext symbols by a pseudorandom bitstream [2]. The randomness of the pseudorandom bitstream is also confirmed by the statistical test suite recommended by the U.S. National Institute of Standards and Technology (NIST) [29]. 300 sequences, each of bits, are extracted for testing, and they all pass the statistical tests including frequency, block frequency, cumulative sums, runs, longest run, rank, and fast Fourier transform (FFT). All of 𝑃-values are larger than  0.01. Therefore, the sequences are considered as random according to the NIST Special Publication 800–22 [29].

6. Conclusion

In order to strengthen the security of chaos-based cryptography with embedded compression, a general multiple chaotic dynamics is proposed to correlate the chaotic trajectory with the plaintext. As a result, both the two steps of embedding compression in chaos-based cryptography are enhanced. For the searching model, the security is enhanced to resist a chosen plaintext attack named one-time pad attack. In the mask model, a potential vulnerability is analyzed and repaired. Meanwhile, the key space of proposed scheme is enlarged to resist a brute-force search attack.

Acknowledgment

This work was supported by Shenzhen University Research and Development Fund (SZU R/D) with Grant no. 200903.