Constructing Keyed Hash Algorithm Using Enhanced Chaotic Map with Varying Parameter
A keyed hash algorithm is proposed based on 1-D enhanced quadratic map (EQM) with varying parameter. Three measures, including assigning unique one-time keys, key expansion, and hash length extension, are taken to enhance its security. First, the message is transformed into a parameter sequence for the EQM to be absorbed, and then the extended keys are generated as the initial values of the EQM. Finally, the EQM is iterated with redundant loops to transform the variable values into a hash value. The algorithm is so flexible that it can generate hash value with different lengths of 256, 512, 1024, or more bits through a parameter switcher, and redundant loops can eliminate the transient effect of chaos and mitigate the increasing threat of the side-channel attack. Security evaluations and comparison demonstrated its practicability and reliability.
Hash algorithm is widely used for assuring data integrity in cryptography ; it can map a message with arbitrary length to a hash value with fixed length. If the input message is unknown, it is extremely difficult to reconstruct via its hash value. In theoretical cryptography, the security level of a hash algorithm could be defined by three properties : preimage resistance, second preimage resistance, and collision resistance. None of the existing hash algorithms is secure absolutely. Even if a hash algorithm has not been broken up to now, a successful attack against a weakened variant may result in its abandonment, such as the theoretical weaknesses of SHA-1 were found in 2005 , a successful attack on MD5 in 2008 , and Google announced a collision in SHA-1 in 2017 . Although some recognized hash algorithms, such as SHA-2, SHA-3, and SM3, are still secure up to now, however, all kinds of attacks on them are going on and on .
Many hash algorithms based on chaotic maps have been proposed ; however, some 1-D chaotic maps, such as logistic map and tent map, are typically insecure or slow, and most of these hash algorithms have been broken successfully. Xiao et al.  constructed a hash algorithm based on the piecewise linear chaotic map with changeable parameter; however, Guo et al.  analyzed its weakness and utilized weak keys to construct a collision successfully. Kwok and Tang  designed a hash algorithm based on a high-dimension chaotic map, and a compression function was developed according to the diffusion and confusion properties of the chaotic map; however, Deng et al.  analyzed the potential flaws in this hash algorithm and took corresponding measures to enhance the influence of a single-bit change in the message on the changes in the final hash value. Liu et al.  proposed a keyed hash function using a hyperchaotic system with time-varying parameter perturbation, which is flexible and has a larger key space. Teh et al.  designed a keyed hash function based on the logistic map with fixed point representation. Li et al. designed four 128-bit parallel hash functions based on cross-coupled map lattices , tent map , circular shifts , and dynamic S-box  with varying parameters.
An attacker can crack the hash value of a short password using a precomputed rainbow table . Petr et al.  designed a secure and efficient hash function with extra padding against rainbow table attacks to block rainbow table attacks by adding additional identification information to extend the key length.
Herein, we design a novel keyed hash algorithm and take three measures to resist some known attacks. We use a preencoding process to obtain the Unicode of each character in the message, transform it into a parameter sequence for the EQM to absorb, output the extended keys to serve as initial values, and use the generation process to generate a hash value with flexible length by the EQM. Redundant iterations are deliberately designed, which can both eliminate the transient effect of the chaotic map and mitigate the increasing threat of the side-channel attack. Performance evaluation demonstrated the effectiveness and flexibility of the proposed hash algorithm.
The remainder of the paper is organized as follows: Section 2 briefly introduces the EQM. Section 3 presents the hash algorithm. In Section 4, we present the experimental evaluation, and in Section 5, we present our conclusion.
2. The EQM
The classical quadratic map can be expressed using equation (1) :where the control parameter and state variable . The bifurcation diagram, phase diagram, and Lyapunov exponent shown in Figure 1 demonstrate that equation (1) has abundant bifurcations and dense windows; hence, the distribution of state points is not uniform, and its randomness is not good.
Based on equation (1), we constructed a 1-D EQM using equation (2):where the range of control parameter is extended to and the exponent . The bifurcation diagram and the phase diagram shown in Figures 2(a) and 2(b) demonstrate that the EQM has ergodicity and better randomness, and it is surjective within the interval . Figure 2(c) demonstrates that the Lyapunov exponent increases gradually with the increase of ; hence, the map achieves chaotic state. The state variable and exponent can serve as keys.
3. Hash Algorithm
Input: message with characters, which can be single-byte or multibyte, and theoretically, the length of the message can be infinite. A unique one-time 256-bit key is assigned according to each user’s identification. Output: hash value H with len-bit. A hash algorithm H (M, len, key) can be described as follows: Step 1 (message pre-encoding): for each character , , transform it into a corresponding Unicode value using equation (3) to obtain and serve as varying parameter of equation (2). It should be noted that even if is a null string, we can pad four specific characters of “====” to it. Step 2 (key derivation): transform a 256-bit key into its hexadecimal number , and then generate four initial values , , , and using equation (4) and exponent sequence as salt using equation (5): Step 3 (message absorption): iterate equation (2) 16 rounds with initial parameter sequence , exponent , and initial value , from the second round, set , and so on. Similarly, iterate equation (2) with and initial values , , and , respectively. Finally, we can obtain four variable values , , , and to serve as new initial values of equation (2) with exponent . Step 4 (hash value generation): after iterating equation (2) 300 times to eliminate the transient process, continue to iterate it times using four initial values , , , and with the salt sequence as salt in turn to obtain four variable sequences , , , and , . Transform them into unsigned integers within the interval [0, 255] using equation (6) to generate two groups of hash value and in hexadecimal form using equation (7), and concatenate them to obtain the final hash value :
The flowchart of the proposed hash algorithm is shown in Figure 3.
4. Experimental Evaluation
4.1. Key Space
4.2. Hash Sensitivity to Message and Keys
A good hash algorithm based on the chaotic map, should be very sensitive to any small change of the input message and initial conditions . In the following tests, M1 represents the original input message, M2, M3, and M4 represent minor modifications to M1, and M5 represents a minor change to K. The original message M1: “as of 2018, the development of actual quantum computers is still in its infancy, but experiments have been carried out in which quantum computational operations were executed on a very small number of quantum bits. Both practical and theoretical research continue, and many national governments and military agencies are funding quantum computing research in additional effort to develop quantum computers for civilian, business, trade, and environmental and national security purposes, such as cryptanalysis. A small 16-qubit quantum computer exists and is available for experiments via the IBM quantum experience project.” M2: replace the first character “A” of M1 with “a.” M3: replace the last character “.” of M1 with “,”. M4: add a blank space to the end of M1. M5: change one bit to K.
The 256-, 512-, and 1024-bit hash values in hexadecimal form are given in Table 1, and the results of Hamming distance demonstrate that any slight modifications on messages or key will lead to about 50% difference in the hash value.
4.3. Statistical Distribution of Hash Value
The hash value generated by a good hash algorithm should be evenly distributed. Here, we use Figure 4 to show the distributions of the message M1 and hash values of H1256 and H1512; from Figure 4(a), we can find that the ASCII values of M1 are localized within some specified intervals, while the hash values shown in Figures 4(b) and 4(c) distribute uniformly. In addition, we utilize the hash algorithm to calculate the 256-, 512-, and 1024-bit hash values of a null string; from Figure 5, we can infer that the distributions of hash values are also uniform.
4.4. Statistical Analysis of Confusion and Diffusion
The hash value of a good hash algorithm should be confused and diffused completely , and the ideal result is that one-bit change to the input bits would lead to 50% change in the output bits. Here, we conducted a large number of experiments to analyze its performance. First, a random message M with the size of is generated, and len-bit hash value is calculated. Second, a single bit in M is changed, and a new len-bit hash value is calculated. Two hash values are compared bit by bit to obtain the total number of changed bits. The experiment is repeated N = 5000 times with len = 256-bit, 512-bit, and 1024-bit, respectively.
The corresponding histogram distribution of the total number of different bits is plotted in Figure 6, which demonstrates that the total numbers of changed bits concentrate around the ideal number 128-bit, 256-bit, and 512-bit, i.e., about 50% bits are changed; hence, the results of diffusion and confusion are ideal.
The following statistics are used to test the performance of the hash algorithm. Here, len is the length of the hash value, N is the number of tests, Bi denotes the number of different bits between the hash values obtained in the i-th test, denotes the minimum number of different bits, denotes the maximum number of different bits, denotes the mean changed bit number, denotes the mean changed probability, denotes the standard deviation of numbers of changed bits, and denotes the standard deviation .
Tables 2–4 are statistical results obtained by changing one bit to M1 randomly and executing the hash algorithm N times to obtain hash values with different hash lengths of 256-, 512-, and 1024-bit. Every time, the total number of changed bits between the new and the original hash values is calculated.
Tables 5–7 are the comparison results with other hash algorithms, and the results demonstrate that, for all the values belonging to N, the mean changed bit number is very close to the ideal number of changed bits , from which we can infer that the hash algorithm has strong capability of confusion and diffusion. Meanwhile, the mean changed probability P is very close to the ideal value of 50%, which is one of the desired features of confusion. Another good feature of the hash algorithm is that both and are very small for all the tests, which means that the confusion and diffusion capability is very stable.
4.5. Collision Analysis
4.5.1. Meet-in-the-Middle Attack
To seek a collision, the meet-in-the-middle attack is conducted on intermediate variables, and a collision could be found if two intermediate variables match [22, 23]. This type of attack is invalid for the proposed hash algorithm, due to the initial values of EQM serving as keys, which can make the inverse computation extremely difficult. Hence, the proposed hash algorithm can resist the meet-in-the-middle attack.
4.5.2. Collision Analysis
To perform a collision analysis, message M1 with the length of L = 50 len is randomly generated, and its len-bit hash values are calculated and stored in ASCII form (8-bit per character). Then, we randomly change one bit to M1, calculate its hash value, and compare two hash values to obtain the absolute difference between two hash values using the following equation :where and denote the i-th ASCII character of two hash values and the function maps an ASCII character to its decimal value. The theoretical value of average absolute distance per character is 85.3333.
In Table 8, we present the minimum, maximum, and mean values of the absolute difference between two hash values, from which we can infer that when we set h = 256 and 512, the results of the proposed hash algorithm are as good as some existing hash algorithms, such as SHA-2, SHA-3, and other chaos-based hash algorithms.
4.6. Rainbow Table Resistance Analysis
Rainbow table is a practical example of space/time tradeoff; it uses more computer processing time at the cost of less storage when calculating a hash value on every attempt or less processing time and more storage when comparing to a simple lookup table with one entry per hash. Use of a key derivation function that employs a salt makes this attack ineffective . In the proposed hash algorithm, we took two measures to make the rainbow table attack ineffective. (1) One-time keys: we assign different one-time keys by the key sequence sampled from noise to different users according to their identifications. (2) Random salt: as for equation (2), we add salt derived from the key in each iteration through perturbing the exponent to make the rainbow table attack ineffective.
4.7. Speed Analysis
In order to analyze the computation speed, we implemented the proposed hash algorithm on a PC with 2.50 GHz Intel Core i7-6500U, 16G Memory and Windows 10 operation system, and the tested message consists of 20,000 ASCII characters; the speed is about 131.2 Mbps with N = 2048. Experiments showed that the running speed is unaffected by the hash value length.
4.8. Computational Complexity
The computational complexity  of the proposed hash algorithm depends on the message length and iterations of the EQM. For any message M with the character length n, there are n times to transform it into a parameter sequence, and the time complexity is O(n). For the EQM, there are n + 300 + len/16 iterations with varying parameter; hence, the time complexity is O(n). There are 140 times of addition, multiplication, and modular, hex conversion, and XOR operations, which have nothing to do with n; hence, the corresponding time complexity is O(1). Therefore, the total computational complexity of the proposed hash algorithm is O (n).
A novel hash algorithm is constructed based on the EQM; three measures, including assigning unique one-time keys adaptively, key expansion, and hash length extension, are taken to resist against the rainbow table attack. Three steps of message pre-encoding, message absorption, and generation of hash value are implemented. The hash algorithm is so flexible that it can be keyed or unkeyed and can generate 256-bit, 512-bit, 1024-bit, or longer hash value through a parameter switcher. Any characters, including single-byte and double-byte characters, can be transformed into a parameter sequence for EQM to absorb. Simulation results and performance analysis demonstrated the effectiveness and flexibility of the proposed hash algorithm. In the future, we intend to research chaos-based parallel hash algorithm that can resist attacks from the quantum computing.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Hongjun Liu was the major contributor and contributed to algorithm design; Abdurahman Kadir contributed to algorithm optimization; Chao Ma was responsible for statistics of experimental results; and Chengbo Xu contributed to diagram design.
This research was supported by the National Natural Science Foundation of China (no. 61662073).
M. J. Schiereck, Sha-3 Standard: Permutation-Based Hash and Extendable-Output Functions, Federal Inf. Process. Stds. (NIST FIPS)-202, Gaithersburg, MD, USA, 2015.
A. Sotirov, M. Stevens, J. Appelbaum et al., “MD5 considered harmful today, creating a rogue CA certificate,” in Proceedings of the 25th Annual Chaos Communication Congress, Leipzig, Germany, January 2008.View at: Google Scholar
T. Fox-Brewster, Google Just “Shattered” an Old Crypto Algorithm-Here’s Why That’s Big for Web Security, Forbes, Waltham, MA, USA, 2017.
F. J. S. Moreira, Chaotic Dynamics of Quadratic Maps, IMPA, Colchester, UK, 1993.
G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche, “The keccak sponge function family, 2011, submission to NIST’s SHA-3 competition,” 2011.View at: Google Scholar