An Improved Genetic Algorithm for Developing Deterministic OTP Key Generator

Jain, Ashish; Chaudhari, Narendra S.

doi:https://doi.org/10.1155/2017/7436709

Complexity

On this page

Abstract Introduction Related Work Results Conclusion Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2017 | Article ID 7436709 | https://doi.org/10.1155/2017/7436709

An Improved Genetic Algorithm for Developing Deterministic OTP Key Generator

Ashish Jain¹and Narendra S. Chaudhari^1,2

Academic Editor: Roberto Natella

Received22 Mar 2017

Revised28 Jun 2017

Accepted10 Jul 2017

Published11 Oct 2017

Abstract

Recently, a genetic-based random key generator (GRKG) for the one-time pad (OTP) cryptosystem has been proposed in the literature which has certain limitations. In this paper, two main characteristics (speed and randomness) of the GRKG method are significantly improved by presenting the IGRKG method (improved genetic-based random key generator method). The proposed IGRKG method generates an initial pad by using linear congruential generator (LCG) and improves the randomness of the initial pad using genetic algorithm. There are three reasons behind the use of LCG: it is easy to implement, it can run efficiently on computer hardware, and it has good statistical properties. The experimental results show the superiority of the IGRKG over GRKG in terms of speed and randomness. Hereby we would like to mention that no prior experimental work has been presented in the literature which is directly related to the OTP key generation using evolutionary algorithms. Therefore, this work can be considered as a guideline for future research.

1. Introduction

Recent years have witnessed use of information in many areas including financial accounts, military and political. Security of this information in both storage and transit is crucial as it may be compromised resulting in financial loss, disclosure of military or commercial secrets, and even the loss of life. Cryptography is one set of techniques for providing information security. Historically, cryptography is commonly connected with surveillance, warfare, and the similar applications. However, with the advent of information civilization and digital revolution, cryptography is also useful in the peaceful lives of common people, for example, when buying something over the Internet through credit card, withdrawing money from the ATM machines using smart-cards, and locking and unlocking luxury cars.

Cryptography is related to the design of cryptosystems. Cryptosystems have two divisions: symmetric key and asymmetric key. In the case of symmetric-key cryptosystem, encryption function takes a text message (plaintext) as input and transforms it into an unreadable text (ciphertext) with the use of a secret key [1]. The decryption function converts the ciphertext back to the plaintext using the same secret key. If any flaws or oversights exist in the cryptosystem, it can be exploited by the attacker [1]. The attacker can recover the plaintext from the ciphertext without knowing the secret key because of openness of cryptographic algorithms and the encrypted data transfers via the insecure public communication channel. For this reason, sensitive applications, for example, financial domain, demand perfect security that can only be achieved by one-time pad (OTP) symmetric-key cryptosystems in which the key used for encryption once is never used anymore at any time [1]. For achieving perfect security, an obvious choice is random generation of the key via truly random sources. However, this choice is inefficient (generation of truly random numbers from hardware-based physical phenomena, for example, elapsed time between emissions of particles during radioactive decay; thermal noise from a semiconductor diode or resistors; sound from a microphone or video input from a camera, and so on; and generation of truly random numbers from software-based process, for example, the system clock; elapsed time between keystrokes or mouse movement; and operating system values such as system load and network statistics are impractical choices for practical cryptographic applications, i.e., large sized keys for each encryption [1]). Therefore, for sensitive applications pseudorandom generation of the key is the only option to make the scheme practical. Recent years have witnessed large use of computationally secure OTP all over the world, typically during financial transactions. Hereinafter OTP key means computationally secure pseudorandom key and original OTP key means truly random key. In this work, we present a genetic-based scheme for automatically generating OTP keys.

Several things in the world are naturally encoded, for example, genomes of animals [2]. This motivates us to utilize (genotype) genetic algorithms in development of deterministic scheme that can generate the OTP keys rapidly. In 2013, Sokouti et al. [3] have demonstrated a significant use of genetic algorithm for automatically generating OTP keys. They have proposed and compared two genetic-based OTP key generators, namely, 10P-GRKG and the GRKG. The comparison results in [3] show that the GRKG method is much better than the 10P-GRKG method in terms of speed and randomness. However, it is observed that the GRKG method has certain limitations which needs improvements. In this paper, we propose an improved genetic-based random key generator (IGRKG). As compared to GRKG, the proposed IGRKG generator generates the OTP key rapidly and the degree of randomness of the generated keys is better. In the literature, a prior attempt in OTP key generation using evolutionary technique has been addressed only in [3]. Therefore, this paper can present a detailed comparison between GRKG and IGRKG generators. We also compare the Diehard scores of GRKG and IGRKG with some existing pseudorandom number generators. It is important to note that, except GRKG and IGRKG generators, the other pseudorandom generators have not been developed to generate OTP key.

It should be noted that speed and randomness are the main objectives of a designer behind the design of a pseudorandom key generator. For achieving these objectives the following novelties and modifications are introduced which are our major contributions:(1)Unlike GRKG we use a comparatively short secret key.(2)Unlike parameter employed in GRKG, a new variable is proposed, where the essence of the parameter is the minimum generations in which the initial pad is obscured almost entirely.(3)For determination of the crossover point, rather than using modular arithmetic over addition, we introduce a new approach of modular arithmetic over subtraction. The advantage of this approach is that it improves the randomness of the pad and makes the scheme faster.(4)For evolving existing generation, a new and efficient approach is introduced that updates two variables and . These variables are employed in Algorithm 1, Steps (–, to decide crossover and mutation points. This idea increases the randomness of the existing pad and evolves the obscure final pad rapidly.(5)For increasing the speed of encryption and decryption, a more efficient encryption and decryption function is suggested.Figure 1 shows the block diagram of the proposed work. Figure 1 shows that four integer values are taken as input corresponding to the short secret key: , , , and . It should be noted that the values of these parameters are taken only once in the presence of both the sender and receiver. Also, all the values must be “truly random” which is referred to as seed. This seed must be generated from the truly random sources, because it is utilized by GA techniques in order to generate larger sized keys. As shown in Figure 1, the seed is first processed by one of the existing statistical sound generators, namely, LCG. Through feedback mechanisms, the initial pad equal to the size of the plaintext is generated. That is, is used to generate , is used to generate , and so on, where for each computation the remaining secret key parameters, that is, the multiplier , the increment , and the modulus , remain unchanged. The initial pad is then converted into a population of individuals , where is a binary equivalent of integer , is a binary equivalent of integer , and so on. Afterward, the population is evolved by three evolutionary operators: selection, crossover, and mutation (all these operators have been discussed in detail in Sections 5.1, 5.2, and 6). The probability of crossover and mutation is controlled by a common probability parameter . However, for each instance of the problem, the rate of mating and mutation may be different; we determine these rates by a deterministic mathematical procedure (for details, see Algorithm 1, Steps to ). The selection of individuals for mating, crossover point , choice of genes for mutation, and mutation point are also controlled by a deterministic mathematical procedure (see Algorithm 1, Steps to and to ). Finally, we get an obscure final pad , where is an integer equivalent of the corresponding binary individual. The number of generations is controlled by a parameter, namely, (for details about this parameter, see Section 5.2, Step ()).

(1) Data: .
(2) Result: one-time pad of size , where is the size of the plaintext.
Remark : Generate an initial population Popu of size , where th chromosome is represented as binary string equivalent to .
Here the initial vector is computed via LCG equation, where . That is:
(3) for from 0 to do
(4) Set mod ()
(5) end for
Remark : Initialize two positive integers: Number of selection of chromosome pairs for crossover ( ) and Number of
selection of chromosomes for mutation ( ) by “1”. Sometime, we refer and as crossover and mutation rates,
respectively.
(6) Set 1, 1
Remark : Update crossover and mutation rate, if the size of population is more than or equal to 10. That is:
(7) if 10 then
(8) Compute
(9) if then
(10) Compute . To maintain low probability mutation in case of large sized population.
(11) else
(12) Compute
(13) end if
(14) end if
(15) Initialize mod ()
(16)
(17) repeat
Remark : Select a pair of chromosome and determine a point of crossover. Repeat the process till counter reached .
(18) for from 1 to do
(19) Set (mod)
(20) Compute mod ()
(21) Set (mod)
(22) Compute mod ()
(23) Set mod (), where denotes the starting bit position of the chosen
chromosomes to be reproduced. Here , since a maximum integer value equivalent to its binary representation could
be 255 (i.e., ).
(24) Perform single point crossover between pairs of selected chromosomes. That is between and
with probability , where the starting point for mating is denoted by .
(25) end for
(26) Set
Remark : Select an individual from the mated population and perform a single bit complement mutation for each of the
following iteration, where the new population was generated in the previous steps (18)–(25).
(27) for from 1 to do
(28) Compute mod ()
(29) Set (mod)
(30) Set (mod) , where denotes the position of a bit of the selected chromosome,
.
(31) Change a bit of located at position .
(32) end for
(33) Update
(34) Set
(35) Update mod ()
(36) until counter reached

Advantages of IGRKG over GRKG. () IGRKG generator is much faster than the GRKG generator; for instance, in generating large sized secure pad (e.g., ), the average time taken by the IGRKG generator is about 2.432 seconds, while the GRKG generator takes 9.747 seconds (for details, see Section 6.4). These results indicate that the IGRKG generator is four times faster than the GRKG generator. Consequently, in the case of exchange of a significant number of encrypted messages, the OTP-IGRKG system will outperform the OTP-GRKG system. () In terms of randomness, the quality of the IGRKG generator is significantly better than the GRKG generator (see statistical and randomness testing results in Section 6.5).

We organize the remainder of the paper as follows: in Section 3, we present some of the previous valuable research work in the field of cryptology where the genetic algorithm has been utilized. In Section 4, we present basics of the one-time pad and associated challenges. In Section 5, we propose the IGRKG method followed by comparison with previously proposed Sokouti et al. GRKG method. In Section 6, statistical testing and cryptanalysis results are discussed followed by conclusion in Section 7.

3. Genetic Algorithm

The origin of evolutionary algorithms (EAs) is an attempt to mimic some of the process taking place in natural evolution. Although the details of biological evolution are not completely understood (even nowadays), there exist some points supported by strong experimental evidence [5]. Genetic algorithm (GA) is one of the most popular EA techniques that has emerged based on the concept of imitating the evolution of a species [6]. In the case of GA, a population of individuals (or chromosomes) is generated using an intelligent method or a random method [6–8]. Each of these individuals is encoded as a binary string that represents a possible candidate solution to the problem at hand. In each iteration, the survival strength of each candidate solution is measured by a fitness function [6–9]. Afterward, the evolutionary process is constrained by three genetic operators: selection, crossover, and mutation. Through selection procedure, individuals are selected that enter into the crossover process. The crossover operator alters two or more parents to create offspring, where a probabilistic crossover rate is usually used to generate offspring [7, 8, 10]. Mutation operator produces one child from one parent by flipping a bit (s) of the parent. A probabilistic mutation rate is usually used to determine whether a particular change occurred or not within an individual [7, 8, 10].

There are some important characteristics of crossover and mutation operators that are not captured by the other. Błażej et al. [11] mentioned that it has never been theoretically shown that mutation is in some sense less powerful than the crossover and vice versa. Mutation serves to create random diversity in the population while crossover serves as an accelerator that promotes emergent behavior from components [11, 12]. The metaissue, then, is the relative importance of diversity and construction. It is impossible for mutation to simultaneously achieve high levels of construction and survival [11, 12]. This would appear to be important since one without the other may not be extremely useful. High construction levels are accomplished at the expense of survival (e.g., mutation rate 0.5), while good survival is at the expense of construction (e.g., mutation rate 0.01) [11, 12]. In our study, we get the highly constructive results with 0.25 to 0.3 mutation rates. That is, 25% to 30% parents are affected in our study by mutation operation (see Section 5.1 for details). GA parameters can be controlled in three different ways: deterministic [13, 14], adaptive [13–15], and self-adaptive [13, 14, 16]. The deterministic parameter control technique takes place when the value of strategy parameters (e.g., and in our study) is altered by some deterministic rule. This rule modifies the strategy parameters deterministically without using any feedback from the search [13, 14, 17].

Applications of GAs in Cryptographic Applications. GAs have been applied successfully to solve real-world optimization and search problems [9]. These techniques have also shown good potential in the domain of cryptology. Here we mention some of the good works that have been carried out in the last decade. An interesting work in the domain of cryptographic protocol design has been carried out by Park and Hong [18] and Zarza et al. [19] in 2005 and 2006, respectively. Wang et al. (2012) [20] have proposed a novel method based on the genetic algorithm and chaotic map for designing substitution boxes (S-boxes). Jhajharia et al. (2013) [21] have utilized GAs for cryptographic key generation. Jain and Chaudhari (2014) [22] have proposed an improved GA method to attack the knapsack based cryptosystems. Faraoun (2014) [23] has proposed a block cipher design using GA and cellular automata. Recently, GA and CGP techniques have been utilized in [24] for determining strong cryptographic Boolean functions. Jain and Chaudhari (2015) [25] have proposed improved GA for automated cryptanalysis of the substitution ciphers. In [3] Sokouti et al. (2013) have proposed a GA technique for automatically generating OTP keys that we improve in this paper.

4. One-Time Pad Cryptosystems

One-time pad cryptosystems are based on the concept of stream cipher. In stream cipher, a short secret key is used to generate a keystream (i.e., a string of bits) [1]. The keystream bits are XORed with the plaintext bits in the usual way to produce the ciphertext [1]. At the receiver end, the ciphertext is XORed with keystream to get the original plaintext [1]. However, in stream cipher, a keystream is generated from a short secret key [1]. Therefore, these ciphers can be compromised if not used carefully. The advantage of stream ciphers is that they are much faster in hardware and therefore mostly employed in resource-constrained devices. However, the original OTP is used in those applications where the primary objective is perfect security rather than speed [1]. The conventional OTP cryptosystem combines a plaintext sized key with the given plaintext code as modulo addition “26” and thereby generates the ciphertext. An example is shown in Table 1. The fact is that the plaintext message can consist of not only English alphabet, but also ASCII characters. Therefore, in this paper we consider that the encryption and decryption of plaintext will be done on “modulo 256” rather than “modulo 26.” As a result, each plaintext character will consist of 8 bits (i.e., each plaintext character will be in the range ).

5. Genetic Algorithm for Generating OTP Keys

There are two main challenges for developing original OTP cryptosystem: () The OTP cryptosystem must generate a key of length equal to the length of the plaintext. () The key should be truly random for achieving perfect security. Plaintexts are variable sized and often their size is large. Therefore, it is impossible to generate a truly random key of the size of the plaintext.

An efficient option for solving this kind of problem is the utilization of pseudorandom key [1]. However, it is not trivial to generate pseudorandom key equal to the length of the plaintext. In this context, Sokouti et al. (2013) [3] have proposed the GRKG method. GRKG generator accepts a fixed size short secret key as an initial key and thereby generates the pseudorandom key . Here, we point out each time a different key is generated.

We have two popular pseudorandom generator choices as a base generator: LCG and Mersenne Twister, because of the good statistical properties [26]. However, for cryptographic security only the use of a statistical sound generator is not sufficient. Therefore, we can employ either LCG or Mersenne Twister for generating initial pad and then genetic algorithm is used to improve the randomness of the initial pad. As a result, an obscure and appropriate OTP key is generated. In this research, we have decided to use LCG method because it is easy to implement and runs efficiently on computer hardware [26]. Most importantly, its use allows us to give a fair comparison between two methods, GRKG and IGRKG, since LCG has also been employed in the GRKG method.

5.1. IGRKG: The Proposed Method

Consider initial key which consists of LCG and GA parameters, and the details are as follows.

Parameters Related to LCG : the modulus () : an initial positive integer number for generating another integer number using (1), where : the multiplier : the increment .

Parameters Related to GA : combine probability of crossover and mutation : number of selections of chromosome pairs for crossover : number of selections of chromosomes for mutation : minimum number of iterations to generate a sufficient secure OTP key

Algorithm 1 (Description). Pseudocode for the proposed IGRKG method is shown in Algorithm 1. Input to the algorithm is , where is a secret key decided by communicating parties once. Using first four elements, , , , and , the initial pad is generated via LCG method, where and is the size of the plaintext. The initial pad is then converted into its equivalent binary representation (see Remark 1).

GA Operators. The binary initial pad which is generated by LCG method is modified by applying selection, crossover, and mutation that are deterministically [14] controlled. That is, crossover and mutation will be not performed at random positions of individuals; rather positions are determined using deterministic procedure. It is emphasized that the same secret key is possible at both ends iff identical evolutionary operations are applied. If we use fitness function then this constraint will be violated. Therefore, this work does not require any fitness function; however, for generating secure OTP keys intelligent selection, crossover and mutation operators have been designed.

Deciding Values of and . If the initial pad length , then selection of one pair of chromosomes (i.e., ) and a single chromosome (i.e., ) is sufficient for reproduction and mutation, respectively (see Remark ). However, for the initial pad of size , and are determined deterministically by utilizing and (see Steps to ). is a common probability parameter for crossover and mutation.

Fine-Tuning of Crossover and Mutation. We have tested certain type of mutation and crossover operators, but the best results have been obtained using simple mutation (which flips a selected bit) and single point crossover. In the literature it has also been shown that, among all the crossover operators, the most successful one is single point crossover [27]. A deterministic procedure is developed for deciding crossover and mutation points (see Steps and , resp.). The number of chromosomes mutated is defined as fixed percentage of the total number of chromosomes (see Steps –().

Finding best combination of crossover rate and mutation rate is an important step in GA. In [28, 29], it is investigated that generally low mutation rates (0.01 to 0.1) and comparatively high crossover rates (0.5 to 0.7) perform very well. However, in [8], it is mentioned that the modern view of EAs admits that specific problem types require specific EA setups. Therefore, different crossover and mutation rates have been experimented to investigate their capability to find good solutions (the conditional optimal values of crossover and mutation rates are shown in Table 2). Note that there is no prior experimental work of this kind, so this work should be considered as a guideline for future research.

Use of and Variables. For the initialization of , we use the last element of the initial pad only once (see Step ). That is, in the evolutionary process, we will never use again due to security reasons, but the GRKG method uses more than once, which is one of the drawbacks of the GRKG method (see Table 3, Steps () and ()). Steps and show that an integer variable is used to select a chromosome pair for mating, where each time possibly a different chromosome pair is mated (see Steps and ). In each iteration the mating operation is performed “” times (see Remark and Step ). Step shows that another integer variable is used to select an individual for mutation, where each time possibly a different individual is mutated (see Steps and ). For each iteration the mutation operation is performed “” times (see Step ). By repetitive applications of mating and mutation, a new population is generated. During evolution of the population through crossover, variable is itself updated (see Steps and ). Similarly, during evolution of the population through mutation, variable is itself updated (see Steps and ). In both cases, the LCG method is used. In each iteration, after crossover and mutation, and are assigned the updated value of and , respectively (see Steps and ). This strategy has been introduced in this research for the purpose of generating robust and secure OTP key (for detailed information, see Section 5.2, Step ).

Use of Variable. Until the termination condition is not satisfied, the new population is fed back in the evolutionary process. is an integer variable that indicates the minimum number of generations till the pad is entered in the evolutionary process (see Section 5.2, Step ).

5.2. Comparison between GRKG and IGRKG Generators

In this section, we compare the proposed IGRKG method with Sokouti et al.’s GRKG method [3]. A table of comparison based on the features of both the generators is shown in Table 3. In this table, we have underlined the values of IGRKG features that are different form their GRKG counterparts. A detailed list of proposed improvements is as follows:(1)Rather than the secret key of size “seven,” IGRKG uses a short secret key of size “six.” This is possible because the crossover and mutation probabilities have been combined in a single parameter . However, the algorithm is designed in such a way that the same probability parameter is utilized for performing both crossover and mutation operations (see Table 3, steps () and ()).(2)Unlike parameter used in GRKG, IGRKG uses parameter. The essence of the parameter is the minimum generations in which the initial pad is obscured almost completely. The IGRKG scheme has been tested with different values of and . It is observed that 50 generations are sufficient to completely obscure the initial pad. However, after each communication the variable is increased in order to achieve computationally high security.(3)For evolving the existing generation, a different approach is proposed which is based on the effective updates of and . In the GRKG method, the same initial value (i.e., PGU()) is used for evolving current generation (see Table 3: Column 2, Steps () and ()). The limitation of this approach is that the number of individuals that were improved by crossover is once again selected for mutation. Due to this reason, the GRKG method requires a large number of generations for evolving the remaining individuals. This limitation is resolved by assigning the updated value of to and by assigning updated value of to (i.e., PGU() and CGU(), see Table 3, Column 3, Steps () and (), resp.). This phenomenon gives the chance to remaining individuals that were not improved by crossover, that is, improvement in the same current generation through mutation. The main benefit of this approach is that there is a high probability of selection of chromosomes for mating that were not selected in mutation and vice versa. This idea increases randomness of the pad with the increase in iteration. That is, due to this strategy the IGRKG method produces the more randomized pad in less number of iterations as compared to the GRKG method.(4)In order to determine crossover points, GRKG method uses modular arithmetic over addition. This approach makes the GRKG scheme conceptually weak. The fact is that the sum of two chromosome values (i.e., sum of integers) before and after crossover will always be the same. That is, if two chromosomes and are mated and converted into and , respectively, whenever in the next generation and are selected for crossover, the result will be the original chromosomes, that is, again and (see Table 3, Step ()). Clearly, this phenomenon is a big obstacle in increasing randomness of the input pad. In this paper, we resolve this weakness by suggesting the use of modular arithmetic over subtraction rather than addition. Due to this strategy, even though the same pair will be selected in the next generation, the different crossover points will be selected because the subtraction of two chromosome values before and after crossover operation is different. This approach improves the practical efficiency of the generator.(5)We have critically examined that the encryption and decryption functions suggested by Sokouti et al. [3] are not appropriate for use in cryptography. The design of encryption and decryption functions is not a part of the OTP key generator. However, as a complete OTP scheme, we advise simple encryption and decryption functions that are often used in stream ciphers (see Table 3, Steps () and ()).

6. Results

For the purpose of comparison between GRKG and IGRKG generators in terms of speed, we have implemented both generators in Java 2.0 with Intel Quad-Core processor i7 (@3.40 Ghz). We present the results of both the generators on the text “cryptology.” The size of plaintext “cryptology” is 10; that is, . Consider . That is, in each iteration “two pair ” of chromosomes and “two ” chromosomes will be affected by the crossover and mutation, respectively. Note that this example has been considered by Sokouti et al. [3] in their work. Therefore, for a fair comparison between GRKG and IGRKG generators, we demonstrate our work on the same example.

6.1. Common Computation

Consider secret , , , , , . Using this short secret key an initial pad is generated iteratively via LCG method. That is, mod (256) = 52, mod (256) = 11, mod (256) = 62, and so on. Finally, . A population of size 10 is initialized, where th chromosome will be binary equivalent of the th element of . This population is input in the GRKG and IGRKG generators for generating OTP keys.

6.2. Results Obtained Using the GRKG Method

In this section, we determine the OTP key from the initial pad using the GRKG generator, where the initial . Table 4 shows the working of the GRKG method. Initially , that is, the last element of the initial pad.

Mating. Initially, for mating, 8th and 9th chromosome pairs are selected, where the selection of chromosomes is determined as follows: (, i.e., ) mod (256) = 198 (mod) and (updated , i.e., ) mod (256) = 229 (mod) 10 = 9. The mating is performed in between 8th and 9th indexed chromosomes at the 7th indexed-gene position. The index is computed as follows: ( i.e., 252 + i.e., 243) mod ( i.e., 8) = 7. Similarly, the second mating operation is performed in between 0th and 5th indexed chromosomes, where the mating starts from 3rd “(52 + 31) mod (8) = 3” indexed-gene position. Following such selection and crossover mechanisms, the initial pad is transformed into .

Mutation. The mutation operation is performed using variable = 243 (here, we point out that the initial value 243 is used again for the selection of chromosomes for mutation, which is one of the drawbacks of the GRKG method). As shown in Table 4, the first mutation operation changes 4th “252 (mod) 8 = 4” indexed bit of the 8th indexed chromosome and the second mutation changes 3rd “243 (mod) 8 = 3” indexed bit of the 9th indexed chromosome. In this way, after first iteration, the intermediate pad is transformed into .

Table 4 also shows the recomputation (Re) phase which is one of the limitations of the GRKG method, where recomputation appeared due to the selection of the same chromosome again (i.e., 8th one). Similarly, during the mutation operation, if the same chromosome is selected again, then the GRKG method performs the recomputation. Here we emphasize that this phenomenon needs improvement because, in the case of large sized plaintext, the efficiency of the scheme will degrade. This paper resolves this issue by removing recomputation phase and keeping updating the resulting pad through crossover, where the crossover is performed using modular subtraction rather than modular addition.

6.3. Results Obtained Using the IGRKG Method

In this section, we determine the OTP key from the initial pad using the IGRKG generator, where the initial pad is equal to . Table 5 shows the working of the IGRKG method. Initially , that is, the last element of the initial pad.

Mating. First of all, 8th and 9th indexed chromosome pairs are selected for mating. The mating is started from 1st “ (mod) 8 = 1” indexed-gene. The second mating operation is performed in between 8th and 5th indexed chromosome pairs of the output pad generated in the previous step, where the mating starts from 3rd “(52 + 31) mod (8) = 3” indexed-gene positions. Following such selection and crossover mechanisms, the initial pad is converted into .

Mutation. The mutation operation is performed using variable = 135 (i.e., updated ). As shown in Table 5, the 8th indexed chromosome is mutated at 4th “252 (mod) 8 = 4” indexed-gene position and 9th indexed chromosome is mutated at 3rd “243 (mod) 8 = 3” indexed-gene position. In this way, the intermediate pad is converted into .

6.4. Discussion on the Speed of GRKG and IGRKG Generators

It is observed from Tables 4 and 5 that the OTP key is generated by GRKG and IGRKG in “nine” and “seven” iterations, respectively. That is, the IGRKG method obscures the initial pad in less number of generations than the GRKG method. In other words, if we run the IGRKG generator for two more generations, it will result in the enhancement of the randomness. As evident from the results, the relative time performance of the IGRKG method is significantly better than the GRKG method. The time taken by the GRKG method and the IGRKG method for the above-solved instance is 1.918 and 0.743 milliseconds, respectively. For an accurate comparison between the speeds, we have tested both the generators on the large data set.

We have taken and so that and . Afterward, both the generators have been run 100 times for the plaintext of length of 1000 characters along with various settings of (). We have examined that the average time taken by the IGRKG method for 1000 generation is 2.432 seconds, while the GRKG method takes 9.747 seconds. This result indicates that the IGRKG generator is approximately four times faster than the GRKG generator.

6.5. Statistical Results

This section presents some statistical tests to analyze the security of GRKG and IGRKG generators that are purported to be random bit generators. A random bit generator is a device or algorithm which outputs a sequence of statistically independent and unbiased binary digits. According to [1], it is impossible to give a mathematical proof that a generator is a random bit generator; the tests described here help to detect certain kinds of weaknesses the generators may have. From both the generators, we take a sample output binary sequence of length = 20000 bits and subject it to four statistical tests designated as , , , and . The conclusion of each test is not definite but rather probabilistic. If the sequence passes all of the statistical tests, the generator is “not rejected” [1]. Here, we emphasize that normal and distribution (goodness of fit test) tests can be used to compare the observed frequencies of a sequence to their expected frequencies under a hypothesized distribution. The distribution with degrees of freedom arises in practice when the squares of independent random variables having standard normal distributions are summed. For detailed information on normal and distributions, we refer readers to [1]. In the following sections, we present the formal definition of each of the four statistical tests and the results of statistical testing on the output sequence .

6.5.1. Frequency Test

The objective of this test is to examine whether the numbers of 0s and 1s in are approximately the same, as would be expected for a random sequence. Let , denote the number of 0s and 1s in , respectively. We use these statistics as [1]:which approximately follows an distribution with 1 degree of freedom.

6.5.2. Serial Test

The objective of this test is to examine whether the numbers of occurrences of 00, 01, 10, and 11 as subsequences of are approximately the same, as would be expected for a random sequence. Let , denote the number of 0s and 1s in , respectively, and let , , , and denote the number of occurrences of 00, 01, 10, and 11 in , respectively, where , since the subsequences are allowed to overlap. We use these statistics as [1]:which approximately follows an distribution with 2 degrees of freedom.

6.5.3. Autocorrelation Test

The purpose of this test is to check for correlations between the sequence and (noncyclic) shifted versions of it. Let be a fixed integer, . The number of bits in not equal to their -shifts is . We use these statistics as [1]:which approximately follows a normal distribution with mean of zero and standard deviation of 1 if .

6.5.4. Poker Test

Let be a positive integer such that , and let . Divide the sequence into nonoverlapping parts each of length , and let be the number of occurrences of the th type of sequence of length , . The poker test determines whether the sequences of length each appear approximately the same number of times in , as would be expected for a random sequence. We use these statistics as [1], which approximately follows an distribution with degrees of freedom:For a significance level of , threshold value for , , and value is different for different degree of freedom which is computed using different size subsequences (see Table 10). The calculated () results clearly indicate that the statistical results obtained via the IGRKG generator are consistently superior to the GRKG generator (see Tables 7–10).

In addition to the above-discussed statistical tests, we have tested large output sequence of both the generators on the more stringent batteries of statistical test: Diehard, NST, and “ENT” [30]. For this purpose, we have generated two separate files of 250 MB corresponding to the output of each of the generators over a low entropy input, and then each file was analyzed with each of the batteries. Note that 250 MB data is a vast binary sequence generated by using input parameters to . Table 11 shows the results of the ENT test. As evident from the ENT results, the output of the GRKG and the IGRKG generators successfully passes all the tests. However, it is clear from the results that the ENT test results are superior in the case of the proposed IGRKG scheme. Diehard results are shown in Table 12. The tests are treated as successful if value is greater than 0.05. IGRKG passes all the tests with significantly better results than GRKG. It is important to note that GRKG has failed in monkey test OPSO and overlapping sums test.

In the case of NIST test, 100 values have been evaluated for each test; the proportion of successful results are presented in Table 13. The tests are treated as successful if value is greater than 0.959 (except discrete Fourier transformation and binary matrix ranks tests). In these two tests the value should be in between 0.051 and 0.990 for success [31]. Results presented in Table 13 show that the IGRKG generator passes all the tests; however, the GRKG generator has failed in longest-run test and binary matrix ranks and tests.

6.6. GRKG and IGRKG Quality Assessment

6.6.1. Diehard Scores

Although GRKG and IGRKG generators have been developed and reported in the literature for OTP key generation, we compare the performance of both the generators with several other existing pseudorandom number generators. The generators that we are comparing to GRKG and IGRKG are of various types: pure linear congruential generators (rand [32], rand1k [33], and pm [34]), multiply-with-carry generators (mother [35]), additive and subtractive generators (add [32], sub [34]), compound generators (shsub [32], shpm [34], and shlec [34]), feedback shift register generators (tgfsr [36], fsr [37]), Tausworthe generators (tauss [38]), and GP based generator (Lamar [31]). For the comparison purpose, Johnson’s scoring method [39] is used. We have generated 50 different 10 MB files from GRKG and IGRKG using the same method as mentioned in Table 6, and then scores have been assigned using results of the Diehard tests. The score corresponding to different generators has been taken from [31]. Since Diehard tests produce one or more values, categorizing them as rejected, suspect, or good, a value is called rejected if and suspect if ; all other values are considered to be good. Two points, one point, and zero points have been assigned for rejection, suspect, and good, respectively. Finally, the addition of these points produces a global Diehard score for each generator. The average has been taken over the 50 evaluations in the case of GRKG and IGRKG generators. In Table 14, low scores indicate good quality generators. From Table 14, it can be noted that the IGRKG generator is comparatively better than Lamar and significantly superior to the rest of the generators.

6.6.2. Changes in Population

For the purpose of demonstration of behavior of the proposed IGRKG method, we present some experimental analysis. For instance, consider that the secret key equals (, , , , , = 10) and size of the plaintext is 10. Figure 2 shows the changes in population for each iteration, where the initial pad (165, 212, 123, 90, 49, 192, 199, 6, 61, 44) is indicated by the 0th generation. Figures 2(a), 2(b), and 2(c) represent the behavior of the algorithm for first three (0 to 2), second three (3 to 5), and the last three (6 to 8) generations, respectively, where 0th generation indicates the initial pad status and 8th generation represents the final pad status. The 1st and 2nd generations pad status can be seen in Figure 2(a) which are (37, 52, 115, 94, 209, 192, 199, 134, 61, 44) and (37, 52, 91, 126, 209, 192, 199, 198, 61, 44), respectively. The 3rd, 4th, and 5th generations pad status can been seen in Figure 2(b) which are (37, 52, 195, 126, 209, 193, 71, 94, 61, 44), (37, 52, 203, 126, 209, 195, 71, 94, 61, 44), and (37, 52, 195, 126, 209, 195, 47, 94, 61, 65), respectively. The 6th, 7th, and 8th generations pad status can been seen in Figure 2(c) which are (223, 36, 195, 126, 209, 195, 47, 36, 61, 65), (223, 36, 35, 222, 209, 195, 175, 52, 61, 65), and (95, 36, 53, 60, 209, 203, 175, 34, 223, 65), respectively. Although after 8th generation the initial pad is completely changed (see Figure 2(d)) and transforms to (95, 36, 53, 60, 209, 203, 175, 34, 223, 65), further iterations increase the randomness of the pad. Figure 2(d) shows a comparison graph of the initial and final population (output at the 8th generation), where we can clearly see that there is no similarity at any element positions of initial and final pads. In other words, initial and final pads are independent of each other; that is, without knowing the secret key, the initial pad is very difficult to recover.

(a) First three generation samples

(b) Second three generation samples

(c) Last three generation samples

(d) Initial and final pad status

6.7. Basic Cryptanalysis of GRKG and IGRKG

This section discusses resistance results against some of the basic cryptanalytic attacks that are as follows:(1)Input-Based Attack. In order to distinguish between random outputs and GRKG or IGRKG outputs, if it is possible to use control or knowledge of the generator, then we say that the generator is not resistant to input-based attack.(2)State Compromise Extension Attack. Assuming that a state has been recovered by the adversary through successful efforts (e.g., due to inadvertent leak, a cryptanalytic success, etc.), an attack which tries to extend the advantage of state is called state compromise extension (SCE) attack. SCE attack succeeds when the attacker is able to distinguish between random outputs and generator outputs before was compromised. Due to insufficient starting entropy the generator can be started from an insecure guessable state; at that time there is a highest probability of SCE attack to work. SCE attack can also work when has been compromised by any of the attacks mentioned below.(a)Backtracking Attack. In order to acquire previous generator values, the backtracking attack uses the compromise of the state at time .(b)Permanent Compromise Attack. As soon as the attacker negotiates at time , all past and future values are susceptible to attack which is a permanent compromise attack.(c)Iterative Guessing Attack. If the inputs collected between times and are guessable by the attacker, then this type of attack is called iterative guessing attack.(d)Meet-in-the-Middle Attack. A combination of backtracking and iterative guessing attacks is called meet-in-the-middle attack. Knowledge of at times and allows the attacker to recover at time .A binary sequence of size 20000 bits is generated individually from both the GRKG and the IGRKG generators corresponding to four different input parameters that have been mentioned in Table 6. Afterward, cryptanalysis against all the above-mentioned attacks has been performed. The cryptanalytic results have been mentioned in Table 15. Results obtained indicate that the IGRKG method is resistant to all the attacks; however, the GRKG method is not resistant to backtracking, iterative guessing, and meet-in-middle attacks.

7. Conclusion and Avenues for Future Research

This paper has presented an improved and efficient genetic-based OTP key generator. The proposed method is a significant improvement in the GRKG method. The proposed IGRKG generator has successfully passed the simple statistical tests such as frequency, serial, autocorrelation, and poker tests. IGRKG generator has also passed ENT, Diehard, and NIST batteries of statistical tests. IGRKG is also resistant to basic cryptanalytic attacks. These tests indicate that IGRKG generator does not have any weakness and implementation bugs. Additionally, the statistical quality of the IGRKG generator has been compared with other existing pseudorandom number generators through Diehard scores, and the obtained scores indicate that IGRKG is the acceptable pseudorandom number generator. In terms of speed, IGRKG generator is four times faster than the GRKG generator.

It is important to note that, in the case of the IGRKG method, there are various trade-offs to run and produce the next pad. For instance, in the next communication, the variable can be increased by “1.” Another practical approach can be designed by an appropriate use of secret key parameters and , and that can be decided by communicating parties once. Indeed, if the variable is correctly handled, this may result in computationally high security.

This paper has used linear congruential generator for generating initial pad and then genetic algorithm is used to improve the randomness of the initial pad. Instead of linear congruential generator, Mersenne Twister can also be used. However, how will the use of Mersenne Twister be effective and efficient in generation of OTP key? We left it as an open problem. Also, an extensive cryptanalysis is required to ensure computational security of the proposed generator.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

References

A. J. Menezes, P. C. Van Oorschot, and S. A. Vanstone, Handbook of Applied Cryptography, CRC Press, 2010.
D. Tagu, J. K. Colbourne, and N. Nègre, “Genomic data integration for ecological and evolutionary traits in non-model organisms,” BMC Genomics, vol. 15, no. 1, article no. 490, 2014.
View at: Publisher Site | Google Scholar
M. Sokouti, B. Sokouti, S. Pashazadeh, M.-R. Feizi-Derakhshi, and S. Haghipour, “Genetic-based random key generator (GRKG): A new method for generating more-random keys for one-time pad cryptosystem,” Neural Computing and Applications, vol. 22, no. 7-8, pp. 1667–1675, 2013.
View at: Publisher Site | Google Scholar
K. A. De Jong, Evolutionary Computation: A Unified Approach, MIT press, 2006.
S. Sivanandam and S. Deepa, Introduction to Genetic Algorithms, Springer Science & Business Media, 2007.
D. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesly, 1989.
M. Srinivas and L. M. Patnaik, “Genetic algorithms: a survey,” Computer, vol. 27, no. 6, pp. 17–26, 1994.
View at: Publisher Site | Google Scholar
Z. Michalewicz, Genetic Algorithms+ Data Structures= Evolution Programs, Springer Science & Business Media, 2013.
T. F. Gonzalez, Handbook of Approximation Algorithms And Metaheuristics, CRC Press, 2007.
M. D. Vose, The Simple Genetic Algorithm: Foundations and Theory, vol. 12, MIT Press, 1999.
P. Błażej, M. Wnȩtrzak, and P. Mackiewicz, “The role of crossover operator in evolutionary-based approach to the problem of genetic code optimization,” BioSystems, vol. 150, pp. 61–72, 2016.
View at: Publisher Site | Google Scholar
W. M. Spears et al., Crossover or Mutation, Foundations of Genetic Algorithms 2, 1992.
Á. E. Eiben, R. Hinterding, and Z. Michalewicz, “Parameter control in evolutionary algorithms,” IEEE Transactions on Evolutionary Computation, vol. 3, no. 2, pp. 124–141, 1999.
View at: Publisher Site | Google Scholar
G. Karafotias, M. Hoogendoorn, and A. E. Eiben, “Parameter Control in Evolutionary Algorithms: Trends and Challenges,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 2, pp. 167–187, 2015.
View at: Publisher Site | Google Scholar
M. Srinivas and L. M. Patnaik, “Adaptive probabilities of crossover and mutation in genetic algorithms,” IEEE Transactions on Systems, Man and Cybernetics, vol. 24, no. 4, pp. 656–667, 1994.
View at: Publisher Site | Google Scholar
J. E. Smith and T. C. Fogarty, “Adaptively parameterised evolutionary systems: Self adaptive recombination and mutation in a genetic algorithm,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1141, pp. 441–450, 1996.
View at: Publisher Site | Google Scholar
A. E. Eiben and S. K. Smit, “Parameter tuning for configuring and analyzing evolutionary algorithms,” Swarm and Evolutionary Computation, vol. 1, no. 1, pp. 19–31, 2011.
View at: Publisher Site | Google Scholar
K. Park and C. Hong, “Cryptographic protocol design concept with genetic algorithms,” in Knowledge-Based Intelligent Information and Engineering Systems, pp. 483–489, Springer, 2005.
View at: Google Scholar
L. Zarza, J. Pegueroles, M. Soriano, and R. Martnez, “Design of cryptographic protocols by means of genetic algorithms techniques,” in SECRYPT, pp. 316–319, 2006.
View at: Google Scholar
Y. Wang, K.-W. Wong, C. Li, and Y. Li, “A novel method to design S-box based on chaotic map and genetic algorithm,” Physics Letters A: General, Atomic and Solid State Physics, vol. 376, no. 6-7, pp. 827–833, 2012.
View at: Publisher Site | Google Scholar
S. Jhajharia, S. Mishra, and S. Bali, “Public key cryptography using neural networks and genetic algorithms,” in Proceedings of the 2013 6th International Conference on Contemporary Computing, IC3 2013, pp. 137–142, Noida, India, August 2013.
View at: Publisher Site | Google Scholar
A. Jain and N. S. Chaudhari, “Cryptanalytic results on knapsack cryptosystem using binary particle swarm optimization,” Advances in Intelligent Systems and Computing, vol. 299, pp. 375–384, 2014.
View at: Publisher Site | Google Scholar
K. M. Faraoun, “A genetic strategy to design cellular automata based block ciphers,” Expert Systems with Applications, vol. 41, no. 17, pp. 7958–7967, 2014.
View at: Publisher Site | Google Scholar
A. Jain and N. S. Chaudhari, “Evolving highly nonlinear balanced boolean functions with improved resistance to DPA attacks,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9408, pp. 316–330, 2015.
View at: Publisher Site | Google Scholar
A. Jain and N. S. Chaudhari, “A new heuristic based on the cuckoo search for cryptanalysis of substitution ciphers,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9490, pp. 206–215, 2015.
View at: Publisher Site | Google Scholar
S. William and W. Stallings, “Cryptography and Network Security, 4/E,” Pearson Education India, 2006.
View at: Google Scholar
S. Picek, M. Golub, and D. Jakobovic, “Evaluation of crossover operator performance in genetic algorithms with binary representation,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6840, pp. 223–230, 2011.
View at: Publisher Site | Google Scholar
T. Bäck, D. Fogel, and Z. Michalewicz, “Handbook of evolutionary computation,” Release, vol. 97, no. 1, p. B1, 1997.
View at: Google Scholar
L. Chambers, Practical Handbook of Genetic Algorithms, CRC Press, 1998.
View at: Publisher Site
J. Walker, Ent: A pseudorandom number sequence test program, Software and documentation available at http://www.fourmilab.ch/random/.
C. Lamenca-Martinez, J. C. Hernandez-Castro, J. M. Estevez-Tapiador, and A. Ribagorda, “Lamar: A new pseudorandom number generator evolved by means of genetic programming,” in Parallel Problem Solving from Nature-PPSN IX, pp. 850–859, Springer, 2006.
View at: Google Scholar
G. Knuth, The Art of Computer Programming, Seminumerical Algorithms, vol. 2, addition wesley, Reading, Massachusetts, Mass, USA.
M. M. Meysenburg and J. A. Foster, “Randomness and ga performance, revisited,” in Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation-Volume 1, pp. 425–432, Morgan Kaufmann Publishers Inc, 1999.
View at: Google Scholar
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Flannery, Numerical Recipes in C, vol. 2, Cambridge University Press, 1982.
G. Marsaglia, Yet another rng, Posted to the electronic billboard sci. stat. math, August 1.
M. Matsumoto and Y. Kurita, “Twisted GFSR Generators,” ACM Transactions on Modeling and Computer Simulation (TOMACS), vol. 2, no. 3, pp. 179–194, 1992.
View at: Publisher Site | Google Scholar
B. Schneier, “Applied cryptography,” Cover and Title Pages, pp. 125–147, 1997.
View at: Google Scholar
S. Tezuka and P. L'Ecuyer, “Efficient and Portable Combined Tausworthe Random Number Generators,” ACM Transactions on Modeling and Computer Simulation (TOMACS), vol. 1, no. 2, pp. 99–112, 1991.
View at: Publisher Site | Google Scholar
B. C. Johnson, “Radix-b extensions to some common empirical tests for pseudorandom number generators,” ACM Transactions on Modeling and Computer Simulation, vol. 6, no. 4, pp. 261–273, 1996.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2017 Ashish Jain and Narendra S. Chaudhari. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2543

Downloads

1191

Citations

Complexity

An Improved Genetic Algorithm for Developing Deterministic OTP Key Generator

Abstract

1. Introduction

2. Related Work and Our Contributions

3. Genetic Algorithm

4. One-Time Pad Cryptosystems

5. Genetic Algorithm for Generating OTP Keys

5.1. IGRKG: The Proposed Method

5.2. Comparison between GRKG and IGRKG Generators

6. Results

6.1. Common Computation

6.2. Results Obtained Using the GRKG Method

6.3. Results Obtained Using the IGRKG Method

6.4. Discussion on the Speed of GRKG and IGRKG Generators

6.5. Statistical Results

6.5.1. Frequency Test

6.5.2. Serial Test

6.5.3. Autocorrelation Test

6.5.4. Poker Test

6.6. GRKG and IGRKG Quality Assessment

6.6.1. Diehard Scores

6.6.2. Changes in Population

6.7. Basic Cryptanalysis of GRKG and IGRKG

7. Conclusion and Avenues for Future Research

Conflicts of Interest

References

Copyright