Abstract

When establishing a cryptographic key between two users, the asymmetric cryptography scheme is generally used to send it through an insecure channel. However, given that the algorithms that use this scheme, such as RSA, have already been compromised, it is imperative to research for new methods of establishing a cryptographic key that provide security when they are sent. To solve this problem, a new branch known as neural cryptography was born, using a modified artificial neural network called Tree Parity Machine or TPM. Its purpose is to establish a private key through an insecure channel. This article proposes the analysis of an optimal structure of a TPM network that allows generating and establishing a private cryptographic key of 512-bit length between two authorized parties. To achieve this, the combinations that make possible to generate a key of that length were determined. In more than 15 million simulations that were executed, we measured synchronization times, the number of steps required, and the number of times in which an attacking TPM network manages to imitate the behaviour of the two networks. The simulations resulted in the optimal combination, minimizing the synchronization time and prioritizing security against the attacking network. Finally, the model was validated by applying a heuristic rule.

1. Introduction

Cryptography is the mathematical science and the discipline of writing messages in encoded text. Its purpose is to protect secrets from adversaries, interceptors, intruders, opponents, and attackers [1]. It also pays special attention on mechanisms that guarantee information integrity and deals with techniques for exchanging users authentication keys and protocols [2]. The most known method is the combination of the symmetric and asymmetric encryption and decryption algorithms. The asymmetric algorithm is used for the exchange of cryptographic keys and the symmetric algorithm is used for information encryption and decryption.

The RSA public key cryptosystem and the systems based on elliptic curves are the most common forms for public-key cryptography in the encryption and digital signature standards [3]. However, the security of the RSA algorithm depends on the length of the prime numbers used for factoring [4]. Thus, one of the main concerns of RSA is the demand for large keys in today’s cryptographic algorithms, since the product of two long prime numbers must be factored. This generates a bigger value than the original. Increasing the length of prime numbers increases security; however, the computational cost of factoring these numbers is also increased. For this reason, it is important to look for new methods of exchanging cryptographic keys in a secure way and with a relatively low computational cost. Hence, neural cryptography is born; this method uses a neural network called Tree Parity Machine (TPM). In this way, through two TPM networks with exactly the same structure, two users can establish a cryptographic key by exchanging the inputs and outputs of these networks, keeping the synaptic weights secret.

The aim of our study is to search for an optimal structure that minimizes the time and number of steps that two TPM networks require to establish a cryptographic key of -bit length. In addition, we also emphasize on an attacking network whose behaviour is difficult to imitate. For this, we created an algorithm using Python; this algorithm allows performing the simulations to test and find the optimal structure. A total of 15’041,100 simulations were conducted with all the possible structures of the network. Additionally, we used R and GNU Octave for statistical analysis. Finally, a heuristic rule was applied to validate the computed values (proposed by [5, 6]). As a result, it was possible to determine the optimal structure of the TPM network with an average amount of steps and a percentage of of success of a passive attack network and with a successful 0% of a geometric attack.

The main contribution of this study is that two users will be able to generate a cryptographic key based on certain probability according to the number of steps established at the beginning of the synchronization and with a very low probability of a successful passive attack.

The remainder of this document is structured as follows. Section 2 mentions the related works that supports the present study. Section 3 describes the procedure that helps determine the best combination. Section 4 shows the results obtained from the experiment. Finally, Section 5 details conclusions and proposes future work.

Concerning the studies related to this research, Kanter et al. [7] and Rosen-Zvi et al. [8] demonstrated that when two artificial neural networks are trained on their outputs according to a learning rule, they are able to develop equivalent states of their internal synaptic weights. Kinzel and Kanter [9] and Ruttor et al. [10] revealed that the possibility of a successful attack decreases as the synaptic depth of the network increases and the computational cost of the attacker increases, since its effort grows exponentially while the effort required by the users grows polynomially. Similarly, Sarkar and Mandal [11] and Sarkar [12] mention that the performance is improved by increasing the synaptic depth of the TPM networks, henceforth counteracting the brute force attacks with the current computation. In comparison with the present study, we used the structure of the TPM network proposed by the latter. However, we determined that the security of the TPM network can also be increased by increasing the number of neurons in the hidden layer and also the number of entries of each neuron. In addition, the range for the synaptic depth value has been modified to adapt the need to generate a key of bits.

With respect to the proposed algorithms for the design of the TPM network, Lei et al. [13] developed a two-layer, prepowered TPM network model. A fast synchronization can be achieved by increasing the minimum value of the internal representations Hamming distance and by reducing the probability of a step that does not modify networks weights. Allam et al. [14] proposed an algorithm that increases the security of neural cryptography by authenticating communications using previously shared secrets; in this way, it increases the security of neural cryptography. As a result, it shows that the algorithm reaches a very high security level without increasing synchronization time. Ruttor [15] mentions that the value of K is 3, since lower values have negative consequences from a safety point of view and higher values have negative consequences in terms of synchronization time. Klimov et al. [16] calculate the probability that two networks take to either synchronize their weights or not. Although the probability rises with a low value of K, the success of the attacking network also increases, and consequently the value of K must be greater. In comparison with our study, the initial structure of the TPM networks is random and different and has a single output. In addition, because the goal is to generate a 512-bit key, it was not arithmetically adequate to use odd numbers for K, N, and L values.

In relation to measurements on TPM networks performance, Dolecki and Kozera [17] present a method of frequency analysis that allows evaluating the synchronization level of two TPM networks before they finish up, with the calculated value not related to the difference of their synaptic weights. As a result, the selection of the appropriate range for the count frequency and the threshold allow them to specify whether it is a short or a long synchronization. Santhanalakshmi et al. [18] and Dolecki and Kozera [19] analyse and compare the performance of synchronization by employing, respectively, a genetic algorithm and a Gaussian distribution instead of uniform random values for the weights of the TPM networks. As a result, they found that replacing the random weights with optimal weights helps reduce the synchronization time. In addition, increasing the number of hidden and input neurons accelerates convergence and also reduces the probability of success of the “Majority Flipping Attack” attack. Dolecki et al. [20] perform an adjustment of the timing distribution of two TPM networks to a Poisson distribution. Pu et al. [21] perform an algorithm that combines “true random sequences” (generated by artificial and validated circuits with randomness tests) with TPM networks, demonstrating more complex dynamic behaviours that offer better performance as an encryption tool and resistance to attacks. In comparison with our study, the synaptic weights values of the TPM networks are generated according to a discrete uniform distribution. Additionally, the Poisson distribution adjustment was made with the results in the number of steps in each of the simulations.

With reference to rules that contribute in the design of TPM networks, Mu and Liao [5] and Mu et al. [6] define the following heuristic rule: “Keeping the equations of motion constant, a high value in the state classifier with respect to the minimum values of the smallest Hamming distances between the state vectors of the networks, has a high probability of fulfilling the condition that the average change in the percentage difference between synaptic weights is greater than zero, improving the security of neuronal cryptography”. For our study, we used the proposed heuristic rule to determine the level of security of the final structure of the TPM network.

In regard to modification of initial structures of a TPM network, Gomez et al. [22] state that, with an initial misalignment in the weights between and , the synchronization time is reduced from to less than . This also reduces the number of steps from to less than . Within this context, Niemiec [23] presents a new idea for a key reconciliation method in quantum cryptography using TPM networks, correcting errors that occur during transmission in the quantum channel. The number of steps necessary to establish the key is significantly reduced with a low value in the error rate of the quantum bit and a high value in the initial percentage of synchronization of the two networks. In comparison with our study, we did not use initial structures that presented an initial alignment with an established percentage. This initial alignment should be delivered and shared by the two users, which implies that an attacker has more initial information about the structure of the TPM networks.

3. Materials and Methods

3.1. Background

A TPM is a neural network that is formed by a hidden layer and a single output. The general structure consists of the triad of values , , and , where is the number of neurons in the hidden layer, is the number of entries of each neuron in the hidden layer, and sets the limit of the range of possible integer values related to the synaptic weights.

To generate and establish a key of bits, some variations of the TPM structure whose synaptic weights allow this key length were tested. To determine it, the value of was taken as a base, which, in its binary notation, establishes the final length of the key. To establish pair values in the key length, the limits of the range of values of to whose maximum value in decimal notation is were modified, and the length of its binary value is a multiple of . Thanks to this, there are possible values of that allow generating keys of bits; see Table 1.

For each value of , there are values of and such that 512, considering the restriction , as shown in Table 2.

The restriction of is due to the fact that if , the neural network will be able to reach values that make the synchronization not to be possible. Then, there are possible combinations of , , and values that allow us to generate a key of -bit length.

To find the optimal combination that allows rapid synchronization in a minimum number of steps, which prevents an attacker from being able to copy their behaviour, 500,000 simulations were performed for each combination. Each simulation consisted of three TPM networks, two networks synchronized their weights in an authoritative way (Alice and Bob), and another unauthorized attacking network (Eve), they tried to imitate the behaviour of the other two networks to discover the key they tried to establish. For simulation, the values of greater than (, , y ) were not considered, due to its length and complexity in the calculation.

3.2. Implementation of the Algorithm and the Attack Model

We used Python to implement an algorithm to measure the time and the number of steps needed for synchronization, in addition to calculating the cumulative values of the number of steps during simulations (see Algorithm 1).

Data: , , , number of simulations
Result: none
1 initialization;
2 for to do
3  Alice TPM (, , );
4  Bob TPM (, , );
5  Eve TPM (, , );
6  steps 0;
7  while != do
8   input randomVector ();
9   outA Alice(input);
10   outB Bob(input);
11   outE Eve(input);
12   if then
13    Alice.update(outB);
14    Bob.update(outA);
15   end
16   if then
17    Eve.update(outA);
18   end
19   steps steps+1;
20 end
21saveToFile (steps);
22 end

As can be seen, in Algorithm 1, the second line defines the total of simulations that will be executed. Lines to initialize the three TPM networks with , , and . Line initializes the step counter to . Line establishes that the simulation will not finish while the weights of both networks are not equal. In line , we create a random vector that will be the input for the networks. Lines to compute the outputs of the three networks. Line analyses whether the outputs of Alice and Bob networks are the same. In this case, lines and update weights of both networks according to the previously computed outputs. Line 16 allows Eve to update her weights as shown by line , but only when the outputs of the three networks are equal. Line increases the step counter by . Finally, line saves the number of steps in a file.

At a glance from the algorithm, attacks occur when the outputs of the three TPM networks match. Eve can only listen to messages that are sent between Alice and Bob, and its learning process is slower since it only updates her weights if its output matches the other two outputs. According to [24], a geometric attack has better results when an attacker uses a single TPM network to mimic the behaviour of the other two authorized networks. Therefore, additional 180,000 simulations were conducted in a geometric attack. The condition is taken into account when the outputs of the two authorized networks are equal, and the output of the attacking network is different. In this case, the attacking network can modify its internal representation by adjusting the partial output with the lowest absolute value by its counterpart.

For this purpose, we designed and included Algorithm 2. In this algorithm, the third line calculates the absolute values of the partial outputs of each hidden neuron, using the input values and the weights of the attacking network. Line 4 calculates the sign of each of the partial values of each hidden neuron. In line 5, the sign corresponding to the position of the minimum absolute value of the partial output is changed. Line 6 calculates the value of the output of the network with the new partial outputs. Finally, line 7 updates the network weights.

Data: outA, outB, outE, input, weightE
Result: none
1 initialization;
2 if != then
3  sigmaAbs abs(sum(input weightE));
4  sigmaSign sign(sum(input weightE));
5  sigmaSign[sigmaAbs.min()]
   -sigmaSign[sigmaAbs.min()];
6  tau prod(sigmaSign);
7  update(tau);
8 end

4. Results and Discussion

4.1. Results of the Simulations

To perform and obtain the results of the simulations, a DELL Inspiron 5759 computer with a 2.50GHz sixth generation Intel Core i7-6500U CPU (model 78) was used. It has four CPUs, one socket, two cores per socket, and two processing threads per core, cache memory L1d of 32K, L1i of 32K, L2 of 256K, and L3 of 4096K.

Table 3 shows the results obtained from the simulations performed for each combination of , , and . Columns Min Steps and Max Steps show the minimum and the maximum number of steps that took for the two networks to synchronize their weights. The Average Steps column shows the average of steps of all the executed simulations. Columns Min Time and Max Time show the minimum and maximum time (in seconds) that took for the two networks to synchronize their weights. The Average Time column shows the average time (in seconds) of all the executed simulations. Column Eve sync successfully shows the number of occurrences in which the attacking network managed to imitate the behaviour of the other two networks of the total of executed simulations. Finally, column % Eve sync successfully shows the percentage represented by the previous column with respect to the total of simulations.

As can be seen in Table 3, items and corresponding to the combinations and , respectively, got the best results from the point of view of security against the attacking network (Eve). None of the 500,000 simulations allowed Eve to imitate the behaviour of the other two networks with the same number of steps of Alice and Bob’s networks. However, two combinations took a greater number of steps to achieve synchronization ( and , respectively) than the other combinations, which involves a longer average time.

For synchronization time, item corresponding to combination produced the best result, with an average synchronization time of seconds. However, it got an attacking network sync percentage of . Given that security was prioritized in the present study, we considered the combinations mentioned above as good results. It should also be noted that the combinations, where , despite having the best times in synchronization, produced the worst results because the attacking network Eve managed to imitate the behaviour of the other two networks in approximately of the total simulations.

To determine the best combination of the two previous situations, specifically, with better results from the security point of view, we performed one million simulations for each, and we got the following results; see Table 4.

From Table 4, we can determine the best combination for the triad (, , ) is . From Table 4, it was determined that the best combination for the trio of values (, , ) is . Simulations were carried out using geometric attack and the results were obtained as shown in Table 5.

More simulations with the combination of were conducted to determine the evolution of the distribution in the number of steps. Of all simulations, we totalled the number of steps that were necessary for the two TPM networks to synchronize. See the histograms shown in Figure 1.

During additional simulations, on a single occasion, Eve’s attacking network managed to synchronize its weights with the other networks. Table 6 shows the probabilities of a successful attack of both combinations, compared to the total of performed simulations.

4.2. Adjustment of Probabilities Distribution and Calculation

As can be seen in Figure 1, the number of steps used in all simulations follows a positive asymmetric discrete distribution or to the right. For the present work, the Poisson distribution was considered by the characteristics of the mathematical parameters, such as lambda and delta, where delta was considered the maximum value of the limit as a function of the total simulations. Delta is the probability of occurrences of the event. Poisson has the formula as shown by the equation ref Poisson.

Then, for the first test, we used the results obtained from the 1,000 simulations. The maximum value of the limit is 515, where

Using (1), taking as delta value = 0.515 of the equation ref delta 1 (lambda equal to delta to obtain the highest probability), a probability of 0.4787 was obtained. Similarly, the probabilities were calculated for the rest of the simulations. For the 10 thousand simulations, a delta value = 0.0634 (6.34 %) was obtained and its probability is 0.8147. Therefore, lambda varies by 34 %. In an analogous way, the same test was carried out with the results of the other simulations. Table 7 shows the obtained values.

As the number of simulations increases, the probability has a tendency towards 1 (maximum probability value), which implies that the tests performed are correct. The same can be seen in Figure 2. As can be seen, the data tend to adjust mathematically to a Poisson distribution.

4.3. Generation of the Cryptographic Key of 512 Bits

When the networks finish their synchronization, their synaptic weights will have values in the range . To obtain the key of bits, the value of is added to each synaptic weight; this produces that the range is . Then, each synaptic weight is transformed to its binary equivalent of bits in length. Each of these values is concatenated and the key of 512-bit length is obtained ().

4.4. Process of Validation

To validate the security of the proposed TPM network structure with the values of , , and , and, according to the heuristic rule proposed by Mu and Liao [5] and Mu et al. [6], the value of the state classifier was computed. For this, the Hamming distances were calculated between all the possible vectors whose outputs are equal ( and ). Since , there are possible vectors ( vectors with output and vectors with output ). The combinations for each group according to their output are given by the binomial coefficient, as shown by

Therefore, there are combinations for each group. The value of the state classifier was computed as follows. First, Hamming distances were calculated between each pair of vectors whose output is equal to

Then, for each group, the minimum value of the said distances is calculated:

Finally, the value of the state classifier is calculated as the smallest value between the minimum values of the Hamming distances:

The value of shows that it meets the probability that the average change in the percentage difference between synaptic weights is greater than zero. This implies that the synchronization time is increased polynomially by increasing the value of . This means that the proposed structure is secure.

4.5. Process of Analysis of the Values of , , and

The probability of success of an attacking network has a very noticeable dependence with respect to the values of , , and , as shown in Table 3. Therefore, an analysis of the values of , , and was conducted with respect to their probability. All possible combinations were plotted with their respective probabilities. To do this, we used the GNU Octave scatter3 function to draw a scatter diagram. Algorithm 3 presents the code that generated this graph.

Data: , , ,
Result: Graph
1 initialization;
2 scatter3 ((:), (:), (:), [16], (:), ‘.’);
3 set(gca,‘zscale’,‘log’);
4 xlabel “K”;
5 ylabel “N”;
6 zlabel “L”;
7 colorbar;

Algorithm 3 starts with the values of , , and the probabilities of Eve in each combination. Line 2 draws a 3D scatter diagram with the values of , , and . The value of indicates the size of the points to be plotted. The value of reveals the value of the probabilities that will be every point colour. Line 3 changes the scale of the axis (value of ) to a logarithmic scale to improve visualization. Lines 4 to 6 change the labels of the axes to their respective values. Finally, line 7 discloses a colour bar. Figure 3 displays the result of Algorithm 3.

Figure 3 shows how the probabilities change according to the values of , , and . On the axis are the values of , on the axis the values of , and on the axis the values of . The colour of each point displays the probability of success of the attacking network. A low probability is better (blue) and a high probability is worse (red).

As can be seen in Figure 3, the value of does influence much on probability. On the other hand, a high value of has a negative impact on the result. Finally, a very low value of has a negative impact. For this reason, to propose values of , , and , a relatively low value of is recommended, but it has to be higher than . The value of should not be too low. These conditions ensure a very low probability of a successful passive attack.

5. Conclusions

This article has been developed in the context of the design of a TPM neural network that allows generating a -bit key between two authorized parties. To do this, the optimal values of , , and were computed through simulations that allowed prioritizing security against an attacking network. This attacking network, through a passive attack, tried to imitate the behaviour of the other two networks, achieving the synchronization with the other two networks on a single occasion. This demonstrates that a secret key can be exchanged between two authorized parties in a network with a probability of so that an attacking network can mimic the original network behaviour. In addition, a geometric attack was designed and executed, with successful results of 0%. Furthermore, if the values of , , and of a TPM neural network are modified, cryptographic keys of various lengths can be obtained. These values also determine how easy or how difficult it might be for an attacking network to mimic the behaviour of the other two networks. According to our results, it is recommended that the values of , , and meet the following conditions: and .

As future work, we plan the analysis of L values that were not considered in this study due to software and hardware limitations (, , , and ), in which security is also prioritized in front of an attacking network. Also we plan studying the combination of neural cryptography using the values calculated in the study, along with symmetric cryptographic systems that need a cryptographic key established between two parties (such as DNA-based cryptography). This in order to perform secure communications and without the need to have sent its keys through insecure means. Finally, it is recommended to perform a more exhaustive security analysis, to the values found for the structure of the TPM neural network, namely, specific attack models designed to manage the cryptographic key, especially in its distribution.

Data Availability

The data used to support the findings of this study (CSV, TXT, and PNG) have been deposited in the Google Drive repository. In the following link, you can find the data used in our manuscript: https://drive.google.com/drive/folders/1MOf7CfXFrFB-gtK6ue4JkHURzga1o8zA.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The funding of this scientific research is provided by the Ecuadorian Corporation for the Development of Research and the Academy (RED CEDIA) and also from the Mobility Regulation of the Universidad de las Fuerzas Armadas ESPE, from Sangolquí, Ecuador. The authors would like to thank the financial support of the Ecuadorian Corporation for the Development of Research and the Academy (RED CEDIA) in the development of this study, within the Project Grant GT-Cybersecurity.