Research Article | Open Access

Reihaneh Safavi-Naini, Alireza Poostindouz, Viliam Lisy, "Path Hopping: An MTD Strategy for Long-Term Quantum-Safe Communication", *Security and Communication Networks*, vol. 2018, Article ID 8475818, 15 pages, 2018. https://doi.org/10.1155/2018/8475818

# Path Hopping: An MTD Strategy for Long-Term Quantum-Safe Communication

**Academic Editor:**Hamed Okhravi

#### Abstract

Moving target defense (MTD) strategies have been widely studied for securing computer systems. We consider using MTD strategies to provide long-term cryptographic security for message transmission against an eavesdropping adversary who has access to a quantum computer. In such a setting, today’s widely used cryptographic systems including Diffie-Hellman key agreement protocol and RSA cryptosystem will be insecure and alternative solutions are needed. We will use a physical assumption, existence of multiple communication paths between the sender and the receiver, as the basis of security, and propose a cryptographic system that uses this assumption and an MTD strategy to guarantee efficient long-term information theoretic security even when only a single path is not eavesdropped. Following the approach of Maleki et al., we model the system using a Markov chain, derive its transition probabilities, propose two security measures, and prove results that show how to calculate these measures using transition probabilities. We define two types of attackers that we call risk-taking and risk-averse and compute our proposed measures for the two types of adversaries for a concrete MTD strategy. We will use numerical analysis to study tradeoffs between system parameters, discuss our results, and propose directions for future research.

#### 1. Introduction

Cryptographic infrastructure of the Internet allows users from across the world to establish private and authenticated, confidential communication channels, and interact securely. Shor’s discovery of a quantum algorithm that can efficiently solve integer factorization and discrete logarithm problems [1], the two mathematical problems that are the basis of the security of the most prominent public key crypto algorithms such as RSA public key encryption and Diffie-Hellman key agreement, effectively brings down the cryptographic infrastructure of the Internet. The NSA’s recent call for quantum-safe cryptography and the prediction that significant progress in the development of quantum computers could be expected within fifteen years [2, 3] have created a flurry of activities in research community, standardization bodies [4, 5], and major industries [6–8]. The main approaches to quantum-safe cryptography use, (i) quantum cryptographic models and algorithms, (ii) cryptographic algorithms that rely on computational assumptions for which no efficient quantum algorithm is known [9], and (iii) cryptographic systems that do not use any computational assumptions. This last approach results in information theoretically secure systems and is followed in this paper.

A prominent and widely researched direction in information theoretically secure communication is* physical layer security * systems that base security on assumptions about the physical environment [10]. In these systems the advantage of the sender and the receiver over the adversary is captured through the properties of the physical layer of communication. For example, in Wyner’s model [11] Alice is connected to Bob and Eve through two noisy channels, and the assumption is that Eve’s reception is noisier than Bob’s reception. The extra noise in Eve’s channel is the resource and can be used for securing communication against Eve, without the need for a shared key. A unique property of this approach compared to computationally secure systems is providing* long-term security* which refers to the property that Eve’s transcript of communication cannot be used for offline attacks. This is because security is due to the lack of information and not adversary’s limited computation.

In this paper we assume there are multiple communication* paths* ( paths) between the sender and the receiver. A path is an abstraction of a channel and can have different realizations in practice. For example, in wireless communication, a path can correspond to a frequency that is used for transmission and reception by the sender and the receiver; and multiple paths are specified by a set of frequencies that will be used by the two. Or in a sensor network, a path consists of a sequence of nodes in the network which are used to send messages from the sender to the receiver. A similar notion of path can be defined for communication over the Internet. If the adversary can eavesdrop all the paths, secure communication without additional assumptions (e.g., quantum mechanical, computational or other physical layer assumptions) is impossible. We assume that although the set of all paths (e.g., possible frequencies) is known to the attacker, they cannot eavesdrop all the paths at the same time.

To provide cryptographic security against an attacker whose goal is to learn the sent message, the sender can use an -secret sharing (see Section 2) to construct shares of the message and send each share along a path. This is an inefficient solution with communication rate of (i.e., for one bit of information, the sender must send bits). If the number of paths that the adversary can simultaneously eavesdrop is bounded by , the sender and receiver can select a random set of paths and use a -secret sharing. Note that is not necessarily equal to , but to keep the introductory discussion simple, let ; we will later discuss the relation between and in Section 3. If the set of paths is kept fixed, the attacker will discover them over time and will be able to learn the message. We propose “path hopping,” where the sender and the receiver regularly change (“hop”) one or more paths that have been chosen for communication. We also allow the attacker to change their selected paths. We model and analyze dynamic behaviour of the system and show that it results in efficient cryptographic security using an MTD strategy.

##### 1.1. Our Work

Alice wants to send a stream of data to Bob. The adversary is computationally unlimited (no computational assumption is made). Alice and Bob are connected by a set of communication paths, up to of which can be eavesdropped by the adversary at the same time. The adversary probes a path to determine if it carries data, and if it does, it captures it (e.g., scanning the port on a server and if open, later break into the path). The adversary is* mobile *[12] and can move in the network in the sense that, in each time step, it can release a captured path and capture a new one. Hence, it can eavesdrop different sets of paths during different time periods.* We assume the attacker can eavesdrop up to ** paths, but only eavesdrops those that carry data, and not the ones that are not in use between Alice and Bob *(one can consider a case where the attacker always eavesdrop on paths. Although the analysis approach will be similar, the actual calculations will be different). Alice uses (out of ) paths at each time and the attacker needs to know all the paths (targets) to be able to determine the message.

Time is divided into fixed consecutive intervals, each referred to as a* time step*. In each time step,* defender* (which includes Alice and Bob),* attacker,* or none of them takes an action (move). Using the MTD framework of [13], the combination of the attacker’s and the defender’s actions in each time step can be modelled as a* Markov chain*. We define Markov chain where (exactly) and paths are hopped by the defender and the attacker, respectively, and derive transition probabilities of the chain. (One can also consider Markov chain where in each time step, the defender randomizes* up to* paths and the attacker hops and probes* up to* paths. We leave this for future work.) For our concrete analysis we focus on where the defender’s and the attacker’s actions involve a single path, only.

In each time step, the system can be in one of the * states* labeled by , where state 0 is the starting state of the system, state is the* winning state* for the attacker, and in state , attacker has captured paths. We assume the defender leads in each time step and* moves* with a fixed probability . The attacker also moves in the same interval with a fixed probability that is upper bounded by (the model assumes that, in each time step, the attacker and the defender do not act simultaneously.) Note that can be chosen by the adversary, knowing the upper bound. We define a* risk-taking* and a* risk-averse* attacker, depending on their choice of . A risk-taking adversary chooses the highest available attack probability, that is, . A* risk-averse* attacker would like to stay undetected and so limits their action rate to a threshold that is determined by the intrusion detection system (IDS) of the defender. Thus, in the case of a risk-averse adversary, in each time step, there is a probability that no one moves.

We model the system as a Markov chain, and in Section 3, derive the transition probabilities of the Markov chains associated with the risk-taking (case C1) and the risk-averse (case C2) attackers.

*Security Measures*. We use two security measures to evaluate effectiveness of a path hopping strategy: (i) the expected number of times that the attacker reaches the winning state in time steps, assuming that it is starting from state 0, and (ii) the expected number of time steps to enter the wining state for the first time, assuming that the attacker starts from state 0. The two measures are denoted by and , respectively.

These security measures capture security requirements of different scenarios. is appropriate for data streams for which sporadic access to different parts of the stream may be tolerated. For example, small excerpts of a large file are not expected to leak much information about the file. Theorem 3 shows that is upper bounded by the product , where is the th component of the stationary probability distribution of the Markov chain. This suggests that * can be used to represent **, with higher values corresponding to less security.*

is appropriate for highly sensitive data streams that must stay strictly inaccessible to the adversary and the sender wants to ensure that the expected number of time steps to the first compromise is sufficiently high (possibly higher than the length of the stream). Theorem 4 shows that can be calculated by solving a set of linear equations whose coefficients are derived from the transition probabilities of the Markov chain. We use * as a security measure with higher values corresponding to higher security*.

*Numerical Results*. Deriving closed form expressions for and is a challenging task. For , we use numerical calculations to study variations of security measures for different values of system parameters. Our results are given in Section 6. They show the following:

(1) For fixed and , security increases (i.e., decrease, and increases) as , the defender’s probability of action, increases (Figures 3 and 4 for C1 and Figures 6 and 7 for C2).

(2) For fixed , security can be maintained by increasing , even when (Figure 5) and in all cases communication rate is .

Figure 5 also shows that, for given values of and , as increases, security initially increases, then it reaches a plateau and then starts to decrease. This is because when is small (relative to ), the target paths are hidden among many available paths and the success chance of correctly guessing a path would be small. However, when is large (relative to ), the attacker’s probability of correct guessing increases. Interestingly, this point of saturation increase as that represents variability of the system increases. This graph can be used to select the optimal value of to provide maximum security, while achieving the highest communication rate.

(3) Using numerical analysis, one can estimate the* cost of being risk-averse * in terms of decrease in or increase in . In Section 7 we show that an adversary who chooses not to use all their attacking power (although they can act with probability , they choose ) will effectively reduce the expected number of times that they will occupy the winning state (proportional to ) and will have higher .

*Attack Costs*. Our model focusses on the defender’s ability to provide security by making the* physical environment* dynamic and does not consider the associated costs. Attacker and defender’s actions have payoffs. The attacker needs to spend resources to launch attacks and also bear consequences of being detected. The defender must spend resources to implement the randomization strategy. This introduces side effects such as packet loss and communication delays that are a function of the rate of randomization (captured with parameter ). The attacker’s reward of their action is related to getting closer to the winning state, and the defender’s reward is preventing the attacker reaching the final state. In Section 7 we discuss these payoffs. We also use our numerical calculation results to quantify the cost of being risk-averse.

*Randomness Requirements of the System*. Our proposed system assumes that the sender and the receiver share the set of target paths that is used for communication in each time step.

In practice if one can assume that the receiver will receive on all paths all the times, then no shared randomness is required: the sender will hop the paths and the receiver will receive the content on the target paths used in each time step. If receiver has the same restriction as the sender on the number of target paths, that is, the receiver can only receive on paths (e.g., cost or restriction on the receiving equipment), then the sender and the receiver need shared randomness to simultaneously hop the paths. This can be realized in two ways: (i) using a preshared random string or (ii) employing a secure pseudorandom generator to extend an initial shared random seed.

The adversary view of the system in state , in addition to the eavesdropped shares that are sent over that target paths, includes the labels of the target paths. In case (i), the sequence of random numbers associated with the labels of target paths will not reveal any information about future values of the sequence of target paths and so future path labels will remain unpredictable. In the case of (ii) however, each observation (of a target path) will leak information about the seed of the PR generator and one needs to use a PR generator with appropriate security level (e.g., a quantum-safe PR generators using a secure block cipher). Note that the MTD system will retain its security because the recorded transcript of communication although may reveal the seed of the PR generator in an offline attack will not have enough information about communicated message and so the message transmission will have long-term security.

##### 1.2. Related Work

Breaking information into shares, to provide confidentiality and reliability, has been used in many cryptographic systems such as secret sharing [14] and information dispersal [15], in information theoretic setting, as well as computational setting [16]. These algorithms have been used in distributed storage systems [17] and are the building blocks of Secure Message Transmission [18] and network coding [19] which use multiple paths between the sender and receiver for providing* security and reliability*.

Uncoordinated Frequency Hopping (UFH) [20] has similarity to our work. In UFH the sender and receiver send and receive on two independently chosen subsets of frequencies, and the eavesdropper uses a third subset of frequency for eavesdropping. Authors show that, assuming public key infrastructure, one can communicate securely and reliably in this setting. The work of [21] uses a similar abstract model to construct information theoretic protocols for secure communication, without requiring public key infrastructure. The communication rate in this latter construction, however, is very low. Our approach is* coordinated* path hopping where the sender and receiver share an initial secret key that can be established using the scheme of [21]. We leave the analysis of the secret key requirement of our system and in particular efficient ways of generating new keys at the required hopping rate for future work.

Using diversity and introducing dynamic properties has been widely used in security systems. System properties that can be diversified and randomized include program instructions [22, 23], operating system distributions [24], and systems [25]. A comprehensive study of various methods is given in [26]. Using game theory for analyzing attackers’ strategies in dynamic systems has been studied in [27, 28].

*Organization*. Section 2 recalls the MTD Markov chain framework. Section 3 presents our path hopping model. Security analysis and measures for our model are introduced in Section 4. Sections 5 and 6 present our simulation results for the game. Sections 7 and 8 cover utility discussions and our concluding remarks.

#### 2. Preliminaries

We recall the basic MTD Markov chain framework that is used in our work and review construction and properties of secret sharing schemes.

*MTD Markov Framework [13]*. The system is a defined by the interaction between a* defender* and an* attacker*. The defender and the attacker each have a set of possible actions, denoted by and , respectively; in both shows no action. Time is divided into time steps. In each time step the system is in one of the possible states. In each time step, the defender and the attacker get a turn to move and the state change probability is determined by their chosen actions and their results. A* strategy* of a player determines all actions taken by the player in all points of the game. Using* Markov model* allows the player’s strategy to only depend on the state that the system is in and independent of the history of how the system has reached a state.

*Definition 1. *An -MTD game is defined by a transition matrix which describes a Markov chain of state transitions that reflects both defender and attacker moves. Initially the game starts in the state 0. At each next time step the game transitions from its current state to a new state with probability .

The state is the winning state from the adversary’s view (defender losing the game). Initially the system is in state 0 (from both attacker and defender’s view point). In each time step the defender takes an action according to matrix with probability , the attacker takes an action according to with probability , and with probability , both remain without any action.

*Definition 2. *An -MTD Game is defined by(1)parameters and that satisfy ; the parameters represent the rate of defender’s and the attacker’s play, respectively;(2) transition matrices and ; for , (or ) represents the probability of transitioning from state to state when the defender (or the attacker) plays a move in state .

Thus, in each time step a three-sided coin is tossed, and for each side, the corresponding action is realized, and we have the transition matrix where is the identity matrix.

A Markov chain is* irreducible* if each state can be reached from any other state. A Markov chain is* aperiodic*, if all the states have period 1 where the period of state is defined as , where is the random variable describing the state of the game after steps. The two properties together guarantee the existence of a limiting* stationary distribution*, where .

*Secret Sharing*. A -secret sharing is a cryptographic primitive [14] that divides a secret into shares, each given to a party, satisfying two properties: (i)* reconstructability* which means the share of all parties can perfectly reconstruct the original secret and (ii)* perfect secrecy* which means that if a single share is missing, the secret remains perfectly uncertain. A secret sharing scheme provides two algorithms for* share generation* and* secret reconstruction*. Let , where is the set of integers modulo , denote the set of secrets, and assume that all secrets are equally likely . The share generation algorithm takes a message as input and generates shares as follows. For , randomly chooses an element in . Then the shares of the secret are . It is easy to see that shares recover the secret (finding the sum modulo ), and even if shares are known, the secret remains completely uncertain.

#### 3. The MTD Game of Path Hopping

We consider the setting described in Section 1.1: there is a message source that generates a stream of data that must be protected against an eavesdropper. There are communication paths that connect the sender to the receiver. To protect message transmission against an eavesdropper who can simultaneously eavesdrop up to paths (), the sender does the following: (i) randomly chooses a subset of available paths; (ii) uses a secret sharing to construct shares for the message, and (iii) sends each share on one of the selected path. The chosen paths are also called* target paths.* The receiver knows the paths that are used by the sender in each time step. If the adversary eavesdrops only a subset of the target paths (and not all target paths), because of the perfect secrecy of the -secret sharing scheme, the attacker will stay completely uncertain about the message.

We assume that the attacker will not keep a path that is not carrying data. That is, because of the limitation on the number of paths that they can simultaneously eavesdrop, they prefer to release a path that is not used in the current time step and wait for the next time step to try again, noting that, due to their probabilistic strategy, there would be a chance to try the released path in the next time step again. To simplify our analysis, we first consider the case that . For the cases that or , similar analysis can be used; we omit details because of space.

To protect against this adversary, in each time step, the sender and receiver will* hop* one or more of the target paths, noting that lacking access to even one of the target paths will leave the adversary completely uncertain. We will use the MTD game framework of Definition 2 and model the problem as a dynamic system (game) influenced by (between) two* players,* a* system defender (or simply defender)* that includes the* sender* and the* receiver* and an* attacker*. The attacker wins the MTD game (in each time step) if they find the target paths.

##### 3.1. Games

In each time step, the defender can randomize a subset of target paths. Similarly, the adversary can simultaneously probe paths.

We first describe the Markov chain associated with the game, then derive transition probabilities of , and finally present a detailed analysis of .

##### 3.2. Markov Chain

The set of the defender’s and the attacker’s actions is and , respectively, where and are defender and attacker actions and is no action. Let denote the set of current target paths and denote the subset of target paths known to the adversary.

*Defender’s Move*. The defender cannot determine with certainty if a path is being eavesdropped. We thus consider a defender who, in all time steps, plays a* memoryless strategy*. That is the defender plays (issues the move ) with probability , irrespective of any learnt information about the attacker’s state, or own history of actions. When the defender plays in state , they will choose a subset of the current target paths and replace the paths in with a randomly selected subset of () nontarget paths.

The chosen paths in may belong to (attacker’s known path in state ) or be outside it.

*Attacker’s Move*. The attacker is adaptive. In state , the set of target paths that is known to the adversary is of size . The adversary randomly selects a subset of size of () possible target paths and keeps the message carrying paths and releases the rest. For the adversary, all paths that are not in their set of known target paths have the same probability of being a target path.

We assume that, in state , as soon as the defender reallocates a target path that is in , the attacker can detect the change (the path is not one of the target paths). However this will not affect the adversary’s action at this state simply because they know that those paths are not possible target paths.

*No Move*. Defender and attacker are probabilistic and no moves can be issued by either of them.

In a time step, if the attacker does not issue an action, they will bear the risk of potentially losing one of their known target paths during the next time step. This is because the defender will play a memoryless strategy and will move with probability . This extra risk would translate into a higher probability of not being able to reach the winning position of the game.

To reduce the probability of losing a target path while waiting, the attacker should act when possible and use the available action rate. We refer to this attacker as a* risk-taking* attacker as they focus on maximizing their winning chance. More frequent attacks however have the risk of triggering alarm in the defender’s intrusion detection system (IDS), tightening security, and reducing access to the system. Let be a threshold that is used by the defender’s IDS to raise the threat level of the system. To avoid reduction in accessing the system, the attacker may prefer to keep their attack rate below . We refer to this attacker as a* risk-averse* attacker.

The defender plays memoryless with probability , and so in each time step the attacker moves with probability

There will be no move by any of the players in a time step, with probability . Thus the system transition matrix will be

Equation (2) shows that, depending on the value of (the attack detection threshold of the defender), we have two cases: C1: . In this case from (2), we have and C2: . In this case from (2), we have and

We refer to C1 and C2 as risk-taking and risk-averse attacker, respectively.

##### 3.3. Transition Probabilities of

In state , the attacker knows target paths in . A state transition that starts from state is in general because of the combination of the defender’s and the attacker’s actions in the following time step. A defender’s action reallocates target paths and (since the attacker only holds target paths) can result in state to change to where . An attacker’s action, however, could result in more target paths being captured and so change the state to . The state will not change that is stays at , because of the defender* or* the attacker’s action* or* no moves at all. In the following, we obtain transition probabilities (starting from state ) for (i) the defender’s move and (ii) the attacker’s move and combine them to obtain the transition probabilities of the chain. For the case of “no move” (which happens with probability ) the state of the game will not change.

*Defender’s Move in State *. Defender chooses a set of paths from the set of current paths and replaces them with a set of paths chosen from the candidate target paths .

Let be the intersection of and the adversary’s set of captured paths, and let . We note that . Thus the state of the game after the defender’s action will be (because target paths have been removed from ) and we haveNote that, for , the state of the game will not change.

*Attacker’s Move in State *. Attacker holds the target paths in and knows the state of the game. The attacker will choose a set of paths from , the set of available candidate target paths.

Let be the intersection of and defender’s set of target paths that are not captured yet, and let . We have . With the new captured target paths, the sate of the game will become and we haveFor , the state of the game will not change.

*Transition Probability from State ** to *. Transition probabilities from state to will be calculated using (6) and (7) and . We note that transitions with occur only due to the defender’s move, and transitions with occur due to the attacker’s move. No transition will be due to the defender, the attacker, or no move, with probabilities , and , respectively. Thus we have the following transition probabilities:Here and are defined in (6) and (7), respectively.

The above probabilities show that in each time step a state can be changed to up to other states or stay the same.

##### 3.4. System Parameters

The Markov chain that models the system is determined by the parameters , and . In the following we will define security measures for the system and prove Theorems that relate these measures to the system parameters.

#### 4. Security Analysis

We use two security measures related to the success criteria of the attack.

##### 4.1. Expected Number of Compromises

Consider the system over a period of time steps, starting from the state 0. Within these time steps, the expected number of times that the system will be in the compromised state, that is, the attacker is able to learn the message, is an important security measure. (Note that one can use coding strategies [15] to spread information over longer sequences, and so estimating the expected number of compromises provides the required parameter for encoding.)

Theorem 3. *For an MTD game of path hopping with transition matrix and stationary distribution , where is the winning state, , which is the expected number of times the adversary wins in the first time steps, is less than or equal to , assuming that the game starts with the distribution.*

*Proof. *The game starts at . Let denote the expected number of times the attacker wins in the first time steps.

We first assume the attacker’s starting position is chosen according to the stationary distribution . Our goal is to find the expected number of times the attacker wins in steps, starting with this distribution.

Let be an indicator variable that takes the value 1 if the attacker wins in the time step and zero otherwise. Note that, starting from stationary distribution , the distribution of next step position of the attacker is , and so each has identical distribution .

The random variable is the number of times that the attacker wins in time steps. Noting the linearity of the expectation function , that is, , we have The adversary will have zero chance of winning in the first time steps if they start with the 0 distribution, and soThe last step of the argument assumes that, starting from the initial distribution , is monotonically increasing (in fact a weaker assumption would suffice for this last step of the argument, which is for all ) in each step of the chain until it reaches .

In our numerical computation we use to represent this security measure.

##### 4.2. Expected Number of Steps to the First Time Win

Our second security metric is the* expected number of steps to first time compromise*. This is an important measure for defender to estimate unbreakability of the system and for the attacker to estimate the work (in terms of the number of time steps that could be translated into attacker’s cost) needed to break the system. This measure can be calculated by solving a set of linear equations.

Theorem 4. *Consider an MTD game with transition matrix . Let denote the expected number of times to reach the state (the winning state) for the first time, if the game has started with state . We havewhere for all and is the same matrix as with the last (th) row removed.*

*Proof. *We consider an attacker that starts in state 0. Let denote the expected number of time steps to compromise the system. Let denote the expected number of time steps to reach state (the winning state) for the first time, if the game starts at state . In both MTD games, starting from state 0, the next state will be state 0 or 1, and so we can write That is, the expected number of times to reach state from state 0 is one (time step) more than the weighted average of the expected number of times to reach state from state 0 if the next move was to 0, and the expected number of times to reach state from state 1, if the next move was to state 1. We can write similar equations for all states except state , for which . Let and be column vectors such that and for all . The set of equations can be written as where is with the th row removed. This linear equation can be solved for for any given and the first element of is our desired security metric. In Section 6 we will present graphs of for various game settings.

#### 5. The Game

To better understand the relationship between parameter values (), in the following we will focus on where .

That is, in state , the defender moves by randomly choosing one of the paths in and swapping it with a randomly chosen path from the paths in . If the random choice from is one of the paths in that is known to the adversary, the adversary loses one of their captured paths, and the state will move to state , and this happens with probability . Otherwise, if the defender’s selected path is not in the system stays in the same state and this has probability .

The attacker will randomly choose one of the possible paths. The new path will be a new target path with probability and so the system will move to the state with this probability. On the other hand with probability , the selected path will not be a target path and with this probability the system will remain in state .

##### 5.1. Case C1: Risk-Taking Adversary

State transition probabilities are given by (4). Figure 1(a) shows state transition probabilities due to the attacker’s and the defender’s actions. The transition probabilities on the upper part of the figure are due to the attacker’s action. Figure 1(b) shows the combined transition probabilities.

**(a)**

**(b)**

This is a Markov chain and the state transition matrix can be obtained from Figure 1(b) as

It is easy to see that the Markov chain is irreducible and aperiodic. Using this matrix we can find stationary probability distribution of the system denoted by .

##### 5.2. Risk-Averse Adversary: Case C2

The state transition matrix is given by (5). At each time step, (i) defender moves (randomize) with probability , (ii) attacker moves with probability , and (iii) no move happens with probability .

The adversary knows the state and moves with probability . The state transition matrix can be obtained from Figure 2(b), given bywhere . Using this matrix we can find stationary probabilities . Again, the Markov process is irreducible and aperiodic and a limiting stationary distribution always exists.

**(a)**

**(b)**

**(a)**

**(b)**

##### 5.3. Bounds on

We prove an upper bound on using the following lemma.

Lemma 5. *Let be the transition matrix of a Markov chain such that, for any row , all components except are zero.**Then for any one has*

*Proof. *We prove this by induction. For the base case , we know that Also we know that elements of a row in sum to 1, that is, . Thus, . Therefore, Now we assume that the equality holds for the case ; that is, As , for the case of we have Assuming the case holds we have Recall that ; thus Combining the last two equalities will yield or equivalently

Theorem 6. *Let be the transition matrix of the Markov chain of the game defined in (17). Also let be the stationary probability distribution of . Then the following inequality holds:*

*Proof. *The Markov chains of games satisfy Lemma 5. Therefore, using (17) we have The second equality is due to Lemma 5, and the third equality is due to the transition probabilities of . Note that for the case of risk-taking adversary .

#### 6. Numerical Results for

To study the effect of different system parameters on the system security, we calculated security measures of the two MTD games for different values of system parameters and graphed the results. We used the results of Section 4 to calculate and for different choices of , , and . We used MATLAB for our calculations. For each set of system parameters , or , using (6)–(14), we first obtained the transition matrix . For C1 is varied from 1% to 99% in steps of 1%. For C2 is varied from 1% to in steps of 1%.

To calculate , we used the results of Theorem 4 and employed the linear equation solver (linsolve) of MATLAB to solve the set of equations for each parameter set. For large values of , becomes near-singular and an exact solution cannot be found (in fact, for large values of and large values of a MATLAB warning occurred due to the equation being near-singular and thus only an approximation of the answer was calculated. We have excluded those cases from our analysis and the graphs are only included with exact results here). This explains the choice of in our graphs. The choice of determines the computation time but otherwise is not restricted. We chose values of and such that the cases of and both are shown in the graphs.

To calculate we used MATLAB eigenvector analysis function ([V,D]=eig(M′)) to find for the stationary distribution where . The stationary distribution is obtained by normalizing the eigenvector that corresponds to the eigenvalue + 1. We could use up to 1000 and up to 200 in this analysis. For the choice we ensured that , , and are represented. The results of the above calculations are graphed.

##### 6.1. Case C1

The results of numerical computations for different settings are depicted in Figures 3, 4, and 5.

*Stationary Probability*. For fixed and , as increases decreases. For fixed and , when increases, decreases. Both imply better security as expected from more dynamic systems (Figure 3).

Figure 5(b) shows that, for fixed and , increase in results in the reduction of and increase in and so better security. However, this gain in security diminishes after reaches a threshold. In this last case almost all paths are target paths and so attacker’s chance of correctly guessing is high. The thresholding behaviour suggests using higher (and so higher system cost) will not have a substantial effect on security.

The figures also show that, for fixed as increases, decreases. For example, for , , and 7, for all , we have (see Figure 3).

*Expected Number of Steps to the First Win *. Figure 4 shows that, for given parameters and , increasing increases the security of the game. Moreover, we observe that behaves linearly in the graphs of Figure 4 as increases, and it increases faster for higher ’s. Therefore, with increasing , the risk-taking attacker needs much longer time to compromise the system.

We also observe that, for given and , as increases, after a certain threshold value, increases. For example for , , and , this threshold is (see Figure 5(b)). For however, even for , remains small (very close to zero). The same behaviour exists for : for given and , the expected number of steps to the first win of the adversary decreases as increases. However, decrease in security is negligible if is sufficiently large (see Figure 5(a)).

##### 6.2. Case C2

The stationary distribution for given , and is given in Figure 6. We consider three different values of , an .

We also show our results for in Figure 7. The figures demonstrate that the defender achieves better security by choosing higher ’s.

Figure 6 shows that, for fixed values of , and , the security of the system increases with the increase in , represented by the reduction in . We also observe that, similar to the case C1, for fixed , , and , we gain more security with increasing . Moreover, as can be expected, smaller values of will let defender to reach higher security by increasing .

Figure 7 shows our analytical calculation results of . Again for fixed values of , and , the security of the system increases with the increase in , and this is shown by the increase in . It can also be seen that for fixed values of , , and , more security can be gained with the increase in . The figures also indicate that smaller values of allows the defender to increase security of the system by increasing .

As it was discussed in Section 7, we can calculate the cost of being a risk-averse adversary in terms of . Figure 8 shows the behaviour of this cost (penalty) as a function of .

#### 7. Utilities

There are costs and gains associated with the defender’s and the attacker’s actions. Below we outline important aspects of utilities of the two players and note that our basic modelling and numerical analysis could provide insight into better quantifying these utilities. More concrete analysis and estimation of utilities require considering specific realization of path hoping systems and more detailed modelling and numerical computations.

* Defender’s cost in state * is denoted by and for no action and action , respectively.

(i) is the cost of the defender if they do not move. In this case, the defender does not need to bear any resource cost; however, the chance of the attacker winning in the next time step would increase because the attacker may capture additional target paths in the next time step and the state will change to . The defender, however, does not know the exact state of the system, , and so their cost would be an expected value, where expectation will be taken over the stationary probabilities of the system.

(ii) is the cost of randomization. This will be a fixed cost (that could depend on the number of of paths that will be randomized in one time slot) associated with redirecting traffic, possibly packet losses, and delays to reestablish the paths.

* Attacker’s costs in state * is denoted by and for no action and action .

(iii) is the cost of the attacker not taking action. They do not need to spend resources, but their success chance would reduce because the defender’s action in the next time step may result in one of the target paths in to be removed from the target path set. As discussed below, this cost can be estimated in terms of increase in .

(iv) is the cost of launching the attack in state and indicates the resources that the attacker must spend to realize the attack. This cost would be fixed as long as the attack rate is below and will increase if the attack rate is above

One can also consider gains associated with actions of the attacker and the defender. Utilities of the players will be a function of the costs and the gains.

*Estimating *. A risk-averse adversary will use an attack rate below and so, with probability , there is no action from the attacker. Intuitively, no action means that the attacker will have a reduced success chance in breaking security. This can be quantified by the larger expected number of time steps to win for the first time. For example, consider , , and . A risk-averse adversary who moves with probability will have ; however, for the same values of and , a risk-taking attacker with will have . This shows that a risk-averse adversary is paying the penalty of being risk-averse and (most likely) must wait for its first win much longer than a risk-taking adversary. By defining the cost of being risk-averse as for a risk-averse adversary with probability of moving , we can graph the behaviour of this cost (penalty) as a function of .

Figure 8 shows that smaller (being more risk-averse) will have higher costs, and as increases, the cost decreases. However, as defender increases , the available attack rate of the attacker () decreases and after a certain threshold value of , the cost of being risk-averse decreases and becomes 0 when .

#### 8. Concluding Remarks

We introduced path hopping as an approach to providing efficient long-term cryptographic security for communication against an adversary with access to a quantum computer. We considered a general class of dynamic strategies that can be modelled as a Markov chain that models the attacker’s and the defender’s interaction and gave detailed analysis of . Our work opens new directions for future work including considering more complex set of actions for the players. For example, allow defender and attacker to choose and in each state, and/or use different values of and at different state. Including the actual costs of the defender and the attacker in the modelling and analysis will unravel the limits of randomization strategies in practice. Also, an important question is to efficiently provide sufficient shared randomness for the sender and the receiver, to be used in the path hopping algorithm. For our analysis, we assumed that sufficient number of random bits has been shared by the sender and receiver through another protocol, before transmission starts. Sharing true random bits without computational assumptions in practice requires sharing a random pad which would restrict application of the system in practice. Efficient sharing of randomness can be achieved by using a pseudorandom number generator and only sharing an initial random seed. Security of the resulting system, however, needs to be analyzed to ensure that this will not affect postquantum security of the system. This will be an interesting problem for future research. There are also similar games that can be modelled using this approach. For example, one can consider the path hopping game associated with the model in [21].

#### Disclosure

A preliminary version of this paper was presented at the 2017 Workshop on Moving Target Defense (MTD ’17), Dallas, Texas, USA, October 30, 2017.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This research was partially supported by Natural Sciences and Engineering Research Council of Canada and by TELUS Communications.

#### References

- P. W. Shor, “Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer,”
*SIAM Journal on Computing*, vol. 26, no. 5, pp. 1484–1509, 1997. View at: Publisher Site | Google Scholar | MathSciNet - R. Brandom, “Microsoft lab predicts a working quantum computer within 10 years,” 2015, http://www.theverge.com/2015/10/15/9539033/working-quantum-computer-prediction-ten-years-microsoft. View at: Google Scholar
- D. Evans, Top 25 Technology Predictions—Cisco Systems, 2009, https://www.cisco.com/c/dam/en_us/about/ac79/docs/Top_25_Predictions_121409rev.pdf.
- S. Dahmen-Lhuissier, Quantum-Safe Cryptography, 2016, http://www.etsi.org/technologies-clusters/technologies/quantum-safe-cryptography.
- L. Chen, S. Jordan, Y. Liu et al., NISTIR 8105 Draft—Report on Post Quantum Cryptography, 2016, http://csrc.nist.gov/publications/drafts/nistir-8105/nistir_8105_draft.pdf.
- M. Braithwaite, Experimenting with Post-Quantum Cryptography, July 2016, https://security.googleblog.com/2016/07/experimenting-with-post-quantum.html.
- J. W. Bos, C. Costello, M. Naehrig, and D. Stebila, “Post-quantum key exchange for the TLS protocol from the ring learning with errors problem,” in
*Proceedings of the 36th IEEE Symposium on Security and Privacy (SP '15)*, pp. 553–570, IEEE, San Jose, Calif, USA, May 2015. View at: Publisher Site | Google Scholar - E. Brickell, “Intel strategy for post quantum crypto,” in
*Proceedings of the 7th International Conference on Post-Quantum Cryptography*, (Invited Talk), Fukuoka, Japan, 2016, https://pqcrypto2016.jp/data/Brickell-Post_Quantum_Strategy-PQC_2016_final.pdf. View at: Google Scholar - O. Regev, “On lattices, learning with errors, random linear codes, and cryptography,”
*Journal of the ACM*, vol. 56, no. 6, article 34, 2009. View at: Publisher Site | Google Scholar - X. Zhou, L. Song, and Y. Zhang,
*Physical Layer Security in Wireless Communications*, CRC Press, 2013. - A. D. Wyner, “The wire-tap channel,”
*Bell Labs Technical Journal*, vol. 54, no. 8, pp. 1355–1387, 1975. View at: Publisher Site | Google Scholar | MathSciNet - A. Herzberg, S. Jarecki, H. Krawczyk, and M. Yung, “Proactive secret sharing or: how to cope with perpetual leakage,” in
*Advances in Cryptology—CRYPT0 ’95*, pp. 339–352, Springer, Berlin, Germany, 1995. View at: Publisher Site | Google Scholar - H. Maleki, S. Valizadeh, W. Koch, A. Bestavros, and M. Van Dijk, “Markov modeling of moving target defense games,” in
*Proceedings of the 2016 ACM Workshop on Moving Target Defense*, pp. 81–92, ACM, 2016. View at: Google Scholar - A. Shamir, “How to share a secret,”
*Communications of the ACM*, vol. 22, no. 11, pp. 612-613, 1979. View at: Publisher Site | Google Scholar | MathSciNet - M. O. Rabin, “The information dispersal algorithm and its applications,” in
*Sequences*, pp. 406–419, Springer, 1990. View at: Google Scholar - H. Krawczyk, “Secret sharing made short,” in
*Proceedings of the Annual International Cryptology Conference*, pp. 136–146, Springer, 1993. View at: Google Scholar - A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran, “Network coding for distributed storage systems,”
*IEEE Transactions on Information Theory*, vol. 56, no. 9, pp. 4539–4551, 2010. View at: Publisher Site | Google Scholar - D. Dolev, C. Dwork, O. Waarts, and M. Yung, “Perfectly secure message transmission,”
*Journal of the ACM*, vol. 40, no. 1, pp. 17–47, 1993. View at: Publisher Site | Google Scholar | MathSciNet - C. Fragouli, J.-Y. Le Boudec, and J. Widmer, “Network coding: an instant primer,”
*ACM SIGCOMM Computer Communication Review*, vol. 36, no. 1, pp. 63–68, 2006. View at: Google Scholar - M. Strasser, C. Pöpper, S. Capkun, and M. Cagalj, “Jamming-resistant key establishment using uncoordinated frequency hopping,” in
*Proceedings of the IEEE Symposium on Security and Privacy (SP '08)*, pp. 64–78, Oakland, Calif, USA, May 2008. View at: Publisher Site | Google Scholar - H. Ahmadi and R. Safavi-Naini, “Multipath private communication: an information theoretic approach,” https://arxiv.org/abs/1401.3659. View at: Google Scholar
- K. Scott and J. Davidson, “Strata: a software dynamic translation infrastructure,” Tech. Rep., University of Virginia, Charlottesville, Va, USA, 2001. View at: Google Scholar
- N. Nethercote and J. Seward, “Valgrind: a framework for heavyweight dynamic binary instrumentation,” in
*Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '07)*, pp. 89–100, ACM, San Diego, Calif, USA, June 2007. View at: Publisher Site | Google Scholar - H. Okhravi, A. Comella, E. Robinson, and J. Haines, “Creating a cyber moving target for critical infrastructure applications using platform diversity,”
*International Journal of Critical Infrastructure Protection*, vol. 5, no. 1, pp. 30–39, 2012. View at: Publisher Site | Google Scholar - A. Saidane, V. Nicomette, and Y. Deswarte, “The design of a generic intrusion-tolerant architecture for web servers,”
*IEEE Transactions on Dependable and Secure Computing*, vol. 6, no. 1, pp. 45–58, 2009. View at: Publisher Site | Google Scholar - H. Okhravi, T. Hobson, D. Bigelow, and W. Streilein, “Finding focus in the blur of moving-target techniques,”
*IEEE Security & Privacy*, vol. 12, no. 2, pp. 16–26, 2014. View at: Publisher Site | Google Scholar - K. M. Carter, J. F. Riordan, and H. Okhravi, “A game theoretic approach to strategy determination for dynamic platform defenses,” in
*Proceedings of the 1st ACM Workshop on Moving Target Defense (MTD '14)*, pp. 21–30, ACM, Scottsdale, Ariz, USA, November 2014. View at: Publisher Site | Google Scholar - R. Colbaugh and K. Glass, “Predictability-oriented defense against adaptive adversaries,” in
*Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC '12)*, pp. 2721–2727, October 2012. View at: Publisher Site | Google Scholar

#### Copyright

Copyright © 2018 Reihaneh Safavi-Naini et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.