Security and Communication Networks

Volume 2018, Article ID 8475818, 15 pages

https://doi.org/10.1155/2018/8475818

## Path Hopping: An MTD Strategy for Long-Term Quantum-Safe Communication

^{1}University of Calgary, Calgary, AB, Canada^{2}Czech Technical University, Prague, Czech Republic

Correspondence should be addressed to Reihaneh Safavi-Naini; ac.yraglacu@ier

Received 9 December 2017; Accepted 13 March 2018; Published 7 May 2018

Academic Editor: Hamed Okhravi

Copyright © 2018 Reihaneh Safavi-Naini et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Moving target defense (MTD) strategies have been widely studied for securing computer systems. We consider using MTD strategies to provide long-term cryptographic security for message transmission against an eavesdropping adversary who has access to a quantum computer. In such a setting, today’s widely used cryptographic systems including Diffie-Hellman key agreement protocol and RSA cryptosystem will be insecure and alternative solutions are needed. We will use a physical assumption, existence of multiple communication paths between the sender and the receiver, as the basis of security, and propose a cryptographic system that uses this assumption and an MTD strategy to guarantee efficient long-term information theoretic security even when only a single path is not eavesdropped. Following the approach of Maleki et al., we model the system using a Markov chain, derive its transition probabilities, propose two security measures, and prove results that show how to calculate these measures using transition probabilities. We define two types of attackers that we call risk-taking and risk-averse and compute our proposed measures for the two types of adversaries for a concrete MTD strategy. We will use numerical analysis to study tradeoffs between system parameters, discuss our results, and propose directions for future research.

#### 1. Introduction

Cryptographic infrastructure of the Internet allows users from across the world to establish private and authenticated, confidential communication channels, and interact securely. Shor’s discovery of a quantum algorithm that can efficiently solve integer factorization and discrete logarithm problems [1], the two mathematical problems that are the basis of the security of the most prominent public key crypto algorithms such as RSA public key encryption and Diffie-Hellman key agreement, effectively brings down the cryptographic infrastructure of the Internet. The NSA’s recent call for quantum-safe cryptography and the prediction that significant progress in the development of quantum computers could be expected within fifteen years [2, 3] have created a flurry of activities in research community, standardization bodies [4, 5], and major industries [6–8]. The main approaches to quantum-safe cryptography use, (i) quantum cryptographic models and algorithms, (ii) cryptographic algorithms that rely on computational assumptions for which no efficient quantum algorithm is known [9], and (iii) cryptographic systems that do not use any computational assumptions. This last approach results in information theoretically secure systems and is followed in this paper.

A prominent and widely researched direction in information theoretically secure communication is* physical layer security * systems that base security on assumptions about the physical environment [10]. In these systems the advantage of the sender and the receiver over the adversary is captured through the properties of the physical layer of communication. For example, in Wyner’s model [11] Alice is connected to Bob and Eve through two noisy channels, and the assumption is that Eve’s reception is noisier than Bob’s reception. The extra noise in Eve’s channel is the resource and can be used for securing communication against Eve, without the need for a shared key. A unique property of this approach compared to computationally secure systems is providing* long-term security* which refers to the property that Eve’s transcript of communication cannot be used for offline attacks. This is because security is due to the lack of information and not adversary’s limited computation.

In this paper we assume there are multiple communication* paths* ( paths) between the sender and the receiver. A path is an abstraction of a channel and can have different realizations in practice. For example, in wireless communication, a path can correspond to a frequency that is used for transmission and reception by the sender and the receiver; and multiple paths are specified by a set of frequencies that will be used by the two. Or in a sensor network, a path consists of a sequence of nodes in the network which are used to send messages from the sender to the receiver. A similar notion of path can be defined for communication over the Internet. If the adversary can eavesdrop all the paths, secure communication without additional assumptions (e.g., quantum mechanical, computational or other physical layer assumptions) is impossible. We assume that although the set of all paths (e.g., possible frequencies) is known to the attacker, they cannot eavesdrop all the paths at the same time.

To provide cryptographic security against an attacker whose goal is to learn the sent message, the sender can use an -secret sharing (see Section 2) to construct shares of the message and send each share along a path. This is an inefficient solution with communication rate of (i.e., for one bit of information, the sender must send bits). If the number of paths that the adversary can simultaneously eavesdrop is bounded by , the sender and receiver can select a random set of paths and use a -secret sharing. Note that is not necessarily equal to , but to keep the introductory discussion simple, let ; we will later discuss the relation between and in Section 3. If the set of paths is kept fixed, the attacker will discover them over time and will be able to learn the message. We propose “path hopping,” where the sender and the receiver regularly change (“hop”) one or more paths that have been chosen for communication. We also allow the attacker to change their selected paths. We model and analyze dynamic behaviour of the system and show that it results in efficient cryptographic security using an MTD strategy.

##### 1.1. Our Work

Alice wants to send a stream of data to Bob. The adversary is computationally unlimited (no computational assumption is made). Alice and Bob are connected by a set of communication paths, up to of which can be eavesdropped by the adversary at the same time. The adversary probes a path to determine if it carries data, and if it does, it captures it (e.g., scanning the port on a server and if open, later break into the path). The adversary is* mobile *[12] and can move in the network in the sense that, in each time step, it can release a captured path and capture a new one. Hence, it can eavesdrop different sets of paths during different time periods.* We assume the attacker can eavesdrop up to ** paths, but only eavesdrops those that carry data, and not the ones that are not in use between Alice and Bob *(one can consider a case where the attacker always eavesdrop on paths. Although the analysis approach will be similar, the actual calculations will be different). Alice uses (out of ) paths at each time and the attacker needs to know all the paths (targets) to be able to determine the message.

Time is divided into fixed consecutive intervals, each referred to as a* time step*. In each time step,* defender* (which includes Alice and Bob),* attacker,* or none of them takes an action (move). Using the MTD framework of [13], the combination of the attacker’s and the defender’s actions in each time step can be modelled as a* Markov chain*. We define Markov chain where (exactly) and paths are hopped by the defender and the attacker, respectively, and derive transition probabilities of the chain. (One can also consider Markov chain where in each time step, the defender randomizes* up to* paths and the attacker hops and probes* up to* paths. We leave this for future work.) For our concrete analysis we focus on where the defender’s and the attacker’s actions involve a single path, only.

In each time step, the system can be in one of the * states* labeled by , where state 0 is the starting state of the system, state is the* winning state* for the attacker, and in state , attacker has captured paths. We assume the defender leads in each time step and* moves* with a fixed probability . The attacker also moves in the same interval with a fixed probability that is upper bounded by (the model assumes that, in each time step, the attacker and the defender do not act simultaneously.) Note that can be chosen by the adversary, knowing the upper bound. We define a* risk-taking* and a* risk-averse* attacker, depending on their choice of . A risk-taking adversary chooses the highest available attack probability, that is, . A* risk-averse* attacker would like to stay undetected and so limits their action rate to a threshold that is determined by the intrusion detection system (IDS) of the defender. Thus, in the case of a risk-averse adversary, in each time step, there is a probability that no one moves.

We model the system as a Markov chain, and in Section 3, derive the transition probabilities of the Markov chains associated with the risk-taking (case C1) and the risk-averse (case C2) attackers.

*Security Measures*. We use two security measures to evaluate effectiveness of a path hopping strategy: (i) the expected number of times that the attacker reaches the winning state in time steps, assuming that it is starting from state 0, and (ii) the expected number of time steps to enter the wining state for the first time, assuming that the attacker starts from state 0. The two measures are denoted by and , respectively.

These security measures capture security requirements of different scenarios. is appropriate for data streams for which sporadic access to different parts of the stream may be tolerated. For example, small excerpts of a large file are not expected to leak much information about the file. Theorem 3 shows that is upper bounded by the product , where is the th component of the stationary probability distribution of the Markov chain. This suggests that * can be used to represent **, with higher values corresponding to less security.*

is appropriate for highly sensitive data streams that must stay strictly inaccessible to the adversary and the sender wants to ensure that the expected number of time steps to the first compromise is sufficiently high (possibly higher than the length of the stream). Theorem 4 shows that can be calculated by solving a set of linear equations whose coefficients are derived from the transition probabilities of the Markov chain. We use * as a security measure with higher values corresponding to higher security*.

*Numerical Results*. Deriving closed form expressions for and is a challenging task. For , we use numerical calculations to study variations of security measures for different values of system parameters. Our results are given in Section 6. They show the following:

(1) For fixed and , security increases (i.e., decrease, and increases) as , the defender’s probability of action, increases (Figures 3 and 4 for C1 and Figures 6 and 7 for C2).

(2) For fixed , security can be maintained by increasing , even when (Figure 5) and in all cases communication rate is .

Figure 5 also shows that, for given values of and , as increases, security initially increases, then it reaches a plateau and then starts to decrease. This is because when is small (relative to ), the target paths are hidden among many available paths and the success chance of correctly guessing a path would be small. However, when is large (relative to ), the attacker’s probability of correct guessing increases. Interestingly, this point of saturation increase as that represents variability of the system increases. This graph can be used to select the optimal value of to provide maximum security, while achieving the highest communication rate.

(3) Using numerical analysis, one can estimate the* cost of being risk-averse * in terms of decrease in or increase in . In Section 7 we show that an adversary who chooses not to use all their attacking power (although they can act with probability , they choose ) will effectively reduce the expected number of times that they will occupy the winning state (proportional to ) and will have higher .

*Attack Costs*. Our model focusses on the defender’s ability to provide security by making the* physical environment* dynamic and does not consider the associated costs. Attacker and defender’s actions have payoffs. The attacker needs to spend resources to launch attacks and also bear consequences of being detected. The defender must spend resources to implement the randomization strategy. This introduces side effects such as packet loss and communication delays that are a function of the rate of randomization (captured with parameter ). The attacker’s reward of their action is related to getting closer to the winning state, and the defender’s reward is preventing the attacker reaching the final state. In Section 7 we discuss these payoffs. We also use our numerical calculation results to quantify the cost of being risk-averse.

*Randomness Requirements of the System*. Our proposed system assumes that the sender and the receiver share the set of target paths that is used for communication in each time step.

In practice if one can assume that the receiver will receive on all paths all the times, then no shared randomness is required: the sender will hop the paths and the receiver will receive the content on the target paths used in each time step. If receiver has the same restriction as the sender on the number of target paths, that is, the receiver can only receive on paths (e.g., cost or restriction on the receiving equipment), then the sender and the receiver need shared randomness to simultaneously hop the paths. This can be realized in two ways: (i) using a preshared random string or (ii) employing a secure pseudorandom generator to extend an initial shared random seed.

The adversary view of the system in state , in addition to the eavesdropped shares that are sent over that target paths, includes the labels of the target paths. In case (i), the sequence of random numbers associated with the labels of target paths will not reveal any information about future values of the sequence of target paths and so future path labels will remain unpredictable. In the case of (ii) however, each observation (of a target path) will leak information about the seed of the PR generator and one needs to use a PR generator with appropriate security level (e.g., a quantum-safe PR generators using a secure block cipher). Note that the MTD system will retain its security because the recorded transcript of communication although may reveal the seed of the PR generator in an offline attack will not have enough information about communicated message and so the message transmission will have long-term security.

##### 1.2. Related Work

Breaking information into shares, to provide confidentiality and reliability, has been used in many cryptographic systems such as secret sharing [14] and information dispersal [15], in information theoretic setting, as well as computational setting [16]. These algorithms have been used in distributed storage systems [17] and are the building blocks of Secure Message Transmission [18] and network coding [19] which use multiple paths between the sender and receiver for providing* security and reliability*.

Uncoordinated Frequency Hopping (UFH) [20] has similarity to our work. In UFH the sender and receiver send and receive on two independently chosen subsets of frequencies, and the eavesdropper uses a third subset of frequency for eavesdropping. Authors show that, assuming public key infrastructure, one can communicate securely and reliably in this setting. The work of [21] uses a similar abstract model to construct information theoretic protocols for secure communication, without requiring public key infrastructure. The communication rate in this latter construction, however, is very low. Our approach is* coordinated* path hopping where the sender and receiver share an initial secret key that can be established using the scheme of [21]. We leave the analysis of the secret key requirement of our system and in particular efficient ways of generating new keys at the required hopping rate for future work.

Using diversity and introducing dynamic properties has been widely used in security systems. System properties that can be diversified and randomized include program instructions [22, 23], operating system distributions [24], and systems [25]. A comprehensive study of various methods is given in [26]. Using game theory for analyzing attackers’ strategies in dynamic systems has been studied in [27, 28].

*Organization*. Section 2 recalls the MTD Markov chain framework. Section 3 presents our path hopping model. Security analysis and measures for our model are introduced in Section 4. Sections 5 and 6 present our simulation results for the game. Sections 7 and 8 cover utility discussions and our concluding remarks.

#### 2. Preliminaries

We recall the basic MTD Markov chain framework that is used in our work and review construction and properties of secret sharing schemes.

*MTD Markov Framework [13]*. The system is a defined by the interaction between a* defender* and an* attacker*. The defender and the attacker each have a set of possible actions, denoted by and , respectively; in both shows no action. Time is divided into time steps. In each time step the system is in one of the possible states. In each time step, the defender and the attacker get a turn to move and the state change probability is determined by their chosen actions and their results. A* strategy* of a player determines all actions taken by the player in all points of the game. Using* Markov model* allows the player’s strategy to only depend on the state that the system is in and independent of the history of how the system has reached a state.

*Definition 1. *An -MTD game is defined by a transition matrix which describes a Markov chain of state transitions that reflects both defender and attacker moves. Initially the game starts in the state 0. At each next time step the game transitions from its current state to a new state with probability .

The state is the winning state from the adversary’s view (defender losing the game). Initially the system is in state 0 (from both attacker and defender’s view point). In each time step the defender takes an action according to matrix with probability , the attacker takes an action according to with probability , and with probability , both remain without any action.

*Definition 2. *An -MTD Game is defined by(1)parameters and that satisfy ; the parameters represent the rate of defender’s and the attacker’s play, respectively;(2) transition matrices and ; for , (or ) represents the probability of transitioning from state to state when the defender (or the attacker) plays a move in state .

Thus, in each time step a three-sided coin is tossed, and for each side, the corresponding action is realized, and we have the transition matrix where is the identity matrix.

A Markov chain is* irreducible* if each state can be reached from any other state. A Markov chain is* aperiodic*, if all the states have period 1 where the period of state is defined as , where is the random variable describing the state of the game after steps. The two properties together guarantee the existence of a limiting* stationary distribution*, where .

*Secret Sharing*. A -secret sharing is a cryptographic primitive [14] that divides a secret into shares, each given to a party, satisfying two properties: (i)* reconstructability* which means the share of all parties can perfectly reconstruct the original secret and (ii)* perfect secrecy* which means that if a single share is missing, the secret remains perfectly uncertain. A secret sharing scheme provides two algorithms for* share generation* and* secret reconstruction*. Let , where is the set of integers modulo , denote the set of secrets, and assume that all secrets are equally likely . The share generation algorithm takes a message as input and generates shares as follows. For , randomly chooses an element in . Then the shares of the secret are . It is easy to see that shares recover the secret (finding the sum modulo ), and even if shares are known, the secret remains completely uncertain.

#### 3. The MTD Game of Path Hopping

We consider the setting described in Section 1.1: there is a message source that generates a stream of data that must be protected against an eavesdropper. There are communication paths that connect the sender to the receiver. To protect message transmission against an eavesdropper who can simultaneously eavesdrop up to paths (), the sender does the following: (i) randomly chooses a subset of available paths; (ii) uses a secret sharing to construct shares for the message, and (iii) sends each share on one of the selected path. The chosen paths are also called* target paths.* The receiver knows the paths that are used by the sender in each time step. If the adversary eavesdrops only a subset of the target paths (and not all target paths), because of the perfect secrecy of the -secret sharing scheme, the attacker will stay completely uncertain about the message.

We assume that the attacker will not keep a path that is not carrying data. That is, because of the limitation on the number of paths that they can simultaneously eavesdrop, they prefer to release a path that is not used in the current time step and wait for the next time step to try again, noting that, due to their probabilistic strategy, there would be a chance to try the released path in the next time step again. To simplify our analysis, we first consider the case that . For the cases that or , similar analysis can be used; we omit details because of space.

To protect against this adversary, in each time step, the sender and receiver will* hop* one or more of the target paths, noting that lacking access to even one of the target paths will leave the adversary completely uncertain. We will use the MTD game framework of Definition 2 and model the problem as a dynamic system (game) influenced by (between) two* players,* a* system defender (or simply defender)* that includes the* sender* and the* receiver* and an* attacker*. The attacker wins the MTD game (in each time step) if they find the target paths.

##### 3.1. Games

In each time step, the defender can randomize a subset of target paths. Similarly, the adversary can simultaneously probe paths.

We first describe the Markov chain associated with the game, then derive transition probabilities of , and finally present a detailed analysis of .

##### 3.2. Markov Chain

The set of the defender’s and the attacker’s actions is and , respectively, where and are defender and attacker actions and is no action. Let denote the set of current target paths and denote the subset of target paths known to the adversary.

*Defender’s Move*. The defender cannot determine with certainty if a path is being eavesdropped. We thus consider a defender who, in all time steps, plays a* memoryless strategy*. That is the defender plays (issues the move ) with probability , irrespective of any learnt information about the attacker’s state, or own history of actions. When the defender plays in state , they will choose a subset of the current target paths and replace the paths in with a randomly selected subset of () nontarget paths.

The chosen paths in may belong to (attacker’s known path in state ) or be outside it.

*Attacker’s Move*. The attacker is adaptive. In state , the set of target paths that is known to the adversary is of size . The adversary randomly selects a subset of size of () possible target paths and keeps the message carrying paths and releases the rest. For the adversary, all paths that are not in their set of known target paths have the same probability of being a target path.

We assume that, in state , as soon as the defender reallocates a target path that is in , the attacker can detect the change (the path is not one of the target paths). However this will not affect the adversary’s action at this state simply because they know that those paths are not possible target paths.

*No Move*. Defender and attacker are probabilistic and no moves can be issued by either of them.

In a time step, if the attacker does not issue an action, they will bear the risk of potentially losing one of their known target paths during the next time step. This is because the defender will play a memoryless strategy and will move with probability . This extra risk would translate into a higher probability of not being able to reach the winning position of the game.

To reduce the probability of losing a target path while waiting, the attacker should act when possible and use the available action rate. We refer to this attacker as a* risk-taking* attacker as they focus on maximizing their winning chance. More frequent attacks however have the risk of triggering alarm in the defender’s intrusion detection system (IDS), tightening security, and reducing access to the system. Let be a threshold that is used by the defender’s IDS to raise the threat level of the system. To avoid reduction in accessing the system, the attacker may prefer to keep their attack rate below . We refer to this attacker as a* risk-averse* attacker.

The defender plays memoryless with probability , and so in each time step the attacker moves with probability

There will be no move by any of the players in a time step, with probability . Thus the system transition matrix will be

Equation (2) shows that, depending on the value of (the attack detection threshold of the defender), we have two cases: C1: . In this case from (2), we have and C2: . In this case from (2), we have and

We refer to C1 and C2 as risk-taking and risk-averse attacker, respectively.

##### 3.3. Transition Probabilities of

In state , the attacker knows target paths in . A state transition that starts from state is in general because of the combination of the defender’s and the attacker’s actions in the following time step. A defender’s action reallocates target paths and (since the attacker only holds target paths) can result in state to change to where . An attacker’s action, however, could result in more target paths being captured and so change the state to . The state will not change that is stays at , because of the defender* or* the attacker’s action* or* no moves at all. In the following, we obtain transition probabilities (starting from state ) for (i) the defender’s move and (ii) the attacker’s move and combine them to obtain the transition probabilities of the chain. For the case of “no move” (which happens with probability ) the state of the game will not change.

*Defender’s Move in State *. Defender chooses a set of paths from the set of current paths and replaces them with a set of paths chosen from the candidate target paths .

Let be the intersection of and the adversary’s set of captured paths, and let . We note that . Thus the state of the game after the defender’s action will be (because target paths have been removed from ) and we haveNote that, for , the state of the game will not change.

*Attacker’s Move in State *. Attacker holds the target paths in and knows the state of the game. The attacker will choose a set of paths from , the set of available candidate target paths.

Let be the intersection of and defender’s set of target paths that are not captured yet, and let . We have . With the new captured target paths, the sate of the game will become and we haveFor , the state of the game will not change.

*Transition Probability from State ** to *. Transition probabilities from state to will be calculated using (6) and (7) and . We note that transitions with occur only due to the defender’s move, and transitions with occur due to the attacker’s move. No transition will be due to the defender, the attacker, or no move, with probabilities , and , respectively. Thus we have the following transition probabilities:Here and are defined in (6) and (7), respectively.

The above probabilities show that in each time step a state can be changed to up to other states or stay the same.

##### 3.4. System Parameters

The Markov chain that models the system is determined by the parameters , and . In the following we will define security measures for the system and prove Theorems that relate these measures to the system parameters.

#### 4. Security Analysis

We use two security measures related to the success criteria of the attack.

##### 4.1. Expected Number of Compromises

Consider the system over a period of time steps, starting from the state 0. Within these time steps, the expected number of times that the system will be in the compromised state, that is, the attacker is able to learn the message, is an important security measure. (Note that one can use coding strategies [15] to spread information over longer sequences, and so estimating the expected number of compromises provides the required parameter for encoding.)

Theorem 3. *For an MTD game of path hopping with transition matrix and stationary distribution , where is the winning state, , which is the expected number of times the adversary wins in the first time steps, is less than or equal to , assuming that the game starts with the distribution.*

*Proof. *The game starts at . Let denote the expected number of times the attacker wins in the first time steps.

We first assume the attacker’s starting position is chosen according to the stationary distribution . Our goal is to find the expected number of times the attacker wins in steps, starting with this distribution.

Let be an indicator variable that takes the value 1 if the attacker wins in the time step and zero otherwise. Note that, starting from stationary distribution , the distribution of next step position of the attacker is , and so each has identical distribution .

The random variable is the number of times that the attacker wins in time steps. Noting the linearity of the expectation function , that is, , we have The adversary will have zero chance of winning in the first time steps if they start with the 0 distribution, and soThe last step of the argument assumes that, starting from the initial distribution , is monotonically increasing (in fact a weaker assumption would suffice for this last step of the argument, which is for all ) in each step of the chain until it reaches .

In our numerical computation we use to represent this security measure.

##### 4.2. Expected Number of Steps to the First Time Win

Our second security metric is the* expected number of steps to first time compromise*. This is an important measure for defender to estimate unbreakability of the system and for the attacker to estimate the work (in terms of the number of time steps that could be translated into attacker’s cost) needed to break the system. This measure can be calculated by solving a set of linear equations.

Theorem 4. *Consider an MTD game with transition matrix . Let denote the expected number of times to reach the state (the winning state) for the first time, if the game has started with state . We havewhere for all and is the same matrix as with the last (th) row removed.*

*Proof. *We consider an attacker that starts in state 0. Let denote the expected number of time steps to compromise the system. Let denote the expected number of time steps to reach state (the winning state) for the first time, if the game starts at state . In both MTD games, starting from state 0, the next state will be state 0 or 1, and so we can write That is, the expected number of times to reach state from state 0 is one (time step) more than the weighted average of the expected number of times to reach state from state 0 if the next move was to 0, and the expected number of times to reach state from state 1, if the next move was to state 1. We can write similar equations for all states except state , for which . Let and be column vectors such that and for all . The set of equations can be written as where is with the th row removed. This linear equation can be solved for for any given and the first element of is our desired security metric. In Section 6 we will present graphs of for various game settings.

#### 5. The Game

To better understand the relationship between parameter values (), in the following we will focus on where .

That is, in state , the defender moves by randomly choosing one of the paths in and swapping it with a randomly chosen path from the paths in . If the random choice from is one of the paths in that is known to the adversary, the adversary loses one of their captured paths, and the state will move to state , and this happens with probability . Otherwise, if the defender’s selected path is not in the system stays in the same state and this has probability .

The attacker will randomly choose one of the possible paths. The new path will be a new target path with probability and so the system will move to the state with this probability. On the other hand with probability , the selected path will not be a target path and with this probability the system will remain in state .

##### 5.1. Case C1: Risk-Taking Adversary

State transition probabilities are given by (4). Figure 1(a) shows state transition probabilities due to the attacker’s and the defender’s actions. The transition probabilities on the upper part of the figure are due to the attacker’s action. Figure 1(b) shows the combined transition probabilities.