Abstract

An encryption/decryption approach is proposed dedicated to one-way communication between a transmitter which is a computationally powerful party and a receiver with limited computational capabilities. The proposed encryption technique combines traditional stream ciphering and simulation of a binary channel which degrades channel input by inserting random bits. A statistical model of the proposed encryption is analyzed from the information-theoretic point of view. In the addressed model an attacker faces the problem implied by observing the messages through a channel with random bits insertion. The paper points out a number of security related implications of the considered channel. These implications have been addressed by estimation of the mutual information between the channel input and output and estimation of the number of candidate channel inputs for a given channel output. It is shown that deliberate and secret key controlled insertion of random bits into the basic ciphertext provides security enhancement of the resulting encryption scheme.

1. Introduction

It is well recognized that communications should be secure and accordingly encrypted in order to avoid misuse of the transmitted information. Consequently, contemporary cryptographic algorithms for encryption play a very important role in data communication systems for various areas of applications. A particular challenge is related to addressing the resource constrained environments, where the requirements include lightweight algorithms and hardware designs. To select a suitable encryption algorithm for an application or an environment, the algorithmic requirements as well as the implementation constraints have to be taken into account. This is also in line with a discussion recently reported in [1].

On the other hand, in a number of scenarios the communication parties are with very different capabilities: one party could be with a tiny capability and the other with much higher ones. As an illustration, we point to a communication scenario over the Internet of Things (IoT) where a tiny machine (a tiny sensor, e.g.) should communicate with a more powerful one (sink of a sensor network or a gate, e.g.). According to the current state of the art, the following two problems appear as the still open ones: (i) developing encryption/decryption techniques which take into account asymmetric capabilities of the entities involved in encryption/decryption and (ii) enhancing cryptographic security of encryption in a lightweight and provable manner.

Consequently, in this paper we consider the problem of designing a dedicated encryption/decryption algorithm which fits into the communications scenarios which include the following: (i) a high performance computing party should deliver encrypted messages in a one-way communication scenario to a number of parties which have tiny computational capabilities; (ii) implementation limitations at the tiny entity imply employment of a lightweight keystream generator (from certain reported lightweight stream ciphers); (iii) developed encryption scheme should have enhanced security in comparison with the one offered by the employed keystream generator.

A certain number of reported encryption approaches jointly employ elements of traditional stream ciphers and elements of coding theory as well as features of certain communication channels (see, e.g., [28]), and this paper follows the same track. We consider an encryption approach which involves a communication channel with the synchronization errors which appear in the form of inserted bits. In this approach, the transmitting/encrypting side requires a source of random bits and capability to insert them between message bits. Under the assumption that the transmitter has a method to inform the intended receiver about the locations (and not necessarily the values) of the inserted random bits, the intended receiver can perform decimation (i.e., discard the inserted bits) of the obtained sequence so that it can be a subject of simple traditional decryption.

Summary of the Results. This paper focuses on the following two issues which have not been addressed in the literature: (i) developing of an encryption/decryption technique which has asymmetric implementation complexity and provides lightweight decryption and (ii) security enhancement of the involved keystream generator employing paradigm of the binary channels with random insertions. An encryption/decryption technique for data transfer between a computationally powerful party and a party with limited computational capabilities is proposed which provides a trade-off between implementation complexities at the involved parties: the implementation overhead is reduced at the low-capability party at the expense of a higher (but still moderate) one at the party with high capabilities. In order to achieve security enhancement of the employed traditional keystream generator the proposed encryption technique at the transmitting side involves a simulator of the binary channel with synchronization errors. Security enhancement of encryption archived by the proposed scheme in comparison with the security of the employed keystream generator is based on the design paradigm and results on the mutual information between inputs and outputs of the channels with bit insertion.

Organization. The paper is organized as follows. In Section 2, we give the underlying ideas for the design and proposal of an encryption/decryption framework. In Section 3, we provide some information-theoretic results for the proposed scheme; that is, we mostly derive various mutual information rates of interest for the security evaluation. In Section 4, we provide the cryptographic security evaluation based on implications which link the information-theoretic quantities to computational complexity based ones. Accordingly, Sections 5 and 6 provide evaluation of the computational complexity security enhancement employing numerical estimation of the mutual information and enumeration of input candidates for the given output after a binary channel with insertion of random bits, respectively. (Also note that this paper is a significantly revised and expanded version of [8].)

2. A Proposal of a Dedicated Encryption Technique

This section proposes an encryption/decryption technique which provides asymmetric implementation complexity at the communicating parties and provably enhanced cryptographic security. Both asymmetric implementation complexity and enhanced security appear as a consequence of the design based on employment of a simulator for binary channels with insertion errors.

2.1. Underlying Ideas

Our main design goals/approaches could be summarized as follows:(i)Enhance security based on information-theoretic and coding results over channels with synchronization errors.(ii)Assuming that Party I is more powerful than Party II move the more complex operations to the side of Party I without implications on the cryptographic security.

This paper proposes a stream cipher developed based on the following two construction principles: (i) adjustment of the construction to the asymmetric capabilities of the involved parties; (ii) employment of the results regarding binary channels with insertion errors for enhancing security. The goals are that the party with more powerful resources performs more complex operations and that the entire scheme provides a highly and provably secure level of cryptographic security resulting from the employment of the insertion communications channel paradigm.

Our design is based on employment of the following building blocks:(i)a lightweight binary keystream generator;(ii)a block for insertion (embedding) random bits into a given -dimensional binary vector;(iii)a block for decimation of a given -dimensional binary vector which selects certain -bits.

Accordingly, we assume that the employed keystream generator outputs certain pseudo-random sequences denoted as and . Also, we assume that a deterministic mapping exists which maps a given into . We assume that the message is additively combined (i.e., encrypted) with the shared pseudo-randomness to obtain , that is,and is subject of further mapping by a simulated binary channel with random insertions where positions of random bits embedding are specified by so that the channel outputs . The intended receiver (Bob), knowing both and , can easily decimate to obtain and further perform , to obtain the message .

Since Bob can easily recover the transmitted message using a simple decimation technique, the system requires no special hardware overhead for decryption. This is especially useful if the intended receiver is a low-power device. On the transmitter’s side encryption requires simulation of a binary channel with insertion errors and the transmitter needs to send times more symbols than it otherwise would, which means that the power consumption of the transmitter goes up by a factor of . Hence, it may be reasonable to use this scheme when the transmitter is a high computational/power device and the receiver is a low computation/power device. In essence, a properly adjusted synchronization error scheme (an insertion scheme) seems to be well suited for a resources-asymmetric communication scenario in which a base station has ample resources while each of the numerous distributed nodes has severely constrained resources.

2.2. Framework for Encryption and Decryption

This section proposes an encryption/decryption technique for one-way communication from a transmitting party with high computational and other resources towards a receiving party with limited computational capabilities. Accordingly, the design follows the asymmetric implementation and execution constraints and the requirement regarding provable security.

As usual, it is assumed that encryption and decryption parties share a secret key and that before a transmission session, based on the common secret key and the public data, both parties (encryption and decryption ones) establish a session key to be used for the transmission session.

The encryption/decryption technique is designed employing the following components:(a)Encryption side:(i)a lightweight stream cipher (keystream generator);(ii)a block which provides deterministic mapping (see Figure 1) of a given keystream segment of dimension into a vector with predetermined weight equal to , that is, with a number of ones equal to which determines positions of the embedded bits;(iii)a simulator of a binary channel with random bits insertions controlled by keystream generator which performs mapping .(b)Decryption side:(i)a lightweight stream cipher (keystream generator);(ii)a block for deterministic mapping of a given keystream segment into a vector with predetermined weight, that is, the number of ones, the same as that at the encryption side;(iii)a block for decimation controlled by keystream generator which performs mapping .

We assume that implementation and execution complexity of a keystream controlled simulator of a binary channel with random insertions is highly dominant in the considered encryption/decryption scheme.

Assuming that and are the parameters, for specification of the proposed encryption/decryption, the following notation is employed:(i) is -dimensional binary vector of data which should be encrypted;(ii) is -dimensional binary vector of keystream for stream ciphering;(iii) is -dimensional binary vector of keystream nonoverlapping with ;(iv) is -dimensional binary vector of the weight exactly obtained by a deterministic mapping of ;(v) is -dimensional binary vector defined as ;(vi) is -dimensional binary vector which is equal to with inserted random bits.

The proposed encryption/decryption is displayed in Figure 1.

3. Information-Theoretic Analysis

This section yields an information-theoretic analysis of a (statistical) model of the considered encryption displayed in Figure 1.

A random variable is denoted by an uppercase letter (e.g., ) and its realization is denoted by a lowercase letter (e.g., ). An index (subscript) denotes discrete time. A discrete-time sequence of random variables, for example, , is shortly denoted by . Since our channel has synchronization errors, we have a need to distinguish strings from sequences. We denote a random string (indexed by discrete-time ) as . The string may not have a fixed length, and we denote its length (which is a random variable if the string itself is a random variable) as . A concatenation of two strings and is denoted by . As short notation, we denote the concatenation of strings through as . The entropy of a random object is denoted by , and the mutual information between two random objects and is denoted by . The binary entropy function is denoted by .

Let the channel input be a binary random variable drawn from the alphabet . The vector of all channel inputs up to time is denoted by . The transmitter (Alice) observes the pseudo-random sequence provided by a shared source of randomness (shared with Bob) and uses it to create a channel output (ciphertext) . Even though is a pseudo-random sequence, we assume that the variables are statistically indistinguishable from independent and identically distributed (iid) geometric random variables with parameter ; that is, for any integer , we haveHere, the parameter denotes the insertion probability. Namely, between any two symbols and , Alice inserts a string that consists of Bernoulli- random variables, such that the length of equals . Since is a sequence of iid geometric random variables with parameter , it is clear that Alice’s transmission scheme is equivalent to randomly inserting a Bernoulli- random variable at any point of time during the communication. Formally, we state that Alice creates a string obtained as a concatenation of individual strings , that is, where each individual string is obtained asThe length of the string equalsthat is, on average, Alice inserts Bernoulli- random variables between any two symbols and .

Eve (the eavesdropper) and Bob (the intended receiver) both receive the string containing the randomly inserted symbols. The eavesdropper, not having access to the shared source of randomness , cannot easily parse the string to recover . The intended receiver, on the other hand, has access to , and since represents the length of the inserted string between any two symbols and , the intended receiver (Bob) can easily remove the inserted symbols from (i.e., decimate ) to recover . In other words, by sharing the source of randomness , Bob can resynchronize himself with Alice; see Figure 1.

The sequence is a pseudo-random sequence, but for the purpose of computing information-theoretic quantities, we assume that is modeled to be statistically indistinguishable from a sequence of iid Bernoulli- random variables. (It should not be understood that implements a one-time pad. The variables are only statistically modeled as Bernoulli- for the purposes of deriving (and computing) some information-theoretic quantities that we later use to derive a cryptographic security measure.)

Here, no assumptions are made on the statistical properties of the message , but because is iid Bernoulli-, we have that is also iid Bernoulli-. Hence, the information-theoretic quantity of interest is the iud information rate defined as the information rate between and when the symbols are independent and uniformly distributed (iud):The information rate represents the amount of information that the eavesdropper can “learn,” on average, about after observing . The information rate is not computable in closed-form but is attainable using Monde Carlo techniques. For example, known bounds are [10]For large , the correction term in (7) equalsIf our desired accuracy of computing (bounding) is and if , considerations of (7)–(9) dictate that . For details on how to compute using “rhomboidal” trellis techniques such that both the desired correction term (9) and the confidence interval are kept under a predetermined accuracy (e.g., ), see [10]. Here, we only give numerical results in Figure 2, which reveal that the information rate is only a small fraction of the entropy rate , especially when . These results are very favorable for secret communication because only a small fraction of the uncertainty in can be learned from observing , as the next section demonstrates.

We already established that learning after observing is extremely unfavorable for the eavesdropper because the information rate is low for large insertion probabilities . However, the eavesdropper may adopt a strategy in which she first attempts to learn the sequence and then attempt to crack . To study the effects of this strategy, let us define the following quantities:

Proposition 1. Consider

Proof. First, notice thatbecause is a string of Bernoulli- random variables whose length is , and as , we haveNext, we also have and (11) is now a direct consequence of (15) and (17). Equality (12) follows from the fact that is uniquely determined (by decimation) if and are known; that is, . Finally, (13) follows by adding (11) to (12) and applying the chain rule for mutual information, and (14) follows from (13) also using the chain rule.

By equality (11) of Proposition 1, it is clear that the eavesdropper cannot learn simply by observing . Also, from Figure 2, it is clear that, from the eavesdropper’s perspective, learning from is extremely unfavorable because she can only learn a small fraction of by observing . However, equality (12) of Proposition 1 reveals a potential vulnerability in that if the eavesdropper were to somehow learn , then secrecy would be lost because . Since learning either or individually is not favorable to the eavesdropper, the eavesdropper’s strategy could be to go after the pair . Indeed, equality (13) of Proposition 1 reveals that, theoretically, the eavesdropper could gain substantial knowledge of the pair by observing . Even for large , this posterior knowledge of the pair , quantified as , is not a negligible fraction of the entropy

In the next section, we further explore the cryptographic implications by studying the connection between computational complexity and the information-theoretic quantities.

4. Generic Framework for the Security Evaluation

Note that the above information-theoretic analysis is based on modeling the pseudo-random sequence as a random sequence. In this section, we now take into account the fact that the sequence is indeed pseudo-random. We show that the considered encryption (see Figure 1) based on employing the binary insertion channel provides enhanced security compared to the basic scheme that outputs only .

4.1. Preliminaries: Security Notation

A definition of security consists of two distinct components: a specification of the assumed power of the adversary and a description of what constitutes a “break” of the scheme. Generally speaking, a cryptographic scheme is secure in a computational sense, if, for every probabilistic polynomial-time adversary carrying out an attack of some specified type and for every polynomial , there exists an integer such that the probability that succeeds in this attack (where success is also well defined) is less than for every . Accordingly, the following two definitions specify a security evaluation scenario and a security statement.

Definition 2. The adversarial indistinguishability experiment consists of the following steps:(1)The adversary chooses a pair of messages of the same length and passes them onto the encryption system for encrypting.(2)A bit is chosen uniformly at random, and only one of the two messages , precisely , is encrypted into ciphertext and returned to .(3)Upon observing , and without knowledge of , the adversary outputs a bit .(4)The experiment output is defined to be 1 if , and 0 otherwise; if the experiment output is , denoted shortly as the event , one says that has succeeded.

Definition 3. An encryption scheme provides indistinguishable encryptions in the presence of an eavesdropper, if for all probabilistic polynomial-time adversaries where is a negligibly small function.

Definitions 2 and 3 are more precisely discussed in [11].

4.2. Evaluation of the Security Gain Based on the Mutual Information

We consider the encryption system displayed in Figure 1 taking into account the fact that the legitimate parties share pseudo-random secret sequences instead of random ones. Our goal is to estimate the advantage of in the indistinguishability game specified by Definition 2 when , where is a particular realization of , assuming that the advantage of is known when and are two chosen realizations of and the corresponding realization of is known.

Proposition 4. Let the encrypted mapping of into be such that equals the advantage of the adversary (specified by Definition 3) to win the indistinguishability game (specified by Definition 2), and let the mutual information be known. Under these assumptions, for large ,

Proof. Note that, for simplicity of the proof, Proposition 4 addresses a restricted case where it is assumed that equals the advantage of the adversary (specified by Definition 3) to win the indistinguishability game. Let the index of the selected message be realization of the random variable whose distribution reflects that of the output of adversary . The probability that wins the game is determined by the following:According to the proposition assumption we havewhere corresponds to the selected , andConsequently,Next, we have the following general upper bound on the entropy (see [12] or [13], e.g.):where is the binary entropy function and , implying

5. Evaluation of the Security Gain Based on Numerical Estimation of the Mutual Information

Theorem 5. Let the encrypted mapping of into be such that equals the advantage of the adversary (specified by Definition 3) to win the indistinguishability game (specified by Definition 2), and let the mutual information be known (see Figure 2, e.g.). Under these assumptions, for large ,

Proof. ConsiderSubstitution of (7) and (9) into (28) finalizes the proof.

Accordingly, the encryption mapping enhances security by a factor in comparison to the encryption mapping because the probability that wins the game becomes closer to , which corresponds to random guessing.

6. Evaluation of the Security Enhancement Employing Enumeration of Channel Input Candidates for the Given Output

6.1. Preliminaries

Let be a binary string of length , and let be a parameter. Recently, in [9], improved bounds on the number of subsequences obtained from a binary string of length under deletions have been reported. It is known that the number of subsequences in this setting strongly depends on the number of runs in the string , where a run is a maximal substring of the same character. The improved bounds are obtained by a structural analysis of the family of -run strings , an analysis in which the extremal strings with respect to the number of subsequences have been identified. Specifically, for every , -run strings with the minimum (resp., maximum) number of subsequences under any deletions have been considered, an exact analysis of the number of subsequences of these extremal strings has been presented, and it has been shown that this number can be calculated in polynomial time.

Let be a set of subsequences of that can be obtained from after deletions. The analysis of and its size are challenging as the number of subsequences of a string obtained by deletions not only depends on its length and the number of deletions, but also strongly depends on its structure. For example, is of size 1 and equals the single string . Clearly, is at most (as after deletions we remain with a binary string of length ). It has been shown that the number of subsequences strongly depends on the number of runs in the string. Here, a run is a maximal substring of the same character, and the number of runs in a given string is denoted by . It has been proven that Also, it has been shown that the maximal number of subsequences is obtained from certain strings , known as cyclic strings , in which , and it has been shown thatwhich has been further improved so that the following has been shown:where is a string of length with runs.

In [9], also a family of strings, named unbalanced strings, has been defined. A string is called unbalanced, if all of the runs of symbols in the string are of length 1, except for one run. Let be a binary string of length with runs, in which all runs are of length 1, except for the th run which is of length . Due to symmetry , and consequently defineIt has been shown in [9] that these extreme cases have the least number of subsequences among the unbalanced strings and also that they have the least number of subsequences among all strings. The following theorem has been proven in [9].

Theorem 6 (Theorem [9]: closed-form formula for ). For all , , (i)when ,(ii)when ,whereassuming that and, for , and that the following conventions are employed:

A numerical illustration of Theorem 6 is displayed in Figure 3.

6.2. Estimation of the Security Enhancement

Traditionally, as introduced in [14], the main information-theoretic security metric is the average information leaked, that is, the mutual information between the message and the related sample , or, equivalently, the uncertainty, that is, the equivocation . Recently, certain information-theoretic security measures have been considered in [15] implying that, in our case, as a strong security metric the average mutual information should be addressed and as a corresponding weak one.

Theorem 7. Assuming that the employed keystream generator is such that the following is valid,the simulator of binary channel with random insertions provides where is the number of certain equally likely subsequences.

Sketch of the Proof. The uncertainty about the input (the argument) into a binary channel with random insertions given its output (the image) depends on the number of equally likely candidate arguments which can generate the given image. A lower bound on the number of these candidates can be obtained based on the lower bound on the number of the subsequences which can be obtained from the given one employing Theorem 6 (i.e., Theorem from [9]). By adapting this result to the considered particular case we have the following. A lower bound on the number of the argument candidates , where is a parameter, is given by (39) and (40):(i)when ,(ii)when :whereassuming that and, for , . Particularly note that the above enumerated subsequences are obtained from a sequence where all of the runs of symbols are of length 1, except for one run, and that the assumed decimation is a random one, and in addition, for simplicity of the evaluation we assume that the subsequences appear equally likely.
Consequently, the uncertainty is lower-bounded as follows:noting that is at most as after deletions we remain with a binary string of length . Taking into account that we obtainand accordingly the theorem statement.

Figure 4 yields numerical illustrations of coefficient which determines the security gain.

Note that, in order to achieve a desired high enhancement of the security, the insertion rate should be high enough as illustrated in Figure 4. When the insertion rate is low, the security enhancement is low as well, and this is analytically shown in the next corollary.

Corollary 8. Considerwhen the parameters of the considered encryption fulfil the following constraints:

Sketch of the Proof. For large values of and , the following approximation can be employed:where means that is approximately if is a polynomial function of and . Accordingly,Using the fact reported in [9] we have the following. Let . Numerical calculations reported in [9] show that . Consequently, it is shown in [9] that for even The above imply the corollary statement.

Disclosure

This work was has been partially presented at IEEE Workshop on Information Theory, Korea, October 2015.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

The Ministry of Education, Science and Technological Development, Serbia, has partially funded this work.