Abstract
Public key cryptosystems are constructed by embedding a trapdoor into a oneway function. So, the onewayness and the trapdoorness are vital to public key cryptography. In this paper, we propose a novel public key cryptographic primitive called preimage selective trapdoor function. This scenario allows to use exponentially many preimage to hide a plaintext even if the underlying function is not oneway. The compact knapsack problem is used to construct a probabilistic public key cryptosystem, the underlying encryption function of which is proven to be preimage selective trapdoor oneway functions under some linearization attack models. The constructive method can guarantee the noninjectivity of the underlying encryption function and the unique decipherability for ciphertexts simultaneously. It is heuristically argued that the security of the proposal cannot be compromised by a polynomialtime adversary even if the compact knapsack is easy to solve. We failed to provide any provable security results about the proposal; however, heuristic illustrations show that the proposal is secure against some known attacks including brute force attacks, linearization attacks, and keyrecovery attacks. The proposal turns out to have acceptable key sizes and performs efficiently and hence is practical.
1. Introduction
1.1. Background
Public key cryptosystem (PKC) is an important cryptographic primitive in the area of network and information security. The basic idea to construct a PKC is to embed a trapdoor into a mathematically intractable problem. The trapdoor helps the trapdoor holder reverse the underlying oneway function. However, without the trapdoor, one should attack a presumed intractable problem in order to reconstruct the plaintext corresponding to a given ciphertext. So the existence of mathematically hard problems is vital in public key cryptography. This explains why we say bad news in computational complexity is good news for cryptography.
In the cryptographic community, a PKC is always considered as synonymous with the socalled trapdoor oneway function. Bellare et al.’s observation [1] provides a better understanding of the two concepts. On one hand, a probabilistic public key encryption does not necessarily imply a trapdoor oneway function. For example, the underlying encryption function of ElGamal [2] is not oneway in that the auxiliary key randomly chosen by the encrypter cannot be recovered knowing the trapdoor satisfying . On the other hand, in some cases, noninjective trapdoor oneway functions cannot be used to construct PKCs because a PKC is required to recover a unique plaintext from any valid ciphertext. So the encryption function underlying a PKC should be injective. Bellare et al. showed that any oneway function implies a highly noninjective trapdoor oneway function. However, the authors derived no PKCs from their universal constructions. They only showed that if the image of a trapdoor oneway function has polynomially bounded number of preimages, the trapdoor oneway function can be used to derive a PKC.
In this paper, we propose a probabilistic public key encryption algorithm from a highly noninjective oneway function. We think that our construction not only enriches the cryptographic basket and deepens our insights into public key cryptographic designs but also provides weaker cryptographic assumptions. Our confidence in the hardness of noninjective oneway functions rests on the proven fact that it is not easier to reverse a given oneway function equipped with an additional noninjectivity property than an injective oneway function [3]. More importantly, what we observed in this paper is that it remains possible to develop a secure PKC by fully exploiting the noninjectivity property of functions even if these functions turn out to be not oneway.
1.2. Our Contribution
In this paper, we firstly formalize a novel public key cryptographic scenario called preimage selective trapdoor oneway function (PSTOF, for short) and then use the compact knapsack problem to realize such a scenario.
1.2.1. Preimage Selective Trapdoor OneWay Function (See Figure 1)
A PSTOF is in fact a noninjective public key encryption function. When encrypting a message in the message space , the encrypter firstly encodes the message into a plaintext in the plaintext space and then derives a ciphertext in the ciphertext space . When decrypting a ciphertext , the decrypter utilizes the trapdoor to decrypt a valid ciphertext into a unique plaintext and then uniquely decodes into a message .
The scenario allows us to base the security of a PKC on the high noninjectivity rather than the onewayness of a function (an intractable problem). In this scenario, is a subset of the equivalent plaintext space , and the ciphertext space . For any valid ciphertext , will have many preimages in , which are called equivalent ciphertexts in that all of them can be mapped into under the function , and a unique preimage (the plaintext) with respect to the function , which is a valid plaintext and hence can be further decoded into a meaningful message. The message encoding function simulates the process of randomly selecting a subset as the plaintext space from the whole preimage domain , from which the name PSTOF comes. In other words, the message encoding function imposes some restrictions on the preimage set to get a subset with a predesignated special structure as the plaintext space . The equivalent plaintexts in are used to hide a (valid) plaintext in . If is preassumed oneway, any polynomialtime adversary cannot obtain a preimage in given a ; let alone the unique preimage (the plaintext) . If the underlying function is not oneway and when a has exponentially many preimages in , the corresponding plaintext in is hidden by these exponentially many preimages. A polynomialtime adversary only obtains polynomially many preimages, which can be seen as randomly output from all the exponentially many preimages and hence include the target plaintext with a negligible probability.
1.2.2. Probabilistic Compact Knapsack PKC
By using the concept of PSTOF and from the compact knapsack problem, a probabilistic public key encryption scheme is designed. An easy compact knapsacktype problem is defined, and an algorithm called Onion Algorithm is developed to peel off the solutions to the easy problem. The easy problem is used to construct a probabilistic PKC. The main results are that the trapdoor oneway function underlying the proposal is PSTOFs under different linear attack models (See Theorems 25, 26, and 30), and the images of the PSTOFs have exponentially many preimages (see Table 3). Some design tricks are adopted to make the PKC immune from some known keyrecovery attacks such as GCD attack [4, 5], Diophantine approximation attack [6, 7], and orthogonal lattice attack [8, 9].
1.3. Organization
The rest of the paper is organized as follows. In Section 2, we formalize the definition of PSTOF and use the cryptographic scenario to explain some known PKCs. Section 3 provides a concrete probabilistic public key encryption scheme to realize the PSTOF. Sections 4 and 5 discuss the performance and security related issues, respectively. Section 6 gives some concluding remarks.
2. Preimage Selective Trapdoor OneWay Function
2.1. Symbols and Notations
The notations listed at the end of the paper will be used throughout the paper.
2.2. Message Encoding Function
When encrypting a message using a PKC, we need to firstly encode a binary string message into a plaintext (an element in the algebraic structure underlying the PKC) in order that the PKC can deal with the message. However, sometimes, it is not necessary to explicitly specify the encoding algorithm. For example, it is more convenient to just look a message as a binary integer in than to develop an algorithm to transform a message into an element in in case of RSA [10]. Formally, we define a message encoding function as follows.
Definition 1. A function with is called a message encoding function if it satisfies the following conditions.(1)The function is injective.(2)It can be efficiently performed to compute for a given , and for a given .
2.3. Preimage Selective Trapdoor OneWay Function
The PSTOF is formalized as follows.
Definition 2. A function is called a PSTOF if it satisfies the following conditions. (1)For an arbitrary , it is easy to compute .(2) is noninjective.(3)Given any , there exists at most an such that .(4)Given any and the trapdoor, there is a polynomialtime algorithm to output a unique with or the invalid ciphertext symbol .(5)Given any and without knowing the trapdoor, anyone cannot efficiently compute a preimage such that .
Remark 3. The third and the forth conditions in Definition 2 guarantee that when viewed as a public key encryption function, the PSTOF is inverted according to the knowledge of the trapdoor to derive a unique preimage (the plaintext) in , and both conditions in Definition 1 mean that the plaintext can be uniquely decoded into the original message. If we remove the last requirement in Definition 2, we just call a preimage selective trapdoor (not nonway) function.
Definition 4. The preimage density of a noninjective function is defined as the ratio of the cardinality of the preimage set to that of the image set : On average, a ciphertext will have preimages in .
Remark 5. When is exponentially large, we can use the high noninjectivity of the underlying function to hide a plaintext. On average, for a valid ciphertext for a plaintext , will have exponentially many preimages in . So a polynomialtime adversary only can obtain polynomially many preimages in of , which contains the targeted unique valid plaintext with a negligible probability . One may doubt that the function is indeed a trapdoor oneway function mapping from to . So if one breaks the onewayness of , he also recovers the unique such that . However, if the plaintext space can be seen as randomly chosen from the whole equivalent plaintext space , the adversary cannot use the special structure of to develop an efficient algorithm for reversing the function . We will explain this point later on by using an example.
2.4. Examples
2.4.1. HighDensity Knapsack PKCs
To illustrate the aforementioned definitions, we consider highdensity 01 knapsack (subset sum) PKCs ChorRivest [11] and OkamotoTanakaUchiyama [12]. In both cryptosystems, a bitstring message is encoded into an dimensional 01 plaintext vector with a fixed Hamming weight (the socalled predesignated special plaintext structure) by using the source encoding algorithm (message encoding function) [13]: Let the public key . The underlying PSTOF is ; that is, for , The density of both PKCs defined as can be made sufficiently high (>1), in which case asymptotically (3) will have more than one (i.e., ) binary solution [14]:
The illustrations say that when , a valid ciphertext will have exponentially many preimages in and a unique preimage in . In other words, the plaintext is hidden by exponentially many preimages. Even if a polynomialtime adversary conquers the onewayness of the underlying subset sum problem and hence the underlying encryption function is only a preimage selective trapdoor (not oneway) function; the adversary just obtains polynomially many preimages to (3) which contain the plaintext with a negligible probability. Highdensity subset sum PKCs can base their security on the noninjectivity of the underlying function, which is a weaker intractability assumption.
Remark 6. There exist two ways to break ChorRivest [11] and the OkamotoTanakaUchiyama [12] cryptosystems: recovering the secret key [15] and solving the underlying knapsack problem. Some work still was done to make ChorRivest immune from keyrecovery attacks [16–18]. However, some known cryptanalytic algorithms [14, 19–21] only can solve the preimages but not necessarily the unique plaintext . So in the adversary’s point of view, the underlying problem remains the knapsack problem. This explains why we say in Remark 5 that the adversary cannot use the special structure of to efficiently reverse the function .
2.4.2. Rabin Cryptosystem
We also can view Rabin cryptosystem [22] as a PSTOF. For convenience of discussions, we just set Rabin encryption function as with being the product of two primes, and . So, the ciphertext space consists of the quadratic residues modulo ; that is, . In order to uniquely recover a plaintext, we need to embed some redundant information in . The redundant information forms the special structure of . The PSTOF underlying Rabin cryptosystem is ; for , . We note that each of the quadratic residues modulo has four roots, from which we use the redundant information to pick out the exact plaintext. Hence, the noninjectivity of the PSTOF for Rabin is given as
3. The Proposed Probabilistic Public Key Encryption Scheme
3.1. Knapsack Problems
Definition 7 (knapsack problem). Given a positive integral vector and an integer , the knapsack or subset sum problem is to find a binary vector such that . A knapsack problem is denoted as . The density of the knapsack problem is defined as .
The density of a compact knapsack problem imposes an important effect on the hardness of the problem. It was shown that if the density , the knapsack problem can be solved with an overwhelming probability [23, 24].
Definition 8 (compact knapsack problem). Given a positive integral vector , an integer , and an integer , the compact or general knapsack problem is to find a vector with such that . A compact knapsack problem is denoted as . The density of a compact knapsack problem is defined as .
If the density of a compact knapsack problem is , it was shown that the compact knapsack problem can almost always be efficiently solved with lattice reduction algorithms [25]. In this paper, we consider the compact knapsack problem with .
3.2. The Proposed Message Encoding Function
The message space consists of blocks with each block 3 bit long; namely, . For a message , we firstly define a message encoding function that maps the message into a vector in under an auxiliary key , which is randomly chosen by the encrypter. We define for as
Remark 9. We note that 97 is a prime and 5 is a primitive root modulo 97, so the function forms a permutation when runs through and hence permutates when and run through and , respectively. We denote It is easy to verify that forms a partition of . So we can deduce that if , , and are uniformly distributed over , and , respectively, is uniformly distributed over . This means that the generating of each can simulate the process of randomly choosing an integer from , which allows the predesignated special structure of defined in (9) as random as possible.
The message encoding function is defined as : where the plaintext space is In the rest of the paper, we also define the equivalent plaintext space including the plaintext space .
Remark 10. Another fact about is for and because . Hence, we have that if , .
Remark 11. Now we remark on how to encode a message and decode a plaintext with respect to . We can construct tables similar to truthvalue tables to show the functions . By observing (10), we see that the tables have a periodic structure with period 12. That is to say those sharing a common residue modulo 12 have a same table. So only 12 tables suffice to list the values of . See Table 1. Here, we spend some more words on the message encoding function and Table 1. At present, we only consider the preceding three columns of the 12 subtables in Table 1, which list the values of , , and the corresponding , respectively. To encode a message and given a randomly chosen auxiliary key , for , we determine the one by one by looking up the subtable indexed by in Table 1 denoted as SubT. When (or 1, resp.), we only search the 8 rows covered by (or 1, resp.) in SubT, and output the integer in the column marked by that lies in the same row with as the value of . For example, to encode under the control of , by looking up SubT, we determine . Similarly, we get , , . So, we encode as . On the contrary, we can decode a plaintext by looking up the table. For example, . We search the third column of SubT for which lies in the same line with , so . Similarly, we get , , . Hence, .
3.3. Onion Algorithm
3.3.1. Indistinguishability and Associated Integer Pairs
Given an integer pair and , we define and the set . In fact, we can look as a transformation on . If the two cardinalities are identical, , the transformation forms a bijection.
Definition 12. Given an integer pair , if , we call to be distinguishable modulo , or is an associated integer pair with .
Example 13. Decide whether is distinguishable modulo and or not, respectively. We note that contains the integers listed in the third column of SubT. Calculations show that . The 16 values in are listed in the forth column of SubT_{1}. So, is distinguishable modulo . However, in that 4 and 42 modulo produce a same integer pair , and 24 and 138 modulo produce a same integer pair . So is not distinguishable modulo .
Remark 14. When is distinguishable modulo , there is a bijection between the integers in and the integer pairs in . For example, the third and the forth columns of SubT_{1} listed the integers in and the integer pairs in . For example, if we know an integer satisfies , we search the forth column generated by for in Sub T_{1} and find that lies in the same line with , so .
Remark 15. From Remark 10, we have that, for congruent indices ’s modulo , we get a same set . For each , we give two integer pairs and associated with , and the corresponding integer pairs modulo and , respectively, in the forth and fifth column of SubT. We denote the set consisting of the two integer pairs and as AIP. For example, AIP=AIP . In Table 2, we list all the 12 sets of AIP.
Remark 16. Note that is distinguishable modulo if and only if is distinguishable modulo . For example, also contains integer pairs associated with . If we know that an integer satisfies , we also can determine the value of by searching the forth column generated by in SubT for , and we find a unique with . So the unique integer is such that .
3.3.2. Simultaneous Diophantine Equation
Now, we consider a simultaneous Diophantine equation problem as follows; given integers and and positive integer sequences and , find defined in (9) such that It is pointed out that if and , for some constants and subject to , will have exponentially many preimages in satisfying (11); namely, which is . What we mean is that if we require and , (11) will be a manytoone function. In the subsequent contents, we will define a condition such that (11) will have at most one solution in (to guarantee the unique decipherability for ciphertexts) and propose an algorithm to solve (11). Besides, some transformations will be introduced in the construction of the proposed PKC to derive a compact knapsack problem equipped with noninjectivity, a trapdoor, and onewayness natures.
3.3.3. An Easy Problem and the Algorithm Solving It
For some special and , we can efficiently determine in (11) as stated in the following lemma.
Lemma 17. Assuming that (11) has solutions in , , and , then can be efficiently and uniquely determined.
Proof. We first note that and , for , so the two equations in (11) modulo and , respectively, give and . Observing that , we can invert and similarly : If , we determine the unique corresponding value according to Remark 14. If , we use the method given in Remark 16 to determine the unique .
The above lemma says that for some special and , we can determine the solution to (11) one by one as if we peel off an onion, so the name Onion Algorithm comes.
Theorem 18. Assuming that (11) has solutions in : and that, for , the solution to (11) can be efficiently and uniquely determined.
Proof. From Lemma 17, can be efficiently and uniquely determined.
For , assuming that have been uniquely determined, we know that
We note that for , so divides the right side and hence the left side of (16). We set
A similar analysis also applies to the second equation of (11), so we set to obtain
If we view (18) as a new simultaneous Diophantine equation problem with integer sequences and , what follows shows that the new problem satisfies the conditions in Lemma 17.
Firstly, (11) has solutions, and have been uniquely determined, so (18) must have solutions.
Secondly, recalling the meaning of , we claim that these entries divided by their gcd must be relatively prime:
Thirdly,
From (15) and the previous two proven things, we know that (18) indeed satisfies the conditions in Lemma 17. So we can efficiently determine a unique .
Finally, once have been uniquely determined, it is a trivial thing to determine a unique just by computing
Both values on the right side of the above equation must be not only identical but also in because (11) has solutions in and have been uniquely determined.
Given the problem (11), satisfying the conditions presented in Theorem 18, the Onion Algorithm for solving (11) is summarized as shown in Algorithm 1.

Now we use a toy example to illustrate what we discuss in this subsection about the Onion Algorithm.
Example 19. Assume that the following equations have solutions in ; find the solution to + + , + + . We can verify that , , , and , so (14) and (15) are satisfied. Hence, we use the Onion Algorithm to compute . We look up SubT_{3} and find that . So . We compute , . Then we compute . We search SubT_{2} for and find that . So we determine . Finally, we get . Thus, a unique is determined.
3.4. The Proposed Probabilistic Public Key Encryption
In this subsection, we use the results obtained in previous subsections to derive a probabilistic public key encryption.
3.4.1. Key Generation
Randomly choose two sequences and , satisfying conditions (14) and (15) given in Theorem 18, and a twodimensional invertible square matrix with positive integer entries upper bounded by a constant, Compute Randomly choose two primes such that Let . Compute the vector using the Chinese remainder theorem, Randomly choose an integer and compute The public key is . The secret key consists of , , and , . When decrypting a ciphertext, the decrypter also needs to store the values of , , , and Table 1.
3.4.2. Encryption
To encrypt a message , the encrypter randomly chooses an auxiliary key and computes the plaintext and the ciphertext using the public key ,
3.4.3. Decryption
To decipher a ciphertext , the decrypter firstly computes , and then , . Secondly, the decrypter computes Thirdly, the decrypter uses the Onion Algorithm to determine a unique solution to (11). Finally, the decrypter decodes into the corresponding message by using the method illustrated in Remark 11.
3.4.4. Why Decryption Works
To illustrate why the decryption works, we observe that from (26), so and according to (25). From (23), we know that , so from (24). Now we conclude that and similarly . Hence, we have from which (11) is derived. It is easy to verify that (11) satisfies the conditions listed in Theorem 18, and hence can be efficiently solved with Onion Algorithm to give a unique solution . Then is decoded into the original message .
3.4.5. On Generating and
Now, we give an algorithm to generate the dimensional positive integer sequences and satisfying conditions (14) and (15) in Theorem 18.
Theorem 20. The generated and satisfy (14) and (15).
Proof. We note that, for , Similarly, . We immediately have . Thus, (14) is satisfied. For , we have so (15) is satisfied, as desired.
Now we use an example to illustrate Algorithm 2.

Example 21. We randomly choose , and , , , , and set , so we can compute , , , and , , , which are the coefficients in Example 19.
3.4.6. Remarks and Suggested Security Parameters
The suggested parameters are listed in Table 3.
We provide some remarks on the proposed PKC as follows.
Remark 22. We can choose a 2dimensional square matrix with determinant equal to because the inverse can be easily represented when .
Remark 23. We should choose and slightly greater than and , respectively. See (24). So for convenience of discussions, we always assume that
Remark 24. In Algorithm 2, we suggest choosing and with some special lengths in order that the generated ’s (’s, resp.) share an almost same binary length; that is, So, for , we have For convenience of discussions, we also assume that and have an almost same binary length, so from (34), we have
4. Performance
This section analyzes the performance related issues, such as the computational complexity of the encryption and decryption algorithms, the public key size and the information rate.
4.1. Estimation of Public Key Size
We first point out two facts. Firstly, when generating and by using Algorithm 2, for , we randomly choose in the first step. We also set . So from Table 2, we see that , for . Secondly, .
From (23), we rewrite and as and with , and from (35), we have .
We recall that the public key consists of integers , and the length of each of them is upper bounded by that of , so from (32) we have So the public key size is estimated via which is upper bounded by .
4.2. Computational Complexity
We analyze the computational costs for encrypting a message and decrypting a ciphertext. Given and , we only need bit operations to compute . To encrypt , the encrypter only needs to perform multiplications with the multipliers bounded by and , and additions. So bit operations suffice to do the computations in (27). The computational complexity for the encryption algorithm is given as .
The decrypter first performs a modular multiplication , and two modular reductions , , which cost bit operations in total. To solve (11) for each and hence , the decrypter computes to determine with the moduli ,