Abstract

The millionaires’ problem is the basis of secure multiparty computation and has many applications. Using a vectorization method and the Paillier encryption scheme, we first propose a secure two-party solution to the millionaires’ problem, which can determine , or in one execution. Subsequently, using the vectorization and secret splitting methods, we propose an information-theoretically secure protocol to solve the multiparty millionaires’ problem (a.k.a. secure sorting problem), and this protocol can resist collusion attacks. We analyze the accuracy and security of our protocols in the semihonest model and compare the computational and communication complexities between the proposed protocols and the existing ones.

1. Introduction

The millionaires’ problem (abstracted as a greater than (GT) problem) was proposed by Yao [1] as follows: two millionaires, Alice and Bob, want to know who has more wealth, but they do not want to disclose their wealth value to each other. The GT problem can be applied to a practical situation: suppose that Alice wants to buy some commodities from Bob online, and she hopes to pay $ at most, while Bob wants to sell the commodities for $, but neither wants to disclose this price. Therefore, they need to compare and privately. If , they will make a bargain; otherwise, they do not waste their time trying to reach an agreement upon price. The GT problem has also been applied to secure multiparty computation (SMC), a focus in the international cryptography community [2]. Goldwasser [3] predicted that SMC will become an integral part of our computing reality in the future because of its rich theory and potential as a powerful tool.

Motivated by Goldwasser’s predictions, cryptographic researchers have studied many SMC problems, including private sorting problems [4], private determination of the relationships among sets [5] and geometry [6], private voting problems [7], and private data mining [8]. Goldreich [9] studied theoretical SMC problems and established a simulation paradigm that has been extensively used to evaluate the security of SMC protocols.

The GT problem is a foundational component of many SMC protocols [1015] and should be solved efficiently for many applications. The secure two-party computation of the GT problem can be applied to securely solve the rational interval computation [16], which can further determine the inclusion problems between point and ring, point and infinite region, point and segment, and so on and even can be used to reduce the cost in real commodity transaction. Protocols for the GT problem can be used to determine graphics similarity [17]. SMC protocols of the GT problem were used to solve secure computation of skyline query in mapreduce [18], secure stable matching at scale [19], private large-scale databases [20], association rule mining [21], and so forth. In addition, Bogdanov et al. [22] proposed a tool for cryptographically secure statistical analysis based on a sorting method. Aly and Van Vyve [23] proposed a secure sorting protocol for secure single-commodity multimarket auctions. Blanton and Aguiar [24] applied a sorting protocol to compute multiset operations.

Cryptographic researchers have proposed several GT protocols. Yao [1] used garbled circuits for solving the GT problem. Grigoriev and Shpilrain [30] used various laws of classical physics to offer several solutions of Yao’s millionaires’ problem. Chang et al. [31] proposed a pioneering quantum private comparison protocol for some users.

Many researchers have proposed GT protocols based on homomorphic encryption schemes. Li and Wang [32] used the Paillier encryption scheme to compare two encoding numbers. Blake and Kolesnikov [25] used the additive homomorphism of the Paillier encryption scheme to construct a two-round GT protocol with the computational cost of modular multiplications, where is the bit number of the private inputs and is the modulus of the Paillier encryption scheme. Lin and Tzeng [26] proposed a two-round protocol using the multiplicative homomorphism of the ElGamal encryption scheme, with the computational cost of modular multiplications. Cao and Liu [33] improved the Lin-Tzeng solution to Yao’s millionaires problem by having the two participators alternately perform the original Lin-Tzeng protocol twice.

In addition, some researchers proposed SMC protocols for the GT problem based on the linear secret sharing method. Nishide and Ohta [29] constructed a simplified bit-decomposition protocol for the GT problem, whose communication cost is 15 rounds. The protocol cannot distinguish from . Damgard et al. [27] proposed a novel technique to convert a polynomial sharing of the secret into a sharing of the bits of inconstant rounds. Its computation complexity is ( is the bit number of the shared value) modular multiplications.

Although several protocols have been proposed, their efficiency is not very high and more efficient SMC protocols are desirable. Therefore, this paper proposes two efficient solutions to both the two-party and multiparty GT problems.

Our Contributions. (1) Using the vectorization method, we encode a number into a vector and transform the GT problem into a vector-element-selection problem.

(2) Protocol 1 securely solves the two-party GT problem based on the vectorization method and the Paillier encryption scheme. It determines , , or in one execution. The computational cost is modular multiplications ( is the modulus of the Paillier encryption scheme; is the encoding vector dimension), and the communication cost is one round. We compare the computational and communication complexities between Protocol 1 and the existing solutions in Section 4.

(3) Protocol 2 securely computes the multiparty GT problem (i.e., secure sorting problem) based on the vectorization method and the secret splitting method without any encryption scheme based on computational assumptions. The computational cost is negligible and the communication cost is rounds ( is the number of parties). It can resist collusion attacks and is information-theoretically secure. We compare the computational and communication complexities between Protocol 2 and the existing solutions in Section 4.

Organization. Section 2 introduces related definitions and methods, including the SMC model, the semihonest party, the simulation paradigm, the Paillier encryption scheme, and the vectorization method. Section 3 proposes the two-party and multiparty GT protocols, analyzes the protocols’ accuracy and security, and demonstrates their privacy-preserving properties using the simulation paradigm. Section 4 compares the protocols’ computational and communication complexities with the existing solutions, and Section 5 concludes this work.

2. Preliminary

2.1. Secure Multiparty Computation Model
2.1.1. Secure Two-Party Computation

A two-party computation is a random computation process, denoted by where

2.1.2. Secure -Party Computation

The above two-party computation can be extended to -party computation. That is,

2.1.3. Ideal SMC Model

The ideal SMC model is the most secure SMC model. It needs a trusted third party (TTP), that will always tell the truth and never lies. If such a TTP exists, Alice (holding ) and Bob (holding ) can privately compute as follows:(1)Alice sends to TTP.(2)Bob sends to TTP.(3)TTP computes .(4)TTP sends to Alice and sends to Bob.

Theoretically, this protocol can solve any SMC problem, but the TTP is not easily implemented in practice. Therefore, SMC protocols without needing a TTP must be developed.

2.2. Semihonest Party

At present, SMC protocols are investigated either in the semihonest model or in the malicious model. A semihonest party truthfully follows the protocol but may retain all intermediate computation and try to deduce other parties’ private inputs from the record. Studying SMC in the semihonest model is the basis of studying SMC in the malicious model. The semihonest model is not only an important methodology but also provides an appropriate model for many settings. In some settings, proving that a protocol is secure in the semihonest model is sufficient. Furthermore, Goldreich et el. [2] have proven that, given a protocol that privately computes a function in the semihonest model, a protocol that privately computes the function in the malicious model can be automatically produced by introducing a bit commitment macro that forces each party to either behave in the semihonest manner or to be detected. Therefore, this work focuses on solutions in the semihonest model.

2.3. Simulation Paradigm

A protocol is considered secure if what a party can obtain from the execution of the protocol can also be computed only from his input and output. This situation is formalized by a simulation paradigm [9] in which a party’s view can be simulated by the input and output of the protocol. In this case, the parties cannot learn additional information except the necessary results from the output.

The simulation paradigm is a commonly used and widely accepted proof method for secure multiparty computation. The principle of the simulation paradigm is that the security of an SMC protocol is compared to the security of the ideal SMC protocol, and the protocol is secure if it discloses no more information than the ideal SMC protocol does. Therefore, the simulation paradigm is considered as the formal expression of principle to evaluate the security of SMC protocols.

2.3.1. Simulation Paradigm for the Two-Party Case

Suppose that is a probabilistic polynomial-time function. Alice holds , and Bob holds . They want to privately compute . is a protocol that computes .

In the execution of , Alice and Bob obtain message sequences and , respectively, where is the result of her (his) internal coin toss and is her (his) received message. Their outputs are and .

Definition 1 (see [9]). For a function , privately computes if there exist probabilistic polynomial-time algorithms and such thatwhere denotes computational indistinguishability.
In order to prove that a multiparty computation protocol is secure, we must construct simulators and such that (4) hold.

2.3.2. Simulation Paradigm for the -Party Case

A semihonest party is one that follows the protocol properly, with the exception that it keeps a record of all its intermediate computations. Loosely speaking, a multiparty protocol privately computes if whatever a set (or a coalition) of semihonest parties can obtain after participating in the protocol could be essentially obtained from the input and output of these parties. Thus, the only difference between the current definition and the one used in the two-party case is that we consider the gain of a coalition (rather than of a single party) from participating in the protocol [9].

Let be an -ary function, where denotes the -th element of . For , we let denote the subsequence . Let be an -party protocol for computing . The view of the -th party during an execution of on , denoted by , is defined as in Definition 1, and for , we let .

Definition 2 (see [9]). In case is a deterministic -ary function, we say that privately computes if there exists a probabilistic polynomial-time algorithm, denoted by , such that for every , it holds that

2.4. Paillier Encryption Scheme

The Paillier public-key cryptosystem [34] is probabilistic and has the additive homomorphism. The cryptosystem contains the following three algorithms.

KeyGen. This algorithm generates two large primes and sets . Let . The algorithm computes such that , where

The public key is , and the private key is .

Encryption. To encrypt a plaintext , select a random number and compute Decryption. To decrypt the ciphertext , compute Additive Homomorphism. Consider the following:

2.5. Vectorization Method

We use a vectorization method to solve the GT problem privately. The vectorization method encodes a number into a vector. Suppose that , is encoded into an -dimensional vector as follows:

The computational complexity of our protocol depends on . Note that . If is smaller, no matter how large is, it can be encoded into a low dimension vector. For example, suppose that and , ; then, the vector representation of is . The vector dimension is 7, which is much less than the bit number of . With this vectorization method and the additive homomorphism of the Paillier encryption scheme, we can easily solve the GT problem.

Yao is the pioneer of secure multiparty computation research. In his seminal paper entitled “Protocols for Secure Computations” [1] which introduced the famous millionaire’s problem and further ignited the secure multiparty computation research, he assumed that the numbers to be compared are members of a set . Noting that the domain size is usually small in many practical applications [35], this assumption is reasonable and it does not leak any information.

In many cases, is known to all parties in reality. For example, if some students want to know their score sorting, then is known to all parties; if two persons want to compare their ages, then may be ; if two workers of a company want to compare their wages, may be , but the wages of a company are often sparsely rather than densely distributed on . The wages can only be a few scales, and is small. Therefore, is known to them and, in all these cases, does not leak any information about their private data. Generally speaking, all the numbers compared in secure multiparty computation are comparable. If the numbers are comparable, the parties know their range. A common worker will never compare his wages with that of Bill Gates because they are not comparable. But we have to say that though is known, if is large, the computational complexity of the protocol will become high, and the protocol therefore becomes impractical.

3. Our Protocols

In this section, we propose protocols to solve secure two-party and multiparty GT problems. In the two-party case, we propose Protocol 1 by using the vectorization method to encode a plaintext number into a vector and using the Paillier encryption scheme to encrypt the vector’s components. Protocol 1 can determine the relationship of or in one execution.

Although we can expand the two-party GT protocol to parties, the communication complexity is high, and it will disclose more information. Therefore, we propose Protocol 2 by using the vectorization method and the secret splitting method to solve the multiparty GT problem without using any encryption scheme based on computational assumptions. Protocol 2 can resist collusion attacks and is information-theoretically secure.

3.1. Secure Two-Party Computation for the GT Problem
3.1.1. A Protocol for the Two-Party GT Problem

Yao’s scheme [1] simply determines whether or , which cannot distinguish from . Our protocol can compare , and in one execution.

In our solution, Alice (holding ) uses the vectorization method to encode into a vector and uses the Paillier encryption scheme to encrypt the vector. Bob (holding ) receives the encrypted vector, selects the -th element, encrypts the -th element into another ciphertext using the additive homomorphism of the Paillier encryption scheme, and returns the ciphertext to Alice. Alice decrypts the ciphertext and tells Bob the result.

The solution for two-party GT problem is as follows.

Protocol 1. Secure two-party computation of the GT problem.

Input. Alice’s input is , and Bob’s input is . So

Output. The relationship of and is denoted by (1)With the Paillier encryption scheme, Alice generates the public key and private key and sends the public key to Bob.(2) Following the vectorization method, Alice encodes into a vector:where (3)Alice selects random numbers , and encrypts the vector using the Paillier encryption scheme: where ().(4) Alice sends to Bob.(5) Bob selects a random number . He chooses from such that and computes (6) Bob sends to Alice.(7) Alice decrypts and tells Bob the result :If , and .If , and .If , and .

3.1.2. Accuracy and Security

(1)Alice computes ; that is, such that . Since , , if which implies , so . If which implies , then . If which implies , , then .(2)When Alice receives from Bob, she does not know how to compute because she does not know , so is private for Bob. Therefore, is private for Bob.When Bob obtains the result or , he cannot know which equals or , so is private for Alice.(3)Using Protocol 1, we can determine the relationship between and in one execution, especially that Yao’s solution [1] cannot determine directly.

The following theorem refers to the privacy-preserving property of this protocol.

Theorem 3. Protocol 1 for the two-party GT problem is secure in the semihonest model.

Proof. We begin by constructing and such that (4) are satisfied.
In the protocol, where are inputs, is Alice’s encryption result, is Alice’s decryption result, is Bob’s computation result, and is the output.
proceeds as follows:(1)By , randomly selects a number such that . constructs the vector .(2)Using the Paillier encryption scheme, encrypts using different random numbers : (3) selects a random number and computes , ().(4) decrypts . By comparing with , determines .Let Since , and , then Similarly, we can construct , such that

3.2. Secure Multiparty Computation Protocol for the GT Problem (Secure Sorting Problem)

In practice, more than two parties want to privately compare all of their numbers. For example, some companies want to compare their turnovers, but they do not want to disclose their data, so they prefer a secure protocol that can sort their turnovers. As another example, students desire to know their own score’s rank without publishing their scores, and thus they must privately determine the order of their own score.

To solve the secure multiparty GT problem (i.e., secure sorting problem), we may use Protocol 1 to compare pairwise, but this process is of high complexity and will disclose too much information to other parties. Therefore, we propose a more efficient protocol as follows.

3.2.1. Secure Sorting Problem Model

The set is defined as , where (, and is the number of the parties) holds his own number , .

Consider , where is the order of .

A protocol is proposed to compute privately. In the protocol, only knows and cannot know other parties’ orders and any other information.

3.2.2. A Protocol for the Multiparty GT Problem

In our protocol, , encodes his own number into a vector by the vectorization method as follows. where

All parties’ form a matrix (Table 1). If wants to know his order, he can add the components of th column vector. For example, Table 1 shows the vectorization method and all parties’ orders.

However, if sends to other parties directly, will be disclosed. We utilize the secret splitting method to guarantee his privacy. splits into vectors . He keeps secret and sends to other parties. In the end, will receive vectors () that other parties send to him, as shown in Figure 1. computes .

All parties publish their . If wants to know his order, he adds all of , that is, , and .

Protocol 2. Secure multiparty computation of the GT problem.

Input. The inputs of are .

Output. Consider .(1)All parties encode their own numbers with the vectorization method. Take for example:where (2) randomly splits into vectors {}, as follows:where all are random numbers that satisfy(3) keeps his secret and sends to other parties separately. In the end, each party holds vectors; that is, holds and adds the corresponding column vectors, denoted by .(4)All parties publish their . According to , adds all () of , denoted by .(5) computes as his order.

3.2.3. Accuracy and Security

(1)In this protocol, we use the secret splitting method to split a vector into vectors. Each party keeps his secret. Without , others cannot conspire to compute .(2) Protocol 2 does not require any encryption scheme based on computational assumptions, so it is information-theoretically secure.(3) adds the -th column vectors of , and other parties do not know which column vector selects.(4) In a particular case, if holds the number and his order is also , he confirms that someone holds the number , but he does not know who holds the number . This situation does not imply that the protocol is insecure because this information is what the function computed leaks. Even in the ideal SMC protocol, the TTP tells that his rank is 2, the information also leaks because this is what the function to be computed leaks.(5) The protocol cannot be applied to the two-party situation, because, in this case,in the execution of Protocol 2 for two parties, Alice splits to vectors , and Bob splits to vectors . Alice sends to Bob, and Bob sends to Alice. Bob computes and sends to Alice. Because Alice has the vectors and , she can compute and compute , so she obtains Bob’s . Therefore, Protocol 2 is not applicable to the two-party GT problem.(6) In this protocol, we do not consider attacks in the transmitting process. If necessary, we can use the Paillier encryption scheme [34] to encrypt and , although this increases the computational and communication costs.(7) In fact, in many cases, limited parties take part in sorting their numbers. In cryptographic literatures, the multiparty refers 3 to 5 parties. There are few literatures that consider more than 10 parties. In addition, in our life, we sort 100 richest men of the world or compare the top 100 banks in the world or rank the top 500 companies all over the world at most. Therefore, the number of parties in the above cases does matter much.(8) Protocol 2 can resist collusion attacks, which is proved by the following Theorem 4.

The following theorem states the privacy-preserving property of the protocol.

Theorem 4. Protocol 2 for the multiparty GT problem is secure in the semihonest model.

Proof. In Protocol 2, even though some adversaries obtain ’s vectors , they cannot compute . Because all vectors sum to , while keeps secret, the adversaries cannot obtain . Therefore, the security of Protocol 2 is the same as the security of a protocol with a TTP in the ideal SMC model. We prove the privacy-preserving property of Protocol 2 using the following formal proving method (Definition 2).
We consider the following cases.
Case 1. does not participate in collusion, while other participants conspire to compute ’s . By executing Protocol 2, the set can obtain and . Since and are private, the coconspirators cannot obtain and . So they do not conspire . We can construct a simulator such that (5) holds.(i)By the result of the protocol, the coconspirators choose and such that(ii)The set will execute the protocol , determining their as follows:(iii)Let Therefore, in this case, the protocol is secure.
Case 2. and the participants conspire to obtain the information of other participants .
Case 2.1. If , suppose , ’s order except can be obtained, which is the same as the case mentioned above in part () and part () of Section 3.2.3. Even in the ideal SMC protocol, the TTP tells the orders of and , and and can conspire ’s order . Therefore, the protocol does not leak more information than that the ideal SMC model leaks, and the protocol is secure.
Case 2.2. If , and are private, and and cannot know and , which is the same as Case 1. In this case, the protocol is secure. This is true for .
To sum up, in any case, there exists a simulator such that (5) holds, so Protocol 2 is secure.

To help the readers understand the protocol, we give a toy example.

Example

Input. , .

Output. .(1) encode their number’s vectors as follows (set ):(2) split their own vector into four random vectors, respectively:(3) keeps secret and sends to , respectively. keeps secret and sends to , respectively. keeps secret and sends to , respectively. keeps secret and sends to , respectively.(4) holds . holds . holds . holds .(5) computes . computes . computes . computes .(6) publish their , respectively.(7) By , selects .By , selects .By , selects .By , selects .(8) computes his rank . computes his rank . computes his rank . computes his rank .

4. Computational and Communication Complexity

4.1. Computational Complexity

The computational complexity is an important measure for evaluating the efficiency of an SMC protocol.

For secure two-party computation of the GT problem, we compare the computational complexity of Protocol 1 with that of the following typical solutions. The computational cost of [1] is exponential, which is impractical if the two numbers are very large. Furthermore, the protocol only compares or . In [25], which also computes only or , the computational cost is modular multiplications, where is the bit number of inputs and is the modulus of the Paillier encryption scheme. In [26], Lin and Tzeng used the multiplicative homomorphism of the ElGamal encryption scheme to solve the GT problem for or , the computational cost of which is modular multiplications. In addition, some researchers proposed SMC protocols for the GT problem based on linear secret sharing. Nishide and Ohta [29] constructed a simplified bit-decomposition protocol for the GT problem, which cannot determine .

In the proposed Protocol 1, Alice encodes her number into an -dimensional vector. She encrypts components of the encoding vector and computes -time Paillier encryption and one-time Paillier decryption. Bob computes one-time Paillier encryption. The computational cost of each Paillier encryption or decryption is modular multiplications. Therefore, the computational overhead of Protocol 1 is modular multiplications, where is the dimension of the encoding vector and is the modulus of the Paillier encryption scheme. Table 2 compares the computational complexities of the proposed protocol and the existing protocols for the two-party GT problem.

Table 2 shows that Protocol 1 can determine whether or in one execution, whereas the other three solutions can only determine whether or . In addition, is much smaller than , because is the dimension of the encoding vector, whereas is the bit number of inputs. For example, and , , , , , , ; thus, its encoding vector is , so . But the bit number of is . Therefore, Protocol 1 is more efficient than these three existing solutions.

For secure multiparty computation of the GT problem (secure sorting problem), the solution of [1] is not suitable for the multiparty GT problem, and we compare the computational cost of Protocol 2 with that of [27, 28] which are typical protocols for the secure sorting problem. Paper [27] proposed an SMC protocol for the sorting problem based on the secret sharing scheme. The computational cost of [27] is ( is the length of the shared values) modular multiplications. Paper [28] proposed a sorting protocol based on the secret sharing method, with a computational cost of modular multiplications, where is the number of parties and is the threshold value.

In Protocol 2, we simply use the vectorization and the secret splitting methods, which requires additions, where is the encoding vector dimension and is the number of parties. For example, when , the computational cost is 243 additions, which is much smaller than that using any public-key encryption scheme. Therefore, the computational cost of Protocol 2 is negligible. Table 3 summarizes these computational complexities for the multiparty GT problem.

4.2. Communication Complexity

The communication complexity, measured by the number of communicating rounds, is another important measure for evaluating the efficiency of an interactive protocol. For secure two-party computation of the GT problem, [1] requires two rounds, and [25] requires four rounds. Nishide and Ohta [29] constructed a simplified bit-decomposition protocol for the GT problem, whose communication cost is 15 rounds, whereas Protocol 1 requires only one round.

For secure -party computation of the GT problem, [25] requires rounds, and [28] requires rounds. Protocol 2 also requires rounds.

Table 4 summarizes the communication complexities of these schemes for both types of the GT problem.

5. Conclusion

In this work, we propose the SMC protocols to solve the two-party and multiparty GT problem. Using the Paillier encryption scheme and the vectorization method, we construct an SMC protocol for solving the two-party GT problem which can determine in one execution. To efficiently solve the multiparty GT problem (secure sorting problem), we use the vectorization method to transform private numbers into vectors and use the secret splitting method to resist collusion attacks without any encryption scheme based on computational assumptions. This protocol is information-theoretically secure. The simulation paradigm proves the privacy-preserving property of the protocols in the semihonest model. Comparing the computational and communication complexities of our protocols with that of the existing solutions, our protocols are efficient for practical applications and easily implemented.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research is supported by the Natural Science Foundation of China (Grant nos. 61272435, 61272514, 61261028, and 61562065), the Natural Science Foundation of Inner Mongolia (Grant no. 2017MS0602), the University Scientific Research Project of Inner Mongolia (Grant no. NJZY17166), and Fundamental Research Funds for the Central Universities (Grant no. 2016TS061). The authors thank the sponsors for their support.