Abstract

The wireless network suffers from many security problems, and computation in a wireless network environment may fail to preserve privacy as well as correctness when the adversaries conduct attacks through backdoors, steganography, kleptography, etc. Secure computation ensures the execution security in such an environment, and compared with computation on the plaintext, the performance of secure computation is bounded by the underlying cryptographic algorithms and the network environment between the involved parties. Besides, the Chinese cryptography laws require the cryptographic algorithms that appeared in the commercial market to be authorized. In this work, we show how to implement oblivious transfer (OT), an important primitive in secure multiparty computation (MPC), using the Chinese government-approved SM2 and SM3 algorithms. The SM2 algorithm is based on the elliptic curve cryptography and is much faster than the discrete logarithm-based solutions. Moreover, by adopting the standard OT extension technique, we can extend the number of OTs efficiently with one more round of communication and invocations to the SM3 and SM4 algorithms. The OT primitive can be used in the Beaver multiplication triple generation and other MPC protocols, e.g., private set intersection. Therefore, we can utilize the SM series cryptography, specifically, the SM2, SM3, and SM4 algorithms, to build highly efficient secure computation frameworks which are suitable for the wireless network environment and for commercial applications in China. The experimental evaluation results show that our protocols have comparable performance to existing protocols; specifically, our protocols are quite suitable for bad network environments.

1. Introduction

Wireless network (WLN) enables devices to communicate with each other without cable connections, and it is a major component of the modern Internet. With the advancement of the Internet of things (IoT) technique, a vast amount of wireless networks are being deployed [1]. However, the wireless network can be quite vulnerable [2, 3], an adversary may eavesdrop on or alter the communication in the network. When several parties want to perform a joint computation in a wireless network with potential adversaries, the correctness of the computation and the privacy of inputs can be easily breakdown. Although efforts have been made to avoid or mitigate the security threat in the wireless network [46], there are still a lot of security issues. Therefore, we need privacy-enhancing technologies to ensure security in such a network environment.

In this work, we leverage a cryptographic technique called secure multiparty computation (MPC) [7, 8] to construct an efficient and provably secure computation framework that protects parties’ privacy in a network potentially controlled by the adversary. The goal of MPC is to design protocols that enable several mutually untrusted parties to jointly compute a function on their private inputs without revealing anything except for the function output. Typically, the computational security of a MPC protocol relies on some computational or setup assumptions. Therefore, if there is an adversary that can break the security of the protocol, either the adversary has unbounded computation power or the computational assumption does not hold.

We focus on the oblivious transfer (OT) primitive, which is complete for the MPC computation [9] and is a fundamental building block for many MPC protocols. Naor and Pinkas [10] provide the first efficient OT protocol, and Chou and Orlandi [11] propose the simplest OT construction. Specifically, in [12], Masny and Rindal initiate the study of a new OT primitive called endemic OT. The endemic OT is weaker than commonly used OT in the sense that a corrupted party is able to control its output message, and the same definition has also been considered by Garg et al. [13]. As demonstrated by Masny and Rindal [12], endemic OT can be constructed using proper key agreement protocols in the programmable random oracle model. Later, McQuoid et al. [14] improve the efficiency of -out-of- endemic OT protocol using a programmable-once public function (POPF). They also notice some security issues in the batch setting and provide a proper treatment in [15].

In the implementation, the main factor affecting the performance of a MPC protocol is the involved cryptographic primitives, whose performance depends on the underlying cryptographic assumptions. A wide range of computational MPC protocols are built on asymmetric-key cryptosystems, and as for asymmetric-key cryptography, the Rivest-Shamir-Adleman (RSA) based on integer factorization and elliptic curve cryptography (ECC) based on discrete logarithm are two of the most important algorithms being used. Evaluation results show that ECC has great advantages over RSA in both computation time [16] and resource consumption [17]. In 2010, the Chinese State Cryptography Administration announced the public key cryptographic algorithm SM2 [18] and several other SM series algorithms, which are based on the elliptic curve cryptography. According to the Chinese cryptography laws [19], it is mandatory to adopt cryptography algorithms which have been authorized for commercial use in China, e.g., the SM algorithms.

1.1. Our Contribution

The contributions of this work can be summarized as follows. First, we construct an endemic OT protocol based on the SM2 key agreement protocol. Moreover, we build several MPC protocols on the top of the endemic OT protocol, including a two-party secure computation protocol on the Boolean circuits, a multiparty party secure computation protocol on arithmetic circuits, and a two-party private set intersection (PSI) protocol. Our constructions consider both efficiency and availability and only use the SM series cryptography. The security of our protocols can be proved in the random oracle model and in the public key infrastructure (PKI) setting. Since a PKI is used, the parties can communicate without the secure channel functionality, and in implementation, the parties can transmit messages without the TLS protocol. To the best of our knowledge, we are the first to propose a secure computation framework that complies with the Chinese national standards and regulations.

There have been some secure computation solutions for the wireless network. We can categorize them into low-communication MPC and hardware-based computation.

Garay et al. [20] investigated the feasibility of designing protocols with sublinear communication complexity. Their work enables large-scale secure computation in a communication-restricted environment. Moreover, Gentry et al. proposed a communication model called YOSO [21], in which each party only sends one message to others. In YOSO MPC, only a small fraction of parties execute and communicate in each round; therefore, its communication complexity is also sublinear to the total amount of involved parties. Fully homomorphic encryption is a widely used primitive in low-communication MPC protocol design. Asharov et al. [22] constructed a MPC protocol using threshold FHE. The proposed protocol only needs two or three communication rounds depending on the underlying assumption, and its communication cost is independent of the function to be computed. López-Alt et al. [23] and Mukherjee and Wichs [24] considered to use multikey FHE and a third-party untrusted server in their construction, the proposed protocol achieves a minimal communication complexity which is independent of the function and the number of parties. All these FHE-based protocols have the property that the communication size only depends on the input/output size; however, using FHE dramatically increases the computation burden. Another way to reduce communication in the protocol execution is to offload the major communication workload to a preprocessing phase. Damgård et al. proposed the celebrated SPDZ computation framework [25], in which a bunch of authenticated triples are generated in the preprocessing phase and consumed in the online phase. The authenticated garbling protocols proposed by Wang et al. [26, 27] also use authenticated triples to speed up the online computation. Specifically, these garbled circuit-based protocols have constant round complexity, which makes them more suitable for the wireless network. Carter et al. [28] noticed that a remote server can be used to instantiate the preprocessing phase. However, in such a server-aided setting, the adversaries are allowed to corrupt the cloud server.

Hardware-based computation has many things in common with the server-aided computation, and the efficiency of the protocols varies with the parties’ trust to the hardware. Since the hardware can have a fast connection with the computing parties, hardware-based computation is typically much faster than other solutions [2931]. Generally, hardware-based computation uses a hardware token or trusted hardware issued by the parties or a hardware manufacturer, and the parties can use the hardware to generate preprocessing information or even directly compute the function. These hardware-based protocols assume that the hardware is tamper-resistant or tamper-proof, while in practice, there have been works that successfully break the security of some commonly used hardware [32, 33].

3. Preliminaries

3.1. Notations

Throughout this paper, we use the following notations and terminologies. Let be the computational security parameter, and be the statistical security parameter. Denote a binary matrix with rows and columns as . When and are two-bit strings, is the concatenation of them. When is a bit string, a vector, or an array, is its -th element. Denote the set by , let denote , and let denote the empty set. When a set is used, we assume the elements are arranged by the indexes as . When is a set, stands for sampling uniformly at random from , and stands for the size of in terms of the number of elements. When is a matrix, denotes its -th column, and denotes its -th row. When is a randomized algorithm, stands for running on input with a fresh random coin ; when needed, we denote as running on input with the explicit random coin . Let and be a polynomially bounded function and negligible function, respectively. We assume each party has a unique PID. For readability, we refer as the PID for the party . We abbreviate “probabilistic polynomial time” as PPT and “interactive Turing machine” as ITM.

3.1.1. Elliptic Curve Cryptography Notation

In this work, we work on the finite field , and the elliptic curve is defined by two elements . The set of all the points on is denoted as . is the base point of with order , and is the cofactor.

3.2. Security Definition

Our security model is based on the universal composability (UC) framework [34], which lays down a solid foundation for designing and analyzing protocol secure against attacks in an arbitrary network execution environment (therefore, it is also known as a network-aware security model). We refer to the original work [34] for details.

Roughly speaking, in the UC framework, protocols are carried out over multiple interconnected machines; to capture attacks, a network adversary is introduced, which is allowed to partially control the communication network and corrupt some machines (i.e., have full control of all physical parts of some machines). Then, a protocol is a UC-secure implementation of a functionality , if it satisfies that for every network adversary attacking an execution of , there is another adversary —known as the simulator—attacking the ideal process that uses (by corrupting the same set of machines) such that the executions of with and that of with make no difference to any network execution environment .

3.2.1. The Ideal World

In the ideal world, only communicate with an ideal functionality during the execution. As depicted in Figure 1, waits for each party to provide input, and when all parties’ inputs have been received, it computes the function and sends the output to the party , for . Besides, the functionality interacts with the simulator . When a party sends its input to , receives a notification . Before outputs, it sends to ask for permission of , and it only sends to if a is received.

3.2.2. Adversary Models

There are two main adversary models. A semihonest adversary follows the protocol description and a malicious adversary can deviate from the protocol description arbitrarily. Both adversaries try to help the environment distinguish between the ideal world and the real world by learning more information from the protocol execution.

3.2.3. Model of Protocol Execution

In the protocol execution, an environment provides inputs to the parties and receives outputs from them. Moreover, it can interact with the adversary freely. At the end of the protocol, outputs a binary variable. Let denote the output variable of in an execution of protocol with environment and adversary on input , and denote the ensemble . We use when protocol is in the -hybrid model, i.e., can be invoked in . We slightly abuse notation and use for the ideal execution.

3.2.4. Random Oracle

A random oracle [35] is an idealized hash function that can be publicly accessed. In the random oracle model, the random oracle maintains a table of the previous queries. For a query with input , the random oracle first checks if is recorded. For an unrecorded , the random oracle chooses an element from its output domain uniformly at random and responds with this element, and it then records and the corresponding response; for a recorded , the random oracle simply responds with the recorded response.

3.2.5. Public Key Infrastructure

A public key infrastructure (PKI) links a party’s public identity with its public key. In this work, we use a PKI to guarantee the authenticity and validity of a party’s public key and further ensure the security of the communication. In such a PKI setting, the parties can communicate with each other without the requirement of an underlying secure channel functionality [36]. In implementation, we allow the parties to send messages in an insecure network environment that may be eavesdropped on or tampered with without the TLS protocol.

3.3. One-Round Key Agreement Protocol

Key agreement (KA) protocols allow two parties and to jointly establish a key known to no one else. We use a similar notation as in [14], which considers two-round key agreement protocols, while we focus on one-round key agreement protocols. We first provide an illustrative example that is provided in Figure 2, and the protocol involves the following parameters: (i) is the set of randomness used by the parties(ii) and are ’s message generation function and ’s message generation function, respectively(iii) and are the set of ’s protocol message and the set of ’s protocol message, respectively(iv) and are ’s key generation function and ’s key generation function, respectively(v) is the set of output keys

In a one-round key agreement protocol , party picks random and computes , and it sends to ; in the meanwhile, party also picks random and computes , and it sends to . At the end, and establish the same key by computing and , respectively. Throughout the protocol, the computational security parameter is used implicitly as a parameter of the algorithms.

We require the protocol to have the following properties:

Definition 1 (Correctness). A one-round KA protocol is correct if for any , and ,

Definition 2 (Security). A one-round KA protocol is secure if for any PPT distinguisher ,

Definition 3 (Uniformity). A one-round KA protocol is -uniform if for any PPT distinguisher , Likewise, we can define uniformity.

Definition 4 (Robustness). A one-round KA protocol is robust if for any PPT distinguisher ,

3.4. Programmable-Once Public Function

The primitive programmable-once public function (POPF) is proposed by McQuoid et al. [14]. Later in [15], they fixed some issues in the definition and formally defined a batch 2-POPF. In our endemic OT protocol, we use a N-POPF, and in its multi-instance variant, we use a batch N-POPF; therefore, we provide a formal definition of a batch N-POPF here.

A batch N-POPF consists of two algorithms: and . Programmable-once means that one can compute for an and , but for any other , the value of such that should be unpredictable, i.e., looks like random. This unpredictability is defined with respect to a 1-weak random oracle which produces a pseudorandom when is only allowed to be accessed once.

Definition 5 (1-weak random oracle). A function is a 1-weak random oracle if for any PPT distinguisher , can only access through this experiment.

Now we can formally define a batch N-POPF. Generally, a batch N-POPF makes use of some local setups , which can consist of random oracles, common reference strings, etc. We use and to denote the algorithms when they access . Besides, a batch N-POPF should include two alternative local setups:

: this setup provides the same interface as and an additional method .

: this setup provides the same interface as and an additional method .

We require the batch N-POPF to have the following properties:

Definition 6 (Correctness). A batch N-POPF is correct if for any ,

Definition 7 (Honest simulation). A batch N-POPF has honest simulation if for any PPT distinguisher and PPT adversary ,

The honest simulation property captures the batch N-POPF’s ability of hiding : when , is generated using and when , is generated from random . If and are indistinguishable even when is given, we can say that does not leak the information of .

Definition 8 (Uncontrollable outputs). A batch N-POPF has uncontrollable outputs if for any 1-weak random oracle any PPT distinguisher and PPT adversary ,

The uncontrollable output property restricts the adversary to only be able to program once. Given any produced by the adversary , the method finds an such that for , the value of is unpredictable.

Moreover, when the batch N-POPF has an honest simulation and uncontrollable outputs, and the interface of and looks indistinguishable from the adversary, we can say that the batch N-POPF is secure.

In this work, we use a correct and secure batch N-POPF which is constructed by McQuoid et al. in [15]. For simplicity, we denote as . The hash functions are defined as and are modeled as random oracles. We provide the details of the batch N-POPF in Figure 3, and we have the following theorem from [15]:

Theorem 9 (See [15]). Figure 3 defines a correct and secure batch N-POPF.

3.5. Oblivious Transfer

Oblivious transfer (OT) is a cryptographic primitive that allows a receiver to choose and to obtain several messages from a bunch of messages held by a sender , while is not aware of ’s choice, and will not learn anything about the unchosen messages. The messages held by can be contributed by itself or generated by the OT functionality. We denote an OT functionality where decides the messages as sender OT and an OT functionality which generates messages for as random OT . In [10], Naor and Pinkas first provide an efficient implementation of , and in [11], Chou and Orlandi propose the simplest OT protocol for , and they show how to transform it into a protocol. However, both [10] and the random OT protocol in [11] do not provide full simulation-based security.

In this work, we make extensive use of a special notion of OT called 1-out-of- endemic OT, which is proposed in [12]. Endemic OT is essentially the same as random OT, except that the endemic OT functionality allows the adversary to determine the corrupted party’s messages. As depicted in Figure 4, the 1-out-of- endemic OT functionality waits for from and from , where denotes ’s choices. After both messages are obtained, picks uniformly random messages . However, when the adversary corrupts , it is allowed to determine all the messages by sending , and when the adversary corrupts , it is allowed to determine the message by sending . At the end, sends to and to . We also provide the standard random OT functionality in Appendix A.1.

3.6. Private Set Intersection

Private set intersection (PSI) is a specialized MPC problem. In PSI, two parties want to compute the intersection of their input sets, without revealing the content of their inputs. As described in Figure 5, the party and send their input sets and to the functionality , and computes the intersection and sends to . PSI can be solved using generic MPC techniques, like GMW protocol [37] and Yao’s garbled circuit protocol [38], while there are also custom protocols for this problem that are more efficient.

4. SM Series Cryptography

4.1. SM3 Hash Function and Key Derivation Function

Let denote the SM3 hash function that can map an arbitrary length string to a -bit hash digest, i.e., . Let denote the key derivation function that takes input as the string and the key length , and it outputs length-bit key string . In this work, we implement with SM3 hash function , and the details are shown in Algorithm 1.

1. Set ;
2. For :
 (a) Compute ;
 (b) Set ;
3. If , set as the top bits of ; otherwise, set ;
4. Set ;
5. Output .

The security of directly follows the security of .

4.2. SM4 Block Cipher Algorithm

Let denote the block cipher that takes a -bit seed and -bit plaintext as input and output an -bit ciphertext. The SM4 block cipher algorithm has an electronic codebook (ECB) mode and a cipher block chaining (CBC) mode. In ECB mode, is instantiated by repeatedly invoking , in other words, ; in CBC mode, each block of ciphertext depends on the previous ciphertext block and an initialization vector is used, more specifically, , where and for .

4.3. SM2 Key Agreement Protocol

The SM2 key agreement protocol is defined in the PKI setting, and it works on elliptic curve defined over the field . The parties and first agree on the output key length and an elliptic curve system where the elliptic curve discrete logarithm problem (ECDLP) is hard. The system parameters include , and is used in the computation.

We assume each party knows the other party’s distinguishing identifier and a PKI distributes the public key computed from . Therefore, the parties can compute the identifier hash value , where is the length of and outputs a 256-bit string. Therefore, has , and has . The public key/secret key pairs are used to prevent the man-in-the-middle attack, and the identifier hash values are used to identify the parties executing the protocol. In a nutshell, the SM2 key agreement protocol consists of the following PPT algorithms which use elliptic curve system parameters implicitly: (i) is the message generation algorithm that outputs a fresh random private message and the corresponding public message (ii) is the point generation algorithm that takes input as the party’s private message , both parties’ public messages , the party’s secret key , and the other party’s public key , and it outputs a point that can be used to derive the shared key(iii) is the key generation algorithm that takes input as the point , and ’s identification message , and it outputs a shared key

We slightly modify the order of message delivery in the original SM2 key agreement protocol to achieve the one-round property. In the original protocol, party sends the messages to after it obtains the key , which indicates sends only after is received, while we notice that this step can be done right after is generated, and it does not raise any security issues. In our modified protocol, first invokes to generate , and it sends to . After receiving from , computes the point and derives . The execution of is exactly symmetric. The process of the protocol is illustrated in Figure 6, and the details of the algorithms can be found in Figure 7.

When key confirmation is needed, an augmented SM2 key agreement protocol can be used which contains one more PPT algorithm and several more steps.

is the verification algorithm that takes input as a string , a point , both parties’ public messages , and both parties’ identifier hash values , and it outputs a hash value .

As in Figure 6, after generating and , invokes as a proof that it obtains a correct point and is able to derive a correct key, and it sends to . When it receives from , it checks if obtains the correct point . The process of is likewise.

Claim 10. If the key derivation function is modeled as a random oracle, and the ECDLP is hard in , then the SM2 key agreement protocol is a one-round KA protocol with perfect correctness, security, perfect uniformity, perfect uniformity, and robustness.

Proof. It has been shown in [39] that the SM2 key agreement protocol is secure in the well-known Bellare-Rogaway model [40, 41] when the key derivation function is modeled as a random oracle and the ECDLP is hard in . Security in the Bellare-Rogaway model means perfect correctness and security of the KA protocol. Moreover, as illustrated in Figure 7, , where is a uniformly random element from ; since is generated by a bijective function, it should be perfectly indistinguishable from a random element in , which indicates the perfect uniformity of the protocol, and the perfect uniformity can be obtained in the same way. At the end, the parties and already hold after is invoked, which can be used as a shared key, and the function only converts the point to a bit string. As proved in [14], this gives the robustness of the SM2 key agreement protocol when is modeled as a random oracle, since a random oracle outputs uniformly random strings.

5. Construct Oblivious Transfer Using SM

In this section, we show how to construct an oblivious transfer protocol using the SM series cryptography and the batch N-POPF. Moreover, we illustrate how to extend the number of OT instances using OT extension protocols. Before presenting the constructions, we first provide the descriptions of the symbols used in Table 1.

5.1. Oblivious Transfer from SM2 Key Agreement

Our one-round 1-out-of- endemic oblivious transfer protocol is constructed from the SM2 key agreement protocol and the batch N-POPF defined in Figure 3. As depicted in Figure 8, the sender and the receiver first run a setup phase to determine the protocol parameters and to exchange the public keys along with the distinguishable identifiers. This setup phase only needs to be run once between these two parties and , and the parameters can be used in multiple instances.

When receives the instruction from the environment , it invokes to generate , and it sends to . Meanwhile, receives the instruction from , and it invokes . After that, generates by and sends to . Upon receiving from , for , sets , computes , and sets . then returns to . Upon receiving , computes and sets . At the end, returns to .

The correctness of the protocol directly follows the correctness of the SM2 key agreement protocol and the batch N-POPF. For the security proof, intuitively, since the SM2 key agreement has perfect uniformity, the message should be indistinguishable from a random element from ; by the honest simulation property of the batch N-POPF, should be indistinguishable from other . Therefore, ’s choice remains private to . Besides, because of the robustness of the SM2 key agreement protocol and the uncontrollable output property of the batch N-POPF, the output messages should be unpredictable to .

More formally, we use a theorem from [15]:

Theorem 11 (See [15]). If the KA protocol has security, uniformity and robustness and the batch N-POPF are secure, and then, the protocol described in Figure 8 securely realizes the endemic 1-OPRF functionality in the random oracle model.

Specifically, 1-OPRF is essentially 1-out-of- OT. Therefore, we have the following result:

Theorem 12. If ECDLP is hard in , the hash functions and the key derivation function are modeled as random oracles, and then, the protocol described in Figure 8 securely realizes described in Figure 4 against any PPT malicious adversary corrupting or/and .

The proof is automatically done given Theorems 9 and 11.

The main focus of [12, 14, 15] is the malicious setting. When it comes to the semihonest setting, we notice that securely realizes the standard random OT functionality as well as the endemic OT functionality . This is because the power of the adversary is limited to observing the protocol messages. We provide the results below.

Theorem 13. If ECDLP is hard in , the hash functions and the key derivation function are modeled as random oracles, and then, the protocol described in Figure 8 securely realizes described in Figure 9 against any PPT semihonest adversary corrupting or/and .

The proof can be found in Appendix B.1.

Corollary 14. If ECDLP is hard in , the hash functions and the key derivation function are modeled as random oracles, and then, the protocol described in Figure 8 securely realizes described in Figure 4 against any PPT semihonest adversary corrupting or/and .

The proof can be found in Appendix B.2.

5.2. Oblivious Transfer Extension

Although the endemic OT protocol is quite efficient, the exponentiation operations in can still be too expensive when we need millions of OTs in applications. In such cases, a technique called oblivious transfer extension can be adopted to generate OTs much faster. An OT extension protocol takes a bunch of “base” OTs to initiate the protocol, and then, it extends them to polynomially many OTs using only symmetric primitives, instead of asymmetric primitives. The Beaver [42] first introduced the idea of OT extension, and the following works [43, 44] proposed several highly efficient OT extension protocols. In this work, we use the well-optimized 1-out-of-2 OT extension protocol from [45] in the semihonest setting; in the malicious setting, we consider the protocol from [46] with endemic OT as base OT and apply the result of [47] to reduce the communication round. When it comes to the 1-out-of- OT extension, results of [48, 49] can be adopted. The OT extension protocols securely realize the multi-instance version of the OT functionalities, and we provide the multi-instance 1-out-of-2 endemic OT functionality in Figure 10, which is similar to . The multi-instance 1-out-of-2 uniform OT functionality can be found in Appendix A.1.

5.2.1. Batching Base OT

The base OTs can be obtained by repeatedly invoking the single-instance functionality or ; however, the more efficient way is to design a protocol that directly realizes the multi-instance functionality. Our construction is essentially the same as , where , except that generates multiple message pairs at once, and each message pair is used to generate one OT instance. The details can be found in Figure 11. This batching method saves from repeatedly generating message pairs , thus reducing the computation and communication costs. In [15], McQuoid et al. showed that this batching preserves the security of the original protocol when a tag is used in the generation of the KA protocol output to produce different OT results for each OT instance. In our protocol, we preserve the structure of the SM2 key agreement protocol, and we add the tag in the key derivation function : we set the tag as for the -th invocation. Therefore, the protocol in Figure 11 is secure, and we have the following theorem:

Theorem 15. If ECDLP is hard in , the hash functions and the key derivation function are modeled as random oracles, and then, the protocol described in Figure 11 securely realizes described in Figure 10 against any PPT malicious adversary corrupting or/and .

5.2.2. OT Extension in Semi-Honest Setting.

In the semihonest setting, the protocol securely realizes the random OT functionality when , so the protocol securely realizes the multi-instance random OT functionality . Therefore, we take 1-out-of-2 random OT as the base OT of the OT extension protocol. As depicted in Figure 12, the OT extension protocol needs base OTs to start the extension; typically, is used considering both security and performance. To generate the base OTs, the sender of the outer protocol acts as the receiver, and it picks random select bits and sends to ; the receiver of the outer protocol acts as the sender and sends to . The random OT functionality picks random , and it sends to and sends to . After obtaining the base OTs, forms the choice bits as a column vector. For , computes the to generate and parses as a column vector, and it sets . After that, forms a matrix . For , computes as its random OT message where is the -th row of the matrix . Subsequently, sends and outputs. After obtaining the base OT results and , sets for . It then forms a matrix and a row vector . For , computes and as its OT output.

Now we examine the correctness of . For each column of the matrix , , we can write as before, and this gives us . Therefore, for each row of the matrix, we have , and thus, the protocol is correct. As for the security of the protocol, in [45], Asharov et al. prove that securely realizes . Moreover, We can obtain a result similar to Corollary 14 that securely realizes the multi-instance endemic OT functionality .

Theorem 16. The protocol described in Figure 12 securely realizes described in Figure 13 against any PPT semihonest adversary corrupting or/and .

The proof is done by Asharov et al. [45].

Corollary 17. The protocol described in Figure 12 securely realizes described in Figure 10 against any PPT semihonest adversary corrupting or/and .

The proof is similar to the proof of Corollary 14.

5.2.3. OT Extension in Malicious Setting.

The malicious setting is more difficult to handle. When the base OTs are endemic OTs, we can only extend them to more endemic OTs, and the sender needs to check the consistency of the messages sent by the receiver . We provide the malicious setting OT extension protocol in Figure 14. The main process of extending oblivious transfer remains the same as the semihonest setting protocol , so is the correctness of the protocol. However, a malicious can violate the requirement that the same choice bits should be used when computing the vectors . In [46], needs to prove its honesty in zero knowledge, and generally speaking, the consistency check uses a random linear combination of the row vectors of the matrices, and the coefficients of the linear combination should be unpredictable to , e.g., they are randomly picked by . In [47], Doerner et al. use Fiat-Shamir heuristic [50] to make the zero-knowledge proof process noninteractive: the coefficients are generated by a hash function taking the matrix as input. The formal security proof of the protocol can be found in [12].

Theorem 18. The protocol described in Figure 14 securely realizes described in Figure 10 against any PPT malicious adversary corrupting or/and .

The proof is done by Masny and Rindal [12].

6. Generate the Beaver Triple

A wide range of MPC protocols working on circuits requires heavy computation and huge communication to compute AND gates and multiplication gates. To speed up the MPC protocols, a research trend is to split the protocol into a preprocessing phase independent of parties’ input and an online phase where the computation proceeds using actual input and data from the preprocessing phase. Therefore, parties can run the preprocessing phase whenever they are available and respond instantly when the computation needs to proceed.

As for secret-sharing-based MPC protocols, Beaver introduces a notion of the Beaver triple (or multiplication triple) in [51]. Basically, a Beaver triple consists of shared triples , where are chosen randomly with that always holds. The length of the Beaver triple can vary with applications, and in this work, we refer the Beaver triple with a length larger than 1 as a multiplication triple, and a Beaver triple with a length of 1 is simply called the Beaver triple.

As depicted in Figure 15, the triple-generation functionality first waits for all parties to send a instruction. After that, it allows the adversary to determine the content of the triple received by the corrupted by sending . Subsequently, as for the parties not corrupted, picks uniformly random . However, to ensure that , chooses the party with the smallest index among the uncorrupted parties and sets . At the end, sends back to .

6.1. Two-Party Beaver Triple Generation

We first introduce how to generate the Beaver triple among two parties using endemic OT. In an OT execution, the receiver sends to obtain , and the sender obtains two messages . Notice that can be represented as , and if we set , then . Therefore, invoking the endemic OT twice with and playing the sender in turn is directly a Beaver triple-generation protocol , which can be found in Figure 16. In , party first picks random , and then, it invokes the endemic OT functionality as sender and receiver, respectively. When sends , and back to , sets and .

In the implementation, the main cost of the protocol is to invoke twice. When is instantiated with , which is a one-round protocol, the protocol also has round complexity 1. Besides, only needs to transfer 3 group elements in total, which means only needs 6. Moreover, we can use the multi-instance OT functionality to generate a bunch of OT instances in advance, which further reduces the computation and communication costs. Therefore, can be extremely suitable for lightweight devices in extreme network environments with high delay and low bandwidth.

Although endemic OT is a weak version of general random OT, the protocol is still secure in the malicious setting, and we provide the theorem together with the proof below.

Theorem 19. The protocol described in Figure 16 securely realizes described in Figure 15 with in the -hybrid model against any PPT malicious adversary corrupting or .

The proof can be found in Appendix B.3.

Given this two-party Beaver triple-generation protocol , we can use it to generate sufficiently many Beaver triples in the preprocessing phase and carry out a GMW-style two-party computation protocol [37] in the online phase, where the XOR gates can be computed locally and the AND gates consume one Beaver triple each and need communication. The details of the protocol can be found in Figure 17. This provides an efficient solution to a generic two-party computation over the Boolean circuits.

Theorem 20. The protocol described in Figure 17 securely realizes described in Figure 1 in the -hybrid model against any PPT semihonest adversary corrupting or .

The proof is done by combining the result of [37, 51].

6.2. Multiparty Multiplication Triple Generation

We can extend the Beaver triple-generation protocol to the multiparty setting and generate multiplication triple of with one more round communication. Therefore, we obtain the protocol described in Figure 18 which is secure in the semihonest setting, even when the endemic OT functionality is used which gives more power to the adversary .

The core idea is still using the OT functionality to generate correlated messages, and again, we can simply invoke the multi-instance OT functionality and use the OT extension technique to generate polynomially many OT instances in advance. In , all computations are on the ring , while now, we consider . Assume an endemic OT outputs to sender and to receiver for choice bit , and it holds that .

Consider the simple case where , we have and we can extend Equation (9) to Equation (10), which proves the correctness of the protocol .

Now, we proceed to provide the security of the protocol .

Theorem 21. The protocol described in Figure 18 securely realizes described in Figure 15 with in the -hybrid model against any PPT semihonest adversary corrupting no more than parties.
The proof can be found in Appendix B.4.

Given this multiparty multiplication triple-generation protocol , we can use it in the semihonest SPDZ-style multiparty computation protocol [25], where each multiplication gate consumes one multiplication triple. Note that the original SPDZ protocol is designed for the malicious setting, and it includes information-theoretic MAC to ensure correctness, while in the semihonest setting, these checks can be removed. The details of the resulting MPC protocol can be found in Figure 19. provides an efficient solution to generic multiparty computation over the arithmetic circuits, which are more powerful than the Boolean circuits.

Theorem 22. The protocol described in Figure 19 securely realizes described in Figure 1 in the -hybrid model against any PPT semihonest adversary corrupting up to parties.

The proof is done by combining the result of [37, 51].

7. Private Set Intersection from OT

Apart from generic MPC protocols, there are also protocols dedicated for special usage, e.g., PSI. The PSI protocol (Figure 5) is taken from the work of Chase and Miao [52], and we instantiate the OT functionality with as another application of this endemic OT protocol. When the required OT number is large, we can use the OT extension technique to extend the number of OT instances.

The security of in the semihonest setting directly follows the result of [52] since according to Theorem 13, the protocol (as well as the protocol ) securely realizes the random OT functionality in this setting. Moreover, Chase and Miao consider one-sided malicious security, where the can be maliciously corrupted by the adversary. However, we notice that does not meet the security requirement of , and in such a case, the protocol can be insecure. As stated above, in the malicious setting, only realizes , which allows the adversary to control the OT messages. Specifically, can influence the protocol output by changing the OT messages, enabling the environment to distinguish between the real world and the ideal world. We provide the result along with its proof below.

Theorem 23. The protocol described in Figure 20 is not secure against a PPT malicious adversary corrupting when the endemic OT functionality is used even if is a secure PRF, and are modeled as random oracles, and parameters are chosen properly.

Proof. To prove Theorem 23, we construct an adversary and an environment such that for any PPT simulator , can distinguish between (i) the real execution , where the parties run protocol in the -hybrid model and the corrupted is controlled by , and (ii) the ideal execution , where the parties and interact with the functionality in the ideal world, and corrupted is controlled by the simulator .

7.1. Adversary

The adversary instructs to run the protocol faithfully except for the following steps.

For , upon receiving from , the adversary sends to on behalf of .

For , upon receiving from , sends to .

makes random queries to after receiving from on behalf of .

The environment outputs 1 if sends back where , and it outputs 0 otherwise.

This can be seen as a drawback of the endemic OT functionality in that its application scenarios are limited, and sometimes, we need other types of OT functionalities. As depicted in Figure 21, we can adopt the transformation of [53] and obtain at the cost of one more round communication. After transformation, our endemic OT protocol can be used to construct this highly efficient PSI protocol even in the malicious setting. We can also obtain and following the protocols in [12].

8. Implementation and Benchmarks

In our protocols, we instantiate all the hash functions involved with SM3 hash function . When the required output length is not to , we adopt a similar technique as used in the construction of the key derivation function (cf. Section 4.1). The pseudorandom function and pseudorandom number generator function can be instantiated with the SM4 block cipher algorithm . Roughly speaking, to implement , we use . To implement , where , we use . When , we truncate the extra bits as in Section 4.1. As for other mentioned protocols, we instantiate the hash function, and functions as described in their works, e.g., SHA256, for the hash function, and AES for .

8.1. Experimental Setup

We perform the experiments on Dell OptiPlex 7080 equipped with an Intel Core 8700 CPU @ 3.20 GHz with 32.0 GB RAM, running Ubuntu 18.04 LTS. We evaluate all protocols in two simulated network settings: (i) a LAN setting with 1 Gbps bandwidth and 1 ms delay and (ii) a WAN setting with 100 Mbps bandwidth and 50 ms delay. All test results are the average of 10 tests.

8.2. Oblivious Transfer Evaluation

We first compare the performance of our multi-instance OT protocol with several state-of-the-art OT protocols [10, 11] and [12]. Note that the OT protocol in [10] is a sender OT protocol, and it needs an additional round to transfer messages. While [11, 12] and our protocol only generate random correlated messages. Besides, our protocol is based on the SM series cryptography, especially the SM2 key agreement protocol, while the other three protocols are inspired by the Diffie-Hellman key agreement protocol [54].

In Figure 22, we show the running time of the protocols in the LAN setting. Since the SM2 key agreement protocol needs more exponentiation operations than the Diffie-Hellman key agreement protocol, our protocol will be slower than the other protocols when the number of OT instances is large. However, as is shown in Figure 23, in the WAN setting, our protocol is faster than the protocol of Naor and Pinkas [10] as well as the protocol of Chou and Orlandi [11] because our protocol only needs one round as the protocol of Masny and Rindal [12]. Therefore, our protocol is specifically suitable for bad network environments, e.g., the wireless network. We also provide the detailed running time in Table 2. In the LAN setting, [11] is the fastest protocol for a large number of OT instances since it requires the least number of exponentiation operations. And in the WAN setting, [12] is the fastest protocol because it only needs one round of communication, and our protocol is slightly slower than [12]. We note that our protocol is the only one that is based on the SM series cryptography and can be legally used for commercial purposes in China.

In real-life applications, OT extension techniques are often used to generate hundreds of thousands of OT instances with high speed. In such cases, the performance of the base OT protocols only has a minor impact on the performance of the overall protocol. To illustrate this, we provide the test result in Table 3. We use the OT protocols of [1012] and our multi-instance OT protocol as the base OT for the semihonest setting OT extension. As the number of OT instances increases, the running time of different protocols is relatively closer. In the LAN setting, our protocol only takes 1.563 seconds to generate 10 million OT instances. And in the WAN setting, our protocol can generate the same number of OT instances in 15.41 seconds. Therefore, our protocol is comparable with other OT protocols in many application scenarios.

8.3. Triple-Generation Evaluation

One of the applications of our OT protocol is to generate the Beaver triples, which can be used in many MPC applications. Our Beaver triple-generation protocol invokes the endemic OT functionality. We use different OT protocols as the base OT protocols for the OT extension protocol, and we use the OT extension protocols for the triple generation protocols. The test results can be found in Table 4, and as one can see, the performance of the triple-generation protocol mainly depends on its underlying OT extension protocols. In the LAN setting, our protocol needs 3.238 seconds to generate 10 million Beaver triples. And in the WAN setting, our protocol can generate the same number of Beaver triples in 31.814 seconds.

9. Conclusion

In this work, we investigate the problem of secure computation from the SM series cryptography, which complies with the Chinese cryptographic laws and is authorized for commercial usages in China. We show how to generate OT using the SM2 and SM3 algorithms. Moreover, we instantiate the OT extension protocols in the semihonest setting and malicious setting with the SM3 and SM4 algorithms, which can efficiently extend some base OTs to a polynomial number of OTs. With the generated OT, we can securely realize the Beaver multiplication triple-generation functionality and further construct generic MPC protocols. Besides, we show that the specific MPC, PSI, can also be implemented using the SM2, SM3, and SM4 algorithms. The proposed protocols are secure in the random oracle model and the public key infrastructure setting. The evaluation results indicate that our constructions are comparable to existing protocols and especially suitable for the wireless network environment. Therefore, we provide an efficient secure computation solution from SM series cryptography, and it is the first solution that can be used for commercial purposes in China.

Appendix

A. Functionalites

A.1. Random Oblivious Transfer

As depicted in Figure 9, the 1-out-of- endemic OT functionality waits for from and from , where denotes ’s choices. After both messages are obtained, picks uniformly random messages . At the end, sends to and to .

Figure 13 depicts the multi-instance version of 1-out-of-2 .

B. Proof of Theorems

B.1. Proof of Theorem 13

Proof. To prove Theorem 13, we construct a simulator such that for any nonuniform PPT environment , the following ensembles are indistinguishable: (i) the real execution , where the parties run protocol and the corrupted party is controlled by a dummy adversary who simply forwards messages from/to , and (ii) the ideal execution , where the parties and interact with functionality in the ideal world and the corrupted party is controlled by the simulator . We consider following cases.

Case 1. is corrupted; is honest.
Simulator. The simulator internally runs , forwarding messages to/from the environment . simulates the interface of as well as honest . In addition, the simulator simulates the following interactions with : (i)Upon receiving from the external functionality and receiving from the environment for , the simulator sends to , it then receives and from . For , picks random , and it sends to (ii)For , when queries the for the -th time, returns Indistinguishability. The indistinguishability is proven through a series of hybrid worlds .
Hybrid : it is the real protocol execution.
Hybrid : is the same as except that in , the simulator receives for corrupted from . The view of is not changed since behaves exactly the same.
Hybrid : is the same as except that in , picks random , for . In , , where should be indistinguishable from a random element and serve as a one-time pad (OTP); therefore, in and should be indistinguishable.
Hybrid : is the same as except that in , the random oracle returns for the -th query. Because of the key indistinguishability of the SM2 key agreement protocol, and should be indistinguishable.
The adversary’s view of is identical to the simulated view. Therefore, and are indistinguishable.

Case 2. is corrupted; is honest.
Simulator. The simulator internally runs , forwarding messages to/from the environment . simulates the interface of as well as honest . In addition, the simulator simulates the following interactions with : (i)Upon receiving from the external functionality and receiving from the environment for , the simulator sends to , and it then receives and from . invokes , and it sends to (ii)When queries the , returns Indistinguishability. The indistinguishability is proven through a series of hybrid worlds .
Hybrid : it is the real protocol execution.
Hybrid : is the same as except that in , the simulator receives for corrupted from . The view of is not changed since behaves exactly the same.
Hybrid : is the same as except that in , the random oracle returns when queries it. Because of the key indistinguishability of the SM2 key agreement protocol, and should be indistinguishable.
The adversary’s view of is identical to the simulated view. Therefore, and are indistinguishable.

Case 3. Both and are corrupted.
Simulator. The simulator internally runs , forwarding messages to/from the environment .
Indistinguishability. This is a trivial case, since both and are controlled by the adversary .

B.2. Proof of Corollary 14

Proof. The simulator used to proof Corollary 14 is exactly the same as the simulator used to proof Theorem 13. Although the functionality allows the simulator to fix the corrupted party’s message, we never invoke the instruction.

B.3. Proof of Theorem 19

Proof. To prove Theorem 19, we construct a simulator such that for any nonuniform PPT environment , the following ensembles are indistinguishable: (i) the real execution , where the parties run protocol in the -hybrid model and the corrupted party is controlled by a dummy adversary who simply forwards messages from/to , and (ii) the ideal execution , where the parties and interact with functionality in the ideal world and the corrupted party is controlled by the simulator . Since the protocol is symmetric, we only consider the case where is corrupted.

Simulator. The simulator internally runs , forwarding messages to/from the environment . simulates the interface of as well as honest . In addition, the simulator simulates the following interactions with : (i)Upon receiving from the external , the simulator sends and to the adversary on behalf of . also sends to and receives (ii)Upon receiving from via the interface of , sends to on behalf of . Upon receiving from via the interface of , sets , and it sends to on behalf of (iii)Upon receiving from via the interface of , sends to on behalf of . Upon receiving from via the interface of , sends to on behalf of (iv) sets , and it sends to . It outputs whatever outputs

Indistinguishability. The indistinguishability is proven through a series of hybrid worlds .

Hybrid : it is the real protocol execution.

Hybrid : is the same as except that in , the simulator simulates the functionality , and it receives from and from the adversary . The view of is not changed since behaves exactly the same as .

Hybrid : is the same as except that in , the simulator sets and , and it sends to the external to modify the triple. The view of is not changed since no message sent to is changed.

Hybrid : is the same as except that in , the output of is directly from . The output distribution remains the same since (1) the simulator modifies in to the values obtained by ; (2) in both and , is randomly picked; (3) in , , where one of the values is randomly picked, and in , is randomly picked; and (4) in both and , it holds that .

The adversary’s view of is identical to the simulated view . Therefore, it is perfectly indistinguishable.

B.4. Proof of Theorem 21

Proof. To prove Theorem 21, we construct a simulator such that for any nonuniform PPT environment , the following ensembles are indistinguishable: (i) the real execution , where the parties run protocol in the -hybrid model and the corrupted party is controlled by a dummy adversary who simply forwards messages from/to , and (ii) the ideal execution , where the parties and interact with functionality in the ideal world and the corrupted party is controlled by the simulator . We consider the extreme case where only is not corrupted.

Simulator. The simulator internally runs , forwarding messages to/from the environment . simulates the interface of as well as honest . In addition, the simulator simulates the following interactions with : (i)Upon receiving from the external , the simulator sends and to the adversary on behalf of , for and . also sends to and receives , for (ii)For : (a)Upon receiving from via the interface of , sends to on behalf of . Upon receiving from via the interface of , sends to on behalf of (b)Upon receiving from via the interface of , sends to on behalf of . Upon receiving from via the interface of , sends to on behalf of (c)Upon receiving from via the interface of , sends to on behalf of . Upon receiving from via the interface of , sends to on behalf of . Upon receiving from via the interface of , sends to and to on behalf of (iii) picks random , for . It then sends to , for (iv)For , upon receiving from for , computes . After that, computes , for . Subsequently, computes , for . At the end, computes , and it sends to , for . It outputs whatever outputs

Indistinguishability. The indistinguishability is proven through a series of hybrid worlds .

Hybrid : it is the real protocol execution.

Hybrid : is the same as except that in , the simulator simulates the functionality to extract and obtains and . The view of is not changed since behaves exactly the same as .

Hybrid : is the same as except that in , the simulator computes using the knowledge of , , , and . It then sends to to the external to modify the triple, for . The view of is not changed since no message sent to is changed.

Hybrid : is the same as except that in , the simulator picks random , for , instead of computing . The views of the other parties in and have the same distribution since one of is uniformly random.

Hybrid : is the same as except that in , the output of is directly from . The output distribution remains the same since (1) the simulator modifies in to the values obtained by ; (2) in both and , is randomly picked; (3) in both and , is randomly picked; and (4) in both and , it holds that . The adversary’s view of is identical to the simulated view. Therefore, it is perfectly indistinguishable.

Data Availability

The data used in the submitted manuscript are available by email contacting the corresponding author.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the National Key R&D Program of China (No. 2021YFB3101601) and the National Natural Science Foundation of China (Grant No. 62072401 and No. 62232002). It is also supported by the “Open Project Program of Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province”. This project is supported by Input Output (iohk.io).