Research Article | Open Access
Yu Zhang, Wei He, Yin Li, "Efficient Boolean Keywords Search over Encrypted Cloud Data in Public Key Setting", Mobile Information Systems, vol. 2020, Article ID 2904861, 15 pages, 2020. https://doi.org/10.1155/2020/2904861
Efficient Boolean Keywords Search over Encrypted Cloud Data in Public Key Setting
Searchable public key encryption- (SPE-) supporting keyword search plays an important role in cloud computing for data confidentiality. The current SPE scheme mainly supports conjunctive or disjunctive keywords search which belongs to very basic query operations. In this paper, we propose an efficient and secure SPE scheme that supports Boolean keywords search, which is more advanced than the conjunctive and disjunctive keywords search. We first develop a keyword conversion method, which can change the index and Boolean keywords query into a group of vectors. Then, through applying a technique so-called dual pairing vector space to encrypt the obtained vectors, we propose a concrete scheme proven to be secure under chosen keyword attack. Finally, we put forward a detailed theoretical and experimental analysis to demonstrate the efficiency of our scheme.
Currently, thousands of information retrieval systems, such as e-mail systems, database management systems, and document management systems, are operating successfully in both the government and private sectors. As the data stored in these systems increase rapidly, more and more people want to migrate these data to cloud. To keep data privacy, users often encrypt these data before uploading them to the cloud. Since the encrypted data are difficult to retrieve, how to execute keyword search over encrypted data has attracted tremendous research attention over the past few years. Among these research studies, the searchable encryption (SE) is one of the most important techniques to address the issue of searching over encrypted data [1, 2].
The SE enables data users to retrieve the encrypted data of interest from a cloud server without decrypting the data. Commonly, SE is divided into two categories: one is searchable symmetric key encryption (SSE); the other is searchable public key encryption (SPE). During recent years, many SSE schemes have been proposed to support keyword search over encrypted data [3–6]. The key of SSE for encrypting data is the same as the key for generating search trapdoor. By contrast, the key of SPE for encrypting data is open to public, while the key for generating search trapdoor is only given to the authorized data receivers. Compared with SSE, SPE is more suitable for the situation in which there are many data senders and only a few data receivers, e.g., e-mail system , personal health record , and wireless sensor network . As illustrated in Figure 1, in the scenario of e-mail system, the security requirements can be summarized as follows: (1) any data senders can generate encrypted e-mail data; (2) only data receiver can query and decrypt the encrypted e-mail data; (3) except the data receiver, none of the other entities, including the cloud server, can know the content of the encrypted e-mail data. Since security characteristics of SPE satisfy all these requirements in the above scenario, it is argued that SPE is very suitable for this application. Therefore, how to construct an efficient and secure SPE scheme supporting keyword search is always a hotspot in the field of SE.
The very first SPE scheme supporting keyword search was introduced by Boneh et al., and it is so-called public key with keyword search (PEKS) . However, their work only supports a single keyword search. In order to support more expressive query, many SPE schemes [10–12, 16] were proposed to realize advanced search, for example, conjunctive and disjunctive keywords search. In practice, most of the applications need more advanced keywords search function than the conjunctive and disjunctive keywords search. More precisely, many applications require Boolean keywords search. For example, in an e-mail system, users want to make a query like , where , , , and are keywords. A naive thought is that a Boolean query can be obtained by remoulding a PECK or PEDK scheme, i.e., by combining the query results of conjunctive or disjunctive keywords search. However, we argue this simple method has many drawbacks. To better illustrate our motivation, based on a PEDK or PECK scheme, we construct a naive scheme supporting the Boolean keywords search like , where , , and are three keywords. We then briefly review the simple solution and explain why it is unsatisfactory.
The approach is that we first execute the query and the query by making use of the PECK scheme, respectively, and obtain the union of the results of query and query. However, this method will leak the trapdoors of and . By utilizing the trapdoors, the search results of and are also leaked. Over time, the adversary may combine this information to derive the contents of user’s documents. In addition, we also can execute the query of and by making use of the PEDK scheme, respectively, and then obtain the intersection of the results of the query and the query . However, this method carries the same drawback.
In this paper, we seek to construct a secure and efficient SPE scheme supporting Boolean keyword search which is not based on the PECK and PEDK schemes. We define a Boolean keywords search Q as a combination of conjunctive normal form (CNF) and disjunctive normal form (DNF), denoted by , where is defined as . Here, is a keyword, and . This Boolean keywords search is more expressive than the conjunctive and disjunctive keywords search. The contributions of our work are summarized as follows:(1)Inspired by the keyword conversion method introduced in , we create a novel keyword conversion method which can transform the index keyword set and Boolean query into an attribute and a predicate vector, respectively. These vectors can efficiently realize Boolean keywords search by an inner product operation. Moreover, the vector dimension is much less than that generated by adopting the previous method.(2)Through elaborately applying the existing technique called dual pairing vector space (DPVS) to encrypt the attribute and predicate vectors, we propose a secure and efficient SPE scheme supporting Boolean keywords search (SPE-BKS), which can accomplish Boolean keywords search over encrypted data with a better search efficiency than the previous schemes.
Moreover, for security concern, we introduce a formal security definition for SPE-BKS and give a detailed proof to demonstrate that our scheme is secure against chosen keyword attack. To verify the efficiency of the proposed scheme, we conduct an experiment for comparing our scheme with some recent schemes over a real-world dataset (Enron Email Dataset).
1.3. Related Work
The first SPE scheme supporting keyword search was introduced by Boneh et al. . They called it as public key encryption with keyword search (PEKS), which only supports a single keyword search. To support multikeyword search, Park et al. proposed an SPE scheme supporting conjunctive keyword search, which is called public key encryption with conjunctive keywords search (PECK) . In their scheme, each keyword is associated with a keyword field. The mechanism of the keyword field is based on two assumptions: one is that the keywords in a keyword field must be arranged in a preset order; the other is that the same keyword never appears in two different keyword fields of the same document. However, in many applications, the keyword field will make the multikeyword search unpractical. For instance, in an e-mail system, the keyword fields usually contain “From,” “To,” and “Title.” Many e-mails may have the same keyword in different keyword fields, e.g., “From: LeBron James” and “To: James Harden.” Moreover, the keywords in the keyword field “Title” may be organized in an alphabet order. To address this issue, the subsequence work is to create a PECK scheme without keyword field. In , Boneh and Waters proposed a public key encryption scheme called hidden vector encryption, which can efficiently support conjunctive keywords search without keyword field. After this, some efficient PECK schemes with better performance were proposed in [12–15]. To support disjunctive keyword search over encrypted data without keyword field, Katz et al. introduced a novel encryption scheme called predicate encryption supporting inner product, which is also named as inner product encryption (IPE) . Through changing the index and query into an attribute and a predicate vector, respectively, a public key encryption with disjunctive keywords search (PEDK) scheme can be built based on the IPE scheme. Considering that the previous SPE schemes cannot use one trapdoor to realize conjunctive and disjunctive keywords search simultaneously, Zhang et al. proposed two public key encryption with conjunctive and disjunctive keyword search (PECDK) schemes [17, 18], which can efficiently support conjunctive and disjunctive keyword search at the same time. In order to support expressive query over encrypted data, based on the Paillier cryptosystem with threshold decryption (PCTD) , Yang et al. proposed an SPE scheme supporting versatile search query patterns, such as the range, conjunctive, disjunctive, and Boolean keywords search . Miao et al. presented a hybrid keyword-field search scheme that supports both keyword search and range search simultaneously . In addition, their scheme also provides an efficient key management mechanism to reduce the storage cost of keys. For the issue of fuzzy keyword search, Yang et al. designed a method to segment keyword according to the position of wildcards and proposed an SPE scheme supporting wildcard keyword search by combining the segmentation method and PCTD . To support keyword search over arbitrary languages, Yang et al. realized a general method which can convert a variety of languages into a uniform big integer. By utilizing this conversion method and PCTD, they can carry out an SPE scheme supporting multikeyword rank search in arbitrary language . To add the access control mechanism to SE, Li et al. created an attribute-based encryption (ABE) scheme which supports not only keyword search but also update operations for users ciphertext and secret key . Then, they presented an outsourced ABE scheme supporting keyword search, which can transfer operations of decryption and key issuing to the cloud server partially . He et al. proposed an SPE scheme which can control user’s search permission according to an access control policy . Miao et al. proposed an attribute-based keyword search scheme under a shared multiowner setting . Zhang et al. proposed an SPE scheme achieving both Boolean keywords search and fine-grained search permission . For the problem of tensor decomposition over encrypted data, by elaborately combining homomorphic encryption and block chain techniques, Feng et al. designed several schemes to implement different types of tensor decomposition, such as high-order Bi-Lanczos and Tucker decomposition [29–31]. To improve the efficiency of SPE, Hwang et al. created a more efficient SPE scheme, by replacing the operation of bilinear pairing with ElGamal encryption system . Lu et al. proposed a certificate-less encryption supporting keyword search under a multirecipient setting . In order to obtain a better efficiency, their scheme avoids using a costly operation called bilinear pairing. Considering the scenario in which devices have limited resources, two secure and efficient energy-saving platforms were proposed to protect user’s sensitive data [34, 35]. To resist the DoS attack, Li et al. gave an efficient remote user authentication and privacy-preserving scheme by adopting the technique called extended chaotic maps . In order to improve search accuracy, Zhang et al. proposed an SPE scheme supporting semantic keywords search by adopting a method called “Word2vec” .
This paper is organized as follows. In Section 2, we give the framework of SPE-BKS and its security definition. Some basic tools are also provided in the section. In Section 3, the construction of SPE-BKS is given, and its security proof is also presented. The experimental and theoretical analysis is provided in Section 4. We conclude this paper in Section 5.
In this section, we will give a formal definition of the framework and security model of SPE-BKS. In addition, we also briefly introduce some basic ingredients used in our scheme, including dual pairing vector space (DPVS), two important lemmas, and complexity assumption.
2.1. Framework of SPE-BKS
The SPE-BKS consists of three roles: data sender, data receiver, and cloud server. The responsibilities of these three roles are listed as follows:(1)Data receiver generates the public key (pk) and secret key (sk) and sends the pk to the public. Data receiver also generates the trapdoor for any query of his/her interest and sends the trapdoor to the cloud server.(2)For a message M with a keyword set , data sender encrypts to create the encrypted index by using pk. Moreover, data sender will produce the encrypted message C for M. After this, data sender sends and C to the cloud server.(3)When the cloud server receives the trapdoor generated by the data receiver, the server tests the trapdoor against each encrypted index and returns the matched messages to the receiver.
According to the responsibilities of these three roles, we give a formal definition of the framework of SPE-BKS.
Definition 1. SPE-BKS consists of four polynomial-time algorithms (KeyGen, IndexBuild, Trapdoor, and Test) as follows:(1)KeyGen : this algorithm is run by the data receiver. It takes a security parameter as input and outputs pk and sk.(2)IndexBuild (pk, ): this algorithm is executed by the data sender to encrypt the keyword set . It produces a searchable encrypted index by using pk and .(3)Trapdoor (pk, sk, and Q): the algorithm is executed by the receiver to construct a trapdoor of Q. It takes pk, sk, and Q as input and outputs a trapdoor .(4)Test (pk, , and ): for the query and the index keyword set , we define the function as follows: if there exists some such that the keyword set in is a subset of , then . Otherwise, . This algorithm is run by the cloud server. It takes a trapdoor , a secure index , and pk as input and outputs 1 if , or 0 otherwise.
For a query Q and a keyword set , for pk, sk, , and correctly generated by the algorithms KeyGen , IndexBuild (pk, ), and Trapdoor (pk, sk, Q), respectively, the correctness property asks that the following two situations are needed to be met:(1)If , Test (pk, , ) outputs 1(2)If , Test (pk, , ) outputs 1 with negligible probability
In practice, data senders will send a message M with a keyword set . The above algorithms aim to construct a secure and searchable index for . For the message M, we can apply the symmetric encryption scheme, e.g., AES and triple DES, to protect the security of M. Like the previous SPE schemes, we only concentrate on searchable encryption part.
2.2. Security Definition of the SPE-BKS
In this section, we present a formal definition for SPE-BKS, which defines a group of adversaries who can adaptively query the trapdoors of chosen keyword sets, and issue two challenge ciphertexts. The essential of the security of SPE-BKS is that the adversaries fail to distinguish these two ciphertexts based on the given trapdoors. Depending on the above description, inspired by the security definition of the previous SPE schemes, the security definition of SPE-BKS is given as follows.
Definition 2. An SPE-BKS scheme is adaptively index-hiding against chosen keyword attack if for all probabilistic polynomial-time (PPT) adversaries , the advantage of in the following game is negligible for the security parameter :(1)Setup: the challenger runs the KeyGen algorithm to generate pk and sk and gives pk to the attacker .(2)Phase 1: the attacker can adaptively ask the challenger for the trapdoor for any query Q of his choice.(3)Challenge: first selects two keyword sets and and sends them to . Suppose that , are the keyword queries which are queried to construct trapdoors in Phase 1; the only restriction is that these queries cannot distinguish these two challenge keyword sets. Then, randomly chooses a bit and generates . Finally, are sent to .(4)Phase 2: continues to ask for trapdoor for any query Q of his/her choice under the restriction mentioned in the Challenge phase.(5)Response: the attacker outputs and wins the game if .Based on the above game, the advantage of is defined as follows:
2.3. Prime Order Bilinear Group
Let G, be two cyclic groups of prime order p. There are three properties in the bilinear pairings map as follows:(1)Bilinear: , where a, and (2)Nondegenerate: if , then (3)Computable: for any a, , can be efficiently computable
An efficient bilinear map can be obtained by applying the Weil pairing or the Tate pairing .
2.4. Dual Pairing Vector Space
Suppose that and ; we have . We can perform the scalar multiplication and vector addition in the exponent. For any and , we have and . We can also have and . Here, the dot product is taken as modulo .
We will employ the concept of DPVS which is introduced in . The notation used to describe DPVS is introduced in . Suppose that and are two random bases of , where l is a fixed dimension; if whenever and for all , where is a random elements in , then we call and dual orthonormal bases. Obviously, for a generator , whenever , where 1 can be seen as the identity element of .
2.5. Two Important Lemmas
We will introduce two important lemmas used in the security proof of our scheme. The first lemma is presented in . To describe the lemma formally, first of all, we give some notations and definitions which are also introduced in . Let t, l be two fixed positive integers where , be an invertible matrix and be a subset of size t. Suppose that and are random dual orthonormal bases; a new pair of dual orthonormal bases and was defined as follows.
Let be a matrix over whose columns are the vectors such that . We can easily find that is also a matrix. By keeping all of the vectors for and exchanging for with the columns of , is then constructed. Because is also a matrix, also can be constructed by using the same method.
For a fixed dimension l and prime p, we denote randomly choosing a pair of dual orthonormal bases and by . can be viewed as a dual orthonormal bases set.
The first lemma is described as follows.
Lemma 1. For any fixed positive integers , any fixed invertible and set of size t, if , is also distributed as a random sample from . In particular, the distribution of is independent of .
The second lemma introduced in  (Lemma 23) is described as follows.
Lemma 2. Let , where is l-dimensional vector space, and and are its dual. For all , ,where and .
2.6. Complexity Assumption
For a fixed dimension and a prime , the dual orthonormal bases and which are randomly chosen are denoted by . can be seen as a dual orthonormal bases set. For a positive integer , the definition of this assumption is described as follows.
Definition 3 (subspace complexity). Given a group generator , we define the following distribution:We assume that, for any PPT algorithm A with output in , the advantage of defined by is negligible in the security parameter .
3. The Proposed SPE-BKS Scheme
In this section, we first introduce a keyword conversion method which converts the index and query keywords into a group of vectors. Then, through taking advantage of DPVS to encrypt these vectors, the construction of SPE-BKS is given. Finally, the security proof of our scheme is presented.
3.1. Keyword Conversion Method
Before describing the method, some notations will be introduced. Suppose that any keyword can be expressed as a string in , we define a function . Since p is a large prime and is larger than the number of all words, can be collision-resistant. This means that if , then , where and are two distinct keywords.
For the index keyword set , we construct an equation of degree n with one unknown:
According to the coefficient of the , the vector for is obtained.
For the query , we first split Q into a group of keyword sets. For each , we obtain a keyword set , where . For each , we can create a vector:
As a result, we can test each in Q against to make a Boolean keywords search. If , there is at least an such that . Based on this property, a concrete SPE-BKS scheme will be proposed in the next section.
According to Definition 1, we present a concrete construction of our SPE-BKS scheme:(i)KeyGen: choosing a bilinear group G of a prime order and setting n′ = 3n + 3, the algorithm randomly selects a pair of dual orthonormal bases from the dual orthonormal bases set , where , and (mod p), where . The algorithm outputs pk and sk as follows:where .(i)IndexBuild: given a keyword set , the algorithm constructs an n-degree polynomial, where are n roots of the equation f (x) = 0. Choosing two random elements , for the vector , this algorithm creates the index as follows:(i)Trapdoor: given a query Q, this algorithm first generates a group of vectors , by using the keyword conversion method introduced in Section 3.1. Then, it randomly chooses and an invertible matrix . Suppose that and in which and , where , for each , the trapdoor generation algorithm computes(i)The trapdoor of Q is .(ii)Test: the test algorithm first computes for each . Suppose that ; it outputs where and . Based on , the test algorithm works as follows:(1)Choose a counter , and set .(2)If , then go to step (3); otherwise, the algorithm computes. If , the algorithm outputs 1 and ends. Otherwise, it sets and goes to step (2).(3)The algorithm outputs 0 and ends.
Suppose that and are correctly generated by the “IndexBuild” and “Trapdoor” algorithms, respectively, then we have the following equation:where and .
Owing to , based on the equation above, we have the following equation:
If there exists some such that , it has , which makes , and, thus, the test algorithm outputs 1.
According to the user’s identity, the proposed scheme works as follows:(1)Data Receiver. Data receiver runs the “KeyGen” function to generate pk and sk, and pk is open to the public. When data receiver wants to perform Boolean keywords search, the “Trapdoor” function is called to generate a trapdoor by using sk and a Boolean query condition. After this, the trapdoor is sent to the cloud server.(2)Data Sender. For a document set, the data sender builds the secure index by calling the “IndexBuild” function and sends the index to the cloud server.(3)Cloud Server. Upon receiving a trapdoor generated by the data receiver, the cloud server launches the “Test” function and returns documents associated with the query to the data receiver.
In the real world, any practical application that needs ciphertext retrieval can integrate our scheme to realize the function of searching on encrypted data.
To prove the security of our SPE-BKS system, we adopt the dual system encryption method proposed in [41, 42]. According to this method, we give the construction of semifunctional index and trapdoor in our scheme. The semifunctional index and trapdoor will not be implemented in the real system but used in the proof:(i)Semifunctional Index. Let , where i and is introduced in “KeyGen” algorithm. A normal index is constructed by the “IndexBuild” algorithm. Choosing random values , the semifunctional index is created as follows:(i)Semifunctional Trapdoor. Let , where i. A normal trapdoor is constructed by the “Trapdoor” algorithm. Choosing random values where , the semifunctional trapdoor is created as follows:
When using the semifunctional trapdoor to test the semifunctional index, the additional factors will be generated, where .
The security proof of our SPE-BKS scheme relies on subspace complexity assumption which is presented in Section 2.6. We will prove security by using a hybrid method which consists of a sequence of games. These games are described as follows:(1): this game is the real security game.(2): for each , is similar to except that the index given to is semifunctional and the first k trapdoors are semifunctional. The remaining trapdoors are normal. In , all the trapdoors given to are normal and the index is semifunctional. In , the index and all trapdoors are semifunctional.(3): suppose that a keyword set is the challenge keyword set; we construct an n-degree polynomial by using the function , where are n roots of the equation f (x) = 0. Then, we define this game. For each , is similar to except that index is a semifunctional encryption of a vector in which the first k + 1 elements are random and the remaining elements are . is a game such that the index is a semifunctional encryption of a real challenge keyword set, which is identical to . is a game such that the index is a semifunctional encryption of a random keyword set. We will show that these games are indistinguishable in the following lemmas.
Lemma 3. Suppose that there exists a PPT algorithm such that is nonnegligible. Then, we can build a PPT algorithm with nonnegligible advantage in breaking subspace complexity assumption, with n′ = 3n + 3, k = n + 1.
Lemma 4. Suppose that there exists a PPT algorithm such that is nonnegligible. Then, we can build a PPT algorithm with nonnegligible advantage in breaking subspace complexity assumption, with n′ = 3n + 3, k = n + 1.
Lemma 5. Suppose that there exists a PPT algorithm such that is nonnegligible. Then, we can build a PPT algorithm with nonnegligible advantage in breaking subspace complexity assumption, with n′ = 6, k = 2.
Considering the length of the article and the coherence of the article structure, the proofs of Lemmas A–C are given in Appendix.
Theorem 1. If subspace complexity assumption holds, then our SPE-BKS scheme is secure.
Proof. If subspace complexity assumption holds, the real security game is indistinguishable from based on the previous lemmas. In , the value of is information-theoretically hidden from the attackers. Hence, we can state that the attackers can attain no advantage in breaking our SPE-BKS scheme.
4. Performance Evaluation
In this section, we present a detailed experiment to demonstrate that our scheme can efficiently perform Boolean keywords search over the encrypted data. We implement our scheme in JAVA with Java Pairing-Based Cryptography (JPBC) Library . In our implementation, the bilinear map is instantiated as Type A pairing (base field size is 128 bits), which offers a level of security equivalent to 1024-bit DLOG . Our experiment is run on Intel® Core™ i7 CPU at 2.90 GHz processor and 16 GB memory size and is over a real-world e-mail dataset called Enron Email Dataset . In our experiment, we randomly choose 1000 e-mails from the Enron Email Dataset and denote the number of documents by d (d = 1000). To show the efficiency of our scheme, we compare our scheme to three previous SPE schemes in terms of key generation, index building, trapdoor generation, and search. For simplicity, we denote these three schemes introduced in [17, 18, 20] by PECDK-1, PECDK-2, and YY18. These three SPE schemes can perform conjunctive, disjunctive, and Boolean keywords search over encrypted data.
4.1. Key Generation
From Figure 2(a), the time costs of key generation in PECDK-1 and our scheme are both linear with, while that in PECDK-2 is linear with O (n). The reason for this phenomenon is the case that both our scheme and PECDK-1 adopt DPVS to generate group elements in G. Because the dimension of DPVS in our scheme is 3n while that in PECDK-1 is 4n, the time cost of key generation in our scheme is less than that in PECDK-1. In addition, since the key generation algorithm in YY18 is independent of n, the time cost of key generation is not related to n. Although the time cost of key generation in our scheme is higher than that in PECDK-2 and YY18, it has little impact on our practical application since this algorithm only runs when system initialization and key pair replacement are carried out.
As shown in Figures 3(a) and 3(b), because both pk and sk contain group elements in G, the space cost for key pair in our scheme and PECDK-1 are both linear with the square of n. By contrast, the space cost for key pair in PECDK-2 is linear with O (n). Besides, for YY18, since both pk and sk contain constant big integers, the space cost for key pair is not related to n. Though the storage cost of keys in our scheme is more than that in the other three schemes, our scheme still does not need much space to store the keys as these keys are stored only a few copies.
4.2. Index Building
From Figure 2(b), the time costs of index building in PECDK-1, PECDK-2, and our scheme are all linear with, while that in YY18 is linear with O (n). For PECDK-2, the index building algorithm needs to convert the keywords into a matrix and then needs exponentiation computation of G to encrypt the keywords. For the proposed scheme and PECDK-1, they also require exponentiation computation of G owing to DPVS. More precisely, compared to PECDK-1, our scheme needs less time cost in index building since the dimension of DPVS in our scheme is less than that in PECDK-1. Besides, the time cost of index building in our scheme is slightly higher than that in PECDK-2 since our scheme needs exponentiation computations while PECDK-2 requires exponentiation computations. The reason for this phenomenon is that, compared to PECDK-2, our scheme needs more group elements to support more complex search function. Compared with YY18, our scheme needs more index building time since our scheme needs exponentiation computations while YY18 only runs the encryption algorithm of PCTD n times.
For the storage cost of indices, the group elements on G in the index for our scheme are linear with n. For YY18, since each document’s index contains n ciphertexts generated by PCTD, the space cost of index building is linear with O (n). By contrast, the group elements in the index for PECDK-1 and PECDK-2 are both linear with the square of n since the index structures for PECDK-1 and PECDK-2 are both a matrix. As shown in Figure 3(c), the storage costs of indices in our scheme and YY18 are linear with O (n) while those in PECDK-1 and PECDK-2 are both linear with.
4.3. Trapdoor Generation
As shown in Figure 2(c), the time costs of trapdoor generation in PECDK-1, PECDK-2, YY18, and the proposed scheme are linear with m, m, and respectively. More precisely, for PECDK-1, the keywords in the query are first converted to be a vector, whose dimension is n. Then, this vector will be encrypted by using DPVS. Since the encryption operation needs exponentiation computations of G, the time cost of trapdoor generation in PECDK-1 is linear with. For PECDK-2, suppose that the number of keywords in the query is m, the query is converted to be a vector whose dimension is m, and each dimension needs one exponentiation computation on G. Thus, the time cost of trapdoor generation in PECDK-2 is linear with m. For YY18, if the query contains m keywords, the trapdoor algorithm will perform encryption algorithm of PCTD n times, so the time cost of trapdoor generation is linear with O (m). For the proposed scheme, the query is converted to be m vectors in which each vector’s dimension is n. After this, each vector is encrypted by making use of DPVS, and thus, the time consumption of trapdoor generation in our scheme is linear with.
From Figure 3(d), the space costs for PECDK-1, PECDK-2, YY18, and our scheme are linear with n, n, n, and mn, respectively. The reason for this phenomenon is that the trapdoors in PECDK-1, PECDK-2, and our scheme contain n, m, and mn group elements on G, respectively, and the trapdoor in YY18 involves m ciphertexts of PCTD.
As shown in Figure 2(d), the time cost of search in PECDK-1 is linear with, while that in PECDK-2, YY18, and our scheme is linear with mn. More precisely, for PECDK-1, the index of W contains n ciphertexts, and each ciphertext needs n pairing operations. For PECDK-2, the index is a matrix, and the trapdoor is a vector whose dimension is m. The test algorithm in PECKD-2 performs mn pairing operations between the first m rows of the matrix and the vector. For YY18, since the index and trapdoor hold n and m ciphertext of PCTD, respectively, the test algorithm will run secure less or equal (SLE) protocol and secure multiplication protocol across domains (SMD) mn times. For the proposed scheme, the trapdoor has m items, and the test algorithm in our scheme performs n pairing operations between each item and the index. Thus, total pairing operations in our scheme are mn. Since PECDK-1, PECDK-2, and our scheme need nearly 2 mn and 3 mn pairing operations, respectively, the time consumption in our scheme is slightly more than that in PECDK-2 and is less than that in PECDK-1. Moreover, since the time cost of a pairing operation is less than that of SLE and SMD, our scheme is more efficient than YY18 in test phase.
4.5. More Comments
As shown in the experimental results, when n = 5, d = 1000, and m = 5, the time cost of index building in our scheme is 331 s, the generation time of a single trapdoor is 1.7 s, and the search time is 142 s. According to the statistical data given in [17, 45], the number of keywords in a document (n) is usually less than 20, e.g., only 3∼5 keywords in the scientific paper, and the number of keywords in a query (m) is often less than 10. We can argue that our scheme is suitable for the applications with fewer keywords, such as the keywords in the scientific literature, e-mail title and summaries, medical data summaries, and so on.
Although Figure 2 shows that the time complexity of our scheme is as good as that of PECDK-2, our scheme can support Boolean keywords search, which is much advanced than the conjunctive and disjunctive keywords search. Compared with YY18 that supports Boolean keywords search, our scheme needs less search time, despite the fact that it increases index building time. In practice, the index building in real-world application is usually a one-time activity, while queries are frequently performed. Thus, we reckon that it is worth sacrificing index building time to reduce retrieval time. For the space complexity, from Figure 3, our scheme needs less space for index storage, though requiring more storage space for the trapdoor and keys. Considering the fact that trapdoor and keys often require much less storage space than the index, we argue that our scheme is practicable in the real world.
In this paper, by applying DPVS and the bilinear pairing, we proposed a searchable public key encryption scheme supporting Boolean keyword search, which is proven to be secure under chosen keyword search attack. Compared to previous SPE schemes supporting conjunctive and disjunctive keywords search, the proposed scheme can support more advanced search function. Moreover, through a detailed experiment over a real-world dataset, we can argue that the efficiency of our scheme is suitable for practical applications with fewer keywords. Considering that the efficiency in our scheme still needed to be improved, we will construct a more efficient scheme in the forthcoming work.
A. Proof of Lemma 3
Proof. Given , needs to decide whether are , distributed as , or , .
By using , C can simulate or with . To create pk, firstly, randomly selects an invertible matrix . Then, we define a dual orthonormal bases F and by , , , , , and , , , , , .
implicitly sets and where the matrix is applied as a change of basis matrix to and is applied as a change of basis matrix to , as described in Section 2.5. Note that the first 2n + 2 basis vectors are unchanged. According to Lemma 2, and are properly distributed.
Choosing a function , computes and sends it to . Each time asks to provide a key for a keyword query , creates a normal trapdoor of . Choosing and an invertible matrix and , computesand sends to , where , , .
At some point, sends two challenge keyword sets, and . By randomly choosing and computing an n-degree polynomial , setswhere implicitly sets , .
Then, gives the index to . If are equal to , , then this is a properly distributed normal index. In this case, has properly simulated .If are equal to , , then there is an additional term of in the exponent part of the index. The coefficients in the basis are the vector . In order to acquire the coefficients in the basis , we multiply the matrix by the transpose of these vectors and obtain . Since is random, these coefficients are uniformly random. Therefore, in this case, has properly simulated . So, if can distinguish from with nonnegligible advantage, then can use the output of to break subspace assumption with nonnegligible advantage.
B. Proof of Lemma 4
Proof. Given , needs to decide whether are distributed as follows: , or ,