Abstract

Outsourcing data to cloud services is a good solution for users with limited computing resources. Privacy and confidentiality of data is jeopardized when data is transferred and shared in the cloud. The development of searchable cryptography offers the possibility to solve these problems. Symmetric searchable encryption (SSE) is popular among researchers because it is efficient and secure. SSE often requires the data sender and data receiver to use the same key to generate key ciphertext and trapdoor, which will obviously cause the problem of key management. Searchable encryption based on public key can simplify the key management problem. A public key encryption scheme with keyword search (PEKS) allows multiple senders to encrypt keywords under the receiver’s public key. It is vulnerable to keyword guessing attacks (KGA) due to the small size of the keywords. The proposal of public key authenticated encryption with keyword search (PAEKS) is mainly to resist inside keyword guessing attacks. The previous security models do not involve the indistinguishability of the same keywords (), which brings the user’s search pattern easy to leak. The essential reason is that the trapdoor generation algorithm is deterministic. At the same time, most of the existing schemes use bilinear pair design, which greatly reduces the efficiency of the scheme. To address these problems, the paper introduces an improved PAEKS model. We design a lightweight public key authentication encryption scheme based on the Diffie-Hellman protocol. Then, we prove the ciphertext indistinguishability security and trapdoor indistinguishability security of the scheme in the improved security model. Finally, the paper demonstrates its comparable security and computational efficiency by comparing it with previous PAEKS schemes. Meanwhile, we conduct an experimental evaluation based on the cryptographic library. Experimental results show that the computational overhead of our scheme compared with the ciphertext generation algorithm, trapdoor generation algorithm and test algorithm of other schemes Our scheme reduces 274, 158 and 60 times, respectively.

1. Introduction

It has become increasingly important and popular to use cloud storage services due to the rapid development of cloud computing. Based on the characteristics of high efficiency and low cost, cloud computing has attracted great attention from academia and industry. Users are able to upload their own data such as text, audio, and video. It can not only save their own local storage costs but also facilitate data access at any time. In addition, cloud storage services provide the ability to share data. Since cloud servers are not fully trusted, they are likely to be bribed by third parties through other means, which makes the transfer and storage of plaintext data a privacy breach. The data owner is not in direct management and control of the data. How to ensure the security and privacy of data during data sharing is the first challenge encountered in cloud server storage.

Encrypting sensitive data before sending it to a cloud server is the most natural method. In the case where the ciphertext is saved directly to the cloud server, finding a specific file from among hundreds or thousands of encrypted files becomes challenging. Cloud storage is the most convenient way for users to download and decrypt all their files, and then locate the needed files or data from the plaintext obtained. Considering the high communication overhead between users and cloud storage, and the processing costs to decrypt data from user-owned devices, this solution is clearly unrealistic. It would be ideal if the cloud server would search for ciphertext data based on requirements of each user, then return the matching ciphertext data, and then the user would decrypt it. To solve the above problems, searchable encryptions (SE) are proposed.

Symmetric searchable encryption (SSE) and public key searchable encryption (PEKS) are two types of searchable encryption. In 2004, Boneh et al. [1] were the first to study the search problem of data encrypted by the public key system. Any data sender (DS) can use the public key of the data receiver to encrypt keywords. The data receiver (DR) can generate a trapdoor and send it to the server through a secure channel to search the encrypted data. They also discussed the relationship between PEKS and identity based encryption (IBE) and pointed out that PEKS scheme with semantic security under adaptive keyword selection attack contains IBE scheme with IND-ID-CCA security [2].

It was Byun et al. [3] who observed the keyword space was very small compared to the key space, which discovered that offline keyword attacks are possible and successfully thwarted Boneh’s scheme. The adversary could intercept the trapdoor and could perform unlimited trapdoor tests, which allowed keyword guessing attacks to succeed. This enables external attacks to be prevented from being launched by the adversary by protecting the trapdoor from being leaked, e.g. a secure method is achieved by establishing a secure communication channel between the recipient and the server, where only the server can access the trapdoor. Thus, its security can be ensured. Another solution idea is by limiting the testing capability of the adversary. That is, by specifying the PEKS of the tester. However, both methods do not work against malicious servers. How to resist malicious servers is another problem solved by PEKS.

Huang and Li [4] introduced authentication techniques in PEKS, namely Public Key Authentication Encryption with Keyword Search (PAEKS) to resist malicious server attacks. In PAEKS, DS encrypted and authenticated keywords, and trapdoors can only be generated by legitimate recipients, which prevented adversaries from performing brute force tests. It was not until Qin et al. [5] that the PAEKS model was brought into question as [4] was unable to capture a real threat. In this scenario, the outsider choses multiciphertext attacks, i.e. determining if two encrypted files share some keywords. The PAEKS scheme Pan and Li [6] proposed follows scheme [5], achieving multiciphertext indistinguishability (MCI-security) and multitrapdoor indistinguishability (MTI-security) with high computational complexity. A security flaw was discovered soon after Qin et al. [7] spotted the scheme [6]. Their ciphertext indistinguishability security (CI-security) models all fail to be resistant to the fully selected keyword attack.

1.1. Motivation and Contributions

This study is motivated by the algorithm proposed by Qin et al. [7] and their analysis of the security system of PAEKS. In their analysis of keyword guessing attacks, [7] pointed out that most schemes often take into account the indistinguishability of ciphertext, i.e., the semantic security of the selected keywords. In other words, if the adversary does not get the trapdoor that challenges the keywords and , it only requires the adversary to distinguish between the keyword encryption or encryption. In light of the following basic fact, Qin considered that a document usually contained multiple keywords, and the same keyword may appear several times.

Therefore, [7] reconsidered the ciphertext indistinguishability security model of PAEKS, which is multiciphertext indistinguishability. It is worth noting that adversaries can ask for ciphertext for any keyword, including challenging keywords. This means if the model can resist an attack that involves fully selected keywords; it can protect the user’s access pattern. However, [7] only considers the security of full ciphertext, not the security of full trapdoor. Trapdoor generation algorithm is a deterministic encryption algorithm in [7]. A malicious adversary can accurately record how often a user searches for different trapdoors, thus providing insight into his search pattern. In that case, the search pattern of users must be protected. In addition, most of the existing PAEKS schemes are constructed based on bilinear mapping. The result is high computing overhead for the client side and poor practicability. We need to design a lightweight PAEKS scheme for lightweight computing devices in order to better promote the application. (i)Firstly, we discuss different types of adversary capabilities, analyze the security of the PAEKS scheme, and describe the security model of the hidden matching relationship(ii)Secondly, we design a PAEKS scheme without bilinear pairing. Our scheme satisfies both trapdoor indistinguishability and ciphertext indistinguishability before the cloud server runs the test algorithm. In other words, our base scheme protects the user’s search pattern and access pattern against adversaries Type-II(iii)Finally, we provide an extended keyword approach that hides the matching relationship between keyword ciphertexts and trapdoors after the cloud service executes the test algorithm against adversaries Type-III. We analyze the security and efficiency comparison

1.2. Organization

In section 2, contains a review of related works. Preliminary cryptographic notations and definitions are presented in section 3. We analyze previous PAEKS schemes in section 4. We present the PAEKS scheme and a searchable encryption scheme derived from it in section 5. In section 6, we present the security proof of the scheme. In section 7, we present our time consumption comparison with others. Lastly, we present our conclusions in section 8.

The KGA is a serious concern for public key searchable encryption. It is shown that one of the main reasons the keyword will be attacked by an adversary under KGA if the keyword space is in a polynomial size. In the literature, there have been some attempts to implement IKGA security. Xu et al. [8] proposed Public Key Encryption for Fuzzy Keyword Search (PEFKS), where each keyword corresponds to an exact keyword search trapdoor and a fuzzy keyword search trapdoor. The keyword space of PEFKS was very small, but malicious parties were also unable to learn accurate keywords. Later, Chen et al. [9] pointed out that in [8] the server cannot accurately guess the keyword. As long as it knows which small set the basic keyword belongs to, it cannot protect the keyword privacy very well. [9] analyzed another main reason for keyword guessing attacks.

The vulnerability is caused by the fact that anyone with knowledge of an external receiver’s public key can generate the PEKS ciphertext of any keyword. A malicious server may select a guess keyword when presented with a trapdoor, and then use it to generate ciphertext using PEKS. Then the server can guess whether the keyword hidden under the trapdoor is the right keyword until the correct keyword is found. As long as the two servers do not collide, [9] proposed a method for preventing attacks using dual-server public key encryption with keyword search. Sun et al. [10] took into account the fact that the server was malicious, and they explored whether a PEKS scheme against inside KGA can be built based on different public key cryptosystems, such as PKI based, identity-based, or certificateless cryptosystem. In order to improve the scheme’s efficiency, the researchers proposed a construction of PAEKS based on word-independent smooth projective hash functions (SPHFs) and PEKS [1119].

Combining attribute-based encryption (ABE) and searchable encryption (SE), a fine-grained search scheme is obtained. This combination leads to the development of SE schemes suitable for multiuser scenarios. The ciphertext is linked to the access policy in ciphertext-policy of ABE (CP-ABE), and the user’s key is associated with a set of attributes. Therefore, searchable scheme based on attribute encryption can resist IKGA, which mainly depends on the security of attribute encryption. Zheng et al. [20] developed the first fine-grained searchable encryption scheme by using both the variants of CP-ABE. In order to reduce the amount of calculation, efforts had been made in some schemes [2123]. Schemes [2428] tried to reduce the storage and calculation cost of keyword ciphertext, key and search token. In this paper, we mainly improve the computational efficiency and retrieval efficiency of key ciphertext and trapdoor in the smallest basic unit of searchable encryption structure, that is, only two parties share data. If we study the most basic units clearly, it will promote multiuser scenario.

3. Preliminaries

In this section, we introduce some basic concepts of cryptography, the definition of schemes, security models, threat models, and design goals.

3.1. Notations

The notations are described in Table 1.

3.2. Pattern Leakage

The set is a -query set consisting of pairs , with representing the timestamp of a query and representing a keyword. The access pattern [29] is as follows: (i)Access pattern. Documents containing a given keyword appear in the search results. The access pattern is defined as for quering keyword , where denotes the of documents containing (ii)Search pattern. It shows the links between particular keywords and search queries. Its search pattern is defined as for quering keyword ,.

3.3. Inverted Index

Database indexes store maps between content (such as words or numbers) and their location in a table, or in a file or group of files. (i)Create inverted index. First, all the raw data are numbered to form a list of documents. Then, the document data is extracted to obtain a large number of keywords, which are indexed by entry. The numbering information of the documents containing these entries is kept. It can also be referred to as an index matrix. As depicted in Figure 1(ii)Search process. The user enters any keyword and brings the keyword to an inverted index list for matching. By looking up these terms, the numbers of all documents containing them can be found. Documents are then found in the document list based on these numbers

3.4. The Adversary Attack Capability

Here we divide the adversary capabilities into 3 Types. (i)Type-I: The adversary collects some the ciphertext and trapdoor of keywords, and they do not know the correspondence between ciphertext and plaintext(ii)Type-II: The adversary collects all the ciphertext and part of the trapdoor of the keyword submitted by the data receiver, and only knows part of the matching relationship between the submitted trapdoor and the ciphertext. It is worth noting that the adversary does not know the matching relationship between the trapdoor being challenged and the ciphertext(iii)Type-III: The adversary stores all the ciphertexts of keywords and all the trapdoors submitted by the data receiver, and knows the matching relationship between all the trapdoors and ciphertexts, including the trapdoors that challenge keywords

3.5. The Computational Diffie-Hellman Assumption (CDH)

Computational Diffie-Hellman assumption (CDH): let be a cyclic group, which has a prime order , and Pi be a generator of G1. Given the tuple (, , ) Gi, R where a, b-Z, there is no probabilistic polynomial time (PPT) algorithm to get abP, E Gj. We define the advantage.

Let is a cyclic group, which has a prime order , and be generator of . We get two elements and from . The needs to be calculated. CDH is hard to solve in the

3.6. The Hash Diffie-Hellman Assumption (HDH)

is a cyclic group. Its order is . The generator is . Given be a security cryptographic hash function. The advantage of is negligible for a polynomial adversary . The advantage is , where

3.7. The Definition of PAEKS

A PAEKS scheme includes five polynomial algorithms, which are Setup, KeyGen, PAEKS, Trapdoor, and Test. The specific construction is described as follows: (i): it is an algorithm for generating global parameters. Input a security parameter , and outputs a global public parameter (ii)KeyGen(): providing participants with a public/private key pair is its responsibility (a): it is sender’ key generation algorithm. Input , and generates public and private key (b): the receiver generates keys with this algorithm. Input , and generates public and private key (c): this is the algorithm used by the cloud server to generate keys. Input , and generates public and private key .(iii)PAEKS(): the DS performs the operation. Input , and returns corresponding ciphertext (iv)Trapdoor(): The procedure is performed by DR. Input , and returns corresponding trapdoor (v)Test(): the cloud server manages the process. Input . If , outputs 1; else, outputs 0

Correctness: given and , and we formulate consistency as follows:

If ,

3.8. System Model

The system includes the following parties: Cloud Server (CS), Data Sender (DS), and Data Receivers (DR) as depicted in Figure 2. (i)Data Sender (DS): The DS produces his own public key and private key upon inputting the security parameter. Moreover, DS extracts keywords from files, and generates index matrix, and computes the searchable keywords ciphertexts. Finally, the DS stores ciphertexts on the CS(ii)Data Receiver (DR): Request Users utilize targeted keywords to generate search trapdoors and send them to the CS(iii)Cloud sever (CS): The Cloud Server has almost unlimited storage and computing power in the PAEKS system. The CS is in charge of storing searchable ciphertexts received from DS. Then, the CS addresses search queries and returns corresponding searching results ciphertexts to DR

3.9. Threat Model

In this paper, it supposes that DS honestly follow the PAEKS algorithm to produce searchable ciphertexts for DO. DR honestly follows the Trapdoor algorithm to produce trapdoor. The CS is supposed to be honest and curious, who will honestly perform Test algorithm, and is interested in query results and frequency information of ciphertext.

3.10. The Security Models of PAEKS

As part of PAEKS semantic security model, there is fully cipher-keyword indistinguishability (Fully CI-security) and fully trapdoor indistinguishability (Fully TI-security). Our fully CI-security and fully TI-security model is the same as [4, 5, 7] in settings. In addition, we propose that a PAEKS scheme for adversary Type-II, which should also meet the security of hiding its matching relationship. ie the matching relationship security model of Hidden Ciphertext and Trapdoor (HMR-security).

3.11. Fully CI-Security Model

(i)Given adversary and the security parameter (ii)Initialization. The challenger firstly runs the algorithm to generate the system parameter . runs to generate DS’s key pairs , DR’s key pairs and CS’s key pairs , respectively. gives , and to (iii)Phase 1. asks for polynomial oracle and (iv)Challenge. After Phase 1, outputs and . picks a coin . Sends to (v)Phase 2. can remain to query the oracles as (vi)Guess. Finally, returns a bit as the guess of

wins in the above game if he guesses . The advantage of winning this game is defined as (i)

Definition 1 (Fully CI-Security). A PAEKS scheme satisfies cipher-keyword indistinguishable against chosen keywords attacks if the advantage of succeeding in the above game is negligible for any polynomial .

3.12. Fully TI-Security Model

(i)Given adversary and security parameter (ii)Initialization. firstly performs to generate parameter . Then, runs to generate DS’s key pairs , data receiver’s key pairs and cloud sever’s key pairs , respectively. gives and and to (iii)Phase 1. adaptively asks queries on and , and gets ciphertext and trapdoor of query keywords(iv)Challenge. After Phase 1, it outputs and with the restriction that and never be queried on oracle and by in Phase 1. Then, picks a coin and sends the trapdoor keyword to (v)Phase 2. As Phase 1, can continue to query the oracles(vi)Guess. Finally, returns a bit as the guess of and wins the game if

The adversary wins in the above game if he guesses . The advantage of winning this game is defined as

Definition 2 (Fully TI-Security). A PAEKS scheme satisfies trapdoor indistinguishability against chosen keywords attacks if for any PPT adversary , the advantage of succeeding in the above game is negligible.

3.13. HMR-Security Model

Let be an adversary and be the security parameter. (i)Initialization. first runs the algorithm to generate the system parameter . Then, it runs to generate DS’s key pairs , DR’s key pairs and CS’s key pairs , respectively. gives parameters , and to (ii)Phase 1. In this approach, asks PPT queries on the cipher-keyword, trapdoor, and test oracles, and then receives cipher and trapdoor responses to the query keywords and test results, as follows:(a) runs for any keyword , and sends to (b) runs for any keyword , and sends to (c)For any the cipher and trapdoor of keyword and , runs , and if , outputs 1; else, outputs 0(iii)Challenge. After Phase 1, it outputs two challenge keywords and . Then, picks a coin and . Sends and to (iv)Phase 2. can continue to query the oracles as Phase 1. But cannot query and with and (v)Guess. Finally, returns two bit as the guess of

wins in the above game if he guesses and . The advantage of winning this game is defined as

Definition 3 (HMR-Security). A PAEKS scheme satisfies matching relationship security model of Hidden Ciphertext and Trapdoor (HMR-security) if the advantage succeeding in the above game is negligible for any polynomial .

3.14. Design Goals

Our goal is to design an efficient PAEKS scheme, which can resist external keyword guessing attack and internal keyword guessing attack, and prevent the leakage of search and access pattern for adversary Type-II.

4. Previous PAEKS Schemes

In this section, we firstly analyze the relationship between keyword ciphertext indistinguishability and access pattern and the relationship between trapdoor indistinguishability and search pattern. Secondly, it analyzes whether the previous PAEKS really meet the indistinguishability of ciphertext and trapdoors for external enemies and malicious servers. Finally, we give suggestions and methods for adversary with different abilities.

4.1. Why Should we Define the Indistinguishability of Cipher and Trapdoor of Keywords?

(i)TI-Security and Search Pattern. It is well known that the disclosure of search pattern will reflect users’ search habits. If the PAEKS scheme does not meet the indistinguishability of trapdoors, trapdoors for the same keyword are the same. This allows the adversary to classify the keywords through the trapdoor, thus revealing the user’s search pattern. It can be seen that if the trapdoor does not meet the indistinguishable properties, the user’s search pattern will be divulged. The adversary can accurately guess the keywords by combining some a priori knowledge(ii)CI-Security and Access Pattern. Access pattern disclosure will reflect the data file information matching keywords, such as the frequency of different keywords and documents. If the PAEKS scheme does not meet the indistinguishability of keyword ciphertext, the ciphertext of the same keyword is the same. Therefore, the adversary can guess the keywords through the matched file information and prior knowledge

Based on the above analysis, we believe that PAEKS should meet at least the above two points to be considered secure if we want to achieve the purpose of search and access pattern without being compromised.

4.2. Security Analysis of Previous PAEKS Schemes

We discuss the close relationship between indistinguishable ciphertext and indistinguishable trapdoor from the following three aspects for different types of adversaries. (i)CI-Security and Not TI-Security. If the ciphertext of the keyword in PAEKS meets the indistinguishability, the trapdoor is discontent the indistinguishability, as shown in Figure 3(a). Three different ciphertexts of keyword are represented by ellipsoids of three different colors. We can see that the ciphertext are indistinguishable. The trapdoor does not satisfy indistinguishability. If the adversary obtains trapdoor , after testing the algorithm, he can understand that , and are the ciphertext of the same keyword, which the indistinguishability of the ciphertext loses its function. The access pattern will be compromised. Therefore, this algorithm plays a key role in above process. It is worth noting that this also provides an idea for us to solve the problem of access pattern disclosure(ii)Not CI-Security and TI-Security. As shown in Figure 3(b), three boxes with different colors indicate that the three trapdoors for the keyword are different. So, it meets the indistinguishability of trapdoor. does not satisfy the indistinguishability of ciphertext. If the adversary obtains ciphertext , after testing the algorithm, he can understand that is the trapdoor of the same keyword, which makes the user’s search pattern leak(iii)CI-Security and TI-Security. As shown in Figure 3(c), combined with Figures 3(a) and 3(b) analysis, the adversary can also get relevant information. To be specific, if the adversary gets , the matching ciphertext is . Similarly, for , the matching ciphertext is . The adversary can know that are trapdoors with the same keyword. The solution is the same as above

Based on the study above, for adversary Type-I and Type-II, we can solve the problem of access pattern and search pattern disclosure by specifying an honest server. However, if it is the adversary Type-III, the existing schemes do not protect their privacy from the perspective of access pattern and search pattern [47]. In other words, none of them can resist leakage abuse attacks [30]. One possible solution is to store data distributed on servers so that statistics are not completely compromised if there is an honest server. Next, for the adversary Type-III, we will solve this problem from another angle.

5. Construction

In this section, we first introduce a PAEKS scheme without bilinear pairs. Secondly, based on this scheme, we padding the keywords to achieve one cipher at a time from the ciphertext perspective. In addition, appropriate processing of the documents containing the key makes it possible to protect the access patterns.

5.1. Basic Construction

Our algorithm is consisted of five PPT algorithms, namely, Setup, KeyGen PAEKS, Trapdoor, and Test. The formal constructions are as follows: (i): is cyclic groups. The order of is prime . Select two hash functions and . Choose a random generator of . Output the public parameters .(ii)KeyGen(): It is responsible for the public/private key pair of participants(iii)The DS chooses a random element , and sets , (iv)The DR selects randomly , and sets , (v)The CS selects randomly , and sets be , (vi): Choose a random , compute , , where , . The ciphertext is .(vii): Compute , for keyword , . Then, choose a random , compute , , . The trapdoor is .(viii): Compute , . If so, it outputs 1; otherwise, it outputs 0

5.2. Derived Construction

In basic construction, we can understand that if the PAEKS scheme itself can not hide the matching relationship between ciphertext and trapdoor for the adversary Type-III, which it always gets the relationship between ciphertext and trapdoor through Test algorithm. We consider that if we only achieve the hidden matching relationship in the simple PAEKS scheme, the solution is to fill in the keyword or attach some state information, so that the keyword can achieve the security of one time pad.

To be specific, in order to prevent the malicious server from classifying ciphertext and trapdoor through the tested algorithm this leads to the disclosure of access and search pattern. We have to expand the external form of keywords to achieve that the same keyword has different ciphertext and trapdoor forms Figure 3(d). Meanwhile, if we want to essentially solve the problem of search pattern and access pattern leakage, we have to operate on the original index matrix. The purpose of this operation is to achieve the uniqueness of keywords from the external form.

According to the keywords of the diagram and the frequency information of the document and you can also make a choice according to your actual situation on how to construct different expressions of a keyword, here we give an example Figure 1. Provide the uniqueness of keywords as follows: Solve the uniqueness of keywords as follows: (i)Maximum Number of Searches I. Before uploading the ciphertext, the DS stipulates that the same keyword can be searched for several times at most without disclosing the search pattern, and broadcast on the public channel. If users search more than this number of times, there is a risk of revealing the search pattern(ii)Divided into Several Parts J. To solve the problem that different keywords return different numbers of documents, we suggest dividing the returned documents so that the number of documents returned each time is not much different. Here, the data DS needs to determine the maximum number of parts of a document and broadcast it to the Data Receiver(iii)Keyword Form. A keyword is set to , where stands for the i-th search and represents the j-th part of the search. For example Figure 2, we make the following settings. Note the following facts that is , , , , , let . Thus, the keyword extension to Table 2

Next, we apply the scheme proposed in Section VII to encrypt the filled keywords to achieve our purpose, which is to solve the problem that malicious servers can resist keyword guessing, search, and access pattern disclosure.

Remark. Once the data receiver searches for a keyword more than the predetermined number of times, it is possible to disclose the search behavior and access pattern. Another flaw is that the storage cost has doubled. Fortunately, the peace point of security and efficiency is left to the user to decide.

6. Security Proof

In this section, we demonstrate that our basic scheme matches the design goals in the sense that it is capable of providing soundness and confidentiality. It can resist IKGA and reduce the possibility of access pattern and search pattern disclosure for adversary Type-II

6.1. Correctness

The ciphertext and trapdoor of keyword and are and , respectively. That is , . let be the trapdoor of keyword generated by the receiver. , , , where , . It follows that (i)(ii)

If , , then with probability 1; otherwise, with overwhelmig probability.

6.2. Confidentiality

The following theorems illustrate that our scheme is IND-CKA and IND-KGA security. (i): runs and to generate the public parameter and public/private key pair of participants of Data Sender, of DR, of of CS. Then, sends to . Based on our assumptions, we assume that hash function and are secure and resistant to collisions(ii): sends queries to and , and is simulated as follows: (a): choose a random , compute , , where , . The ciphertext is .(b): compute , for keyword , . Then, choose a random , compute , , . The trapdoor is (iii): selects . Sends them to . picks a random bit . Encrypts as follows: (a)Choose a random , compute , , where , . The ciphertext is (iv): Oracles continue to be consulted as (v): outputs a guess . If then, wins the game. has the advantage that is

Theorem 5. Our scheme PAEKS implements IND-CKA security if the HDH assumption holds.

Lemma 6. The advantage is negligible for any polynomial adversary .

Proof. Suppose the adversary guesses the key words in the game, and the correct event is recorded as (). We define games as follows:
. It is the original IND-CKA game.

. Let be the same game as . Assuming that is the same game as , except that chooses instead of computing . sends . We have , where is negligible if the HDH assumption holds.

. Let be the same game as , except that random selects instead of . Given that and are random values, the of and are of the same distribution as shown by perspective

We have

can only win with probability since is independent of . Thus, the advantage is

Finally, according to the , , and we have where are negligible. Therefore, the advantage of wins in the IND-CKA game is negligible. (i): runs and to generate the public parameter and public/private key pair of participants of DS, of DR, of of CS. Then, sends to . Based on what we know so far about hash functions, we assume and are secure and collision-resistant(ii): adaptively issues queries to and , and is simulated as follows: (a): choose a random , compute , , where , . The ciphertext is (b): compute , for keyword , . Then, choose a random , compute , , . The trapdoor is (iii): chooses two keywords and sends them to . chooses a random bit , and then the trapdoor of the challenge keyword as follows: (a)Compute , for keyword , . Then, choose a random , compute , , . The trapdoor is .(iv): Oracles continue to be consulted as (v): outputs a guess . If then wins the game. has the advantage that is

Theorem 7. Our PAEKS scheme implements IND-KGA security if the HDH assumption holds.

Lemma 8. The advantage of is negligible for any polynomial adversary .

Proof. Suppose the adversary guesses the key words in the game, and the correct event is recorded as (). We define games as follows:
. is the original IND-KGA.

. Let be the same game as , except that chooses instead of computing . sends the trapdoor is , where , , ,

We have where is negligible if the HDH assumption holds.

. Let be the same game as , except that the random chooses instead of , where , ,, . Due to and are random, which of and are the same distribution from adversary’s view.

We have We know that can only win with probabilityin because is independent of . Thus, the advantage is

Finally, according to the , , we have where are negligible. Therefore, the advantage of wins in the IND-KGA game is negligible. (i): The challenger runs and to generate and , , and . Then, sends to . Based on our assumptions, we assume that hash function and are secure and resistant to collisions(ii): adaptively sends queries to , and , and is simulated as follows: (a): Choose a random , compute , , where , . The ciphertext is .(b): Compute , for keyword , . Then, choose a random , compute , , . The trapdoor is .(c): For any the cipher and trapdoor of keyword and , runs , and outputs 1 if ; otherwise, it outputs 0(iii): selects . Sends them to . selects a random bit , and encrypts . picks a random bit , and the trapdoor of the challenge keyword as follows: (a)Choose a random , compute , , where , . The ciphertext is .(b)Compute , for keyword , . Then, choose a random , compute , , . The trapdoor is .

Theorem 9. Our scheme PAEKS implements HMR-Security if the DL assumption holds.

Lemma 10. The advantage is negligible for any polynomial adversary .

Proof. We define games as follows:
. The is the original version of this game.

Finally, sends and to adversary. (iv): continues to be issued by as queries to Oracles. There is only one restriction is that neither or to (v)Guess. Finally, returns as the guess of and wins the game if and

According to the definition of the HMR game, the advantage of adversary is

According to Theorem 5 and Theorem 7, we know that the distribution of ciphertext and trapdoor of keywords is the same as that of random tuples from adversary view. So, has to test algorithm to verify whether the ciphertext matches the trapdoor. Given and , which calculating the value of is DL problem.

Finally, we have where are negligible. Therefore, the advantage of wins in the HMR game is negligible.

6.3. Security Analysis

(i)Resisting attack IKGA. As far as PAEKS is concerned, only the DS has the ability to generate a legal ciphertext. The adversary executes the algorithm, it cannot generate . Similarly, the adversary can not generate the trapdoor of keyword. algorithm cannot provide any information to the adversary. Thus, our scheme can resist IKGA(ii)Access pattern and Search pattern. For adversary Type-I, we can know that the ciphertext and trapdoor of keywords meet the indistinguishability according to Theorem 5 and Theorem 7. Therefore, the access pattern and search pattern are not compromised. However, the adversary Type-III can run the test algorithm, which will reveal some search information of the user. Then, the adversary can carry out leakage abuse attacks by combining some prior knowledge. In addition, we cannot hide the matching relationship between the ciphertexts and trapdoors from this type of powerful adversary

7. Comparison and Analysis

We compare our scheme with the authentication based searchable schemes (HL17 [4], QC+20 [5], QC+21 [7], and PL21 [6]), which are mainly focused on security comparison. Then, we count the number of different operations of other schemes and conduct an empirical performance evaluation using the Relic and GMP libraries.

7.1. Comparative Analysis of Security

The Table 3 indicates the comparison results among the proposed PAEKS. Again, we emphasize that since the server can perform test operations, the inseparability of ciphertext and the indistinguishability of trapdoor for the same keyword have no practical significance unless the keyword is filled in like the extended version of the scheme. Therefore, we only compare the indistinguishability of external adversaries or curious servers before a retrieval token is given. We adopted the security description of [7]. In the table, we are only comparing the security of adversary Type-II. Qin et al. [7] introduced the definition of fully CI-security and fully TI-security. The mDLIN, DBDH, and BDHI stand for modified Decision Linear (mDLIN) assumption, Decisional Bilinear Diffie-Hellman (DBDH) assumption, and Bilinear Diffie-Hellman Inversion (BDHI) assumption, respectively. The BDH and CODH stand for Bilinear DiffieHellman and Computational Oracle Diffie-Hellman. HL17 [4], QC+20 [5], QC+21 [7], and PL21 [6] have a common feature that they use the DR’s public key in ciphertext generation and the DR’s private key in trapdoor generation, which can naturally resist IKGA. Table 3 shows that only our scheme achieves the fully CI-security and fully TI-security. Meanwhile, it can protect the user’s search pattern.

7.2. Time Complexity

The Table 4 shows the number of operations of each algorithm. is a symbol for exponentiation in group . is a symbol for the pairing operation. is a symbol for a group element that maps any string to . In Table 4, other schemes employ bilinear pair operation, which greatly reduces the efficiency. The computational costs of and of various schemes are both similar and have the same algorithm. Therefore, we just consider , , and algorithms. Qin et al.’s [5, 7] schemes need to compute 3 exponentiation operations, one hash-to-point operation and one pair operation in the ciphertext phase. Their scheme needs to compute 2 power operations and one hash-to-point operation in the trapdoor phase. Huang et al.’s [4] and Pan et al.’s [6] operate the same number of times in the cipher phase and the trapdoor phase. Notice that their schemes all use bilinear bilinear operations, which leads to a large computational overhead. We can see that our scheme is the fastest in the ciphertext, trapdoor, and test phases. This is because our algorithm uses power operations which are computationally less expensive. It can be seen that our scheme is more suitable for lightweight devices.

7.3. Evaluation

We can evaluate the effectiveness of the various schemes (HL17, QC+20, QC+21, and PL21) by incorporating Relic and GMP. Platforms used in this experiment include Ubuntu 18.04.5 LTS with Intel(R) Xeon(R) CPU E5-2620 [email protected] and 16.00GB of RAM. The pseudo random permutation was computed using the AES algorithm (CBC model, 128bit key). The hash functions were computed using the SHA256 algorithm. We choose the real Encron Email Dataset (Version 20150307, about 423 MB) to demonstrate the practical performance of our proposed scheme, which contains the data from about 150 users [31]. We choose about 2000 keywords whose lengths are not less than 5 characters and the total number of occurrences is higher than 20.

This paper compares the proposed scheme and the schemes in HL17, QC+20, QC+21, and PL21 in terms of , , and Test. The keywords were also chosen randomly for this experiment. As shown in Figures 46, we have the lowest computation cost among the five schemes for generating ciphertexts, trapdoors, and test. A key point of the high computational efficiency of our scheme is that it does not require some bilinear pairing operations, which can save a lot of computing overhead. In the ciphertexts generation algorithm, Pan and Li [6] computational overhead is about 274 times that of ours. Compared to our trapdoor generation algorithm, Qin et al.’s [5] computation overhead is about 158 times higher. It is estimated that Qin et al.’s [7] computational overhead is about 60 times ours in the test algorithm.

To evaluate the efficiency of the PAEKS algorithm, we give the time consumption of each algorithm for testing 2000 keywords, as shown in Figure 7. Communication and network effects were removed in the experimental results. Note that to generate 2000 keyword ciphertexts in our scheme, the trapdoor and test take 315.004 ms, 452.788 ms, and 204.079 ms, respectively. The time cost of generating a cipher text is 0.1575 ms. The time cost to generate a trapdoor is 0.2264 ms. The search algorithm is efficient, with an average time cost of about 0.102 ms for matching. Therefore, our scheme is more suitable for some lightweight or computationally constrained devices.

8. Conclusion

In this paper, we have defined different types of adversary capabilities and firstly, analyze the relationship between ciphertext indistinguishability, trapdoor indistinguishability, access pattern, and search pattern. Then, we discuss the security of existing PAEKS schemes. Based on [7], we add the full TI-security model so that we can resist the leakage of search pattern. For adversary Type-II, we design a PAEKS scheme without bilinear pairs, which greatly improves efficiency compared with the previous project. Because the scheme does not use bilinear pairs as a whole, it is more suitable for edge servers and clients with limited computing power. In addition, we also put forward a solution on how to protect access pattern disclosure against adversary Type-III. Unfortunately, the memory expansion is too large. The future work is possible to construct a PAEKS scheme that can protect the search pattern and access pattern and has less memory overhead.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

The work was supported by the Shandong Provincial Key Research and Development Program (No. 2020CXGC010107, 2021CXGC010107), the Special Project on Science and Technology Program of Hubei Provience (No. 2020AEA013), the Natural Science Foundation of Hubei Province (No. 2020CFA052) and the Wuhan Municipal Science and Technology Project (No. 2020010601012187).