Abstract

Cloud computing can provide users with sufficient computing resources, storage, and bandwidth to meet their needs. Data security and privacy protection are among the new threats faced by users. Searchable encryption is the combination of search technology and encryption technology. Searchable encryption can upload the user’s data to the cloud server after special encryption, and can realize the function of retrieving according to keywords. Comparatively to symmetric searchable encryption (SSE), public key searchable encryption (PEKS) simplifies key management greatly. However, most existing public key authenticated encryption with keyword search (PAEKS) schemes are based bilinear pairing, making them computationally expensive. Apart from this, complex retrieval requirements and the integrity of the results had not been considered. To address these problems, we propose a blockchain-based PAEKS schemes supporting multi-keyword queries and integrity verification. In addition, we provide security proofs for the PAEKS scheme under the decisional oracle Diffie-Hellman (DODH) assumption. This scheme a scheme that requires less storage and computational power than other schemes of the same kind.

1. Introduction

With the rapid development of mobile Internet, cloud computing is also gradually changing people’s lifestyles. A service provider aggregates a large amount of computing and storage resources to the cloud. It provides services on-demand to users with limited resources to facilitate a variety of diverse and personalized applications. Based on the high efficiency and low cost that cloud computing can provide, cloud computing has attracted a great deal of attention from academia as well as industry in the past decade. However, third-party service providers are not completely trustworthy, and storing data in plaintext will definitely pose a serious threat to the privacy of data leakage.

The data on an untrusted server must be encrypted before it can be accessed. Data encryption ensures that nobody can access information without a key. Moreover, the data owner also loses the ability to retrieve the data. A conventional approach is to download all the files locally, decrypt them, and then search for them. It is not practical for users to use this method. Another approach is to delegate retrieval to the server, which finds the file and returns it to the user. However, it is a challenge for the server to find what it needs from the encrypted file. To query the encrypted file, the server needs the user’s key to decrypt it. Thus, encryption is rendered ineffective. As a result, it is hoped the server side has access to as much search functionality as possible without having to decrypt the data. Researchers proposed a technique called searchable encryption (SE), which could be used for private encryption and search.

There are two types of searchable encryption: symmetric searchable encryption (SSE) and public key encryption with keyword search (PEKS). Song et al. [1] introduced the first practical scheme for searchable encryption, allowing text search without compromising confidentiality. In 2004, Boneh et al. [2] published the first study studying the search problem of data encrypted by the public key system, used as the prototype of all PEKS. It mainly solved how to find the mail required by users from the massive mail system. We must therefore understand the PEKS model clearly. As depicted in Figure 1, a PEKS scheme works as follows:(i)The user Alice encrypts the file with a symmetric key. Then the user encrypts the keyword with Bob’s public key and sends them to the mail server. Finally, the user sends the encrypted file and keyword ciphertext to the mail server.(ii)The user Bob generates a search trapdoor of using his secret key, and sends a query to the mail server. Data users who wish to search for data on the cloud server generate search trapdoor information using their secret keys. Bob sends the search trapdoor to the cloud server.(iii)By matching the ciphertext of the keywords, the mail server delivers the search results to Bob.

SE is invalid, however, when you cannot securely share a secret key over e-mail. The PEKS can effectively solve the key transfer and key management problems in symmetric searchable encryption. Researchers strived to design efficient and effective mechanisms to perform searches on encrypted data. However, PEKS naturally has two major shortcomings.(i)Having to deal with an ever-growing dataset size and requiring many public key operations limits the usefulness of PEKS.(ii)Keyword guessing attacks (KGA) and inside keyword guessing attacks (IKGA) can be applied when the space of keywords is smaller than the space of keys.

To resist keyword guessing attacks (KGA) from outside, Rhee et al. [3] constructed a public key encryption scheme for keyword searches of a specified searcher by introducing random numbers in the trapdoor, making it possible to run the test algorithm only on a specified server, thus preventing KGA by external adversaries. Further, the researchers considered the case where the adversary is a malicious server. As a result, the methods are divided into two categories roughly. A potential solution is to expand the range of keywords, which will make keyword guesses more difficult [4]. The other solution is to limit the adversary’s ability to create ciphertexts and trapdoors, so that keyword guessing attacks cannot be tested in large numbers by Huang and Li [5]. The methods are resistant to IKGA, but they ignore the fact that trapdoors are algorithmically generated deterministically, which can give away the user’s search habits. To meet this goal, the first challenge is to design a searchable encryption scheme that is resistant to IKGA and protects user search patterns and access patterns.

As far as practicality and efficiency are concerned, Xu et al. [6] presented a lightweight PEKS scheme that was quite similar to some practical SSE schemes in terms of search performance. Similarly, those schemes [79] also designed the storage structure of ciphertext to improve the retrieval efficiency. Their approach was effective in terms of efficiency. However, we take another perspective is to reduce the consumption of public key operations, thus improving the retrieval efficiency and enhancing the utility. In addition, the current PAEKS schemes do not take into account multi-keyword search, which can affect the user’s experience. The second challenging goal of this scheme is to design an effective searchable encryption scheme that can be searched by multiple keywords.

Third-party servers often benefit in practice from providing retrieval services with a risk that ciphertext data may be tampered with by malicious parties or that the server may fail. As a consequence, the integrity of search results is also a concern that cannot be overlooked. Azraoui et al. [10] combined polynomial-based accumulators and merkle trees to implement conjunctive keyword verification. The results of multi-keyword searches used by Wan and Deng [11] were verified using homomorphic Mac. Verifying search results in a flexible and feasible manner was possible by using a blockchain, which proved the reliability and fairness of the process by its non-repudiable property [1215]. Therefore, designing a PAEKS that supports integrity verification of the retrieved results is the third challenge goal.

1.1. Motivation and Contributions

We analyzed the problems in existing schemes to establish the basis for achieving the above goals. First, the cloud server can perform unlimited trapdoor testing and cipher text matching. The server is too powerful for users to moniton, which can reveal the user’s access patterns. Therefore, we prefer that the recipient sends the file identifier to the server itself. In addition, trapdoors for the same keyword are identical, which may reveal the user’s search pattern. We introduce random numbers in trapdoor generation to blind trapdoors, and we can also use multiple keywords to reduce the risk of leakage. Secondly, most existing PAEKS algorithm is based on the bilinear map. The generation of ciphertexts and the high overhead of trapdoor computation can affect the experience of computationally constrained users. At the same time, the cloud server has to perform matching test for each ciphertext one by one in the trapdoor retrieval phase, which leads to a large computational burden on the cloud server. This makes the overall utility of the solution low. Besides, most existing PAEKS either do not consider the extension of multi-keyword search or the cost of extending to multiple keywords is costly. Therefore, we use a computationally inexpensive public key cryptographic primitive with support for multi-keyword search. Finally, the integrity of the search results is not considered. Therefore, we should use some new tools to solve this problem efficiently. Following the analysis above, we constructed a new PAEKS scheme based on our design goals. The contributions of this research are as follows:(i)Firstly, we design a blockchain-based public key authenticated encryption scheme. Data sender and data receiver can naturally share a common initial key based on Diffie-Hellman key agreement, which can resist IKGA. The search and access pattern are not compromised.(ii)Secondly, our scheme greatly reduces the computational cost and improves the practicability of the scheme. Specifically, it can reduce the computational cost of ciphertext and trapdoor without using bilinear pairs. In the testing phase, we use the inner product relation of two vectors to determine the set inclusion.(iii)Finally, our scheme supports the integrity verification of the results. It ensures the reliability of search results by using the tamper proof characteristics of blockchain. We demonstrate that the scheme has adaptive security for CKA, and we perform a series of experiments to evaluate its performance.

1.2. Organization

We introduce a review of the related works in Section 2. Preliminaries containing some cryptographic notations and the system model of our proposed system are introduced in Section 3. We introduce the basic BB-PAKES scheme in Section 4. We give the security proof of scheme in Section 5. We show the comparison of time consumption with other schemes in Section 6. Finally, we conclude this paper in Section 7.

In the cloud environment, searchable encryption provides a robust solution for retrieving ciphertexts under privacy protection. In this case, the cloud server can search among the ciphertext data uploaded by users and return ciphertext data matching the search criteria when the plaintext information is unavailable. Boneh et al. [2] introduced the public key encryption with keyword search, which is of great significance. The performance of SSE is generally better than the performance of PEKS. Beak et al. [16] pointed out [2] scheme utilized a secure channel between communication parties. The cost of building a secure channel made it not suitable for certain applications. Meanwhile, the adversary could intercept the trapdoor transmitted using a non-secure channel, creating security problems. Reference [16] used the server’s public key to encrypt the trapdoor to avoid these drawbacks, and proposed an improved keyword search public key encryption scheme to ensure the security of the trapdoor in the transmission process. There is only one server that can perform the search, and it was the first searchable public key encryption with designated tester (dPEKS) that improved the security of PEKS.

In contrast, Byun et al. [17] observed that the keyword space was much smaller than the ciphertext space, and that we could brute force keywords with low entropy content more easily. Based on these facts, [17] pointed out that [2] could not resist (outside) keyword guessing attacks (KGA). Reference [17] provided specific methods of obtaining keyword information from any captured query message. Yau et al. [18] based on the attack by [17], and presented an attack method against the scheme by [16]. The above literature analysis showed that it was not possible to simply combine keywords and secret keys to generate a trapdoor that was resistant to keyword guessing attacks. This means that any insider/outsider attacker can associate the combination with the public key through pairing, which leads to an offline keyword guessing attack.

To resist KGA, Rhee et al. [3] constructed a designated searcher keyword search public key encryption scheme, which restricted external adversaries from running the test algorithm. Hu and Liu [19] pointed out that [3] could not resist the outside keyword guessing attack of the server, and described the specific attack method and two improvements. They reserved the nature that the designated server can only carry out the keyword search and extended the dPEKS scheme to a bidirectional searchable proxy re-encryption with a designated tester scheme (Re-dPEKS).

To achieve IKGA security against inside attacker, Huang and Liu [5] introduced the concept of public key authenticated encryption with keyword search (PAEKS). In PAEKS, the data sender not only encrypts the keyword, but also authenticates it, so that the verifier will believe that the encrypted keyword can only be generated by the sender. Specifically, Qin et al. [20, 21] concluded that [5] efficient was not adequate to capture a realistic threat, called for outside chosen multi-ciphertext attacks, and provided a new PAEKS model, which captured both (outside) chosen multi-ciphertext attacks and (inside) keyword guessing attacks. However, The trapdoor had been fixed, which would reveal the user’s retrieval patterns. On the basis of the above research, we considered several issues.

Additionally, some researchers had suggested some encryption schemes that can support complex retrieval such as conjunctive keyword search, subset search, range query, and semantic keyword search [2225], which improved retrieval accuracy and provided more complex retrieval expressions for the user. Abdelraheem et al. [26] presented a query evaluation scheme that combined SSE with Bitmap indexes. However, it required two rounds of interaction with the cloud server. Katz et al. [27] constructed a scheme for evaluating inner products based on predicates, which evaluated disjunctions, polynomials, thresholds, and more. It was possible to increase query efficiency by reducing the bilinear pairing during the search process [2830]. Zhang et al. [31] proposed a public key encryption scheme based on a tree-based index structure. However, there was a keyword arrangement table between the data sender and the data receiver by default. Otherwise, the probability of successful retrieval was very low.

Integrity verification aspects, Cheng et al. [32] proposed a symmetric searchable encryption scheme with verifiable integrity, which applies indistinguishable obfuscation techniques to counter server attacks. It achieved that a malicious server cannot tamper with the search results arbitrarily, but the indistinguishable obfuscation technique was too ideal. Therefore, it was difficult to implement in practical applications and did not have application value. Wang and Fan [33] proposed a lightweight symmetric searchable encryption scheme that implements support for search result integrity detection and also enables dynamic updating of ciphertexts corresponding to search keywords. They mainly used a tree structure to improve the update efficiency, but only single-user uploading and retrieval of ciphertexts is possible.

3. Preliminaries

As part of this section, we describe notations, cryptography materials, blockchains, the Bloom Filter, and system goals.

3.1. Notations

There is a description of the notations in Table 1.

3.2. Blockchain

Decentralization, public verification, transparency, open audit, and antitampering are characteristics of blockchain, a technology that is gaining attention from academia and industry alike. All nodes have access to the data stored on the blockchain since blockchain runs without a central server. Each node participates and generates calculations, which are stored on the blockchain. Blockchains maintain the same data between all nodes thanks to the consensus mechanism, meaning that no single node can change the recorded data. Given these characteristics, the blockchain can act as a trusted third party for fair verification.

3.3. Bloom Filter

The bloom filter (BF) is a probabilistic data structure. It is very useful for performing collection membership tests quickly and in a space-saving manner, but it has the drawback that false positives do occur [3436]. False positives are rare enough that many applications outweigh this disadvantage. Particularly, the BF in consists of three polynomial-time algorithms :(i): It requires two integers as inputs , and selects a collection of hash functions , where is from to a set , and is key. Finally, it outputs and an initial -bit array with each bit for set to 0.(ii): It takes , and an element , updates the current array by setting for all , and finally outputs the updated .(iii): It takes , and an element , and checks if for all . If ture, it outputs 1, otherwise returns 0.

3.4. Set Containment Search and Vector Scalar Product

One solution to perform aggregate containment search on outsourced data is to transform the aggregate records into a set of binary vectors with fixed dimensions. Based on the scalar product of the set containment problem, one can transform it into a vector search problem [37].

3.5. Pattern Leakage

Let be a q-query set whose element is a pair , where denotes the timestamp of a query, and denotes a keyword. The leakage can be represented as follows [38]:(i)Access Pattern. An access pattern describes whether a document contains a query keyword. For each query keyword , an access pattern is defined as , where represents the document’s number.(ii)Search Pattern. It shows the query patterns for every keyword . The search pattern for each keyword is .

3.6. Inverted Index

Database indexes store maps between content (such as words or numbers) and their location in a table, or in a file or group of files.(i)Create Inverted Index. First, all the raw data are numbered to form a list of documents. Then the document data is extracted to obtain a large number of keywords, which are indexed by entry. The numbering information of the documents containing these entries is kept. It can also be referred to as an index matrix. As depicted in Figure 2.(ii)Search Process. The user enters any keyword and brings the keyword to an inverted index list for matching. By looking up these terms, the numbers of all documents containing them can be found. Documents are then found in the document list based on these numbers.

3.7. The Computational Diffie-Hellman Assumption (CDH)

Let be a cyclic group. The order is . The generator is . Given two elements of , and , it is required to compute . CDH is intractable in the underlying group .

3.8. The Decisional Oracle Diffie-Hellman (DODH)

The is a group. The order is . The is a random generator of . The is hash function. The DODH assumption means that the advantage of is negligible i where for any PPT adversary .

3.9. The Definition of BB-PAEKS

A scheme is consisted of eight PPT algorithms, namely, Setup, KeyGen, PAEKS, Trapdoor, Search, Verify, Dec, Update. The formal constructions are as follows:

The formal constructions are as follows:(i): This algorithm generates global parameters. Input a security parameter , and outputsa global public parameter .(ii): It is responsible for maintaining the public and private keys for participants.(a): An algorithm that produces public and private keys for a sender. Input , and generates for the sender.(b): This algorithm generates the public/private key pair for the receiver using input and outputs .(iii): The process is carried out by a data sender. Input , and returns corresponding keyword’s ciphertext .(iv): It is performed by a data user. Input , and returns corresponding keyword’s trapdoor .(v): The algorithm is run by a SP.(vi): The algorithm is run by Blockchain.(vii): The algorithm is run by DR.(viii): The algorithm is run by DR.

3.9.1. Correctness

For a PAEKS scheme, Given and , it can be converted into two vectors . We formulate consistency as below:

If , then ; else , then .

3.10. The Security Models of BB-PAKES

For , the semantic security model incorporates both the indistinguishability of cipher-keywords and the indistinguishability of trapdoors. Our ciphertext indistinguishable security (CI-security) and Trapdoor indistinguishable security (TI-security) definition is identical to that of [5, 20, 21] in settings.

3.10.1. CI-Security Model

Assume is an adversary, while is the security parameter.(i)Initialization. A challenger generates the system parameter by running the setup algorithm . Then, it runs to generate the target DS’s public/private key pairs , DR’s public/private key pairs respectively. The challenger provides and and to .(ii)Phase 1. In this adaptive attack, the adversary asks PPT questions to the cipher oracle and trapdoor oracle , and receives the cipher and trapdoor for query keywords.(iii)Challenge. After Phase 1, it outputs two challenge keywords and . A coin is flipped by the challenger. It sends the cipher-keyword PAKES to .(iv)Phase 2. While the adversary may use the Phase 1 oracle to continue to query the oracles, they may not access and with and .(v)Guess. Finally, returns a bit as the guess of and wins the game if .

The adversary wins in the above game if he guesses . The ’s advantage in winning this game is defined as,

Definition 1. (CI-security) Ascheme satisfies cipher-keyword indistinguishable against chosen keywords attacks if for any PPT adversary, the advantageof succeeding in the above game is negligible.

3.10.2. TI-Security Model

The adversary and the security parameter .(i)Initialization. Challenger generates the system parameter by running the setup algorithm . Then, it runs to generate the target DS’s public/private key pairs , DR’s public/private key pairs respectively. The provides , and to .(ii)Phase 1. The adversary adaptively performs PPT queries on password-keyword oracles PAEKS and trapdoor oracles, and obtain cipher and trapdoors for the query keywords.(iii)Challenge. After the first phase, it outputs two challenge keywords and with the constraints and . In the first stage, and are never queried by for the cryptographic keyword oracle and the trapdoor oracle. The challenger then picks a coin and sends the trapdoor . Sends the trapdoor Trapdoor to .(iv)Phase 2. As Phase 1, the adversary can continue to query the oracles, but cannot query and with and .(v)Guess. Finally, returns a bit as the guess of and wins the game if .

The adversary wins in the above game if he guesses . The advantage of winning this game is defined as,

Definition 2. (TI-security) Ascheme satisfies trapdoor indistinguishability against chosen keywords attacks if for any PPT adversary, the advantageof succeeding in the above game is negligible.

3.11. System Model

The system includes the following parties: Storage Provider (SP), Data Sender (DS), Data Receiver (DR) and Blockchain as depicted in Figure 3. The characteristic and function of each party are depicted as follows.(i)Storage Provider (SP): Service providers have strong computing and storage capabilities. The SP stores encrypted data on behalf of the data sender, and SP retrieves the corresponding ciphertext when DR submits a retrieval request. The final step involves returning the retrieval result to the DR and blockchain.(ii)Data Sender (DS): The DS collects documents with identifiers . Each document contains a set of keywords , which is a subset of the collection of all keywords , i.e., . As a measure of improving query efficiency, DS creates an inverted index database DB for his keywords. For different keywords , there is a set of document identifiers, denoted by . After that, DS encrypts all files, keyword and . The DS outsources the keyword ciphertext and document ciphertext to the SP. At the same time, the DS performs a hash operation on the encrypted file identifier set and sends it to the blockchain.(iii)Data Receiver (DR): The DR generates search token according to the keywords and then sends the trapdoor to SP and BlockChain, respectively.(iv)Blockchain: The blockchain is composed of various nodes (such as recruitment companies, candidates, and other identities). Its primary responsibility is to maintain the blockchain network that supports smart contracts, which can be utilized for storing user data and inspecting documents.

The detailed procedures of are as follows: ① The DS extracts keyword set from documents and builds a checklist from files. At the same time, encrypt keyword index and documents . ② The DS sends encrypted keyword index to the SP. Meanwhile, the DS sends checklist to the Blockchain. ③ The SP registers and to the database. ④ The DR computes trapdoor of queried keywords . ⑤ The DR sends trapdoor to the SP and Blockchain. ⑥ The SP searches keyword index in the database by using trapdoor . ⑦ The SP returns the result to the DR and blockchain for verification. ⑧ As soon as the SP returns the results, the blockchain calculates the hash value of every result. From the identity information of DS in the checklist , the blockchain gets the corresponding hash values of files. Finally, the blockchain compares two hash value to generate the . ⑨ The blockchain sends the to DR.

3.12. Threat Model

In this paper, it suppose that DS honestly follow the PAEKS algorithm to produce searchable ciphertexts for DO and transmits these ciphertexts to the SP. DR honestly follow the Trapdoor algorithm to produce trapdoor and transmits these to the SP. The SP is supposed to be honest and curious, who will honestly perform Test algorithm, and is interested in query results and frequency information of ciphertext. Blockchain is trusted and executes the protocols in the system honestly.

3.13. Design Goals

According to the most basic requirements of public key searchable encryption, our scheme must meet the following characteristics:(i)Correctness: If the DR provides the correct trapdoor and the scheme is executed in the correct way, the received search results must be correct.(ii)Confidentiality: The scheme needs to protect the confidentiality of keywords, files, indexes and trapdoors. In other words, the scheme can resist KGA and IKGA.(iii)Integrity: The scheme supports multi-keyword retrieval and integrity verification of search results.

4. Construction

This section presents a basic construction of PAEKS followed by an extension of the base version. It supports multi-keyword retrieval. Inverted index-based data structures, as illustrated in Figure 2, are better suited to our construction. Our discussion in this paper does not include the topic of how to encrypt files, which is not the focus of our discussion.

4.1. Basic Construction

Our scheme is consisted of eight PPT algorithms, namely, Setup, KeyGen, PAEKS, Trapdoor, Search, Verify, DEC, Update. The formal constructions are as follows:(i)Setup: Choose a is cyclic groups with prime order . Select three hash functions , , and a collection of hash functions with key , where . Choose a random generator of . Output the public parameters .(ii)KeyGen: It is responsible for the public and private key pair of participants.(a)The DS chooses a random element , and sets to , .(b)The DR selects randomly , and sets to , .(iii)PAEKS:(a)Data Sender. Version serial number . DS runs . Compute , and , which the key is updated with the version information. Input and , run algorithm, and output the updated bloom filter , where a collection of hash functions with key and keyword for .. Choose random , , , . Send the to SP, and send the to Blockchain.(b)Storage Provider. Store to .(c)Blockchain Store to .(iv)Trapdoor: The algorithm is run by a DR.(a)Data Recriver. DR runs . Compute , and , which the key is updated with the version information. Input and , run algorithm, and output the updated bloom filter , where a collection of hash functions with key and keyword for . Randomly select random numbers , and the corresponding position of the bloom filter is set to 1, where means the number of pseudo-random functions in bloom filter.. Send to SP and Blockchain.(v)Search: The algorithm is run by a SP.(a)If , the search algorithm returns 1. Otherwise, it returns 0.(b)Puts the matching ciphertext into the map .(c)Sends the result to the DR and Bolckchain.(vi)Verify: The algorithm is run by Blockchain.(a)Gets the search result , obtains the hash value of each file in the result from , and gets the .(b)For all ciphertexts , blockchain computes the hash value of , gets .(c)Blockchain compares and , if they are equal, the proof is true, otherwise false.(d)The search results and are sent to DR.(vii)Dec: The algorithm is run by DR.(a)Gets the search result .(b)Compute .(viii)Update: The algorithm is run by DR.(a)Data Receiver. The ciphertext update algorithm here is the same as , except for the key generation method Version serial number . DS runs . Compute , and , . Input and , run algorithm, and output the updated bloom filter .. Choose random , , , . Send the to SP, and send the to Blockchain.(b)Storage Provider. Store to .(c)Blockchain Store to .

4.2. Derived Constructions

On the basis of scheme , it can be easily extended to public key authenticated encryption with multi-keyword search. The Setup, KeyGen, PAEKS, Search, Verify, Update these algorithms are the same as those of the basic scheme, except that the Trapdoor and Dec algorithms are different. In order to save space, please refer to the basic scheme settings for details of specific schemes.(i):(a)Data Receiver. DR runs . For Compute , For each keyword in set , perform the following steps, and finally add them to the bloom filter.. Input and , run algorithm, and output the updated bloom filter . Choose random , and set to 1, where means the number of pseudo-random functions in bloom filter.. Send to SP and Blockchain.(ii): The algorithm is run by DR.(a)Gets the search result .(b)Run , get .(c)It can get the results of different keyword combinations locally(conjunction and disjunction).

4.2.1. Remark
(1)It is not linear that the ciphertext and trapdoor size grow with keywords in our scheme. The trapdoor is the same size as the single keyword trapdoor.(2)Search Index can be executed with a linear search, but a binary tree search is more efficient because we do not have to test all indexes in a binary tree search.(3)Several factors need to be considered prior to using the bloom filter, including the length and number of pseudo-random functions. We can determine these parameters by using and , where means the maximum number of characters and represents the maximum likelihood of false-positive [35, 39]. We select 2000 keywords, the length of the mapping bit array is 20000, the number of hash functions is 5, and the error rate is 0.00943. According to the parameter standard provided by [40].

5. Security Proof

In this section, we show that our basic scheme is compatible with the design goals, in other words, our construction is reliable. It can resist IKGA and reduce the possibility of access pattern and search pattern disclosure.

5.1. Correctness
5.1.1. Correctness

The sender and receiver compute , , . They run and algorithm and get and about keyword . Then, it can be proved to be correct according to bloom filter and set intersection property.

5.1.2. Remark

Our scheme does not consider malicious man in the middle attack and malicious tampering by external enemies. It can be guaranteed by public key encryption system and algorithms.

5.2. Confidentiality

The following theorems illustrate how our scheme satisfies IND-CKA and IND-KGA security.

Theorem 1. Our scheme implements IND-CKA security if the DODH assumption holds.

Lemma 1. The advantage is negligible for any PPT adversary .

Proof. Suppose the adversary guesses the key words in the game, and the correct event is recorded as . We define games as follows:
. It is the original IND-CKA game.(i): The challenger runs and to generate the public parameter and public/private key pair of participants , of DS, , of DR. Then, the challenger sends to the adversary . Hash functions and should be collision resistant and secure.(ii): In this adaptive attack, the adversary asks PPT questions to the cipher-keyword oracle and the trapdoor oracle , and is simulated as follows:(a): Run . Compute and . Input and , run algorithm, and output the updated bloom filter . Send the to .(b): Run . Compute , and . Input and , run algorithm, and output the updated bloom filter . Choose random , and set to 1, where means the number of pseudo-random functions in bloom filter. Send trapdoor to .(iii): As the adversary chooses keywords , and sends them to . The challenger picks a random bit , and encrypts the keyword . The ciphertext of the challenge keyword as follows:(a)Choose a random , and . , output .(b)Finally, the challenge sends to adversary.(iv): The adversary proceeds to query oracles during . It cannot query the ciphertext and trapdoor of .(v): If then wins the game. The adversary has the advantage that is. Let be the same game as , except that the challenger chooses instead of computing . The challenger sends the ciphertext . We have , where is negligible if the DODH assumption holds.
. Let be the same game as , except that the challenger random chooses instead of . Due to and are random. From adversary’s view, the ciphertext of and are the identical distribution.
We have,We conclude that the adversary can only win with probability in because is independent of . Thus, the advantage,Finally, according to the , we have,where are negligible. Therefore, the adversary advantage in winning can be ignored in the IND-KGA game.

Theorem 2. Our scheme implements IND-KGA security if the DODH assumption holds.

Lemma 2. The advantage is negligible for any PPT adversary .

Proof. Suppose the adversary guesses the key words in the game, and the correct event is recorded as . We define games as follows:
. IND-KGA is the original version of this game.(i): The challenger runs and to generate the public parameter and public/private key pair of participants of DS, of DR, of of SP. Then, the challenger sends to the adversary . Based on what we know so far about hash functions, we assume and are secure and collision-resistant.(ii): In this adaptive attack, the adversary asks PPT questions to the cipher-keyword oracle and the trapdoor oracle , and is simulated as follows:(a): Run . Compute and . Input and , run algorithm, and output the updated bloom filter . Send the to .(b): Run . Compute , . Input and , run algorithm, and output the updated bloom filter . Choose random , and set to 1, where means the number of pseudo-random functions in bloom filter. Send trapdoor to .(iii): The adversary selects two keywords and sends them to . The challenger picks a random bit , and encrypts the keyword . The trapdoor of the challenge keyword as follows:(a)Choose a random and . , output . Choose random , and set to 1, where means the number of pseudo-random functions in bloom filter. Finally, the challenge sends to adversary.(iv): continues to be issued by the adversary as queries to Oracles. There is only one restriction is that cannot be chosen for the ciphertext and trapdoor query.(v): If then wins the game. According to the IND-KGA definition, adversary has the advantage that is. Let be the same game as , except that the challenger chooses instead of computing . The challenger sends the ciphertext . We have , where is negligible if the DODH assumption holds.
. Let be the same game as , except that the challenger random chooses instead of . Due to and are random, which the ciphertext of and are the same distribution from adversary’s view.
We have,As we know, the adversary can only win with probability since is independent of . Thus, the advantage,Finally, according to the , we have,where are negligible. Therefore, the advantage of adversary wins in the IND-KGA game is negligible.

5.3. Security

This study, adversary a is mainly aimed at the curious server. That’s because resisting external adversary can be easily solved by introducing the server public key. In other words, IKGA is mainly considered. In addition, and algorithms are most likely to disclose keyword information, So we focused on the analysis and algorithms.(i)Resistance against IKGA. In PAEKS, only the DS can generate the legal ciphertext of keyword. The adversary executes the algorithm, it cannot generate . Similarly, the adversary cannot generate the trapdoor of keyword. Finally, the adversary cannot obtain any information by running the algorithm. Thus, our scheme can resist the IKGA.(ii)Access pattern. We use DR’s public key to encrypt the file identifier that meets the keyword . The SP returns the result to DR, who decrypts the file identifier information with his private key. Finally, the DR submits file identifier to SP according to his actual needs. By adding a round of communication, we ensure the privacy of the matching relationship between keywords and file identifiers. Thus, our scheme protects the access pattern of DS.(iii)Search pattern. The trapdoors of our scheme are generated randomly, that is, the trapdoors of the same keyword are different. Thus, our scheme protects the search pattern.

6. Comparison and Analysis

In this section, we compare our scheme with the authentication based searchable schemes (HL17 [5], QC + 20 [20], QC + 21 [21] and PL21 [41]), which are mainly focused on security comparison. Then, we count the number of different operations of other schemes and conduct an empirical performance evaluation using the Relic and GMP library.

6.1. Comparative Analysis of Security

A comparison of the security guarantees provided by these PAEKS schemes is shown in Table 2. Reference [21] introduced the definition of fully CI-security, fullyTI-security. The mDLIN, DBDH and BDHI stand for modified Decision Linear (mDLIN) assumption, Decisional Bilinear Diffie-Hellman (DBDH) assumption and Bilinear Diffie-Hellman Inversion (BDHI) assumption respectively. The BDH and CODH stand for Bilinear DiffieHellman and Computational Oracle Diffie-Hellman. HL17 [5], QC + 20 [20], QC + 21 [21] and PL21 [41] have a common feature that they use the DR’s public key in ciphertext generation and the DR’s private key in trapdoor generation, which can naturally resist IKGA. A comparison of Table 2 shows that only our scheme has fully CI-security and fully TI-security in PAEKS. Specifically, in our scheme, the ciphertext with the same keyword is encrypted with different keys, and the trapdoor is blinded with random numbers. In their CI-security model, HL17 and PL21 still prohibited adversaries from querying cipher-keyword oracles with challenge keywords. QC + 21 and QC + 20 are fully CI-security. Trapdoors are generated by deterministic algorithms, so they are not fully TI-security. In other words, the trapdoor generation algorithm of their scheme will leak the retrieval habit. We reduce the possibility of access pattern leakage by adding a round of communication. Because the server does not know the specific index entries that meet the keyword. In addition, only our scheme also meets the result integrity verification.

6.2. Time Complexity

Table 3 shows the number of operations of each algorithm. , , , denotes the operation of exponentiation, the pairing operation, hash function, pseudo-random function respectively. In the table, we disregard operations with low costs, such as normal hashing. Considering that Setup and KeyGen of different schemes have no significant difference in computational costs and the common algorithm, we only consider , , and algorithms. The schemes of HL17 and PL21 be slightly faster than that of QC + 20 and QC + 21, as they only calculated the hash to point operation in the keyword encryption algorithm. However, our solution is much faster than theirs. Our scheme is the fastest compared with the schemes of HL17, QC + 20, QC + 21 and PL21 in the algorithm, because it does not need pairing. For generation algorithm, our scheme requires one time inner product operation. This is almost optimal among these schemes.

6.3. Evaluation

By incorporating Relic and GMP, we are able to evaluate the effectiveness of the various schemes (HL17, QC + 20, QC + 21 and PL21). Platforms used in this experiment include Ubuntu 18.04.5 LTS with Intel (R) Xeon (R) CPU E5-2620 v4 @ 2.10 GHz and 16.00 GB of RAM. The pseudo random permutation was computed using the AES algorithm (CBC model, 128 bit key). The hash functions were computed using the SHA-256 algorithm. We choosed the real Encron e-mail Dataset (Version 20150307, about 423 MB) to demonstrate the practical performance of our proposed scheme, which contains the data from about 150 users. We choosed about 2000 keywords whose lengths are not less than 5 characters and the total number of occurrences is higher than 20. We compared our proposed scheme to HL17, QC + 20, QC + 21, PL21, with respect to ciphertext generation, trapdoor generation, and test algorithm. Furthermore, this experiment also used random keywords. We have the smallest computation costs of the five schemes based on Figures 46. The main reason is that our scheme does not need bilinear pairing operation, which can save a lot of computing overhead. In the ciphertexts generation algorithm, Pan and Li [41] computational overhead is about 420 times that of ours. Our trapdoor generation algorithm has a computation overhead 282 times higher than Qin et al.’s [20]. In the test algorithm, Qin et al.’s [21] computational overhead is about 25 times that of ours.

7. Conclusion

This paper proposes a blockchain-based public key authenticated encryption with keyword search for cloud computing scheme. The scheme can not only resist the IKGA and minimize the leakage of retrieval information, but also realize the functions of multi-keyword retrieval, result integrity verification and so on. At the same time, the security analysis and theoretical analysis of BB-PAEKS are carried out. Then, we evaluate its performance by simulation experiments. The disadvantage is that our scheme uses bloom filter, which will lead to certain false positive events. However, the error rate can be reduced as much as possible by selecting appropriate parameters. Our future work will focus on the design of lightweight multi-user public key searchable encryption schemes.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

The work was supported by the National Key Research and Development Program of China (No. 2021YFA1000600) and the National Natural Science Foundation of China (Nos. U21A20466, 62172307, 61972294, 61932016).