Abstract

Due to capacity limitations, large amounts of data generated by IoT devices are often stored on cloud servers. These data are usually encrypted to prevent the disclosure, which significantly affects the availability of this data. Searchable encryption (SE) allows a party to store his data created by his IoT devices or mobile in encryption on the cloud server to protect his privacy while retaining his ability to search for data. However, the general SE techniques are all pay-then-use. The searchable encryption service providers (SESP) are considered curious but honest, making it unfair and unreliable. To address these problems, we combined ciphertext-policy attribute-based encryption, Bloom filter, and blockchain to propose a blockchain-based fair and reliable searchable encryption scheme (BFR-SE) in this paper. In BFR-SE, we constructed an attribute-based searchable encryption model that can provide fine-grained access control. The data owner stores the indices on SESP and stores some additional auxiliary information on the blockchain. After a data user initiates a request, SESP must return the correct and integral search results before the deadline. Otherwise, the data user can send an arbitration request, and the blockchain will make a ruling. The blockchain will only perform arbitrations based on auxiliary information when disputes arise, saving the computing resources on-chain. We analyzed the security and privacy of BFR-SE and simulated our scheme on the EOS blockchain, which proves that BFR-SE is feasible. Meanwhile, we provided a thorough analysis of storage and computing overhead, proving that BFR-SE is practical and has good performance.

1. Introduction

With the continuous development of Mobile Internet, 5G, and some other advanced technologies, especially the Internet of Things, people and machines are always generating massive amounts of data. Most IoT devices produce large amounts of data with a limited storage capacity, so the data owners like to use cloud storage services to reduce the burden of maintenance costs and local storage overhead. Cloud services provide users with great convenience, enabling users to access their data anytime and anywhere, instead of using a specific machine. But these data, especially the data generated by specific IoT devices such as smart homes and intelligent wristbands, often contains sensitive information to the user. To prevent the disclosure, users encrypt their data before uploading it to the cloud server [18]. However, encryption will weaken the ability of users to search for data.

Searchable encryption technology was first proposed by Song et al. [9], which allows a party to store his data in encryption on the cloud server to protect his privacy while retaining his ability to search for data. A searchable encryption scheme typically includes three participants: the data owner (DO), the data user (DU), and the cloud server. The DO encrypts his data together with the corresponding keywords and uploads them to the cloud server. The cloud server maintains these ciphertexts and provides search services for data users. A data user will initiate a search request using a search token generated based on the keywords, and the matching search results will be sent to him by the cloud server. Finally, the data user can decrypt the ciphertext locally to obtain the data. The whole process will not expose any information related to the data itself. Nowadays, many researchers have proposed various searchable encryption algorithms, such as asymmetric searchable encryption [10, 11], multikeyword searchable encryption [12, 13], and fuzzy keyword searchable encryption [14, 15]. Most of the above studies focus on searchable encryption’s privacy and performance in different scenarios and assume that the cloud server is curious but honest. However, this is not the case, which will cause problems in the fairness and reliability of searchable encryption: (1)On the one hand, after the user pays, the cloud server cannot provide satisfactory search services, resulting in the user’s economic losses. On the other hand, after the user obtains the desired search results, he will slander and deny the cloud server’s service and deceptively refuse to pay the service fee(2)The cloud server is not always reliable. To save costs, it may delete data that is not often used at ordinary times to save space. When users search, it will send part of the search results or even send fake data to users

According to the above point of view, in addition to the privacy of keywords and search algorithms’ efficiency, practical searchable encryption is highly expected to be fair and reliable. To solve these problems, we urgently need such a searchable encryption scheme, in which the service provider is always to provide reliable search service, and the users pay for it. There is no credible third party in this scheme but will not cause any economic disputes. Fortunately, with the emergence and development of Bitcoin [16], as a decentralized cryptocurrency, its underlying technology blockchain can gracefully help us to achieve this goal. In this paper, we proposed a fair and reliable searchable encryption scheme (BFR-SE) based on blockchain. The main contributions of our research are as follows: (1)We constructed an attribute-based searchable encryption algorithm (ABSE) and combined it with blockchain and Bloom filter to propose a fair and reliable searchable encryption scheme. While the DO stores the data indices in the SESP, some additional auxiliary information used for verification is uploaded to the blockchain. In the event of disputes between DU and SESP, the blockchain will arbitrate, and the dishonest participant will be punished financially(2)BFR-SE supports users’ multikeyword search for ciphertexts. By utilizing ABSE, the DO realizes fine-grained access control for their data search, which means that only the users whose attributes satisfy the specific policy can search and obtain the correct search results(3)Not the same as other blockchain-based searchable encryption schemes, BFR-SE only stores a small amount of auxiliary information on-chain and performs possible arbitration when disputes occur, which dramatically saves storage and computing resources on-chain(4)We simulated and implemented BFR-SE on the EOS blockchain and showed implementation details of smart contracts and algorithms. Together with the security analysis, it proves that our scheme is feasible(5)We used 6 MacBook Pros to build an EOS private chain in a laboratory environment and simulated our scheme. The storage and computing overhead proves that BFR-SE is practical and has good performance

The rest of this paper is organized as follows: Section 2 consists of related works. Section 3 reviews some preliminary knowledge used throughout this paper. In section 4, we have an overview of our scheme. Section 5 describes specific implementation details. In Section 6, we analyze the security and performance. Finally, we present the conclusion and future direction.

2.1. Verifiable Searchable Encryption

To ensure the reliability of searchable encryption and prevent the cloud server from returning partial or even wrong search results, users need to have the ability to verify the correctness of search results. Earliest in 2012, Chai and Gong [17] proposed a verifiable keyword search scheme, in which the cloud server needs to prove that the returned results are correct. Kurosawa and Ohtaki [18] proposed the first UC-secure verifiable symmetric searchable encryption, which can verify whether the search results are modified or deleted, and the computational cost of verification has a linear relationship with the number of files. Zhu et al. [19] constructed a verifiable fuzzy keyword search scheme to support dynamic data using Bloom filter and locality sensitive hash function. The single-keyword verifiable searchable encryption will return many irrelevant results that cause the waste of transmission bandwidth and computing resources, so the verifiable searchable encryption proposed by Azraoui et al. [20] supports multikeyword search or combined search. However, the above verifiable searchable encryption schemes are only suitable for a small number of users, and it is challenging to meet the user’s dynamic requirements in the cloud environment. The number of users growing will cause the burden of key management and cannot achieve fine-grained access control. In 2014, Zheng et al. [21] proposed a novel cryptographic primitive named verifiable attribute-based keyword search. This primitive allows DO to control the search and outsource his encrypted data to the cloud server based on an access policy. Simultaneously, it allows legitimate users to outsource the search operation (usually expensive) to the cloud server and verify whether the cloud server loyalty performs it. Ameri et al. [22] combined hierarchical identity-based multidesignated verifier signature (HIB-MDVS), hierarchical identity-based broadcast encryption (HIBBE) and Bloom filter to propose a generic construction for verifiable attribute-based keyword search. The VBKS scheme proposed by Sun et al. [23] can realize the revocation of user attributes and utilize proxy reencryption and lazy reencryption to transfer the heavy update work during attribute revocation to a semitrusted cloud server while supporting the multikeyword search.

As mentioned above on verifiable searchable encryption, the research enables users to verify the search results’ correctness and integrity. However, as a more practical searchable encryption scheme, this is far from enough because when a dishonest server is detected, it cannot continue to punish the dishonest server without a third-party trusted organization, which cannot be genuinely reliable.

2.2. Blockchain-Based Searchable Encryption

In recent years, some researchers utilized blockchain to solve the fairness problem in searchable encryption. In 2017, Li et al. [24] used blockchain to construct symmetric searchable encryption (SSE-using-BC). In their scheme, users publicly store all data on the Bitcoin through transactions. As long as the participant does not execute honestly, he will lose his BTC. In their subsequent work [25], they also improved the scheme and adopted the Fabric blockchain, which significantly improved the performance. Hu et al. [26] explored the potential capabilities of the Ethereum blockchain and constructed a decentralized, privacy-protected search model. The scheme designed a financially fair smart contract to replace the centralized server so that all participants are treated equally and motivated to perform correct operations. Cai et al. [27] also used a smart contract to record encrypted search records on the blockchain and designed a fair protocol to deal with disputes and payment issues. They used a dynamic, efficient searchable encryption scheme, which retained the search capability and inspired the service provider to make a real effort. Tang [28] extended the original searchable encryption, storing some necessary information on-chain, in which blockchain only serves as a proper judicial function. If there is no dispute, it will perform little operations on-chain, reducing the blockchain’s burden. Chen et al. [29] stored the indices and complex logical structure of EHRs on the blockchain. They believed that only utilizing blockchain for propagation can the data owner have complete control over their data. Blockchain ensures the integrity, unforgeability, and traceability of the indices. Jiang et al. [30] proposed a Bloom filter-enabled multikeyword search protocol with enhanced efficiency and privacy preservation. In the protocol, a low-frequency keyword is selected using the Bloom filter to filter the database when performing a multikeyword search.

In summary, although the above searchable encryption based on the blockchain can solve the fairness problem in the payment process, there are still some shortcomings: (1)These schemes are products of the combination of blockchain and symmetric searchable encryption that can only achieve a single one-to-one scenario that is difficult for a large number of users and meeting dynamic requirements in a cloud environment, not to mention fine-grained access control(2)The main idea of these schemes is to store index information of the encrypted data on-chain. Although encrypted, the symmetric searchable encryption is generally a deterministic function. It will be noticed when the user searches for the same keyword multiple times. It will lead to the establishment of some statistics, making it possible to infer some private information(3)Both file storage and search algorithm execution are all processed on-chain, which increases the storage and computing overhead of blockchain. Compared with the traditional way, because the blockchain requires parallel storage and calculation of multiple miners, resource waste is bound to become noticeable. Individual schemes put this part off-chain but caused functional defects, such as the participants need a lot of offline communications

3. Preliminary

3.1. Bilinear Pairing

Let and be cyclic groups of order and be a generator of . We call is a bilinear paring if it is a map with the following properties: (1)Bilinear: for all and , there will be (2)Nondegenerate: there exists , such that (3)Computable: there is an efficient algorithm to compute for all and

3.2. Linear Secret Sharing Scheme (LSSS)

Let be a set of participants, and be an access structure. In the structure, is an matrix with mapping its rows. An LSSS is composed of two polynomial-time algorithms: (1): to share a secret value , it selects randomly . Let , and be the vector as the th row of matrix , then, the secret share belongs to party (2): it takes and corresponding secret shares as inputs. If any satisfies the access structure, a set of recovery coefficients can be calculated so that

3.3. Bloom Filter

Bloom filter is a space-efficient probabilistic data structure, proposed by Burton Howard Bloom in 1970 [31] that through an individual error rate in exchange for space-saving and query efficiency. A standard -bit Bloom filter includes a vector with a length of , all bits of which are initialized to 0, independent hash functions , and each hash function has a uniform value in the range of . For each element in set , set the corresponding position of in the vector to 1. It is only necessary to determine whether an element all in are 1 or not to judge whether an element is in the set . If not, there must be ; otherwise, there is a high probability . (It should be noted that a high probability means that there is a false-positive rate of nonzero, but this possibility can be minimized by appropriate setting the value of and k.) A Bloom filter is composed of two algorithms: (1): this algorithm generates an -bit Bloom filter by hashing the data set (2): this algorithm verifies whether the element is in the set . If it returns 1, . Otherwise,

3.4. Blockchain

The blockchain concept originated from Nakamoto’s Bitcoin white paper [16], whose foundation is cryptography and P2P networks. Then, it organizes the data with a specific structure into blocks in a certain way and links these blocks into a chain in chronological order. Cryptography and consensus mechanisms together ensure the security and unforgeability of data. In short, as the underlying technology of cryptocurrency like Bitcoin, the blockchain is a trusted ledger with distributed computing capabilities that can process business credibly without a third-party organization.

3.5. Smart Contract

Initially, when it comes to blockchain, the only well-known applications are cryptocurrencies such as Bitcoin and Litecoin. What brings a qualitative change to the blockchain is that in 2013 Vitalik Buterin established the first public chain platform named Ethereum with a built-in Turing complete language [32] and inaugurated smart contract for the blockchain. Szabo defined smart contract as “a computerized transaction protocol that executes the terms of a contract” [33]. The smart contract in the blockchain is a piece of program code stored on the chain, which can be executed securely and reliably. On the one hand, the blockchain can utilize a programmable smart contract to implement more complex business logic. On the other hand, the blockchain can provide a trusted environment for executing a smart contract. The operating mechanism of the smart contract in the blockchain is shown in Figure 1.

As shown in the figure, the blockchain can be seen as a state machine triggered by transactions, and the ledger is a public world state starting from the Genesis Block. Users can create a transaction and broadcast it to the blockchain network from any node. All block producers will perform corresponding operations after receiving the transaction, and the consensus mechanism makes all nodes finally get a consistent result and update the world state.

Blockchain provides the following support for the execution of smart contract on-chain: (i)Public status: every participant can inspect the smart contract’s current world state on the public ledger(ii)Timestamp server: the block height can be seen as a trusted timestamp that never stops(iii)Trusted propagation channel: the sender can utilize the blockchain to spread the message, and the receiver will reliably receive the message shortly. The delivery traces will be recorded on-chain for auditing, and the records are credible and cannot be tampered with by anyone

3.6. Transactions of EOS

Account, address, and transaction are three essential components in the EOS blockchain [34]. Each user has an account that corresponds to multiple ECDSA key pairs expressed as , and each key pair represents different operation permission of the account. The private and public keys are used by users to sign and verify a transaction. Our definition of a transaction is consistent with our previous work [35, 36]:

where refers to the height and id of a recently generated block, which prevents the transaction from being packaged on the fork chain and is the expiration time of the transaction. is the signature signed by the sender. Action is the operation performed by the transaction in which Code is the name of the smart contract, Name is the method to be invoked in the smart contract, is used to verify whether the sender has the authority to call the method, and Data are the parameters. There may be multiple actions in one transaction. Smart contracts can also send actions to each other to call methods of other contracts, which is called inline communication, and the corresponding execution authority is the same as the original transaction.

3.7. Data Persistence of EOS

After the smart contract is executed, the occupied memory will be released, and all variable data in the program will be lost, so it is necessary to persist the data in smart contract. In the smart contact of Ethereum, data can only be stored in key-value pairs, which is difficult to meet more complex requirements. EOS imitates multi-index containers in Boost library and develops a C++ class: eosio::multi_index (from now on referred to as multi_index). Each multi_index can be regarded as a table in the traditional database. Each row of the table can store an object, and the object’s attributes can be any C++ data type. Therefore, the table constructed by multi_index in EOS is no less flexible than traditional databases. A significant feature of multi_index is that a primary key can be set as the main index and 16 secondary indices. Users can obtain any of these indices and use the emplace, erase, modify, and find functions of the index to insert, delete, update, and select data.

4. Overview of Proposed Scheme

This section will give an overview of our proposed scheme, including the system model and scheme design. The meanings of the symbols and abbreviations used in this paper are shown in Table 1.

4.1. System Model of BFR-SE

The scheme proposed in this paper is composed of four components: data owner (DO), data user (DU), searchable encryption service provider (SESP), and blockchain. The keywords and their corresponding index structure are encrypted and uploaded to the SESP after DO extracts the keyword set from the outsourced data set to prevent privacy disclosure. DO distribute keys for DUs through blockchain, and only the DUs whose attributes satisfy the access policy can search and obtain the original data. DU uses his private key to generate a search token according to the keywords he wants to retrieve. According to the search token provided by DU, SESP performs complex search calculation operations, then returns the search results to DU and obtains the revenue. The traces and additional evidence of each participant will be recorded on the blockchain, which cannot be destroyed or denied. DUs need to pay SESP for the service they use. If the SESP does not provide the correct result before the predetermined block height, DU can apply for arbitration, and the blockchain will make a judgment according to the auxiliary information and the additional evidence on-chain during the search. Then, the charge fee will be returned to the DU as compensation, together with a penalty on SESP. The specific functions and responsibilities of the four components are as follows: (1)DO: the owner of the IoT device is also the owner of the data. Responsible for the system’s initialization, including creating and deploying smart contracts in the scheme. DO needs to generate and distribute private keys for registered DUs according to their attributes. Besides, DO extracts keywords from the outsourced data files, generates the corresponding indices, and sets a reasonable access policy for the indices. DO is honest by default(2)DU: according to the keywords he wants to retrieve, DU generates a search token with his private key and sends a search request to the SESP. At the same time, DU can judge whether the search results returned by SESP are correct or not(3)SESP: it has powerful computing capabilities and stores the ciphertext of indices provided by DO. It will perform complex search calculations on the indices according to DU’s search token, then return the search results to DU. SESP may be malicious and may return partial or even wrong search results to save resources and defraud profits(4)Blockchain: it stores the auxiliary information of the indices and intermediate evidence information. In the event of a dispute, arbitration can be conducted according to this information, and the malicious parties can be punished economically. In the absence of a third-party authoritative and trusted organization, the blockchain is the cornerstone of trust in the scheme. Additionally, blockchain can provide a reliable broadcast channel for each participant, which can be used by each party for information dissemination

The system model of our proposed scheme is shown in Figure 2.

Our scheme’s searchable encryption algorithm is inspired by the scheme named VABKS (verifiable attribute-based keyword search) proposed by Zheng et al. [21]. We have optimized and extended it to support a multikeyword search. The detailed description of each step in the flowchart is as follows: (i)DO creates and deploys smart contracts on the blockchain. BFR-SE includes two smart contracts: PMContract and SEContract(ii)DO generates the system master key and public parameters, as well as a pair of signature keys. Then, DO publishes the public parameters and public key for the signature to the smart contract, while the system master key and the private key for signature keep secret(iii)SESP is registered in the SEContract, and a definite amount of deposit is required when registering. If SESP has fraudulent behavior, it will deduct part of the deposit as punishment(iv)DU applies for registration in the PMContract, and he needs to provide his EOS account and a public key of ECC in which the EOS account is used for receiving compensation when SESP is dishonest(v)DO generates the attribute key for DU according to his attributes set, then uses his public key to encrypt it and broadcast it to the blockchain. The ciphertext of the attribute key will be stored in the PMContract(vi)DU obtains the ciphertext of his attribute key from PMContract and decrypts it locally by the corresponding private key(vii)DO encrypts his data files and outsources them. (It can be uploaded to the cloud server, or IPFS, which is beyond the scope of this paper.) The returned address and the corresponding decryption key constitute the identity of the shared data. DO extracts the keywords set from the data files and builds indices for the keywords matching data files’ identification, then generates auxiliary information simultaneously. DO uploads the indices to SESP(viii)DO uploads the auxiliary information to SEContract(ix)DU uses his private attribute key locally to generate the corresponding search token and additional commitment for his search request. Before searching, DU can check the amount of SESP deposit in PMContract. If the deposit is not enough to pay the penalty when doing evil, DU can choose not to continue(x)SESP obtains the search request from SEContract, uses DU’s search token and the indices uploaded by DO to execute the search algorithm, and returns the search results(xi)SESP uploads search results to SEContract. It should be noted that SESP needs to complete the search and upload the results before the preagreed block height(xii)DU gets his search results from SEContract(xiii)If SESP does not provide any results before the specified block height, DU can withdraw his service fee through SEContract without any loss(xiv)If DU believes that the search results returned by SESP are incorrect, DU could initiate a request for arbitration, and the blockchain will determine whether SESP has performed correctly(xv)SESP gets his deposit from the contract

4.2. Detail Design of BFR-SE

BFR-SE consists of the following phase: initialization phase, apply and register phase, build index phase, token generation phase, search phase, verification phase, and withdraw phase. This section will describe each phase’s detailed design in BFR-SE and give the corresponding relationship with the process steps in the flowchart. (1)Initialization phase

The main work of the initialization phase is that DO creates smart contracts and deploys them on the blockchain. Then, DO generates the system master key MSK and public parameters PK locally. The core algorithm of this phase is , which is run by DO locally. The algorithm’s input is a security parameter , and the outputs are MSK, PK, and a pair of keys for signature. After that, the DO publishes PK and to smart contracts and keep MSK, and secret locally. The corresponding steps in the system flowchart are ① and ②. (2)Apply and Register phase

The apply and register phase’s primary work is to complete the registration of SESP and DUs, including that DO distributes private attribute key for each DU. SESP needs to transfer a certain amount of system tokens to SEContract when applying for registration. If the amount of deposit is less than fines when doing evil, DU can choose not to use the search service. When DU applies for registration, it needs to provide a public key of ECC. After that, the DO generates a private attribute key for the DU according to his attributes set. The core algorithm is , which is run by DO locally. The algorithm inputs MSK, PK, the attribute set of DU, and DU’s public key and outputs the private attribute key of DU. Then, DO uses the public key provided by DU when applying for registration to encrypt and obtain the ciphertext of , . DO uploads to the SEContracts, so that DU can securely obtain and decrypt it to get his private attribute key . The corresponding steps in the system flowchart are ③, ④, ⑤, and ⑥. (3)Build index phase

The main work of the build index phase is that DO encrypts the sharing data and outsource it. Take IPFS as an example, we can use the returned address and key to identify the data. After that, DO extracts the keywords set from the sharing data, builds indices for all the sharing data with the same keyword set, and generates the auxiliary information. DO sends the indices to SESP and uploads the auxiliary information to SEContract. The corresponding steps in the system flowchart are ⑦ and ⑧. It consists of the following three subalgorithms: (a)

This algorithm is run by DO. For each element in where denotes the number of the sharing data, DO encrypts it by the key and outsource it. Taking IPFS as an example, the returned address of is , and the sharing data’s identity is . The algorithm’s final output is the identity set . IDGen is an encryption function module defined by DO, which is not the focus of this paper, so the flowchart does not present it. (b)

DO runs this algorithm, and the main work is to establish the corresponding indices based on sharing data and the relevant keywords. At first, DO need to extract the keywords for each in , then . For and , if the corresponding data set is , then use all the elements in as leaf nodes to generate a Merkle Tree, and the root is . We defined that , and the final indices of the keywords set KW matching the sharing data FID is in which is the number of indices. (c)

DO runs this algorithm and the primary work is to encrypt the indices by a specific access policy to generate the ciphertext of indices CI and the auxiliary information AI. The inputs are the indices I, the access policy , and the public system parameters PK, while the outputs are CI and AI. DO encrypts the keywords of each in I to get the ciphertext of the keywords . DO signs and by his private key to ensure the integrity of the index. DO uploads the ciphertext to SESP, and the corresponding auxiliary information AI is uploaded to SEContract. For example, with four files, the data structure of an index is shown in Figure 3. (4)Token generation phase

The main work of the token generation phase is that DU uses his private attribute key to call the trapdoor function to generate a search token and commitment for the searching keywords. is the core algorithm whose inputs are the private attribute key of DU, and the keywords set retrieved, and outputs are search token TOK and the commitment COMM. The commitment is to prove that DU did search for the keywords set provided by him when arbitrating. After that, DU uploads TOK and COMM to SEContract and pays the fee simultaneously. The corresponding steps in the system flowchart is ⑨. (5)Search phase

The search phase’s primary work is to use the search token to retrieve the ciphertext of indices uploaded by DO and return the successful matching results. The core algorithm of this phase is , which is run by SESP locally. This algorithm’s inputs are the search token TOK and the ciphertext of index , and the output is 0 or 1. If the output is 1, then the match is booming, and the search result is . SESP needs to upload to SEContract before the preagreed time. Otherwise, DU can claim back the charge fee. The corresponding steps in the system flowchart are ⑩, ⑪, ⑫, and ⑬. (6)Verification phase

The verification phase’s primary work is to verify the search results returned by SESP according to the results and the auxiliary information uploaded by DO, including the verification of existence, integrity, and correctness. If the verification result is that the SESP has done evil, economic punishment will be imposed on SESP. This algorithm is executed by the blockchain, corresponding to step ⑭ in the flowchart. It can be subdivided into three subphases as follows: (a)

When the search result returned by SESP is null, the algorithm can verify the existence of the sharing data searched by DU. The algorithm’s inputs are the auxiliary information AI, a random number related to the search token and , and the commitment corresponding to the search request. The output is 0 or 1. (b)

This algorithm can verify the integrity of the search results returned by SESP and prevent SESP from returning only partial or even forged results. The input of this algorithm is the search result , and the output is 0 or 1. (c)

This algorithm can verify the correctness of the search results returned by SESP and prevent SESP from returning the wrong results. This algorithm’s inputs are the search token TOK and the search result . The output is 0 or 1. (7)Withdraw phase

The main work of this phase is that each participant withdraws their coins from smart contract. The SESP’s coin includes the deposit at the time of registration and DU’s payment for using the search service. The coin of DU is mainly compensation, which comes from the penalty of SESP. It should be noted that each fee needs a freeze period, during which DU can apply for arbitration on the blockchain. Only after the freezing period has passed can SESP withdraw this fee from the contract. The corresponding steps in the system flowchart are ⑮.

5. Implementation Details of Proposed Scheme

To achieve our goal, we constructed an attribute-based searchable encryption algorithm that supports multikeyword search and combined with the EOS blockchain platform to realize our fair and reliable scheme. This section will elaborate on the details of our smart contracts deployed on EOS and the concrete construction of BFR-SE.

5.1. Smart Contract Design

In order to make the logic clearer, we divide the smart contract in the scheme into two parts, PMContract and SEContract. We use _self to represent the account of smart contract itself and _self.asset to represent the balance in the contract. Let require_auth be a function that represents which account’s permission is needed to continue. We will describe the two smart contracts in detail in this section.

5.1.1. Participant Management Contract (PMContract)

The PMContract is composed of five interfaces: SetSPK, Register, GetPK, SetSK, and GetSK. We initialize PMContract as follows:

Let three-tuple denote a DU and create a multi_index named table_user, in which is an EOS account of DU, is a public key of DU, and is the private attribute key of DU. Let be the primary key of table_user, whose corresponding index is account_idx. Let PK denote the system public parameters. (1)SetSPK: when PMContract receives action (PMContract, SetSPK, Auth, (pk)), this function will be triggered to execute. It can only be invoked by DO to set and update the public system parameters(2)Register: when PMContract receives action (PMContract, Register, Auth, (, )), this function will be triggered to execute. It is invoked by DU to apply for registration in the system. The detail of this function can be seen in Algorithm 1(3)GetPK: when PMContract receives action (PMContract, GetPK, Auth, ()), this function will be triggered to execute(4)SetSK: when PMContract receives action (PMContract, SetSK, Auth, ()), this function will be triggered to execute. It is used for DO distributing private attribute keys to DUs. The detail of this function can be seen in Algorithm 2(5)GetSK: when PMContract receives action (PMContract, GetSK, Auth, ()), this function will be triggered to execute

Input: ,
Output: void
1 require_auth()
2 u = account_idx.find()
3 if (u == null) then
4  u.=
5  account_idx.modify(u)
6 else
7  u.=
8  u.=
9  account_idx.emplace(u)
10 end
Input: ,
Output: bool
1 require_auth(_self)
2 u = account_idx.find()
3 if u == null then
4  return false
5 else
6  u.=
7  account_idx.modify()
8  return true
9 end
5.1.2. Searchable Encryption Contract (SEContract)

The SEContract consists of 13 functions: SetPK, SetAI, Deposit, SearchRequest, SendResult, ExistenceVerify, IntegrityVerify, CorrectnessVerify, GetFeeSESP, IsVerifyRound, IsResultReady, Compensate, and CommitVerify. We initialize SEContract as follows:

Let six-tuple (, SerialNum, TOK, COMM, Height, Coin) be a search request initiated from DU and create a multi_index named search_table, in which is the account of DU, SerialNum is the serial number of the search request, TOK is the search token, COMM is the commitment of DU for the request, Height is the block height when the request is initiated, and Coin is the charge fee paid by the user for the service. Let be the primary key of search_table, and the corresponding index is search_idx. Let three-tuple (, SerialNum,) be a result returned from SESP and create a multi_index named result_table, in which and SerialNum are to match the search request in search_table, and denotes the search result. Let be the primary key of result_table, and the corresponding index is result_idx. Let be the account of SESP and Deposit be the balance of SESP in the contract. Let represent the fee that the user needs to pay for each search and the amount of penalty when SESP does evil. Let round_height represent the time required for search round and verification round, BF be the auxiliary information which is a Bloom filter, and be the public key for the signature of DO. (1)SetPK: when SEContract receives action (SEContract, SetPK, Auth, (pk)), this function will be triggered to execute(2)SetAI: when SEContract receives action (SEContract, SetAI, Auth, (AI)), this function will be triggered to execute(3)Deposit: when SEContract receives action (SEContract, Deposit, Auth, (, coin)), this function will be triggered to execute. The detail of this function can be seen in Algorithm 3(4)SearchRequest: when SEContract receives action (SEContract,SearchRequest,Auth, (,Sn,TOK, COMM)), this function will be triggered to execute. The detail of this function can be seen in Algorithm 4(5)SendResult: when SEContract receives action (SEContract, SendResult, Auth, (, Sn, )), this function will be triggered to execute. It can only be invoked by SESP. The detail of this function can be seen in Algorithm 5(6)ExistenceVerify: when SEContract receives action (SEContract, ExistenceVerify, Auth, (,random, )), this function will be triggered to execute. The detail of this function can be seen in Algorithm 6(7)IntegrityVerify: when SEContract receives action (SEContract, IntegrityVerify, Auth, ()), this function will be triggered to execute. The detail of this function can be seen in Algorithm 7

Input: , coin
Output: void
1 require_auth()
2 if== null then
3  =
4 end
5 send action (eosio.token, transfer, Auth, (,_self, coin))
6 Deposit = Deposit + coin
Input: , Sn,TOK, COMM
Output: void
1 require_auth()
2 s = search_idx.find()
3 send action (eosio.token, transfer, Auth, (,_self, d))
4 ifs == nullthen
5  s. =
6  s.SerialNum = Sn
7  s.TOK = TOK
8  s.COMM = COMM
9  s.Coin = d
10  s.Height = getCurrentHeight()
11  search_idx.emplace(s)
12 else if (getCurrentHeight() > (s.Height+2round_height)) then
13  s.SerialNum = Sn
14  s.TOK = TOK
15  s.COMM = COMM
16  s.Coin = d
17  s.Height = getCurrentHeight()
18  search_idx.modify(s)
19 end
Input: , Sn,
Output: void
1 require_auth()
2 rlt=result_idx.find()
3 if rlt!=null then
4  rlt.SerialNum=Sn
5  rlt.=
6  result_idx.modify(rlt)
7 else
8  rlt.=
9  rlt.SerialNum=Sn
10  rlt.=
11  result_idx.Emplace(rlt)
12 end
Input: , random,
Output: void
1 bool IsVr = IsVerifyRound()
2 bool IsRd = IsResultReady()
3 ifIsVr== true && IsRd==falsethen
4  Compensate()
5 else ifIsVr== true && IsRd==truethen
6  ifCommitVerify(,random, ) == true then
7    bool isExist = BFVerify(BF, SHA256())
8    ifisExist == falsethen
9      Compensate()
10    end
11  end

12 end(8)CorrectnessVerify: when SEContract receives action (SEContract, CorrectnessVerify, Auth, ()), this function will be triggered to execute. The detail of this function can be seen in Algorithm 8(9)GetFeeSESP: when SEContract receives action (SEContract, GetFeeSESP, Auth, ()), this function will be triggered to execute. The detail of this function can be seen in Algorithm 9(10)IsVerifyRound: this is a private function and can only be called internally by the contract itself. The detail of this function can be seen in Algorithm 10(11)IsResultReady: this is a private function and can only be called internally by the contract itself. The detail of this function can be seen in Algorithm 11(12)Compensate: this is a private function and can only be called internally by the contract itself. The detail of this function can be seen in Algorithm 12(13)CommitVerify: this is a private function and can only be called internally by the contract itself. The core of this function is the algorithm TEST in the search phase. For detailed implementation, see the concrete construction of BFR-SE in the next section

Input:
Output: void
1 bool IsVr = IsVerifyRound()
2 bool IsRd = IsResultReady()
3 if IsVr== true && IsRd==false then
4  Compensate()
5 else if IsVr== true && IsRd==true then
6  rlt = result_idx.find()
7  bool isMK = VerifyMerkleRoot(rlt., rlt.FID)
8  bool isSig = SigVerify(rlt.|| SHA256(rlt.), rlt., ))
9  if isMK && isSig then
10    throw
11  else
12    Compensate()
13  end
14 end
Input:
Output: void
1 bool IsVr = IsVerifyRound()
2 bool IsRd = IsResultReady()
3 ifIsVr== true && IsRd==falsethen
4  Compensate()
5 else ifIsVr== true && IsRd==truethen
   s = search_idx.find()
6  rlt = result_idx.find()
7  bool isCv = Test(s.TOK,rlt.)
9  ifisCv == false then
12    Compensate()
13  end
14 end
Input:
Output: void
1 s = search_idx.find()
2 if (getCurrentHeight() > s.Height+2round_height)& & (s.Coin>0) then
3  Deposit = Deposit + s.Coin
4  s.Coin =0
5  search_idx.modify(s)
6 end
Input:
Output: bool
1 s = search_idx.find()
2 if (getCurrentHeight() > (s.Height + round_height)) &&
  (getCurrentHeight() < (s.Height+2round_height)) then
3  returntrue
4 else
5  returnfalse
6 end
Input:
Output: bool
1 s = search_idx.find()
2 rlt = result_idx.find()
3 if s.SerialNum == rlt.SerialNum && rlt.! = null then
4  return true
5 else
6  return false
7 end
Input:
Output: void
1 s=search_idx.find()
2 send action (eosio.token, transfer, Auth, (_self, , s.Coin+d))
3 s.Coin=0
4 search_idx.modify()
5 Deposit = Deposit - d
5.2. Concrete Construction of BFR-SE

This section shows the concrete construction of BFR-SE, including the algorithms to be executed at each phase and how each participant interacts with the EOS blockchain. Our initialization is as follows:

Let and be cyclic groups of order p, and g be a generator of . Let be a bilinear pairing, be the set of all attributes, be general and distinct hash functions. and are also two hash functions. (1)

Firstly, DO picks randomly. For each attribute , compute that .

The public system parameters are PK:

The system master key is MSK:

DO randomly selects a key pair of ECDSA which be denoted , then keeps MSK and secret and sends the following two transactions to the blockchain: (2)

At first, DO sends the following transaction to the blockchain to obtain the DU’s public key:

Let the attribute set of DU be and , then randomly pick and compute , . For each , it computes that . The private attribute key of DU is :

Then, encrypt it with the public key of DU:

DO sends the following transaction to the blockchain: (3)

For , , randomly picks and computes , . Let be an access structure. It randomly chooses and sets . For each to , it calculates . Then, randomly picks and performs the following calculations for each attribute:

Let be the size of the keyword set . For each in , , perform the following calculation:

The ciphertext of will be

DO signs the ciphertext of the index by his private key:

Let be the ciphertext corresponding to :

Set , ,.

Finally, the ciphertext of the indices and the auxiliary information will be as follows:

After that, DO uploads CI to SESP, and the auxiliary information AI is uploaded to the blockchain by sending the following transaction: (4)

Firstly, DU obtains his ciphertext of private attribute key by sending the following transaction:

DU decrypts it to get the private attribute key:

Then, DU randomly picks . Let be the size of and compute it as follows:

For each , it computes .Set , and the search token will be TOK:

It calculates the commitment of the search request as follows:

Finally, DU picks randomly as the serial number of the search request and sends the following transaction: (5)

After SESP receives the TOK from DU, it will compare each row of CI. Firstly, it is judged whether the number of keywords in is the same as that in TOK. If different, compare the next row.

Assuming that the attribute set of DU satisfies the access policy, set be the recovery coefficient of the th row in and calculate as follows:

Then, determine whether the following two formulas (4) and (5) are equal. If it is, it returns 1. Otherwise, it returns 0.

If the result is found successfully, then ; otherwise, . SESP sends the following transaction to the blockchain. (6)Verification

If DU believes that there are problems with the search results returned by SESP during the verification round, DU can initiate a request to the blockchain within this period and request the blockchain to make a judgment. (a)Existence

When the result is null, DU can send the following transaction to the blockchain:

After the contract receives the transaction, it will first verify the DU’s previous commitment to prevent DU from submitting keywords that are different from that in the search request. It will be calculated as follows:

Blockchain determines whether and COMM in the contract are equal. If they are the same, the calculation will continue as follows:

If the above equation is tenable, this DU is honest. After that, compute the following formula to verify the existence of the search result. (b)Integrity

When DU believes that the result returned by SESP is not complete or corrupted, he can send the following transaction to the blockchain. (c)Correctness

When DU believes that the result returned by SESP does not include the keywords searched for, he can send the following transaction:

6. Security and Performance Analysis of Proposed Scheme

6.1. Security and Privacy Analysis of BFR-SE
6.1.1. Security and Privacy of ABSE

The ABSE model we constructed in BFR-SE is inspired by the attribute-based search model VABKS of Zheng et al. [21]. Furthermore, we have expanded it to support the multikeyword search. VABKS is proved to be secure, and the complete proof process can refer to the security analysis in [21]. The security of VABKS relies on the decisional linear assumption.

We focus on the fairness and reliability of searchable encryption, and the security of ABSE is not the main work of this paper. So, we will mainly analyze the correctness of ABSE and briefly discuss its security and privacy.

(1) Correctness. Let be the recovery coefficient of the th row in and calculate as follows:

If the attribute set of DU satisfies the access policy , it will get s by calculating , and .

Then, calculate as follows:

If the keywords set in the search token is the same as the keywords set KW in the index, it will have

(2) Security and Privacy. From a security point of view, all attribute-based cryptographic algorithms need to resist collusion attacks. In the ABSE used in BFR-SE, we pick a random number for each DU at the key generation phase, and the attribute-based part of the private key is related to it. So, different DUs cannot combine their respective attributes to launch a collision attack.

From the perspective of privacy, DO encrypts his indices and stores them on SESP without revealing any information. Moreover, DO has fine-grained access control on the search function. For DU, a random number is used every time generating a search token, making the search token generated will not be the same even if DU searches for the same keywords multiple times. The adversary cannot analyze DUs’ privacy by collecting the traces of search requests.

6.1.2. Fairness and Reliability of BFR-SE

(1) Fairness. In this paper, we proposed a pay-per-use searchable encryption scheme. We believe that both SESP and DU are not always credible, and the dishonest behavior of either party may cause economic disputes. In our scheme, the participant with substantial computing power needs to pay a certain amount of deposit before becoming SESP. We divide each search of DU into two rounds: search round and verification round. When a DU initiates a search request, he needs to transfer the fee to smart contract.

DUs can initiate a request for arbitration in the verification round when SESP does not return any result, return partial results, or return incorrect results. The blockchain will arbitrate the search results. If it determines that SESP has acted dishonestly, SESP will be subject to a financial penalty, and the fine will compensate DU. Although DU may expose a little bit of their information when applying for arbitration, they will get financial compensation.

For SESP, if he can provide the correct results before the pre-agreed time, he can take away his profits after the freezing period. When DU initiates a search request, he promises the set of keywords retrieved and stores it on-chain, ensuring that DU will not submit a different set of keywords during the verification round to defraud the SESP deposit. Moreover, a particular fine can be introduced to DU so that DU cannot apply for arbitration without any certainty.

In summary, BFR-SE is fair to both SESP and DU.

(2) Reliability. In our scheme, we draw on the idea of verifiable searchable encryption in which DU can verify the results return by SESP from three aspects, including existence, integrity, and reliability. Therefore, the reliability of verifiable searchable encryption is also available in our scheme. Unlike the verifiable searchable encryption, we utilize the blockchain to introduce a reward and punishment mechanism for the scheme. When DU finds any problem with the results returned by SESP, they apply for arbitration during the verification round. If the SESP is indeed dishonest, the blockchain will punish SESP financially and compensate DU.

Therefore, BFR-SE is more reliable than previous schemes.

6.1.3. Verifiability of Search Results

(1) Existence. BFR-SE uses Bloom filter to verify the existence of the search result. DO stores all the information corresponding to the keyword set in the indices into the Bloom filter. If the result returned by SESP is null, which means that there is no matching data for the search request, DU could verify it using the keywords searched for and Bloom Filter. If the verification result is 1, the keywords searched for having a high probability of existence. Although there is a specific false-positive rate, the research in Ref. [37] shows that the calculation method of this false-positive rate is as follows:

It can be seen that by setting the values of and , the false-positive rate can be reduced. For example, when , it can get a minimum false-positive rate of . Therefore, DUs can verify the existence of search results in BFR-SE.

(2) Integrity. For each row of indices in our scheme, DO uses his private key to sign the keywords and the MerkleRoot obtained from the data-related information as leaf nodes. Each row of the indices uploaded by DO contains the signature. DUs can use the public key of DO to verify whether the keywords and MerkleRoot are damaged. For the returned search results, DU can verify the data’s integrity according to whether the MerkleRoot can be constructed. Once the data-related information as leaf nodes are destroyed, or the SESP only returns partial results, the MerkleRoot cannot be constructed. Therefore, DUs can verify the integrity of search results in BFR-SE.

(3) Correctness. If the existence and integrity are verified, then DU can get the ciphertext of the keyword set of the search results. DU only needs to use this ciphertext and his search token as inputs and repeatedly execute the TEST function to verify the search results. Therefore, DUs can verify the correctness of search results in BFR-SE.

6.2. Security and Privacy Analysis of BFR-SE
6.2.1. Functional Comparison

We compared BFR-SE with previous verifiable searchable encryption and blockchain-based searchable encryption schemes from the following aspects: fairness, reliability, privacy protection, whether it supports multikeyword search, whether it is suitable for multi-user situation, whether it supports fine-grained access control for the search function, and practicability.

From the comparison in Table 2, comparing verifiable searchable encryption and previous blockchain-based searchable encryption schemes, the following conclusions can be drawn: (1)The former and earlier related studies did not consider the fairness of searchable encryption. With the emergence of blockchain, these blockchain-based schemes all meet the requirements of fairness(2)The former supports DUs to verify search results and has a certain degree of reliability, but because there is no subsequent sufficient punishment, the reliability is weak. However, the recent searchable encryption schemes based on blockchain have not been considered reliable(3)The former stores the indices and DU’s search records on SESP, which can better protect DU’s privacy on the premise that SESP is credible. The latter stores the indices and search records on blockchain and uses a deterministic encryption algorithm. Since the information on-chain is public, even if the keywords are encrypted, when the search records are large enough, it is not impossible to analyze some privacy of DU(4)The performance of the former depends on the capabilities of SESP and has high practicability. The latter is mostly based on low-performance blockchain platforms such as Bitcoin and Ethereum, and the design is not particularly perfect, so there are still problems in performance and security. In real enterprise applications, it does not have practicality

Compared with these two types of schemes, BFR-SE designs a relatively perfect reward and punishment mechanism with blockchain. If SESP is dishonest, it will pay the price. Therefore, our scheme has both fairness and reliability. The indices are still stored on SESP in our scheme, and the blockchain only plays the role of arbitration when there are disputes, which makes BFR-SE more efficient than the previous blockchain-based schemes. Our constructed ABSE is not a deterministic encryption algorithm. The random number makes the search token different, even for the same keyword set. Therefore, BFR-SE has strong privacy protection capabilities. We have extended the work of Ref. [21] to support a multikeyword search. The combination of attribute-based searchable encryption and blockchain makes BFR-SE meet multiuser scenarios in a distributed environment quickly and enables DO to have fine-grained access control on his sharing data. BFR-SE uses EOS blockchain, which is the current high-performance public chain. Moreover, it considers more practical scenarios, such as SESP and DU may be dishonest. So, our scheme has better practicality.

6.2.2. Storage Analysis

BFR-SE is a fair and reliable searchable encryption scheme based on the EOS blockchain. Since the storage resource on the blockchain is very valuable, and the acquisition of RAM in the EOS blockchain requires the user to mortgage the system token, it is necessary to analyze the size of the data stored on-chain.

In the beginning, we define some symbols. We set , to represent the bit length of an element in group and , respectively. Let be the bit length of an element in field , be the bit length of the signature, be the number of all attributes, be the number of DUs, |TOK| be the bit length of search token, |COMM| be the bit length of commitment, be the bit length of the result returned by SESP. Let be the size of the keyword set.

According to the experiment simulation in our scheme, we set bits, bits, bits, bits bits. The length of Account, SerialNum, blockheight, Deposit, and Coin are all 64 bits. The implementation of our Bloom filter refers to the C++ code on Github, which address is https://github.com/bbondy/bloom-filter-cpp.git. We set the length of the bloom filter to 20 KB and there are 5000 rows of ciphertext indices, so and the number of hash functions in Bloom Filter will be . Then, the false-positive will be . In our scheme, three operations that will interact with blockchain to store data in the smart contract, which are as follows: (1)Initialization

DO uploads the public system parameters and his public key for the signature to smart contract. The storage cost is as follows: (2)Registration

The information on-chain mainly includes registration-related information uploaded by DU and SESP. The DU registration includes the information submitted by DU and the private key distributed by DO. The SESP registration includes the account and the deposit. The specific storage overhead is as follows: (3)Search

The information on-chain mainly includes auxiliary information uploaded by DO, search requests initiated by DUs, and search results returned by SESP. The specific storage overhead is as follows:

The storage overhead of BFR-SE varies with the number of attributes is shown in Figure 4.

For simplicity, the figure only shows that the storage overhead varies with the number of attributes when there are only 10 DUs and 50 keywords. From the figure, we can see that the storage overhead is mainly spent in the search phase, while the initialization and registration phase are negligible. As the number of DUs and keywords grows, storage overhead will also increase linearly. However, since we only store some information necessary for verification on-chain, compared with other blockchain-based schemes [24, 2630] that store all index information on-chain, the storage overhead is reduced a lot.

Users can obtain the RAM in EOS by collateralizing system tokens, and the current price is 42EOS/MB. DO can purchase RAM according to the scale of his system. Unlike Ethereum transactions that need to consume ETH as gas, the tokens mortgaged when acquiring RAM in EOS can still be redeemed at the original price. Above all, BFR-SE is feasible and practical.

6.2.3. Performance Analysis

Before analyzing the performance, we define two primary operations’ computational cost: for bilinear pairing and for power exponent operation. Here we ignore the computational overhead of operations such as hash functions because they are very efficient than the above two. In our proposed scheme, the computational overhead of the primary operations is shown in Table 3.

There are many studies on the analysis mentioned above [3843], so we will not repeat them too much. Like storage resource, the computing resource on-chain is also very valuable. If the interaction with the blockchain is too frequent or the computational overhead is too large, it will have a terrible impact on system performance. So, we mainly focus on the execution time of BFR-SE on-chain.

We used 6 nodes to build an EOS private chain in a laboratory environment. The 6 nodes we chose were all MacBook Pro (2017) with Intel (R) Core (TM) i5 CPU that clocks at 3.1 GHz and has 8 GB of RAM. The version of the EOS blockchain we chose is v2.0.7. The computational overhead of other blockchain-based schemes [24, 26, 27, 30] is all second level, which is obviously not better than ours, and will not be analyzed anymore. We compared BFR-SE with the scheme in Ref. [25], as shown in Figure 5.

In BFR-SE, most of the interactions with the blockchain are to upload data to smart contract, such as initialization and registration. The computational overhead of this part can be ignored. The main computational overhead of BFR-SE occurs when the blockchain arbitrates. It can be seen from the figure that as the number of indices to increase, the computational overhead on-chain of BFR-SE has always remained at a stable level, about 40 ms, while the scheme in Ref. [25] will grow. It is because, in our scheme, all the time-consuming operations are executed off-chain. The EOS block producers’ configuration in the Mainnet is much higher than that of the MacBook used in our simulation environment. When our contracts are deployed on the Mainnet, the performance will be even more outstanding. The EOS blockchain generates a block in 0.5 seconds, and the transaction will be confirmed soon after execution. Therefore, compared with Ref. [25], BFR-SE has a better performance.

7. Conclusion

To achieve a fair and reliable searchable encryption scheme, we constructed an attribute-based searchable encryption ABSE that supports multiple keywords search and designed an exclusive reward and punishment mechanism by using blockchain. In our scheme, DO sends the ciphertext of indices to SESP and uploads the auxiliary information to the blockchain. SESP must return the correct search results before a preagreed block height, and the charge fee paid from DU will be frozen for a period during which DU could initiate an arbitration request to the blockchain if he disagrees with the results. As the cornerstone of trust, blockchain will punish the dishonest party economically, ensuring the scheme’s absolute fairness and reliability [4453]. Besides, ABSE can be used by DO to have fine-grained access control on the search function. Experiments and analyses show that our scheme is feasible and has better performance. However, our scheme still has many shortcomings. For example, our scheme uses an index structure, and the signature guarantees the integrity of the indices, but this significantly reduces the flexibility of the scheme, especially when adding or updating the index of sharing data. Simultaneously, due to an attribute-based encryption algorithm, topics such as the revocation or update of permission are also one of the directions that need to be studied in the future. We will continue to refine our approach in conjunction with some other research [37, 5457].

Data Availability

The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the National Natural Science Fundamental of China under Grants 61272519, 61170297, 61472258, and 61802094, China Postdoctoral Science Foundation under Grant 2021M690305, and National Natural Science Foundation of Zhejiang Province under Grant LY20F020012.