Abstract

Provable data possession (PDP) is a crucial means of protecting the integrity of data in the domain of cloud storage. In the post-quantum era, the PDP scheme that uses lattices relies too heavily on the third-party auditor (TPA), which is not entirely trustworthy and is easily affected by a single point of failure. Moreover, the scheme often leads to the leakage of private user data while attempting to satisfy the demand for public verification. In response to the above problems, this paper designs a protocol for post-quantum privacy-preserving PDP and uses it to develop a scheme based on smart contracts. The proposed scheme has the characteristics of being post quantum and can satisfy the demand for public verification while preserving user privacy. The property of noninteraction of the protocol can reduce transaction fees incurred owing to the frequent operation of the blockchain, and the smart contract with a deposit mechanism can ensure fair payments to all parties. The results of a theoretical analysis and experiments show that the proposed scheme is highly secure and efficient.

1. Introduction

Cloud storage has emerged as the crucial infrastructure for data storage with the rapid development of cloud computing, big data, the Internet of Things, the mobile Internet, and artificial intelligence in recent years. It can deliver services on demand with just-in-time capacity and costs and eliminates the need for users to buy and manage their own infrastructure for data storage. This provides users with agile, global, and durable—“anytime, anywhere”—access to data [1].

While cloud storage is widely used, it poses new security threats to cloud data because users lose physical control over their data [2]. An untrusted cloud storage provider (CSP) may reclaim storage space for economic reasons by deleting infrequently accessed data or even concealing events of data loss to avoid damage to its reputation. In addition, server failures, power outages, and security attacks occasionally occur to seriously compromise the integrity and availability of data in the cloud. Owing to various security-related incidents, such as the service downtime of cloud storage and the leakage of private data [3, 4], the integrity of data in cloud storage is a cause for concern for users and has greatly restricted the development of this technology.

Researchers have proposed efficient schemes of verification to protect the integrity of data in cloud storage. Data integrity can be verified by computing a small number of data blocks without downloading all the data in the cloud to the local server. Provable data possession (PDP) is the most common scheme to verify the integrity of data. The user in this scheme divides the uploaded data into blocks of equal size and then calculates the tag of each block. The data blocks and the tags are then sent to the CSP. When data integrity needs to be verified, the user randomly selects a few data blocks, asks the CSP to calculate a short proof based on their tags, and then verifies the proof returned by it to determine whether it stores all of the user’s data. Ateniese et al. [5] proposed a PDP scheme that uses RSA-based homomorphic verifiable tags to aggregate signatures, thus reducing the burden of computation and I/O access to significantly improve the efficiency of verification. Wang et al. [6] used the BLS [7] to design a scheme to verify data integrity. They constructed a Merkle hash tree (MHT) that supports dynamic data operations, where the security of the system depends on the computational Diffie–Hellman (CDH) problem. Curtmola et al. [8] extended PDP to the case of replicas and proposed multiple-replica provable data possession (MR-PDP) that can reduce the overhead of all replicas due to verification to roughly the same as that of a single replica.

It is essential for users with limited computing resources and capabilities to provide public verification. They can designate external auditors to check the integrity of data in cloud storage when needed. A variant of PDP has been proposed [5] to support public verification. It allows anyone (not just the data owner) to challenge and check the data ownership of the CSP. Wang et al. [9] introduced a third-party auditor (TPA), instead of users, that regularly verifies the integrity of remote data without storing all data. Once verification fails, it can immediately notify the user that the data are damaged and take timely measures to avoid more significant losses. On the contrary, the results of an audit submitted by the TPA ensure some objectivity and fairness, where this is conducive to the accountability of users after disputes with the CSP. Therefore, most subsequent PDP schemes [1015] tend to include a TPA.

However, we cannot base data security on the assumption of a fully trustworthy TPA. A malicious TPA may corrupt and leak user data or deceive users with fake audits. Some researchers have used the blockchain instead of the TPA in the PDP scheme. Because the blockchain is open, transparent, and tamper resistant, it provides users with a completely trusted third-party audit, and users can supervise the entire process. Zhang et al. [16] proposed a blockchain-based fair payment framework for outsourcing services in cloud computing and used it to build a blockchain-based PDP scheme. Chen et al. [17] proposed a decentralized blockchain-based PDP scheme that uses a rank-based MHT to support the dynamic operation of outsourced data, uses the blockchain to record all transaction behaviors, and deploys smart contracts combined with the deposit mechanism to ensure the automatic execution of the protocol and fair payments. Chen et al. [18] proposed the first decentralized system for proofs of data retrievability and replication—BOSSA, which is incentive-compatible for each party and realizes automated auditing by using smart contracts on the Ethereum blockchain. Wang et al. [19] proposed the concept of noninteractive public provable data possession and used it to design a blockchain-based smart contract for the public auditing of cloud storage. This scheme is based on bilinear pairing, and its security relies on the CDH problem and the discrete logarithm problem.

Since public verification allows people other than users to verify data integrity, it is necessary to ensure that the audit process preserves user privacy such that the auditors cannot obtain any original user data based on the information collected. Encrypting data before uploading them is one way to protect user privacy but can be used only as a supplement and does not by itself prevent data from “flowing” out if appropriate protocols are not designed. Wang et al. [9] used random masking technology to solve the problem of privacy leakage. Two random parameters are introduced to the process of proof generation to prevent the adversary from obtaining private data by solving linear equations based on the content of the message. References [2022] used the same idea to protect user privacy. Tong et al. [23] used private information retrieval and the tag repacking technique to enable the TPA to check the integrity of data without violating the privacy of the user data and the query pattern.

With the development of quantum computing [24] in recent years, the hard problems of traditional cryptography, such as integer factorization and the discrete logarithm, can be solved by quantum computers in polynomial time. The security of all information systems designed based on these problems is thus seriously affected, and a PDP scheme is needed to resist attacks based on quantum computing. Lattice cryptography has the advantages of a strong proof of security with worst-case hardness, simple construction, and easy implementation [25]. It has a high speed of calculation and low communication overhead and can be used to construct various cryptographic algorithms and applications. It is thus considered the most efficient and promising method of post-quantum cryptography. Zhang and Xu [26] proposed a post-quantum secure cloud storage system that supports privacy preservation and public auditing. Its security is based on the hardness of inhomogeneous small integer solution (ISIS) on lattices. The lattice signatures generated by a function with preimage sampling are used for random masking to ensure that the TPA cannot acquire knowledge about the data. Yang et al. [27] proposed an identity-based model to audit the integrity of data that does not rely on a public key infrastructure, and its security is based on ring learning with errors (RLWE). Tan et al. [28] proposed a provable data integrity scheme based on lattices in cloud storage that uses the homomorphic signature technique on lattices [29] and supports dynamic data operation and batch verification. Based on the work in [28], Tan et al. [30] used random masking technology to achieve privacy preservation such that the TPA cannot obtain the original data.

In the context of the post-quantum era, the PDP scheme based on lattice cryptography relies too heavily on the TPA, which is not entirely trustworthy and is easily affected by a single point of failure. Moreover, this scheme often leads to the leakage of private user data in the quest to satisfy the demand for public verification. In response to these problems, this paper designs a post-quantum privacy-preserving PDP protocol and uses it to propose a privacy-preserving scheme based on smart contracts. The proposed scheme is post quantum and can be used for public verification while ensuring the preservation of private user data. The characteristic of noninteraction of the protocol can reduce transaction fees incurred by the frequent operation of the blockchain, and the smart contract with a deposit mechanism can ensure fair payments to all parties.

The main contributions of this research can be summarized as follows: (1)We design a post-quantum privacy-preserving PDP protocol. It uses linearly homomorphic signatures on lattices to calculate the signatures of different data blocks and generate a short proof. Data possession can be verified through this proof. The security of the scheme depends on the hardness of the inhomogeneous small integer solution on the lattices, thereby ensuring resistance against quantum computing-based attacks. The protocol also uses random masking technology to add random variables to the generated proof so that the verifier cannot obtain the user’s data by solving a linear system of equations composed of multiple pieces of proof, thereby preserving user privacy(2)We propose a post-quantum privacy-preserving PDP scheme based on smart contracts. The audit process is open and transparent and is regularly and automatically verified by deploying smart contracts on the blockchain. The smart contract stipulates the rights and obligations of all parties involved. Once the conditions have been triggered, the contract is automatically executed to ensure fair payments to all parties, and this reduces the cost of handling disputes. In addition, the cost of transactions decreases significantly because the participants do not need to interact in the “challenge–response” phase owing to frequent interactions on the blockchain, and this increases the verifier’s enthusiasm for executing smart contracts(3)We analyze the correctness and capability of privacy preservation of the proposed scheme and prove that it is secure under the random oracle model through interactive analysis of the construction of a series of games. Finally, the practicability and efficiency of the proposed scheme are verified through comparative experiments with prevalent methods in the area

The rest of the paper is organized as follows. Section 2 recalls some preliminaries used in our scheme. Section 3 defines the model of our scheme, gives the formal definition, and presents the concrete construction. Section 4 provides the security proof of the protocol. Section 5 evaluates the performance of our scheme. Finally, we give a conclusion in Section 6.

2. Preliminary

2.1. Provable Data Possession

To reduce storage overhead, users store their original data in the cloud instead of doing so locally. Therefore, the CSP is needed to prove the possession of data to periodically ensure their integrity. The TPA is introduced to the PDP scheme to verify the integrity of the data on behalf of the users to reduce the overheads incurred by them due to computation and communication. During the verification process, the CSP needs to respond to the challenge initiated by the TPA and generate a proof for it. The result of verification is returned to the user.

According to the basic definition of PDP, a general system model of it is shown in Figure 1. The participating entities include the users, CSP, and TPA.

The PDP scheme consists of five algorithms: key generation, tag generation, challenge generation, proof generation, and proof verification. The description of each follows. (1): the key generation algorithm is run by the user. The input to it is the system security parameter and the output is a public–private key pair (2): the tag generation algorithm is run by the user. The inputs are the public–private key pair and the data block , and the output is the signature of . The user sends to the CSP(3): the challenge generation algorithm is run by the TPA. The inputs to it are the identification information , a random number , and a challenge set , and the output is the challenge (4): the proof generation algorithm is run by the CSP. The input to it consists of the data block , the signature , and the challenge , and the output is the proof . The CSP obtains the proofs and by using homomorphic aggregation technology to process and , respectively, and the CSP returns the result of aggregation to the TPA(5): the proof verification algorithm is run by the TPA. The input to it is the proof . If the proof is verified, the output is true; otherwise, the output is false

The above algorithms constitute the general PDP scheme. To satisfy the requirements of different application scenarios, researchers have improved and supplemented this scheme.

2.2. Lattice-Based Homomorphic Signature

Notation 1. For any integer , we let denote the ring of integer modulo . If is a prime number, is a field and is denoted by . We let denote the set of matrices with entries . We use standard big-O notation to classify the growth of functions and say that if, we letdenote an unspecified functionfor some constant. A negligible function, denoted generically by , is an such that for every fixed constant . We say that a probability is overwhelming if it is . The base 2 logarithm is denoted .

Definition 2. Lattice: an -dimensional lattice of rank is where the columns of the basis are linearly independent. denotes the length of the longest vector in , i.e., , and denotes the Gram–Schmidt orthogonalization of the vectors .

For any integer and any , we define

The lattice is a coset of , namely, for any such that.

Lemma 3 ([31] Lemma 7.1). Let be an -dimensional lattice. There is a deterministic polynomial-time algorithm that, given an arbitrary basis of and a full-rank set in , returns a basis of satisfying

Theorem 4 ([32] Theorem 3.2). Let be positive integers with and . There is a probabilistic polynomial-time algorithm that outputs such that is statistically close to a uniform in and is a basis for that satisfies with all but negligible probability in .

Definition 5. Gaussian distributions: let be a subset of . For any vector and any positive , let be a Gaussian function on with center and parameter . Let be the discrete integral of over , and let be the discrete Gaussian distribution over with center and parameter . For all , . For notational convenience, and are abbreviated as and , respectively.

Gentry et al. [33] constructed algorithms to sample from discrete Gaussian distributions.

Theorem 6. (a)There is a probabilistic polynomial-time algorithm SampleGaussian that, given a basis of an -dimensional lattice , a parameter , and a center , outputs a sample from a distribution that is statistically close to (b)There is a probabilistic polynomial-time algorithm SamplePre that, given a basis of an -dimensional lattice , a parameter , and a vector , outputs a sample from a distribution that is statistically close to

Definition 7 ([34] Definition 3.1). Smoothing parameter: for an -dimensional lattice and positive real , the smoothing parameter of is defined to be the smallest positive such that , where is the dual lattice of .

The critical property of the smoothing parameter is that if , then every coset of has roughly equal mass under the distribution .

Lemma 8 ([34] Lemma 4.4). Let be an -dimensional lattice. Let be a basis for , and suppose . Then, for any , we have Let and be two lattices in such that . We show that a discrete Gaussian sampling from is close to uniform in by the following lemma.

Lemma 9 ([29] Lemma 3.9). Let and be -dimensional lattice such that . For any and and any , the distribution is within statistical distance at most of the uniform distribution over .

Definition 10. The inhomogeneous small integer solution (ISIS) problem is as follows: given , find a nonzero vector such that and .

Lemma 11 ([33] Proposition 5.7). For any poly-bounded and any prime , the average-case problem ISIS is as hard as that of approximating the SIVP problem in the worst case.

2.3. Blockchain and Smart Contracts

The blockchain is managed by a peer-to-peer network for maintaining a secure and decentralized distributed ledger of transactions, where nodes collectively adhere to a protocol to communicate and validate new blocks. It is best known for its vital role in cryptocurrency systems such as Bitcoin [35]. The innovation of the blockchain guarantees the fidelity and security of data records and generates trust without the need for a trusted third party.

The difference between the blockchain and other databases lies mainly in how the data are composed. The data are stored in blocks, which are generated in chronological order and connected to a chain. This data structure forms an immutable data timeline, with each block in the chain being added with a precise timestamp. As shown in Figure 2, each block contains a block header and a block body. The block header contains the hash value of the previous block, a version number, a random value, a timestamp, a Merkle root hash, and a difficulty value. The block body contains all transactions generated during block creation. Each block in the blockchain is identified by a hash value obtained by twice executing the SHA256 algorithm of the block header. Each block can find its previous block by using the hash value contained in its block header. Any changes to the data in the block lead to a series of changes in subsequent blocks, and distributed nodes run a consensus protocol to synchronously update the hash chain. Therefore, the blockchain has the characteristics of decentralization, transparency, openness, immutability, and traceability.

The smart contract [36] is a piece of code in the blockchain in which the logic of the code defines the contents of the contract. Smart contracts operate under a set of conditions agreed upon by all parties involved; when these conditions are triggered, the contents of the contract are automatically executed. Say that a tenant wants to use a smart contract to lease an apartment from a landlord. First, the tenant and the landlord negotiate details of the contract, such as the duration of the lease, rent, deposit, and the terms of compensation. They then write the content as agreed by both parties into the smart contract and deploy it on the blockchain. The contract takes effect within the specified time. It is automatically executed according to its contents such that this reduces the costs of notarization, mediation, and litigation in case disputes arise.

We can say that the blockchain provides a trusted execution environment for smart contracts, which in turn extends the application of the blockchain. Smart contracts have been used in many fields, such as electronic voting [37] and insurance [38], and have excellent prospects for further use.

3. Our Scheme

3.1. System Model

The system model of a post-quantum privacy-preserving provable data possession scheme based on smart contracts contains three entities: the data owner, the cloud storage provider, and the verifier (see Figure 3). (1)Data owner (DO): to save the cost of local storage and use data conveniently and flexibly, cloud storage users pay a certain fee and store their data on a remote server owned by the cloud storage provider(2)Cloud storage provider (CSP): it provides computing, storage, and network bandwidth for the DO. The CSP needs to periodically perform PDP to prove that it maintains the integrity of data. If verification fails, it pays the DO a certain fee by way of compensation(3)Verifier: this is the executor of the proof verification algorithm. Because the scheme is publicly verifiable, all members of the blockchain can theoretically act as verifiers, usually as third-party miners. Verifiers get a certain compensation by executing smart contracts deployed on the blockchain

3.2. Formal Definition

Our scheme contains four algorithms: KeyGen, TagGen, ProofGen, and ProofVerify. Each is formally defined below. (1): the key generation algorithm is run by the DO. The inputs to it are the security parameter and the number of data blocks , and the outputs is a public–private key pair (pk, sk)(2): the tag generation algorithm is run by the DO. The input to it consists of the public key pk, private key sk, and file . The outputs are a set of block signatures and a set of block indices (3): the proof generation algorithm is run by the CSP. The inputs to it are , a set of block signatures , a set of block indices , the hash value of the previous block, and timestamp . The output is the proof (4): the proof verification algorithm is run by the verifier. The inputs are the proof and the public key pk. If the proof is verified, the output is SUCCESS; otherwise, the output is FALSE

3.3. Scheme Implementation

We first design a post-quantum privacy-preserving PDP protocol and then combine it with smart contract technology to develop our scheme.

3.3.1. Post-Quantum Privacy-Preserving PDP Protocol

We set a security parameter and a maximum file block size . Two hash functions and (modeled as a random oracle) are used as well.

KeyGen (1)Choose two primes and with . Define and (2)Set . Use to generate a matrix , which results in a lattice and a short basis of . Define and , and note that is a basis of the lattice (3)Output the public key and the private key

TagGen (1)Given the file , its fingerprint value can be denoted by the hash (2)Compute the index for each block (3)Compute the vector such that and (4)Compute the signature of :(5)Output a set of signatures and a set of block indices

ProofGen (1)The CSP obtains the header of the newly generated block to obtain the hash value of the previous block and the corresponding timestamp . Because the blockchain uses SHA-256, (2)Define the challenge set as , where and . satisfies (3)Select two random vectors and . Compute , , and (4)Output the proof

ProofVerify (1)Verify the authenticity of the timestamp and the hash value of the previous block from . Return FALSE if it fails, and continue if it succeeds(2)Verify the conditions below. If all conditions are met, output SUCCESS; otherwise, output FALSE: (a)(b)(c)

3.3.2. Post-Quantum Privacy-Preserving PDP Scheme Based on Smart Contracts

The DO needs to pay the CSP to purchase storage space in the traditional cloud storage system. If the DO’s data are unavailable or have been tampered with, it is difficult to protect the data rights and obtain economic compensation. On the contrary, the laws and regulations on data security in various countries are not necessarily complete, especially in case of transnational disputes. Legal action also often incurs additional costs in terms of money and time.

The emergence of smart contracts has helped solve the above problems. As smart contracts are immutable and automatically triggered, the contract is executed immediately once all parties have agreed to its contents if all its conditions are met. No one can then change the contents of the contract.

We propose a post-quantum privacy-preserving provable data possession scheme based on smart contracts for cloud storage systems. We provide parties with tamper-resistant integrity verification through smart contracts deployed on the blockchain that pass the deposit mechanism to guarantee fair payment. To protect against dishonest verifiers and ensure the correctness of verifiers when executing smart contracts, a consensus mechanism is needed. This is not discussed in detail in this article.

The flow logic of the scheme is shown in Figure 4 and is described below. The smart contracts used are shown in Table 1. (1)The DO, CSP, and verifier are registered on the blockchain to obtain the public–private key pair and account addresses , , and . The public–private key pair is used for signing and verifying on the blockchain. The account addresses are generated from the public key to indicate the identity of the transactions. When the scheme is applied to multiple users, different DOs will product their own and , by which they distinguish each other and generate different smart contracts. The DO and CSP need to pay a certain deposit to ensure the smooth completion of subsequent transactions(2)The DO runs the algorithm KeyGen and outputs a public–private key pair (pk, sk). The algorithm TagGen is run to output the block signatures and the block indices (3)The DO uploads to the CSP, generates , and deploys it on the blockchain. The storage contract contains basic file information (name, hash, and upload time) and transaction-related information (storage fee, DO’s address, CSP’s address, and DO’s signature). This contract can ensure that the DO must pay the CSP in time if the latter stores all of the files of the former and the data satisfy the proof of verification(4)When the CSP receives , it generates and deploys it on the blockchain. The compensation contract contains basic file information (name, hash, and receive time) and transaction-related information (compensation fee, DO’s address, CSP’s address, and CSP’s signature). This contract can ensure that the CSP pays/compensates the DO in time if it does not store its entire data such that integrity verification fails(5)The CSP periodically runs the algorithm ProofVerify to generate the proof according to the hash value of the previous block and the corresponding timestamp of the latest block generated. It generates and deploys it on the blockchain. The verification contract contains basic file information (name, hash value, and proof), the related contracts ( and ), and transaction-related information (verification fee, CSP’s address, verifier’s address, and CSP’s signature). To obtain the reward, the verifier executes contract and chooses to activate or according to the result

3.4. Brief Summary

We have developed a post-quantum privacy-preserving provable data possession scheme based on smart contracts that not only meets the essential requirements of correctness and security but also has the following characteristics. (1)Post quantum: the scheme uses the linear homomorphic signature on lattices based on the preimage sampling algorithm. Its security depends on the ISIS problem on lattices and provides a strong proof of security based on the worst-case hardness so that it can meet the requirements of protecting against the quantum attacks(2)Public verification: the proof verification algorithm of the scheme is public and does not require the use of a private key. Any third-party auditor can obtain a public conclusion on whether the CSP completely stores the DO’s data, which liberates users from the arduous task of verification(3)Privacy preservation: the random masking technology introduces random variables to ProofGen. The adversary cannot obtain the user’s original data by solving the linear equations composed of different proofs, which satisfies the demand for privacy preservation(4)Noninteractive: the scheme uses the time-varying hash value of the previous block to generate the challenge set so that the parties do not need to interact in the traditional “challenge–response” phase. On the one hand, the DO and CSP do not need to stay online all the time, which makes the operation of the scheme more flexible, and on the other hand, it can reduce the cost of transactions due to interactions in the blockchain(5)Fair payment: the DO, CSP, and verifier trade according to the agreed smart contract. The smart contract is immutable and automatically executed such that fair payment can be guaranteed

4. Proof of Security

4.1. Correctness

Theorem 12. Our post-quantum privacy-preserving PDP protocol proposed in Section 3.3 is correct with an overwhelming probability.

Proof. Assume a file , block signatures , a challenge set , and the proof , , and . We check the three conditions separately in ProofVerify. (a)By Theorem 4, we have ; therefore, . By Lemma 8, we have . Since , , , and , we have that with overwhelming probability(b)Since and , we have . Thus, the following equation holds:(c)We have and . Since , we have . Thus, the following equation holds:

4.2. Soundness

We define the security of the scheme by formally describing a series of games between a challenger and an adversary .

Setup: runs the algorithm KeyGen to generate the public-private key pair (pk, sk), reserves sk for responding to the ‘s query, and then sends pk to .

Queries: can execute adaptive signature queries by interacting with . specifies a series of files to send to . generates the corresponding set of signatures and the set of block indices by running the algorithm TagGen and then sends to .

Output: outputs the proof based on the results of multiple queries. For a filewith a set of signatureand a set of block indices, the proofis output when the hash value of the previous block isand the timestamp is.

Definition 13. The advantage of an adversary in the game is , and if is nonnegligible, we say that wins the game.

Definition 14. The post-quantum privacy-preserving provable data possession protocol proposed in Section 3.3 is secure. If there exists an efficient extraction algorithm Extr, such that any adversarywins the security game and outputs the proofof the file, the probability that Extr can recoverfrom(i.e.,) is not negligible.

Theorem 15. If the algorithms used to generate the block signatures and block indices of the files are existentially unforgeable and the problem is difficult in case , then, under the random oracle model, the probability that any adversary breaking the security of our scheme passes ProofVerify by using a proof not generated by ProofGen is negligible.

Proof. We demonstrate Theorem 15 through the interactive analysis of a series of games. The game’s restrictions on the opponent are gradually tightened.

Game-0: Game-0 is the first game. It is the security game defined at the beginning of this section.

Game-1: Game-1 is similar to Game-0 with one difference: the challenger maintains a list of block signatures that have been responded to during the query phase. If the adversary submits a valid signature that is, however, not in the challenger’s signature list, aborts and outputs failure.

Game-2: Game-2 is similar to Game-1 with one difference: the challenger maintains a list that stores the indices of the data blocks that have been responded to during the query phase. If the adversary submits a valid index that is not in the challenger’s index list, aborts and outputs failure.

Game-3: Game-3 is similar to Game-2 with one difference: the challenger maintains a list of all tag queries and responses initiated by the adversary . Ifsubmits a proof of successful verification whereis not equal to,is not equal to, andis not equal to,aborts and outputs failure.

Lemma 16. If there is an algorithm that can distinguish Game-1 from Game-0 with a nonnegligible probability, we can construct an algorithm that can counteract the existence of GPV signature scheme with a nonnegligible probability.

Analysis: if the adversary fails to output in Game-1, we can forge a valid signature , which contradicts the existential unforgeability of the GPV signature scheme.

Lemma 17. If there is an algorithm that can distinguish Game-2 from Game-1 with a nonnegligible probability, we can construct an algorithm for the collision resistance of the hash function with a nonnegligible probability.

Analysis: if the adversary fails to output in Game-2, we can forge a valid index , but they are all equal to , which contradicts the collision resistance of the hash function.

Lemma 18. If there exists an algorithm that can distinguish Game-3 from Game-2 with a nonnegligible probability and , we can construct an algorithm to solve the problem.

Analysis: before analyzing the above, we establish some notation. Suppose that the file that caused the failure is divided into blocks of equal length, denoted as . The block signature set and the block index set are generated by TagGen. Assuming that is the set of challenges that lead to failed queries, the outputs of the adversary’s response are , , and , where , , and . Let the expected responses (generated by an honest prover) be , , and . We can ensure by Lemmas 16 and 17 that the block signature and block index used in the queries and output phases, both generated by the challenger , are unforgeable. Thus, if , there must be and , and if , there must be and .

We now show that if the adversary fails the challenger with a nonnegligible probability in Game-3, we can construct a simulator to solve the ISIS problem on lattices.

The inputs of the simulator are the file and some parameters. Its goal is to output a vector . The behavior of the emulator is different from that of the challenger in Game-2, including in terms of the algorithm setup, queries, outputs, and the hash:

Setup: when generating the key, create a lattice and its short basis . The other parameters are the same as in Game-2, meaning that the simulator does not know the private key .

Hash: the simulator writes a random oracle and maintains a table of queries and responses. When the adversary chooses to initiate a query, it returns if it is in the queried list; otherwise, it returns , where .

Queries: when asked to sign the file , the simulator does the following: (1)It randomly selects as the tag. Since the selection space is large enough, the probability that the simulator selects a tag that has once initiated query to the random oracle is negligible. If it has been queried, it aborts(2)Choose such that , and output (3)Let

Output: for the challenge set , compute , , and , and finally, output the vector .

We first show that the output of the simulator is distributed (up to a negligible statistical distance) as in the real signature scheme. Because the simulator chooses a random tag from when signing, the probability that the simulator aborts is negligible. We may therefore assume that the simulator does not abort. Since for a negligible , by Lemma 9, the vectors chosen in both hash and queries are statistically close to the uniform modulo , and thus, the output of is indistinguishable from random. By Theorem 6, in the real scheme are distributed as such that and . The simulated , on the contrary, are distributed as such that , conditioned on . By a straightforward generalization of Lemma 5.2 in [32], these two distributions are identical. Therefore, samplingfromin the real scheme has the same distribution as sampling fromand then moduloin the simulator.

We now show that the output of the simulator is a solution to the ISIS problem on lattices. The adversary’s proof as response is , , and , and the expected responses (generated by an honest prover) are , , and . They are different, but all can be successfully verified, so there is . According to . Assuming that and , let . Then, , i.e., . We thus identify the nonzero vector , which is a solution to the problem on the lattice .

Summary. Any proof of successful verification is generated by an honest prover and cannot be forged. The challenger thus always aborts and outputs failure; i.e., any adversary’s advantage in Game-3 must be zero. The series of games and Lemmas 1618 help ensure that any adversary’s advantage in Game-0 is negligible. Our proposed post-quantum privacy-preserving provable data possession protocol is thus secure.

4.3. Privacy-Preserving

We hope to ensure that the verifier cannot obtain user data based on the information collected during the verification process through the proof of the following theorem.

Theorem 19. The verifier cannot obtain from the proof returned by the CSP.

Proof. According to and , we cannot obtain any information about from and . Therefore, the proof can be transformed into one whereby the verifier cannot obtain from .
We note that because the challenge set is publicly available, if can be obtained by the verifier, then a set of linear equations for can be obtained after multiple challenges and can be obtained by solving them. We now prove that the verifier cannot obtain from .
According to , vectors and randomly chosen by CSP are unknown to the verifier; thus, cannot be derived from .

5. Performance Evaluation

5.1. Comparative Analysis

This paper proposed a post-quantum privacy-preserving provable data possession scheme based on smart contracts that can verify whether the CSP holds all user data, and it is secure under the random oracle model. We compared our scheme with similar methods along five dimensions: post quantum, public verification, privacy-preserving, noninteractive, and fair payment. Table 2 shows that our scheme has significant advantages over other schemes.

5.2. Experiments

We design a prototype of our scheme to evaluate its performance. Our experiments rely on the NTL Library (version 11.3.2) for matrix operation, lattice reduction, and SHA-256 for the hash algorithm. All experiments were conducted on a laptop running Windows 10 (x64) equipped with 3.20 GHz AMD Ryzen 7 5800H and 16 GB DDR4 RAM.

The experiment takes the running time of the scheme as the evaluation basis, including KeyGen’s time, TagGen’s time, ProofGen’s time, and ProofVerify’s time, and mainly focuses on the relationship between the running time and system parameters . Subsequently, comparisons will be made with two similar schemes [28, 30] and a BLS-based scheme [15], respectively. All results of experiments are representing 30 trials on average.

First, we analyze the relationship between the running time of the scheme and the parameter . It is the dimension of the lattice, determined by the security parameter and the prime number (i.e., ). It is also the length of each data block and signature, and the scheme’s security increases as increases. We set the number of file blocks ; the experimental results are shown in Table 3. The running time of the scheme increases rapidly with the increase of , and most of the time is spent on the algorithm TagGen. This is because TagGen needs to call the preimage sampling algorithm SamplePre. When the dimension of the lattice increases, the computational complexity of performing lattice transformation and solving linear equations is much higher than other algorithms in the scheme.

Second, we analyze the relationship between the running time of the scheme and the parameter . The experimental results are shown in Figure 5.We set the system parameters , , and . The running time of the scheme increases linearly with the number of file blocks . The reason is that the running time is mainly consumed in the TagGen, and the number of signatures is determined by .

Next, this scheme is compared with two similar schemes [28, 30]. The main difference between the three schemes is in ProofGen and ProofVerify. [28] uses the linear combination of data blocks and block tags as proof, namely, and ; [30] randomly selects vectors and , uses lattice theory to construct a random vector satisfying and then uses and to hide the evidence

As shown in Table 4, our scheme does not need to construct random vectors by solving equations, and the efficiency is significantly better than [30] in ProofGen and ProofVerify; due to the use of random masking technology, the running time of our scheme is comparable to that of [28] which has a slight increase but can provide privacy preservation with higher security.

Finally, this scheme is compared with the BLS-based PDP scheme [15]. To achieve the same signature length of 160 bits as [15], we set , , , and . As shown in Table 5, although the efficiency of KeyGen and TagGen in our scheme is not as good as that of the scheme [15], ProofGen and ProofVerify take less time. The first two algorithms are completed before uploading data and only need to be run once. In comparison, the latter two algorithms need to be run for each verification, which is more important for the entire integrity verification, so our scheme verification is more efficient.

6. Conclusion

This paper designed a post-quantum privacy-preserving PDP protocol and used it to propose a scheme to this end based on smart contracts. The proposed scheme has the characteristics of being post quantum and is capable of public verification and privacy preservation. The noninteractive nature of the protocol can reduce the cost of blockchain transactions, and the smart contract with the deposit mechanism can ensure fair payments to all parties. In addition, the scheme’s efficiency is reflected in the fact that it mainly uses linear operations, thus avoiding a large number of modular exponential and bilinear pairing operations in the traditional method that can significantly reduce the amount of calculation and improve the efficiency of verification. The main bottleneck of the current lattice-based PDP scheme is that the signature process is slow, and the key and signature in the proposed scheme are long. Future research in the area should focus on ways to solve these problems.

Data Availability

The data used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This research is funded by grants from the National Key Research and Development Program of China (No. 2020YFB1805403) and Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No. 2017BDKFJJ015, No. 2018BDKFJJ008, No. 2018BDKFJJ020, and No. 2018BDKFJJ021).