Data owners outsource their data to remote storage providers without keeping local replicas to save their precious storage resources. However, the ownership and management of data are separated after outsourcing. How to ensure the integrity and recoverability of outsourced data becomes a significant problem. Provable Data Possession (PDP) and Proofs of Retrievability (POR) are two cryptographic protocols that enable users to verify the integrity of outsourced data. Nevertheless, the state-of-the-art PDP and POR schemes either need users to perform the complicated audit tasks by themselves or delegate these tasks to a Third-Party Auditor (TPA). Moreover, these schemes are constructed on a centralized storage framework which vulnerably suffers single-point-of-failure. In this paper, we propose a blockchain-based decentralized self-auditing scheme with batch verification. Firstly, data owners outsource their data to decentralized storage nodes, which can achieve self-auditing based on blockchain without TPA. Secondly, our scheme uses Pedersen-based polynomial commitment to significantly reduce the number of authenticators. Furthermore, we propose a batch verification algorithm, which can verify multiple proofs from different storage nodes to improve the verification efficiency. Finally, we analyze the security of our scheme and implement a gas-efficient system prototype using the smart contracts of the Ethereum Reposten test network. The results demonstrate that the scheme is practical.

1. Introduction

In recent years, cloud storage has become an essential application in our daily life. It is greatly convenient and flexible for users to store, update, and retrieve their data in the cloud [1]. Thus, more and more users outsource their data to cloud storage providers to save their local storage resources. However, the ownership and management of outsourced data are separated after users upload their data to the remote cloud. Therefore, there have arisen numerous security problems in the cloud storage. In 2015, a European data center of Google was affected by lighting strikes and permanently lost 100 GB data. In the same year, Tencent Cloud lost their customers’ data, which caused significant losses for this company. Therefore, the cloud storage providers are not fully trusted. It is important for users to ensure the outsourced data are correct and intact.

Traditional cloud auditing schemes are mainly classified into two categories: Provable Data Possession (PDP) [2] and Proofs of Retrievability (POR) [3], which are two cryptographic protocols allowing users to verify the integrity of data without retrieving the data, and the POR protocol can also guarantee data recoverability. Concretely, the verifier can randomly sample a challenging set of original data, and the prover generates the corresponding integrity proof, which can be audited to ensure the integrity of the outsourced data. To avoid online and computational burden on data owners, existing schemes introduce a Third-Party Auditor (TPA) to help data owners to audit data [4]. However, TPA is a centralized entity that easily suffers from single-point-of-failure. In addition, many cloud auditing schemes assume that TPA will never collude with storage providers. This strong and impractical assumption may be easily broken driven by certain interests. Moreover, in the state-of-the-art auditing schemes, the data owner needs to generate homomorphic linear authenticators which grow linearly with the number of file blocks. These schemes cause an amount of computational overhead for data owners and storage redundancy for storage providers. In practical applications, we also desire to perform batch verification of multiple data auditing tasks to improve audit efficiency. Therefore, it is of great practical significance to design an efficient integrity auditing scheme without TPA.

With the rapid development of public blockchains, such as Ethereum [5]and Solana [6], more and more researchers focus on constructing a decentralized storage system by deploying blockchain. In these systems, data owners can outsource their sensitive data to decentralized storage nodes, which can arbitrarily join the network to contribute their idle storage resources. The existing blockchain-based auditing schemes [79] for decentralized storage can ensure the integrity of data without TPA and are naturally resistant to single point of failure due to the decentralized feature. However, these schemes delegate complex auditing tasks, including a lot of cryptographic operations, to the smart contract deployed in the blockchain, which cannot be efficiently implemented in Ethereum Virtual Machine (EVM) and may cause amounts of gas consumption. Thus, it is a challenging problem to handle the computation-intensive operations on smart contracts in blockchain-based auditing schemes for decentralized storage.

1.1. Our Contributions

In this paper, we propose a self-auditing scheme with batch verification based on blockchain for decentralized storage. Our scheme is based on blockchain technology, PDP protocol, Pedersen-based polynomial commitment, and batch opening polynomial commitment, to achieve efficient data self-auditing without TPA. In summary, our contributions can be listed as follows: (i)We propose a blockchain-based self-auditing scheme with batch verification. We remove TPA and allow data owners to store their data to decentralized storage nodes, who can interact with the blockchain to achieve self-auditing(ii)We adopt Pedersen-based polynomial commitment to construct the homomorphic linear authenticators, which significantly decreases storage overhead and slightly reduces authenticator computation time. In addition, we also propose a batch verification algorithm to verify multiple proofs simultaneously, which can improve verification efficiency(iii)We implement a gas-efficient self-auditing system in the platform of the Ethereum test network and perform security analysis and performance evaluation. The results show that our scheme is efficient and practical

1.2. Related Work
1.2.1. Centralized Outsourced Storage

In 2007, Ateniese et al. [2] proposed the PDP protocol, which is the first public audit scheme to verify the data integrity in an untrusted server. Based on the homomorphic linear authenticators constructed by the RSA signature, data owners can probabilistically verify the integrity of data without retrieving original data. However, this scheme cannot guarantee data recoverability. In 2007, Juels and Kaliski [10] presented the POR protocol, which can achieve data recoverability by using the erasure coding, but their scheme only supports fixed number audits and cannot achieve public verifiability. In 2008, Shacham and Waters [3] improved POR scheme with provable security. Based on BLS signatures [11] and the bilinear pairing, the verifier can publicly verify the data integrity and recover the remote data at any time. Nevertheless, the above schemes cannot guarantee data privacy because the data is stored by plaintext in storage providers. At the same time, the data owners must always be online to audit the storage providers.

Then, there are numerous new protocols improving the PDP/POR system model, such as additional properties of privacy, multiple-replica, and batch auditing. Curtmola et al. [12] firstly proposed multiple-replica provable data possession (MR-PDP) to guarantee the data recoverability. MR-PDP scheme allows data owner that stores replicas of a file to the cloud system to verify the multiple-file integrity. Then, there are numerous scheme attention on multiple-replica auditing [1318]. Considering the data security, Wang et al. [4] proposed a privacy-preserving public auditing scheme that outsources the auditing tasks to TPA and supports batch auditing. Following, several schemes [1921] concerned with privacy for data owners are proposed. To improve storage redundancy and communication overhead, Yuan and Yu [22] proposed a constant communication cost auditing scheme using the polynomial commitment for cloud storage. Besides, there are an increasing number of blockchain-based auditing schemes. Using the RSA signature, Wang et al. [23] proposed a blockchain-based private auditing scheme with small authenticators’ redundancy. Their scheme can divide data into arbitrary blocks for generating authenticators which should be uploaded to the blockchain. Moreover, there are many schemes [7, 8, 2426] that stored the verification proofs to blockchain to achieve undeniable verification interactions. Furthermore, Yuan et al. [7] and Wang et al. [8] used the smart contract to replace TPA to audit the storage providers. Recently, Su et al. [27] proposed a self-auditing scheme for multiple cloud servers without TPA. Their scheme stored the data to several cloud servers and achieved data integrity via the interactions of these servers. However, each server requires multiround interactions to acquire the other servers’ proofs. Moreover, data owners need online to challenge servers.

1.2.2. Decentralized Outsourced Storage

All of the above schemes concern on centralized outsource storage framework, which has many obvious drawbacks. Firstly, centralized storage providers are vulnerable to single-point-of-failure making the users’ data at risk. Secondly, it is too expensive to centralized storage compared with decentralized storage. In order to deal with the above problems, decentralized storage is becoming a hot spot.

Li et al. [28] proposed the notion of IntegrityChain, which is a decentralized storage system supporting MR-PDP. IntegrityChain is a blockchain that mainly stores the information of storage node registration, storage transactions, and auditing proofs. Based on the schemes [4, 22], Du et al. [9] also proposed a blockchain-based auditing scheme with privacy-assured in the decentralized storage network. However, due to the problem of Solidity language, currently, the smart contract cannot effectively support complex cryptographic primitives. At the same time, Francati et al. [29] utilized the blockchain nodes to store users’ data. Miner audits the integrity of data when generating a new block. To ensure data recoverability, Chen et al. [30], respectively, distributed the data and its replica to cloud and decentralized storage providers. The cloud audits the replica stored in the decentralized storage nodes as TPA. With the development of blockchain technology, there are numerous blockchain-based decentralized storage projects. Both Swarm [31] and Storj [32] are decentralized storage networks that outsource audit services to centralized auditors. Unlike Swarm, based on an incentive system through the smart contract on Ethereum, Storj provides an incentive layer with cryptocurrency. Therefore, storage nodes may collude with auditors to deceive data owners. Sia [33] is a fully decentralized storage platform for uploading and downloading data between users and storage nodes. To verify the integrity of data, the storage node transfers Merkle proofs to blockchain and receives Siacoins as a reward. Unlike Sia, Filecoin [34] employs proof-of-spacetime and proof-of replication to guarantee that miners have correctly stored the committed data, which provides more robust storage security. However, the proof generation time and the computational overhead make it hard to be deployed.

1.3. Organization

This paper is organized as follows. Section 2 provides some notations and the cryptographic primitives used in our scheme. Section 3 introduces the system model and security goals of our scheme. We propose our main scheme in Section 4 and provide the security analysis in Section 5. Then, we evaluate our scheme performance and show the results in Section 6. Finally, we conclude this paper.

2. Preliminaries

In this section, we describe notations used in our scheme as Table 1. Then, some cryptographic primitives are introduced to construct our scheme.

2.1. Bilinear Map

There are three multiplicative cyclic groups, , , and , where the order of is . Let and be the generators of and . We use to denote a bilinear map with the following properties. (i)Bilinear: for all , , and , (ii)Computable: there exists a computable algorithm to compute the map efficiently(iii)Nondegenerate:

2.2. Pedersen-Based Constant Size Polynomial Commitment

In a polynomial commitment scheme, the committer can commit a polynomial to a group element, and then, the committed polynomial can be opened at any point by a verifier. Based on an algebraic property of polynomial : perfectly divides the polynomial , , Kate et al. [35] proposed Pedersen-based polynomial commitment scheme which commits two polynomials simultaneously with the constant communication overhead. In their scheme, the proof generated by the committer proves that is the evaluation of the committed polynomial at point .

Firstly, we introduce the Pedersen commitment [36] that a value with a random number computes as: where and are two generators of a cyclic group , whose order is .

Concretely, Kate’s scheme is described as below: (i): given the security parameters and the degree of the polynomial , a trusted entity generates private key , selected randomly from , and public key . and are two multiplicative cyclic groups, and the order of group is prime . and are two generators of , and is a symmetric bilinear pairing(ii): given the and a polynomial , the committer chooses a polynomial of degree from . the commitment is computed as (iii): given the , the committer calculates and . Then, the witness is calculated as based on . Finally, the algorithm outputs (iv): given the output of the algorithm , it is verifiable that is the evaluation at the point of polynomial which is committed by as below:

2.3. Batch Opening Polynomial Commitment

To improve the verification efficiency of Kate’s scheme [35], Boneh et al. [37] proposed two polynomial commitment schemes which can open proof for multiple points and polynomials at the same time. The first scheme is introduced to construct our scheme. The opening proof of their first scheme is constant size as same as Kate’s scheme, but the verifier will have a large amount of computation if there are many distinct evaluation points. The concrete scheme is described as follows. (i): is the degree of the polynomial, and is the max number of opening points. Then, a trusted authority uniformly chooses as private key from denoted by and computes public key as . Let be multiplicative cyclic groups where the order of is . and are the generators, respectively, selected from and . is an asymmetric bilinear pairing(ii): the algorithm outputs (iii): given a random sent by the verifier, several polynomials , and their individual opening point subset , a prover computes the polynomial as , where is the polynomial of . The witness is the polynomial commitment of , computed as (iv): verifier computes and verifies the following equation , where is a polynomial commitment of

3. Problem Statement

3.1. System Model

We propose a blockchain-based public self-auditing scheme that ensures data integrity and recoverability for decentralized storage. The framework of our scheme is shown in Figure 1, which obtains three roles: data owner, storage node that belongs to a decentralized network, and blockchain. (i)Data owner (DO): DO outsources his data to several distributed storage nodes to save storage space. Before uploading data to nodes, DO processes data in advance to guarantee data privacy and recoverability(ii)Storage node (SN): SN, a peer of the decentralized network, wants to outsource his storage resources to gain more interest. Our scheme assumes that several nodes that store the same DO data cannot collude with each other. SN should generate proof for each auditing task and interact with the blockchain(iii)Blockchain (BC): due to the transparency and tamper-proof property, blockchain servers as a trusted third party in our scheme. In the auditing stage, BC will challenge SN and store the auditing proof generated by the SN

As shown in Figure 2, our scheme is outlined in detail. In the setup stage, the DO divides the outsourcing file into blocks, and each 2 blocks form a chunk. For each chunk, DO generates an authenticator. Then, in the storage stage, DO will distribute file chunks and corresponding authenticators to several SNs. During the self-auditing stage, each SN firstly calculates the challenged set via the information of the blockchain header. Secondly, each SN will generate the proof according to the challenged data chunks and transfer the proof to the smart contract. Finally, each SN gets the other node’s proofs from the smart contract, verifies the correctness of all proofs, and transfers the audit report to the smart contract.

3.2. Threat Model and Security Goals

Considering the fairness of the scheme, we describe malicious behaviors as follows. Besides, we assume that at least one of the nodes that stored DO’s data is honest. Firstly, the SN may delete data rarely accessed by DO to save storage costs for more interest or directly leave the decentralized network. Secondly, due to various accidents, such as hardware and software failure, the outsourced data stored on the SNs may be tampered or deleted. In order to his reputation, the SN may hide the facts of data loss until the time of data retrieval. Finally, DO may generate erroneous metadata to SNs for gaining more interests. In our scheme, we want to achieve the security goals as follows. (i)Data privacy: we should protect DO’s data privacy that SN and malicious adversaries cannot extract data contents(ii)Storage correctness: we should guarantee that if SNs can pass each audit, they must correctly store the DO’s data(iii)Batch auditing: we require SNs to correctly verify multiple audit tasks at one time to improve the scheme’s efficiency. If the batch auditing can be passed, SNs must correctly store the DO’s data(iv)Fairness: we should ensure the fairness of incentive mechanism, which will reward honest participants and punish malicious participants

4. Our Main Scheme

In this section, we first describe the formal definition of our scheme. Then, we propose the main idea of our scheme. In the end, we present our scheme in detail. The main algorithms of our scheme are defined below. During the following presentation, we describe our scheme from the view of DO and . (i)Setup : given the security parameters , the number of blocks in each chunk and the number of opening points . Then, this algorithm outputs private and public key pairs for DO to preprocess the uploading file, and the blockchain accounts and for and DO(ii)TagGen : this algorithm inputs the DO’s and uploading file . It outputs the processed file and corresponding authenticators (iii)FileDistribution: given the and the number of storage nodes , this algorithm outputs the file chunks and authenticators of each required for storage(iv)ChallGen : given the , the number of challenged set and the on the blockchain, this algorithm outputs the challenged set of each , denoted by (v)Self-auditing : given the , the number of challenged set and metadata, this algorithm outputs 1 or 0, where 1 represented this round audit is successful and 0 failure

4.1. Main Idea

The overview of our scheme is shown in Figure 2. We propose a blockchain-based self-auditing scheme to solve the problem that the traditional schemes introduced TPA. The main idea is that the storage nodes act as verifiers and interact with the smart contract to complete each audit. Concretely, the data owner first encrypts and encodes the file for data privacy and retrievability. Then, selecting several storage nodes, the data owner distributes part of the file to each node. For each audit, every node generates the proof through the information of blockchain and transfers the proof to the blockchain. Each node serves as a verifier to verify the proofs of other nodes. If every proof is passed the verification, each node transfers the successful message to the smart contract, such as 1. The smart contract automatically maintains a counter recording the number of successful audits. When the data owner retrieves the file, the smart contract sends the data owner’s deposit to the storage nodes according to the counter.

4.2. The Concrete Scheme

We use three different multiplicative cyclic groups in our protocol, and . And the order of the group is prime . Let three generators , , and , respectively, selected from , , and , be a bilinear pairing. and are expressed as two collision-resistant hash functions. Besides, the stored file is denoted to separated into blocks. Each blocks form a set of data for generating authenticators and saving storage space. It should be noted that each block is the element of group for the security of the audit. Furthermore, our scheme can support batch auditing where the number of audit proofs is .

Setup. The data owner randomly selects two elements, and from group , as his private key and computes , and as part of the public key. Let and in order to conveniently express in a single audit. Therefore, the private key is , and the public key is . Meanwhile, data owner and each storage node generate the private and public keys of the blockchain account, respectively, denoted by and .

TagGen. Before uploading a file to decentralized storage network, data owner should process using symmetric encryption algorithms to protect data privacy, such as AES, and erasure-correcting code [38] to reinforce data recoverability. Then, the file is split into blocks and each blocks form a chunk. Thus, the number of chunks is represented by . The file is the form of , and each chunk can be denoted as . We use to denote the processed file.

It is noteworthy for data owner that the last chunk may need padding. For each , the data owner generates a corresponding homomorphic authenticator utilizing the Pedersen-based polynomial commitment [22, 35]. An authenticator is computed as the following operations, where is the file identifier randomly selected from and is the index information of chunk bound in the authenticator.

Using the polynomial commitment based on Pedersen, we can commit two polynomials at one time without increasing the amount of calculation, which greatly alleviates the storage redundancy and I/O overhead. As shown in Equation (3), and are bound to the authenticator in the form of polynomial commitment, and their parameters are the first and latter half part of each chunk . Besides, considering the computational cost for SNs in the self-auditing stage, the number of each chunk has an upper limit of .

FileDistribution. In order to ensure data recoverability, the data owner selects several nodes in the decentralized distributed network, and each node stores a part of the file. In detail, there are storage nodes selected to store file chunks and accompanying authenticators, . Each storage node should store metadata pair set whose index set is computed as . Then, the data owner transfers the divided metadata pairs to each node in a secure channel and sends deposit to smart contract compiling in advance on the blockchain. After nodes receive the corresponding metadata, they also send deposit to the smart contract.

ChallGen. Utilizing the information on the blockchain, generates a challenged set using the hash function. The height of the blockchain is taken as the initial audit point where DO’s deposit is sent to the blockchain. And each audit is generated for every blocks. Concretely, generates a challenged set , where , , and .

Self-Auditing. Based on the file and authenticators, the node computes:

generates two polynomials, and , which are linear combinations of challenged chunks. Note that .

Besides, computing quotient and remainder under the polynomial , that is, , and we represent the coefficients vector of the quotient polynomial as . In the end, produces:

The form of proof is =. Then, the node sends the to the blockchain. After all nodes send proofs to the blockchain, every node verifies the correctness of other node’s proofs through the following Equation (7). If each is passed the verification equation, the node sends 1 to the smart contract; otherwise, 0. In the equation, , can, respectively, be acquired by calculating and .

The correctness of the above equation is proved as follows:

Remark. Compared with Su et al. [27] in the self-auditing stage, our scheme never needs interaction between SNs. At the same time, we use blockchain information to generate the challenge set for each SN. Blockchain assists SNs in completing each audit without DO online. Therefore, we reduce the number of interactions and some computational overhead between SNs and DO.

4.3. Batch Verification

Our scheme can support batch auditing for improving the efficiency of data audits via polynomial commitment aggregation [37]. Consequently, our scheme can verify multiple proofs at one time. Concretely, when the last proof is transferred to the blockchain, every SN generates a random number . Based on the proofs on the blockchain, every SN generates the aggregation of the proofs as shown in Algorithm 1.

Finally, each SN sends the audit results to the smart contract. The smart contract maintains a counter for all storage nodes denoted the number of successful audits. After archiving the data stored in the decentralized network, SNs are rewarded or punished according to the number of counter.

 The proofs shown on the blockchain:
 The challenge can compute by the information of blockchain, ;
Result of integrity audit ;
1: Compute ;
2: Compute ;
3: Compute ;
4: Compute as:
5: Compute ;
6: Ifthen
7: Set ;
8: else
9: Set ;
10: end if
11: return;

In the end, we show that Algorithm 1 can ensure the correctness of batch verification. There have proofs to audit at one time and the correctness of batch verification is shown as follows:

5. Analysis of our Proposed Scheme

5.1. Security Analysis

In this section, we evaluate the security of our auditing scheme according to data privacy, storage correctness, and fairness listed in Section 3.2.

Theorem 1. The scheme can guarantee that the correct proof can pass the verification, and the storage nodes cannot forge the authenticators and proofs when he does not maintain the entire data in the q-BSDH assumption.

5.1.1. Data Privacy

We assume that the tools and cryptographic primitives used in our scheme are secure, such as the hash function, symmetric encryption algorithm, and polynomial commitment scheme. Therefore, we can ensure the privacy of the DO’s data by using symmetric encryption algorithm before uploading data to the decentralized storage network.

5.1.2. Storage Correctness and Fairness

In Equation (8) and Equation (9), we first prove that the honest SN can always pass the integrity verification. Then, we give a proof sketch that the authenticator generated by DO and the proof generated by SN are unforgeable.

According to the description in [35], the Pedersen-based polynomial commitment scheme is security provided the t-SDH assumption in group . Therefore, if an existed probabilistic polynomial time adversary can forge an authenticator, can construct an algorithm to efficiently deal with the t-SDH problem. Specifically, assume that can forge and such as , where , , , and are known to . can construct polynomials and and gain . Therefore, to factor and , can achieve the private key and break the t-SDH assumption in the system security parameters.

We show that malicious SN cannot generate valid proof if the entire data is never honestly stored. Given two valid proof responses and . An adversary is possible to extract the knowledge of linear combination of original data chunks with overwhelming probability unless can deal with CDH and q-BSDH problems. Please refer to [3, 22, 39] for more details.

Moreover, the scheme generates a probabilistic proof of data possession like previous work. To guarantee a high confidence level is significant for DO. The probabilistic analysis given in [2] shows the relationship between the storage confidence level and the number of challenged chunks. Concretely, if 1% of data chunks have been tampered, only 300 chunks can give DO storage guarantee level of 95%. In the decentralized storage network, we think this number of challenges is sufficient to protect DO interest.

Remarks on Fairness. We further analyze the fairness in our scheme. First of all, we consider that the DO generates the incorrect metadata to gain more benefit. After the SN receives this erroneous metadata, it can implement the audit protocol locally in advance to ensure the correctness of the metadata with a tremendous probability. If the SN finds the wrong metadata, it can stop this transaction at negligible cost. Then, we consider that the malicious SN may generate wrong proof to gain more interest. The most honest nodes can send the malicious node’s blockchain address to the smart contract when the audit fails. Then, the smart contract can transfer the deposit of the malicious node to other participants.

5.2. Comparison

As shown in Table 2, we compare four data auditing schemes, which consists of Yuan’s scheme [22], Du’s scheme [9], Su’s scheme [27], and our scheme, in terms of audit mode, no interactions between SNs, data owner offline, decentralized storage, and batch auditing. Firstly, compared with Yuan’s scheme, other schemes remove TPA from the traditional system model, which defends against single-point-of-failure and collusion between storage providers and TPA. Secondly, only Su’s scheme has interactions between SNs in the self-auditing stage because each node needs to obtain other nodes’ proofs to complete each auditing task where the parameter is the number of storage nodes.

Finally, Su’s scheme and our scheme support decentralized storage and batch auditing, improving data recoverability and reducing the computation overhead for SNs.

Both Su’s scheme and our scheme use self-auditing to remove TPA in traditional system model. However, our scheme uses blockchain information to generate the challenged set for each SN. Thus, data owners should never online to challenge each SN, and each SN should only interact with blockchain without amount communication overhead. However, Su’s scheme should have interactions between SNs which is unfeasible when becomes larger.

6. Performance Evaluation

This section gives a performance evaluation of our auditing scheme in terms of off-chain storage and computational overhead and on-chain gas fee overhead. Besides, we mainly compare the experimental result with Du’s scheme [9] which is the first auditing framework to consider on-chain privacy and efficiency.

6.1. Implementation and Experiment Setup

We leverage the Golang language to implement our off-chain scheme by the BN256 curve [40] and the KZG [41] library where the secure parameter is bit. BN256 curve library implements the elliptic-curve-related operations with Golang language . KZG library implements addition, subtraction, multiplication, and division of polynomials with Golang language. On the part of the blockchain, we use the Ethereum test network Reposten and utilize the Solidity and Remix-IDE tool to employ our smart contract.

In order to simulate the decentralized storage network, we use three virtual machines as our storage nodes with Intel (R) Core (TM) i5-9500 CPU 3.00 GHz, 4 GB RAM and 20 GB SSD disks running on Ubuntu 18.04 LTS. The data owner uses a Desktop PC with windows 10 (AMD Ryzen5 4600 U CPU 2.1 GHz 16 GB RAM). Besides, we use the same configuration and parameters to implement the Du’s scheme [9]. We set as a 256 bit large prime number, the stored file size to 1 GB, and the number of challenged chunks from 240 to 440. The evaluation results are the average of 10 experiments.

6.2. Off-Chain Overhead

In the off-chain part, we test the authenticators’ generation time, authenticators storage overhead, and proof generation time as shown in Table 3. To simulate data owner in the preprocessing stage, we use four Golang coroutines to process a 1 GB file size with the parameter from 50 to 350. Note that our authenticators’ generation time includes other factors such as key pairs’ generation, I/O overhead, and polynomial coefficient transformation of data chunks.

The parameter is negatively correlated with the number of chunks and authenticators. Thus, as illustrated in Figure 3, we greatly decrease the number of authenticators which is always half of Du’s scheme [9] due to the use of Pedersen-based polynomial commitment. As shown in Figure 4, we slightly decrease the authenticators’ generation time due to less frequency to access the original file. Therefore, we spend less time on I/O overhead with the increasing of parameter compared with Du’s scheme. However, is the order of two committed polynomials in the TagGen stage. The computational overhead for the polynomials becomes more expensive with the parameter increases. We can see that the case of is the least time for generating the auditing authenticators.

Then, we test the overhead of proof generation related to the number of the challenged chunks from 240 to 440. In the proof generation phase, our scheme needs to deal with an extrapolynomial with the order of . Therefore, as shown in Figure 5, compared with Du’s scheme, our scheme is about 20-30% slower in generating proof. Moreover, we can discover that the size of file dramatically affects the proof generation time, mainly dominated by the I/O cost. Due to the smaller authenticators’ redundancy, our scheme has less I/O overhead. In the case of 1 MB, which can ignore the cost of I/O, we are about 30% slower than the comparison scheme. And for the case of 1 GB, we are about 20% slower than the comparison scheme. Thus, our scheme has the advantage of processing more larger file.

Then, setting the parameter to 150, we evaluate the verification time by increasing the challenged chunks from 240 to 440. As shown in Figure 6, the time consumption increases linearly with the size of the challenged chunks. Compared with Du’s scheme, our scheme requires additional overhead to calculate the multiplication and addition on elliptic curve , but this time can be ignored. As shown in the case of chunks of 240, the verification time of the two schemes is almost equal, about 45 ms.

Lastly, we test the time cost of batch verification with the increasing number of storage nodes. We set the number of blocks to 150, challenged chunks to 240, and storage nodes from 2 to 7. As shown in Figure 7, the test results of our batch verification are compared with Du’s -time single verification. In theory, we can decrease bilinear pairing operations, and other calculation overhead is unchanged when is small. It can be seen from the figure that our batch verification scheme can improve 30% verification efficiency.

6.3. On-Chain Overhead

In the decentralized system model based on blockchain, the on-chain cost is concerned by the participants. We use a smart contract to manage all participants who use our scheme. All participants only need to interact with this smart contract rather than create their own smart contracts. The proofs and counters of SNs and the public keys of DOs are all stored in this smart contract. We use the Ethereum test network, Reposten, to deploy our smart contract written by Solidity. There are two main function in our smart contract, and .

As shown in Table 4, we compare the on-chain gas overhead in terms of the storage of public key and proofs, challenge generation, and proof verification.

After the setup phase, our scheme should send to the blockchain. Note that the size of a element is 64 bytes and is 128 bytes. To save gas consumption, we only send -axis coordinates on the elliptic curve to the blockchain, and more 1 bit records the positive or negative information of the elliptic curve points. Through computing the -axis coordinates on the off-chain, SNs can obtain public keys. Compared with Du’ scheme, our scheme only adds group elements of and elements on groups to improve the size of the authenticator and introduce the batch auditing. Thus, the consumption of PK storage of our scheme is 0.3 million, which is nearly a double increase compared with Du’s scheme. However, note that this process is one-time storage cost for the whole auditing duration. Data owners can use these public keys in the future. Moreover, we compare the storage overhead of the proof in the blockchain. Compared with Du’s scheme, our proof only has an extra 256 bit large number . We increase about 2000 gas consumption in each audit for the test in the Reposten.

However, compared with Du’s scheme in the challenge generation and proof verification phase, our scheme never requires any overhead on the blockchain. We are only required to maintain a counter to record the number of successful audits without the amount of computation in the smart contract. To sum up, our scheme saves approximately 60% gas consumption.

7. Conclusion

In this paper, we proposed a blockchain-based self-auditing scheme with batch verification in a decentralized framework. Firstly, different from previous works, our scheme removes TPA through the interaction between storage nodes and blockchain to achieve self-auditing. The recoverability of data can be guaranteed due to the distribution to storage nodes. Secondly, using the Pedersen-based polynomial commitment to generate the authenticators, our scheme decreases the computational overhead for DO and the storage overhead for SNs. Moreover, the batch verification algorithm improves the verification efficiency by aggregating multiple polynomials and points. Lastly, security analysis and experiments show that our scheme achieves the security goals and is efficient and feasible to deploy in the practice blockchain environment.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This work is supported by the National Nature Science Foundation of China (no. 62072357), the Key Research and Development Program of Shaanxi (nos. 2022KWZ-01 and 2020ZDLGY08-03), the Shandong Provincial Key Research and Development Program of China (no. 2019JZZY020129), and the Fundamental Research Funds for the Central Universities (nos. JB211503 and YJS2212).