Abstract

Cloud storage and cloud computing technologies have developed rapidly for a long time, and many users outsource the storage burden of their data to the cloud to obtain more convenient cloud storage services. Allowing users to audit the private data’s integrity has become an additional basic function of the cloud server when providing services. In 2021, based on the BLS signature and automatic blocker protocol, Jalil et al. proposed a secure and efficient cloud data auditing protocol. The protocol can realize public audit, batch audit, data update, and protect data privacy. Moreover, the automatic blocker protocol is used to realize the identity authentication of the auditor. The protocol is relatively novel, innovative, and has a larger use space. However, we found that their scheme had security problems. If the cloud server has thoughts of malicious attack, he can forge the proof that he holds users’ data with stored labels and pass the audit. Referring to the original protocol and being inspired by them, we propose an improved audit protocol. The improved protocol solves the security problem and is more effective.

1. Introduction

Recently, advanced and innovative technologies represented by cloud computing and cloud storage have become increasingly mature. Cloud storage and cloud computing technologies have the characteristics of convenience, economy, and high scalability. Users can store the generated data in the platform and control their data remotely without purchasing and using local storage devices. Users are increasingly inclined to use cloud storage services to manipulate data more quickly and easily.

Cloud server providers centrally hold massive amounts of users’ data, which are easily targeted by malicious attackers, and dishonest cloud server providers will deliberately delete users’ data or conceal data security incidents from users for reasons such as reducing their own storage burden or maintaining their reputation. In the application of cloud storage technology, users cannot absolutely manipulate the data, and the integrity of the users’ data is threatened. Verifying the integrity of cloud data is a hot topic of current research now.

1.1. Organization

We organize our paper as follows: in Section 1, we introduce the research background and related work. In Section 2, we describe the system model of cloud storage audit protocol. In Section 3, we review Jalil et al.’s public audit protocol. In Section 4, we give our attack on the original protocol and show that it is not efficient. In Section 5, we introduce our improved secure auditing protocol. In Section 6, we analyze the security of the improved protocol and compare the audit efficiency of the improved protocol with the original protocol. Finally, in Section 7, we make the conclusion of our work.

1.2. Related Work

Scholars have proposed many cloud storage data integrity audit protocols with different functions to meet the different needs of users in different application scenarios more effectively. In 2004, based on the RSA signature, Dewarte et al. [1] designed a protocol to audit remote files. However, the exponential calculation on all data blocks in the file will be performed on the user side, which will result in expensive computational overhead. In 2007, Ateniese et al. [2] designed a verification scheme suitable for a cloud storage environment called “provable data possession (PDP).” The protocol uses an RSA-based homomorphic linear authenticator and random sampling technology, and users only download the partial file to be able to verify the integrity. Then, Juels and Kaliski [3] designed another “proof of retrievability(PoR)” scheme suitable for a cloud storage environment, which implements data integrity detection by inserting special data blocks (generally called “sentinels”) into the data file.

In the actual application of cloud storage, users may need to perform various modifications and update operations on the data. Therefore, researchers have proposed audit protocols that support dynamic data updates. In 2008, Ateniese et al. [4] first proposed an audit protocol that can achieve dynamic data update with a symmetric encryption method. However, this audit protocol has the shortcoming of limited audit numbers and does not support public data audit. In 2012, Zhu et al. [5] constructed an audit protocol that supports dynamic data update with an index hash table (IHT) based on zero-knowledge proof. In 2015, Erway et al. [6] designed an audit protocol based on the sorted authentication skip list. The protocol supports a complete data dynamic update. In 2016, Jin et al. [7] introduced an index switcher to propose an audit protocol that not only provides fair arbitration but also supports dynamic data updates. In 2017, Shen et al. [8] used a two-way linked list of location arrays to implement the audit of the data. The protocol uses global and block-free sampling verification methods, which can also reduce computing and communication costs. In 2019, Guo et al. [9] designed a verification protocol that supports task outsourcing and supports dynamic data updates. It provides a log audit mechanism to enable users to detect misconduct by dishonest auditors. However, the solution has security loopholes. After multiple audits of data blocks with the same index, theoretically, data labels can be forged by solving linear independent equations. In 2020, the cloud audit scheme suitable for IoT [10] designed by Hou et al. [11] uses a chameleon authentication tree to save the computational overhead during the dynamic data update process and supports batch audit.

If users undertake the periodic audit work, it will generate a large computational overhead and consume a lot of resources [12]. In practical application scenarios, it is important to protect the privacy of user’s data [13]. Scholars introduce a third-party auditor (TPA) to help users regularly check the integrity of the data stored on the cloud server. However, when users outsource the audit task, TPA will obtain data content during the implementation of audit tasks [14]. In 2013, Wang et al. [15] designed a public verification scheme, and the scheme supports a privacy protection function based on random masking technology and batch audit function based on the homomorphic linear authenticator. The protocol ensures that TPA cannot obtain the user’s real data during the data integrity audit process. In 2014, Worku et al. [16] used random masking technology to propose an efficient public audit protocol with data privacy protection function. Wang et al. [17] designed a shared data audit protocol, which uses the ring signature technology and can protect the users’ identity privacy. In 2015, Xiong et al. [18] used an ID-based encryption algorithm to design a privacy protection protocol, and the protocol uses distributed hash table network to protect sensitive data. In 2016, Li et al. [19] used online/offline signatures to design a lightweight public audit protocol with data privacy protection function.

Traditional cloud audit protocols are mostly based on the design of PKI cryptosystem, which brings complicated certificate management issues. In 2013, the first public identity-based audit scheme was designed by Zhao et al. [20]. The protocol minimizes the information carried in the verification process and the information obtained or stored by TPA, which simplifies key management and reduces communication and calculation overhead. In 2014, Wang et al. [21] proposed an ID-based data audit scheme, which formally defines the ID-based remote file verification model. The protocol gave the first security proof of the identity-based audit protocol based on CDH problem’s difficulty. In 2016, Wang et al. [22] designed an agent-oriented ID-based remote data audit protocol. According to user’s authorization, the protocol can realize three modes of private audit, entrusted audit, and public audit. In the same year, Yu et al. [23] used zero-knowledge proof to propose an ID-based cloud audit protocol that supports the privacy protection of users’ data. The protocol regulates the identity-based audit protocol and its security model and can realize zero-knowledge privacy protection for TPA. In 2019, as the solution to the complex key management problem in cloud data integrity verification, Li et al. [24] used fuzzy identity to design an audit protocol. Xue et al. [25] designed an ID-based audit protocol using blockchain to construct random challenge messages. In their protocol, TPA cannot forge audit results to deceive users [26]. Peng et al. [27] designed a new ID-based data ownership verification protocol using compressed authentication arrays, which can simultaneously and efficiently support batch verification for multiple users in terms of computing and communication. Rabaninejad et al. [28] used the online/offline signature to design an ID-based PDP, and the protocol is implemented to support privacy protection, batch audit, and full dynamic data update [26].

However, the key escrow problem exists in ID-based cloud audit protocols, so many cloud audit protocols based on certificateless signature have been proposed. In the certificateless signature system, the user and the key generation center (KGC) cooperate to produce the private key for the user, which can avoid the strong dependence of the system security on the KGC security [29]. In 2013, Wang et al. [30] designed a certificateless cloud audit protocol, but He et al. [31] later pointed out the security problem. In 2015, Zhang et al. [32] designed the certificateless cloud data verification protocol that can resist malicious auditors. In 2017, Kang et al. [33] applied the certificateless cloud audit protocol to wireless body area networks. The proposed protocol can resist malicious auditors and protect data content. The certificateless cloud audit protocol proposed by He et al. [34] can protect users’ privacy, but it has also been pointed out that there are security problems. He et al. [35] applied the certificateless data audit protocol to the data management system of the smart grid, reducing the computational overhead. In 2018, Yang et al. [36] designed a certificateless cloud audit scheme for group user file sharing, which supports the protection of data content and users’ identity privacy. In 2019, Wu et al. [37] defined the security model of the certificateless cloud audit protocol with privacy protection. The proposed protocol supports the protection of multiuser group identity privacy. In 2020, Huang et al. [38] designed a certificateless data verification protocol supporting the batch audit function, which realized efficient key update based on the Chinese remainder theorem.

1.3. Our Contribution

Recently, Jalil et al. [39] proposed an effective cloud data public audit protocol based on BLS signature to realize public audit and protect file content privacy. The protocol implements batch audit and dynamic update. Their scheme also uses automatic blocker protocol (ABP) to prevent unauthorized TPA from participating in the audit work, which is highly innovative, and ABP is essentially an access control facility [40], which can detect threats from auditors [41]. However, we found that their protocol has security issues. Even if the cloud server does not hold the stored data, he can mathematically prove that he holds the user’s data. Then, we propose an improved and secure protocol with high security. The analysis shows the safety and effectiveness of our improved program in actual environments.

2. System Model

To facilitate understanding, we define and explain the various symbols and variables that appear in the original scheme and the improved scheme in Table 1.

The existing cloud audit systems generally include three interactive entities: cloud server provider (CSP) provides users with data storage services to obtain remuneration. CSPs are incredible. They may delete cloud data for profit or steal users’ data privacy. Users: users are the owners of the data, and they upload files to the cloud to save their own storage cost. Third-party auditor (TPA) is not an entirely believable auditor entrusted by users, and on the one hand, TPA performs the audit task faithfully, and on the other hand, TPA attempts to decipher the content of the user’s data with curiosity.

The interaction process of all entities: the user preprocesses the data to be stored and uploads it to CSP. When the data integrity needs to be verified, TPA generates a challenge with relevant parameters and sends them to CSP. Based on the challenge parameters, CSP uses cloud data to generate the proof that he holds the user data in full and sends the proof to TPA. TPA uses the proof to audit the data’s integrity and sends the result to the user.

3. Review of Jalil et al.’s Protocol

There are three entities involved in Jalil et al.’s scheme. Jalil et al. used the BLS signature to achieve public audit and protect data content privacy. The program also supported batch audit and dynamic update. In addition, the proposed system enhanced the level of security authentication through an ABP to protect the system from unauthorized TPA. In particular, their scheme contains the following algorithm.

3.1. DataProtection Protocol

To protect data privacy, data file blocks need to be encrypted first. The user divides the data file into data blocks and then uses the AES encryption algorithm to encrypt the data blocks and obtain the encrypted data blocks .

3.2. Setup Protocol

The user takes the security parameter as input, for each data block , outputs the corresponding private key , and calculates the corresponding public key .

3.3. SignatureGen Protocol

For each data block , the user generates a random value and calculates the corresponding label :where is the name of relevant blocks and is SHA256 hash function, which defines intermediate parameters .

Then, the user uploads and to the auditor and uploads and with to cloud for and deletes the local data.

3.4. ChallGen Protocol

When the user needs to verify the integrity of cloud data, he sends an audit request to the TPA. TPA first randomly selects elements to form a subset of . For all , TPA selects a random and sends all and to CSP.

3.5. Response Protocol

When CSP receives an audit challenge from TPA, he first asks the user whether the user has issued an audit request, thereby confirming the authenticity of the challenge from the TPA. After receiving user’s affirmative reply, CSP confirms that the challenge is true and performs the next step. This process is implemented through the ABP. CSP uses the following equation to calculate the aggregate tag and sends the evidence to the auditor:

3.6. CheckProof Protocol

When the TPA receives the corresponding evidence generated by the CSP for the challenge, he calculates the following equation to verify the integrity of the data:

If equation (3) is true, he shows that the CSP has faithfully performed the service and ensured the integrity of the cloud data.

3.7. BatchAuditing Protocol

Each user divides the original file into data blocks, then uses different encryption keys to encrypt the respective data blocks, generates private and public keys for different data blocks, and uses equation (1) to generate data tags. All users send for to the cloud and upload metadata to TPA, where represents the user’s identifier. When the data integrity needs to be verified, TPA randomly selects data block indexes to be challenged and sends them to CSP. After CSP receives the challenge and confirms the authenticity of the challenge, based on the label set of each user, the aggregate label is calculated for all challenged data blocks:

CSP generates evidence and sends it to TPA. After receiving the evidence, the TPA verifies whether the following equation holds:

If equation (5) is true, it means that the integrity of the data has not been damaged.

4. Our Attack

In the audit protocol of Jalil et al.’s scheme, the correctness of the audit cannot be achieved. Even if the user’s data held by the CSP are incomplete, CSP can pass the audit. In the SignatureGen protocol, the user calculates the signatures as equation (1). In equation (1), the calculation process of is determined by the private key value and the name of the data block . However, are not signatures of the content . In response protocol, CSP only uses equation (2) to calculate the aggregation signature, but he does not calculate the aggregation of the data content. The integrity proof generated by the CSP has nothing to do with the content of the data block. The CSP can use the stored signatures to generate the integrity evidence and pass the audit, so he can store the name locally instead of the content . In addition, in the original scheme, the number of public keys and private keys required is extremely large, which is proportional to . Both in terms of certificate management and storage overhead of three entities, it is more complicated and cumbersome. In the CheckProof protocol, bilinear mappings are used. The cost of calculation is also relatively high. In this section, we will show that CSP can generate an integrity proof that passes the audit from TPA without the store data block .

The relevant data stored by CSP include the following:

User needs to store the following:

The data stored by TPA include the following:

We can see that the storage costs of the three entities are proportional to , and the storage costs are relatively large, which violates the original intention of cloud storage. In addition, CSP and TPA need to store public keys, users need to store the same number of private and public keys as the number of requiring a lot of certificates, and certificate management is more complicated.

In the response protocol of Julil et al.’s protocol, CSP only generates the aggregation of signatures. CSP stores , so regardless of whether CSP stores data, aggregate tags can be generated according to equation (2). As long as the stored signatures are correct, the correct data audit proof can be generated and verified by the CSP.

In the CheckProof stage, after the auditor accepts the proof, he needs to verify whether equation (3) is true or not and calculates bilinear mappings. The bilinear mapping is computationally expensive and reduces the audit efficiency.

5. Improvements to the Secure Auditing Protocol

Based on the above analysis, the original protocol is improved here to enhance security and efficiency. The difference comparison between the original scheme and the improved scheme is shown in Figure 1.

5.1. DataProtection Protocol

The user encrypts data blocks divided from the data file using the AES encryption algorithm and obtains the encrypted data blocks , which can protect data privacy.

5.2. Setup Protocol

CSP inputs security parameters and outputs public parameters . Among them, is a multiplicative cyclic group, is the generator of , is the bilinear mapping, and is the hash function. The user randomly generates and calculates .

5.3. SignatureGen Protocol

For each data block , the user calculates the corresponding label :

The tag is calculated by the secret key , data block , and data block index . Then, the user deletes the local data and tags after uploading them to the cloud.

5.4. ChallGen Protocol

To verify whether the data are complete, the user sends a message to TPA requesting an audit first. TPA randomly selects elements from to form a subset , and then, he randomly selects for all . Finally, all and are sent to CSP.

5.5. Response Protocol

When CSP receives an audit challenge from TPA, he first ensures the authenticity of the challenge by querying the user. When the user’s authenticity is confirmed, the CSP will accept the challenge. This process is implemented through the ABP. CSP randomly generates and uses the following equations to calculate the proof:and then sends the proof to the auditor.

5.6. CheckProof Protocol

When the CSP sends the evidence to the TPA, TPA verifies the authenticity of equation (14):

If equation (14) is true, the data are completed and not corrupted. The process of proving the truth of equation (14) is as follows:

5.7. BatchAuditing Protocol

users use different encryption keys to encrypt the data blocks belonging to themselves among the n data blocks divided from the original file, generate private keys and public keys , and then use equation (9) to generate data tags. All users delete the local data after the task of transferring to the cloud server is completed. To prove the completeness of the data, the TPA randomly selects data block indexes to be challenged, sending the indexes and corresponding random values to the CSP. After the CSP receives and confirms the authenticity of the content, TPA randomly generates for each user and calculates:

Based on the set of each user, the aggregate tag is calculated for all challenged data blocks:

CSP generates evidence and sends it to TPA as a basis for the verification. Upon receipt, TPA indicates whether the cloud data are completed by verifying the following equation:

If equation (20) holds, it proves that data integrity has not been compromised. The proof of the correctness of (15) is as follows:

6. Analysis of the Improved Protocol

The security of the improved protocol is first analyzed and explained here, including preventing forgery attack from CSP and attack from TPA to steal data content privacy. Then, the storage and computation overhead of the improved protocol are analyzed in comparison with the original protocol, to prove that the improved protocol is safe and efficient.

6.1. Security Analysis
(1)Anti-Forgery Attack: if in the cloud, the CSP generates a forged audit certificate and the stored user data are corrupted or tampered with, then it means that the group can compute the discrete logarithm problem with probability ( is a large prime number).A forged data possession proof will be generated by the CSP in the case of incorrect data, and we define the following:Because is the forged evidence, there must be a difference between and , and there is at least one . Assuming that CSP’s forged proof of data possession can pass TPA’s audit, thereforeThe correct proof can pass the TPA audit; therefore,From equations (23) and (24), we get , so and . Since is a cyclic group, so , makes .Given and , then can be written as , and , and therefore,simplified further to get .To make equation (25) not true, only if the denominator , then equation (25) is meaningless, and , so , and the probability that equation (25) is true is .It is concluded that if the CSP can successfully forge the data block, then he can calculate the discrete logarithm problem and the probability is but obviously the discrete logarithm problem is a difficult problem, so the CSP cannot forge the fake data block that has passed the audit.(2)Privacy Protection: first, an authentication protocol (ABP) is used to prevent unauthorized adversaries from entering the system.Then, in the DataProtection protocol, the user’s original data are encrypted by AES to obtain . The data uploaded to the cloud are encrypted data. The CSP does not hold the encryption and decryption keys of the AES encryption algorithm, so it is impossible to know the real data content of the user, avoiding the leakage of data privacy.Finally, for TPA, the improved protocol uses random masking technology to realize data protection. Assuming that TPA is curious about the challenged data blocks’ content and audits data blocks times, is represented as random parameters during the time audit on the data block, and then, the set of random numbers is . An evidence set consisting of pieces is . TPA can obtain the following equations:

In the above equations, TPA knows and , but he does not know and . There are unknown numbers in equation (26), no matter how many times TPA audits the same data blocks; that is, no matter what the value is, it will always be less than , and TPA cannot solve equation (26) and cannot know the content of the data blocks and .

6.2. Efficiency Analysis

In the original protocol, the user needs to generate the corresponding public keys and private keys for . After uploading the data blocks and tags to CSP, the user still needs to store his own public keys and private keys, and the storage cost is . In addition to data blocks, CSP also needs to store tags, and the storage cost is . is stored at TPA, so the storage overhead is .

In the improved protocol, the user only holds a pair of and , and the storage overhead on the user side is . CSP needs to store , and the storage overhead is . When TPA verifies the evidence, he needs the user’s public key in addition to the challenge information, and the storage cost is . The storage cost comparison between the original protocol and the improved protocol is shown in Table 2. The storage overhead of the improved scheme is lower than that of the original scheme.

Because the multiplication and addition operations on have minimal computational overhead compared with other operations, we omit them. In the original protocol, the user needs to calculate , , and , and the calculation cost is . CSP needs to calculate , and the calculation cost is . TPA needs to calculate equation (3), and the calculation cost is .

In the improved protocol, the user needs to calculate , and the calculation cost is . CSP needs to calculate and , and the calculation cost is . TPA needs to calculate equation (14), and the calculation cost is . The calculation cost comparison between the original protocol and the improved protocol is shown in Table 3. Among the entities of the improved scheme, only CSP’s calculation overhead is slightly higher than the original scheme. The calculation overhead of TPA and user in the improved scheme is significantly reduced compared with the original scheme.

7. Conclusion

According to the analysis in this study, it is clear that the protocol of Jalil et al. is insecure. We point out the security loophole in the original protocol and attacked it, and then, we propose an audit scheme with higher security and efficiency based on the directions that can be improved.

Data Availability

The data supporting this systematic review were taken from previously reported studies and datasets, which have been cited. The processed data are available from the corresponding author upon request.

Conflicts of Interest

There are no potential conflicts of interest.

Authors’ Contributions

Ruifeng Li is responsible for the writing of the article and the construction of the improved scheme, Xu An Wang is responsible for the derivation of the formulas in the article and gives some significant ideas, Haibin Yang is responsible for the polishing of the language of the article and the collecting of the information related to this article, Zhengge Yi is responsible for the verification of the security of this article, and Ke Niu revised the finished manuscript.

Acknowledgments

This work was supported by the National Key Research and Development Program of China (No. 2017YFB0802000), National Natural Science Foundation of China (No. 62172436 and No. 62102452), State Key Laboratory of Public Big Data (No. 2019BDKFJJ008), Engineering University of PAP’s Funding for Scientific Research Innovation Team (No. KYTD201805), and Engineering University of PAP’s Funding for Key Researcher (No. KYGG202011).