Research Article | Open Access

Jian Mao, Wenqian Tian, Yan Zhang, Jian Cui, Hanjun Ma, Jingdong Bian, Jianwei Liu, Jianhong Zhang, "Co-Check: Collaborative Outsourced Data Auditing in Multicloud Environment", *Security and Communication Networks*, vol. 2017, Article ID 2948025, 13 pages, 2017. https://doi.org/10.1155/2017/2948025

# Co-Check: Collaborative Outsourced Data Auditing in Multicloud Environment

**Academic Editor:**Qing Yang

#### Abstract

With the increasing demand for ubiquitous connectivity, wireless technology has significantly improved our daily lives. Meanwhile, together with cloud-computing technology (e.g., cloud storage services and big data processing), new wireless networking technology becomes the foundation infrastructure of emerging communication networks. Particularly, cloud storage has been widely used in services, such as data outsourcing and resource sharing, among the heterogeneous wireless environments because of its convenience, low cost, and flexibility. However, users/clients lose the physical control of their data after outsourcing. Consequently, ensuring the integrity of the outsourced data becomes an important security requirement of cloud storage applications. In this paper, we present Co-Check, a collaborative multicloud data integrity audition scheme, which is based on BLS (Boneh-Lynn-Shacham) signature and homomorphic tags. According to the proposed scheme, clients can audit their outsourced data in a one-round challenge-response interaction with low performance overhead. Our scheme also supports dynamic data maintenance. The theoretical analysis and experiment results illustrate that our scheme is provably secure and efficient.

#### 1. Introduction

With the increasing demand for ubiquitous connectivity, wireless technology has significantly improved our daily lives. Meanwhile, together with cloud-computing technology (e.g., cloud storage services and big data processing), heterogeneous wireless networking technology has become a foundation infrastructure widely adopted by emerging communication networks, for instance, IoT (Internet of Things), C-RAN (cloud radio access network), and body-area network, as shown in Figure 1. Particularly, the cloud storage technique has been widely used in services, such as wireless data outsourcing and resource sharing, thanks to its convenience, low cost, and flexibility. Nowadays, online service providers, such as Amazon and Baidu, operate large data centers and offer unlimited storage capacity for users, relieving their burden of local data management and maintenance [1, 2]. In addition, cloud storage enables universal data access in any place. However, users lose the physical control of their outsourced data, while the cloud storage service provider is not always trustworthy. Dishonest service providers may conceal the fact that users’ data have been damaged due to some misoperations or unexpected accidents. Even worse, malicious service providers also may delete the data seldom accessed by users to gain more benefits. How to ensure the integrity of their remotely outsourced data becomes a serious concern for users selecting cloud storing services.

Traditional data integrity verification solutions [3, 4], which are based on hash functions and digital signatures, are impractical to audit cloud data remotely due to their unacceptable communication and computational overhead to retrieve the outsourced files. To check the remote data integrity effectively without retrieving the whole outsourced document, Ateniense et al. presented the first probabilistic verification model called* provable data possession* (PDP) based on homomorphic cryptography algorithm and* sampling* techniques [5]. Taking the public verifiability into account, Ateniense et al. improved their approach [6]; Wang et al. also proposed a publicly verifiable cloud data audition scheme that supports dynamic data maintenance by using Merkle Hash Tree data structure [7]. Juels et al. introduced error correcting coding techniques and proposed* Proof of Retrievability* (POR) mechanisms to audit cloud data and ensure data correction if data corruption happened.

Most of these previous works mainly target the problem of data integrity audition in a single-cloud storage environment rather than a heterogeneous cloud infrastructure that collaborates multiple internal (private) and/or external (public) cloud resources [8, 9]. In the multicloud environment, users split their data, duplicate file blocks, and outsource them to different CSP (Cloud Service Provider) servers. The solutions above cannot enforce the data integrity checking efficiently in such an environment where data spread over multiple servers. Aiming at this problem, Zhu et al. propose a cooperative provable data possession (CPDP) scheme [8, 10] in the multicloud environment. However, in the CPDP scheme, the security parameter is independent of other parameters; and thus servers can bypass the authentication by forging the parameter in the response sequence. Moreover, in the process of third-party public verification, the third party needs to know where every data block is exactly stored. It poses a threat to users’ data storage privacy and increases the operation overhead for the third auditing party to maintain the storing state of file blocks. Moreover, besides the effectiveness, efficiency is also a significant concern for a data integrity auditing solution in the multicloud storage environment.

In this paper, we present* Co-Check*, a collaborative multicloud data integrity audition scheme, which is based on BLS signature and homomorphic tags. According to proposed scheme, users can audit their outsourced data in one challenge-response interaction with low communication cost. Our scheme also enables public verification and supports dynamic data maintenance that users can modify and delete the data with low performance overhead. The contributions made by this paper are summarized as follows.(i)We propose an effective collaborative multicloud data audition scheme enabling users to conduct data integrity checking among multiple CPS server simultaneously in one-round challenge-response procedure.(ii)The audition procedure of our scheme is stateless and supports unlimited challenge-response interactions. Moreover, the proposed scheme supports dynamic data maintenance efficiently.(iii)We prototype our scheme and conduct system evaluation. The theoretical analysis and experiment results illustrate that our scheme is provably secure and efficient.

*Paper Organization*. The rest of this paper is organized as follows. Section 2 describes the security goals, system model, and the overall architecture of our approach; Section 3 presents the collaborative multicloud data integrity audition scheme; in Section 4, we make the theoretical analysis and evaluate our protocol on security and performance aspects; Section 5 discusses the related work; and Section 6 concludes the paper.

#### 2. Approach Overview

##### 2.1. System Framework

As shown in Figure 2, the general multicloud storage system includes three types of network entities.

*(i) Client (or User)*. (We use the term* user* and* client* exchangeably in this paper.) Clients outsource data to reduce local storage overhead and make use of the computation resources provided by the cloud service providers in multicloud storage system.

*(ii) Cloud Service Provider (CSP)*. CSPs that possess a large quantity of hardware and software resources are clustered to provide remote data storing services. We assume that there is an organizer in the CSP cluster, a mediation node that interacts with users and other CSPs.

*(iii) Third-Party Authority (TPA)*. TPA is an optional entity being partially trusted in the multicloud scenario.

In the multicloud storage system shown in Figure 2, the user splits her/his documents into several file blocks. The file blocks will distribute the cloud storage servers deployed by different cloud service providers. In addition, to promote the access efficiency and ensure the data retrievability, users might also duplicate the file blocks and spread the copies to several cloud servers.

##### 2.2. Challenges and Goals

As the CSPs in the multicloud system cannot always be trustworthy, it is necessary for users to establish the integrity audition mechanism that ensures their outsourced data are stored correctly without unauthorized access by CSP servers or other entities. To make the audition more efficient, another challenge of data integrity audition in the multicloud environment is to conduct parallel checking, which means verifying the integrity of block files stored in different CSP servers simultaneously. Moreover, supporting securely dynamic maintenance is also a major concern of the multicloud data audition.

Aiming to address the above challenges, the goal of this paper is to propose an effective multicloud data integrity audition mechanism satisfying the following requirements.(i)Correctness: benign servers will prove themselves successfully and none of the misbehaved servers can bypass the checking.(ii)Batch verification: the client can simultaneously verify the integrity of the file blocks distributed in different CSP servers without retrieving the file.(iii)Stateless and unbounded checking: the audition procedure is stateless and supports unlimited challenge-response interactions.

##### 2.3. Collaborative Data Integrity Audition Model

Our collaborative data audition model consists of three stages as we defined in our preliminary version [11]:* initialization*,* challenge-response*, and* integrity checking*. Motivated by the sampling technique introduced by Ateniese et al. [5], users split their files and distribute the file blocks among the cloud service providers (CSPs) in* initialization and preprocessing* stage. Meanwhile, users keep the corresponding metadata for the future audition. Here we use BLS signature to create the* homomorphic tags* due to its homomorphic property. Instead of retrieving the whole file to verify its correctness, in stages II and III, users generate the challenges for audition by using parts of the metadata restored at the client side to prompt the audition efficiency and ensure that malicious CSPs cannot bypass the check with a high confidence rate. Additionally, our scheme also designates a subprocedure to support dynamic maintenance. The procedure of our scheme is shown in Figure 3.

*(**1) Stage I: Initialization and Preprocessing*. Stage I consists of steps (1)-(2) in Figure 3. In step , the user selects system parameters and generates keys for BLS algorithm used in the successive steps. Meanwhile, the user splits the file into file block set and each file block consists of several file sectors. Then the user computes the homomorphic tags corresponding to the file sectors. After preprocessing the outsourced file, the user distributes the file blocks with the metadata for audition into the cloud servers belonging to the different CSPs and keeps the secret parameter locally.

*(**2) Stage II: Challenge-Response*. Stage II includes steps (3)–(6) in Figure 3. When the user wants to audit her/his outsourced file, she/he computes a challenge sequence corresponding to the file blocks under test. The user sends* organizer* to the challenge sequence and* organizer* will forward the challenges to the aimed CSP servers that contain the user’s file blocks. CSP servers calculate and return their proofs to* organizer*.* Organizer* aggregates the proof received and sends the corresponding answer to the user.

*(**3) Stage III: Integrity Checking*. Based on the received response from* organizer*, the user verifies the data integrity in step shown in Figure 3. If data are stored correctly, the algorithm outputs “*TRUE*”; otherwise, it outputs “*FALSE,*” which means that there exist misbehaved CSP servers.

*Dynamic Maintenance*. When users need to conduct dynamic operations on their outsourced data, they recreate tags corresponding to the new file sectors and send them to the organizer for updating.

All the symbols used in this paper are listed in Notation.

#### 3. Collaborative Multicloud Data Integrity Audition Scheme

In this section, we present our collaborative multicloud data integrity audition scheme in detail. The notations and concepts employed in our work are illustrated below.(i) is the system parameter. is a big prime number and is the order of the cyclic group ; is a nondegenerate bilinear map. is the generator of .(ii) is the number of the CSPs, and the CSP set is represented as .(iii) is user’s file and is the file name. The file is separated into blocks, each of which contains sectors, , where .(iv) is the challenge generated by users.(v) is a hash function.

As shown in Figure 3, our scheme includes three entities, a* user*, CSP servers, and an* organizer*, which is also one of the CSP servers. The integrity checking scheme is fulfilled by the following eight steps.

*Step 1 (user setup). *(1)KeyGen: . The user selects secure parameter and system parameters and . She/he randomly selects an as the private key. The public key is . Then the user gets .(2)File preprocessing: . The user splits the file into blocks, each of which contains parts. The file is represented as follows: We assume that is the total number of copies corresponding to each data block stored in different CSPs, and represents how many times each data is updated. The initial value of is 0 for all the elements. We use to represent it. represents concatenation.(3)TagGen: . The user randomly selects parameters and computes the tags for corresponding to each data block and thus the set of all tags is obtained. As shown in Figure 4, represents data blocks from the file; each block is separated into parts and every part of a block is represented as ; represent tags corresponding to .

*Step 2 (data outsourcing). *The user sends the file and corresponding tags to the organizer, and the organizer distributes data blocks with corresponding tags to different CSP servers (as shown in Figure 5). If a file block is stored with several copies, every copy of the file block has a tag. For instance, data block is stored with copies, then there are tags, which means the CSPs should store data along with the tag from the labels. The user computes the public parameter () and sends it to the trusted third party for storage. The user keeps the private key at the client side.

*Step 3 (challenge creation, challenge (chal)). *When the user wants to audit the outsourced data, he or she computes a challenge, , and sends it to the organizer.

*Step 4 (challenge delivery, forward (chal)). *The organizer forwards the received challenge to the CSP servers, . Without losing generality, we assume there are CSP servers that store the blocks challenged by the user.

*Step 5 (proof creation and delivery, ). *, the service provider computes the evidence according to the following formula: returns the proofs shown in (3) to the organizer:

*Step 6 (proof aggregation and response, ). *The organizer computes . The organizer returns the aggregated proof , , to the user, where .

*Step 7 (user verification). *After the user received the data sent by the organizer, she/he gets the parameter from the trusted third party and verifies the response according to the formula If formula (4) holds, it means the outsourced data are stored correctly and the output is “*TRUE*”; otherwise, the output is “*FALSE*.”

We summarize the interactions of collaborative auditing in Figure 6.*Dynamic Update*. When users need to update data , they should make a modification from , compute the new label , , and send the updated along with the corresponding label to the organizer. After that, the organizer conducts the distributed storing operation. Due to the relevance between the label and the sequence of the data, the scheme could only realize part of the update operations, namely, data modification and deletion.

#### 4. Evaluation

##### 4.1. Security Analysis

In this section, we prove two properties to ensure data integrity under our scheme.

Theorem 1. *Correctness*. If all CSP servers keep user’s data correctly, they can successfully pass the challenge-response verification procedure initiated by the user.

*Proof. *To verify the data correctness, according to step , the use computes . It can be noticed in step (5)-(6) that and , where and .

According to the bilinear property of the Weil-paring function, we get This completes our proof.

Theorem 2. *If there exists a probabilistic polynomial time adversary adv and it is able to successfully convince the TPA to accept the fake proof information for a corrupted file in nonnegligible probability, then it is possible to construct a polynomial algorithm to solve the computational Diffie-Hellman (CDH) problem by invoking adv with nonnegligible probability.*

*Proof. *Suppose that the algorithm is given an instance of the CDH problem tuple shown as follows: and its goal is to compute . The algorithm will execute an interactive game with adv in the following game of security model.*Setup*. Let be the public key of the user, and choose a hash function which acts as random oracle in the following security proof. And for to , it randomly selects to set . Finally, it returns the public parameter to the adversary .*Hash Query*. At any time, the adversary is able to adaptively query hash oracle for the string it submits. And to respond to these queries, the algorithm maintains an -list which is initially empty and responds as follows: (1)If exists in the -list, retrieves the tuple and sends the adversary .(2)Otherwise, chooses a bit according to a bivariate distribution function and , where is a fixed probability value which will be determined later. Then answers as follows:(a)If , chooses a random number to compute and return to the adversary. Then inserts the tuple to the -list.(b)If , chooses a random number to compute and return to the adversary. Then inserts the tuple to the -list.*TagGen Oracle*. At any time, the adversary can adaptively query the TagGen oracle with message . To respond to it, executes as follows: (1)First, it divides message into .(2)Then, it checks whether the tuple exists in the -list:(a)if there exists for to , it aborts.(b)if for to , it computes (3)Otherwise, it makes a Hash Query with and executes as above.*Challenge*. The adversary chooses a subset of indices of the data blocks such that at least one index in set satisfies in the tuple . And one has queried the hash oracle before.

The challenge sets the challenge information and sends it to the adversary.

*Proof. *Finally, the adversary outputs the response as .

If the adversary wins the above game, the returned proof information can pass the verification, which means that should satisfy where for to .

Without loss of generality, we assume that there is index whose in the -list and the other index satisfies . For simplicity, we assume all of .

Thus, we have It means that It means that the solution of the CDH problem can be solved.

From the above simulation, we know whether could output the correct solution of CDH problem depends on whether the simulation aborts during the TagGen Query and Challenge phases and whether the adversary could output a valid proof information for the challenge information. The adversary is allowed to make the Hash Query at most times. Nonabort probability during TagGen Query phase requires that all for to ; thus its probability is . Nonabort probability during Challenge phase requires that at least one index ’s ; thus, its probability is at least , where is the size of subset . Thus, its success probability is When , then This completes our proof.

##### 4.2. Performance Analysis

We prototyped our algorithm and the evaluation is conducted on a desktop with Intel Core 2 Duo CPU @2.66 GHz, running Ubuntu 10.10 in Oracle VM VirtualBox Version 4.2.10 configured with 2 GB memory, and adopted PBC library to implement the crypto primitives. The security parameter of the bilinear pairing function is configured as , which means the prime number is 160 bits. In the evaluation, we set the file size as 80 KB, 160 KB, and 320 KB, respectively. The result of evaluation is illustrated in Table 1.

| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

is the time cost for preprocessing; is the time cost for generating challenge; is the time cost for generating proof; is the time cost for verification. |

The experiment results shown in Table 1 illustrate that the time cost of preprocessing and challenge generating will not be influenced by the number of file blocks. The time cost of proof generating decreases with the decline of , the number of file blocks; in contrast the time cost of verification will increase when decreases. The time cost of preprocessing increases proportionally with the increase of file size. When file size increases, the challenge generation time cost almost remains unchanged and the time cost of proof generating and verification increases.

#### 5. Related Work

Based on different properties of the proposed models or schemes, related work can be classified as static data verification schemes, integrity verification schemes supporting dynamic operation on data, and verification schemes in multicloud environments. In this section, we discuss the related work in detail.

##### 5.1. Static Data Integrity Verification

Early research of outsourced verification focuses on static archive data. Deswarte et al. [3] are the first to propose remote data integrity verification. They proposed two solutions to this problem, one is to precompute hash value of files and compare whether the hash value returned by server is equivalent to that of the local storage; this solution could significantly reduce the communication bandwidth between users and the server to . Another solution, which is based on RSA signature, requires users to sign the data before it is outsourced with the labels at the server side. Challenges could be issued randomly in the process of verification, and the bandwidth is . The computational cost of the server is ( is the number of the file blocks), which increases linearly with respect to the file size.

Gazzoni Filho and Barreto [4] proposed a remote data integrity verification scheme by combining RSA signature and hash function techniques. Their method could verify the same file for unlimited times, but the whole package of data is required to conduct a specific verification.

Sebe et al. [12] proposed a new integrity verification scheme based on Diffie-Hellman key exchange. In their scheme, the computational overhead at the user and server sides is , while the storage cost at the user side, , increases linearly with respect to the entire data size. Their follow-up work [13] combines Diffie-Hellman key exchange and RSA signature to realize remote data integrity verification.

To reduce computational overhead, Ateniese et al. [5, 14] propose a probabilistic remote data integrity checking scheme called* provable data possession* (PDP) by using homomorphic verification tags and* sampling* technique.

Ateniese et al. [6] proposed a framework to adopt homomorphism identification protocol in data integrity verification and they demonstrated this under the instance of homomorphism identification authentication protocol by Shoup [15]. The authors define the model of homomorphism identification authentication, the model of data integrity verification, and the corresponding attack models.

The schemes above can only detect whether data is properly stored but could not correct the mistakes (like retrieving the data). Another branch of remote data integrity checking focuses on the error correction and retrievability along with the cloud data audition.

Therefore, the study emphasis lies on data error correction and retrieval along with data integrity verification.

Juels and Kaliski Jr. [16] combined data possession checking and error correction of coding technique and became the first to propose the model of POR (proof of retrievability) for remote storage of data. This model adds indistinguishable sentinel to the original code which is not only able to preserve data integrity, and data availability is also realized. Their scheme is used to handle encrypt data.

Shacham and Waters [17] proposed two types of POR schemes: one is a public authentication scheme based on BLS signature, the other is private authentication on the basis of pseudorandom function, and both of the schemes have low interactions and computations. Bowers et al. [18, 19] introduced POR scheme in distributed static data storage system and realized and practiced it.

Naor and Rothblum [20] study the issue of whether files are damaged badly when they are stored in remoter server. They firstly focus on the entire file correcting error code, then compute message authentication code (MAC) for every data block to verify its integrity. When the integrity is damaged and it is within the range of correcting error, then the error detection and correction are to be realized.

Xu and Chang [21] proposed a high efficiency POR scheme, in which data block is involved with group elements and child data blocks, the storage overhead is of the file block, and computational costs is .

In addition, for the static data in cloud, multiple integrity verification schemes have been proposed which support public verification and users’ privacy preservation. In the cloud storage users worry that data in the cloud server is damaged; on the other hand, they worry about the leakage of their data to the unauthorized third party especially for the sensitive information such as personal health report, corporation financial report. Therefore, to preserve privacy the most direct method is that users preprocess the data to encrypt it before they store the sensitive data into the cloud. With data integrity detection scheme, they could verify the data at any time.

Shah et al. [22] considered the problem raised by integrity verification of data storage after it is encrypted and proposed an effective solution. The trusted third-party verdict is introduced; on one hand this is for servers reputation to attract more users by concealing the truth of losing data; on the other hand, users may mistake their own fault of losing data as that of the servers, and their server wants to avoid this situation. Considering that the average users could not preserve their secret information for a long time, shah proposed storing the users’ data along with the keys in the cloud, and hence the data integrity verification is needed as well as the key authentication. In the meantime, to not let the key be leaked to the third party, the article adopts multiple zero-knowledge proving [22] scheme to conduct key authentication.

Wang et al. [23] proposed a scheme to ensure public authentication and privacy preservation. The scheme is based on the discrete logarithm zero-knowledge proof scheme combined with bilinear pairing signature. The trusted third party cannot access any outsourced data in a half honesty state. Moreover, their scheme ensures the batch verification property.

##### 5.2. Integrity Verification That Supports Dynamic Data Operation

Ateniese et al. [24] presented an EPDP solution based on symmetric cryptographic algorithm. They use MAC to get the hash values of data blocks and keep them locally. They illustrate that even if the users save 70 trillion outsourced data, the local users just need to save 128 M bit data. If verifying the integrity every 15 minutes, the saved hashed data are enough to use 16 years. They divide the data to blocks so that when the cloud data are modified by some operations such as updating, deleting, and adding, they do not need to download all data to calculate the hash values and they just need to operate on certain blocks.

Wang et al. [25] presented a solution that adds a preprocessing by RS codes. When users find that the data are incorrect, they can retrieve their data and correct errors, in which is the minimum code distance. Although EPDP and Wang’s solutions can support some dynamic data operation, they still can not achieve full dynamic maintenance and their performance overheads are relatively high for data addition operation.

Erway et al. [26] first discussed the complete dynamic operation problem. They used memory detection [27] and skip-lists [28] related technology to support the DPDP and improve the security. They proved that, under the standard module, this solution is more completeness and robustness than the PDP solution which is based on random oracle module. This solution also causes performance overhead. Its computing overhead and communication overhead have the relationship with file size.

Wang et al. [7] also proposed a solution that supports the DPDP. However, their solution is limited in data updating, deleting, and appending. It is going to be very complicated when inserting data. In their follow-up work, they combined the bilinear pairing BLS signature [29] and Merkle Hash Tree integrity verification technology [30]. They assigned the data to a binary tree and signed the leave nodes to realize dynamic operation on data blocks. Their scheme supports the public authentication and its computational and communication overhead are , where is the file size. Yuan and Yu [31] proposed a public integrity auditing scheme to support dynamic data sharing with multiuser maintenance.

Hao et al. [32] proposed a privacy protected solution that supports dynamic data operations. In this solution, the interaction data size is while both the local saved data size and the server computing overhead are (in which is the saving data size).

Zheng and Xu [10] presented the FD-POR module that supports dynamic operations. Their module is based on 2-3 trees, which is a verified data structure, combined with an incremental signature method, which is also called hash signature. However, this solution cannot support public authentication.

##### 5.3. Privacy Preserving in Cloud Data Checking

Ensuring the data auditing without any unnecessary information leakage is a critical concern in the practical application. Yu et al. [33] introduced the term,* zero-knowledge privacy*, to define the goal of privacy preserving in data integrity verification, which ensures that the TPA cannot obtain any additional information of file content from all the auxiliary verification information available. Fan et al. [34] proposed an indistinguishability-game-based definition,* IND-Privacy*, to evaluate the cloud data privacy preserving. They point out that many approaches are not theoretically secure according to the IND-Privacy definition. They also presented their example protocol that ensures content-integrity checking and satisfies the IND-Privacy.

##### 5.4. Integrity Verification on Multicloud

By the extensive use of cloud storage, people start to consider saving their data among more than one cloud service provider. The integrity verification of multicloud becomes especially important [10].

Zhu et al. [9] proposed a scheme called CPDP that can achieve the integrity authentication of multicloud. The security of CPDP mechanism is based on zero-knowledge proof system. The verifier connects with the organizer, which may reduce the communication overhead and provide computing flexibility for the verifier. However, the protocol is found to be vulnerable by Wang and Zhang [35]. Any malicious CSP or organizer can generate response that can pass the authentication, even when it has already deleted all the data. Therefore, it does not have soundness guarantee. Wang [36] presented ID-DPDP (identity-based distributed provable data possession) scheme. Under the standard CDH problem assumption, the scheme is provably secure and can support regular verification, delegate verification, and public verification as well.

We present the theoretical comparison of various schemes in Table 2. In summary, our scheme has the following features: fast computation speed, low storage overhead, low bandwidth requirement, and support for sampling, unlimited challenge-response interactions.

| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

is the verification time allowed by the scheme; Type means guarantee types provided by the scheme; Det. and Pr. represent Deterministic guarantee and Probabilistic guarantee, respectively; is the number of the file blocks. |

*Redundancy Analysis*. Basically, there are three ways to ensure the remote cloud storage audition, that is, hash-function-based solutions, homomorphic-authentication-tag-based solutions, and network-coding-based solutions. All these approaches will cause unavoidable redundancy and thus additional storage overhead at the server side. Furthermore, there are potential security vulnerabilities in hash-function-based solutions, while network-coding-based solutions cause higher storage overhead than homomorphic-authentication-tag-based solutions. In our scheme, we create the authentication tags based on Boneh-Lynn-Shacham (BLS) signature, whose additional redundancy is bits for one file block. In contrast, in Zhu’s scheme [9], their storage redundancy for authentication tag per file block is also bits; the redundancy caused by Wang’s approach [35] is bits per file block; and the approach proposed by Yuan and Yu [31] introduced -bit additional storage overhead at the server side to verify one file block. Compared with existing related solutions, the redundancy rate introduced by our scheme is relative low.

#### 6. Conclusion

Together with cloud-computing technology, heterogeneous wireless networking technology has become a critical infrastructure adopted by emerging communication networks. Due to the convenience, low cost, and flexibility, cloud storing techniques become widely used in remote services, such as wireless data outsourcing and resource sharing. However, users lose the physical control of their outsourced data, while the cloud storage service provider is not always trustworthy. Consequently, how to ensure the integrity of their remotely outsourced data becomes a serious concern for users to select cloud storing services. In this paper, we present a collaborative multicloud data integrity audition scheme, which is based on BLS signatures and homomorphic tags. According to the proposed scheme, users can audit their outsourced data in a one-round challenge-response process. In addition, our scheme also enables dynamic data maintenance (e.g., data modification, insertion, and deletion). The theoretical analysis demonstrates the effectiveness of our scheme and the probability that the dishonest CSP server can bypass the checking successfully is neglectable if the one-way hash function is collision-resistant and the computational Diffie-Hellman (CDH) assumption holds.

#### Notation

: | The th file sector of the th file block |

: | A large prime |

: | A cyclic group with order and generator |

: | is a nondegenerate bilinear map |

: | The system parameter and |

: | The number of CSPs |

: | The file outsourced by a user |

: | The file name |

: | Challenge created by users |

: | Hash function |

: | Public key |

: | Private key |

: | Security parameter |

: | The tag for the th copy of the file block |

: | The number of copies of the file block |

: | The updated times of the file block |

: | The initial representation of the file block |

: | The th CSP server |

: | The amount of sampled file blocks to be sampled. |

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (no. 61402029), the National Key R&D Program of China (no. 2017YFB0802400), the National Natural Science Foundation of China (no. 61379002, no. 61370190), Beijing Natural Science Foundation (no. 4162020), the Funding Project of Education Ministry for the Development of Liberal Arts and Social Sciences (no. 12YJAZH136), and the Funding Project of Shanghai Key Laboratory of Integrated Administration Technologies for Information Security (no. AGK201708).

#### References

- L. Yu, L. Chen, Z. Cai, H. Shen, Y. Liang, and Y. Pan, “Stochastic Load Balancing for Virtual Resource Management in Datacenters,”
*IEEE Transactions on Cloud Computing*, pp. 1–14, 2016. View at: Publisher Site | Google Scholar - L. Yu and Z. Cai, “Dynamic scaling of virtual clusters with bandwidth guarantee in cloud datacenters,” in
*Proceedings of the 35th Annual IEEE International Conference on Computer Communications, IEEE INFOCOM 2016*, IEEE, San Francisco, CA, USA, April 2016. View at: Publisher Site | Google Scholar - Y. Deswarte, J.-J. Quisquater, and A. Sadane, “Remote integrity checking,”
*Integrity and Internal Control in Information Systems VI*, vol. 140, p. 11, 2004. View at: Google Scholar - D. L. Gazzoni Filho and P. S. L. M. Barreto, “Demonstrating data possession and uncheatable data transfer,”
*Cryptology ePrint Archive*, vol. 2006, pp. 150–158, 2006. View at: Google Scholar - G. Ateniese, R. Burns, R. Curtmola et al., “Provable data possession at untrusted stores,” in
*Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS '07)*, pp. 598–609, Virginia, Va, USA, November 2007. View at: Publisher Site | Google Scholar - G. Ateniese, S. Kamara, and J. Katz, “Proofs of storage from homomorphic identification protocols,” in
*Proceedings of International Conference on the Theory and Application of Cryptology and Information Security: Advances in Cryptology*, pp. 319–333, Springer-Verlag, London, UK, 2009. View at: Google Scholar | MathSciNet - Q. Wang, C. Wang, K. Ren, W. Lou, and J. Li, “Enabling public auditability and data dynamics for storage security in cloud computing,”
*IEEE Transactions on Parallel and Distributed Systems*, vol. 22, no. 5, pp. 847–859, 2011. View at: Publisher Site | Google Scholar - Y. Zhu, H. Hu, G. Ahn, Y. Han, and S. Chen, “Collaborative Integrity Verification in Hybrid Clouds,” in
*Proceedings of the 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing*, pp. 191–200, Miami, Fla, USA, October 2011. View at: Publisher Site | Google Scholar - Y. Zhu, H. Hu, G.-J. Ahn, and M. Yu, “Cooperative provable data possession for integrity verification in multicloud storage,”
*IEEE Transactions on Parallel and Distributed Systems*, vol. 23, no. 12, pp. 2231–2244, 2012. View at: Publisher Site | Google Scholar - Q. Zheng and S. Xu, “Fair and dynamic proofs of retrievability,” in
*Proceedings of the first ACM conference on Data and application security and privacy*, pp. 237–248, ACM, Tempe, AZ, USA, February 2011. View at: Publisher Site | Google Scholar - J. Mao, J. Cui, Y. Zhang, H. Ma, and J. Zhang, “Collaborative outsourced data integrity checking in multi-cloud environment,” in
*Proceedings of the International Conference on Wireless Algorithms, Systems, and Applications (WASA '16)*, pp. 511–523, 2016. View at: Google Scholar - F. Sebe, A. Martinez-Balleste, Y. Deswarte, J. Domingo-Ferrer, and J. Quisquater, “Time-bounded remote file integrity checking,” Tech. Rep., July 2004, 04429, LAAS. View at: Google Scholar
- F. Sebé, J. Domingo-Ferrer, A. Martínez-Ballesté, Y. Deswarte, and J. Quisquater, “Efficient remote data possession checking in critical information infrastructures,”
*IEEE Transactions on Knowledge and Data Engineering*, vol. 20, no. 8, pp. 1034–1038, 2008. View at: Publisher Site | Google Scholar - G. Ateniese, R. Burns, R. Curtmola et al., “Remote data checking using provable data possession,”
*ACM Transactions on Information and System Security*, vol. 14, no. 1, article 12, 2011. View at: Publisher Site | Google Scholar - S. Shoup, “On the security of a practical identification scheme,”
*Journal of Cryptology*, vol. 12, no. 4, pp. 247–260, 1999. View at: Publisher Site | Google Scholar - A. Juels and B. S. Kaliski Jr., “Pors: proofs of retrievability for large files,” in
*Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS '07)*, pp. 584–597, ACM, Alexandria, VA, USA, November 2007. View at: Publisher Site | Google Scholar - H. Shacham and B. Waters, “Compact proofs of retrievability,”
*Journal of Cryptology. The Journal of the International Association for Cryptologic Research*, vol. 26, no. 3, pp. 442–483, 2013. View at: Publisher Site | Google Scholar | MathSciNet - K. D. Bowers, A. Juels, and A. Oprea, “Proofs of retrievability: theory and implementation,” in
*Proceedings of the ACM Workshop on Cloud Computing Security (CCSW '09)*, pp. 43–53, ACM, Chicago, IL, USA, November 2009. View at: Publisher Site | Google Scholar - K. D. Bowers, A. Juels, and A. Oprea, “HAIL: a high-availability and integrity layer for cloud storage,” in
*Proceedings of the 16th ACM conference on Computer and Communications Security*, pp. 187–198, ACM, Chicago, IL, USA, November 2009. View at: Publisher Site | Google Scholar - M. Naor and G. N. Rothblum, “The complexity of online memory checking,” in
*Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2005*, pp. 573–582, Pittsburgh, PA, USA, October 2005. View at: Publisher Site | Google Scholar - J. Xu and E.-C. Chang, “Towards efficient proofs of retrievability,” in
*Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, ASIACCS 2012*, pp. 79-80, Seoul, Republic of Korea, May 2012. View at: Publisher Site | Google Scholar - M. A. Shah, R. Swaminathan, and M. Baker, “Privacy-preserving audit and extraction of digital contents,”
*IACR Cryptology EPrint Archive*, vol. 2008, pp. 186–206, 2008. View at: Google Scholar - C. Wang, Q. Wang, K. Ren, and W. Lou, “Privacy-preserving public auditing for data storage security in cloud computing,” in
*Proceedings of the 29th Annual IEEE International Conference on Computer Communications (INFOCOM 2010)*, pp. 1–9, Toronto, Canada, March 2010. View at: Publisher Site | Google Scholar - G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, “Scalable and efficient provable data possession,” in
*Proceedings of the 4th International Conference on Security and Privacy in Communication Networks (SecureComm '08)*, pp. 1–10, ACM, Istanbul, Turkey, September 2008. View at: Publisher Site | Google Scholar - C. Wang, Q. Wang, K. Ren, and W. Lou, “Ensuring data storage security in cloud computing,” in
*Proceedings of the 17th International Workshop on Quality of Service (IWQoS '09)*, pp. 1–9, IEEE, July 2009. View at: Publisher Site | Google Scholar - C. C. Erway, A. Küpçü, C. Papamanthou, and R. Tamassia, “Dynamic provable data possession,”
*ACM Transactions on Information and System Security (TISSEC)*, vol. 17, no. 4, article 15, 2015. View at: Publisher Site | Google Scholar - D. Clarke, S. Devadas, M. Van Dijk, B. Gassend, and G. E. Suh, “Incremental multiset hash functions and their application to memory integrity checking,” in
*Proceedings of the 9th International Conference on the Theory and Application of Cryptology and Information Security*, pp. 188–207, Springer, 2003. View at: Google Scholar - M. T. Goodrich, R. Tamassia, and A. Schwerin, “Implementation of an authenticated dictionary with skip lists and commutative hashing,” in
*Proceedings of the DARPA Information Survivability Conference and Exposition II, DISCEX 2001*, pp. 68–82, USA, June 2001. View at: Publisher Site | Google Scholar - D. Boneh, B. Lynn, and H. Shacham, “Short signatures from the weil pairing,” in
*Proceedings of the ASIACRYPT 2001*, pp. 514–532, Springer, Berlin, 2001. View at: Google Scholar - R. C. Merkle, “Protocols for public key cryptosystems,” in
*Proceedings of the 1980 IEEE Symposium on Security and Privacy, SP 1980*, pp. 122–134, Oakland, Calif, USA, April 1980. View at: Publisher Site | Google Scholar - J. Yuan and S. Yu, “Public integrity auditing for dynamic data sharing with multiuser modification,”
*IEEE Transactions on Information Forensics and Security*, vol. 10, no. 8, pp. 1717–1726, 2015. View at: Publisher Site | Google Scholar - Z. Hao, S. Zhong, and N. Yu, “A privacy-preserving remote data integrity checking protocol with data dynamics and public verifiability,”
*IEEE Transactions on Knowledge and Data Engineering*, vol. 23, no. 9, pp. 1432–1437, 2011. View at: Publisher Site | Google Scholar - Y. Yu, M. H. Au, Y. Mu et al., “Enhanced privacy of a remote data integrity-checking protocol for secure cloud storage,”
*International Journal of Information Security*, vol. 14, no. 4, pp. 307–318, 2015. View at: Publisher Site | Google Scholar - X. Fan, G. Yang, Y. Mu, and Y. Yu, “On indistinguishability in remote data integrity checking,”
*The Computer Journal*, vol. 58, no. 4, pp. 823–830, 2013. View at: Publisher Site | Google Scholar - H. Wang and Y. Zhang, “On the knowledge soundness of a cooperative provable data possession scheme in multicloud storage,”
*IEEE Transactions on Parallel and Distributed Systems*, vol. 25, no. 1, pp. 264–267, 2014. View at: Publisher Site | Google Scholar - H. Wang, “Identity-based distributed provable data possession in multi-cloud storage,”
*IEEE Transactions on Services Computing*, 2014. View at: Publisher Site | Google Scholar - Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling public verifiability and data dynamics for storage security in cloud computing,” in
*Proceedings of the 2009 European Conference on Research in Computer Security*, pp. 355–370, Springer, Saint-Malo, France, 2009. View at: Google Scholar

#### Copyright

Copyright © 2017 Jian Mao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.