Abstract

As information technology develops, cloud storage has been widely accepted for keeping volumes of data. Remote data auditing scheme enables cloud user to confirm the integrity of her outsourced file via the auditing against cloud storage, without downloading the file from cloud. In view of the significant computational cost caused by the auditing process, outsourced auditing model is proposed to make user outsource the heavy auditing task to third party auditor (TPA). Although the first outsourced auditing scheme can protect against the malicious TPA, this scheme enables TPA to have read access right over user’s outsourced data, which is a potential risk for user data privacy. In this paper, we introduce the notion of User Focus for outsourced auditing, which emphasizes the idea that lets user dominate her own data. Based on User Focus, our proposed scheme not only can prevent user’s data from leaking to TPA without depending on data encryption but also can avoid the use of additional independent random source that is very difficult to meet in practice. We also describe how to make our scheme support dynamic updates. According to the security analysis and experimental evaluations, our proposed scheme is provably secure and significantly efficient.

1. Introduction

In recent years, cloud computing has triggered profound technology changes in the field of information industry, promoting the rapid development of IoT (Internet of things) and big data that have gained so much attention in our daily social and economic activities [1]. As one of the vital services of cloud computing, cloud storage offers many attractive advantages, including the location-independent resources, ubiquitous network access, and on-demand storage space [2], motivating more and more enterprises and individuals to outsource their own data to cloud. Benefiting from the big data that is gathered together into the cloud, all kinds of data-driven techniques, such as data mining [3, 4] and data signal processing [5, 6], can be deployed upon the cloud storage environment to play their effective roles for creating more information wealth.

Despite that fact that many potential gains can be achieved based on the cloud storage, there also exists new threats from the cloud user’s point of view. After user uploads all of her own data to cloud, one of the most pressing issues for user is how to verify the integrity of outsourced data stored at remote cloud side. Note that user loses the physical possession over her data in the context of data outsourcing, so it is clearly not feasible to directly apply traditional local data verification techniques that require access to the entire data, since both user and cloud servers cannot afford the heavy communication cost of frequently transferring all the outsourced data across a network to perform the data integrity verification. In this case, a variety of remote data auditing schemes [723] are designed, which can support the periodic integrity verifications upon outsourced data and simultaneously avoid transferring all these data for the minimum communication overhead. In addition, as an important feature to further reduce the burden on the user, public auditing is first proposed by Ateniese et al. [7] and has been adopted extensively by the subsequent improved schemes [1322], which enables a third party auditor (TPA) to audit cloud servers on behalf of user for ensuring the outsourced data integrity. Nonetheless, happiness will not come so easily. When TPA is introduced, the following security risk arises.

Malicious TPA. TPA is considered as a trusted (or semitrusted, i.e., honest but curious) entity who cannot violate the auditing protocols in existing public auditing schemes [1322]. But actually TPA might be untrusted [23]. Obviously, if the irresponsible TPA is lazy and does nothing, there is no difference between entrusting a malicious TPA and casting away all prior public auditing schemes for user.

In order to protect against the above malicious TPA, Armknecht et al. [23] first presented the outsourced auditing scheme Fortress to achieve this goal. Meanwhile, Fortress can protect the honest TPA from a malicious user, which is also a potential security issue that has not been considered in existing public auditing models. However, during the data preprocessing step, Fortress enables TPA to have read access right over the whole user’s outsourced data in cloud, which is a significant limitation for practical applications. On the one hand, since Fortress exposes all outsourced data to TPA, in Fortress the only way for data privacy protection against curious TPA is to encrypt user’s files before outsourcing. Nevertheless, as shown in [13, 14], although data encryption alone is an approach to relieve the privacy concern in cloud storage, encryption itself is often not enough to prevent user’s data from leaking to TPA during the auditing process. On the other hand, in the era of big data, user’s outsourced data is one kind of core business assets of CSP [15], which means the wealth and the future for CSP. Thus, CSP is selfish and has no incentives to reveal user’s outsourced data to TPA under any circumstance. Besides, user is also often reluctant to expose her data to a third party [24]. In this case, for the various online cloud storage applications (e.g., online videos) where user cannot encrypt her data prior to outsourcing and only resorts to CSP to protect against outsourced data leakage, it is clearly that the direct extension of Fortress upon these online applications is impractical, since the design of revealing outsourced data to TPA is inevitable in Fortress. Therefore, it is necessary for an outsourced auditing scheme to include the privacy-preserving mechanism that is independent of data encryption to defend against curious TPA.

Furthermore, Fortress argues that the challenges for auditing cannot depend on any of the involved three entities since they might be malicious. So Fortress requires the aid of additional independent random source to produce the secure challenges for protecting against any malicious entity. However, as shown in [25], under the environment of cloud storage, the requirement of additional independent servers is already a strong assumption that is very difficult to meet in commercial contexts, and thus the similar assumption of requiring additional independent random source in Fortress is the same situation.

To address the above problems, in this paper, we introduce for outsourced auditing model a novel notion User Focus, which emphasizes the idea of restoring user’s data autonomy lost in cloud storage setting. As shown in Sections 2.2 and 2.3, User Focus means to let user control all challenges throughout the process of outsourced auditing, avoiding the limitation of introducing the additional bitcoin pseudo-random source for generating challenges as in existing Fortress scheme. Furthermore, the user’s autonomy enabled by User Focus is also reflected in that the data only needs to be preprocessed by user herself, avoiding the unfavorable situation in Fortress that TPA must fetch all user’s data from cloud for initialization. With introducing User Focus, we propose an efficient and secure outsourced auditing scheme, which not only can defend against any malicious entity but also can protect user’s outsourced data from curious TPA without depending on data encryption. In general, the contributions of this paper can be summarized as follows.

By empowering user to play the leading role, we propose a formal User Focus outsourced auditing model along with the security definitions, which do not depend on any additional pseudo-random source. Our model extends the model of Fortress and takes into account the problem of preserving user’s data privacy when introducing TPA, which is not covered in Fortress.

Based on our proposed model, we design a concrete User Focus outsourced auditing scheme, the security of which is analyzed. Although the notion of User Focus empowers user to generate the challenges, it does not mean that a malicious user can do whatever she wants to do, since our scheme can also defend against the malicious user. In addition, under the environment of outsourced auditing, our scheme can enable user to predefine enough challenges for avoiding keeping user online all the time and also support the dynamic data updates by relying on the MHT authenticated data structure.

Our scheme applies the RSA public key cryptography rather than the symmetric cryptography technology as utilized in Fortress and thus enables TPA to complete his preparatory work for auditing without requiring access to user’s outsourced data at cloud side, which solves the significant performance problem faced by Fortress. We evaluate the run time of our scheme through concrete implementation when compared to Fortress. The evaluation results show that our solution is promising according to the improved performance.

The rest of this paper is organized as follows: Section 2 introduces the notion of User Focus and extends the outsourced auditing model along with the security definitions. In Section 3, we propose the User Focus outsourced auditing scheme, followed by the security analysis. In Section 4, the concrete algorithms for supporting dynamic updates are described. Section 5 gives the implementation results with the performance evaluation. Section 6 overviews the related work. Finally, Section 7 gives the concluding remark of this paper.

2. Problem Statement

In this section, we introduce the notion of User Focus, which should be an important requirement for user in the setting of storage outsourcing. Then we propose a formal User Focus outsourced auditing model and the corresponding security definitions.

2.1. Outsourced Auditing for Cloud Storage

Various remote data auditing schemes [723] provide a cloud user the ability of confirming that her outsourced data is intact at the cloud, with the advantage that it is no need to fetch the data from cloud. The private auditing schemes [8, 12] only include two entities: a user and CSP, where user has to audit CSP regularly by herself to ascertain that CSP holds the stored data all the time. In view of user’s limited resources and the expensive computation cost incurred by the frequent audits, the public auditing schemes are proposed [1322], which introduce a trusted TPA to perform the above auditing task. By employing TPA, user is alleviated from the auditing burden. However, trusted TPA is just an ideal hypothesis in real world.

Based on the prior auditing solutions, the first outsourced auditing scheme [23] is proposed to defend against the malicious TPA. Compared with the public auditing model, although there are also three entities included in outsourced auditing setting, the major difference is that anyone of the three entities might be dishonest, as described as follows:(i)User might be a dishonest entity, who uploads her data to the cloud servers. User needs to remotely update outsourced data as necessary. And user might maliciously deny the fact that the auditing work performed by TPA against CSP is correct for claiming compensations from TPA.(ii)CSP might be a dishonest entity, who is the owner of cloud servers (so CSP and the cloud servers are not distinguished in our paper), holding a large amount of resources to store and maintain outsourced data. CSP might try to cheat on auditor when data loss or data corruption occurs in cloud.(iii)TPA might be a dishonest entity, who has capabilities and expertise, on behalf of user, to regularly audit CSP for confirming the intactness of user’s data in cloud. But TPA might be lazy and fail to perform the auditing task required by user. In addition, TPA might be curious and try to deduce user’s outsourced data during performing his auditing task against CSP.

2.2. User Focus

“Customer Focus” is a marketing term that means keeping customer in mind and bringing customer the good experience of services. Clearly, “The customer is a God” is not only the truth in business, but also a similar situation in our cloud storage environment, where the user is the targeted customer of CSP and all kinds of auditing solutions. User experience is a determining factor signifying whether an auditing scheme is accepted or not in practice. If user experience of a scheme is poor, no matter whatever sophisticated technology is adopted, it is impossible for this scheme to get a practical application widely.

In spite of various auditing schemes that are proposed to cover many critical issues, but user experience is ignored. On the one hand, whichever of the private auditing schemes is based on the design that the auditing protocol needs to be frequently executed by user, resulting in the nonnegligible computation overhead at user side. Obviously, this is a terrible experience for user who just holds limited resources, such as smartphone. On the other hand, within public auditing schemes, the assumption of a “trusted” TPA is also a bad experience for user, since it is impractical for every ordinary user to find an idealistic “trusted” TPA.

Note that the purpose of remote data auditing is to provide user with a mechanism for confirming the security of her outsourced data. Here, user is regarded as the demander. Therefore, user ought to be put at the center when designing the auditing scheme, and her experience should not be ignored. For this reason, we introduce the notion of User Focus, which is defined as follows:

User is the initiator of the auditing protocols and controls all the challenges, who can timely receive the exception message of her outsourced data without frequently working, since both CSP and TPA have to frequently provide all proofs around user’s needs.

We stress that User Focus does not mean a malicious user can do whatever she wants. As shown in Section 3.2.5, although user controls all the challenges, our scheme can still ensure the security for honest TPA to defend against the malicious user.

Actually, the essential concept of cloud services is the centralized management of user’s data in cloud, which casts a psychological shadow on user. To save the storage space or availably access data without restriction of time and location, user is required to outsource her data to cloud, which means that user is no longer able to physically possess all her data. In other words, data outsourcing makes user lose the physical ownership and autonomy of her data, which is one of the main obstacles for the application and promotion of cloud storage.

When outsourcing data to cloud is inevitable, the notion of User Focus can make user step out of the psychological shadow brought by cloud storage, enabling user to enjoy cloud services more confidently. User Focus expresses the idea that let user dominate her own data, which is realized in our model by the way that “the gain offsets the loss.” By empowering user to gain the control right of challenges, user gets back the autonomy that is lost after data outsourcing and is able to proactively check the intactness of the specific data just by adjusting challenges, which can bring user the feeling that there is really no difference for intactness assurance between her data stored in local disks and outsourced in cloud, since everything is under control from the user’s perspective. Apparently, User Focus will be an attractive property for user. Especially when our proposed scheme is implemented as a cloud service and CSP hopes that this service can be broadly accepted by potential customers, User Focus will be a fascinating feature to persuade every customer to try this cloud service.

2.3. User Focus Outsourced Auditing Model

Now, we begin with the description of User Focus outsourced auditing model, as shown in Figure 1. To avoid the considerable user online direct interactions during the frequent TPA’s auditing against CSP, user will pregenerate enough challenges which can support running the auditing protocol for ages. Since the size of a challenge can be very small (e.g., only 88 bytes for a challenge as shown in Section 5), all these pregenerated challenges can be stored in user’s email box (e.g., only 8.5 MB email box memory is required for storing 100,000 pregenerated challenges). In this case, after user uploads her data to cloud and delegates the auditing work to TPA, based on the built-in timer of email box, each challenge will be periodically issued from user’s email box to automatically trigger TPA’s auditing against CSP without involving user herself. Furthermore, TPA must produce the corresponding log when he finishes each auditing against CSP. Based on the contract established by three entities, TPA has to immediately inform user (e.g., gives user a phone call) as soon as any exceptional situation about user’s outsourced data is detected. If TPA is lazy and hence does not find out the data corruption happening upon the challenged data blocks, once user launches her auditing to TPA by checking TPA’s logs, the lazy TPA will be identified with deterministic evidence. Finally, when all the pregenerated challenges are exhausted, user will add the new challenges to her email box. However, note that such operation for adding challenges and the auditing against TPA’s logs are only rarely executed by user, so user can go offline most of the time throughout our model.

In contrast to existing outsourced auditing model of [23], one major difference in our model is that the notion of User Focus is introduced, enabling user to play the leading role on her outsourced data with minimal effort. So user is the only one who can possess the additional secret key, besides a signing key pair, to preprocess the data. Based on the modern legal society with the spirit of contract, our model can achieve that not any honest entity will be wronged and that any malicious entity can be captured. More specifically, User Focus outsourced auditing model consists of five protocols Setup, Preproccess, AuditCSP, AuditTPA, and IdentifyMalice, which are described as follows.

(1) The Setup Protocol. For each involved entity, this randomized protocol produces a public-private key pair used for signing subsequently. For simplicity, we always imply that every entity uses as inputs its own private key and the public keys of the other entities. Moreover, user randomly generates the secret key , which will be used to preprocess the data before uploading it to cloud servers.

(2) The Preproccess Protocol. This randomized protocol, launched by user, takes as input user’s secret key and a file owned by user. The output marks the outsourced data that will be uploaded to CSP. Observe that should include not only the file , but also some metadata . The role of metadata is reflected in three aspects which (i) enables TPA to frequently verify the response from CSP, (ii) prevents TPA from deducing any data information of the file for privacy-preserving, and (iii) enables user to effectively audit the log files produced by TPA about his past auditing work. Formally, the following holds:

Furthermore, three entities need to agree on the financial contract that defines their respective liabilities and liquidated damage, as well as the system parameters set that will be used throughout the outsourced auditing scheme. Let denote that both entities and establish agreement on the data . Then the contract consists of three agreements as follows. Formally,

If all the above agreements succeed, the Preproccess protocol runs to completion, meaning that is outsourced and the contract is established by three entities. At last, user deletes from local.

(3) The AuditCSP Protocol. This protocol is launched by a challenge , which is generated from user and sent to both CSP and TPA with user’s signature. After receiving , CSP takes as input and TPA takes as input, running this protocol, respectively. At the end, TPA must produce a binary value that expresses whether CSP’s running results pass his audit or not. If , TPA must immediately report this exception to user. In addition, the auditing corresponding to must be generated by TPA for each challenge. Formally, the following holds:

If , indicating that CSP passes TPA’s audit for the current challenge, so user will not receive the emergency report from TPA. Then after a certain time interval, this AuditCSP protocol will be repeated again by a new challenge.

(4) The AuditTPA Protocol. As long as user wants to check if TPA correctly executes AuditCSP protocol for all past challenges, the AuditTPA protocol can be launched by user. This protocol is a deterministic algorithm with one-time process, which takes as input user’s secret key , the metadata , and the log files stored by TPA. The output for user is a binary value that is either or . indicates that TPA is honest. Otherwise, TPA is malicious. Formally,

In practice, AuditTPA will be executed much less frequently. If TPA is lazy when data loss occurs, once user executes AuditTPA, the output must be , which can be taken as evidence for user to claim compensations from TPA based on their financial contract. Since AuditTPA can be launched without requiring the real-time output, user has enough time to execute this protocol. To avoid the breach of contract and financial penalties, TPA has to be honest all the time.

(5) The IdentifyMalice Protocol. If all entities are honest, this protocol is not necessary. This protocol will not be invoked until any dishonest entity exists. IdentifyMalice is a deterministic algorithm that models the scenario of “forensic debate.” Various proofs about all the malicious activities can be produced during the operations of previous protocols in our model. As a way for the honest entity to claim compensations or prove its innocence, all the proofs will be presented with nonrepudiation and taken as input by the last IdentifyMalice.

The output is a binary value either or , indicating whether the entity is honest or malicious. Obviously, the existing of IdentifyMalice will provide deterrence against all the possible misconducts. Since any malicious activity is bound to be identified, to avoid the financial penalties or even the legal liability, every involved entity has to be honest.

2.4. Security Definitions

For security, a User Focus outsourced auditing scheme should be correct and sound, simultaneously, supporting privacy protection against a curious TPA. The definitions are explained as below.

Definition 1. The correctness of User Focus outsourced auditing scheme requires that if all the involved entities are honest, for all keys output by Setup protocol and for all files and the corresponding output by Preproccess protocol, TPA always accepts probability 1 at the end of each AuditCSP protocol run and likewise the user at the end of each AuditTPA protocol. In addition, the correctness can be defined from another aspect that IdentifyMalice protocol will never (i.e., with probability 0) be invoked, since if everything goes well then none of the entities will abort for invoking the last protocol.

To define the soundness of our model, we start by the notion of liability in [23]. In contrast to the traditional public auditing security models, providing security to TPA is also an important aspect that should not be ignored in outsourced auditing setting. Especially in situations where some problems occur (e.g., the outsourced data has been lost in CSP), TPA must not be blamed as long as he can prove that he has fulfilled the auditing obligation according to the protocols. Namely, by providing his log files, if TPA convinces user with a high probability that he is taking over guarantees about the service quality of user’s outsourced data, then TPA is actually honest to finish his auditing task for each past challenge against CSP. This is the definition of liability which is formalized in [23]. Although the security model of [23] splits the definition of soundness into two parts, extractability and liability, we stress that the soundness of our User Focus outsourced auditing model can be defined as a whole, since liability becomes an intrinsic property and implicitly exists in our model without having to be proved for each instantiation.

More precisely, as shown in Section 2.3, throughout the AuditCSP protocol, TPA does not use any secret key when he performs his auditing work, which means that all operations conducted by TPA are deterministic in AuditCSP protocol. Firstly, the metadata and the public parameters set are confirmed by the contract after Preproccess protocol is over. Secondly, the challenge is individually generated by user, which is incontestable for user because of her signature. Therefore, TPA has no influence on all the involved parameters that are applied to complete his auditing work. The IdentifyMalice protocol can reconstruct the results that should have occurred for TPA during each execution of AuditCSP protocol and compare these results with TPA’s logs to judge if TPA is lazy or not. Clearly, these results are destined to be the objective and incontestable proofs once the AuditCSP protocol is executed. For this reason, no matter whether any malicious entity exists, a User Focus outsourced auditing scheme is bound to produce the objective proofs to protect a honest TPA who correctly performs his work, while guaranteeing that any lazy TPA can be identified as long as TPA’s logs do not conform to the objective proofs as above.

In conclusion, our model is naturally sound for TPA in the case that TPA is honest, since honest TPA has been implicitly protected from the malicious user. So in the following we can define the soundness for User Focus outsourced auditing scheme by excluding the situation that user is corrupted.

Now, we start by using the similar game employed in [23]. Firstly, given an adversary who corrupts any entity . Subsequently, the adversary takes over the role of corrupted entity and plays the training game with an environment as follows:(1)By running the Setup protocol, the environment generates all the public-private keys for the involved entities and the secret key for user. The adversary obtains all the keys of the corrupted entity .(2)For learning the knowledge of various outputs provided to the corrupted entity , adversary adaptively interacts with the environment, which plays the role of the honest entity. Given any file , can request the execution of Preproccess protocol which outputs the outsourced data . Afterwards, can request the executions of AuditCSP and AuditTPA protocols upon any above stored in cloud. All these protocol executions can be arbitrarily interleaved with each other.(3)Finally, after finishing all the learning, adversary outputs a challenge file that should be stored and the description of a cheating prover corresponding to this file .

The cheating prover is - if the probability that the honest entity interacts with without aborting the protocols is at least . Here, the probability is over the coins of the honest and malicious entities. To formalize the definition of soundness, we adopt the notion of extractor algorithm , which takes as inputs all the information provided to the honest entity, and the description of the cheating prover . The output of is the file . In particular, is given non-black-box access to the cheating prover and can rewind it. Formally we have the following.

Definition 2. A User Focus outsourced auditing scheme is - with respect to any corrupted entity if there exists an extractor algorithm such that, for any honest entity establishing the contract on a file and interacting with the - cheating prover , if the whole system remains running without being aborted by honest entity, the extractor can recover from with overwhelming probability.

Based on the above definition, we can informally say that if the - prover convinces the honest entity with a sufficient level of probability, then the challenged data is actually intact and extractable.

Moreover, since the user is often reluctant to reveal her outsourced data to TPA, privacy protection is necessary, as defined in the following.

Definition 3. A User Focus outsourced auditing scheme with privacy-preserving requires that no matter what running results are obtained by TPA during the operation of this scheme, TPA cannot deduce any privacy information of user’s outsourced file —except possibly with negligible probability.

3. The Proposed Scheme

In this section, based on the proposed model, we present a concrete User Focus outsourced auditing scheme, the security of which is analyzed according to our security definitions.

3.1. Preliminaries

Given that the user wants to outsource her file to CSP. The file can be seen as a set of blocks: . We first introduce several necessary techniques, which are important under the environment of remote data auditing.

Blockless Verification and Homomorphic Tags [7]. Blockless verification technique enables verifier to audit if the cloud servers possess certain file blocks, without having to retrieve these actual file blocks from the cloud. Blockless verification is essential since it is expensive and impracticable to retrieve all specified file blocks for frequent audits. Homomorphic tags can meet the requirement of blockless verification. Given two file blocks and , along with their corresponding homomorphic tags and , then the combination of and into a value will correspond to the tag of the sum of blocks and .

Merkle Hash Tree (MHT) [26]. It is a kind of authenticated data structure, which can be used to efficiently and securely prove that, in a given set of elements, the value of each element and the order of all elements are both undamaged. Based on a collision-resistant cryptographic hash function , MHT can be constructed as a binary tree, by the rule that the value of each parent node is defined as , where the leaf nodes are the hash values of authentic file blocks. In order to realize the authentication of each leaf node in MHT, Leaves Auxiliary Authentication Information (LAAI) is defined as the siblings of the nodes on the path from the leaf nodes to the MHT . An example of MHT is shown in Figure 2. Assume that auditor possesses the and wants to authenticate the appointed leaf nodes provided by the adversary. According to the given from adversary, auditor can compute in order and finally computes . If , auditor accepts all the leaf nodes of ; otherwise, it rejects them. In our paper, the order of leaf nodes in MHT is treated from left to right. By following the given order, any leaf node can be located and authenticated by the with its corresponding . Obviously, all the leaves of MHT can be authenticated just by the .

3.2. Scheme Construction

Now, we detail the specifications of User Focus outsourced auditing scheme, comprising five protocols Setup, Preproccess, AuditCSP, AuditTPA, and IdentifyMalice.

3.2.1. Design of Setup

For each entity , a corresponding public-private key pair is generated for signature by executing the signature key generation algorithm . Based on these key pairs, the secure communication links between any two entities can be established and authenticated in terms of the Transport Layer Security protocols.

In order to construct the homomorphic tags, the user relies on the RSA algorithm to output the public key , where is the product of two large primes and ; is a random large prime chosen by user. Let ; then user obtains the private key . The will be sent to TPA, and is kept by user as her secret.

3.2.2. Design of Preproccess

This protocol, comprising the following four phases, is initiated and dominated by user who holds the file .

(1) Generating Metadata . Let be a full-domain hash. For each block , user calculates as the corresponding leaf node of MHT. When the leaf nodes of all the blocks are generated, user can compute the of MHT by iteratively hashing as shown in Section 3.1. Note that user just needs to compute the , so it is not necessary for user to construct and store the whole MHT .

Then user randomly chooses a public element and a secret element and computes . In terms of her secret key , for each block , user computes the corresponding tags . Let denote the processed file, described as follows:

(2) Uploading to CSP. After receiving from user, for each , CSP recomputes and compares with the corresponding . If , which signifies that there exists inconsistent data within , thus will be rejected by CSP at once. If passes CSP’s verification, based on all the , CSP can reconstruct the whole MHT, denoted by with . After this, CSP uses its signature to respond to user as follows:

Upon receiving , user verifies CSP’s signature and then checks whether is equal to stored in her local. If , indicating that the same MHT has been constructed at CSP side, then user stores and sends out her acceptance to CSP. Otherwise, user aborts the protocol.

(3) Authorizing TPA’s Auditing. When receiving the acceptance from user, CSP immediately sends all leaf nodes to TPA. Similarly, TPA also reconstructs the whole MHT, denoted by with the and sends his response to user:

User executes the same verification process for as the way for . If , is accepted and stored in local by user. At last, user sends the public key to , meaning that the authorization of auditing CSP is provided to TPA.

It is worth mentioning that we make CSP, on behalf of user, to transfer all leaf nodes of MHT to TPA in above process. This is in order to reduce bandwidth cost of user, who might have the limited bandwidth resource. We stress that user does not have to be worried about any security issue by this way. If malicious CSP tries to provide any false leaf node to TPA, as long as the hash function preserves the collision-resistant property, then will not be equal to the at user side with overwhelming probability. Obviously, such unusual condition will be detected by user, who will terminate the auditing authorization immediately.

(4) Agreeing on the Contract. A constant number needs to be assigned by user for denoting the amount of challenged blocks during each TPA’s auditing to CSP. Let denote the set of system parameters, which is signed by user and sent to both CSP and TPA. Afterwards, CSP and TPA must respond to user with their own signatures upon , respectively, which means that all entities sign the contract for reaching an agreement. Note that CSP and TPA will store user’s signature in their locals, respectively, to protect themselves from a dishonest user.

At last will be deleted from user side to complete this protocol. Different from the corresponding Store protocol of [23] that enables TPA to access the whole user’s outsourced data in cloud, throughout our Preproccess protocol TPA can only obtain from CSP the hash values of file blocks, which contributes to protecting user’s data privacy from the curious TPA in the context of no encryption upon the outsourced data.

3.2.3. Design of AuditCSP

When each pregenerated challenge is released from user’s email box, this protocol will be launched, described as follows.

(1) User’s Challenge. Given a specific point-in-time , the form of corresponding challenge is defined as     , where is the key for public pseudo-random function , is the key for public pseudo-random permutation , and is user’s signature.(i): .(ii): .

Both and can be randomly or selectively generated by user. The meaning of selectively lies in that user can cover any specific block in a challenge by selectively generating the particular .

Let denote keyed with key applied on the input . According to public and , after receiving the challenge and checking user’s signature, for each , CSP can calculate the locations of challenged blocks by and the associated values such as . Here, we use to denote the -elements set , which can be generated based on at time .

(2) CSP’s Proof. After computing , by randomly choosing a secret interfering element corresponding to the challenge , CSP computes and signs its proof as below, where both and are the public parameters:

Then CSP transfers to TPA. Since the size of the combined block within is roughly the size of a single file block, so CSP has a much smaller communication cost than for transferring challenged blocks. The key attribute of is that there is no data privacy that can be deduced from and leaked to TPA, as will be analyzed in Section 3.3.

(3) TPA’s Verification. Upon receiving challenge from user’s email box, TPA can compute by the same way as CSP. Then TPA calculates his auditing parameter as below, where all the hash values can be found from the leaf nodes of the MHT stored at TPA side.

Subsequently, TPA parses CSP’s proof to obtain , and . Now TPA can use user’s public key to check whether

If so, TPA outputs . Otherwise, according to the signed contract, TPA must report to user that CSP does not pass the integrity auditing upon the challenged blocks. It is clear that the interfering element , generated and kept secretly by CSP, has no influence on the correctness of TPA’s verification. But plays an important effect on the privacy protection of outsourced data, as will be discussed in Section 3.3.

In addition, to prove that the auditing work related to the challenge has been performed correctly, TPA must generate the corresponding , which records the necessary information for the auditing against TPA launched by user.

3.2.4. Design of AuditTPA

In our scheme, we stress that TPA has to inform user as soon as anything has gone wrong during his auditing. If TPA is lazy and does nothing, then he cannot detect the outsourced data corruption in time, and user will find out such TPA’s misbehavior when executing the AuditTPA protocol upon a batch of TPA’s past logs.

Specifically, user randomly selects a -elements point-in-time set , where each indicates the time when a past challenge is issued from user’s email box. User sends to TPA. According to , TPA parses his local logs to obtain the and accumulates these values into the corresponding single value, respectively:

Note that, for each , TPA can also read the key from within local and then compute the locations of all challenged blocks related to the time set by using the pseudo-random permutation Let denote the set of all specified locations. Since some blocks might be challenged more than once, the repetitive locations will be taken away to ensure that every location in is unique.

For each , TPA reads from his local MHT all the corresponding leaf nodes that constitute the set . In addition, according to , TPA generates the corresponding Leaves Auxiliary Authentication Information, (as shown in Section 3.1). Now, TPA constructs and signs his proof   ≔  , which is sent to user as TPA’s response.

Upon receiving and checking TPA’s signature, user first uses and to compute the new of MHT and then compares with the held by herself. User accepts only when , which indicates that all values of pass user’s verification. If , a malicious TPA can be identified immediately.

For each , by accessing all the challenges stored in her email box, user can use the same approach as CSP/TPA to regenerate each by herself. Then user computes the following parameter in terms of all hash values within the accepted :

By comparing with given by TPA within the proof , if , there is no doubt that TPA is indolent and thus user outputs . If , with her secret element , user further checks whether

If this check fails, user outputs , confirming that CSP tries to misconduct by submitting the incompatible proof to TPA. Otherwise, based on her secret key , user finally checks whether

If the last check fails, user will learn that (i) TPA is feckless for his past auditing work and (ii) the outsourced data has been damaged at cloud side. Note that if there is something wrong with the challenged file blocks stored in CSP, the corresponding TPA’s auditing is bound to fail and therefore TPA should report to user immediately according to the AuditCSP protocol. So the execution of AuditTPA protocol launched by user means that TPA implicitly makes a commitment to user about his conscientiousness for all the past auditing against CSP. In this context, once user’s last check fails, TPA is malicious without any doubt. So user can take actions such as claiming compensations from TPA and CSP in terms of their signed contract.

In practice, as long as user’s check upon (15) passes, user can delete all past challenges from her email box. Meanwhile, the authorization of allowing TPA to delete all corresponding , signed by user herself, is generated and sent to TPA. In this case, both user and TPA just only need a constant storage space for storing the pregenerated challenges and the corresponding logs, respectively. And by this way the AuditTPA protocol will gain a significant performance improvement since the total number of point-in-time within the set can be controlled under a reasonable upper limit.

3.2.5. Design of IdentifyMalice

Obviously, the process of the AuditCSP protocol has produced enough proofs to identify the malicious CSP who tries to conceal the fact of outsourced data corruption, and the process of the AuditTPA protocol also produces the undeniable proofs to identify the malicious TPA who tries to be lazy or does not perform the required auditing tasks correctly. Now, we show that how our scheme can defend against the malicious user, who wants to put the honest TPA in the wrong by purposely denying TPA’s correct auditing work.

As shown in the AuditCSP protocol, firstly, the validity of TPA’s auditing parameter is undeniable for the malicious user. Since all the elements used for computing are incontestable from the user’s perspective, such as all the derived from the challenge signed by user herself, all the hash values can be authenticated based on the public MHT , and the element is public. Secondly, the required auditing work for TPA is to use and CSP’s proof to check (11), where the authenticity of can be verified by verifying CSP’s signature, and the other two elements and within (11) are also public. Therefore, in case of dispute or litigation, honest TPA can reveal his own MHT and the (or the corresponding authorization of allowing TPA to delete the past logs, signed by user, as described in the AuditTPA protocol). In this case, provided that TPA always performs his auditing work correctly, then the malicious user has no chance to frame TPA up, since after authenticating by the public , anyone can use the open to reconstruct and check (11) again to prove that honest TPA has actually fulfilled his auditing responsibility.

3.3. Security Analysis

According to the definitions in Section 2.4, we analyze the security of the proposed scheme in this section. The correctness of (11) within the AuditCSP protocol can be elaborated as follows:

Furthermore, the correctness of (15) within the AuditTPA protocol is elaborated as follows:

-. As shown in Section 2.4, our scheme is naturally sound for honest TPA because of the implicit protection. Now, we prove the soundness for the other two cases where either user or CSP is honest. The soundness of our scheme for honest user is based on the Knowledge of Exponent Assumption (KEA), which was introduced by Damgård [27], formalized by Bellare and Palacio [28], and proved to be secure in the generic group model by Abe and Fehr [29]. Formally, KEA is described with an extractor as follows.

KEA. For any adversary that takes input , , and returns with , there exists an “extractor” , given the same inputs as returns such that .

Case  1. Honest user interacts with adversary without aborting the protocol run. In this case, note that and are the public parameters taken as input by adversary , who cannot deduce any information about user’s secret element since the discrete logarithm problem is hard. We define as . As described in the AuditTPA protocol, user will receive the returned from adversary . So the following two elements and satisfying that are the implicit output of adversary :

Based on the KEA, the extractor is able to extract . Assume that the time set has elements; then is a linear equation built upon blocks set , since user can control the challenge , which means that can be selectively determined by user. By depending on the honest user to generate independent coefficients for the repetitive executions of AuditTPA protocol upon the same file blocks set , extractor can obtain a system of independent linear equations for . At last, extractor can recover each file block of just by solving the system of these linear equations. In terms of Definition 2 in Section 2.4, our scheme is -sound.

Case  2. Honest CSP interacts with adversary . At this point, the same security argument as in [23] can be adopted here. Throughout our scheme, as long as user’s outsourced data can pass CSP’s verification during the execution of the Preproccess protocol, then honest CSP must correctly possess and follow the subsequent protocols. So it is clear that the extractor can always rely on the honest CSP to extract the file from for defending against any adversary . This completes the proof for soundness.

Privacy-Preserving against Curious TPA. Throughout our scheme, all the running results obtained by TPA are MHT and CSP’s proof , which are derived from the executions of Preproccess and AuditCSP protocols, respectively, with respect to MHT , where every leaf node is hash value such as . So based on the preimage resistance attribute of the cryptographic hash function , none of privacy about the outsourced data block can be deduced from the leaf nodes of MHT .

Now, we argue that TPA cannot deduce user’s data information from the proof output by CSP. Due to the intractability of discrete logarithm, for each , all the secret random interfering elements , generated and stored individually by CSP for the corresponding challenges , cannot be derived by TPA from and even though and are the public parameters. As a consequence, for any specified blocks set , although TPA, via a number of challenges, might gather a system of linear equations such as with the different values of and the independent coefficient , it is impossible for TPA to solve the system and deduce any data block since is unknown and will randomly change within each linear equation of the system.

With regard to , which can be expressed as

obviously, to deduce the privacy of data block , TPA has to compute the exponentiation value from , which is the same as the intractability of discrete logarithm, let alone which is the secret key stored by user and is unknown for TPA. Summing up, no matter how many running results collected by TPA, our scheme can always protect user’s data privacy from the curious TPA.

4. Dynamic Updates

Different from the outsourced auditing scheme of [23], our scheme constructs for each block the corresponding homomorphic tag without involving the block location . In this way, the dynamic operations such as modification, insertion, and deletion can be realized by merely updating the targeted file block without affecting any others. For ease of description, we define the function of that is to calculate the hash value of MHT root by applying the and its corresponding , the example of which is described as shown in Section 3.1. As for the insertion operation, by default the new block will be inserted behind the appointed block location .

Now we can present Algorithm 1 for modifying or inserting a new block , and Algorithm 2 for deleting an appointed block at cloud server side. The main idea of these two algorithms is to let user control the update of MHT root throughout the process of data dynamics, since both CSP and TPA cannot output the expected root hash value unless they have correctly performed the dynamic operations as required by user. Both of these two algorithms are launched by user, who first transmits to CSP the necessary update command including the appointed location . After receiving user’s command, CSP must respond with the corresponding . Note that user holds the original MHT , which enables user to verify the authenticity of from CSP by checking if the hash value is equal to the output of running . Once this user’s verification passes, by calling again based on the and above update command, user can compute by herself what the new would be after performing her update command upon the MHT stored at CSP side. In this case, to obtain user’s authorization for actually executing the dynamic operations upon user’s outsourced file , CSP has to update its MHT in terms of user’s command and output to user the same new . Similarly, user will send the update command to TPA, and the same situation of updating MHT applies to the TPA side. Obviously, both CSP and TPA have to be well-behaving all the time; otherwise, their misconducts will be detected when the AuditCSP and AuditTPA protocols of our proposed outsourced auditing scheme are launched. Finally, user will update her local MHT with the new , meaning that the dynamic operations are executed successfully.

Algorithm: = Modify_or_Insert_Block (, ).
Input: the appointed block location , and the new block .
Output: the new of the updated MHT.
User: compute ; ;
transmit to CSP;
CSP: compute
if then
if mark = modify then
;
else
;
end if
compute ;
send to user;
end if
User: compute ;
if then
compute ;
if then
authorize CSP to execute update operation;
end if
end if
CSP:  if mark = modify then
, , are replaced by , , in the outsourced file ;
update the MHT stored in cloud by recalculating all the nodes on the path from
the th leaf node to the ;
else
insert , into after , , respectively;
update the MHT by replacing with the th primordial leaf node that is transformed into
a parent node having the left-child leaf and the right-child leaf , and then recalculating all the
nodes on the path from to the ;
end if
User: transmit () to TPA;
TPA: according to the mark, update the MHT stored at TPA side by the same way as CSP;
send to user the root of the updated MHT , denoted by ;
User: if then
update the , stored in her local, with ;
end if
return ;
Algorithm: Delete_Block  .
Input: the location of the appointed block that will be deleted.
Output: the new of the updated MHT.
User: transmit to CSP;
CSP: based on , read and from its MHT ;
send to user;
User: compute ;
if then
authorize CSP to execute delete operation;
end if
CSP: if the sibling node of , denoted by , is a leaf node then
replace with the parent node of which is transformed into a leaf node, and then delete from MHT ;
else
replace with the parent node of , and delete from MHT ;
end if
update the MHT stored in cloud by recalculating all the nodes on the path from
to the new root of MHT , denoted by ;
delete , from the outsourced file , and send to user;
User: set ;
compute ;
if then
transmit () to TPA;
end if
TPA: update the MHT stored at TPA side by the same way as CSP;
send to user the root of the updated MHT , denoted by ;
User: if then
update the , stored in her local, with ;
end if
return ;

5. Experimental Evaluation

In this section, we simulate the computations of our proposed User Focus scheme and the Fortress scheme of [23] on the Inspur NF5270M4 servers with Intel Xeon CPU E5-2620 at 2.10 GHz, 16 GB RAM, and 7200 RPM 1 TB Serial ATA drive with a 32 MB buffer. Our experiments are implemented by using python language, and all the cryptographic functions are derived from the python cryptography toolkit [30]. We employ SHA1 to produce the 160 bit hash value, and the size of RSA module is 1024 bit for security. As for the Fortress scheme, we also utilize the tools of bitcoin block explorer [31] to access bitcoin resource for obtaining the random challenges, and we typically set the sector size to be 1 KB (e.g., each 64 KB file block consists of 64 sectors in Fortress).

Note that the typical block size for cloud storage is 64 KB–256 KB, as shown in [32]. Since outsourced auditing scheme runs above the cloud storage, the reasonable lower limit of block size should be no less than 64 KB. In our evaluation, user’s outsourced file is chosen to 1 GB. We do not measure the time of uploading the outsourced file from user to CSP, since this overhead is common to the two investigated schemes. Our statistical results are an average of 20 rounds.

Firstly, user’s file must be preprocessed before outsourcing. Figure 3 shows the required total time for the corresponding preprocessing phases of the two schemes. Furthermore, we also evaluate the computing time consumed by user for our scheme and Fortress, respectively. It turned out that for both schemes the computational overhead incurred at user side accounts for most of thetotal time when preprocessing the outsourced data, and the preprocessing performance of our scheme is orders of magnitude faster than of Fortress. As shown in [23], TPA needs to download the whole user’s file from cloud and convince the user that he correctly preprocessed . In this case Fortress requires user to carry out a time-consuming zero-knowledge-proof (ZKP) with TPA, resulting in the heavy computational overhead for user. Compared to Fortress, our User Focus scheme can effectively avoid such ZKP operation since TPA is not involved in preprocessing and thus gain the performance enhancement.

Secondly, in Figure 4, with respect to the different block sizes, we measure the latency incurred by performing TPA’s auditing against CSPonce in our scheme and in Fortress, respectively. In this experiment, we employ the same parameters as in [23]. During each outsourced auditing executed by TPA, the number of challenged blocks, denoted by , is set to 10% of the whole file blocks for both schemes. In addition, since Fortress involves the condition of parallel user-challenge, we also set the number of user-challenged blocks to be 10% of the TPA-challenged blocks as in [23] (i.e., 1% of the whole file blocks). The results show that during TPA’s auditing phase the performance of our scheme is slightly better than Fortress at 64 KB block level, and the latency of our scheme declines faster than that of Fortress as the block size increases.

Finally, we focus on evaluating the time incurred on the user when she audits a batch of TPA’s logs all at once. For a fair comparison, in this experiment we set the number of user-challenged blocks in Fortress to be the same as our User Focus scheme (i.e., each log refers to 10% of the whole blocks for user). As shown in Figure 5, although the latency increases linearly in two schemes, the performance of our User Focus scheme is almost 5 times faster than that of Fortress. Recall that Fortress relies on bitcoin hash values to determine the challenges, so for reproducing each past challenge, user has to repeatedly interact with bitcoin random resource by sending the HTTP requests and receiving the responses to obtain all past bitcoin hash values, which incurs the noticeable delay as the number of accumulated logs increases since a TPA’s log is related to a past challenge. In our scheme, based on the property of User Focus, user can directly retrieve past challenges from her email box, which saves considerable time for reconstructing challenges when compared to Fortress. Note that each challenge of our scheme is only 88 bytes (8 bytes for the point-in-time , 40 bytes for the two keys and , and 40 bytes for user’s signature), and user can delete all past accumulated challenges once her auditing against TPA is passed (as shown in Section 3.2.4); thus the storage and I/O costs of past challenges are extremely low for user in practice.

Summing up, for the phases of preprocessingand user’s auditing against TPA that involve the user, the experimental results present that our scheme greatly improves the performance of user side, so our scheme will provide a more pleasant user experience when compared to Fortress. In addition, our User Focus scheme achieves the same performance as Fortress for TPA’s auditing phases at the level of 64 KB block, or even better performance advantage at the bigger block level.

With the popularization of storage outsourcing, the problem of remote data integrity auditing has attracted increasing attentions. All kinds of provable data possession (PDP) and proof of retrievability (POR) schemes [723] are proposed to defend against the untrusted remote server. Ateniese et al. provided a series of PDP schemes for the storage security of outsourced data. In [7], they first described the formalized definition of PDP and proposed the original PDP schemes by utilizing the homomorphic verifiable tags that are constructed based on the public key cryptological technique. Simultaneously, to allow anyone, not just the data owner, to audit the untrusted server for data possession, the concept of public auditing is first introduced in [7]. Whereafter, in terms of the symmetric key cryptological technique, Ateniese et al. [10] proposed another provably secure PDP scheme for considering the problems of scalability and data dynamics that are not covered in the original PDP method. Besides, in [11], they also presented two more efficient PDP schemes that go one step further than the original schemes of [7]. To support the fully dynamic operations in the context of remote auditing, Erway et al. [12] extended the PDP model of [7] and presented the first dynamic PDP scheme by using the rank-based authenticated skip list. In addition, Wang et al. [16] and Zhu et al. [19] also proposed other efficient dynamic schemes for public auditing, which are based on the data structures of Merkle Hash Tree (MHT) and Index-Hash Table (IHT), respectively.

Juels and Kaliski Jr. [8] first proposed a formal POR model along with the corresponding security definitions. According to the model of [8], Shacham and Waters [9] constructed two POR schemes upon the static data storage but supporting the unlimited number of challenges. The first scheme is built from the pseudorandom functions to enable private auditing, and the second scheme with public auditing is built from the BLS signature [33]. Given that TPA might be curious during the process of public auditing, Wang et al. [13] integrated the random mask technique with the BLS-based public auditing scheme to prevent user’s outsourced data from leaking to TPA, and the scheme of [13] has been further improved to support data dynamics in [15]. Moreover, under the environment of public auditing, many other schemes are also designed to meet the demands of different scenarios, such as fast data error localization [17], the auditing against shared data [18], batch auditing for multiple clouds [20], fine-grained data updates [21], and the lightweight computations for low performance end devices [22].

Recently a variety of cloud storage application schemes have been proposed, such as keyword-based data retrieval and image copy detection at cloud side. Xia et al. [34] constructed a special tree-based index structure and proposed a secure multikeyword ranked search scheme enabling dynamic updates upon outsourced encrypted data. Fu et al. [35] designed the parallel search algorithm and proposed another flexible searchable encryption scheme supporting both multikeyword ranked search and parallel search. In the setting of multikeyword fuzzy search, to solve the out-of-order problems during the ranking process, Fu et al. [36] also developed a new keyword transformation method and presented the corresponding efficient search scheme. In view of that traditional keyword-based search schemes that cannot completely match users’ search intention, the innovative semantic search scheme based on the concept hierarchy is proposed in [37], making the personalized search more effective and context-aware. And the content-based search scheme of [38] has further solved the problems of semantic search by utilizing the conceptual graphs and the efficient measure of “sentence scoring.” On the other hand, to protect the images stored in cloud, Xia et al. [39] proposed a privacy-preserving and copy-deterrence CBIR scheme using encryption and watermarking techniques, which can prevent the image user from illegally distributing the retrieved images. Li et al. [40] presented a solution to detect the copy-move forgery in an image, by first segmenting the targeted image into semantically independent patches prior to keypoint extraction and comparison. For detecting the image copies of a given original image generated by arbitrary rotation, Zhou et al. [41] proposed a novel copy detection method based on two global features extracted from rotation invariant partitions. In addition, Zhou et al. [42] designed a global context verification scheme to filter false matches for copy detection, which further addresses the problems of limited discriminability and quantization errors that exist in the bag-of-visual-words (BOW) model adopted by previous detection methods. However, since all application schemes mentioned above are designed upon the outsourced data of cloud storage, so it is obviously important for us to firstly focus on how to audit and confirm the integrity of remote outsourced data. But the existing public auditing schemes cannot protect against the malicious TPA. As shown in [23], malicious TPA is a potential security risk for outsourced data integrity and thus should not be ignored, which is the motivation of this paper.

7. Conclusion

Any public auditing/verification scheme can be transformed into a private scheme, just by making user perform the auditing work that should be delegated to TPA. Clearly, public auditing schemes might be more easily large-scale adopted by cloud users in practice, since user’s heavy burden incurred by frequently auditing can be transferred to TPA. Nevertheless, how to protect user from a malicious TPA is a key problem that is never considered by various existing public auditing schemes. The first outsourced auditing scheme Fortress is proposed to defend against the malicious TPA, but Fortress enables TPA to download all outsourced data and thus only relies on data encryption to protect user’s data privacy. A secure outsourced auditing scheme against malicious TPA should be designed to deprive TPA of the access rights over user’s outsourced data in cloud, which is achieved in this paper. Although our proposed scheme is designed without relying on additional independent random source, it also achieves the security of protecting against any malicious entity. In addition, based on the MHT data structure, we extend the outsourced auditing scheme to support dynamic updates. With the analysis and evaluations, our scheme is provably secure and significantly efficient.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank NSFC (Grant no. 61502044).