Abstract

Cloud storage has attracted more and more concern since it permits cloud users to save and employ the corresponding outsourced files at arbitrary time, with arbitrary facility and from arbitrary place. To make sure data integrality, numerous public auditing constructions have been presented. However, existing constructions mainly have built on the PKI. In these constructions, to achieve data integrality, the auditor first must authenticate the legality of PKC, which leads to a great burden for the auditor. To eliminate the verification of time-consuming certificate, in this work, we present an efficient identity-based public auditing proposal. Our construction is an identity-based data auditing system in the true sense in that the algorithm to calculate authentication signature is an identity-based signature algorithm. By extensive security evaluation and experimental testing, the consequences demonstrate that our proposal is safe and effective; it can efficiently hold back forgery attack and replay attack. Finally, compared with the two identity-based public auditing proposals, our proposal outperforms the two proposals under the condition of overall considering computational cost, communication overhead, and security strength.

1. Introduction

With the technique progress in communication filed, the amount of the generated data is going through fast growth. Many companies working on the healthcare trade increasingly make use of cloud storage services. Instead of every hospital storing and maintaining medical data in physical servers, cloud storage is becoming a popular alternative since it can offer the clients more convenient network-connection service, on-demand data storage service, and resource-sharing service.

The aging population problem urges healthcare services to make the continuous reformation so as to obtain cost-effectiveness and timeliness and furnish the services of higher quality. Numerous specialists deem that cloud-computing technique may make healthcare services good by reducing EHC (electronic-health-record) start-up costs, such as software, equipment, employee, and various license fees. These reasons will urge to adopt the relevant cloud techniques. Let us see one instance of healthcare services in which cloud technique is applied, the Healthcare Sensor system can automatically collect the patients’ vital data of the wearable devices which are connected to traditional medical equipment via wireless sensor networks and then upload these data to “medical cloud” for storage. Another typical instance is the Sphere of Care by Aossia Healthcare, it is started in 2015. These cloud-based systems can automatically collect every day real-time data of users. It alleviates manual collection burden so that the deployment of the whole medical system is simplified. However, they may make healthcare providers to face many challenges in migrating all local health data to the remote cloud server, where the paramount concerns are privacy and security since healthcare administrator no longer completely deals with the security of those medical records. After medical data are stored on the cloud, they are possibly corrupted or dropped.

To make sure the intactness of the stored data in the remote server, the patients or healthcare service providers expect that the stored data integrality can be periodically checked to avoid the damage of the stored data. However, for the individuals, one of the greatest challenges is how to carry out the termly data integrality detection when the individuals have the copy of local files. Meanwhile, the method is also infeasible to conduct data integrality verification for a source-limited individual though retrieving the total data file.

To deal with the above issue, many specialists had presented a number of problem-solving methods which aim at the diverse systems and diverse security models in [120]. Nevertheless, most existing problem-solving methods had built on public key infrastructure (for short, PKI). As everyone knows that the PKI-based auditing proposals exist complex key management problem, data client needs to conduct key updating, key revocation, and key maintaining, and so on. Hence, the key management and certificate verification in the PKI-data auditing system will be a troublesome issue. Furthermore, PKC also needs more storage space than the individual ID since key pair (PK, sk) needs to be locally kept. For a verifier, to guarantee data integrality, it must firstly extract PKC from public key directory and then verify whether public key certificate (for short, PKC) is valid. Therefore, it also increases computation burden and communication overhead for the verifier.

In 2014, the first so-called ID-based data integrality proposal was proposed by Wang et al. [13]. Strictly speaking, their proposal is not a kind of identity-based auditing one because the algorithm to generate metadata authentication tag is not an idenity-based algorithm, but a PKC-based one. In 2015, Yu et al. put forward a generic method of constructing identity-based public auditing system by integrating the identity-based signature algorithm with traditional PDP protocols in [15]. Their research is very significant on studying the ID-based public auditing system. However, in their scheme, the algorithm to produce metablock authentication tag is still to adopt a PKI-based one. Furthermore, in the auditing phase, the auditor firstly verifies the validity of an identity-based signature on a public key PK, and then it executes data integrity verificationby using this public key PK again, which increases the computation burden of the auditor. In 2016, Zhang and Dong brought forward a novel identity-based public auditing proposal in [16]. The proposal is the identity-based public auditing system from the literal sense since their algorithm to produce metadata authentication tag is the ID-based signature algorithm. However, their scheme is shown to be insecure in Appendix.

To increase efficiency and strengthen the security of ID-based auditing protocols, in this work, a secure and efficient ID-based auditing construction is proposed. For our construction, its original contributions are as follows:(1)On the basis of the idea of homomorphic signature in ID-based setting, we devise an authentic identity-based auditing proposal of data integrality. The proposal can not only avoid the key managing, but also relieve the auditor’s burden.(2)In auditing the phase, our scheme has constant communication overhead. Compared with the two schemes [15, 16], our proposal has more advantages with regard to computational cost and communication cost.(3)In the random oracle model, the proposed proposal has serious security proof, and the corresponding proof can be tightly reduced to the CDH mathematic problem.

2. Architecture and Security of System

In the following chapter, in order to better understand our ID-based data integrality auditing protocol (ID-DIAP, for short), we firstly give a description of the system model, and afterwards the security model of our ID-DIAP for cloud storage is defined.

2.1. System Architecture

For our ID-DIAP system in cloud, the architecture is composed of four entities: privacy key generator (for short, the PKG), the third party verifier/auditor (for short, the TPA), cloud servers, and data user. The whole systematic architecture is demonstrated in Figure 1.

To avoid biases in the auditing process, the TPA is recommended to implement the audit function in our system model. Detail function of each role in system architecture is described as below.(i)Data User. It acts as a cloud user and possesses a number of files which need to be uploaded to the remote cloud server without local data copy. Generally speaking, data user may be a resource-limited entity due to the limited capability of storing and computing. And it can flexibly access at any time and share the outsourced data.(ii)Cloud Servers. They are composed of a group of distributed servers and have tremendous capability of storing and computing. Furthermore, it is answerable to save and maintain the stored files in cloud. Nevertheless, the cloud server might be un-trusted, and for its own profits and a good commercial reputation, it might conceal data corruption incidents for its cloud users.(iii)The Third Auditor. It acts as a verifier of data intactness. In principle, it has professional experience and practical capability to take charge data integrality audit in the person of cloud users/data users.(iv)The PKG. It is a trusted entity and is duty bound to build up system parameters and to calculate privacy key of every cloud user.

For a cloud-based storing system, its goals are to alleviate the burden of data storage and maintaining of cloud users. Nevertheless, after data are uploaded to the remote sever in cloud, it might lead to a potential security problem since the uploaded data have been out of control for the data user and the remote server in cloud is generally unreliable. Data user might concern whether the stored files in cloud are intact. Thus, the data user wants some security measures to ensure that the integrality of the outsourced data is examined regularly without a local copy.

Definition 1 (identity-based data integrity auditing protocol, ID-DIAP). In general, an ID-DIAP system contains the three stages:(1)System Initialization Phase. In this phase, the PKG is duty bound to produce system parameters. Therefore, it runs Setup(1k) algorithm to obtain system parameters Para, the PKG’s key pair (mpk, mpsk) by inputting which is a safety parameter. On the contrary, the PKG also invokes KeyExtr(1λ, Para, mpsk, ID) algorithm to calculate privacy key skID for the data user with identity ID by inputting its mpsk and Para as well as the identity ID of the data user.(2)Data Outsourcing Phase. In this phase, for the data owner (data user), it runs TagGen(M, skID) to generate metadata authentication tag , on each data block by inputting its private key skID and the outsourced file , where . Finally, it uploads metadata authentication tags to the cloud server.(3)Data Auditing Phase: This phase is divided into three subphases: Challenging, Proof, and Verifying. Firstly, the auditor runs algorithm Challenging(Minfo) to calculate Chall as the challenged information. After receiving Chall, the cloud server runs Proof(M, δ, and Chall) to calculate as proving information, then returns to the auditor. At last, the auditor invokes the algorithm Verifying(Chall, Prf, mpk, and Minfo) to test whether the returned proving information is valid.

2.2. Different Types of Attack and Security Definition

In the subsection, we will analyze that our ID-DIAP system may be confronted with diverse attacks in light of the behavior of every role in the system architecture. In our system architecture, the PKG is the privacy key generator which calculates data user’s privacy key. In general, it is a credible authority. We assume that the PKG does not launch any security attack to the other entities in the whole system model. For the third auditor, it is deemed to be an honest-but-curious entity and can earnestly execute every step in the auditing course. And cloud server is considered to be unreliable. It might deliberately delete or alter rarely accessed data files for saving storage space. It is a powerful inside attacker in our security model. And the goal of the attacker is to tamper and replace the stored data without being found by the auditor. Because the cloud server is a powerful attacker in our security model, we mainly consider the attacks [7] which are launched by the cloud server in this paper.

2.2.1. Forge Attack

The vicious cloud server may produce a forged meta-authentication signature on a new data block or fabricate the fake proving information to deceive the auditor by satisfying the auditing verification.

2.2.2. Replace Attack

If a certain data block in the challenge set was corrupted, the vicious cloud server would select another valid pair () of data block and authentication tag to substitute the corrupted pair () of data block and data tag.

2.2.3. Replay Attack

It is an efficient attack. With respect to the vicious storage server in cloud, it might produce the new proof information without retrieving the challenge data of the auditor though realizing the former proving information .

3. Our Public Auditing Construction

In the following, we will give the description of our ID-DIAP system. It contains four entities: the PKG, cloud server, data user, and TPA. And the whole system is composed of five PPT algorithms. As for every entity and all algorithms, the diagram of the framework is represented in Figure 2. To clearly describe our protocol, the algorithms are given in detail below.

3.1. Setup

For the sake of enhancing readability, some notations used in our ID-DIAP system are listed in Table 1.

The PKG makes use of a parameter as an input and generates two cyclic groups and . The two groups have the identical prime order . And let be two generators of group , where they satisfy . And define a bilinear pairing map . Next, it chooses two map-to-point cryptographic hash functions and and a resistant-collision hash function . And the PKG randomly chooses as its master privacy key, calculates as its public key. At last, public parameters Para are published as below:

And the PKG needs its master privacy key s to be secretly kept.

3.2. Key Extraction

For a data user, to produce its privacy key, it delivers its identification ID to the PKG. Subsequently, the PKG utilizes its master privacy key s and ID to implement the following process:(1)Firstly, the data user delivers its identity information ID to the PKG.(2)Next, the PKG generates () data user’s privacy key, where

And then it goes to the data user through a secret and secure channel.(3)Upon receiving the private key (), this data user is able to test whether its privacy key is valid through the following equations:

3.3. TagGen Phase

For the upload data file , firstly data file is divided into blocks by the data user, namely, . To outsource this file to the cloud, the data user needs to randomly choose a pair () which is a private-public key pair of a secure signature algorithm , for example, BLS short signature. Let Name denote the identifier of data file , and then it calculates the file authentication tag , where denotes a secure signature on , and denotes an information string .

Subsequently, the data user needs to generate metadata authentication tag on the data block. To compute block authentication tags on all data blocks , the data user uniformly samples to calculate .

Next, for , it calculates metadata authentication tag for data block by the following steps:(1)First of all, it calculates(2)And then, it makes use of its private key () to compute(3)For data block , the resultant authentication tag of the data block is .

At last, the data user needs to upload all the meta-authentication tags () and the outsourced file to the remote server in cloud.

On obtaining all the aforementioned data (), the cloud server needs to execute the following validation procedure:

For , it verifies the relationwhere and . If all relations hold, then it parses into and and verifies the validity of signature

If it is also valid, then the cloud server preserves these data in cloud.

3.4. Challenge Phase

To audit the integrality of the outsourced file , firstly, the TPA parses into and and verifies . If it does not hold, then terminate it. Otherwise, it retrieves the corresponding file identifier name and block size .

Later on, the auditor picks a subset randomly where and to generate a challenge information

Finally, it delivers Chall to the cloud server as the challenge.

3.5. Proving Phase

After obtaining the corresponding challenge information , for , cloud server calculates , and then it produces a set .

Subsequently, in light of the outsourced data file and meta-authentication tag of each block, , it produces as follows:

Finally, the cloud storage server goes back 3-tuple to the auditor as the corresponding proof information.

3.6. Verifying Phase

To check the outsourced data’s integrality in cloud, after receiving the responded proof information , the third auditor calculates as below:where .

Then, it checks the validity of the following equation:

If the aforementioned Equation (11) satisfies, then the TPA outputs VerifyRes as true; if not, it outputs VerifyReS as false.

4. Security Analysis

To show our proposal’s security, we will demonstrate that our proposal is proven to be secure against the above three attacks.

Theorem 1. Assume there exists a PPT adversary Adv that is probabilistic polynomial-time attacker (for shot PPT) and can cheat the auditor using invalid proving information Prf which is forged by the adversary Adv (the dishonest cloud storage server) in a nonignorable probability , then we are able to design an algorithm $B$ that can efficiently break the CDH assumption by invoking Adv as subprogram.

Proof. Let us suppose that a PPT adversary {Adv} is capable to calculate a faked proving information Prf after the data blocks or metadata authentication tags are corrupted, then we are capable of constructing another a PPT algorithm B which is capable of breaking the CDH assumption by utilizing the adversary Adv. First of all, let a 3-tuple be a CDH assumption’s random instance, it is hard to obtain the solution .

To show the security proof, hash function in the game is regarded as random oracle, and identity ID of each data user is only made -query once. For and , they only act as one-way functions. In addition, the adversary Adv is capable of adaptively issuing the queries to three oracles: {-oracle}, {Key-Extract oracle}, and {TagGen oracle}.

Setup. Choose two cyclic groups and , and their orders are the same prime number . The algorithm B firstly sets as the public key of the PKG. Let be three hash functions. Finally, it sends public system parameters () to the adversary {Adv}. And let be a challenged identity index of the data user.

-Hash Oracle. The adversary {Adv} submits a query to -oracle with an identity . If the index of identity satisfies , then the challenger picks randomly to set up and . Otherwise, the challenger B uniformly samples and from to set up

In the end, the 5-tuple () is added in the -list being initially empty.

Key Extraction Oracle. For a key extraction query, {Adv} submits an identity information to key extraction oracle. To response it, the challenger calculates the following:(1)If the identity index of satisfies , then looks for 5-tuple () in the -list. If it exists, sends to the adversary Adv; otherwise, it implicitly queries a -Oracle with identity .(2)Otherwise, terminates it.

TagGen Oracle. If the adversary {Adv} submits 3-tuple (M, IDi, Name) to TagGen Oracle for authentication tag query, where and Name is the file identifier of data file . To response it, the challenge calculates the following:(1)First of all, it searches the -list to check if IDi exists. If it is, then the corresponding 5-tuple () in the -list is returned. Otherwise, needs to query -Oracle with identity information IDi.(2)If the identity index satisfies , then the challenge aborts it. Otherwise, it produces authentication tags on data file by the following process:(a)Firstly, for file identifier “Name,” it picks randomly to calculate .(b)Next for to , it calculates

And then for to , calculates data block ’s authentication tag asand adds () to the Tag list which is initially empty.(3)Finally, it returns () to the adversary {Adv}.

Output. In the end, for a challenge information , the adversary Adv outputs a fake proving information () on data user’s the corrupted file in a nonneglected probability , where the data user’s identity is . Adv wins this security game if and only if the following constraint condition holds:(1)(2) can pass verification equation (11)(3), where should be a legitimate proving information for the challenge information and the data file which satisfies

When the adversary Adv wins this game, then we are capable of obtaining the following:where and because is computed by the verifier, and for the same data file, and . Thus, we have

It indicates that the CDH assumption is able to be broken with nonneglected probability . Apparently, it is impossible since it is a hard problem to solve the CDH problem.

Theorem 2. For a malicious cloud server, its replay attack in our proposed auditing proposal can efficiently be resisted.

Proof. The proof in detail is very alike with the security proof in [16]. Hence, it is left out due to the limited space.

5. Performance Evaluation

To efficiently evaluate our proposal’s performance, in the following part, we show that our proposal is efficient by comparing with Yu et al.’s proposal [15] and Zhang and Dong’s proposal [16] in the light of computational cost and communication overhead, where Zhang and Dong’s proposal [16] which is the state-of-the-art identity-based public auditing schemes in the aspect of communication overhead.

5.1. Computation Costs

To evaluate the computation costs of our proposal, we would like to contrast our proposal with Zhang and Dong’s proposal [16] and Yu et al.’s proposal [15] since the two schemes are recent two efficient ID-based public auditing schemes. We emulate the operators adopted in the three schemes on an HP-laptop computer with an Intel-Core i3-6500 CPU at 2.4 GHz processor and 8 GB RAM and all algorithms are implemented using the MIRACL cryptography library [21, 22], which is used for the “MIRACL-Authentication Server-Project Wiki” by Certivox. We employ a Super-singular elliptic curve over field GFp, which has the 160-bit modulus and a 2-embedding degree. Moreover, in our experiments, the whole statistical results are from the mean values of 10 simulation trials.

For explicit demonstration, we use MulG1 to denote point multiplication operation, and let Hash and Pairing be one hash-to-point operation from to and a bilinear pairing operation, respectively. For a public auditing protocol, the computational cost in the TagGen phase is mainly determined by computation of producing block authentication tags. To outsource a data file, the data user requires (3n + 1) MulG1 + n hash to produce block authentication tag in our construction; in Yu et al.’s proposal and Zhang et al.’ s proposal, each data user needs (n + 1)Hash + (2n + 2) MulG1 and 4n MulG1 to calculate metablock authentication tag, respectively, where n is the numbers of data block. In Figure 3, we show the simulation result of generating the block authentication tag for the size of diverse data blocks with the identical data file.

From Figure 3, we can know that, in the TagGen phase, Zhang et al.’s proposal is the most time-consuming and Yu et al.’s proposal is the most efficient. Our proposal is slightly slower than Yu et al.’s one since the algorithm to produce the block authentication tag is a public key certificate-based signature algorithm in Yu et al.’s proposal; however, the algorithm, which is used in our proposal, is the ID-based signature algorithm. Because block authentication tags for the data file can be produced in the off-line phase, it has a little influence on the whole protocol.

In auditing the verification phase, the computational cost mainly comes from verifying proof information. It is determined by the numbers of the challenge data blocks. For our construction, to check the validity of proving information, the auditor requires 3pairing + c Hash + (c + 2)Mul_{G_1}; however, in the other two proposals, the TPA needs to execute 3Pairing + 2MulG1 and 5Pairing + (c + 2)MulG1 + c Hash to check the integrality of the stored file in cloud, respectively, where c expresses the size of the challenge subset. In Table 2, we give their comparison of computational time in the different challenge subset.

According to Table 2, we infer that the proposal in [16] is the most efficient. Our proposal is slightly more efficient than the proposal in [15]. However, the proposal in [16] is shown to be insecure, and its detail attack is shown in Appendix. At the same time, we also find that the TPA’s computational costs grow linearly with the size of the challenge subset.

5.2. Communication Cost

In a data audit system, communication costs mainly come from two aspects. On the one hand, it is from the outsource phase of the datafile; on the other hand, it is from the auditing phase. In the outsource phase, data owner uploads data file and the corresponding meta-authentication tags. As far as our proposal, the data owner wants to upload (n + 1)|G_1| + |M| bits to cloud storage server; however, in the proposals [15, 16], the data owner wants to upload (n + 3)|G_1| + |M| bits and 2n · |G_1| + |M| bits, respectively. Here, |G_1| represents the bit length of an element of group G1, |M| denotes the bit length of data file, and n is the number of data blocks.

In the auditing phase, communication costs are mainly from the challenge information and proving information transmitting between the TPA and cloud storage servers. In our scheme, the challenge information Chall is |Z_q| + |I| bits, proving information is 3·|G_1| bits, and thus, the total communication overhead is |Z_q| + |I| + 3·|G_1| bits, where |Z_q| represents the bit length of an element in group Zq and |I| is the size of the challenge subset. In the proposal [16], the total communication cost is 2|Z_q| + |I| + 2·|G_1| bits in the auditing phase; in the proposal [15], the total communication costs is (2 + |I|)|Z_q| + |I| + 5·|G_1| bits in the auditing phase. Their comparison in detail is shown in Table 3.

As shown in Table 3, our scheme has the least communication overhead among three schemes.

5.3. Security Comparison

According to Theorem 1, we know that our scheme is provably secure against the vicious cloud server in the computational Diffie–Hellman assumption, and it has tight security reduction. For Yu et al.’s proposal in [15], their proposal is also provably secure against the vicious cloud server under the CDH assumption. However, for Zhang and Dong’s proposal [16], it is shown to be insecure against the vicious cloud server attack. A vicious cloud server is capable of deleting the whole file without being conscious of the TPA, and the detail security analysis is given in Appendix.

6. Conclusion

In this work, we present a novel identity-based public audit system by merging the homomorphic authentication technique in the ID-based cryptography into the audit system. Our proposal overcomes the security problem and efficiency problems which have existed in the ID-based public audit systems. Finally, the proposal is proven to be secure, their security is tightly relevant to the classical CDH security assumption. By compared with two efficient ID-based schemes, our scheme outperforms those two ID-based schemes under the condition of overall considering computation complexity, communication overhead, and security.

Appendix

For a vicious s cloud storage server, its attack is conducted as follows:(1)Suppose that is an outsourced file. It is firstly partitioned into blocks, i.e., . And is a meta-authentication signature of each block , for .(2)For a vicious cloud storage cloud, it calculates , .(3)Subsequently, it uniformly samples a number to compute and for . And delete all data blocks from the cloud server for .(4)After the challenge information is received; the vicious remote server in cloud firstly calculates (5)Then, it calculates , , and . Note that has been calculated in step 2.(6)The forged proof information is . Since the forged proof information satisfies the relation

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

We would like to thank the anonymous referees for their invaluable suggestions in TrustCom2016. This research was supported by the Beijing Municipal Natural Science Foundation (no. 4162020), Guangxi Key Laboratory of Cryptography and Information Security (no. GCIS201710), and Research Fund of Guangxi KeyLab of Multi-Source Information Mining & Security (no. MIMS16-01).