Abstract

The secure destruction of expired data is one of the important contents in the research of cloud storage security. Applying the attribute-based encryption (ABE) and the distributed hash table (DHT) technology to the process of data destruction, we propose a secure ciphertext self-destruction scheme with attribute-based encryption called SCSD. In SCSD scheme, the sensitive data is first encrypted under an access key and then the ciphertext shares are stored in the DHT network along with the attribute shares. Meanwhile, the rest of the sensitive data ciphertext and the shares of access key ciphertext constitute the encapsulated self-destruction object (EDO), which is stored in the cloud. When the sensitive data is expired, the nodes in DHT networks can automatically discard the ciphertext shares and the attribute shares, which can make the ciphertext and the access key unrecoverable. Thus, we realize secure ciphertext self-destruction. Compared with the current schemes, our SCSD scheme not only can support efficient data encryption and fine-grained access control in lifetime and secure self-destruction after expiry, but also can resist the traditional cryptanalysis attack as well as the Sybil attack in the DHT network.

1. Introduction

Cloud storage has attracted much attention from both industry and academia for its low cost, flexible deployment, and strong extensibility in recent years. The cloud storage system is composed of massive storage resource on the Internet as well as the resource management and access control mechanism for the resource accessing transparency of users [1]. With friendly user interface and strong extensibility, the cloud storage system can provide users with unlimited storing space; thus, it can form a new delivery model called storage as a service [2]. Cloud storage brings new opportunities for efficiency increasing, cost saving, and green computing in the area of information technology; however, it is also faced with some security challenges.

In the service model of cloud storage, data is outsourced to the storage server which performs as the third party. So, data is out of the control of data owner and the security of data highly depends on the server. Due to the dishonesty of cloud storage server, the data owner will first encrypt the original sensitive data and then outsource the ciphertext to the cloud in order to keep the confidentiality of data. The encryption key is kept by the data owner privately. However, even if the data is stored by cloud in the form of ciphertext, there are some security risks. For example, in order to improve the service reliability, the cloud may make several backups for the user’s data and distribute them to different storage servers [3]. On this condition, when the data has expired and the owner needs to delete the data from the storage servers, the cloud server may not destruct all the backups of data. Once adversaries get the encryption key and the backups of the ciphertext from cloud, the sensitive data can be recovered and the confidentiality is destroyed. Therefore, the assured destruction of expired data, namely, the thorough deletion and the permanent elimination of ciphertext, is one of the important contents in the research of cloud storage security [4].

In this paper, applying the attribute-based encryption and the distributed hash table (DHT) technology to the process of data destruction in the cloud storage environment, we propose a secure ciphertext self-destruction scheme with attribute-based encryption called SCSD. In SCSD scheme, the sensitive data is first encrypted under an access key, and then the access key is encrypted using an attribute-based encryption method. The ciphertext of sensitive data is extracted and transformed in order to get the ciphertext shares, which are stored in the DHT network along with the attribute shares. Meanwhile, the rest of the sensitive data ciphertext and the shares of access key ciphertext constitute the encapsulated self-destruction object (EDO), which is stored in the cloud. When the sensitive data is expired, the nodes in DHT networks can automatically discard the ciphertext shares and the attribute shares, which can make the ciphertext of sensitive data and the access key unrecoverable. Thus, we realize secure ciphertext self-destruction. Compared with the current schemes, our SCSD scheme can resist the traditional cryptanalysis attack as well as the Sybil attack in the DHT network.

The rest of the paper is organized as follows. In Section 2, we introduce some related works of the secure data destruction. Then, in Section 3, we review some preliminaries. Next, we introduce the system and security model and the detailed construction of our SCSD scheme in Section 4. In Section 5, we make an evaluation for the scheme in security analysis and scheme performance. Finally, concluding remarks and future work are given in Section 6.

In cloud storage system, some data is stored in the servers for a long time, which can be compromised by adversaries, because the data may be backed up by the cloud servers and these backups may still exist after the delete command of users. It is difficult to destruct all the backups in the cloud, and the following works are some attempts to achieve the secure destruction of data.

Perlman is the first to focus on the secure deletion of documents [7]. Perlman designed an unrecoverable system for documents. The encryption key is deleted when it is expired; thus, the document encrypted under this key can not be recovered. However, this system considers only the lifetime of encryption key. Besides, this is a local-centered system and is unfit for the cloud environment. Then, following this idea, FADE [8], one secure overlap cloud storage system built under the existing cloud infrastructure, is developed. This system can assure the deletion of documents and can support different document access policies. Another feasible system is Ephemerizer [9], which needs a trusted server to store and manage the decryption key. In Ephemerizer, the data owner sets the expired time for the decryption key. The trusted server deletes the decryption key once the key is expired. Thus, the ciphertext is unreadable.

The above methods follow the idea of centralized solution, which has some limitations as follows. () The key management depends too much on the server. () When there is an investigation from government, the administrator needs to give up the right of key management. This condition makes the server no longer trusted. () There is a need for additional commands and operations to achieve the assured deletion of data.

In order to solve the problem brought by the centralized destruction scheme, Geambasu et al. propose an interesting data self-destruction system called Vanish [5]. The private data is encrypted under a symmetric key, which is divided into several key shares using threshold secret sharing scheme and then distributed to a large scale DHT P2P network. The nodes in the DHT network will automatically delete the key shares periodically, which will result in the unreadable ciphertext. Thus, it realizes the self-destruction of data and this needs trusted servers or additional operations. Wang et al. improve the Vanish system by extracting and distributing parts of ciphertext to the DHT network [6]. This improvement will resist the traditional cryptanalysis attack and brute-force attack more efficiently.

However, [10] points out that there are Sybil attacks against the Vuze DHT network adopted by Vanish system. Adversaries can get enough key shares to reconstruct the key before the ciphertext is expired. Thus, there are security problems in the schemes of [5, 6]. Besides, these decentralized solutions adopt the symmetric encryption algorithms, which will bring complex key management and distribution problems. To solve these problems, an improved system called SafeVanish is proposed [11]. RSA algorithm is adopted to firstly encrypt the symmetric key in order to resist the Sybil attack. But this system can not support fine-grained access control mechanism. Applying attribute-based encryption algorithm, Xiong et al. [12] firstly propose a secure self-destruction scheme, which can support fine-grained access control on documents. However, the direct adoption of attribute-based encryption algorithm on documents is not efficient.

Therefore, a secure sensitive data self-destruction scheme, which supports efficient data encryption and key management, fine-grained access control in lifetime and secure self-destruction after expiry, and traditional cryptanalysis attack and Sybil attack resistance, is needed in the cloud storage environment.

3. Preliminaries

3.1. Distributed Hash Table

Distributed hash table (DHT) [13] supports a distributed database storage model. And DHT network is comprised of large-scaled distributed infrastructures in the P2P networks which support the query, storage, retrieval, and management of data without servers. Every node in the DHT network is responsible for a small-scaled routing and can store parts of data. Thus, the whole DHT network realizes an addressing and storing of data. There are many DHT networks in Internet, such as Vuze, Chord, OpenDHT, and Pastry.

The index of every document stored in the DHT network can be expressed as a pair of (). is denoted as the hash value of name or other descriptive pieces of information of the document; can be denoted as the IP address or other descriptive pieces of information of the node that stored the document in DHT network. All of the index items compose a large document index hash table. When is specified, the location of document can be assured through the corresponding relationship.

Every DHT network has the following three important characteristics, which is suitable for constructing data self-destruction scheme in cloud storage environment:(1)Data availability: DHT network can provide reliable distributed storage capacity, which assures the availability of the data stored in the nodes of DHT network in the lifetime. This is the foundation of constructing data self-destruction scheme.(2)Automatic data deletion in the nodes in DHT network: nodes in DHT network can automatically remove the old data in order to store the new data periodically. Thus, the data stored in the nodes will be destroyed automatically after expiry, which provides a mechanism for ciphertext self-destruction.(3)Large-scaled and global distribution: for example, there are more than one million of active nodes in Vuze network simultaneously, and these nodes are distributed to more than 190 countries all over the world. These completely distributed nodes in DHT network can provide attack resistance capability for self-destruction scheme.

3.2. Attribute-Based Encryption

Attribute-based encryption (ABE), a typical public key cryptography, was firstly proposed by Sahai and Waters in 2005 [14]. In an ABE scheme, the identifier for a user is a set of descriptive attributes rather than a string of characters in identity-based encryption (IBE). Every attribute can be mapped to an element in using a hash function. The ciphertext and user’s key are both associated with the attributes. ABE can support threshold policy of attributes. Namely, if and only if the number of same attributes in both sets of attributes and is greater than or equal to a certain threshold value, a user with a set of attributes can decrypt the ciphertext successfully which is encrypted under a set of attributes .

Specifically, an authority firstly defines a threshold value and generates the system public key, the length of which is related to the number of attributes in . Then, the authority generates the private key for user with a set of attributes . is associated with a random order polynomial . In a decryption process, if , then the user chooses random attributes in the set and reconstructs the encryption key through Lagrange’s interpolation on the associated polynomial . Thus, the user can decrypt the ciphertext and get the plaintext.

3.3. Threshold Secret Sharing

Threshold secret sharing scheme was first proposed by Shamir [15]. The main idea is to divide the secret data into shares and then distribute these shares to users. If there is or more than shares are extracted from these users, then the secret data can be generated. Otherwise, the secret data can not be generated. This method is called () threshold secret sharing.

Generally, threshold secret sharing scheme can be achieved by using Lagrange’s interpolation polynomial. If there is an interpolation polynomial and there are different points that satisfy the equation , then is called Lagrange’s polynomial, which is composed of the following basic polynomial , where .

Namely, given different points satisfying , we can reconstruct a unique order polynomial .

4. SCSD Scheme Construction

In this section, we first describe the system model of the secure ciphertext self-destruction (SCSD) scheme. Then, the detailed algorithm descriptions and the outline of scheme are introduced as follows.

4.1. System Model

The SCSD system comprises six different entities: authority, cloud storage servers, DHT network, data owners, data consumers, and adversaries, as shown in Figure 1.

Authority. Authority provides the system with security parameters setup and key generation processes. Besides, it also assigns attributes for each user.

Cloud Storage Servers. Cloud storage servers are responsible for storing the data sent by the users and assuring that only authenticated users can get access to the data.

DHT Network. Nodes in the DHT network are responsible for storing the ciphertext shares and the attribute shares and can automatically discard the stored data.

Data Owners. A data owner generates sensitive data and then encrypts it under a random access key. Ciphertext shares are sent by data owner to the DHT network along with the attribute shares. Besides, EDO is sent to cloud by data owner.

Data Consumers. The data consumer downloads ciphertext shares and attribute shares from the DHT network and EDO from the cloud. Then, he can decrypt the EDO if his attributes satisfy the ABE threshold policy.

Adversaries. Adversaries may try to capture the data in the cloud or in DHT network.

This paper is aiming at preventing the leakage of sensitive data stored in the cloud after expiry. For example, sensitive information in user’s historic archive may leak out in the condition of an investigation from government. We assume that the data owner and other authenticated users trust each other. Thus, adversaries may try to compromise the EDO in the cloud after the lifetime of EDO. Or the adversaries may capture the ciphertext shares and the attribute shares stored in DHT network within the lifetime of EDO. So, in the security model of our scheme, we divide the behavior of adversaries into the following two kinds. () Adversaries compromise the EDO in the cloud after the lifetime of EDO. The adversary tries to analyze the sensitive data from the EDO. () Adversaries compromise the ciphertext shares and the attribute shares stored in DHT network within the lifetime of EDO. The adversary tries to decrypt the ciphertext and get the sensitive information according to the shares.

4.2. Algorithm Descriptions

Algorithms of our SCSD scheme are described as follows.

() : given a security parameter , the authority firstly generates the master secret parameters , which are all chosen randomly from . Then, the authority generates the public parameters , where is the set of total attributes of users and each attribute in is associated with one unique element in . is a multiplicative cyclic group with the generator . is a bilinear map. is the threshold value for the total attributes of users. is the threshold value for the total ciphertext shares. is the number of bits in each associated ciphertext extraction. is the times of extraction, is a hash function. is a symmetric encryption algorithm and is the corresponding decryption algorithm. , . Besides, the authority also generates secret key for user with attribute set . The authority chooses a polynomial with degree and sets . Then, the user’s secret key is generated as , where .

() : given sensitive data , a data owner with an attribute set firstly chooses a random access key and generates the ciphertext of as . Then, the data owner chooses a random value and generates the ciphertext of as , where are the attribute shares.

() : given a ciphertext , the data owner firstly divides the ciphertext into blocks of bits. If the last block is less than bits, then several bits of “” are added to the end until the length of the last block is bits. Suppose the ciphertext is divided as ; the data owner associates the blocks as follows:

Then, the associated ciphertext is .

() : this is the inverse algorithm of . Given an associated ciphertext , a data consumer performs as follows:

Then, the data consumer gets the ciphertext from the association .

() : given the associated ciphertext , for , the data owner firstly extracts the bits located in in , where is the remaining associated ciphertext after the th extraction from . Note that . All of the extracted ciphertext is denoted by , where is the th extracted ciphertext from . The remaining associated ciphertext after the th extraction from is denoted by . Then, the data owner generates polynomials as follows:

The data owner chooses different integers and then computes the value of for . Finally, the data owner gets the ciphertext shares , where for .

() : given the ciphertext shares and the attribute shares from , the data owner firstly chooses a random index for as a seed to a pseudorandom number generator. Then, the data owner runs the generator to generate indices . For , each ciphertext share is stored in the node indexed by in the DHT network. Similarly, for the attribute shares from , the data owner firstly chooses a random index as a seed to a pseudorandom number generator. Then, the data owner runs the generator to generate indices . For , each attribute share is stored in the node indexed by in the DHT network.

(7) : given the attribute set of the data owner , from , , , and , the data owner generates the encapsulated self-destruction object and then sends the EDO to the cloud.

() : before the expiration timestamp of EDO, a data consumer, with a secret key and an attributes set , firstly gets the EDO from the cloud. Then, the data consumer runs the pseudorandom number generator to generate indices of attribute shares under the seed . Then, the data consumer gets as many , as possible from the DHT network according to the indices . In order to recover the access key , the data consumer chooses a set of attribute shares . Note that if there are no more than attribute shares in the set of , the data consumer can not recover the access key since he can not satisfy the ABE threshold policy. If there is a set of attribute shares , the data consumer firstly gets Lagrange’s coefficient and then recovers the access key as follows:

(9) : given the EDO from the cloud, the data consumer runs the pseudorandom number generator to generate indices of ciphertext shares under the seed . Then, the data consumer gets more than , from the DHT network. From these , the data consumer can reconstruct the polynomials using Lagrange’s interpolation. Then, the data consumer gets from these polynomials and generates the associated ciphertext . Finally, the original ciphertext is generated by running algorithm. The plaintext is recovered from .

4.3. Outline of SCSD Scheme

There are two main phases of SCSD scheme, namely, the data encapsulation phase and the data reconstruction phase. The outline of SCSD scheme is illustrated in Figure 2.

In data encapsulation phase (Phase I), the data owner firstly runs the algorithm to generate the ciphertext of sensitive data under ABE. Then, the data owner runs the algorithms , , and in turn to get the ciphertext shares and attribute shares and then distributes the shares to the DHT network. Besides, the data owner runs the algorithm to get the EDO and then sends the EDO to the cloud.

In data reconstruction phase (Phase II), the data consumer firstly runs the algorithm to generate the access key of ciphertext before the EDO expires. Note that if the data consumer does not satisfy the ABE threshold policy defined by the data owner, he can not recover the access key successfully. Then, the data consumer runs the algorithm to get the ciphertext and finally recovers the sensitive data.

5. Analysis and Performance

In this section, we evaluate our SCSD scheme by modularizing it into two parts, namely, security analysis and scheme performance.

5.1. Security Analysis

In the applications of our scheme, because adversaries can not specify the particular object of attack before the expiration timestamp, we assume that the copies of EDO stored in the cloud are secure during this time. Besides, because the attribute shares and ciphertext shares stored in the DHT network will be discarded after the expiry of EDO, once the DHT network is updated periodically, the contents of EDO copies will be unreadable.

There are mainly two kinds of attack aiming at our scheme. The first one is cracking the expired EDO copies stored in the cloud through cryptanalysis attack and brute-force attack. Despite the fact that the attribute shares and ciphertext shares are discarded, there are still EDO copies stored in the cloud. The other kind of attack is aiming at collecting the attribute shares and ciphertext shares in the DHT network before the expiration timestamp of EDO, and these shares will be used in the tracing attack against the EDO copies stored in the cloud.

Therefore, the security of our scheme is mainly affected by two aspects. One is the security of encryption algorithm used in the sensitive data encryption under the access key, which depends on the capability of resisting the cryptanalysis attack and brute-force attack. The other is the security of DHT network that stored the attributes shares and ciphertext shares, which depends on the capability of resisting sniffing attack, hopping attack, and other DHT Sybil attacks. So, we make the security analysis of our scheme based on these two aspects as follows. The brief comparisons of security properties of our SCSD scheme [5, 6] are summarized in Table 1.

5.1.1. The Security of Encryption Algorithm

The brute-force attack is implemented by trying any possible decryption keys on the ciphertext to recover the plaintext. This kind of attack is based on the integrity of ciphertext. So, adversaries should first get the integrated ciphertext before implementing the brute-force attack. In our scheme, however, the sensitive data is first encrypted under the random access key and then the ciphertext is associated and extracted. Because every block of the associated ciphertext is correlated with each other, once some of the blocks are extracted, the remaining blocks will be no more integrated. Therefore, without the integrated ciphertext, adversaries can not recover the sensitive data by the brute-force attack.

Besides, implementing the traditional cryptanalysis attack is also based on an integrated ciphertext. Because the remaining ciphertext blocks stored in the cloud are incomplete, the traditional cryptanalysis attack had no effect on our scheme.

5.1.2. The Security of DHT Network

In the following, we will discuss whether adversaries can crack the EDO copies by attacking the DHT network before the expiration timestamp of EDO. Because adversaries can not specify the particular object of attack before the expiration timestamp, the adversaries may try to get as many attribute shares and ciphertext shares as possible during this time. For example, the adversaries may keep on attacking the DHT network in order to get enough shares. However, this kind of attack will bring expensive cost to the adversaries.

Due to the characteristic of DHT network, the method of attacking the DHT network to get the attribute shares and ciphertext shares is very difficult. Reference [5] has made detailed analysis aiming at various kinds of DHT attacks by performing simulations in the Vuze DHT network. The result shows that it is impossible for the adversaries to get enough shares from DHT network by implementing sniffing attack, hopping attack, and other DHT attacks. Therefore, in the same way, the adversaries in our scheme also can not get enough attribute shares or ciphertext shares by attacking the DHT network in order to crack the EDO copies stored in the cloud.

5.2. Performance and Optimization

In this section, we first make a performance evaluation of SCSD on the time cost in both the data encapsulation phase and the data reconstruction phase, respectively. Then, we implement the parameter optimization by analyzing the tradeoff between security and availability of our scheme.

5.2.1. Performance Evaluation

In Phase I, the communication overhead is mainly caused by the distribution of ciphertext shares and attribute shares to the DHT network. The computation overhead is mainly caused by the ABE algorithm on the access key, the symmetric encryption algorithm on sensitive data, and the association and the shares generation algorithm on ciphertext. In Phase II, the communication overhead is also mainly caused by the collection of ciphertext shares and attribute shares from the DHT network. The computation overhead is mainly caused by the reconstruction of the access key and the ciphertext.

Based on the above analysis, we execute our SCSD scheme and measure the times spent in the two main phases. For the sake of simplicity, we set the total shares and the threshold for the ciphertext shares and attribute shares, respectively. The evaluation uses an Intel G2130 3.2 GHz with 4 GB of RAM, Java 1.6, and a broadband network. The times of the two main phases are shown in Figure 3.

Figure 3 shows that the data collection and reconstruction phase is relatively fast. The time cost of data encapsulation and distribution, however, is quite large. Fortunately, a simple pretreatment, pregenerating the access key and prepushing shares into the DHT network, can be implemented. As shown in Figure 3, this pretreatment can lead the time of data encapsulation phase to a fixed 1.6 s. Thus, the performance of SCSD scheme is relatively effective and efficient.

5.2.2. Parameter Optimization

Next, we assume that the adversaries have comprised 5% of the nodes in a thousand-node DHT network. We will show how the security and the availability of our scheme are affected by the parameters and the threshold . The probability that an adversary captures sufficient shares to reconstruct the ciphertext shares is shown in Figure 4. It is clear that increasing the number of shares can decrease the adversary’s success probability. Furthermore, the security can also be enhanced as the threshold increases.

As shown in Figure 5, the availability is also affected by the parameters. The maximum timeout gets longer as the number of shares increases. And longer timeout can also be supported by smaller threshold since the scheme can tolerate more share loss. So, the choice of threshold can represent a tradeoff between security and availability. High threshold can provide more security and low threshold can provide longer lifetime. Therefore, by choosing the proper share number and threshold, we can get a tradeoff of high security and good availability.

Besides the parameters, there are other kinds of optimizations for our scheme. Because of the adoption of ABE algorithm, our SCSD scheme can implement one-to-many authorization and access control flexibly. Moreover, the access key can be used repeatedly in the condition of timely processing huge volume of data while the security requirement is lower. And if the requirement of security is higher, the ciphertext shares and the attribute shares can also be distributed to different DHT networks, respectively, one to Vuze and the other to OpenDHT [16], which will improve the security of our scheme significantly.

6. Conclusion

In cloud storage system, secure data destruction is one of the problems that need to be addressed in data security. Many data destruction schemes have been proposed in recent years. However, there are still some limitations. In this paper, we mainly focus on the ciphertext destruction and propose a secure ciphertext self-destruction scheme with attribute-based encryption called SCSD, which applies the attribute-based encryption and the distributed hash table technology to the process of data destruction in the cloud storage environment. Compared with the current schemes, our scheme can resist the traditional cryptanalysis attack as well as the Sybil attack in the DHT network. Besides, the performance of SCSD scheme is relatively effective and efficient.

Conflict of Interests

The authors declare that they have no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the School Innovation Foundation and the Doctorial Foundation under Grant 2014JY170. The authors thank the anonymous reviewers for their useful comments and suggestions.