Abstract

Data outsourcing services have emerged with the increasing use of digital information. They can be used to store data from various devices via networks that are easy to access. Unlike existing removable storage systems, storage outsourcing is available to many users because it has no storage limit and does not require a local storage medium. However, the reliability of storage outsourcing has become an important topic because many users employ it to store large volumes of data. To protect against unethical administrators and attackers, a variety of cryptography systems are used, such as searchable encryption and proxy reencryption. However, existing searchable encryption technology is inconvenient for use in storage outsourcing environments where users upload their data to be shared with others as necessary. In addition, some existing schemes are vulnerable to collusion attacks and have computing cost inefficiencies. In this paper, we analyze existing proxy re-encryption with keyword search.

1. Introduction

Network development has accelerated data communication, and data outsourcing services have been developed to store data in distant storage media, which can be retrieved by a user with various devices. Many companies are now providing competitive high-capacity storage services. Thus, an increasing number of people are using storage outsourcing services to store their data. However, the storage of sensitive data such as medical or financial information increases the development of the “Big Brother problem” and the risk of data disclosure by attackers and unethical administrators.

One scheme for protecting user data is data encryption on the data outsourcing server. However, this approach can cause difficulties during data access. Users must download all of their own data, and decryption needs to be applied to the entire dataset before the data can be searched. This can be viewed as a major disadvantage of data outsourcing. Therefore, searchable encryption systems have been developed that can encrypt data indexes to allow index searching without exposing the data to attackers and unethical administrators.

The study of searchable encryption systems began with searchable symmetric encryption (SSE) based on symmetric key cryptography as well as the development of generic cryptographic algorithms. The first construction of SSE was proposed by Song et al. [1]. Then, a new scheme using the Bloom filter schemes was proposed by Goh [2]. In order to provide faster retrieval time, an SSE scheme using an encrypted linked list scheme was announced by Curtmola et al. [3].

Next, research into searchable encryption systems based on a public-key has been actively carried out. The first public key encryption with keyword search (PEKS) using a bilinear map was proposed by Boneh et al. [4]. The PEKS scheme provides a variety of functions; for example, multiuser capability was proposed [511].

However, this scheme is difficult to apply in a cloud environment where there is frequent data sharing among users. To address this problem, a proxy reencryption with keyword search (PRES) system has been developed that reencrypts encrypted indexes and allows users to search during safe data storage outsourcing and sharing without the need for a decryption process [1214].

However, some existing systems do not consider users who share data with other users or the storage outsourcing structure, which means that they handle the indexes and data encryption as a single process. In reality, the indexes and data are stored separately during storage outsourcing. The indexes are stored on the master server, and the data are split into chunks, which are then distributed to many chunk servers. Therefore, searchable reencryption systems are difficult to apply to a real outsourced storage system. In addition, some existing schemes are vulnerable to collusion attack. Some existing schemes allow only one-hop data sharing. In reality, there is no longer any control after the data have been shared. If data need to be shared, the user has no choice other than to accept multihop reencryption. Most searchable reencryption schemes require large volumes of computing resources for data storage and sharing.

The present study examined the operation process of PRES, which operated in the same manner as the above scenario, and analyzed the consequences of a relevant scheme for collusion with an administrator of an untrusted remote storage and sharing target.

2. Preliminaries

In this section, we provide the necessary preliminary details.

2.1. Bilinear Maps

The bilinear map was proposed originally as a tool for attacking elliptical curve encryption by reducing the problem of discrete algebra on an elliptical curve to the problem of discrete algebra in a finite field, thereby reducing its complexity. However, this scheme has been used recently as an encryption tool for information protection, instead of an attacking tool. Bilinear pairing is equivalent to a bilinear map. These terms are defined and the theory is described below.

Definition 1. Characteristics that satisfy an admissible bilinear map are as follows.(i)Bilinear: define a map as bilinear if where all , and all , .(ii)Nondegenerate: the map does not relate all pairs in to the identity in . Note that and are groups of prime order, which implies that if is a generator of , is a generator of .(iii)Computable: there is an efficient algorithm to compute for any . The following definition was constructed based on the bilinear map . With this map, the D-H decision problem can be solved readily for ellipses using the following equation: . Therefore, the following is the basis for resolving the difficulties of the bilinear map, which is used as an encryption tool by many encryption protocols.

Definition 2. When the elements , , , , (BDHP, Bilinear Diffie-Hellman Problem) are given, this relates to the calculation problem. In this study, the admissible bilinear map was used as the basis for secret number production during the key construction process between heterogeneous devices. This problem can be solved if the ellipse curve discrete mathematics problem can be solved. For example, a can be calculated from , so can be calculated using .

2.2. Existing PRES Scheme

Let us take a look at [13] proposed by Chen and Li in 2011.

2.2.1. Notation

The notation used in this scheme are as follows.(i): Prime number.(ii): Cyclic additive group of order .(iii): Cyclic multiplicative group of order .(iv): Generator of .(v): Bilinear map, .(vi): Hash function, .(vii): Hash function, .(viii): Hash function, .(ix): Hash function, .

2.2.2. Protocol

As with most PRES schemes, the protocol of Chen et al. had a total of 7 phases: KGen, Enc, RKGen, REnc, TGen, Test, and Dec.

KGen Phase. Objects each public/private open key pairs using remote storage in the KGen stage:

Enc Phase. User transmits encrypted data to remote storage :

RKGen Phase. User transmits a reencryption key to in order to share data with :

REnc Phase. reencrypts the data with the reencryption key transmitted by .

TGen Phase. User transmits a produced trapdoor to in order to search the shared data from :

Test Phase. transmits the search results after searching the data using the trapdoor sent from .

Dec Phase. User verifies the contents by decrypting the data relevant to the search results:

2.2.3. An Analysis on the Protocol

We will analyze the PRES scheme proposed by Chen et al. for possible security threats.

Analysis 1: Problem of Sharing Process. In RKGen phase, produced to share his own data. This is known as producing with the similar scheme of . However, the value of is not knowable even if the data owner is . In the Enc phase, produces a random value according to different data and does not save it separately. In addition, directly deducting the value used only in a multiplication operation from the encrypted data is not possible even if is the data owner. In other words, cannot produce an value to reencrypt the uploaded data. In order for RKGen to be established, the value should be opened, or should save all values relevant to each data set.

Analysis 2: Collusion Problem. Let us suppose that and are in collusion. If the value is revealed, or all files are encrypted using the same values, can easily produce the reencryption key by using the open key of data owner and the personal key of colluder . Then, a file that is not a sharing object can be reencrypted as below for :

According to the above scheme, unwanted sharing not just with but with anybody that did not want is possible.

Analysis 3: Data Encryption Problem. In Enc phase the process is implemented to encrypt the data. In other words, a multiplicative group encrypts messages by multiplying the element and the message in an elliptic curve situation. The multiplication operation of the elliptic curve is only possible with elements of the multiplicative group. In other words, when changing a multiplicative group of plaintext to an element, obtaining a normal plaintext value is not possible during decryption in the future.

2.3. Security Requirement

The following requirements should be met to ensure safe searching and sharing in an outsourced storage environment.(i)Confidentiality: data transmitted between the outsourced storage server and client terminal should be identifiable only by validated users.(ii)Search speed: a client who has limited system resources should be able to search documents quickly, including word processing files, stored in outsourced storage systems. In the case where the data index structure of the existing scheme is the same as in Figure 1, and the server needs to retrieve data from all indexes to find the data containing the keyword, it is very inefficient. In addition, many previously developed search algorithms do not apply to this structure, so the storage server must perform a sequential search. In this structure, the scan speed decreases rapidly with an increasing number of documents. In order to solve this problem, the structure of the encryption index must be changed, as shown in Figure 2. If we adopt such a structure, the previously developed fastest search algorithm can be used for the data search.(iii)Traffic efficiency: communication volume between the client and server should be small for energy and network resource efficiency.(iv)Calculation efficiency: calculation efficiency should be provided for index generation, search execution, and safe sharing of data with other users. The previous scheme is highly inefficient for encrypting variable-length data. Data encryption is performed with a symmetric key in a multiplicative group, and hiding the encrypted key using a multiplying operation is more effective.(v)Storage volume efficiency: a variety of distributed file systems have been developed to provide cloud storage services. These systems store the index in master server’s memory for faster data retrieving. In other words, the storage capacity of the index has limitations. Due to these circumstances, a service provider uses this technique to merge the repeated keyword and optimize the index. The server cannot merge duplicated keywords, in the case of existing schemes, using the same structure as in Figure 1. In this the structure the index capacity will also increase rapidly depending on the number of documents. However, if we adopt the structure shown in Figure 2, index capacity management will be more efficient.(vi)Sharing efficiency among users: encrypted data must be retrieved from saved remote data and be securely and efficiently shared with those users who use an unreliable server. Cloud service providers should make shareable only the data that the data owner wishes to share with another user. The PRES papers most often propose previously used proxy reencryption (PRE). These schemes provide a once-only sharing function. In other words, cannot share data with another user with a similar scheme as the one used to share the data between users and . However, is able to search and decrypt the shared data and then share it by saving it to the remote storage again through the PRES encryption process. The existing PRES is not sharing the shared data to again, and additional decryption and encryption operations are needed to share the data again. Therefore, PRES needs to consider a re-share operation.(vii)Prevention of a collusion attack: the administrator of the remote storage is treated as an untrusted object, and the administrator may obtain unauthorized access to the data through collusion. Therefore, PRES proposed in the future needs to be safe from collusion attack.

3. Proposed Scheme

In this paper, a practical proxy reencryption scheme with a keyword search capability is proposed considering the structural characteristics of an entrusted cloud storage center. This paper describes what steps should be taken in a secure data storage, searching, and sharing scenario (refer Figure 3).

3.1. Notation

(i): Concatenation.(ii): Prime number.(iii): Number of data.(iv): Number of keyword on data.(v): Cyclic additive group of order .(vi): Cyclic multiplicative group of order .(vii): Generator of .(viii): Bilinear map, .(ix): ’s private key in .(x): ’s public key in .(xi): th plain data.(xii): th encrypted data.(xiii): th data encryption key .(xiv): th keyword on th data .(xv): Symmetric key encryption by key .(xvi): Symmetric key decryption by key .(xvii): Set of keyword on th data .(xviii): Hash function, .(xix): Hash function, .(xx): Trapdoor searching keyword .

3.2. Definition

The detailed steps performed by the proposed scheme are as follows.(i)KeyGen: the users of the outsourced storage generate public key pairs prior to using the service. The storage outsourcing server should not store the user’s private key. If the private key is leaked, an attacker can generate a trapdoor by acting as the owner of the private key. Thus, we generate a key pair based on the discrete logarithm problem (DLP).(ii): the data owner creates the encrypted index, , and encrypted data, , which only the owner can search by inputting his or her own private key, , and a set of keywords, , which are sent to the master server.(iii): to search the data safely, the user creates a trapdoor, , which does not leak information related to the keyword , which is being searched for using the private key . The trapdoor is sent to the master server. The storage outsourcing administrator should not be able to access information via a trapdoor.(iv)“yes” or “no”: using the trapdoor generated by the user’s private key and the search keyword, the server performs a test to confirm whether the encrypted data contain the keywords. If the cipher text contains the keyword specified, the server sends a “yes” to the user and a “no” if it does not. Thus, the server cannot learn anything about the keywords or the data.(v): the data owner creates a reencryption key, , to create a data index for sharing that can search. The reencryption key is created with the data owner’s secret key , and the hashed secret key of the user who will be sharing the data.(vi): the data owner creates a parameter to generate a data index for sharing that can be searched by . This parameter is created using the data owner’s private key and the public key of the user who will be sharing the data. The master server creates a new index, , which can use to search via the trapdoor.(vii): the rightful owner of the encrypted data uses their private key to decrypt the encrypted data.

3.3. Storage Scenario

The proposed scheme considers the outsourced storage structure so an encrypting index used for sharing and searching is stored on the master server. We assume that each user has received a key pair before using the storage outsourcing service (refer to Step 1). The user encrypts the necessary keywords during data searching so they can perform their own search later and send this to the master server (refer to Step 2). The master server sends chunk information to the user for data storage, who then divides the data into chunks and stores it on the designated chunk server (see Figure 4).

Step 1 (key generation (KeyGen)). Each storage outsourcing service user generates a key pair: selection setting up setting up.

Step 2 (index and data encryption (Enc)). The data owner generates an encrypted index which can be used for searching securely: Alice: selection output encrypted index for the master server output encrypted data for the chunk server: .

3.4. Search Scenario

The user sends a trapdoor that can search data without exposing keyword information to the master server (refer to Step 1). The master server searches for the data with the keyword in the encrypted index using the trapdoor and then sends the chunk information that corresponds to the data to the user (refer to Step 2). The retrieved data is decrypted by the legitimate user (refer to Step 3). The user acquires the data by summing each chunk received from the chunk server that stores the data (see Figure 5).

Step 1 (trapdoor generation (TGen)). A user, , who wants to search the data generates a trapdoor using the keywords and his or her secret key:Alice Server: .

Step 2 (Test). To confirm that the data contains the keywords sought by the user, the user performs the following tests with the public key, trapdoor, and crypt obtained from the server:

Step 3 (decryption (Dec)). The user can perform the following decryption using their private key and the crypt obtained from the server:

3.5. Sharing Scenario

To share data with the desired user and to allow the shared users to share data freely with another user, reencryption needs to be performed to allow the shared users to search only the encrypted index. Many parameters are required to implement proxy reencryption and a separate searchable encryption scheme for secure data sharing in a storage outsourcing environment, which reduces the storage volume efficiency. Therefore, we propose an algorithm that provides both functions simultaneously. First, parameter is generated to allow index sharing with another user, which is sent to the storage outsourcing provider by the owner of the data (refer to Step 1). Next, the storage outsourcing provider changes the owner’s index with respect to the data sharing target. Shared (reencrypted) data searching is then possible, as shown in Steps 25. A user who acquires the data sharing index can always search for the corresponding data using keywords and then download it (see Figure 6).

Step 1 (reencryption key generation (RKGen)). If the data owner wants to share data with other users, he or she can generate keys for reencryption. If user wants to share data with user , generates parameter using ’s secret key and ’s public key, as follows:Bob → Alice: Alice: Alice → Server: .

Step 2 (reencryption (REnc)). If user wants to share data with user , generates parameter using ’s secret key and ’s hashed secret key, as follows:

Step 3 (trapdoor generation (TGen)). User who wants to search the data, generates a trapdoor using the keywords and his or her secret key:

Step 4 (test). To confirm that the data contains the keywords the user seeks, the server performs the following tests using bob’s trapdoor. It checks the equality . If this is true, the output is “Yes” but “No” if not,

Step 5 (decryption (Dec)). The user can perform the following decryption with his or her private key:

4. Analysis

The proposed scheme satisfies the following requirements.(i)Confidentiality: using pairing, the proposed scheme makes it difficult for a malicious third party to decrypt communication contents, even if they eavesdrop on communications between the client and the server.(ii)Search speed: a quick index search is possible by using the index structure shown in Figure 2, and a user can check whether a document contains keywords by performing single pairing calculations, which increases the searching speed (refer Figure 7).(iii)Traffic efficiency: keyword search and reencryption requires only one round of communication, so the scheme increases the communication volume efficiency.(iv)Storage volume efficiency: to use a new index structure, the proposed scheme can reduce storage volume dramatically despite increasing the index document storage space compared to traditional schemes (refer Figure 8). Because, the proposed scheme can merge the same keywords.(v)Calculation efficiency: the relatively simple pairing calculation implies that the proposed scheme allows users to generate indexes and search documents, as well as perform reencryption, which increases the calculation efficiency (refer Table 1).(vi)Sharing efficiency among users: our scheme allows encrypted and stored data on an unreliable remote outsourced storage server to be shared safely and efficiently. In addition, our proposed scheme is different from existing schemes because it does not require the shared subjects to be specified in advance, and no additional devices are required to manage the subjects who receive the shared data. Finally, if users want to re-share the data shared by the owner with other users, they only require one pairing calculation in an unreliable storage outsourcing environment.(vii)Prevention of collusion attack: in the proposed scheme, each data set is encrypted by a different random key (for symmetric encryption). Therefore the sharing phase can be operated by only the lawful data owner. An unethical administrator cannot use a collusion attack, because the key is known only to the lawful data owner.

5. Conclusion

The advent of storage outsourcing services has allowed many users to store and access data. Recent studies of the application of searchable encryption technologies to storage outsourcing have attempted to ensure the security of data. However, most available searchable encryption technologies are inefficient when adding data sharing objects because they are based on e-mail environments, which determine the objects with which data can be shared. In a storage outsourcing environment, users upload data on their own and share the data in a safe manner. Therefore, the indexes and data are separated so available technologies are compatible with data storage outsourcing systems. After considering the requirements of the data storage outsourcing environment, we specified the security requirements and proposed a scheme that provides both functions simultaneously: a proxy reencryption function and a searchable encryption function. The proposed scheme provides a free sharing feature which has the more calculation efficiency than existing schemes. And we adopted the new index structure for fast searching data on cloud storage. It appears that search schemes based on multiple keywords will become important for ensuring flexibility and for facilitating searches during data storage outsourcing. In the future, it will be necessary to develop a reencryption system where an index containing multiple keywords of variable lengths can be encrypted and searched flexibly.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research was supported by the MKE (The Ministry of Knowledge Economy), Republic of Korea, under the Information Technology Research Center (ITRC) support program (NIPA-2013-H0301-13-1003) supervised by the National IT Industry Promotion Agency (NIPA). This work was supported by the Soonchunhyang University Research Fund.