The Scientific World Journal

The Scientific World Journal / 2015 / Article

Research Article | Open Access

Volume 2015 |Article ID 706102 | 17 pages | https://doi.org/10.1155/2015/706102

Universal Keyword Classifier on Public Key Based Encrypted Multikeyword Fuzzy Search in Public Cloud

Academic Editor: Juan M. Corchado
Received20 May 2015
Revised17 Jul 2015
Accepted29 Jul 2015
Published25 Aug 2015

Abstract

Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider’s premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization.

1. Introduction

Cloud computing [1, 2] makes the infrastructure, platform, and software as a service for the worldwide users. The cloud paradigm makes the user outsource their personal data to the cloud storage [3] server which facilitates the users’ access to their data anywhere at any time. The data users of the cloud storage have to pay only for the actual storage they use. Some companies will use the cloud storage for their data backup.

Problem Formulation. We highlight here that the cloud storage server has the responsibility to keep their customer data to be available and accessible all the time. The cloud storage must facilitate their customers to access their wide range of resources, application, and data through internet service interface immediately and fast. The number of customers utilizing the cloud storage increases significantly every day. The data on the cloud storage increases dynamically due to the increasing demands of existing customer and addition of new customers. This means the cloud storage is under a state to respond to increasing customer data and effective access to their data. To retain their customers, the cloud storage must optimize its computational time for searching the requested data. It must have some efficient searching method or additional provisions to serve their customers to provide the requested data immediately. So with this observation, we propose a new searching method named Asymmetric Classifier Multikeyword Fuzzy Search which utilizes the universal keyword classifier to store the search path pattern of all the keywords of their customers data. This allows the cloud storage server to use its time effectively to perform multiprocessing of their growing customers. Our scheme also resolves typos and representation inconsistencies since the searching is done on BTree fuzzy searchable index.

Our Contributions. In this paper, we propose new Scheme Asymmetric Classifier Multikeyword Fuzzy Search (ACMFS) which greatly reduces the time spent for searching the data and delivers the requested file immediately to the users. It also utilizes the data effectively from the cloud storage through fuzzy search on BTree fuzzy searchable index. Experimental results shows effective data utilization and the search efficiency of the proposed scheme. Our contributions are summarized as follows:(i)Asymmetric Searchable Encryption allows the server to search over encrypted BTree fuzzy searchable index thereby providing effective data utilization.(ii)The cloud storage server would not disclose the files to illegal access as they do not have the information about the multiple keywords and files.(iii)As the BTree fuzzy searchable index is created from wild card fuzzy keyword set, it tolerates typos and representation inconsistencies of authorized users.(iv)Classifier search server uses universal keyword classifier for traversing the storage efficient BTree wild card fuzzy searchable index which stores all the search path pattern of the multiple keywords of the entire encrypted files.

Paper Organization. The rest of the paper is organized as follows. The related modules are discussed in Section 2 along with the limitations of the existing searching methods. In Section 3, we formulate our problem by designing the system model and goals of the proposed solution. Then we provide the detailed description of Asymmetric Classifier Multikeyword Fuzzy Search scheme in Section 4 followed by Section 5, which discusses the detailed design and implementation of algorithms of our proposed method. The Experimental results and performance analysis with output are shown in Section 6. We conclude our paper in Section 7.

2. Background

2.1. Related Work

Although Cloud Service Provider (CSP) hosts several third party data, Liu et al. [4] pointed out that managing sensitive data leads to security and privacy concerns. Cryptographic methods can be used to disclose the key only to authorized users to protect the data from untrusted CSP.

Ren et al. [5] state that users have several types of typing behaviour for keywords which are commonly termed as typos, representation inconsistencies, and typing habits. They suggested fuzzy keyword search to overcome these inconsistencies. Though the fuzzy keyword search is prevalent in popular search engines like Google, Bing, and so forth, it still poses risk in cloud storage due to inherent security and privacy obstacles. The searchable encryption [68] is recommended which takes encrypted data as files labeled with keywords and lets user securely search over the files through predefined keywords for retrieving them.

Zhou et al. [9] created k-gram based fuzzy keyword set for keywords W of the encrypted files C and Jaccard coefficient to calculate the keywords similarity.

Wang et al. [10] pointed out that keyword holds sensitive information of the files and thus keyword privacy must be protected for effective data utilization. Xu et al. [11] identified that third party could access the files by knowing the keyword search trapdoor. Xu proposed public key encryption with fuzzy keyword search (PEFKS) in which each keyword corresponds to exact keyword search trapdoor and fuzzy keyword search trapdoor.

Wang et al. [12] discusses that the search over encrypted data not only involves information retrieval techniques such as data structures for representing the searchable index but also depends on efficient search algorithms that run on the index.

2.2. Limitation of the Existing Methods

(1)Secured and privacy preserving keyword search [4]:(i)The encryption and decryption process incurs high communication and computational cost.(2)Secured fuzzy keyword search [5]:(i)It does not support fuzzy search with public key based searchable encryption.(ii)It could not carry out multiple keywords semantic search.(iii)The update operation on fuzzy searchable index is not much efficient.(3)K-gram based fuzzy keyword Ranked search [9]:(i)The k-gram based fuzzy keyword set size depends on the jaccard coefficient value.(4)Verifiable fuzzy keyword search (VFKS) [10]:(i)The symbol tree fuzzy searchable index occupies more space in this search.(5)Public key encryption with fuzzy keyword search:(i)Creating fuzzy keyword index and exact keyword index is not compatible with large database.(6)Privacy-preserving multikeyword fuzzy search [12]:(i)It demands files with relatively high score to reduce the false negative rate.

3. Methodology of Our Scheme

3.1. Cloud Data Utilization Service Architecture

In this paper, we consider our cloud data utilization service architecture which consists of four entities as data owner, cloud storage server, classifier search server, and data users and is shown in Figure 1. Here we assume that the authorization is suitably done between the data owner and data users.

Initially, the data owner generates user’s public and private key pair as (, ). Data owner has a set of data files that are encrypted with user’s public key and are outsourced to the cloud storage server. Data owner predefine multiple keywords for each file. Data owner has a set of multiple keywords , of data files. Data owner creates storage efficient wild card based fuzzy multikeyword set as , using wild card based technique with the predefined edit distance value. Data owner creates BTree wild card fuzzy searchable index from fuzzy multikeyword set. Data owner encrypts the files and index using user public key and is outsourced to the cloud storage server. Data owner sends the user private key as private secret key which is used by the data users for creating keyword trapdoor and for decrypting the file. Now the cloud storage server has the encrypted data files DF and encrypted BTree wild card fuzzy searchable index . The cloud storage server shares the encrypted BTree wild card fuzzy searchable index to the classifier search server. The data user requests the multiple search keywords which are encrypted using the private secret key to create multikeyword trapdoor which is sent to the cloud storage server. The server sends the request to the classifier search server. The universal keyword classifier receives the request to check whether the request is coming for the first time. If the request is arriving for the first time; then the keyword classifier captures and stores the path of the by searching over the encrypted BTree wild card fuzzy searchable index and sends the search path to the cloud storage server. If the request given by the user matches a previous request then it is a repeated multiple keyword. Then the classifier search server extracts the stored search path patterns of the repeated multikeyword from the universal keyword classifier and the search path is sent to the cloud storage server. After receiving the search path pattern of the multiple keywords from the classifier search server, the cloud storage server extracts the set of encrypted files from and is sent to the data users. After receiving the encrypted files, the data user decrypts the files using private secret key.

3.2. Design Goals

To effectively optimize the searching time for the multiple keywords in the cloud storage server and for tolerating the typos and representation inconsistencies of authorized users, our searching method seeks to achieve the following design goals.

Search efficiency goals are(i)to construct the universal keyword classifier for BTree wild card fuzzy searchable index for optimizing search time and for tolerating typos and representation inconsistencies of authorized users.

Security goals are(i)to avoid the cloud storage server from getting the knowledge of data files and keyword set. This is achieved by outsourcing the encrypted files and index to the cloud storage server.

Privacy goals are(i)to provide user privacy by abstracting the details of data files, keyword, and index to the cloud storage server;(ii)to support data privacy by encrypting the files and index with user public key before outsourcing to the cloud storage server;(iii)to attain keyword privacy by forming BTree wild card fuzzy searchable index from the fuzzy multikeyword set for the predefined set of multiple keywords;(iv)to achieve query privacy by sending k-gram keyword trapdoor encrypted with the private secret key;(v)to accomplish index privacy by creating encrypted BTree wild card fuzzy searchable index.

4. Asymmetric Classifier Multikeyword Fuzzy Search (ACMFS)

4.1. Notations and Preliminaries

They are as follows.(i): private secret key;(ii)edit: edit distance;(iii): set of encrypted data files ;(iv): user public key;(v): owner public key;(vi): user private key;(vii): predefined multiple keywords set of DF;(viii): fuzzy multikeyword set;(ix): BTree wild card fuzzy searchable index;(x): encrypted BTree wild card fuzzy searchable index of ;(xi): search path pattern of multiple keywords;(xii): encrypted multikeyword trapdoor search request;(xiii): set of encrypted multiple files matching search request;(xiv): decrypted multiple files of ;(xv): multikeyword search request.

4.2. Searchable Encryption

Searchable encryption [13] is a cryptographic technique where the data users search the encrypted searchable index by the following steps.(i)The encrypted tokens in the searchable index have the pointers to encrypted files. Token symbols are the encrypted keyword.(ii)If the requested token found a match in the searchable index, then it extracts the file pointer without decryption.(iii)If the token is not found in the searchable index, then it returns the null file pointer.The two types of searchable encryption are symmetric and asymmetric (public key based) searchable encryption [14, 15]. In Symmetric Searchable Encryption, the data owner who outsources the encrypted index and data and the server that searches the data share the same secret keys. The efficiency of SSE is high since it uses symmetric cryptographic methods such as block ciphers, pseudorandom functions, and hash functions. The disadvantage of SSE is that the server has high probability to learn about the owner data and keywords. In Asymmetric Searchable Encryption [16, 17], the data owner outsources the index that is encrypted by the user’s public key. The keyword trapdoor is created by the user’s private key. So only the authorized users can request the search from the server. The advantage of ASE is that it supports conjunctive or disjunctive keywords searches. The disadvantage of SSE is that it suffers from KGA.

4.3. Edit Distance

Edit distance is the method of quantifying the similarity of the two strings. The edit distance (, ) between two strings and is the smallest number of operations necessary to change one string to another. The three primitive operations are as follows:(i)insertion: inserting one character into the string;(ii)deletion: deleting one character from the string;(iii)substitution: changing one character to another in the string.The edit distance of the two strings in our system is analysed by dynamic programming. By dynamic programming strategy, the edit distance of any two strings “” and “” is defined as follows and refer to Algorithm 1.Assume the strings are and .If , then .If , then .If , then .If , then .Note that if then else .

Input: Two strings where
Output: Minimum edit distance
(1)  Declare the integer variables length1, length2, ed;
(2)  Declare the character variables char1, char2;
(3)  Declare the integer variables substitution, insertion, deletion, minimum;
(4) length1 = size of ;
(5) length2 = size of ;
(6) ed = new int[length1 + 1][length2 + 1];
(7)  for ( to length1)
(8)   ed;
(9)  end for // loop
(10) for ( to length2)
(11)    ed;
(12) end for // loop
(13)   for ( to length1)
(14)   char1 = .charAt();
(15)    for ( to length2)
(16)     char2 = .charAt();
(17)      if (char1 = = char2) then
(18)       ed = ed;
(19)      else
(20)      
(21)       substitution = ed;
(22)       insertion = ed;
(23)       deletion = ed;
(24)       minimum = (substitution > insertion ? insertion: substitution);
(25)       minimum = (deletion > minimum ? minimum: deletion);
(26)       ed = minimum;
(27)      
(28)      end if;
(29)    end for // loop
(30) end for // loop
(31)  return ed[length1, length2]

Examples for are as follows:insertion: = “SAMI” and = “SAMIY”;deletion: = “SAMI” and = “SAM”;substitution: = “SAMI” and = “SAME”.

4.4. Fuzzy Keyword Set

Since different users have various typing behaviors, they may misspell the keywords. So the fuzzy keyword set is formed to effectively utilize the data. Fuzzy keyword set can be created by wild card based technique, k-gram based technique, and symbol tree based technique. For example, fuzzy keyword set for “W = HEN” with edit distance = 1 is as follows:, .Total number of keywords = 186.

4.5. Fuzzy Keyword Search

For the set of data files with the predefined set of , the fuzzy keyword search fuzzy (w, ) is as shown in Algorithm 2.

Input: Search keyword key
Output: matching the keyword
(1)if key = then
(2)  return
(3) else if edit(key, ) ≤ edit) // edit is the user defined edit distance value
(4)  return
(5) else
(6)  print “File Not Found”
(7)end if
(8) return

4.6. Wild Card Based Technique

Wild card based technique is used to create storage efficient wild card based fuzzy keyword set. We use the wild card “#” character to represent the positions of three edit distance operations such as insertion, deletion, and substitution thereby creating tiny fuzzy keyword set. For example, the wild card based fuzzy keyword set for w = “HEN” with edit distance ed = 1 is as follows:, .Total number of keywords = 11.

The length of fuzzy keyword set is . The length of wild card based fuzzy keyword set is .

4.7. K-Gram Based Technique

K-gram based technique is used to create k-gram based fuzzy keyword set for the predefined gram value k. The keywords of k-gram based fuzzy keyword set is the subset of keyword set. For example, the k-gram based fuzzy keyword set for w = “HEN” with gram value is as follows:.Total number of keywords = 3.

4.8. Assumptions of ADKEFS

Before we start our framework design, we have the following assumptions on our proposed scheme Asymmetric Classifier Multikeyword Fuzzy Search ACMFS.(i)We assume that the cloud storage server concentrates on servicing more customers and not to leave partnership of their customers from the business.(ii)Here we assume that the authorization is suitably done between the data owner and data users.(iii)Data owner creates users private and public key pair.(iv)We assume the wild card based fuzzy multikeyword set FMKS of multikeyword set MKW contains the original keyword as the first component.(v)We assume that each file has multiple keywords and it is possible for a keyword to be the same for multiple files.

5. Implementation of Asymmetric Classifier Multikeyword Fuzzy Search (ACMFS)

Here we discuss our proposed scheme in detail with algorithm for all the functions involved.

5.1. Function Definitions of ADKEFS

The following functions are implemented to optimize the searching on the cloud storage server and to achieve the effective data utilization.

Functions on data owner are(i)CreateKeyPairsForUser(SECRET1, SECRET2),(ii)EncryptMultiKeywordDataFile(),(iii)CreateWildCardFuzzyMultiKeywordSet(MKW, edit),(iv)CreateBTreeWildcardFuzzySearchableIndex(FMKS),(v)EncryptBTreeWildcardFuzzySearchableIndex(BSIWC, PUBKEY).

Functions on cloud storage server are(i)ExtractMultipleFileUsingPattern().

Functions on classifier search server are(i)SearchBTreeWildCardFuzzySearchableIndex(, ).

Functions on data user are(i)ViewDecryptedMultiKeywordFile(, ).

5.2. Overall Framework of Asymmetric Classifier Multikeyword Fuzzy Search (ACMFS)

Our proposed method Asymmetric Classifier Multikeyword Fuzzy Search has classifier search server that creates the search path pattern for all the keywords of the encrypted set of data files. Data owner creates encrypted BTree wild card fuzzy searchable index for the fuzzy multikeyword set and is outsourced to the cloud storage server. This overcomes the problem of typos and representation inconsistencies behaviour of the data users. The overall conceptual description of Asymmetric Classifier Multikeyword Fuzzy Search is shown in Figure 2 as an activity diagram. Please refer to Algorithm 3 for the pseudo-code.

/ Variables Initialization /
(1) Initialize Secret Keys and
(2) Predefine the edit distance “edit” value for Wildcard based Fuzzy Mulikeyword Set Creation
(3) Predefine Multiple Keywords for a single file and a keyword can be shared by multiple files.
/ KeyGeneration /
(4) DataOwner Creates User Public key and User Private Key pairs
(5)   Call CreateKeyPairsForUser ,
 // User Public key and User Private Key
(6) Receive ,
(7) Encrypt the Kdata files using User’s Public key as
(8)   Call EncryptMultiKeywordDataFile
(9)   Receive the Encrypted file
(10) Send the Encrypted file to the Cloud storage server
(11)  Send the User’s Private Key as Private Secret Key to the data users
(12) Data owner creates Wildcard basedFuzzy Multikeyword set for
(13)   Call CreateWildCardFuzzyMultiKeywordSet(MKW ,edit)
(14)   Receive FM
(15) Data ownerCreates B Ttree Wildcard searchable index for FMKS
(16)   call CreateBTreeWildcardFuzzySearchableIndex FMKS
(17)   Receive B Ttree Wildcard searchable index
(18) Encrypts the B+ tree wildcard searchable index
(19) call EncryptBTreeWildcardFuzzySearchableIndex
(20) Receive Encrypted BTree wildcard searchable index
(21) Send the Encrypted BTree wildcard searchable index to the cloud storage server
(22)Cloud Storage server shares the BTree wildcard searchable index to the Classifier Search Server
(23)Data User request Multikeyword search request MSR, and is encrypted using Private Secret Key to
        create MultiKeyword Trapdoor
(24) MultiKeyword Trapdoor is sent to the cloud storage server
(25)Cloud Storage server send the to the Classifier Search Server
(26) Classifier Search Server searches BTree wildcard searchable index
(27)  If the Keyword is the new keyword then
(28)  call SearchBTreeWildCardFuzzySearchableIndex
(29)      Keyword Classifier learn and Store the Search path pattern of
(30)      Receive the Search path pattern of that keyword
(31)      Send the Search path pattern of that keyword to the cloud storage server
(32) else if is already present in the Keyword Classifier then
(33)   Classifier Search Server send the stored Search path pattern of to cloud storage server
(34) End if
(35)   Cloud Storage Server receives the Search path pattern of
(36) Cloud Storage Server extracts the encrypted multiple files of
(37)   Call ExtractMultipleFileUsingPattern SPP
(38) Receive the set of Matching Encrypted files
(39) is sent to theData user
(40) Data User receives the and decrypt the file using Private Secret Key PSK
(41)  call ViewDecryptedMultiKeywordFile ,
(42) Receive the Superset of matching Decrypted file Decrypted File MFILE
(43) Now the user can view the file needed and large number of files viewed by the user
  represents the effective data utilization.

5.3. Key Generation Algorithm

Here we use RSA public key algorithm for generating public and private key. Here this takes two secret keys SECRET1 and SECRET2 which is predefined by the data owner where both secret keys SECRET1, SECRET2 = . Algorithm 4 is executed by the data owner to generate public and private key pair. Data owner sends the user private key as private secret key which is used by the data users to create keyword trapdoor and for decrypting the file.

Input: Two Secret keys , // Predefined SECRET1, SECRET2 =
Output: User Public and Private key pair
(1) Declare the integer variables S1, S2, S3, S4, private, public and key
(2) Assign S1 = SECRET1, S2 = SECRET2
(3) Find key = S1 S2
(4) Compute S3 = (S1 − 1) (S2 − 1), S4 = S3 − (S1 + S2 − 1)
(5) Pick random integer “public”.Check gcd(public, S4) = 1
         // gcd is Greatest Common Denominator
(6) Compute private = e−1 mod S4
(7) = // Public key pair
(8) = // Private key pair
(9) return ()

5.4. File Encryption Algorithm

Data owner executes Algorithm 5 to form encrypted set of data files that are encrypted with user’s public key PUBKEY and are outsourced to the cloud storage server.

Input: User Public key
Output: Encrypted File DF
(1) Declare integer variables key and public,
(2) Assign key = , public =
(3) For to // Data files
(4)  Encrypt the file by computing DF = mod key
(5) end for // loop
(6)  return Encrypted file DF  // Encrypted set of data files

5.5. File Decryption Algorithm

After receiving the search path pattern of the multiple keywords from the classifier search server, the cloud storage server extracts the set of encrypted files from DF and is sent to the data users. After receiving the encrypted files, the data user decrypts the files using private secret key by executing Algorithm 6.

Input: Private Secret key, - Encrypted files
Output: Decrypted FileMFile
(1) Assign PSK =
(2) Assign PSK =
    // PSK − Private Secret Key
(3) For to
     // Data files
(4)   MFILE = mod PSK
      // Decrypting the file
(5) end for // loop
(6) return Decrypted file MFILE

5.6. Wild Card Based Fuzzy Multikeyword Set Algorithm

Data owner has MKW = a set of multiple keywords of data files. Data owner creates storage efficient fuzzy multikeyword set FMKS = using wild card based technique with the predefined edit distance value. Data owner executes Algorithm 7 to form fuzzy multikeyword Set.

Input: Multikeyword set MKW =
edit – Edit Distance
Output: FMKS=
(1) Declare integer variable 1, 1, 1, edit
(2) Char MultiKeywordFuzzy; Initialize FMKS to be empty;
(3) if edit > 1 then
(4)  call CreateWildCardFuzzyMultiKeywordSet(MKW ,ed-1)
(5)  end if
(6)  if edit = 0 then
(7)  FMKS = MKW
(8)  else
(9)  For to length(MKW) do
(10)   For to length(FMKSedit-1) do
(11)    For to 2 length(FMKSedit-11 ) + 1 do
(12)     if is odd value then
(13)     MultiKeywordFuzzy = FMKSedit-1
(14)      Insert “#” at ( + 1/2) position
(15)     else
(16)      Assign MultiKeywordFuzzy = length(Mkw[])
(17)      Insert “#” at () position
(18)     end if
(19)      if MultiKeywordFuzzy is not in FMKSedit-1] then
(20)       Include MultiKeywordFuzzy in FMKSedit]
(21)      end if
(22)     end for // loop
(23)    end for // loop
(24)   end for // loop
(25)  end if
(26)   return FMKS   // Fuzzy MultiKeyword Set

5.7. BTree Wild Card Fuzzy Searchable Index Algorithm

Data owner creates BTree wild card fuzzy searchable index BSIWC from fuzzy multikeyword set. Data owner executes Algorithm 8 to create BTree wild card fuzzy searchable index BSIWC for the wild card based fuzzy keyword set FKS.

Input: FMKS =
   WildCard Fuzzy Multikeyword Set
Output: : BTree Wildcard Fuzzy Searchable Index
(1) Start AddNodeBTree
(2)  Declare the ChildLimit, NumberofChild, ChildrenValue inside the class BTreeNode
(3)  Declare TreeHeight to denote height of the BTree, ;
(4)  Declare the objects KeyNumber, KeyIndex, NodeValue, NextNode, RootNode for AddNode
(5)  Declare KeyIndexNumber that denote the number of keyIndex and Nodevalue pairs in B Tree
(6)  Declare the object ChildrenNodeValue for the class AddNode
(7)  Define Constructor AddNode(KeyIndex, NodeValue, NextNode)
(8)   this.KeyIndex = KeyIndex;
(9)   this.NodeValue = NodeValue;
(10)    this.NextNode =NextNode;
(11)  End ConstrutorAddNode
(12)  Initialize RootNode = new Node()
(13)  for to Number of keywords in FMKS[] do
(14)   for to Number of Fuzzy Keywords in FMKS
(15)    call InsertNodeIntoBTree(RootNode, FMKS[], FMKS, TreeHeight)
(16)    Receive BTreeNode;
(17)    KeyIndexnumber = KeyIndexnumber + 1;
(18)   if (BTreeNode == NULL) then
(19)       return NULL;
(20)   end if
(21)    Create two AddNode object as ChildNode
(22)   ChildNode.childrenNodeValue = new AddNode(RootNode.childrenNodeValue.KeyIndex,
                                 Null, RootNode);
(23)    ChildNode. childrenNodeValue = new AddNode(u. childrenNodeValue.KeyIndex, null,
                                  BTreeNode);
(24)    RootNode = ChildNode;
(25)   TreeHeight = TreeHeight + 1;
(26) End AddNodeBTree
(27) Function AddNodeInsertNodeIntoBTree (RootNode, WFKS, WFKS, TreeHeight)
(28)  Declare the variable for processing loop
(29) for to Number of keywords in FMKS[] do
(30)  for to Number of Fuzzy Keywords in FMKS
(31)   AddNodeCurrentNode = new AddNode(FMKS, FMKS, TreeHeight);
(32)    if (TreeHeight == 0) then
(33)     For to RootNode. NumberofChild do
(34)      if (WFKS[] <RootNode. childrenNodeValue [].FMKS[])
      then
(35)       break;
(36)      end if
(37)     end for // loop
(38)    else
(39)     for to RootNode. NumberofChild do
(40)      if (( == RootNode. NumberofChild) FMKS[] <RootNode.
                  childrenNodeValue [].FMKS[]) then
(41)     AddNodeNewNode = InsertNodeIntoBTree (RootNode. childrenNodeValue[++].NextNode,
                       FMKS[], FMKS, TreeHeight -1);
(42)         if (NewNode == NULL) then
(43)          return null;
(44)         end if
(45)     CurrentNode. FMKS[] =NewNode.childrenNodeValue. FMKS[];
(46)   CurrentNode.NextNode = NewNode;
(47)    break;
(48)   end if
(49)  end for // loop
(50)  for to RootNode. NumberofChild do
(51)    RootNode. childrenNodeValue [] = RootNode. childrenNodeValue[];
(52)   RootNode. childrenNodeValue [] = CurrentNode;
(53)    RootNode. NumberofChild = RootNode. NumberofChild + 1;
(54)   if (RootNode. NumberofChild < ChildLimit) then return null;
(55)    else return SplitNodeInBTree(RootNode);
(56)   end if
(57)   end for // loop
(58)  end for // loop
(59) end for // loop
(60) End Function AddNodeInsertNodeIntoBTree
(61)Function SplitNodeInBTree(RootNode)
(62)  Declare the variable for processing loop
(63)  AddNodeSplitNode = new AddNode (ChildLimit/2);
(64)  RootNode. NumberofChild = ChildLimit/2;
(65)  for to ChildLimit/2 do
(66)   SplitNode. childrenNodeValue = RootNode. childrenNodeValue[childLimit/2 + ];        
(67)  return SplitNode;
(68)  end for // loop
(69) End Function SplitNodeInBTree

5.8. Encrypting BTree Fuzzy Searchable Index Algorithm

Data owner encrypts the BTree wild card fuzzy searchable index BSIWC using user public key and is outsourced to the cloud storage by executing Algorithm 9 to create encrypted BTree wild card fuzzy searchable index BSIE and is outsourced to the cloud storage server.

Input: B+ tree fuzzy searchable index
    - user’s public key
Output: Encrypted BTree WildCard Fuzzy searchable Index
(1) Declare integer variables key1 and public1
(2)Assign key1 =
(3)Assign public1 =
(4) Find the number of elements “Enum” in
(5) For to Enum do
    / Encrypt the fuzzy keyword present in each node /
(6)      = [] mod key1
(7)  end for // loop
(8)  return Encrypted
    // Encrypted BTreeWildCard Fuzzy searchable Index

5.9. Searching Encrypted BTree Fuzzy Searchable Index Algorithm

The data user encrypts the multiple search keywords using the private secret key to create multikeyword trapdoor MKTW to the cloud storage server. The server sends the request MKTW to the classifier search server. The universal keyword classifier receives the request MKTW to check whether the request is coming for the first time. If the request is arriving for the first time, then the keyword classifier captures and stores the path of the MKTW by searching over the encrypted BTree wild card fuzzy searchable index BSIE by executing Algorithm 10 and sends the search path to the cloud storage server. If the request given by the user matches a previous request stored then it is a repeated multiple keyword. Then the classifier search server extracts the stored search path patterns of the repeated multikeyword from the universal keyword classifier and the search path is sent to the cloud storage server.

Input: – Encrypted BTree Wildcard Fuzzy Searchable Index
   –Encrypted MultiKeyword Trapdoor
Output: Set of Matched Encrypted Files
(1) Declare the variable for processing loop
(2)Find the height of the B Tree index and assign it to TreeHeight
(3) AddNode childrenNodeValue = . childrenNodeValue;
(4)  if (TreeHeight == 0) then
(5)    for to NumberofChild do
(6)     if ([] = = childrenNodeValue []. []) then
(7)      return (NodeValue) childrenNodeValue [].NodeValue;
(8)     end if
(9)    end for // loop
(10)   else
(11)  for to .NumberofChild do
(12)   if (( = = . NumberofChild)∣∣([] < childrenNodeValue [].KeyIndex)) then
(13)    return SearchBTreeWildCardFuzzyIndex (childrenNodeValue[].NextNode,
                      [], TreeHeight − 1);
(14)   end if
(15)   end for // loop
(16)  return NULL
(17) end if
(18) End SearchBTreeWildCardFuzzyIndex

6. Implementation Results

6.1. Implementation Setup

The implementation of the proposed work was accomplished through Asymmetric Classifier Multikeyword Fuzzy Search (ACMFS) cloud data utilization service architecture using Jelastic PaaS LayerShift cloud storage provider which offers infrastructure, platform, and application as a service for the customers. The experimentation was carried out with the code programmed in JAVA for data owner, users, classifier search server, and cloud storage server. Microsoft SQL MYSQL 5.5.42 was enabled to act as the database for the proposed system. The simulation was performed with the setup of data owner, data users from our side, and classifier search server, cloud storage server on the Jelastic cloud storage. The data owner authenticates 100 users and defines the multikeyword set for each data files. Prior to evaluating the results, the data owner outsourced 1000 encrypted files to the Jelastic cloud storage. The data owner creates the wild card fuzzy multikeyword set FMKS for edit distance 1, 2 and BTree fuzzy searchable index . The data owner outsources the encrypted and 1000 encrypted files to Jelastic cloud storage server. The cloud storage now contains the encrypted 1000 files and encrypted . With this simulated setup, the authorized users are allowed to access the files in the cloud storage using their individual identity. The users are now allowed to access the files in the cloud storage by entering the multiple keyword search request.

6.2. Performance Analysis

The performance of the proposed method was evaluated taking into account the search time efficiency and data utilization from the Jelastic cloud storage by giving the multiple keyword search request from the data users with classifier search server. The experimental results obtained by ACMFS cloud data utilization system architecture are shown in Tables 15 and Figures 37. Table 1 shows the analysis of time taken for creating the wild card based fuzzy multikeyword set with different number of users and files and its analysis chart is shown in Figure 3. Here data owner predefines five keywords for each file.


Number of usersNumber of filesTime taken for creating wild card based fuzzy multikeyword set (ms)

10300420
12350450
14400453
16450530
18500650
20550655
22600710
24650730
26700810
28750980
308001173


Number of usersNumber of filesSample of correct
multikeyword search request
Number of files retrieved with edit = 1Number of files retrieved with edit = 2

10300Cloud computing2532
12350Data users1728
14400Inter college meet1926
16450Cause of concern2234
18500Web crime issues1217
20550Open access journal2943
22600The set of files3547
24650Storage server1119
26700Remote car access4552
28750Write mode1523
30800Tax benefit818


Number of usersNumber of filesSample of misspelled
multikeyword search request
Number of files retrieved with edit = 1Number of files retrieved with edit = 2

10300Claud computeng2935
12350Dataa usars2131
14400Intor collage meat2227
16450Caase of concarn2527
18500Waeb krime issuesh1419
20550Opan acceass joarnal3146
22600Thee seet of filles3849
24650Starage servor1323
26700Remoote care accesh4957
28750Writee modee1827
30800Taxe beneefit1324


Number of usersNumber of filesSample of correct multikeyword search requestSearch time with classifier search server for edit = 1 (ms)Search time with classifier search server for edit = 2 (ms)Search time without classifier search server for edit = 1 (ms)Search time without classifier search server for edit = 2 (ms)

10300Cloud computing1071543
12350Data users1171624
14400Inter college meet2873523
16450Cause of concern1321620
18500Web crime issues2963524
20550Open access journal20102587
22600The set of files2563029
24650Storage server24183525
26700Remote car access951625
28750Write mode1161622
30800Tax benefit9121516


Number of usersNumber of filesSample of misspelled multikeyword search requestSearch time with classifier search server for edit = 1Search time with classifier search server for edit = 2Search time without classifier search server for edit = 1Search time without classifier search server for edit = 2

10300Claud computeng171715847
12350Dataa usars28139832
14400Intor collage meat9293932
16450Caase of concarn21295016
18500Waeb krime issuesh11307847
20550Opan acceass joarnal25313446
22600Thee seet of filles20263832
24650Starage servor23254932
26700Remoote care accesh13152031
28750Writee modee11102847
30800Taxe beneefit891731

Table 2 shows the analysis of data utilization efficiency for correct keyword in terms of number of files retrieved from the cloud storage and its analysis chart is shown in Figure 4.

Table 3 shows the analysis of data utilization efficiency for misspelled keyword in terms of number of files retrieved from the cloud storage and its analysis chart is shown in Figure 5.

Table 4 shows the analysis of search time efficiency for correct multikeywords with and without classifier search server for edit 1, 2 and its analysis chart is shown in Figure 6.

Table 5 shows the analysis of search time efficiency for misspelled multikeywords with and without classifier search server for edit 1, 2 and its analysis chart is shown in Figure 7.

7. Conclusion

This work Asymmetric Classifier Multikeyword Fuzzy Search presented a method that can be successfully used to enhance data utilization and improved search efficiency for public cloud storage. Provided that the data owner stores the set of encrypted files in cloud storage, we showed that the user experience improved search time due to the presence of classifier search server which searches the BTree wild card fuzzy searchable index. The proposed system extracts the wild card fuzzy multikeyword set for the multiple keyword search request resulting in increased data utilization in terms of number of files retrieved for the corresponding users. The proposed system’s performance is demonstrated with the advent of classifier search server which stores the pattern of search and helps reducing the search time for repeated multiple keyword search request. The classifier search server concept adds a new paradigm to cloud storage server serving several thousands of data owners and their users.

Appendix

See Algorithms 1 and 310.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

  1. J. Geelan, “Twenty one experts define cloud computing,” Virtualization, 2008, http://virtualization.sys-con.com/node/612375. View at: Google Scholar
  2. I. Foster, Y. Zhao, I. Raicu, and S. Lu, “Cloud computing and grid computing 360-degree compared,” in Proceedings of the Grid Computing Environments Workshop (GCE '08), pp. 1–10, Austin, Tex, USA, November 2009. View at: Publisher Site | Google Scholar
  3. http://searchsmbstorage.techtarget.com/feature/Understanding-cloud-storage-services-A-guide-for-beginners.
  4. Q. Liu, G. Wang, and J. Wu, “Secure and privacy preserving keyword searching for cloud storage services,” Journal of Network and Computer Applications, vol. 35, no. 3, pp. 927–933, 2012. View at: Publisher Site | Google Scholar
  5. K. Ren, C. Wang, and Q. Wang, “Towards secure and effective data utilization in public cloud,” IEEE Transactions on Network, vol. 26, no. 6, pp. 69–74, 2012. View at: Publisher Site | Google Scholar
  6. R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable symmetric encryption: improved definitions and efficient constructions,” in Proceedings of the 13th ACM Conference on Computer and Communications Security, pp. 79–88, Alexandria, Va, USA, 2006. View at: Publisher Site | Google Scholar
  7. J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, “Fuzzy keyword search over encrypted data in cloud computing,” in Proceedings of the IEEE Conference on Computer Communications (INFOCOM '10), pp. 441–445, IEEE, San Diego, Calif, USA, March 2010. View at: Publisher Site | Google Scholar
  8. L. Sarga, “Cloud computing: an overview,” Journal of Systems Integration, 2012, http://si-journal.org/index.php/JSI/article/viewFile/131/101. View at: Google Scholar
  9. W. Zhou, L. Liu, H. Jing, C. Zhang, and S. Yao, “K-gram based fuzzy keyword search over encrypted cloud computing,” Journal of Software Engineering and Applications, Scientific Research, no. 6, pp. 29–32, 2013. View at: Publisher Site | Google Scholar
  10. J. Wang, H. Ma, Q. Tang et al., “Efficient verifiable fuzzy keyword search over encrypted data in cloud computing,” Computer Science and Information Systems, vol. 10, no. 2, pp. 667–684, 2013. View at: Publisher Site | Google Scholar
  11. P. Xu, H. Jin, Q. Wu, and W. Wang, “Public-key encryption with fuzzy keyword search: a provably secure scheme under keyword guessing attack,” IEEE Transactions on Computers, vol. 62, no. 11, pp. 2266–2277, 2013. View at: Publisher Site | Google Scholar
  12. B. Wang, S. Yu, W. Lou, and Y. T. Hou, “Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud,” in Proceedings of the 33rd Annual IEEE International Conference on Computer Communications (INFOCOM '14), pp. 2112–2120, IEEE, Toronto, Canada, April-May 2014. View at: Publisher Site | Google Scholar
  13. S. Kamara and K. Lauter, “Cryptographic cloud storage,” in Financial Cryptography and Data Security, vol. 6054 of Lecture Notes in Computer Science, pp. 136–149, Springer, Berlin, Germany, 2010. View at: Publisher Site | Google Scholar
  14. D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key encryption with keyword search,” in Advances in Cryptology—EUROCRYPT 2004, vol. 3027 of Lecture Notes in Computer Science, pp. 506–522, Springer, Berlin, Germany, 2004. View at: Publisher Site | Google Scholar
  15. J. Baek, R. Safavi-Naini, and W. Susilo, “Public key encryption with keyword search revisited,” in Computational Science and Its Applications—ICCSA 2008, vol. 5072 of Lecture Notes in Computer Science, pp. 1249–1259, Springer, Berlin, Germany, 2008. View at: Publisher Site |