Abstract
The arrival of cloud computing age makes data outsourcing an important and convenient application. More and more individuals and organizations outsource large amounts of graph data to the cloud computing platform (CCP) for the sake of saving cost. As the server on CCP is not completely honest and trustworthy, the outsourcing graph data are usually encrypted before they are sent to CCP. The optimal route finding on graph data is a popular operation which is frequently used in many fields. The optimal route finding with support for semantic search has stronger query capabilities, and a consumer can use similar words of graph vertices as query terms to implement optimal route finding. Due to encrypting the outsourcing graph data before they are sent to CCP, it is not easy for data customers to manipulate and further use the encrypted graph data. In this paper, we present a solution to execute privacyguarding optimal route finding with support for semantic search on the encrypted graph in the cloud computing scenario (PORF). We designed a scheme by building secure query index to implement optimal route finding with support for semantic search based on searchable encryption idea and stemmer mechanism. We give formal security analysis for our scheme. We also analyze the efficiency of our scheme through the experimental evaluation.
1. Introduction
With the rapid progress of electronic devices and communication technologies, it promotes the advent of the era of cloud computing that has an important impact and value on all walks of life [1, 2]. Cloud computing also speeds up data outsourcing service and makes it an important and convenient application [3, 4]. Graph is a structure that is often used in various fields, such as traffic graph [5], social graph [6], and molecular structure graph [7]. Due to the powerful processing capability of cloud computing and the problem of cost saving, the enormous graph data are usually outsourced to the cloud computing platform (CCP) which is responsible for storing, managing, and processing these data. But the server on CCP is not completely honest and trustworthy, the privacy and security issues of the outsourcing graph data need to be considered and handled. Encrypting the outsourcing graph data is an effective and commonly used method before they are outsourced to CCP [8]. However, it is not easy for data customers to manipulate and further use the encrypted outsourcing graph data. Therefore, it is an extremely meaningful work to implement privacyguarding optimal route finding with support for semantic search on the encrypted graph in the cloud computing scenario.
The optimal route finding on graph is an operation that is frequently used in many fields, and related applications include shortest path query [9], path planning [10], and minimum spanning tree [11]. The optimal route finding with support for semantic search has stronger query capabilities, and a customer can use similar words of graph vertices as query terms to implement optimal route finding. For instance, in a traffic network graph, the graph vertices represent locations, and the weight on the edge represents the route cost between two locations. The optimal path finding is to query for the path with the least cost between two locations. To realize semantic search, the porter stemmer mechanism commonly used in information retrieval [12] is adopted in this paper. The graph vertex set is transformed into a new set by the stemmer mechanism, and the new set is similar to the original set of vertices in semantics. When performing optimal route finding, the new set serves as the source set of query terms. Our work researches optimal route finding over encrypted graph data, and we also take into account cost savings; it is expensive in cost to download all the graph data from the remote server. In view of this, it is of great significance to implement optimal route finding with support for semantic search on encrypted graph. However, it is not an easy job to carry out the optimal route finding in consideration of the security and privacy issues in the cloud computing scenario.
To implement query operations on the remote server in the cloud computing scenario, the idea of searchable encryption is very effective [13–17]. The remote server performs query through encrypted query terms, and the server cannot obtain the privacy information of the query terms and query results. The searchable encryption is a research hotspot in the area of information security, and the research progress in this field has also been advancing. Soon after, some dynamic and extended searchable encryption schemes have emerged [18–22], but the above searchable encryption schemes cannot be used to implement route finding with support for semantic search on encrypted graph. Recently, some researchers have studied and implemented some query schemes on the encrypted graph [23–27]. Chase et al. studied the query problem on the encrypted graph and proposed the structured encryption method [23]. Privacy preserving subgraph query problems were researched in the literatures [24, 25]. Shen et al. studied the problem of cloudbased approximate constrained shortest distance queries over encrypted graphs with privacy protection [26]. Ciucanu et al. presented a secure framework for graph outsourcing and SPARQL evaluation in the literature [27]. However, these methods cannot address the problem of optimal route finding with support for semantic search on encrypted graph.
To solve the problem on CCP, we present a solution to execute privacyguarding optimal route finding with support for semantic search on the encrypted graph in the cloud computing scenario (PORF). In the PORF scheme, the server on CCP implements the privacyguarding optimal route finding with the help of the index we build. Firstly, we use porter stemmer mechanism to transform the graph vertices into a new set to achieve semantic search. Then, based on the new set, we build the chain tables, and each node of which contains optimal route information. Finally, we build a secure index on the basis of all the chain tables, and the index is stored on the server of CCP, and the server executes optimal route finding by the index and the encrypted query terms. The server cannot learn the privacy contents of the query results and query terms. The security analysis and the experimental evaluation show that our proposed PORF scheme is secure and efficient.
The contributions of our paper are described below. (1)We present a scheme to address the problem of optimal route finding with support for semantic search on the encrypted graph in the cloud computing scenario(2)We give the formal security analysis of the scheme to ensure the privacy and security of query results and query terms(3)We conduct an experimental analysis to demonstrate the efficiency of our scheme
The rest of our paper is organized below. Section 2 presents the related work. Section 3 gives the design and analysis process of our PORF scheme. Section 4 analyzes the security of the PORF scheme. Section 5 demonstrates the PORF scheme through experiment and comparison. Finally, section 6 summarizes our paper.
2. Related Work
With the rapid development of computer technology and the huge increase of people’s demands [28–31], privacy and security issues have increasingly become an important consideration [32–35]. In the field of data outsourcing security, searchable encryption plays an important role and can realize the query of outsourcing data without disclosing privacy information [36, 37]. Generally speaking, there are two kinds of searchable encryption types: searchable symmetric encryption and searchable asymmetric encryption [16, 17]. Usually, the querying of symmetric encryption is more efficient than that of asymmetrical encryption. Thus, we use symmetric encryption idea in our scheme.
Searchable encryption becomes an efficient cryptographic primitive in remote data query which plays an important value in data outsourcing [13–16]. The concept and idea of searchable symmetric encryption was first presented in the literature [13]. Goh came up with the secure index strategy in which the bloom filter was used to address the search question on outsourced data in the literature [14]. Curtmola et al. put forward the conception of nonadaptive searchable symmetric encryption and adaptive searchable symmetric encryption in the literature [16]. Chang et al. adopted pseudorandom functions to solve the problem of privacy preserving keyword searches on remote encrypted data and solved the update problem [17]. Thereafter, some researchers in the field of security proposed a lot of outspread searchable encryption solutions [18–22]. Wang et al. investigate the problem of secure and efficient similarity search over outsourced cloud data and proposed a new symbolbased trietraverse searching mechanism [18]. The questions of the secure and efficient ranked search over encrypted outsourcing data were studied in the literatures [19, 20]. Du et al. presented a dynamic multiclient searchable symmetric encryption scheme supporting Boolean queries which allowed the data owner to authorize multiple users to excute Boolean queries over the encrypted data [21]. Li et al. used the nearest neighbor and attributebased encryption techniques to present a dynamic searchable symmetric encryption scheme and addressed the key sharing problem [22], but all the searchable encryption solutions cannot be used to perform optimal route finding with support for semantic search over encrypted graph data.
In recent years, the secure query questions on encrypted graph were studied, and some relative research achievements have been acquired [23–27]. Chase et al. presented the idea of structured encryption and proposed the application of controlled disclosure in the literature [23]. Cao et al. adopted the “filteringandverification” principle to study and address the problem of subgraph query on the encrypted graph [24]. Fan et al. proposed a private subgraph query solution for large graphs, and the query subgraph needed to be protected while the data graph did not [25]. Shen et al. proposed a graph encryption scheme to achieve the constrained shortest distance query and presented a treebased ciphertext comparison protocol [26]. Ciucanu et al. designed and implemented a secure framework to perform the quey with SPARQL evaluation on outsourcing graphs in the literature [27]. However, all the existing solutions of encrytped graph search have not addressed the issue of privacyguarding optimal route finding with support for semantic search on encrypted graph.
In this paper, we come up with a solution on the strength of searchable encryption idea and porter stemmer mechanism to perform optimal route finding with support for semantic search. We first transform graph vertices into new word set and build the chain tables based on the new set. We next build an index which is sent to the server on CCP, and the server excutes optimal route finding through the index and the encrypted query terms. We finally analyze and evaluate our solution both from security and experiment.
3. PORF Scheme Construction
3.1. Preliminaries
Goldwasser et al. presented the concepts of semantic security and indistinguishability in the literature [38]. A system is semantically secure if whatever an adversary can compute about the plaintext given the ciphertext, he can also compute without the ciphertext [38]. In this paper, we use the set containing three polynomialsize algorithms to represent a semantically secure encryption mechanism [39]. is a secret key generating algorithm. and represent the encryption algorithm and decryption algorithm, respectively.
To implement semantic search in our PORF scheme, we adopt the porter stemmer mechanism in information retrieval [12], but not limited to only this method can achieve. We use to represent the vertices set of outsourcing graph, and the new word set after transformation through porter stemmer mechanism is represented as , where . The main notations used in our paper follow in Table 1.
3.2. PORF Overview
In the cloud computing scenario, the architecture about outsourcing query illustrated in Figure 1 is mainly composed of three entities: the server on CCP, the data owner, and data customers. The server provide data owners and customers with storage, management, and query services. To enable the server to implement optimal route finding on the encrypted outsourcing graph, we construct an index and encrypt the query request and then put them on the server. In this work, our main task is to study and achieve optimal route finding with support for semantic search. With respect to the query control and authentication of data customers, we adopt the thought of preexisting searchable encryption such as broadcast encryption [16].
Our PORF scheme can perform optimal route finding with support for semantic search over the encrypted graph on CCP, and the following design targets can be achieved. (1)Privacyguarding optimal route finding functionality. The customer can achieve privacyguarding optimal route finding with support for semantic search by means of the server on CCP(2)Security guarantee. We can give the security guarantee for our scheme through formal analysis, and it prevents the server from getting the privacy of the query results and query terms(3)Efficiency. We can implement our scheme of optimal route finding with less overhead
In the PORF scheme design, we make use of index and chain table on the part of data structures. Our work considers the undirected graph as the research object, and the processing operation of directed graph is similar. We adopt the chain tables to store the optimal route information. Query vertices need to be processed and encrypted before being sent to the server, and the index is then built to complete privacyguarding optimal route finding via the server on CCP.
To implement the optimal route finding in the cloud computing scenario, we try to deal with it in three steps. Firstly, we build the chain tables, and each node of the chain tables consists of the vertices and optimal route information which need to be encrypted. Secondly, we are going to build a secure index to randomly store all the nodes of the chain tables. Finally, the query vertices of a customer are delivered to the server after security processing, and the server executes optimal route finding with support for semantic search on CCP with the help of the built index. Consistent with the thought of existing symmetric searchable encryption schemes [16, 18, 19], we assume that the server on CCP will adopt the adaptive attack model, and the query customers have the mutual request authentication and query control mechanisms with the data owner [16].
3.3. Scheme Design and Implementation
In the PORF scheme, our main work is to build the index and how to implement optimal route finding by means of the index on CCP. The several used algorithms in our scheme are described below. (i)Genkeys (): symmetric secret key generation algorithm. The parameter is taken as the input, and the symmetric secret key is used as the output(ii)Chaintablebuilding (, ): building the chain tables of graph vertices to store the contents of optimal route. The graph data and the symmetric secret key set serve as inputs, and the outputs are the work set with similar meanings of graph vertices, the set of compound terms about work set, and the chain table set (iii)Indexbuilding (, , , ): building query index algorithm. The inputs are the set , chain table set , and the key sets and , and the output is the query index (iv)Querybuilding (, ): building query term algorithm. The inputs include the word from the set and the key set . The encrypted query term set serves as the output(v)Queryperforming (, ): implementing optimal route finding on CCP. The index and the query term set serve as the inputs, and the set of optimal route is the output
In our work, we use to be the security parameter and adopt as a secure symmetric encryption scheme in our optimal route finding solution. The building process of our proposed PORF scheme is as follows.
3.3.1. Building Chain Table
To implement the semantic search, we adopt the porter stemmer mechanism to turn graph vertices set into a new set , where . We combine two arbitrary inequable words in the set to form a new compound term, and new term set is denoted as = . For every member of the set (), we build its chain table , and we use to represent the node of the chain table. The creation process of chain tables is described in Algorithm 1. The content of each node in the chain table consists of optimal route information, and we write it in terms of the symbol . To protect the security of the chain table contents, we need to encrypt the chain table nodes. The two used symmetric key sets in our scheme are represented as = and = . The set of all chain tables that are built related to the set is denoted as .
In the algorithm , the time complexity of new word conversion via porter stemmer is , and the time complexity of building set is . The time complexity of building every chain table is ( is the maximum number of optimal routes of compound terms), and the time complexity of building chain tables is (). Therefore, the total time complexity of the algorithm is +().

3.3.2. Building Optimal Route Finding Index
To enable the server to perform optimal route finding in the cloud computing scenario, we propose to implement this by building an index. The process of creating our index is described in Algorithm 2. For each chain table (), the contents of each node contain optimal route information, and the number of nodes (that is, the number of optimal routes about the compound term ) is denoted as . For , we generate a tag for the term by concatenating and , and the tag is represented as . Then, all the tags about the term are denoted as a set . The matching optimal route informaton of each member in the set is placed in the index. Performing optimal route finding of the term is equivalent to seeking for the corresponding members in the index via all the correlative tags in the set . To prevent the server on CCP from getting information about the number of optimal routes of each member , we need to add the extra elements (We call them the ) to pad the index such that the number of optimal routes of each member in the set is the same; that is, it is the maximum number of optimal routes in the graph . If , we need to add extra elements.
In the algorithm of generating the index, we need to calculate the storage location of index members and then assign values to each member of the index. The chain table set contains chain tables, and each table has at most nodes. All of the nodes in the chain table set are stored in the index . As a result, the time complexity of the algorithm is ().

3.3.3. Performing Optimal Route Finding
After the index is built, we will consider how to execute optimal route finding through the server on CCP. To accomplish this operation, for the element from set (), we need to build the query term = . More specifically, the query term is created by the symmetric encryption algorithm , that is = = . When a customer is going to execute optimal route finding about the word , the query term is delivered to the server on CCP. With the help of the query term and the index, the server completes the operation of optimal route finding in the cloud computing scenario through Algorithm 3.
In the algorithm , for the query term (), if is not a (), we will put into . Therefore, the time complexity of the algorithm is .

In the paper, we propose a solution to solve the problem of privacyguarding optimal route finding with support for semantic search on the encrypted graph in the cloud computing scenario. By the aid of encrypted query terms and a secure index, the server of CCP executes the privacyguarding optimal route finding and returns the encrypted query results to the query customer. Our scheme satisfies the efficiency and the query security, and the server cannot obtain the privacy information of the query terms and the retrieval results.
4. Security Analysis
Now, we analyze the security of the PORF scheme. We first give several concepts used in security analysis of our scheme [10]. (i)History: the interaction between the server on CCP and a query customer, containing the graph data and the set of query terms, expressed as . The partial history is expressed as , where (ii)View: existing the history about the key , a view is defined as . The partial view is , where (iii)Access Pattern: existing the history about the key , the access pattern is defined as a tuple , where is the result set of optimal route finding matching to the query term (iv)Search Pattern: existing the history about the key , the search pattern is defined as a binary symmetric matrix , such that if and , otherwise, for (v)Trace: existing the history about the key , the trace is defined as a tuple , where is the overall size of the outsourcing graph, and and are the access pattern and the search pattern of the history , respectively. The trace of partial history is defined as , where
The server on CCP will perform optimal route finding through the index and the query term and cannot get the contents of the query results and the query term. In our work, we prove our optimal route finding scheme meets the adaptive semantic security. About the adaptive attack model, the server on CCP can make a choice from the query request on account of the query term and the results about optimal route finding of previous queries [10, 36]. For the consideration of security analysis, we follow the security idea adopted in the previous schemes [10, 36]. According to the security guarantee of our PORF scheme, the server on CCP cannot obtain the additional information apart from the trace, and hence our proposed scheme of optimal route finding is secure. The security theorem of our PORF scheme is stated below.
Theorem 1. Our PORF scheme meets the adaptive semantic security of searchable symmetric encryption idea.
Proof. To prove the security of the PORF scheme, we first describe a polynomialsize simulator . For all , in the case of existing the trace of a partial history, the simulator can build a view which is used to simulate the view of the adversary, and such that and cannot be distinguished, where is a symmetrical key and .
For , the simulator generates the simulative encrypted outsourcing graph with the same size as the real graph through random strings. In the meantime, the simulator constructs the index through randomly generating strings on the which is also used to simulate the real index and has the same size as the real index. The index will be used to simulate the real index in other partial views , where . It is very obvious that the simulative encrypted graph is indistinguishable from the real outsourcing graph, and the index is indistinguishable from the index . Otherwise, one can distinguish between the outputs of the semantically secure symmetric encryption and the random strings with the same size. Thus, is indistinguishable from .
For , the simulator can still use the index that was built before. The search pattern matrix about query terms belongs to the trace . The simulator will generate the query terms that are contained in the view . In the generation of these query terms, the query terms contained in the view may be reused. Or else, the simulator will regenerate these query terms from .
To generate , the simulator first needs to determine whether contains through checking whether , where . If cannot contain , the simulator utilizes the information of about , i.e, . The simulator selects an address at random from the simulative index for , making sure that all addresses are different and generates the query term . The simulator will remember the correlation between and . Otherwise, if contains , the simulator will retrieve the query contents in connection with and assigns it to . This is to ensure that if contains repeated query terms, then the query contents that are involved in are identical.
It is very obvious that the query terms in are indistinguishable from the query terms in . Otherwise, one could distinguish between the outputs of the semantically secure symmetric encryption and the random strings with the same size. Therefore, for , there is no polynomialsize adversary that could distinguish between and . Thus, the security theorem of the PORF scheme has been proven.
5. Experimental Evaluations
In this section, we will carry out experimental analysis of our scheme on the Enron email network graph [40, 41] and then give the evaluation results. The content of the experiment is completed through using C language program coding over the server on CCP and the local machine. The server on CCP is configured with the Linux operating system of 6 CPU cores with 3.0 GHz and 16 GB of RAM, and the local machine runs on the Windows 10 operating system equipped with Intel Core 4 CPU of 2.6 GHz. In our experimental analysis and evaluation, the index generation, query term generation, and the decryption of query results are performed on the local machine. The operation of optimal route finding is implemented on the server of CCP.
To verify the efficiency of our scheme in the experimental analysis, we compare our PORF scheme with the optimal route finding scheme in plaintext, which is referred to as MORF. The MORF scheme is similar with the PORF scheme, and the index is constructed in a similar way. But in the MORF scheme, the data and index are not encrypted. Comparing our PORF scheme with the MORF scheme, it is intended to evaluate the time and memory overhead over encrypted graph. For the outsourcing graph of the same number of vertices, the difference of the number of edges can have a certain effect on the experimental analysis. Therefore, for the outsourcing graph used in our experiment, we adopt five graph data sets chose at random and think about two circumstances to compare and evaluate the performances. One circumstance is that the outsourcing graph includes more edges, and the number of edges in the graph sets is, respectively, 7958, 16934, 35819, 73586, and 119823. The other circumstance includes less edges which contains half of the number or so of the first circumstance. In the experimental evaluation, we use PORF1 and MORF1 to denote the experiments containing much more edges, and PORF2 and MORF2 to denote the experiments containing less edges. The comparative analysis of our experiment about these circumstances can assess overhead issues of optimal route finding and validate the efficiency of our PORF scheme.
5.1. Index Building
To perform secure optimal route finding on CCP, we first need to transform the graph vertices to complete the semantic search requirements and generate new word set. The chain tables are then built on top of the new word set to hold the optimal route information, and finally, the index is generated based on all the chain tables through the algorithm. The experiments evaluated the index building time and index size, respectively. Experimental analysis and evaluation on the experimental graph data are conducted under four conditions, and the experimental result figures of building index are given. The analysis results about the time of index generation are shown in Figure 2, where the abscissa represents the number of vertices in the graph data, and the ordinate shows the time of index generation.
From Figure 2, we can conclude that index building time and the number of vertices are closely related, and the time of index generation increases nearly linearly with the number of vertices under four conditions of PORF1, PORF2, MORF1, and MORF2. Generally, an outsourcing graph with more numbers of edges can have more optimal routes. Therefore, the time of index generation under PORF1 condition is more than that under PORF2 condition, and similarly the time of index generation under MORF1 condition is more than that under MORF2 condition. For the encrypted graph query, we need to encrypt the graph data and build the encrypted index. As a result, the time of index generation is more than that on the plaintext graph. Under PORF condition, we get the security of private data with encryption time cost. After building the encrypted index, the queries on CCP can meet security requirements, and customers’ privacy information cannot be compromised. Therefore, it is an effective way to increase index building time costs appropriately, and the process of encryption is done locally.
The experimental analysis of the size about index generation is plotted in Figure 3. The abscissa of the figure shows the number of vertices in the graph data, and the ordinate represents the size of index generation. The curves of the size about index generation are changing approximately linearly with the vertex count increases under the four conditions. For outsourcing graphs of the same number of vertices, the more number of edges there are, the larger building size of the index is. Therefore, the index generation size of PORF1 is larger than that of PORF2, and the index generation size of MORF1 is larger than that of MORF2. The proposed PORF scheme can ensure the security of query process with additional storage overhead. The index generation size of PORF is a little larger than that in the plaintext query method, but the difference is not significant.
5.2. Performing Query
In the process of optimal route finding, the server on CCP makes use of the index and query terms to implement the query task by the algorithm, and the experimental analysis includes query time evaluation and decryption time evaluation. The experimental results about query and decryption are, respectively, shown in Figures 4 and 5, where the horizontal axis represents the number of vertices in the graph data, and the vertical axis represents the time of query or decryption. From Figure 4, we can conclude that the time of performing query changes nearly linearly with the rise of vertex count. The query processes under PORF2 and MORF2 conditions, respectively, take less time than that under PORF1 and MORF1 conditions.
After the query is processed, the server on CCP sends the encrypted retrieval results to the query customer that completes the decryption locally. The results of experimental analysis about decryption are plotted in Figure 5. The time of the decryption process in our experiment is related to the decryption mechanism and the size of the query results. We adopt the same decryption mechanism under PORF1 and PORF2 two conditions. The decryption time of PORF1 and PORF2 increases almost linearly with the rise of vertex count. Time consumption of the decryption process under PORF2 condition is less than that under PORF2 condition.
In general, the index building of our PORF scheme is completed on local machine, and the time and size of the index are nearly linear to vertex count of the outsourcing graph. The query process is executed by the server on CCP, and the query time also increases with the increasing of vertex count. Meanwhile, the server on CCP does not get the privacy contents about the retrieval results and query terms. Our PORF scheme implements optimal route finding with support for semantic search on the encrypted graph and satisfies the privacy and efficiency of the query process.
6. Conclusion
In this paper, we propose a novel solution to address the problem of optimal route finding with support for semantic search, in which we adopt searchable encryption idea and porter stemmer mechanism. We first convert all graph vertices into a new word set by porter stemmer mechanism to satisfy semantic search. Then, we build the chain tables based on the new word set to place optimal route information and build an index based on the chain tables which is used to execute optimal route finding. Secondly, we prove the security of our scheme through formal analysis. Finally, we give experimental analysis and evaluation, and the results show that our scheme has good performance.
For our future work, we intend to build dynamic optimal route finding scheme to meet the needs of dynamic graphs. In addition, our other research direction is to combine encryption graph query with secret key management and update to meet a wider range of query requirements.
Data Availability
All relevant data to support the findings in this study belong to all authors and will be used for our future research. Requests for access to the data should be made to the corresponding author.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The authors thank the editor and the reviewers’ comments and helpful suggestions. This work was supported in part by the Nature Science Foundation of China under Grant 61762055, Grant 61962029, Grant 61662039, and Grant 61741111, in part by the Jiangxi Provincial Natural Science Foundation of China under Grant 20181BAB202014, Grant 20202ACBL202005, Grant 20181BAB202011, and Grant 20202BAB212006, in part by the Key Scientific and Technological Research Project of Jiangxi Provincial Education Department of China under Grant GJJ190899, in part by the Jiangxi Key Natural Science Foundation under Grant 20192ACBL20031, in part by the Science and Technology Research Project of Jiangxi Education Department under Grant GJJ180904, in part by the Humanities and Social Sciences Foundation of Colleges and Universities in Jiangxi Province under Grant TQ18111, and in part by the Key Program of Zhejiang Provincial Natural Science Foundation of China under Grant LZ18F020001.