Abstract
Pervasive smartphones boost the prosperity of location-based service (LBS) and the increasing data prompt LBS providers to outsource their LBS datasets to the cloud side. The privacy issues of LBS in the outsourced cloud scenario have attracted considerable interest recently. However, current schemes cannot provide sufficient privacy preservation against practical challenges and are little concerned about the data retrieval efficiency of the cloud side. Therefore, we present an efficient Privacy-Preserving LBS Query Scheme (i.e., ). In our scheme, two cloud entities are employed to store the sensitive information of the outsourced data and provide the query service, which enhances the ability of privacy preservation for sensitive information. Besides, by using the techniques of homomorphic encryption and searchable symmetric encryption, the proposed scheme supports both the type query and the range query, which can significantly improve the data retrieval efficiency of the cloud side and reduce the computation burden on the cloud side and the user side. Through detailed analysis on security and computation cost, we show the enhanced ability of privacy preservation and the lower computation cost compared to previous schemes. Based on a real dataset, extensive simulations are performed to validate the effectiveness and performance of our scheme.
1. Introduction
Along with the boom of location-aware mobile electronic devices and the development of wireless communication, location-based service (LBS) has been prevalent in social domains and has attracted considerable interest recently. With the help of LBS, people can get convenience in points of interest (POI) searching, route guiding, and so forth. Nowadays, due to the advantages of cloud on data computation and storage, the location-based service provider (LBSP) tends to outsource the storage service and the query service to the cloud side [1].
A typical scenario for LBS in the outsourced cloud is shown in Figure 1. The LBSP first outsources its database that contains valuable and sensitive information (e.g., coordinates of POI) to the cloud side. Then, based on the outsourced database, the cloud side handles users’ LBS queries. However, this new service paradigm brings new privacy issues since the cloud side is assumed to be semitrust (honest-but-curious). In general, the privacy issues for LBS in the outsourced cloud are two main parts: (1) Since the sensitive information contained in the outsourced database may be exposed to the cloud side, how to keep the outsourced data secret from the cloud side (i.e., how to guarantee data privacy of the outsourced data). (2) Since the private information such as the current location contained in the LBS user’s query request faces the risk of being exposed to the cloud side, how to keep the private information secret from the cloud side (i.e., how to ensure the LBS user’s query privacy).

To guarantee data privacy, one straightforward method is to encrypt all the data before the LBSP outsources its database to the cloud side. Similarly, to prevent the leakage of private information contained in users’ queries, a common way is to encrypt the private data contained in users’ queries before users submit their query requests to the cloud side. Therefore, most of the current researches adopt the way of encryption to solve the privacy issues for LBS in the outsourced cloud [2–6]. Although the above researches can preserve data privacy and query privacy, their schemes are designed based on a single cloud structure (i.e., the cloud side consists of one single cloud provider). Accordingly, both the storage service and the query service rely on a single cloud provider. However, if the single cloud provider is controlled or the data stored in this cloud are stolen by the profit-driven insider, sensitive information may be up against the crisis of being leaked since all the outsourced data can be obtained on a one-time basis. Thus, while preserving data privacy and query privacy, how to design a new scheme to enhance the ability of privacy preservation for sensitive information is an urgent problem to be solved.
Current researchers pay little attention to the data retrieval efficiency of the cloud side since their schemes are only the range query supported [2–4]. That is, a user can appoint an encrypted search area into an LBS query request and then submit this query request to the cloud side. Without knowing the user’s location information, the cloud side will return all the encrypted data located in the search area as the query result to the user. Schemes that only support the range query lead to an expensive computation cost both on the cloud side and the user side. For the cloud side, it has to retrieve all the encrypted data in its database. For the user side, the user has to decrypt all the encrypted data in the query result to find the desired data. In reality, the user’s query contains not only the search area but also the search interests (i.e., POI types). Thus, in addition to the range query based on the user’s search area, the type query based on the user’s POI query type is also an essential factor that needs to be considered for the LBS query in the outsourced cloud. Therefore, while preserving query privacy, how to design a scheme that supports both the range query and the type query to improve the data retrieval efficiency of the cloud side is a problem to be fixed.
To address the above problems, we proposed a Privacy-Preserving LBS Query Scheme () in the outsourced cloud, which provides a stronger privacy guaranty and improves the data retrieval efficiency of the cloud side under a dual cloud structure. From the work in Ref. [7], if the sensitive knowledge is partitioned into two parts and distributed to two noncolluding clouds, the privacy can be preserved against the cloud side. Based on the idea of divide-and-conquer, the dual cloud structure consisting of two noncolluding clouds can ensure that each of the clouds only knows its own part and effectively isolates the knowledge contained in outsourced data. Therefore, two cloud entities (i.e., the type retrieval cloud and the location retrieval cloud) are adopted in our scheme to store the sensitive data (i.e., encrypted POI type data and encrypted location data) separately. Specifically, the contributions of our scheme can be summarized as below.(1)We propose a dual cloud structure to enhance the ability of privacy preservation for sensitive information in the outsourced cloud scenario, i.e., our scheme has the ability to resist the insider attack and the eavesdropping attack while preserving data privacy and query privacy.(2)Our scheme supports both the type query and the range query. Compared to the schemes that only support the range query, our scheme can significantly improve the data retrieval efficiency of the cloud side and reduce the computation burden on the cloud side and the user side.
The remainder of the paper is organized as follows. In Section 2, after discussing the related work, the system model, security requirements, and design goal are given. Subsequently, the basic notations and concepts are introduced in Section 3. Then, the detailed scheme, the analysis about security and computation cost, and the simulation results and corresponding analysis are given in Sections 4, 5, and 6, respectively. Finally, a conclusion is presented in Section 7.
2. Related Work
Our work focuses on the issues of privacy-preserving LBS query over outsourced encrypted data. In this section, we briefly review some related works that can be used to realize privacy-preserving LBS query.
Some early works mainly focus on the issues of privacy-preserving LBS query in the nonoutsourced cloud scenario. In this scenario, users send their query requests to the semitrust (i.e., honest-but-curious) LBSP that stores LBS data resources. To prevent the LBSP from obtaining the user’s private information (e.g., the user’s identity and location information), some well-known approaches like -anonymity, dummy, spatial cloaking, and private information retrieval (PIR) are widely adopted. -anonymity is a common way used to preserve the LBS user’s private information [8], and the core of -anonymity is to ensure that a user cannot be identified with a probability of at least . Nevertheless, the sensitive information of users may also be leaked if users’ queries lack diversity in the sensitive attributes [9]. Dummy usually adopts the way of adding fake users into the real user’s query request to confuse the LBSP. As the LBSP cannot identify a real user from other fake users in the query process, the real user’s privacy can be preserved. However, since fake users are added to the real user’s query request, the communication overhead and computation cost inevitably increase [10]. To confuse the LBSP, spatial cloaking, such as transforming an LBS user’s location to an obfuscation area or a cloaked area [11], is adopted to decrease the accuracy of the user’s location. However, this approach achieves privacy preservation at the expense of expected locations’ accuracy so that some nearby POI may be excluded [12]. PIR was first used to prevent the identifier of retrieved data from being leaked to the database server [13]. For the privacy-preserving LBS query, the user also can obtain the record from the LBSP without revealing which record he/she is interested in by using PIR [14]. For example, based on PIR, Xun et al. designed a framework to find the user’s requested data without revealing to the LBSP which records are retrieved [15]. However, PIR usually brings a heavy computation cost since PIR needs a linear scan for all the data stored in the LBSP.
In the outsourced cloud scenario, current studies focus on the issues of privacy-preserving LBS query over outsourced encrypted data. For instance, to preserve data privacy of the outsourced data and the user’s query privacy, a framework named FINE was designed for the privacy-preserving LBS query over outsourced encrypted data [16]. However, the framework only supports the rectangle range LBS query, which is not very practical since the user’s query range is usually a circle. Subsequently, a privacy-preserving LBS query scheme was proposed to support the circle range LBS query over outsourced encrypted data [2]. However, the scheme only considers the circle range LBS query, i.e., a user can appoint only a circle into his/her LBS query and get all the encrypted data located in the circle from the cloud side. To improve the data retrieval efficiency of processing the circle range LBS query, Li et al. [3] put forward a privacy-preserving tree index structure in their scheme. To provide a flexible LBS query over outsourced encrypted data, Zhu et al. [4] designed a special polygons spatial query algorithm and proposed a privacy-preserving polygons spatial query scheme that allowed a user to appoint any polygon into an LBS query request. Afterward, to improve the query pertinence, by combining search token and inverted index technique, Zeng et al. [5] proposed a privacy-preserving generic LBS query scheme. However, the user side still faces the pressure of decrypting a matrix of -dimensional vectors to get the desired results. Although the above studies can achieve the preservation of data privacy and query privacy in the outsourced cloud scenario, the storage service and the query service in these proposed schemes rely on a single cloud provider (i.e., using the single cloud structure), which may lead to the risk of leaking sensitive information contained in outsourced data if the single cloud is controlled or the data stored in this cloud is stolen by the profit-driven insider. Besides, most of the above schemes are only the range query supported and cannot support the type query based on users’ search interests, which leads to an expensive computation cost both on the cloud side and the user side. Therefore, in this work, we use a dual cloud structure to provide a stronger privacy guaranty and design the scheme that supports both the type query and the range query to improve the data retrieval efficiency of the cloud side.
2.1. System Overview
In this section, we first describe the system model and then give the security requirements and design goal.
2.2. System Model
The system consists of four entities: Location-Based Service Provider (LBSP), Type Retrieval Cloud (TRC), Location Retrieval Cloud (LRC), and LBS User (U), as shown in Figure 2.

2.2.1. Location-Based Service Provider (LBSP)
The LBSP is the owner of LBS data and is responsible for the LBS user registration. Due to the advantages of storage and computation on the cloud side, the LBSP outsources its storage service and LBS query service to the cloud side. Nonetheless, considering the value of these LBS data, the LBSP will perform some encryption operations to prevent the knowledge of data from being disclosed to the cloud side before outsourcing the LBS data to the cloud side.
2.2.2. Type Retrieval Cloud (TRC) and Location Retrieval Cloud (LRC)
In our scheme, the cloud side is separated into two cloud entities: the type retrieval cloud (TRC) and the location retrieval cloud (LRC). The type retrieval cloud is responsible for storing and retrieving the encrypted type data composed of the encrypted POI type keywords, while the location retrieval cloud is in charge of storing and retrieving the encrypted location data. Note that the two cloud entities are two different cloud providers (e.g., Azure and Amazon).
2.2.3. LBS User (U)
An LBS user can request an LBS query service to seek a certain POI type data within a specified range. Before issuing a query, an LBS user must be a registered user, i.e., the authenticity and legitimacy of the user should have been checked. Since the secure authentication mechanism of LBS users in the outsourced cloud has been proposed in [17], we assume that all the users are authenticated users in our scheme. However, to guarantee the LBS user’s query privacy, the user will perform some encryption operations on the query content before sending the query request to the cloud side.
2.3. Security Requirements
In our scheme, the LBSP and LBS users are assumed to be honest. Specifically, the LBSP provides the LBS data accurately and LBS users perform encryption operations during the process of LBS query honestly. However, the cloud side is assumed to be honest but curious as in previous works [2, 4, 5], i.e., the type retrieval cloud and the location retrieval cloud are assumed to be honest but curious in this work. Herein, honest means each cloud entity performs protocols honestly without tampering or retaining part of data on purpose. Curious means each cloud entity is interested in the data it owns or handles, and wants to know the knowledge contained in these data. However, the type retrieval cloud and the location retrieval cloud are assumed to be two noncolluding entities in our scheme. Besides, identity privacy, the collusion attack on privacy (i.e., any two parties collude to disclose the third party’s privacy), and how to prevent the above two cloud entities from collecting information from the real world to analyze the encrypted LBS data and users’ queries are beyond the scope of this paper.
Under the above assumptions, to provide privacy-preserving LBS query in the outsourced cloud, the following security requirements should be satisfied in our scheme.
2.3.1. Data Privacy
The outsourced LBS data should be kept secret from the type retrieval cloud and the location retrieval cloud, i.e., our scheme should prevent the above two clouds from obtaining any actual knowledge about the outsourced data even if these data are stored in their databases.
2.3.2. Query Privacy
The knowledge contained in the user’s query request should be kept secret from the type retrieval cloud and the location retrieval cloud, i.e., our scheme should prevent the above two clouds from obtaining any actual knowledge contained in the user’s query request even if the above two clouds are responsible for handling the user’s query and returning the query result to the user.
2.3.3. Resistance to the Insider Attack
In addition to preventing the knowledge contained in outsourced data from being leaked to the insider, our scheme should also prevent the outsourced data from being controlled or stolen on a one-time basis.
2.3.4. Resistance to the Eavesdropping Attack
In addition to preventing the knowledge contained in the user’s query request from being leaked to the eavesdropper, our scheme should also prevent the knowledge contained in the user’s query request from being acquired on a one-time basis.
2.4. Design Goal
Under the mentioned system model and security requirements, our design goal is to design an efficient and secure privacy-preserving LBS query scheme in the outsourced cloud scenario. The main objectives are as follows.
2.4.1. Guarantee Privacy Requirements
The proposed scheme should meet the defined security requirements. Since the type retrieval cloud and the location retrieval cloud are assumed to be honest but curious, the outsourced LBS data should be kept secret from the cloud providers otherwise the sensitive data of the LBSP could be disclosed. Similarly, the knowledge contained in the LBS user’s query request should also be kept secret from the cloud entities (i.e., the type retrieval cloud and the location retrieval cloud) even if they provide the query service.
2.4.2. Perform LBS Query Efficiently
The designed scheme should achieve high time efficiency. Although the outsourced cloud provider can offer a large computing power, the data retrieval efficiency of the cloud side should be efficient for guaranteeing a short response time.
2.4.3. Achieve Low Computation Cost
Although the performance of smartphones has been greatly improved, the limitation of their batteries is still a problem. Moreover, the energy consumption of the cloud side should also be considered. Therefore, the proposed scheme should consider the computation cost for reducing the computation burden on the user side and cloud side.
3. Building Blocks
In this section, we give the notations and techniques used in this paper. The summary of notations is presented in Table 1.
3.1. Paillier Cryptosystem
Paillier cryptosystem is to solve addition operations upon the encryption field [18]. Due to additive homomorphism, the operation on encrypted data is consistent with the corresponding operation on unencrypted data. Specifically, the paillier cryptosystem consisting of three algorithms (i.e., key generation, encryption, and decryption) is shown below.
Key generation: firstly, two independent large prime numbers and are randomly selected. Then, we compute and , where is the least common multiple function. Finally, the public key and private key (, (n)) can be obtained, where .
Encryption: assume is a plaintext to be encrypted. Firstly, a random number is selected. Then, the encrypted result can be computed by the following equation:
Decryption: to get the plaintext , the encrypted result can be recovered with the private key (, (n)) by the following equation:
The property of addition homomorphic in the paillier cryptosystem can be proved by the following equation:
3.2. Distance Comparison Algorithm
To guarantee the privacy of the LBS user’s location and query radius, the distance comparison algorithm is proposed based on ciphertext comparison schemes [19–21], as shown in Algorithm 1. Herein, the user’s query radius is denoted by , and represents the Euclidean distance between the user’s coordinates and the coordinates of a POI, where and are both integers. The key generation of the paillier cryptosystem is represented by , the encryption operation with public key is represented by , and the decryption operation with private key is represented by . is the space of random number . is the polynomial probability algorithm, and , where is the security parameter.
|
4. Proposed PPQS
In this section, we give the formal descriptions of the proposed scheme, which consists of the following four phases: data preparation phase, query request phase, query retrieval phase, and result filtration phase.
4.1. Data Preparation
In general, the information of data items stored in the LBSP includes identifiers, POI type keywords, coordinates, descriptions, etc. Each data item is plaintext in the format of , as shown in Table 2. Herein, the identifier and POI type keyword of each data item are converted to bit strings, i.e., and , where is the security parameter of a pseudo-random function (PRF) and a pseudo-random permutation (PRP) .
4.1.1. Secret Key Generation
As mentioned in the system model, the LBSP outsources its encrypted LBS data to the cloud side for providing LBS users with the LBS query service. The secret key generation is the preparatory work for the outsourced storage service and query service, which mainly contains the following steps. The LBSP first chooses a secret key for a secure hash function , where . Then, the LBSP selects a key for a PRF , a key for a PRP , and a key for another PRF . Subsequently, the LBSP selects a random number for a symmetric encryption algorithm (i.e., ), where . Finally, the LBSP assigns secret keys , , , and to each registered user. Besides, the TRC runs the paillier cryptosystem to generate the key pair and then opens its public key to other entities, and the LBS user runs to get the key pair .
4.1.2. Encrypted Database Generation
The LBSP runs Algorithm 2 to generate two encrypted databases. Database is generated by the searchable symmetric encryption (SSE) [22, 23] and contains the association between the encrypted POI-type keyword (i.e., search token ) and the corresponding set of encrypted (i.e., ). Database contains the relation between the encrypted (i.e., ) and the corresponding encrypted coordinates (i.e., ) and encrypted description (i.e., ). In , is generated by the pseudo-random permutation with its key , is generated by the secure hash function with its secret key , and is generated by the symmetric encryption algorithm with its secret key , as shown in Table 3. Suppose that the number of data items corresponding to the keyword is denoted as . Then, it needs times to generate the encrypted data items, where represents the number of POI-type keywords and is the max number of data items for all the POI-type keywords. Therefore, the time complexity of Algorithm 2 is .
|
4.1.3. Query Request
The user generates an original query request , where is the user’s current position coordinates, is the user’s query radius, and is the POI-query-type keyword.
As illustrated in the system model, the user needs to perform some encryption operations before sending the original query for guaranteeing query privacy. More specifically, the user uses the POI-type keywords and to generate search token , generates secret key with and , uses and the hash function to encrypt the current coordinates, and encrypts the query radius with public key . Finally, the user sends the query request to the type retrieval cloud and sends query request to the location retrieval cloud. The pseudocode of query request is shown as Algorithm 3.
|
4.2. Query Retrieval
The query retrieval consists of two processes: the type retrieval and the range retrieval. In brief, the type retrieval is to find the matched data that correspond to the user’s POI query type and the range retrieval is to perform distance calculation on the ciphertext domain for the matched data.
4.2.1. Type Retrieval
After receiving the user’s query request , the TRC can find out a list from according to the user’s . In list , the POI type of each (i.e., encrypted ) is consistent with the user’s POI query type keyword . Afterward, the TRC inserts all the contained in list into an empty list and then sends to the LRC. The pseudocode of type retrieval process is shown as Algorithm 4. According to the user’s search token , the TRC requires at most times to find out list . Therefore, the time complexity of Algorithm 4 is at most.
|
4.2.2. Range Retrieval
After receiving the user’s query request , the LRC first gets the set of by decrypting in list with . Then, according to the set of , the LRC finds out the corresponding encrypted coordinates and encrypted description in . For each , the LRC performs the following steps: calculates the Euclidean distance between and (i.e., ), runs the distance comparison algorithm and calculates , generates by encrypting with the user’s public key , and inserts and to an empty list . The pseudocode of range retrieval process is shown as Algorithm 5. Since the number of is consistent with the number of data that correspond to the user’s POI query type keyword , the times of decrypting is . Besides, since the LRC needs to run the distance comparison algorithm once for each , the corresponding times of running the distance comparison algorithm is . Therefore, the time complexity of Algorithm 5 is at most.
|
4.3. Result Filtration
Based on the list , the TRC figures out the data that locate in the user’s query radius by decrypting with the private key . If , the TRC inserts the corresponding into list and sends as the query result to the user. The pseudocode of result filtration is shown in Algorithm 6. Since the TRC needs to decrypt all in list to get list and the number of is consistent with the number of , the times of decrypting is . Therefore, the time complexity of Algorithm 6 is at most.
|
After getting , the user can obtain the desired data by decrypting with the private key and .
5. Discussion
In this section, the security analysis is first presented to check whether the defined security requirements can be satisfied in our scheme. Subsequently, in terms of the number of data that need to be processed in different phases, the computation cost of the previous schemes and our scheme are analyzed and compared.
5.1. Security Analysis
Security analysis is based on the proposed security requirements: data privacy, query privacy, resistance to the insider attack, and resistance to the eavesdropping attack. Before analyzing the security requirements, the following lemmas are introduced to show the security of pseudo-random permutation PRP, pseudo-random permutation PRF, symmetric encryption algorithm , and the paillier cryptosystem.
Lemma 1 (see [5]). For an adversary who uses probabilistic polynomial time (PPT) algorithm, if its advantages and are negligible, then the PRF and PRP are secure, where
Lemma 2 (see [6]). For an adversary who uses probabilistic polynomial time (PPT) algorithm, if its advantages is negligible, then the symmetric encryption scheme is secure, where
Lemma 3 (see [24]). For an adversary who uses probabilistic polynomial time (PPT) algorithm, if its advantages is negligible, then the paillier cryptosystem is a ()-decisional composite residuosity (DRC) problem and secure, where
5.1.1. Data Privacy
Theorem 1. Based on the security of PRP, PRF, symmetric encryption algorithm , hash function , and the paillier cryptosystem, our scheme can achieve data privacy.
Proof. In our scheme, two encrypted databases (i.e. , ) are outsourced to the TRC and the LRC, respectively. In , the relation between the POI-type keyword and encrypted index is built by search token , where and are both generated by the pseudo-random function . Moreover, the ciphertext is generated by the symmetric encryption algorithm with its secret key and an input , where is the outcome produced by the pseudo-random function and is an encrypted outcome generated by the pseudo-random permutation . Therefore, if the PRP, PRF, and symmetric encryption are secure, the TRC cannot obtain knowledge from even if the TRC owns database . In , the encrypted (i.e. ), encrypted coordinates (i.e. ), and the encrypted description (i.e. E) are generated by the pseudo-random permutation , the hash function , and the symmetric encryption algorithm , respectively. Thus, as long as the PRP, hash function, and symmetric encryption are secure, the LRC cannot get any actual knowledge from even if the LRC owns database .
Besides, the retrieved data contained in list sent from the TRC to the LRC is generated by the symmetric encryption . Similarly, the retrieved data contained in list sent from the LRC to the TRC is produced by the paillier cryptosystem. Therefore, if the symmetric encryption and the paillier cryptosystem are secure, the knowledge contained in transferred data between the two clouds (i.e., the TRC and the LRC) can also be well protected. Thus, under the assumption that the two clouds are two noncolluding entities, no single cloud can obtain the knowledge of the data stored in itself unless the other cloud provides additional information (i.e., the two clouds collude with each other and share the secret key and ).
Therefore, our scheme can provide data privacy.
5.1.2. Query Privacy
Theorem 2. Based on the security of PRP, PRF, symmetric encryption algorithm, the hash function, and the paillier cryptosystem, our scheme can guarantee the user’s query privacy.
Proof. In the user’s query request , the user’s POI query type keyword is represented by search token , where is generated by the pseudo-random function . During the process of type retrieval, the is converted to the encrypted index for finding the ciphertext in list , where is generated by the pseudo-random function with , is generated by the symmetric encryption algorithm with its secret key and an input , is the outcome produced by the pseudo-random function , and is an encrypted outcome generated by the pseudo-random permutation . Therefore, although the process of type retrieval is performed on the TRC, the TRC cannot learn any useful knowledge about the user’s query type due to the security of PRP, PRF, and symmetric encryption. In the user’s query request , the user’s coordinates and query radius are encrypted by the hash function and the paillier cryptosystem, respectively. During the process of range retrieval, the distance comparison between the user’s query radius and the Euclidean distance that indicates the user’s coordinates and the coordinates of a POI is performed on the ciphertext domain. Therefore, if the hash function and the paillier cryptosystem are secure, the LRC cannot obtain any actual knowledge about the user’s accurate location and query radius.
Therefore, our scheme can provide query privacy.
5.1.3. Resistance to the Insider Attack
Theorem 3. Based on Theorem 1and the proposed dual cloud structure, our scheme has the ability to resist the insider attack.
Proof. As mentioned in Theorem 1, data privacy can be achieved. Thus, our scheme can prevent the knowledge stored in outsourced data from being leaked to the cloud service provider including its insider. However, under the single cloud structure, the outsourced data stored in the cloud side may be controlled or stolen by the insider on a one-time basis since there is only one cloud service provider (i.e., only one cloud entity). Nevertheless, the cloud side in our scheme is divided into two noncolluding cloud entities (i.e., the TRC and the LRC) and the sensitive information (i.e., POI-type information and POI location information) in the outsourced data is stored in the above two cloud providers separately. Therefore, unless the insider occupies the two cloud entities simultaneously, our scheme can prevent the insider from controlling or stealing all the outsourced data on a one-time basis.
Thus, compared to the schemes under the single cloud structure, our scheme provides a stronger ability of privacy preservation for sensitive information contained in the outsourced data.
5.1.4. Resistance to the Eavesdropping Attack
Theorem 4. Based on Theorem 2and the proposed dual cloud structure, our scheme has the ability to resist the eavesdropping attack.
Proof. As mentioned in Theorem 2, the user’s query privacy can be achieved. Thus, our scheme can prevent the knowledge contained in the query request from being leaked to the eavesdropper. However, under the single cloud structure, all the user’s query private data may be obtained by the eavesdropper on a one-time basis since these data are contained in one query request. Nevertheless, the user’s query private data (i.e., POI-type keyword, location, and query radius) in our scheme are sent to the TRC and the LRC by two separate query requests (i.e. , and ). Therefore, unless the eavesdropper captures the two query requests simultaneously, our scheme can prevent the eavesdropper from obtaining all the user’s query private data on a one-time basis.
Therefore, compared to the schemes under the single cloud structure, our scheme provides a stronger ability of privacy preservation for sensitive information contained in the user’s query request.
5.2. Cost Analysis
Herein, we compare our scheme with previous schemes by analyzing the linear relationship between the computation cost and the number of data that need to be processed in different phases. The notations in this section are described in Table 4 and the comparison results of computation cost with the previous schemes [2, 4, 5] are concluded and shown in Table 5.
The encrypted database is generated by the LBSP, so the phase of the encrypted database generation reflects the computation cost of the LBSP. In the phase of encrypted database generation, the computation cost of each scheme in Table 5 is linear to .
The query retrieval is performed by the cloud side, so the phase of the query retrieval reflects the computation cost of the cloud side. In the phase of query retrieval, the computation cost of scheme and scheme are linear to . However, the computation cost of scheme is linear to 2 and the computation cost of our scheme is linear to . The reason is that the scheme and scheme are both only the range query supported, so all the data items need to be compared to decide whether these data items are within the user’s query range. Therefore, the computation cost of scheme and scheme are linear to . However, the scheme and our scheme support both the type query and the range query. Therefore, in scheme and our scheme, the first step of the query retrieval phase is to find the data items that match the user’s POI query type keyword, and then only the matched data items need to be compared to decide whether they are within the user’s query range. is less than since is the number of data items for one kind of POI type. Therefore, in the phase of query retrieval, the computation cost of our scheme is less than other schemes.
The result filtration is to prepare an encrypted dataset (i.e., the query result) in which each encrypted data satisfy the user’s query conditions, so the number of data in the encrypted dataset reflects the computation cost of the user side. In the phase of result filtration, the computation cost of scheme and scheme are linear to , where also represents the number of data located in the user’s query range after performing the range retrieval. However, the computation cost of scheme and our scheme is linear to , where also denotes the number of data that not only correspond to the user’s query type but locate in the user’s query range after performing the type retrieval and the range retrieval. is less than since there is not only one kind of POI type data in the user’s query range. Therefore, in the phase of result filtration, the computation cost of our scheme is the same as that of scheme , but less than that of scheme and scheme .
6. Evaluation
In this section, we first describe the simulation setup and then give the simulation results and corresponding analysis.
6.1. Setup
Implementations: our scheme is implemented with Python programming language and conducted on a Windows machine with an Intel Core-i7 3.6 GHz, 16 GB RAM, and Microsoft Windows 7 OS. Besides, we call the encrypted sub-package in the Pycryto encryption library to implement the encryption algorithms of our scheme and adopt a 512 -bit paillier cryptosystem to encrypt the coordinates of each data item. Moreover, the LBS resources are collected from an open map of Beijing by using the public API interface of the Amap service [25] and the database of the LBSP is built in the form of Table 2. Based on the database of LBSP, we construct a dataset with 10000 data items and the coordinates of these data items are randomly distributed in 10 km × 10 km square area . Besides, the encrypted database in the cloud side is implemented by constructing the type index table and the location data table in MYSQL, so the delay time of data transmission between clouds is assumed to be 0. To evaluate the time cost, retrieval efficiency, and computation burden, the simulations are performed in the following two scenarios.
Scenario 1: based on the dataset , three original datasets (i.e.,, , and) containing two assigned POI-type data (i.e., the catering service denoted as and the accommodation service denoted as) are formed. Specifically, the number of and in original dataset are 1000 and 500, respectively, the number of and in original dataset are 2000 and 1000, respectively, the number of and in original dataset are 3000 and 1500, respectively. Moreover, the total number of data items in each original dataset is 10000. This scenario is used to evaluate the time cost of generating outsourced datasets.
Scenario 2: to simulate the data retrieval service, two POI query types (i.e. , and ), 5 randomly selected locations in , and 5 specified query radii (i.e., ranging from 0.5 km to 2.5 km with step length 0.5 km) are assigned into the LBS query request. In addition, the grid structure with the grid cell size of 1 km is constructed for , and the method of related area list [2] is used to define the LBS user’s query area. This scenario is used to evaluate the time cost of the LBS query request generation, the data retrieval efficiency of the outsourced cloud, and the computation burden on the user side and cloud side.
6.2. Simulation Results
6.2.1. Time Cost of Outsourced Datasets Generation
For evaluating the time cost of generating outsourced datasets, each of the three original datasets (i.e., , , and ) is executed 10 times to generate the corresponding encrypted data items. Figure 3 shows the average time cost of generating outsourced datasets (i.e., , , and ).

As can be seen from Figure 3, the time cost of generating outsourced datasets is increasing with the number of the assigned POI-type data in original datasets. To generate encrypted data items, the nonassigned POI-type data only need to hash their coordinates, while the assigned POI-type data not only need to hash their coordinates but also need to construct searchable encrypted indexes. Therefore, the increasing number of assigned POI-type data increases the time cost of generating outsourced datasets.
6.2.2. Time Cost of the LBS Query Generation
The efficiency of generating an LBS query request is important to the user side. Therefore, to evaluate the time cost of the LBS query request generation, 5 random locations in square area are selected as the LBS user’s initial coordinates based on the setting of scenario 2, and each location is performed 10 times for generating the LBS query request. The average time cost of generating the LBS query request at each location is calculated and Figure 4 shows the average time cost of generating the encrypted LBS query request at different locations.

As can be seen from Figure 4, when generating the LBS query request at different locations, the time cost of generating the encrypted LBS query request is stable, and it has little relationship with the locations. The reason is that no matter which location is selected as the user’s current coordinates to generate the encrypted LBS query request, the way of encryption on the coordinates is the same (i.e., using the hash function to encrypt the user’s current coordinates). In addition, the time cost of the LBS query request generation has little to do with the POI types contained in the LBS query request. The reason is that the user’s POI query types are represented by the corresponding POI type keywords that are finally converted to search tokens by the pseudo-random function.
6.2.3. Retrieval Efficiency of the Outsourced Cloud
The data retrieval time is related to the data retrieval efficiency of the outsourced cloud. Therefore, to evaluate the data retrieval efficiency of the outsourced cloud, the data retrieval service is performed 10 times on each outsourced dataset (i.e. , , and ) based on the setting of scenario 2. Figure 5 shows the average time cost of data retrieval service with different POI query types and query radii under different outsourced datasets.

(a)

(b)

(c)
Based on Figure 5, we can conclude how the time cost of data retrieval service is affected by the user’s query radius, the data density of POI query type, the dataset density of POI type, and the supported query mode.
To research the relation between the time cost and the user’s query radius, the user’s query radius is selected from 0.5 km to 2.5 km. When the POI query type is fixed (e.g.,), the time cost of data retrieval service is increasing with the query radius no matter on which outsourced dataset. The reason is that the increase of the user’s query radius leads to the enlargement of the user’s query area, which leads to the increase of the data located in the user’s query area. Accordingly, the cloud side needs to run more distance comparison operations, which finally leads to the increase of the time cost of data retrieval service.
Since the data amount of is twice that of in each outsourced dataset, the data density of can be seen as a high-density POI type if the data density of is assumed to be a low-density POI type. From Figure 5, it can be seen that the time cost of data retrieval service for the high-density POI type (i.e. ) is higher than that for the low-density POI type (i.e.) no matter on which outsourced dataset. The reason is that the user’s query radius decides the query area, and when the query area is fixed, the amount of data of the high-density POI type is greater than that of the low-density POI type in the query area. Accordingly, a larger amount of data requires more distance comparison operations to be run, which leads to the increase of time cost. Besides, from Figure 5, the time cost on these two POI types in each outsourced dataset has similar behavior. The reason is that the time cost on the cloud side is basically spent on the distance comparison of the assigned POI-type data located in the query area during the data retrieval service. When the data amount of these two POI types in each outsourced dataset is proportional (i.e., the data amount of is twice that of in each outsourced dataset), the corresponding data amount of these two POI types in the query area is proportional. Accordingly, the time cost of data retrieval service on these two POI types (i.e., and) is proportional in each outsourced dataset, which leads to similar behavior of the time cost on these two POI types in each outsourced dataset.
Recall that the data amount of any POI type (no matter or ) in is more than that in and . Thus, in terms of the dataset density of POI type, can be seen as a high-density outsourced dataset if or is assumed to be a low-density outsourced dataset. From Figure 5, for any POI type (e.g., ), it can be seen that the time cost of data retrieval service on a high-density outsourced dataset (e.g., ) is greater than that on a low-density outsourced dataset (e.g. ,). The reason is that when the query area is fixed, the amount of POI-type data contained in a high-density outsourced dataset is more than that contained in a low-density outsourced dataset, i.e., the running times of distance comparison operation on a high-density outsourced dataset are more than that on a low-density outsourced dataset in the query area, which leads to a greater time cost of data retrieval service on a high-density outsourced dataset.
Figure 5 reflects the results about the time cost of data retrieval service with the assigned POI types (i.e., the results are obtained based on the query mode that supports both the type query and the range query), which means the cloud side only needs to run distance comparison operation for the matched POI-type data in the query area. However, if the type query is not supported, the cloud side needs to run the distance comparison for all the data rather than the matched POI-type data in the query area, which will undoubtedly lead to a bigger computation burden and a greater time cost of data retrieval service. Therefore, compared to the scheme that only supports the range query, the scheme (i.e. ) that supports both the type query and the range query can effectively reduce the computation burden on the cloud side and improve the data retrieval efficiency of the cloud side.
6.2.4. Computation Burden on the User Side
The amount of encrypted data in the query result returned to the user from the cloud side is related to the computation burden on the user side. Although different schemes use diverse decryption algorithms to enable the user to get the desired result by decrypting the encrypted data in the query result, if the user side is assumed to adopt the same decryption algorithm, the number of encrypted data contained in the query result returned to the user can be used as the basis for measuring the user’s computation burden. Therefore, the computation burden on the user side is evaluated according to the number of data in the query result. Besides, since our scheme provides the mode that supports both the type query and the range query (i.e. ), the data retrieval service is first executed 10 times on each outsourced dataset (i.e., , , and ) based on the setting of scenario 2. To compare with the mode that only supports the range query (i.e. ), under the condition of removing the type index table, the data retrieval service is then executed 10 times on each outsourced dataset. Finally, the average number of data contained in the query result returned to the user side is calculated and Figure 6 shows the average number of data contained in the query result with different POI query types and query radii under different outsourced datasets.

(a)

(b)

(c)
Based on Figure 6, we can conclude how the number of data contained in the query result is affected by the user’s query radius, the data density of POI query type, the dataset density of POI type, and the supported query mode.
When the POI query type is fixed (e.g., ), the number of data contained in the query result is increasing with the user’s query radius no matter on which outsourced dataset. The reason is that the increase of the query radius leads to the enlargement of the user’s query area, which leads to the increase of the data located in the user’s query area. Accordingly, the cloud side needs to insert more data into the query result.
Recall that the data density of can be seen as a high-density POI type if the data density of is assumed to be a low-density POI type. In the mode that supports both the type query and the range query (i.e., ), it can be seen that the number of data contained in the query result for the high-density POI type (i.e., ) is greater than that for the low-density POI type (i.e., ) no matter on which outsourced dataset. The reason is that when the query area is fixed, the amount of the high-density POI-type data is greater than that of the low-density POI type data in the query area. Accordingly, the cloud side needs to insert more data into the query result.
As mentioned above, can be seen as a high-density outsourced dataset if or is assumed to be a low-density outsourced dataset. In the mode that supports both the type query and the range query (i.e., ), for any POI type (e.g., ), it can be seen that the number of data contained in the query result on a high-density outsourced dataset (e.g., ) is greater than that on a low-density outsourced dataset (e.g., ). The reason is that a high-density outsourced dataset has a greater data density compared to a low-density dataset in the user query area, which leads to the increase of the data contained in the query result.
In the mode that only supports the range query (i.e., ), since the cloud side cannot find out the encrypted data according to the user’s POI query type, the data inserted into the query result are the encrypted data within the user’s query radius. Therefore, the number of data contained in the query result has similar behavior in each outsourced dataset (i.e., the number of data contained in the query result is almost the same in each outsourced dataset when the user’s query radius is fixed). However, in the mode that supports both the type query and the range query (i.e., ), the data inserted into the query result must meet two conditions: (1) the data are consistent with the user’s POI query type and (2) the data are located in the user’s query radius. Therefore, no matter on which outsourced dataset, the number of data contained in the query result under the mode that supports both the type query and the range query is less than that under the mode that only supports the range query. Therefore, since the number of data contained in the query result is related to the computation burden on the user side, the scheme (i.e., ) that supports both the type query and the range query can effectively reduce the computation burden on the user side compared to the scheme that only supports the range query.
7. Conclusion
In this paper, an efficient privacy-preserving LBS query scheme (i.e., ) in the outsourced cloud scenario is proposed. Specifically, we propose a dual cloud structure to enhance the ability of privacy preservation for sensitive information in the outsourced cloud, i.e., our scheme has the ability to resist the insider attack and the eavesdropping attack while preserving data privacy and query privacy. Moreover, by using the techniques of homomorphic encryption and searchable symmetric encryption, our scheme supports both the type query and the range query, which can effectively improve the data retrieval efficiency of the cloud side and reduce the computation burden on the user side and the cloud side. Finally, the effectiveness and performance of our scheme are validated through the analysis on security and computation cost and extensive simulations.
Data Availability
In this paper, the LBS data are collected from the public API interface of Amap service. The URL is https://lbs.amap.com/api/webservice/summary.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
This research was supported in part by Yunnan Key Laboratory of Blockchain Application Technology (202105AG070005), Natural Science Foundation of China (61802005), Joint of Beijing Natural Science Foundation and Education Commission (KZ201810009011), Beijing Municipal Natural Science Foundation (M21029), Natural Science Foundation of China under Grant 61802005, and Talent Special Project (XN083).