Social Security and Privacy for Social IoTView this Special Issue
A Novel Method for Location Privacy Protection in LBS Applications
Location-based services have become a mainstream in people’s daily lives due to continuous innovations in the field of mobile networking and GPS technologies. Recently they have advanced into a hot topic to which the majority of researchers pay close attention about how to enjoy them while safeguarding the location privacy of mobile users. Existing works involve the injection of random noise that cannot pledge the quality of service. Herein this manuscript, we propose a novel location privacy protection model based on the loss of service quality. This model allows the user to express his/her requirement of service quality by specifying the maximum service quality loss , which is the user’s tolerance. can be set to 0. Our comprehensive experimental evaluation using a real-world dataset demonstrates that our modus outdoes other state-of-the-art approaches.
Location-based services (LBS) are swelling owing to the innovations in technology and the dominance of location-cognizant devices [1–3]. Such services take the user’s location information in a query as an input, execute the query at the server, and then provide the user with the information of nearby points of interest (POI), such as gas stations, banks, and restaurants . A wide range of LBS applications include location-aware search (Baidu Maps), E-commerce (Meituan, Dianping), location-based social recommendation (QQ, WeChat), and ordering application and crowdsourcing (Ali) .
During this process, users’ current or future whereabouts and interests are disclosed to the LBS server through their queries. Access to all submitted information is deemed necessary to best serve users; the LBS server is entrusted with rich information [6–8]. However, many studies have revealed that service providers can be honest but curious, belligerently stockpiling information of profile users, identifying homes, working places, and social relationships, or inferring interests towards commercial purposes [9–12]. Therefore, the concern of LBS is to provide high quality service while the user’s location is anonymous to the LBS server. It seems contradictory and challengeable .
The topic of LBS privacy has been widely studied. In 2003, Beresford proposed the concept of location privacy , which pioneered the research on location privacy protection. Since then, the research on discrete location privacy or trajectory privacy has been published successively. There are two types of privacy issues in LBS: location privacy and query privacy . Location privacy includes users’ previous, current, and future location and query privacy is the type of POI he/she is interested in. In addition, the importance of query privacy is greater when the request is sensitive (query for hospitals). In this paper, we propose a novel location privacy protection model for the former, which ensures high quality service while user location is protected.
These approaches regarding the location privacy in LBS are classified into three categories. The first one is to enlarge the user location into a region; the representative is k-anonymity: e.g., an obfuscated region is formed by k users . The second one can be viewed as a dummy-based technique . The dummy location is sent to the LBS server instead of the exact location. Obviously, the user utilizes another position to replace his/her location. The major limitation of such replacement is that the quality of service degrades significantly if the user chooses a higher level of privacy. The last one is to transform the original query into another problem such that the users’ location cannot be inferred . This kind of approaches usually employs cryptographic algorithms and spatial transformation techniques (e.g., Hilbert curve).
Geo-indistinguishability (GeoInd), a formal notion of location privacy introduced in , builds on the concept of differential privacy  to design user-centric location privacy protection mechanisms. GeoInd guarantees that obfuscated locations are statistically indistinguishable from other locations within a radius around the users’ real location. However,  illustrates that GeoInd can be misleading in terms of both privacy and utility. Sometimes, GeoInd mechanisms possibly generate an obfuscated location vary far away from the user leading to dissatisfying service quality.
In this paper, a trusted third party (TTP) is added between the user and the server, for collecting users’ real location and then sending to the LBS server with the disturbed one. The TTP perturbs the real location with a novel location privacy protection model based on loss of service quality which solves the challenge to ensure high quality service while protecting location privacy. To summarize, our contributions are as follows:
(1) We propose a function of service quality loss based on a real query result, in which is the real query result and is the perturbation
(2) We propose a novel location privacy protection model based on loss of service quality. The TTP calculates a noisy area based on the maximum service quality loss () specified by the user and selects one point randomly to return to the server
(3) We propose a novel traversal method based on a Voronoi diagram, considering the geographic relationships of locations that efficiently reduce the complexity of computation
2. Related Works
Due to the paramount importance of location privacy in LBS services, it has been studied extensively, and several methods have been proposed. We review some of the typical briefly. K-anonymity based on trajectory generalization has been prevailing for its good balance of privacy protection and data availability. In , a anonymous algorithm was proposed for trajectory dataset publication. Based on trajectory generalization and k-anonymity, this algorithm generalizes every position in the trajectory to a circle with a radius of and makes sure that each circle has at least k points to satisfy k-anonymity, each of which is represented by a cylinder of these circles. Literature  proposed a technique by replacing the original data with a logical one.
Differential privacy was quickly applied to the privacy protection of data publishing  based on fake data technology to achieve privacy protection by adding noise to the real dataset . In data distribution, differential privacy can achieve different levels of privacy protection and data publishing accuracy by adjusting the privacy parameter . In general, the larger the value of , the lower the level of privacy protection and the higher the accuracy of the published dataset. The first common mechanism for implementing differential privacy is the Laplace mechanism proposed in . This mechanism mainly focuses on numeric queries, by adding random noise obeying the Laplace distribution to the results of real queries [26, 27]. For nonnumeric queries,  proposed an exponential mechanism, which is the second universal mechanism to achieve differential privacy.
Since its proposal in 2013, GeoInd has drawn a lot of attention from the research community. It has been widely used based on its core qualitative advantages, regardless of the adversary’s background information. However,  illustrates that GeoInd is not that great. Sometimes, GeoInd possibly generates an obfuscated location very far away from the user leading to worthless data as shown in Figures 1 and 2.
To rectify this, we propose a novel location privacy protection model to replace the user’s real location with an obfuscated one based on loss of service quality.
In this section, the symbols and related definitions used in this paper are given. As mentioned earlier, the quality of service declines dramatically after adding the Laplace noise, which means the obfuscated location is far away from the real one. To fix this, we propose a loss of service quality () based on the real query result as a novel evaluation index. The obfuscated location is generated randomly from the noisy area, which is calculated according to a specific .
Real query result: the TTP receives the realistic location and query radius of a user, using as the center within , the set made up of points of interest (POI) sorted by the distance from .
LBS query result: the LBS server takes the obfuscated location from the TTP, using as the center within same , the set made up of points of interest (POI) sorted by the distance from . This article leverages the maximum service quality loss () to constrain the LBS query result.
Loss of service quality (): regard the change of the real query result as . According to the statistics about the clickthrough rates of search results released from AOL and IMN , ranking and attention were found to be expressed by a power function . Therefore, the weight of rank is set to , , , in which denotes the weight of rank , and the last two weights repeat. Given the real query result and the obfuscated result , in which denotes the index of POI at , is formally defined below: compare the ranking of each POI after added noise; regard the weight difference as the loss of if the ranking drops. We set if is not present in the obfuscated result.
Maximal tolerance : this is the maximal loss of service quality that a user may tolerate. The smaller is, the more similar the obfuscated result and the real one are.
Euclidean distance: the Euclidean distance is the shortest distance between two points in space. Given two points in two dimensions and , the Euclid distance of two points is defined: .
The Voronoi diagram : the Voronoi diagram, also known as the Thiessen polygon or Dirichlet diagram, generates a Delaunay triangulation at first and connects the center of the circumcircle of the adjacent triangle. The characteristic is that there is a generator with each V polygon in the graph, and the distance from the inner point of each V to the generator is shorter than other generators. Points on the boundary of two polygons are equidistant from the corresponding generator. The establishment method of the Voronoi diagram is shown in Figure 3.
4. Our Framework of Privacy
In this section we describe our system architecture and the novel method of location privacy protection. We use Baidu Maps API  as the trusted third party (TTP) which is added between the user and the server, for collecting the user’s real location and then sending to the LBS server with the disturbed one. The TTP perturbs the real location with a novel location privacy protection model based on loss of service quality which solves the challenge to ensure high quality service while protecting location privacy. The overall system architecture is shown in Figure 4. The model allows a user to express his/her requirement of service quality by specifying a maximum service quality loss , in which the user would tolerate the loss of service quality (). can also be set to 0, which means immutable service quality. To guarantee the quality of query service, is typically set to a small value.
It can be easier to calculate the distance between points in two dimensions; we convert from latitude and longitude to UTM coordinates , also flagged as .
4.1. Nonlossy Service Quality ()
As mentioned in the previous section, the loss of service quality () based on the real query result is a novel evaluation index to measure the difference between the real query result and the obfuscated one. There are two kinds of situations. The first is that the number of POI and ranking stay the same (the last two points of interest are interchangeable). In another story, the number of POI increases while the rankings of points in the real query result are interchangeable. For instance, given real query result , the obfuscated result could be or if . POI and in bold are generated in addition without impacting the loss of service quality. Therefore, to ensure nonlossy service quality, postprocessing would be required on the obfuscated result.
4.1.1. Generation of Obfuscated Region
Given a real location of a user, the real query result can be obtained by calling Baidu Maps API. Calculate the obfuscated region according to the ranking of the real query result (proximal to distal from ) and the Voronoi diagram. The distance from each point within it to each query result satisfies the true ranking.
Algorithm 1 illustrates the details of the algorithm of obfuscated region generation. Given the real location of a user, Step 2 obtains the real query result by calling Baidu Maps API. Step 4 executes the Delaunay triangulation algorithm backwards according to the ranking. We compute the overlapped region at each step to find the final obfuscated region . The step of the Delaunay triangulation algorithm is as follows:
|Input: Real location , radius|
|Output: generated area|
|1. Initialize generated area|
|3. for each do|
|4. Generate Delaunay triangulation of|
|5. Calculate the Voronoi area of as|
|7. if then|
(1) Construct the Delaunay networks with the discrete points
(2) Calculate the center of the circumcircle of each triangle and take it down
(3) Look for three adjacent triangles whose border is in common with the current triangle
(4) If adjacent triangles are found, connect the circumcenter of each one to the circumcenter of the current triangle. If not, calculate the midperpendicular of the outermost border
4.1.2. Postprocessing of Obfuscated Region
We get the obfuscated region from the previous section satisfying , like polygon in Figure 5. A closer inspection would reveal an extra POI on the certain extension of the query if the obfuscated location were located on . Besides, the distance from to is less than the distance from to (). In this case, influences the ranking of leading to . So, to hedge against this, we need to do the postprocessing of our region.
To guarantee , the distance from to the obfuscated location must be larger than the distance from the last-ranking POI to , which is . The vertical bisector of segment crosses polygon at POI and . We get the ultimately obfuscated region just like Figure 6.
Algorithm 2 illustrates the details of the postprocessing algorithm. Given the obfuscated region calculated in the previous section, we initialize the set of vertices as and the set of extra points . In Steps 4 to 6, we compute the query ranges with query radius . The area enclosed by the red line in Figure 7 is the query ranges. Decide whether there come extra points after calling Baidu Maps API again within and add them to (Steps 7–9). In Steps 10 to 12, for each point in the set , draw the vertical bisector of segment crossing to form the new . That can ensure the distance from the last-ranking to the obfuscated location is less than that of any point in to the obfuscated location.
|Input: Generated area , radius , Real Rank|
|Output: Final area|
|1. Let V=vertex of area , ,|
|2. init A=|
|3. for each do|
|4. Circle with radius as (v=1 to )|
|5. Make the tangent of|
|6. end for|
|7. A=All enclosed area|
|11. for each do|
|12. =Area surrounded by perpendicular bisector|
|13. end for|
4.2. Tolerable Quality of Services ()
In this section, we consider the case that service quality might be lossy. Given the maximal tolerance , the loss of service quality within it is acceptable. By our definition in Section 3, the loss of service quality is the loss of weight regarding the real query result. The top priority in this section is to find all possible rankings under satisfying constraints, expressed as . The traditional enumeration is complex, so we propose two enhanced enumeration algorithms to reduce the time complexity effectively.
4.2.1. Enumeration with Pruning
The first algorithm is enumeration with pruning. For a certain position , calculate the upper bound and the lower bound of the queue after POI joined. The POI is not allowed in position if . Therefore, we can prune the branch of . This is as in Algorithm 3.
|Input: real ranking , Maximum tolerance|
|Output: ranking Set|
|1. init , , ,|
|2. for each do|
|3. calculate and|
|4. if then|
|6. for to do|
|8. for each in do|
|9. for to do|
|10. if then|
|13. calculate and|
|14. if then|
Algorithm 3 illustrates the details of the enumeration algorithm with pruning. Given the real ranking , we consider the possibility that each point may be the first. In Steps 6 to 9, we add POI into queue in turn to calculate and . Each point can appear only once and the extra points may occur several times (Steps 10-11). In Steps 13 to 15, we store the current queue to if the lower bound is less than maximum tolerance . By analogy, all the rankings that meet the constraints are obtained. This method has no regard for the geospatial and will generate many rankings unable to form a region.
4.2.2. Enumeration with a Voronoi Diagram
The pruning algorithm also has a high time complexity, and it will generate many useless rankings which can not form a region. To solve this, an enumeration method with a Voronoi diagram is given in this paper. The ultimately obfuscated region satisfying can be obtained by dividing the polygons continuously. This method operates on the Voronoi diagrams directly, which is intuitive and easier for getting the obfuscated region without postprocessing.
Algorithm 4 illustrates the details of the enumeration with the Voronoi diagram algorithm. Given the real ranking , we generate the Voronoi diagram only once. Step 6 starts the recursive function; calculate the upper bound and the lower bound after each addition. If the condition is met, add the current queue into ranking set . Moreover, if the condition is met, we remove all the points in candidate set that ranked lower than . Besides, in Steps 13 to 16, we divide the current region into multiple regions and start a new round of recursion. The candidate set creating algorithm is as in Algorithm 5.
|Input: real ranking ; Maximum tolerance|
|Output: ranking Set ;|
|1. init queue , set of Candidate ;|
|2. ; 3. for each do|
|4. Generate Delaunay triangulation of|
|5. end for|
|6. function circulate():|
|7. for each do|
|9. if then|
|11. else if then|
|12. candy.remove(x) ;|
|15. calculate the candidate of ;|
|17. return ;|
|Input: Queue ; Delaunay triangulation ; Set of Candidate ;|
|Output: set of Candidate ;|
|1. for each do|
|2. for each Simplex do|
|3. if then|
|5. for each do|
|6. if then|
|8. with ranking;|
|9. return ;|
Algorithm 5 illustrates the details of the candidate generation algorithm. Given the current queue , the Delaunay triangulation , and the current candidate set , the algorithm finds the neighbor POI of each point in queue and adds them to . Then, it sorts them according to the raw ranking. Using Figure 8 as an example, the real ranking is and the maximum tolerance . Generate the Voronoi diagram at first, and calculate the likelihood of a given queue with each point on the top, just like . The lower bounds of B,C,D are all greater than 0.1; that is why the top one must be . After that, we partition the V polygon into smaller polygons; the vertical bisector of segment crosses at points and . Calculate the upper bound and the lower bound of two polygons, which are and . The lower bound of polygon is beyond , while the upper bound of polygon is under . Therefore, we regard the gray area as the ultimately obfuscated region.
5. Experimental Evaluation
In all experiments, we use the real Harbin Station as the real location and query the banks within 400 meters. The real result can be obtained by calling Baidu Maps. For the sake of simplicity, we only take the top 10 points of interest into consideration and regard the others as the extra ones. We ran our experiments on a desktop computer with an Intel Core i5-7200 2.50 GHz processor and 8 GB RAM. The real query result and the ranking are as follows.
Since the triangle made of has the same shape as the triangle made of , we take the common prefix away to get a smaller coordinate value for computing triangulation conveniently. We change the top one POI (14097372.321867,5711495.970734) to (372.321867,495.970734) and then do the same thing for the others as in Table 1. The change of coordinates will also induce the change of the distance between the real location and each query result. We regard the distance between the last one POI transformed (710.269230,3.976522) and the real location transformed (433.443102926,417.242938906) as the query radius, which is .
For the first situation which is , as defined in Section 3, the last two POI are interchangeable. We got the ultimately obfuscated region in terms of that. In order to realize the tolerable quality of services (), given , the key question is how to get all the ranking results. A useful lemma combining the classical triangulation is shown as follows.
Lemma 1. If the ranking of the current generator (vertex) is , represents the set of vertices within the triangulations that contain . Then all the possible vertices in ranking are expressed as .
Proof. The Delaunay triangulation (TIN) gathers the 3 nearest neighbors, and each generator (vertex) has a public edge with the others in the Voronoi diagram . Since , the vertices which can be ranking in must have a public edge with the current generator (vertex).
As shown in Figure 9, any points in the gray area satisfy the nonlossy service quality. The user’s real location is protected while receiving the highest quality of service. We utilize the theory of the Voronoi triangulations instead of simple enumeration to slump the time complexity of the ranking calculation. As shown in Figure 10, we set the weight of rank and the maximal tolerance as and . The distance between each point of the obfuscated region and any one query result is within the maximum query ranges, which can be expressed as .
5.2. Effect of
We studied the scalability of our method by varying in the range of 0.1 to 0.25. The weighting parameter was 2. Figure 11 presents the obfuscated region when increases from 0.1 to 0.25. As can be observed, the obfuscated region increases constantly as increases, which realizes more protection of the user’s location. It is slow growth when increases from 0.1 to 0.2, but when we set , the obfuscated region nearly doubles on account of the change of the higher rankings.
5.3. Effect of the Weighting Parameter
We studied the scalability of our method by varying in the range of 2 to 4 and in the range of 1 to 3. was 0.2. Figure 12 presents the obfuscated region when the weighting parameter () increases from 1/2 to 3/4. As can be observed, the obfuscated region shrinks constantly as increases, which realizes better quality of service. It is significant change when weighting parameter increases from to , and there is no obvious change from to .
5.4. Errors of Perturbation
We now compare the perturbation errors of our scheme with the Laplace noises from 10 experiments. We set the parameter of global sensitivity as =300 and vary from 0.5 to 1 and from 0 to 0.15. Figure 13 depicts the comparison results and we can see that decreasing and increasing can lead to the perturbation error increases. Besides, our scheme achieves fewer errors than the Laplace noise.
5.5. Comparison of Ranking Calculating Time
We then look at the ranking calculating time of our enumeration algorithms: enumeration with pruning and enumeration with the Voronoi diagram. We set and perform 10 experiments. The results are shown in Figure 14. We can see that the time cost of enumeration with the Voronoi diagram is approximately of enumeration with pruning. Experiments prove the enumeration with the Voronoi diagram can effectively reduce time complexity.
In this paper, we propose a novel location privacy protection model based on the loss of service quality. A trusted third party (TTP) is added between the user and the server, for collecting the user’s real location and then sending to the LBS server with the disturbed one. The model allows a user to express his/her requirement of service quality by specifying a maximum service quality loss , which the user would tolerate. The loss of service quality can also be set to 0. Find all possible rankings under satisfying constraints to get the final obfuscated region, and then select one point at random as the obfuscated location. In order to ensure the excellent service, is usually set to a smaller one.
In this paper, we only see the user’s location for privacy and a novel strategy based on the Voronoi diagram is used to generate the noisy location in order to improve the service quality. Since the query privacy is also a key privacy concern, the query interests incur potential damage to personal privacy and even to individual safety. How to ensure the query privacy is another area we would like to investigate further.
The data used in our paper is just longitude and latitude of points of interest (POI), and these are public and can be obtained by anyone. And the third party of our work is Baidu Maps API, which is also public.
Conflicts of Interest
The authors declare that they have no conflicts of interest regarding the publication of this paper.
This article is partly supported by the National Natural Science Foundation of China under Grant No. 61370084 and No. 61872105, the Experimental Verification of the Basic Commonness and the Key Technical Standards of the Industrial Internet Network Architecture, the Key Technology of Home-Based Care Service System (2016RAXXJ013), and Fundamental Research Funds for the Central Universities (Grant No. 3072019CF0602).
Z. Cai and Z. He, “Trading private range counting over big iot data,” in Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019.View at: Google Scholar
J. Krumm, “Inference attacks on location tracks,” in Proceedings of the Pervasive Computing, 5th International Conference (PERVASIVE), pp. 127–143, Toronto , Canada, 2007.View at: Google Scholar
Y. Matsuo, N. Okazaki, K. Izumi et al., “Inferring long-term user properties based on users location history,” in Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2159–2165, Hyderabad, India, 2007.View at: Google Scholar
Z. Cai and X. Zheng, “A private and efficient mechanism for data uploading in smart cyber-physical systems,” IEEE Transactions on Network Science and Engineering, 2018.View at: Google Scholar
G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.-L. Tan, “Private queries in location based services: anonymizers are not necessary,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 121–132, Vancouver, BC, Canada, 2008.View at: Publisher Site | Google Scholar
M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi, “Geo-indistinguishability: differential privacy for location-based systems,” in Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 901–914, ACM, Berlin, Germany, 2013.View at: Publisher Site | Google Scholar
C. Dwork, “Differential privacy,” in Proceedings of the in Automata, Languages and Programming, 33rd International Colloquium (ICALP), pp. 1–12, Venice , Italy, 2006.View at: Google Scholar
S. Oya, C. Troncoso, and F. Pérez-González, “Is geo-indistinguishability what you are looking for?” in Proceedings of the 2017 on Workshop on Privacy in the Electronic Society, pp. 137–140, Dallas, Tex, USA, 2017.View at: Google Scholar
J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), p. 1592, Allerton Park & Retreat Center, Monticello, IL, USA, 2013.View at: Publisher Site | Google Scholar | MathSciNet
M. Mostafavi, G. Abolfazl, Christopher., and D. Maciej, “Delete and insert operations in voronoi/delaunay methods and applications,” Computers and Geosciences, vol. 29, no. 4, pp. 523–530, 2003.View at: Google Scholar