Security and Privacy Challenges for Internet-of-Things and Fog Computing 2020View this Special Issue
Location Privacy-Preserving Method Based on Historical Proximity Location
With the rapid development of Internet services, mobile communications, and IoT applications, Location-Based Service (LBS) has become an indispensable part in our daily life in recent years. However, when users benefit from LBSs, the collection and analysis of users’ location data and trajectory information may jeopardize their privacy. To address this problem, a new privacy-preserving method based on historical proximity locations is proposed. The main idea of this approach is to substitute one existing historical adjacent location around the user for his/her current location and then submit the selected location to the LBS server. This method ensures that the user can obtain location-based services without submitting the real location information to the untrusted LBS server, which can improve the privacy-preserving level while reducing the calculation and communication overhead on the server side. Furthermore, our scheme can not only provide privacy preservation in snapshot queries but also protect trajectory privacy in continuous LBSs. Compared with other location privacy-preserving methods such as -anonymity and dummy location, our scheme improves the quality of LBS and query efficiency while keeping a satisfactory privacy level.
With the development of Internet services, mobile communications, and IoT applications, Location-Based Service (LBS) has become one of the popular electronic applications. Users carrying mobile devices loaded with location-based applications, such as Google Maps, Wechat, and Ctrip, are able to send query requests to location service providers (LSPs) and obtain the corresponding service data. With such applications, mobile users can easily obtain information about various Point of Interests (POIs) in the vicinity; for example, users can acquire the bus schedule, the nearest restaurant providing their favorite cuisine, and the recreational facilities from a nearby edge server.
However, since the LSP is potentially untrustworthy, and the submitted queries from users usually include some personal information, such as users’ locations and the queried interests, the LSP can easily infer who are doing what in which place, which may jeopardize their privacy. For example, physical destinations such as medical clinics may indicate a person’s health problems. Likewise, regularly staying at certain types of places may be linked directly to one’s lifestyles or political associations. Although users may be informed of the policies regarding the collection and distribution of their location data, the execution of these policies is typically beyond the users’ control and relies solely on the service providers. Therefore, the privacy of users has not been truly protected and requires further technical attention. Furthermore, LSPs usually need to process a large amount of location service request messages, and the overloaded calculations may cause LSPs to become busy resulting in denial of service.
To address the privacy issue, many technical schemes [1, 2] have been proposed in the literature over recent years. Most of them are based on location perturbation and obfuscation, which employ traditional privacy techniques such as -anonymity [3, 4]. However, these solutions using -anonymity have some inherent flaws. First, all mobile users, regardless of whether or not they request LBSs, need to frequently report their latest locations to the anonymity server. In addition, users without LBSs may not be willing to spend their resources to help others maintain anonymity. Second, excessive location updates from a large number of mobile users also present overwhelming communication and processing bottlenecks on the server side. Third, in addition to the issues mentioned above, another problem is that the area of cloaking regions generated by the existing approaches is highly dependent on the network density. When a user lies in an unpopulated region, its cloaking area may be very large since it needs to contain the user itself and at least other users. Therefore, these traditional -anonymity schemes cannot be directly applied to the protection of location privacy due to their inherent flaws.
Trajectory privacy preservation  is another challenge in LBSs for the vulnerability of the spatial and temporal information contained in the continuous queries received by the LSP, which may expose users’ whereabouts and other private information. It is practically impossible to support anonymity for continuous LBSs using existing techniques such as GM’s OnStar services . Continuous LBSs require frequent location updates from their clients. Simply ensuring that each reported location belongs to a cloaking region containing at least users cannot really achieve the client’s -anonymity protection, and it even significantly increases the computation and communication load of servers. Therefore, how to design a secure and efficient location privacy protection scheme is worth exploring especially in the continuous LBS scenario.
To address the above problems, we propose a new privacy-preserving method based on historical proximity locations. This method ensures that the user can obtain location-based services without submitting the real location information to the untrusted LBS server, which improves the location privacy level and reduces computation and communication load on the server side. In view of the aforementioned issues, the key contributions of this work are summarized as follows: (1)In order to avoid the computational overhead of generating pseudolocations on the server side, this paper creatively proposes a scheme that substitutes one existing historical adjacent location around the user for his current location and then submits the selected location to the LBS server(2)Historical proximity location query model is adopted to guarantee the location privacy of snapshot queries and continuous queries. In addition, our solution is more difficult for attackers to distinguish the user’s true position from historical locations, and at the same time it cannot generate unreasonable positions(3)Finally, compared with the existing schemes, performance analysis results show that our proposal can significantly improve the query efficiency while ensuring privacy protection
The remainder of this paper is organized as follows. Related work is reviewed in Section 2. The system model and the proposed privacy-preserving method are introduced in Section 3. Section 4 presents the experimental results, performance evaluation, and privacy analysis. Finally, we conclude this paper and present future work in Section 5.
2. Related Work
During the past decades, many promising approaches for preserving location privacy in LBSs have been proposed. We roughly divide them into two categories: centralized architecture and noncentralized architecture.
In centralized/edge anonymity server architecture, a centralized entity [6–9] is introduced into the system to protect the location privacy. Under this architecture, -anonymity is the most popular means used for protecting users’ privacy in LBSs. Gruteser and Grunwald  originally employed this concept in LBSs. As an extension of the traditional -anonymity model [10–13], they proposed to reduce the accuracy of users’ location information along spatial and/or temporal dimensions for a certain level of anonymity protection. However, all these centralized schemes share some drawbacks: (1) The anonymity server has all the knowledge about users’ locations as well as queries, thus it becomes an attractive target for the adversary; so the user’s real information will be jeopardized once it is attacked. (2) All users have to continuously send their queries and update their locations to the anonymity server, which causes the anonymizer to be a performance bottleneck and the potential central point of failure for the entire system.
In the noncentralized architectures, users cloak their locations without trusting a trusted third party (TTP). Some approaches, such as obfuscation-based methods [2, 14], cryptographic-based methods [15, 16], and collaboration-based methods [17–19], were proposed to protect the user’s privacy. Obfuscation is achieved by adding noise, without revealing the exact location to the LBS servers. For example, Ardagna et al.  presented a solution aimed at preserving the location privacy of users by perturbing location information. The main drawback of obfuscation-based methods is that the quality of services (QoS) is degraded because of the low-level accuracy of the query answers. Cryptographic methods are also used to protect privacy data in the LBS; however, they are not practical for mobile devices since they require a powerful computational capability and incur large overhead on the client side. In collaboration-based methods, each user communicates with his peers and collects their location data to generate the cloaking region. The main idea is that, before sending a request to an LSP, the mobile user forms a group with his peers via single-hop communication or multihop routing and generates a cloaking area including users. Shokri et al.  designed a distributed location privacy-preserving algorithm for a collaborative group, called MobiCrowd, which allows users to answer LBS queries from neighboring peers so that querying users can protect their location privacy from the LSP. These approaches focus mainly on snapshot queries, and the problem of protecting location privacy in the continuous LBSs is not considered in the TTP-free methods. With the rise of edge computing, Wang et al.  proposed an edge-based model for data collection, in which the raw data from wireless sensor networks (WSNs) is differentially processed by algorithms on edge servers for privacy computing. To avoid potential information leakage and usage, the user’s exact location should not be exposed to the edge node. Tian et al.  proposed a stochastic location privacy protection scheme for edge computing, in which the geographical distribution of surrounding users is obtained by analyzing the proposed long-term density map and short-term density map. This scheme is practicable for the real scenario when the edge computing server is honest but curious.
Furthermore, in a few privacy-preserving techniques, an attempt was made to use the TTP model for continuous LBSs [22–24]. Zhang et al.  proposed an algorithm for -anonymity trajectory in LBSs, the main idea of which is to continuously expand an initial cloaked area to include at least the same users. This means that while a request for an LBS is in progress, no grouped user who participated in the original anonymity set of the requestor is allowed to leave the group, since this action would jeopardize the privacy of the requestor. Xu and Cai  exploited historical locations to construct the -anonymity trajectory and then presented algorithms for spatial cloaking. However, when a user moves on the cloaked path, the LBS can still easily identify the user’s actual location if no other user exists on that path.
To address the above limitations, we propose a new privacy-preserving method based on historical proximity locations to protect location privacy in both snapshot queries and continuous queries.
3. System Overview
Definition 1. The requested message submitted by the user to LSP can be expressed as a five tuple: where represents the user’s identity information; is the user’s location, which can be directly obtained from a Global Positioning System (GPS) or using other positioning devices; denotes the time at which the user sends the request; represents the query content the user wants to submit; and represents the user’s query radius, and naturally, the corresponding query area is .
Definition 2. denotes the minimum distance allowed between the user’s current location and the historical proximity location selected to be reported to the LSP. This limited distance prevents the selected historical proximity location from being too close to the user’s current one to better protect the location privacy. Likewise, in order to guarantee the query quality, represents the maximum distance between the user’s current location and the selected historical proximity location.
Definition 3. represents the set of POIs the user can obtain under ideal conditions; is the set of POIs returned by LSP searching according to the user’s submitted locations, query content, and query radius.
Definition 4. , which represents the query quality, is the ratio of the number of POIs that the user can obtain under ideal conditions to that of POIs the user receives from the LSP.
3.2. Location Privacy Protection Model
Similar to existing work [25, 26], our system lets mobile users achieve LBSs through an anonymity server, which is considered as a TTP. However, the difference between our centralized architecture and the existing ones is that it can effectively reduce the computing and communication load based on the adopted privacy protection method.
A database that stores a large number of historical proximity locations is essential for the TTP providing privacy service in our model, and the specific characteristics of the database are given as follows: (1)Initially, the database may be empty and the users can obtain the location service with -anonymity protection, during which mobile users report their locations periodically to the TTP, and the positions utilized in the anonymity process will be subsequently added to the database as historical proximity locations. Unlike existing techniques, such a periodic location update is no longer needed after the initial phase, which may last only a short time period. More location data can be obtained with more and more mobile users participating in the requests of LBSs(2)After the initial phase, there are enough historical locations recorded in the database. As shown in Figure 1, suppose that a user is requesting location services at location , if there are a certain number of historical proximity locations existing in the database that satisfy , where represents the distance between the historical location and the user’s current location. In this case, the TTP will select the nearest historical location substituting the current location and send it to the LSP. will be subsequently added to the database as a historical location after the query process. However, if there are no historical proximity locations in the database that satisfy , the -anonymity technique will be activated to provide privacy protection services for the user
Obviously, there will be a continuous increase in the number of historical proximity locations recorded in the database, and under this circumstance, the -anonymity protection is no longer frequently needed.
Furthermore, for efficient retrieval of location data, we index the database using a simple grid-based approach. The entire domain is recursively partitioned into cells in a quad-tree style. Unless a cell has been already at its minimal size (our implementation sets each cell to be at least ), it is split if the number of locations inside it exceeds a predetermined threshold. Thus, given a cell corresponding to the user’s current location, we can effectively retrieve the location data and obtain historical proximity locations.
3.3. Privacy Preservation in Snapshot Queries
3.3.1. Query Area
As shown in Figure 2, the user is located at position , the query radius is , the nearest historical proximity location of point is , and is the distance between and (). (1)As shown in Figure 2(a), when is satisfied, a circle is generated with point as the center and as the radius. Draw two tangent lines ( and ) to the circle via point with and as the tangent points. Wherein, (denoted in radians) is shown in Figure 2(b). To cover all the possible target positions, the fan is enough as the effective query region, while the actual query region is the whole area of circle and the area of the fan can be computed as(2)As shown in Figure 2(c), when is satisfied, if the user wants to query all the target positions, we regard the entire circle which is centered on as both of the effective query region and the actual query region, where is the radius, and the area of the query region is .
(b) The actual query area when
(c) The actual query area when
3.3.2. Query Process and Filtering of Query Results
The LSP cannot directly search the irregular area such as the sector area mentioned above during the process of LBS. However, it is feasible to first filter the query results on the TTP side and then filter the results on the client side, which can efficiently reduce the overhead of mobile devices carried by the users. As shown in Figure 3, when () is satisfied, the user at location , for example, is searching for gas stations nearby with the query radius . The specific query and filtering process is as follows. Once receiving the request from the user located at , the TTP will search the database and deliver the information of location , which is selected carefully as a historical proximity position of , to the LSP. And then the LSP will search the entire circle with as the radius, i.e., the actual query region, for target positions meeting the request. After that, the messages related to the gas stations , , , and will be returned to the TTP from the LSP as the results. Then, the TTP will filter out which is out of the user’s query area. Finally, the user’s mobile device will calculate the distance (, , and ) from locationto the remaining gas station candidates, , and , respectively, with the help of a map installed before. Compare each with the query radius ; if it is smaller than , the corresponding information will be retained, otherwise it will be deleted, so will be filtered out as a result. Ultimately, the location information of gas stations and will be sent to the user.
When is satisfied, the process is similar, which is not repeated here.
It is worth noting that due to the indirect query method in our scheme, the error of the distance between and will also lead to the error of the actual query radius , which may result in the actual query area being too large or too small, possibly accompanied with a declined quality of services.
3.4. Privacy Preservation in Continuous Queries
Existing techniques mostly focus on snapshot queries. However, privacy preservation in continuous LBSs is more challenging than that in snapshot queries because adversaries could use the spatial and temporal correlations on the user trajectory to infer the user’s private information. To deal with the concern, a privacy-preserving method for continuous LBSs based on historical adjacent locations is described in this section.
3.4.1. Average Query Error
As mentioned above, in snapshot queries, the error of the distance d between the user’s current location and the reported location, which is selected from the historical proximity locations in the database by the TTP, may bring about a decline in the quality of queries. Similarly, it is the same in privacy preservation scenarios of continuous LBSs queries. We give a formal definition of the average error degree in continuous LBSs as follows.where is the distance between and , .
Definition 5. Given a trajectory , which is generated by the user over a period of time, where represents the user location at the time point; in response, the TTP will compute a new trajectory based on using historical proximity locations, where represents the historical proximity location of at the time point . The average query error can be defined as
Obviously, the smaller is, the smaller the error degree will be. For the quality of queries, in the query process needs to be as small as possible. Therefore, in the process of the user’s moving on the trajectory, it is better to select the nearest historical proximity location substituting the corresponding when sending it to the LSP.
3.4.2. Trajectory Privacy-Preserving Algorithm
If there are enough historical proximity locations around the user’s trajectory , it is easy to find the corresponding for each , and will not coincide with any , where .
However, if the historical proximity locations around the trajectory are sparse, there is a certain possibility that and coincide with each other. As shown in Figure 4, the directed lines denote a trajectory formed by the user over a period of time, and the solid nodes nearby denote the existing historical proximity locations. Since is both the nearest historical proximity location of and that of , when the user is at the 0th time point and the 1st time point, will be selected and sent to the LSP for query results on behalf of as well as , resulting in the same selection of historical proximity locations at different time points, i.e., . In this case, once the LSP receives the same location at different time points, it will be easy to infer that the user is wandering in the vicinity of during this period of time, which actually leaks the user’s privacy.
To solve this problem, we make further constraints and give Definition 6.
Definition 6. Given a trajectory , which is generated by the user over a period of time, where represents the location of the user at the time point ; there is a new trajectory as the historical proximity trajectory (HPT) of , where , , andrepresents the corresponding historical proximity location selected for location .
Therefore, aiming at the problem for the privacy preservation of continuous LBSs, the key to our solution is how to find the corresponding historical proximity trajectory (HPT) for the user’s trajectory while guaranteeing the minimum value of on the premise of satisfying both Definition 5 and Definition 6. The following is the specific solution description for this problem.
Given a trajectory of the user, let be an ordered set of historical proximity locations along the direction of trajectory , satisfying ( is the distance from to any location among ) and , where denotes the number of historical proximity locations around the trajectory , and represents the number of locations the user left on trajectory . Then, the minimum sum of error degree between the historical proximity trajectory and the user’s trajectory is defined as . If is selected as the historical proximity location of and sent to the TTP, then the solution for getting the minimum value of is certainly contained in the solution for getting that of . If is not selected to substitute , then the optimal solution for getting the minimum value of is bound to contain the solution for getting that of . Therefore, the recursive relationship can be denoted as follows: where represents the distance between and .
The pseudocode of the above procedure is given in Algorithm 1.
Algorithm 1: selectHistoryLocation(, ).
Algorithm 2: getHistoryTrack().
In Algorithm 1, array B holds the subscripts of the selected locations on , and the complexity of the algorithm is . Algorithm 2 is used to get the historical proximity trajectory that guarantees the minimum value of .
Besides, there is still a special situation needing a discussion. It is likely that the quantity of the historical proximity locations recorded in the database is not enough for the algorithm we proposed. As is shown in Figure 5, when is much smaller than , no matter how it is selected, it will occur that one historical proximity location is selected two or more times on the user’s trajectory. Considering the peculiarity of this problem, we propose to employ a symmetry mechanism to generate dummy locations in our scheme, and the specific procedure is described as follows.
As is shown in Figure 6, when there are no other historical locations available except one existing historical proximity location (for example ) of location (for example ), it connects to and extends the connecting line to point (for example ), making , where is the dummy location generated as a historical adjacent location of by symmetry. However, it is possible that the dummy location generated by symmetry is unreasonable (for example, the dummy location is in a lake), so some adjustment is necessary. As shown in Figure 7, suppose that the generated dummy location is unreasonable, and the TTP will rotate and adjust the distance from to to make it meet the rationality requirements, and finally a reasonable dummy location will be obtained as the historical proximity location of .
Besides, there still exists a small possibility of , in this case the number of historical proximity locations is smaller than (the number of locations on trajectory ), even if the number of historical locations is expanded from to with the aid of the symmetry mechanism. To deal with this issue, we can activate the -anonymity technology to protect location privacy and add the user’s locations to the database.
4. Experiment and Analysis
In this section, the experimental evaluation of the feasibility and efficiency of our proposed method under various parameter settings will be presented. Firstly, we analyze the influence of several parameters on the average query error . Secondly, we compare our method with other location privacy-preserving techniques in terms of query efficiency, query quality, and anonymity degree. The experimental region is within 10 square kilometers of the Sanpailou Campus of Nanjing University of Posts and Telecommunications. The data utilized in the experiments are captured by the coordinate pickup tool provided by Google. Our experiments are implemented with the Java Development Kit- (JDK-) 1.7 and Eclipse Integrated Development Environment (IDE), running on a local machine with an Intel Core-i5 2.8 GHz, 8 GB RAM, and Microsoft Windows 7 OS.
4.1. Influence Factors of the Average Query Error
Within the range of the experimental region, 10 coordinate points are generated randomly to construct a trajectory of the user, i.e., let . And then 20~40 locations from the database are selected as historical adjacent positions around the user’s trajectory generated before.
is set by the user, and a smaller has more probability to be taken to ensure the quality of services in a densely populated area; on the contrary, in a sparsely populated area, a larger means better privacy level. As shown in Figure 8, it can be seen that increases with the increase of : when selecting historical adjacent locations, it is necessary to consider whether the distance from the user to the historical adjacent location is larger than , so as to exclude some positions that are too close to the user. The more there are historical proximity locations, the smallerwill be, and this results in a smaller average error degree. Besides, approaches infinitely as approaches infinity.
is also set by the user, and usually it cannot be set too small. Since is the maximum distance between the user’s current location and the historical proximity location reported to the LSP, a that is too small will filter out most historical adjacent locations, reducing the privacy protection level. As shown in Figure 9, when takes the value of 50 and 100, respectively, increases as grows in the initial phase. This is because when and are both small, the number of the screened historical locations is smaller than , and dummy locations will be generated as historical adjacent locations by the symmetry mechanism. Therefore, there is more probability of selecting the nearest historical locations, leading to a smaller . However, when gradually grows, will also increase as the screening conditions for historical locations are relaxed, so the number of historical locations generated by the symmetry mechanism will decrease, accompanied with an increase of . Until there is no need for generating symmetrical historical locations, will no longer affect the historical locations selected. When and , the locations closest to the trajectory selected from the historical points are not screened out, so remains constant as increases.
We have discussed the influence of historical adjacent location parameter selection on . The experimental results clearly show the specific effects of different values of and on . Therefore, in practical application scenarios, the values of parameters and should be selected according to specific requirements and allowable errors.
4.2. Performance Comparison
4.2.1. Performance Comparison under Snapshot Queries
In our experiments, coordinate data of 500 target positions such as hotels, hospitals, and gas stations were captured, and 5000 coordinate points were randomly selected as historical adjacent positions as well as other users’ positions required when using -anonymity and were stored in the database.
(1) Query Efficiency. The query efficiency is usually synthetically evaluated with the total time cost that contains TTP spending on generating the actual query area and the LSP spending on replying to the requested query. As shown in Figure 10(a), the Casper scheme in  and the anonymous-zone merging scheme in  generate the query region using -anonymity, so the query domain generation time is the sum of the time of the database searching other users around and that of constructing the anonymous domain containing users. However, in most cases, the query domain generation time of our solution is almost the time to search for historical adjacent locations in the database, which has nothing to do with . Therefore, the time to generate the query region in our scheme is relatively less and does not increase linearly with the increase of . As shown in Figure 10(b), when the same query radius is taken and is set to 75 m, the query region in our scheme is independent of and its area does not exceed ; the area of query region generated by the Casper scheme is theoretically no less than , which is larger than those of our scheme and the anonymous-zone merging scheme; besides, the query areas of the two compared schemes increase significantly with the increase of . Sufficient historical locations will ensure a smaller in our scheme, and thus guarantee a smaller query area. Figure 10(c) illustrates that the query processing time is positively correlated with the query area. In order to facilitate the comparison between our scheme and the other two -anonymity schemes, a smaller anonymity degree is taken and is set to 75 m. As shown in Figure 10(d), both the query area and the query processing time gradually increase when the radius grows. However, compared with the Casper scheme and the anonymous-zone merging scheme, the query area generated by our scheme is relatively small, which results in a shorter query processing time.
(2) Query Quality. Evaluation of the query quality is based on the ratio of the number of POIs that the user can obtain in theory to that of positions returned by the LSP when the user requests with the query radius , i.e., , as explained in the previous study. In the experiment, we randomly select 20 points as the positions where the user can send the query. We vary the value of , repeat the experiment 20 times, and then take the average value of as the analysis object. Furthermore, we also compare our scheme with the enhanced pseudonym selection scheme in  besides the other two schemes mentioned before. As shown in Figure 11, our scheme maintains satisfactory query quality and stability with the increase of the query radius. In contrast, as for the Casper scheme and the anonymous-zone merging scheme, the query area increases significantly as becomes larger, which indicates that a great number of POIs cross the user’s query area, resulting in the decline of query quality. In addition, since the enhanced pseudonym selection scheme cannot flexibly adjust the query region to cover all the target positions, it is difficult for it to guarantee high query quality.
Experimental results show that, our scheme can effectively improve the query efficiency while guaranteeing satisfactory query quality in snapshot queries.
4.2.2. Performance Comparison under Continuous Queries
In the experiment, we select , defined as the average query error for a user’s trajectory in continuous LBSs, as our performance evaluation metric. And obviously, the smaller is, the better the quality of services will be in continuous LBSs. We compare our scheme with two other existing schemes, the Native scheme in  and the Greedy scheme in , which are both extensions of the -anonymity method. Within the experimental region, the length of the user’s trajectory was set to 1-5 km, and 500-2500 coordinate points were captured within the radius of 200 m around the trajectory as historical proximity locations and added to the database.
We set , and the experiment results are shown in Figure 12. In the Native scheme, the cloaking area will become increasingly large since the traditional trajectory -anonymity method expands an initial cloaking region to cover at least the same users who may move in different directions, resulting in a sharp increase of . Moreover, the query sequence is consistent with the user’s movement direction, which may provide some valuable information for the adversary to infer the user’s trajectory. In the Greedy scheme, the Greedy algorithm is utilized to verify that each node on the candidate trajectories is as close to the user’s trajectory as possible; however, since a complete historical trajectory will be finally selected from the candidates, it cannot guarantee that each position on the selected historical trajectory is the nearest point for each node on the user’s real trajectory.
The following is the analysis of the impact of (the number of historical proximity locations) on of the three schemes, and the length of the user’s trajectory is set to 3 km. As shown in Figure 13, for the Native scheme, has no effect on since the generated cloaking area is only relevant to the current locations of the other users. However, declines with the increase offor the Greedy scheme and our scheme, since historical trajectories and historical proximity positions are utilized in the two schemes, respectively.
4.3. Privacy Analysis
In this section, we will evaluate the privacy degree of our solution by comparing it with the -anonymity and dummy location technology.
In the process of -anonymity protection, the LSP receives the locations ofusers involved with the service requestor, so the probability of identifying the user’s real location is . As shown in Figure 14, the larger the value of , the higher the privacy degree will be; however, it will also lead to a decrease in query efficiency and quality. As shown in Figure 15, a user , located at position 1 in the Sanpailou Campus of Nanjing University of Posts and Telecommunications, wants to request the location service together with 9 other users in the vicinity who also request services. Suppose that user obtains location service through -anonymity with a cloaking area containing users in positions 4, 5, and 6, the probability that the adversary recognizes user will be 1/4. However, if user adopts the scheme as described in this paper, point 4 will be treated as a historical adjacent position to be queried. By the description of the proposed scheme, the actual query area covers a total of 6 points including user and points 1, 3, 4, 5, 6, and 7, which is denoted by the big red circle in Figure 15. Therefore, the probability of identifying user is only 1/6.
The advantages of our proposal will become more obvious in continuous queries under densely populated areas. We take Xinjiekou, the commercial center of Nanjing City, as an example. As shown in Figure 16, user sends a service request with , the anonymous set of which is at the initial time ; while that updates to and separately at and . At each moment, the probability of identifying user is 1/3. However, if an attacker obtains the user’s anonymous sets at the three moments and then performs an intersection operation, then the true identity of can be obtained. In our proposed scheme, point 2 is selected as a historical adjacent position at time . At time , considering that point 2 remains closest to and to block the attacker from speculating thatis located near point 2, we chose point 7 as the historical adjacent position according to the historical proximity selection method described in the scheme (). Due to population density, there are more historical adjacent locations around the user, so the possibility of the users’ real identity being exposed is lower.
(a) At time point
(b) At time point
(c) At time point
In addition, it is more reasonable to utilize historical proximity locations instead of the randomly generated locations by the traditional dummy location technology. As shown in Figure 17, when the user requests LBSs at position 1 with the dummy location technology, it is possible that the pseudonym-location generated is in the lake. In this case the adversary can easily identify it with background knowledge and filter out other unreasonable locations, which definitely decreases the privacy degree. Fortunately, using the historical proximity locations proposed in this paper can address this problem.
Compared with most of the existing methods such as -anonymity and dummy location, which have to report the user’s true position to the service providers, our approach submits the historical proximity location to substitute the user’s current location, which improves the privacy level.
This paper proposed a solution for location privacy protection in both snapshot queries and continuous queries. With historical proximity locations around the user submitted to the LSP for query instead of the true location, the user’s location information is completely anonymous from the LSP in the whole process of requesting the LBSs, and thus a high privacy level is achieved. Compared with -anonymity and the enhanced pseudonym selection scheme, our scheme is simple and feasible, and can achieve better query efficiency and higher query quality. Our proposal also provides a new solution for dealing with the problem of maintaining the equilibrium among the privacy level, query efficiency, and quality of services. However, in continuous queries, we have not succeeded in achieving sufficient anonymity protection level for a user’s movement trajectory. It is still possible for an attacker to infer the general direction of the user’s movement by analyzing the changes of the user’s query range recorded in the LSP. This is a difficulty in the current techniques of trajectory privacy preservation, and it is also the focus of future research work.
No data were used to support this study.
X. Guo and W. Wang are co-first authors.
Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
X. Guo and W. Wang contributed equally to this work.
This work was supported by the National Natural Science Foundation of China (grant number 61672297), the Key Research and Development Program of Jiangsu Province (grant number BE2017742), the Postgraduate Research & Practice Innovation Program of Jiangsu Province (grant number KYCX19_0908), and the Key Project on Anhui Provincial Natural Science Study by Colleges and Universities (grant numbers KJ2019A0579 and KJ2019A0554).
C. Yin, J. Xi, R. Sun, and J. Wang, “Location privacy protection based on differential privacy strategy for big data in industrial Internet of Things,” IEEE Transactions on Industrial Informatics, vol. 14, no. 8, pp. 3628–3636, 2017.View at: Google Scholar
M. Gruteser and D. Grunwald, “Anonymous usage of location-based services through spatial and temporal cloaking,” in Proceedings of the 1st international conference on Mobile systems, applications and services, pp. 31–42, 2003.View at: Google Scholar
Y. Zhang and Q. Zhang, “A -anonymous location privacy protection method of dummy based on geographical semantics,” International Journal of Network Security, vol. 21, no. 6, pp. 937–946, 2019.View at: Google Scholar
T. Nakagawa and H. Arai, “Personalized anonymization for set-valued data by partial suppression,” in 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1003–1010, IEEE, 2017.View at: Google Scholar
R. Shokri, G. Theodorakopoulos, P. Papadimitratos, E. Kazemi, and J.-P. Hubaux, “Hiding in the mobile crowd: location privacy through collaboration,” IEEE transactions on dependable and secure computing, vol. 11, no. 3, pp. 266–279, 2014.View at: Google Scholar