One More Accuracy k-Anonymity Framework for LBS
To achieve privacy protection in the k-anonymity algorithm, replacing the starting positions of k users with the center of the cloaked area results in the route from the starting position to the center of the cloaked area probably nonlinear, which will bring out the inconvenience for the user. This paper proposes a new anonymous server framework, on which we add the function of electronic map and navigation compared to the original one based on the framework of k-anonymity for LBS. In addition to the original tasks including computing the center of cloaked area and the actual query result corresponding to the query , the anonymous server with the new architecture will plan the actual path from the starting query location to the center of the cloaked area and return the new result collection of for the user . Moreover, this paper improves the algorithm of computing the route to make it more in line with the actual situation of road network operation. Finally, we prove the importance of the approach by the experiment demonstration.
User sends a query request to AS (the anonymous server), and the message of request is , in which is the starting position of request, is the content of request, namely, the destination, k is the anonymous level, is the temporal tolerance, and is the spatial tolerance. AS will find other k-1 users, call cloaking algorithm, and send the message to LBS, in which is the center of the cloaked area and is the query collection of k users and . LBS returns the query results ,, to AS. AS finds in corresponding to , computes the Euclidean distance from to , modifies the field related to distance in , and sends to user . The starting location and the message of query are protected by the k-anonymity technique, so LBS does not know and , but and .
The employment of the k-anonymity algorithm makes the route from to uncertainty for it is the Euclidean distance from to , namely, the straight-line distance that AS computes from the query results returned from LBS. But, in practice, there exists many other nonlinear routes, as shown in Figure 1. It shows the state after processed by the technique of k-anonymity, in which is the center of the cloaked area; , , , and , respectively, denote the users; and , , , and denote lines from to . In this figure, , , , and are all nonlinear. Especially in the sparsely populated area or the low peak of the night, in order to get the k-anonymity level, when the temporal and spatial tolerance expanded, the line tends to be more nonlinear. If the driver is unfamiliar with the road condition around him, the last line from the cloaked area to user looks like very valuable.
Take the ride-hailing LBS as an example. The user sends to AS, in which is the starting location, equivalent to , and is the destination location, equivalent to . AS sends to LBS after being handled by the cloaking algorithm. LBS plans the lines from to and calls the drivers to the position to pick up the users. But the drivers do not know the line from to . The same goes for navigation LBS. If being not familiar with his surroundings which is not home or the workplace, the user often does not know the path from the starting location to the cloaked area. As shown in Figure 2, the user is at the community or school or factory and must go through the gate to arrive at the location . The line from to is equal to , which is obviously nonlinear.
Therefore, it is necessary to achieve the last line accurately from the starting position to the center of the cloaked area based on the known k-anonymity algorithm. But the last line involving the user’s privacy cannot be known by LBS, so we propose a framework to solve this problem in this paper. The contributions of this paper are as follows:
In this paper, we propose a new framework for AS, in which an electronic map and its navigation function is deployed on it to plan the route of the last line from the starting location to the center of cloaked area;
Our scheme makes it easier to achieve anonymous level k in sparsely populated area or the low peak period without being influenced by the spatial tolerance than other schemes;
We design the architecture of AS that includes three parts: communication part, transaction processing part, and map navigation part;
We discover the drawbacks and limitations of the k-anonymity algorithm on the new AS and propose the improved algorithm;
Finally, we prove the approach important by experiment demonstration.
The rest of the paper is organized as follows. Section 2 overviews the related work. Section 3 describes the background knowledge related to the electronic map and its navigation. Section 4 proposes the new framework of AS, depicts the architecture of AS after adding the function of navigation in details, and proposes the approach of planning the last line from the starting location to the cloaked area. Meantime, we analyze the problems existing in the new approach and improve it. Section 5 discusses the advantages of the approach in theory. Section 6 demonstrates the performance by experiments. Section 7 is the simple conclusion and description of the future work.
2. Related Work
Grutester and Grunwald  first introduced k-anonymity in LBS, in which a trusted AS is introduced to find k users to make a set and LBS cannot distinguish among them who issues some query, which protects the user’s privacy. The bigger the anonymous level k is, the stronger the ability of privacy protection is; many literatures concentrate on improving the anonymity level k, which includes finding a better cloaked area in the constraint of the maximum temporal and spatial tolerance. The CliqueCloak algorithm [1, 6] locates a clique in a graph to make location cloaking. The Casper system [7, 8] employs the quardtree-based pyramid technique to perform the location anonymization, which can achieve fast cloaking. PrivacyGrid cloaking algorithm [9, 10] splits the map into many cells, and every grid cell is a rectangle area. The AS regards the position of request originated as the candidate cloaked area. If the area satisfies the anonymous conditions, the candidate area will be selected as the cloaked area. Otherwise, it will enlarge the candidate cloaking area. The literature [1, 7, 9] aims to increase the anonymity level k by improving the algorithm relating to the cloaked area. Sometimes, in the sparse populated areas, it cannot still achieve anonymity level k in the range of the maximum temporal and spatial tolerance. In such situation, the dummy location is introduced. The idea of dummy locations [11–13] is to use m dummy locations and c real users to achieve k-anonymity, in which . The method of hierarchy clustering tries to use the clustering technique to enhance the efficiency of finding k-1 users. But in all these approaches exist a problem that is the route from the start position that the request is originated to the center of the cloaked area may be more complicated than a straight line, which is more likely nonlinear. However, in the existing approaches, AS computes the route according to the Euclidean distance formula, which will result in the decline of the QoS, especially for some LBS related to route planning, such as ride-hailing, takeout, or navigation.
Some literatures [15–17] focus on privacy protection in the road network. Ma et al.  proposed to divide the road network according to Taylor polygon and use SpaceTwist technology to protect the privacy in continuous queries. Sun et al.  proposed a location-label-based approach to meet l-diversity requirement, in which the road network is introduced on AS and participated into many cells. Although both [15, 16] introduce the road network on AS, the aim of  is to form Taylor polygon and the aim of  is to form an area of diversity. Ma et al.  mentioned using the road network to compute the actual distance form to and ranked the returned results, but it does not give specific navigation and how to deal with problem met in navigation such as ring or intersection. In this paper, the functions of the map and navigation are to guide the last route accurately and help to find the most suitable k-1 users in the k-anonymity algorithm.
3. Background Knowledge
3.1. The Scenario of Navigation in Road Network
With the rapid development of mobile internet and smart phones, many mobile applications appear and bring great convenience to our life. LBS is one of them, in which the services of searching POI (points of interest) nearby or navigation are provided by the technique of electronic map and positioning device on the mobile clients. Navigation in the road network, one of these services, includes two main parts: path planning and location guidance. The navigation models have three forms: offline mode, online real-time request mode, and a combination of the two modes. Offline mode needs specialized equipment, on which electronic map and navigation software are setup, so it does not visit LBS and leak the privacy to LBS, but the price is expensive. In online real-time request mode, mobile client issues continuous query to LBS, which will add the burdens of the server. The third mode is the combination of the first two modes. The architecture consists of clients and servers. The client is responsible to position, originates query request to LBS, and makes location guidance according to the route information returned from LBS. The function of a server is to receive and response the request of the client and plan the path. When the server finishes planning the path and sends the guidance information to the client, the client will start navigation according to the location guidance and positioning technology by GPS or IP, and the whole process does not visit the server except that the current position deviates from the planned path. The third mode is convenience and cheap and can be implemented by downloading a client application to setup on a mobile device. So, the third mode is widely used. In the first mode, the user issues the continuous query at equal intervals and forms a query trajectory , in which i represents the ith user, denotes position at time , and is the query location issued by user i. In the third mode, the trajectory is sparse for only when the route deviates the planned route, and the request is issued. The trajectory is , in which is small and even only equals to 1. Both the first mode and third mode belong to continuous query. In this paper, we will discuss the improvement of the k-anonymity privacy protection in the two modes.
The work process of the two modes is as follows:(1)The client inputs starting position and destination and originates query request to LBS(2)LBS plans the route from the starting potion to destination and sends the route and location guidance information to the client(3)The client starts navigation according to the guidance information(4)If the current position deviates the planned path or a fixed slice is consumed, the client rerequests to LBS and go to step 1
3.2. Definition of Abbreviations and Variables
The abbreviations involving in this paper are shown in Table 1.
The variables involving in this paper are shown in Table 2.
3.3. k-Anonymity in Road Network
For the path guidance is performed on the client, which does not visit LBS until the current position deviated from the path, the process of location guidance will not leak privacy to LBS. Only during the process of path planning, the users’ privacy may be leaked to LBS. So AS and k-anonymity algorithm are introduced between step (1) and step (2) to cut off the direct communication between the client and LBS in the above work process, and the architecture of AS is shown in Figure 3. But the drawbacks of the existing approaches are that the route from the starting position to the center of the cloaked area may be nonlinear and cannot arrive in a straight line. In some LBSs, such as ride-hailing, takeout, or navigation, the last path planning appears very important. So, this paper proposes a new approach which can plan the last path according to the real-road network and provide more accurate services.
4. The New Approach
To plan the last route from the starting position to the center of the cloaked area accurately, we add the electronic map and navigation function to AS. At present, many electronic maps and its navigation functions are provided freely or free use for at least 10 kilometers to mobile internet users, such as Google Map, Baidu Map, and Gaode Map. So, we can introduce these functions to AS to undertake the function of planning the last route. After introducing the new function to AS, the interaction between AS and LBS does not alter. So, the ability of privacy protection brought by performing the k-anonymity algorithm does not yet alter. Achieving the better QoS, namely, planning the last route according to the real-road network situation just is the sole purpose of introducing the new function to AS, which does not affect the efficiency of privacy protection. It should be emphasized that the unique retained part of the map is just the function of path planning, i.e., finding the route between the starting node and the ending node, and other functions such as searching points of interests is not be involved.
4.1. The Architecture of AS
The architecture of AS consists of three servers: CS (communication server), TPS (transaction processing server), and MNS (map navigation server), as shown in Figure 3. CS is responsible for communicating with the externals including the users and LBSs and the internals including TPS and MNS. TPS mainly performs the k-anonymity algorithm, originates request of planning path from the starting position to the center of cloaked area to MNS, and receives the messages related to location guidance from MNS. The function of MNS is mainly to receive the request of path planning from the TPS and return the guidance information to it.
4.2. k-Anonymity Algorithm in the New Architecture of AS
The work process of the k-anonymity algorithm in new architecture is as follows:(1)The user originates request to AS and sends to it(2)CS receives the request and forwards it to TPS(3)TPS performs the k-anonymity algorithm: ① find k-1 users in the range of and and then call the location cloaking algorithm and obtain the position of the center of the cloaked area ; ② create a set which is composed of the query contents of k users and ; and ③ send the message to CS and then submit to LBS(4)LBS returns the query result , in which , , denoting the route from to , and meaning the guidance information of the route(5)MNS returns the query result which includes route and location guidance information to TPS, i.e., (6)MNS returns the query result which includes route and location guidance information to TPS, i.e., (7)After receiving the query results and , TPS extracts the information about and sends it to him/her via CA. The relationship of obeys the following formulas:(8)The user starts navigation according to and (9)If the current position deviates the path or a fixed time slice is over, it will rerequest to plan route and go to step 1.
4.3. The Existing Problems in the New Approach
In the above approach, the combination of the introduction of the new function on AS and the k-anonymity algorithm makes it possible that the location privacy is protected and the last line is planned accurately, which ensures QoS while protecting privacy. The advantage of the new technique is that it makes the k-anonymity algorithm to get rid of the limitation of spatial tolerance. It means that, in the range of temporal tolerance, it is much easier to find k-1 users than other schemes by increasing the spatial tolerance without influencing the QoS, which makes anonymity level k becoming bigger or obtain k users easier than before.
But there still exist some problems in the new approach. Before discussing the problems, we firstly definite a road with its starting node and ending node  and then a road is described as , in which is the starting node and is the ending node. A route is often composed of many roads. Then, we define a route . Generally, the user often starts off from some location on one road not the starting node of it and goes to the destination located on some position on the other road not the ending node of it. So, assume a route starting from location via many roads to the destination and then the route is defined as
We call as the key node of the route.
According to the above definition, we can define the following routes:
Now let us begin to study the problems. One is shown in Figure 4(a), in which the red line denotes the route and the black one is the route . We take as example. According to the new approach, the user will start navigation in accordance with route :
There is no intersection between the route and . If the route is so long, it is beyond the limitation of temporal tolerance or the spatial tolerance of the user, namely, it obeys the following formula:where denotes all the nodes on the route , , where is the node numbers on the route and is the speed on the line from the node to .
In formulas (7a) and (7b), and are different from and although all of them are the tolerances. is the temporal tolerance that reflects the request and response time from the user who originates request to that he receives the returned message while is a tolerance that is equal to the time spending on the route and it is constraint by the navigation ability of the map deployed on AS and provided by the third party such as Google and Baidu. means the maximum straight-line distance from the starting position to the center of the cloaked area, while denotes the real length of the route . For example, in Figure 5, the straight-line distance between the blue car and the orange car is very short, but the route between them is so long and equal to the length of the expressway.
In this situation, we will decide to abandon and find the next user for the k-anonymity algorithm. So, the formulas (7a) and (7b) can be used to help judging whether the user can be accepted to be one of k users in the k-anonymity algorithm. Now let us take an example to explain this situation. Assume the road is an expressway and has two one-way streets A and B, as shown in Figure 5. The blue points represent the cars on street B, and the orange ones are those on street A. The dashed line denotes the cloaked area. If car on street B issues request, according to the above principle, the cars on street A will be abandoned for the length of the route from the center of the cloaked area to the orange positions is too long (we assume the expressway is infinite and the center of the cloaked area located on street B).
The second is shown in Figure 4(b). In the figure, there is a ring between the route and ; namely, the route includes . In this situation, , , and the routes and are as follows:
Then, the route is
We can see the node appears as two times in route , which forms a ring on this route. In fact, we observe that the real line from to is as follows:
Namely, in this situation, the line before the position of which appears at the second time has been cut and only retains the rest part starting from the second .
The third case is shown in Figure 4(c). Before discussing the third situation, let us review some knowledge:(1)The representation of the road network on electronic map. Objects on the map are arranged in multilevels [19–21]. Take the  mode as example, in which there are two levels, respectively, the upper level and the lower one, as shown in Figure 6. Figure 6(a) shows the roads in the upper lever such as the main road, and Figure 6(b) shows those in the lower level such as the secondary road. Na represents the node on the upper level and Nb on the lower level in Figure 6.(2)The rules of the path search: when the navigation starts, the system will obtain the current opposition by the navigation device on the mobile client and search the position on the path planned and go along the route from the current position on it. If the current position deviates from the route, the system will rerequest to plan the path. Now let us go back to the third case. In Figure 4(c), the main road B includes two roads H and F. Road H always keeps up the same direction with the main road B, and road F deviates the main road B to the secondary road G after running a certain distance on the main road. For the position , it is located on road H. There is some line overlapped nearby between the route and , but it does not form a ring and not continue running all the way down the main road B but goes to the secondary road G. Assume road ; then according to the above definition, we can depict the route from to in Figure 4(c), as follows: But in fact, we observe that the real path is just . According to the principle of path search in navigation, assume the current position is , and then the navigation system will start navigation from the position that appears at the first time not the second one in the line . In formula (12), the current position is appearing at the first time, without covered by shadows in the figure. But if we delete the first in , then the formula of will become as follows: According to the rule of path search, the system will search the current position which is covered with shadows in formula (13), appears at the second time in formula (12), and starts going in accordance with the rest line , which is just the actual route.
Comparing the third case with the second one, we can find that the actual routes in the two cases are both as follows: So the approach in the third case applies to the second. Because in formula (14) belongs to the route , it is not the starting or ending or key node and is not explicitly indicated on the route, while cutting one node appearing firstly in the line shown in formula (13) is easier and more simple than cutting off one line before the location appearing at the second time shown in formula (14). Besides, according to the principle of path search, we know in the last two situations, the formula (13) is equivalent to formula (14), so we decide to use formula (13) as the return route to the user. From the cases of Figure 4, we can conclude that there are two approaches to solve the existing three problems. One applies to the first case corresponding to Algorithm 1 and the other to the last two ones corresponding to Algorithm 2. The Algorithm 1 is used to judge whether the user meets the conditions and and whether he/she can be one of the k users in the cloaking algorithm, as shown in Figure 4(a). Algorithm 2 is used to handle the situation with intersection or ring, as shown in Figures 4(b) and 4(c). The improved algorithms are as follows.
5. Analysis of Performance
This paper introduces the new function of electronic map and its navigation on AS, which brings out many advantages that are mainly reflected on two the following aspects.
5.1. Planning the Last Distance Accurately
To meet the requirement of k level, the algorithm of k-anonymity enlarges the temporal and spatial tolerance, which makes the last distance from the center of the cloaked area to the k starting position not appear to be linear and sometimes especially when spatial tolerance become big in sparse zone, tend to be more nonlinear. If the user is not familiar with the surroundings, knowing the definite route from the user to center becomes more important. This new approach provides the detailed scheme to plan the last route.
5.2. Increasing the k-Anonymity Level
After introducing the new approach, the technique of k-anonymity gets rid of the limitation of spatial tolerance when keeping the temporal tolerance constant. Therefore, we can expand the range of the cloaked area and obtain more users in the area. It means we can get a higher anonymity level k. So, the new approach has improved the ability of protecting privacy.
6. The Experiment
The experimental environment is Intel® Core™ i5-825U CPU @ 1.60 GHz and 8.00 GB RAM. We employ Thomas Brinkhoff’s object generator  and datasets based on road network of the City of Oldenburg, which contains of 5,835 nodes and 6,065 edges . Every experiment is repeated 10 times, and every indicator is calculated according to its average. The moving objects are 10,000, and the temporal tolerance is 10 s. Figure 7 reflects the success rate under different k values. The four lines in the figure, respectively, represent different spatial tolerances, and their lengths are 500 meters, 1000 meters, 1500 meters, and 2000 meters. From the figure, we can see the success rate of finding k-1 users rises with the increase of spatial tolerance. When the spatial tolerance equals to 2000 and k is less than 12, the success rate arrives at 100%. On the old AS architecture, in order to meet QoS, the spatial tolerance is constraint in the range of 0 to 1000 meters , so the value of k cannot be too big, which may result in the low anonymity level. But on the new architecture of AS, the limitation is broken out, and the spatial tolerance is raised without influencing the practical requirement for the last route which can be planned accurately, while it becomes fuzzy with the increase of spatial tolerance in the old architecture. Figure 8 shows the relationship of the success rate and k, respectively, on the old AS architecture and new one. The blue line denotes the success rate on old architecture under the condition of the temporal tolerance equal to 10 s and the spatial one equal to 1000 meters. From the orange line, we can see only adjusting the spatial tolerance in some range, and the success rate may arrive at 100%.
Figure 9 reflects the relationship of the number of nonlinear and k. The blue line represents the average nonlinear number, and the orange is the worst nonlinear one. The figure shows that K is positively correlated with the nonlinear number, namely, that the larger the k value, the more the nonlinearities are. So, with the increasing k, the path from the starting position to the center of the cloaked area tends to be nonlinear and that is more uncertain, which means new architecture of AS becomes more important.
Figure 10 shows the relationship of the number of rings, intersection, and the route , respectively, that is cut off for it is too long and k under the condition of the spatial tolerance equal to 1000 meters. From the figure, we can see that, with the increase of k, the number of rings, intersection, and route cut off appears more. Therefore, the improved algorithm is required in practice.
7. Conclusion and Future Work
This paper proposes a new architecture for AS to plan the path, which is uncertain, from the starting position to the center of the cloaked area. This scheme has more application scenarios, such as taxi-hailing LBS, takeout LBS, and navigation LBS. By experiment demonstration, we can see that the probability of nonlinearity exists and becomes large with the increase of k which decides the level of anonymity . The ring also exists and the improved algorithm is necessarily important. The future work will concentrate on the utilization of this architecture about the new algorithm on privacy protection and new application scenarios.
The data in this paper are based on the road network of the City of Oldenburg in the literature .
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Z. H. Yang, L. I. Shan-Ping, and X. Lin, “Anonymity level adaptation algorithm to meet resource constraint of k-anonymity service in LBS,” Journal of Zhejiang University, vol. 45, no. 7, pp. 1154–1160, 2011.View at: Google Scholar
A. Masoumzadeh, J. Joshi, and H. A. Karimi, “LBS (k, T)-anonymity:a spatio-temporal approach to anonymity for location-based service users,” in Proceedings of the ACM Sigspatial International Conference on Advances in Geographic Information Systems, Seattle, Washington, November 2009.View at: Publisher Site | Google Scholar
M. F. Mokbel, C. Y. Chow, and W. G. Aref, “The new casper: query processing for location services without compromising privacy,” in Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 763–774, Seoul, Korea, September 2006.View at: Google Scholar
T. Brinkhoff, “A framework for generating network-based moving objects,” GeoInformatica, vol. 6, no. 2, pp. 153–180, 2002.View at: Google Scholar