Unchained Cellular Obfuscation Areas for Location Privacy in Continuous Location-Based Service Queries

Luo, Jia-Ning; Yang, Ming-Hour

doi:https://doi.org/10.1155/2017/7391982

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Analysis Conclusion Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Smart Cities: Recent Trends, Methodologies, and Applications

View this Special Issue

Research Article | Open Access

Volume 2017 | Article ID 7391982 | https://doi.org/10.1155/2017/7391982

Unchained Cellular Obfuscation Areas for Location Privacy in Continuous Location-Based Service Queries

Jia-Ning Luo¹and Ming-Hour Yang²

Academic Editor: Christos Goumopoulos

Received09 Feb 2017

Revised06 Jul 2017

Accepted10 Aug 2017

Published28 Sept 2017

Abstract

To access location-based service (LBS) and query surrounding points of interest (POIs), smartphone users typically use built-in positioning functions of their phones when traveling at unfamiliar places. However, when a query is submitted, personal information may be leaked when they provide their real location. Current LBS privacy protection schemes fail to simultaneously consider real map conditions and continuous querying, and they cannot guarantee privacy protection when the obfuscation algorithm is known. To provide users with secure and effective LBSs, we developed an unchained regional privacy protection method that combines query logs and chained cellular obfuscation areas. It adopts a multiuser anonymizer architecture to prevent attackers from predicting user travel routes by using background information derived from maps (e.g., traffic speed limits). The proposed scheme is completely transparent to users when performing continuous location-based queries, and it combines the method with actual road maps to generate unchained obfuscation areas that conceal the actual locations of users. In addition to using a caching approach to enhance performance, the proposed scheme also considers popular tourist POIs to enhance the cache data hit ratio and query performance.

1. Introduction

Currently, most mobile devices feature built-in positioning functions, and smartphone users frequently use location-based services (LBS) to query points of interest (POIs) within their vicinity (e.g., when searching for Chinese restaurants within a 10 km radius). Although using LBSs to rapidly locate places and routes is highly convenient, LBS providers may exploit the opportunity to collect the query contents and travel routes of specific users and then analyze these datasets to determine the users’ dietary habits, shopping preferences, and even personal medical histories. These behaviours are a severe breach of LBS user’ right to privacy.

Numerous previous scholars [1, 2] developed peer-to-peer (P2P) cloaking algorithms to mask the identity and location of users to guarantee location privacy. These P2P algorithms satisfy -anonymity by sharing the user location with other users. However, the approaches proposed in those studies search for conspirators surrounding the user, which may enable attackers to triangulate a user within an obfuscation area (OA) and deploy a variance-based attack (VBA) [3]. In [3], an approach was proposed that searches for other conspirators surrounding a user. Subsequently, a random conspirator in the group is selected to search for other conspirators. This process is repeated until the -anonymity requirement is satisfied; that is, the user cannot be triangulated within an obfuscation area. Subsequently, P2P necessitates the exchange of location information between users. Therefore, users are required to trust other users in the obfuscation area. A malicious user could select different to obtain the location of the other users by using the -anonymity algorithm in [3]. They may even partner with LBS providers to steal personal data from regular users, increasing the risk of privacy leaks.

Recent studies have proposed methods for masking the identity [4, 5], location [6–8], and query information [9] of users by using secure third-party anonymizers to encode the location of a user or POI. Anonymizers not only protect user privacy but also reduce communication time and costs. One study [4] proposed using an anonymizer to mask the identities of a group of query users by using the identity of one random user in the group. These queries, which contain the same metadata, are transmitted to the LBS server. Another study [7] used a Hilbert curve to create an obfuscation area to mask the location of users. Anonymizers mask users by randomly selecting a representative user in proximity to a group of users. The metadata of the representative are copied to all queries before they are transmitted to the LBS server, thereby satisfying -anonymity and obfuscation requirements. Anonymizers typically create obfuscation areas in grid [7, 10–12] or pyramid structure to mask user locations. In [13], a method was proposed to resolve the incompatibility between the original obfuscation area and query criteria by creating an additional obfuscated query area to keep privacy.

Even when a user is masked within an obfuscation area to satisfy -anonymity, LBS servers can collect user queries for area information when they submit continuous LBS queries in a short period. Users are more likely to use LBSs in unfamiliar rural tourist locations (rather than in urban areas) where roads are more dispersed. The simpler road network structures of rural areas enable LBS servers to determine the locations of users by analyzing maps and road conditions. To prevent LBS servers from cross-referencing continuous queries to obtain user location information, previous scholars have added reachable query routes to confuse LBS servers [14–18]. However, LBS servers can extrapolate known data, such as user habits, interests, and actual maps, to determine the most probable route of travel through an elimination process. In [19], a method was proposed to determine reasonable POIs within a user’s query area by analyzing his or her past query records. Subsequently, the user’s actual location is combined with a corresponding reasonable POI to generate a dummy query, preventing LBS servers from filtering out unreasonable dummy queries. A subsequent study [20] proposed a method that selects a nearby insensitive location from a user’s past travel routes to substitute sensitive query locations. However, this method was prone to leak the query location because it failed to account for map data and user mobility. In response, another study [17] combined an anonymizer with map data (all intersection branches within the road network) and user mobility. To confuse LBS servers, the anonymizer used in that study generated obfuscation areas that include the section of road extending from the user’s current intersection, but they do not include blind alleys or overlapping routes according to the user’s privacy requirements.

In this study, we proposed a method combining the anonymizer provided by trusted third-party servers with actual road maps and users’ movement patterns to create multiple virtual paths. When user content cannot be detected in the cache, the mechanism is applied to guarantee the privacy of user queries. The proposed method provides users with high query performance when the query volume is high while guaranteeing location privacy. The method uses the popular query characteristics of tourist locations to enhance the cache hit ratio, query performance, and protection of users’ POI and query locations. The proposed method also considers similarities between pseudoqueries and users’ actual queries, as well as cached POIs, to prevent the generated pseudoqueries from being filtered out by the LBS server, thereby increasing the cache life and hit ratio. The proposed method in the present study is suitable for continuous queries. It has the following contributions:(1)The privacy of users’ POIs is maintained, even during continuous querying.(2)POIs that are difficult for LBS servers to filter out are generated by incorporating area characteristics, logs, and user queries.(3)User privacy requirements are satisfied, even when the location obfuscation algorithm is known to the attacker.(4)Obfuscation areas are generated from real-time maps, thereby avoiding exposing user locations.(5)Cache data are used to reduce the communication costs and time of the anonymizer and LBS server.

The remainder of this paper is organized as follows. Section 2 describes the system architecture and initialization phase, and Section 3 discusses the development of the proposed method. The security analysis and the performance analysis are discussed in Section 4, and Section 5 presents the conclusion.

2. System Architecture

The system architecture is illustrated in Figure 1. When multiple users access LBSs to submit queries, the queries are transmitted to a trusted obfuscated server to protect user privacy. An anonymizer cross-references the query content with the cache database. If POI data matching the query content are detected, the query results are encrypted and returned to the users. In Figure 1, the queries “night market” and “super market” are returned to users from the anonymizer (indicated by the dotted line). If relevant data are not cached, the anonymizer obfuscates the user’s query and location and transmits the obfuscated query to the LBS server. In Figure 1, the user’s “fast food” query and location are obfuscated and transmitted to the LBS server (indicated by the solid line). Once the anonymizer receives the POIs within the obfuscation area from the LBS server, it updates the cache database, filters out the pseudodata, encrypts the query results, and returns the results to the user.

To reduce the computation load of the LBS server for processing user queries, the proposed method uses cell numbers instead of coordinates to represent the query range sent to the LBS server. However, this process necessitates additional computations and transmission costs to synchronize the maps, cell sizes, and cell numbers on the anonymizer and LBS server. Accordingly, we adopted a numbering system for the cellular structure to reduce the overhead costs. This method synchronizes only the center point of the map and the sides of the cells to maintain consistent map segregation and numbering between the anonymizer and LBS server.

The proposed method adopts a trusted anonymizer to protect users’ queries from being collected by the LBS server or other attackers. However, five criteria must be met to successfully implement the proposed method. First, the map on the LBS server must be divided into a cellular structure with a cell side length (Figure 2(a)), and the user’s query range must be an inscribed circle of the cellular structure (Figure 2(b)), where the query radius is the POI within the range of . Second, the LBS server cannot frequently revise the cellular structure of the map. Third, the anonymizer must be reliable for masking user locations. Fourth, the maps on the anonymizer must contain intersections, length of road sections, and speed-limit information. Fifth, the algorithm must be available to the public.

(a)

(b)

Threat models for the LBS server and general attackers are defined in this section. The effectiveness of the proposed method for guarding against these threat models is discussed in Section 4. First, the LBS servers and general attackers can continuously tap, collect, and leak user information. However, they do not alter inbound or outbound query information (e.g., query number). Second, attackers use the open obfuscation algorithm and their background knowledge on known intersections, road sections, and traffic speed limits to deduce users’ travel routes and determine their locations. Third, query results are returned from the LBS server to the anonymizer. This creates the opportunity for the LBS server or general attackers to analyze the cache data of the anonymizer by using known cache algorithms. Fourth, the LBS server and general attackers can cross-reference the obfuscation areas queried by different user IDs in different locations and at different times to identify the associations between different queries and determine the query information submitted by the same user.

Without changing the center and side lengths of the cells, the initialization of the anonymizer and LBS server needs to be performed only once (procedures are presented in Section 2.1). We developed a three-phase unchained location privacy protection method for processing user queries (procedures are presented in Section 3). The following section provides the initialization model. Notations lists and explains the notations used in this paper.

2.1. System Initiation

Once the anonymizer obtains the center coordinates from the LBS server, it uses these coordinates to number each cell and determine their center points. The term denotes the center coordinates of each cell, where represents the - and -axes of the cell. The cellular-structure map illustrated in Figure 2(a) is used to generate a cellular structure comprising cells with side lengths = , where represents the number of layers in the structure. The cells are numbered according to the order, where the center of the map is . Assuming that the hexagonal cell has six directions, increases in increments of 1 to the right and decreases in increments of 1 to the left; increases in increments of 1 to the upper right and decreases in increments of 1 to the bottom left; increases and decreases in increments of 1 to the bottom right; and decreases and increases in increments of 1 to the upper left. Results are illustrated in Figure 2(a).

Once all the cells in the anonymizer are numbered, set (all intersection in , which is a figure containing section length weights) and set with all sections linking two intersections in ) are matched to each cell.where represents the intersections contained in triangle Tir in cell and represents the sections with length weights contained in the triangle Tir in cell .

The numbering method for triangle Tir is illustrated in Figure 3. , 2, 3, 4, 5, and 6 refer to , , , , , and , respectively. This method enables the fewest cells in the obfuscation area to be used to cover the query range. Details concerning the generation procedures and verification of the obfuscation areas are presented in Section 3.

3. Unchained Location Protection Scheme

When a user submits a query, he transmits his ID, , POI for the query, and the -anonymity requirements to the anonymizer. The anonymizer applies the three-phase obfuscation algorithm (Figure 4) to obfuscate his location prior to sending the query to the LBS server. The server then returns the queried information to the anonymizer, which filters out nonuser information before returning the POIs to the user. In Phase 1, the user’s real coordinates are used to calculate the cell number of the user location and the triangle Tir within the cell. If the cell number and POI information are already cached in the anonymizer, the algorithm skips to Phase 3. Otherwise, it continues to the next phase. In Phase 2, multiple obfuscation areas are generated according to the user’s privacy requirements. The obfuscation area that contains the query range is substituted with a pseudo-ID and a pseudoquery order before it is transmitted to the LBS server. The anonymizer then caches the information returned by the LBS server (including the user’s original query and generated pseudoquery). This information can then be used for similar queries in the future. Finally, the anonymizer uses the substituted user ID to retrieve the POI results. In Phase 3, the filtered query results are returned to the user.

Calculation of the cell numbers is explained in Section 3.1, generation of users’ obfuscation areas is described in Section 3.2, generation of multiple pseudoobfuscation areas to protect the privacy of multiple users simultaneously submitting queries and using the cache to achieve unchained location protection are presented in Section 3.3, and query submission is outlined in Section 3.4.

3.1. Calculating User Cell Number

Once the anonymizer receives the current location coordinates of the user , it applies (2) to calculate the vertical displacement of the user relative to the center coordinates of the map . The anonymizer then incorporates the vertical displacement into (3) to determine the cell number of the user. The calculation process is discussed as follows:

First, the vertical distance between the user and origin is used to calculate of the cell number for the user location. The distance between the center points of two vertically adjacent cells (e.g., (0, 0) and (−1, 2) in Figure 5(a)) is , and the cell numbers of these two cells differ by 2. When is located above the center coordinates (), on is added to . This point extends downward vertically to the cell boundary for distance . The number of cells within is calculated and incorporated into (3) to determine whether the final section smaller than crosses over to another cell. In Example (Figure 5(b)), the user is located on a random point () of above the center coordinates. can be expressed as and (see (2)). Moreover, because (see (3)). Thus, the user in Example is located in on the -axis. In Example (Figure 5(c)), because (see (2)). However, because (see (3)). Thus, the user in Example is located in . The cell number of users can be determined using (2) and (3).

(a)

(b)

(c)

The preceding discussion indicates that must be determined before is calculated. The length of is associated with a user’s coordinate. The value of gradually increases as shifts from the cell boundary to the center of the cell . Figure 6 shows that the center distance between two cells is . In other words, one length cycle of is .

The term can be expressed as a triangular wave by using and a length cycle of . The wave equation is expressed in (4), where is the positive slope of and is the vertical displacement distance. Because each represents one cycle (see (5)), the relationship diagram between and can be illustrated by calculating the location of in a single cycle, as shown in Figure 7.

Coordinates on the -axis increase by when 1 is added to the -axis. Therefore, the offset () caused by the change in the -coordinates must be subtracted when calculating the -coordinates. Then, the current distance between and the origin on the -axis is multiplied by a unit length to obtain on the -axis, as illustrated in Figure 5(b). We know that is located in , , and the distance between and is less than . Hence, a user cell number of can be calculated using (3). Therefore, when the location of the user is known , and (2) and (3) can be used to determine the current location of the user, that is, in Figure 5(b).

3.2. Determining the Obfuscation Area

Once the cell number containing the user location and center coordinates is confirmed and if the user query information has not been cached, the anonymizer produces an obfuscation area for the user query and transmits the obfuscation area to the LBS server. The triangle Tir of the cell in which the user is located must be determined to obtain the number of cells required to encompass the query range and produce the minimum obfuscation areas. First, three straight lines passing through a random cell in the cellular-structure map are conceptualized (the three red lines in the cell illustrated in Figure 3). The three lines divide the cell into six equilateral triangles. Without loss of generalizability, the linear equations of the three straight lines intersecting the cell containing the current location of the user can be used to determine the equilateral triangle with the user (Figure 8): Linear equation of , Linear equation of , Linear equation of ,

(a)

(b)

For example, when , , and , represent that the user is either on or to the right of ; indicates that the user is on ; and means that the user is on . A combined analysis of the three lines shows that the user is in of (Figure 8(a)). Thus, four cells are selected as the obfuscation area for the user’s query. These cells are numbered , , , and . If the user’s location is at the center of the cell () and , a random equilateral triangle can be represented the user’s query location. In Algorithm 1, represents the obfuscation area of query in , and is the obfuscation area of the location provided by the user.

Obfuscation Area
Input: User position , User Cell No.
Output:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)

Subsequently, whether the four cells encompass the user’s query range must be determined. In Algorithm 1, three cells neighboring a random triangular section of in Figure 8 are selected to form four cells. Because of the similarities among the three triangles, a random location in the upper-right triangular section of is selected, without loss of generalizability, to verify that the combined area of the four shaded cells is the minimum obfuscation area to encompass the user’s query range (Figure 9).

Supporting Theorem 1. If the query range center is , the radius of the query range is . The user is in a random location in a triangular section with points , , and and a side length (Figure 9). Subsequently, the query range must include the cell with the triangular section and three neighboring cells to create an OA comprising four cells, specifically , , , and .

Proof. In Figure 9, if is located between and on a line expressed as , then a parallel line () with a vertical distance of can be determined. Subsequently, two points can be observed on , namely, and , which denotes that and the vertical distance from to is . Similarly, a parallel line to with a distance of can be observed (). Therefore, . A parallel line to with a distance of can be observed (). Therefore, . Subsequently, the vertical distances from to , from to , and from to are all . Thus, a circular query range with a radius of and a center point anywhere within the triangle created by , , and inevitably encompasses a section of the hexagonal section created by , , , , , and . Moreover, the polygon created by , , , , , and encompasses , , , and . Therefore, Supporting Theorem 1 holds.

Supporting Theorem 1 confirms that the obfuscation areas generated using the four cells encompasses the user’s query range. We subsequently developed an additional theorem to test whether a fewer number of cells can be used to encompass the user’s query range.

Supporting Theorem 2. If the query range center is , the radius of the query range is . The user is in a random location within a triangular section with points , , and and a side length (Figure 9). Subsequently, at least four cells are required to encompass the user’s query range.

Proof. Assume that is in a random location in . Subsequently, only three cells are required to encompass the user’s query range. From Figure 9, the center point of the on can be identified. With only three cells, the diameter of the query range must be less than or equivalent to of (query diameter ). In actuality, the query diameter is greater than . Thus, this hypothesis is contrary to fact, verifying that at least four cells are required for the obfuscation area to encompass the users query range.

Theorem 1. If the user appears in a random location on the map, his or her query range is a circle with a radius of . The proposed algorithm can use the lowest number of cells to encompass the user’s query range. The algorithm can maintain one-third of the size of the obfuscation area when the obfuscation algorithm is known to the attacker.

Proof. Supporting Theorem 1 indicates that an obfuscation area comprising four cells could sufficiently encompass the user’s query range when the user is located in a random location in . Supporting Theorem indicates that at least four obfuscated cells are required in order to sufficiently encompass the user’s query range when the user is located in a random location in . Naturally, the user must be in or for the LBS server to produce the shaded obfuscation areas with the algorithm (Figure 9). The combined area of the two triangles is guaranteed to be one-third of the obfuscation areas.

The user is located within a cell comprising six equilateral triangles. Therefore, the location of the user in is verified regardless of which triangle the user is located in. This suggests that if the user is located anywhere on the map, the proposed obfuscation algorithm produces an obfuscation area of at least four cells, which is the lowest number of cells required, and guarantees that the area of the cells is at least one-third of the obfuscation areas.

3.3. Producing the Obfuscation Areas of Multiple Pseudoqueries

Section 3.2 describes how an obfuscated query area is produced to prevent attackers from obtaining the locations of users in sensitive areas such as special clinics or gyms. Users largely assume that attackers have a 1 in chance of intercepting submitted queries (-anonymity). Thus, we developed an algorithm that can produce multiple pseudoqueries to satisfy users’ -anonymity settings. To enhance the relevance of the pseudoqueries and reduce the number of obfuscation areas, we developed an algorithm that produces multiple pseudoqueries in batches so that individual obfuscation areas and queries can serve as pseudoqueries for other users. Finally, the algorithm replenishes inadequate queries while satisfying individual privacy requirements.

When the anonymizer receives privacy requests () in from users in different locations () and no matches are cached, these queries must be transmitted to the LBS server. The anonymizer uses the proposed algorithm (Algorithm 1) to generate different obfuscation areas for the users (). If the users collectively form an OA, then can satisfy the maximum privacy requirement of the users.

In other words, the OA collectively formed by the u users must contain four times as many cells than the number of cells required for the maximum privacy requirements.

The privacy requirements of users can be satisfied by combining the obfuscation areas of their query locations. However, the number of obfuscation areas must be generated when too few users are available or when users are close together (blue area in Figure 10). For example, Users , , and in Figure 10 request a privacy strength of only 2. Therefore, . However, the three users area within the same obfuscation area generated by the anonymizer, causing . In this instance, a pseudoobfuscation area consisting of four cells must be generated (Figure 10) to meet the obfuscated cell requirement of .

To generate a pseudoobfuscation area that meets the privacy requirements, we developed a method for producing multiuser pseudoobfuscation areas (Algorithm 2). The method follows three criteria to repeatedly produce obfuscation areas until the obfuscation requirement of is met:(1)Avoid VBAs [3] in the center location of the pseudoobfuscation area generated for the user’s location.(2)Avoid generating pseudoobfuscation areas already cached in the anonymizer. Based on the open obfuscation area generation algorithm, attackers know that the queries transmitted to the LBS server are not cached in the anonymizer. Subsequently, the LBS server deduces the cache data of the anonymizer by using the open cache algorithm. Therefore, the pseudoqueries that are detected as cached queries by the LBS server are filtered out.(3)Create new pseudocell numbers () three layers from the user cell () to avoid generating an obfuscation area that overlaps and and reinforce obfuscation strength more rapidly. Therefore, the anonymizer randomly selects one out of six cells three layers away from , namely, , , , , , and , as to generate the obfuscation area.

Generate Obfuscation Area
Input: ,
,
NumOfCells
Output:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8) )
(9)
(10)
(11)
(12)
(13)
(14)
FindDummyCell
Input:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)

A cell is randomly selected from to serve as the center point (Row ()) to meet Criterion 1. Then, a pseudocell number (Row ()) is randomly selected from the cells surrounding to meet Criteria 2 and 3. Subsequently, must contain an intersection. The pseudocell number is added to so that (Row ()), and the pseudoobfuscation area generated using is added to (Rows () to ()). This process is repeated until is generated to meet the obfuscation requirement of . Finally, an obfuscation area set is produced.

In Figure 11, Users , , and move along the red line and transmit queries at the red points at different times. The blue, yellow, and red areas represent the three obfuscation areas generated by the anonymizer for the users’ queries transmitted to the LBS server. The anonymizer receives the privacy requirements of Users and , which are and . However, and generated for the locations of Users and overlap, creating an OA with only seven cells. This fails to meet User ’s privacy requirement of 2, which requires eight cells (). The anonymizer selects a random cell three layers away from and . It identifies and uses this cell to generate a pseudoobfuscation area (). This generates an obfuscation area with 11 > (blue area in Figure 10), which meets the privacy requirements of Users and .

Then, the anonymizer separately receives the privacy requirements of , , and from Users , , and , respectively. Because the query of User is already cached in the anonymizer, it can directly respond to that query. It then generates obfuscation areas for Users and and transmits them to the LBS server. Therefore, three obfuscation areas are created to satisfy the requirement of obfuscated cells, namely, , , and (yellow area in Figure 11).

Finally, the anonymizer receives the privacy requirements of and from Users and , respectively. Because the query of User is already cached in the anonymizer, it generates an obfuscation area only for User in order to satisfy the two obfuscation requirements (red area in Figure 11).

The preceding obfuscation method have two problems. First, the anonymizer can immediately respond to the user without accessing the LBS server when a similar query is cached. Existing methods aimed at enhancing the cache hit ratio ‎[25–27] effectively reduce the likelihood of exposing queries to the LBS server while conserving the communication cost and computation load of the anonymizer. For example, the proposed method uses a hierarchical clustering method ‎[28–31] to group the cached queries according to popularity. These groups are then used to generate corresponding pseudoqueries to prevent attacks that exploit an uneven query distribution [32].

Second, when the anonymizer transmits user IDs to the LBS server, attackers can determine users’ travel routes by analyzing the queries of similar IDs, even when the location of the user is obfuscated. The following section proposes a method for generating unrepeated random pseudouser IDs for each .

3.4. Generating Obfuscated Query Information

We developed a method to prevent LBS servers from combining obfuscation areas and user IDs to deduce users’ travel routes. Even when a simple algorithm is applied to substitute different user IDs with the same ID, LBS servers can still combine intersection and traffic speed-limit data to deduce users’ travel range and travel routes [33–36]. To prevent this problem, when the anonymizer generates obfuscation areas for user queries to satisfy their user privacy requirements, it randomly produces the q pseudo-IDs:

Then, the anonymizer combines all obfuscation areas and the corresponding POIs to generate

The content is randomly interchanged to generate

Directly transmitting the query without changing the order of allows the LBS server to use the known algorithm to identify to be the real user location. Changes are logged with the anonymizer and used to filter user query results once they are returned by the LBS server. Finally, and are combined to transmit the protected query to the LBS server:

4. Analysis

This section analyzes the security and performance of the proposed method and compares the results with those of previous studies. In Section 4.1, we present the security analysis items and compare past security problems. In Section 4.2, the method is applied to a map to examine the method’s real-time performance.

4.1. Security Analysis

The unchained location privacy protection method developed in the present study was based on a trusted anonymizer and existing user/anonymizer security architectures to protect information confidentiality. Therefore, this section discusses four threat models derived from attacks that occur during the communication between the trusted LBS server and anonymizer. The results verify that the proposed method can effectively guard against most LBS attacks when the algorithm is known to the attacker.

When attackers possess the background knowledge of the maps and the capacity to continuously monitor user query content, they can issue the following attacks on user privacy:

Location Homogeneity Attack (LHA). Attackers collect queries from a particularly sensitive area to collect user information, such as a hospital specializing in cardiology and heart surgery, to gain information on heart patients.

Map Matching (MM). Attackers use background knowledge to filter out unlikely query source locations (e.g., lakes) to enhance the likelihood of identifying the actual locations of users.

When LBS servers and general attackers use known location obfuscation algorithms to analyze the queries submitted by multiple users in obfuscated locations, they can perform the following attacks on user privacy:

Known Algorithm Attack (KAA). Attackers who are aware of the obfuscation algorithm can use the algorithm to calculate the obfuscation areas generated in different locations and filter out the less likely results to reduce the obfuscation strength of user locations.

Distance VBA. Attackers calculate the center points of obfuscation areas to estimate the actual locations of users ‎[3].

When LBS servers and general attackers cross-reference the obfuscation areas of queries submitted by different IDs in different locations at different times, they can perform the following attacks on user privacy.

Maximum Movement Boundary (MMB). Attackers examine the traffic speed limits of the map to calculate the maximum movement boundary of the user. They eliminate the areas that the user cannot reach to reduce the obfuscation areas of continuous queries.

Multiple Query Attack (MQA). Attackers cross-reference the members and movement of users in different obfuscation areas to filter out pseudousers and identify real users.

The results in Table 1 show that the proposed method effectively guards against all known attacks. The symbol “O” denotes that the method can defend against this type of attack, and the symbol “X” denotes that the method fails to defend against this type of attack. In ‎[3, 21], methods were proposed to obfuscate the locations of numerous querying users. However, these methods failed to consider user locations that approximate sensitive areas, which enables attackers to exploit these areas by using LHAs to obtain user locations. In [3, 22], algorithms were developed to obfuscate multiquery submissions. However, these methods could not continuously obfuscate locations when the user is moving, which enables attackers to observe the route of the users by performing MQAs. In [20], a method was proposed to substitute sensitive query locations with nearby insensitive locations cached in the anonymizer. However, this method failed to consider user movement speeds, enabling attackers to filter user locations by performing MMBs. Moreover, [20] used the center location of users to generate obfuscation areas, enabling attackers to estimate the actual location of users by performing VBAs ‎[14, 21, 23–25]. Attackers could also confirm the center location of users in an obfuscation area once the algorithm is known to the attacker. In [22], a method was proposed for generating road network obfuscation areas by searching neighboring intersections to avoid placing users on the same road. However, systematically searching neighboring intersections enables attackers to perform KAs to map the obfuscation method and identify user locations.

4.2. Performance Analysis

We implemented simulations in Java 8 on a computer equipped with an Intel i5-4570 CPU to create a test environment with a road map of Oldenburg, Germany [37]. Figure 12 shows that the anonymizer expanded the side length of the map from 10 to 40 km while generating cells to satisfy . Notably, reducing the -value increased the number of cells generated on maps with similar side lengths, reducing the content of each cell. The proposed method uses the same number of cells to obfuscate user query range. Therefore, lower -values reduce the user query range and decrease the amount of data required to return query results from the LBS server.

We observed intersection conditions by dividing the Oldenburg map into -sized cells (Figure 13). Results showed that smaller cells contained fewer intersections. Although Figure 12 shows that cells with shorter sides reduce the transmission load, the results in Figure 13 indicate that smaller cells reduce the number of intersections per cell. Fewer intersections increase the likelihood of attackers estimating the actual location of users. Therefore, a balance between transmission efficiency and the privacy strength must be achieved.

In Figure 14, the privacy requirement of each user is assumed to be and to compare the required average number of queries transmitted to the LBS server. Compared with the result of [25] regarding the number of queries submitted by a single user to generate an obfuscation area, our cache hit ratio was 0, indicating that, without using the cache, four users or more are required to simultaneously transmit a query to meet the privacy requirements with a reduced number of pseudoqueries sent by the anonymizer to the LBS server. The proposed method can combine the user queries of similar obfuscation areas to meet various privacy requirements. In [25], a cache was used to reduce computation and transmission loads. In the present study, we adopted a cache hit ratio of 70%, similar to that used in [25]. Regardless of the number of users, we maintained the privacy protection strength equivalent to that reported in [25], and the performance of the proposed method improved as the number of users was increased. In our proposed method, the number of obfuscation areas must be generated when too few users are available or when users are close together. When the number of users is 2, only 3 queries are submitted to the LBS in [25], and our method requires 6.185 queries with hit rate = 0% or 3.32 queries with hit rate = 70%. But in our method, the obfuscation areas of the query locations can be combined when the number of users increases, which reduce the number of queries that needs to be sent to the LBS. In Figure 14, when the number of users = 8, 12 queries is submitted to the LBS in [25], our method needs 8.019 queries with hit rate = 0% and only 6.427 queries with hit rate = 70%. In this situation, our performance is better than [25].

Figure 15 shows that the average road lengths in the Oldenburg map that satisfy the -anonymity when the radius . In [22], the roads were simply extended to obfuscate the location of users. Niu et al.’s method [3] uses a random walk-based cloaking algorithm, and the method proposed in the present study divides the map into cells. Therefore, the average road lengths of the overall obfuscated areas using the proposed method and [3] were markedly longer than that determined using the method proposed in [22]. Moreover, we generate extra queries to simulate a multiuser environment which requires generating additional obfuscation areas when the cells overlap. The proposed method generated 4.88% longer road length than [3] when .

5. Conclusion

We developed a privacy protection scheme to protect the real location suitable for moving users. The scheme produces multiuser pseudoqueries and uses obfuscation areas to prevent LBS servers from directly deducing users’ real queries and precise locations. We verified that the method produces obfuscation areas with the least number of cells and guarantees one-third the original obfuscation areas size when the algorithm is disclosed. We also considered the distinct characteristic of user queries in different areas and adopted a grouping approach coupled with actual maps to reduce the likelihood of the pseudodata being filtered out by the LBS server, thereby satisfying users’ privacy requirements. Furthermore, we incorporated a caching system to store users’ continuous queries. The cache system coupled with multiuser queries prevents the LBS server from completing deducing users’ routes. Instead, the LBS server can generate only scattered and obfuscated user locations. Therefore, the proposed method effectively protects location privacy during continuous querying. The cache approach also reduces the likelihood of user locations being transmitted to the LBS server, decreases the computation and transmission loads of the anonymizer, and enhances system performance. The proposed method is fully compatible with various user devices. They can use their original mobile devices and Internet service providers to access the trusted anonymizer to protect their location details when submitting a query. Finally, we verified that the proposed method effectively protects users’ identities, locations, and interests and guards against most currently known attacks on location privacy. We also used a real-time road map to test the proposed method. Figure 14 shows that the proposed method uses a cache approach to greatly reduce the amount of query information exposed to the LBS server. A summary of the results illustrated in Figures 14 and 15 shows that the proposed method outperformed other existing methods.

Notations

:	Cell side length
:	Number of real users
:	Index value,
:	-anonymity requirement of the user in query
:	-anonymity requirement for the multiuser query
ID:	User ID
:	Pseudo-ID randomly generated by the anonymizer
:	Cell number
:	Cell number
:	- and -coordinates of cell
:	User’s real current location
:	The real location of the user in a multiuser query
:	Cell number of user in a query
:	Cell number set during a query
:	Number of the equilateral triangles formed by the six vertices of the cell
:	Intersection set contained in triangle Tir in cell
:	The set of all intersections
:	Road section set of triangle Tir in cell
:	All road section sets
:	Vertical distance between and the lower or upper boundary of the cell
:	Current query in a continuous query,
:	Wait time of the anonymizer before receiving a query from a user
:	Indicator of additional space between cells,
:	One obfuscation area comprising four cells, all cell numbers within the obfuscation area in of query
:	OA set of query
:	and sets (no changes to production order)
:	Content order after changing

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the National Science Council of Taiwan under Grants nos. MOST 106-2221-E-130-001, MOST 106-3114-E-011-003, and MOST 106-2221-E-033-002.

References

C.-Y. Chow, M. F. Mokbel, and X. Liu, “A peer-to-peer spatial cloaking algorithm for anonymous location-based service,” in Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems (ACM-GIS '06), pp. 171–178, ACM, November 2006.
View at: Publisher Site | Google Scholar
C.-Y. Chow, M. F. Mokbel, and X. Liu, “Spatial cloaking for anonymous location-based services in mobile peer-to-peer environments,” GeoInformatica, vol. 15, no. 2, pp. 351–380, 2011.
View at: Publisher Site | Google Scholar
B. Niu, X. Zhu, Q. Li, J. Chen, and H. Li, “A novel attack to spatial cloaking schemes in location-based services,” Future Generation Computer Systems, vol. 49, pp. 125–132, 2015.
View at: Publisher Site | Google Scholar
A. Pfitzmann and M. Köhntopp, “Anonymity, unobservability, and pseudonymity—a proposal for terminology,” in Proceedings of International Workshop on Design Issues in Anonymity and Unobservability Berkeley, vol. 2009, pp. 1–9, Springer, Berlin, Germany.
View at: Publisher Site | Google Scholar
T. Rodden, A. Friday, H. Muller, and A. Dix, “A lightweight approach to managing privacy in location-based services,” Technical Report Equator-02-058, University of Nottingham, Lancaster University, University of Bristol, 2002.
View at: Google Scholar
C. A. Ardagna, M. Cremonini, S. De Capitani Di Vimercati, and P. Samarati, “An obfuscation-based approach for protecting location privacy,” IEEE Transactions on Dependable and Secure Computing, vol. 8, no. 1, pp. 13–27, 2011.
View at: Publisher Site | Google Scholar
M. L. Damiani, E. Bertino, and C. Silvestri, “Protecting location privacy against spatial inferences: the PROBE approach,” in in Proceedings of the 2nd SIGSPATIAL ACM GIS 2009 International Workshop on Security and Privacy in GIS and LBS, pp. 32–41, 2009.
View at: Google Scholar
M. Duckham and L. Kulik, “A formal model of obfuscation and negotiation for location privacy,” in Proceedings of International Conference of Pervasive Computing, pp. 152–170, May 2005.
View at: Google Scholar
D. C. Howe and H. Nissenbaum, “TrackMeNot: resisting surveillance in web search,” in Lessons from the Identity Trail: Anonymity, Privacy, and Identity in a Networked Society, pp. 417–436, 2009.
View at: Google Scholar
P. Kalnis, G. Ghinita, K. Mouratidis, and D. Papadias, “Preventing location-based identity inference in anonymous spatial queries,” IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 12, pp. 1719–1733, 2007.
View at: Publisher Site | Google Scholar
J.-H. Um, H.-D. Kim, and J.-W. Chang, “An advanced cloaking algorithm using Hilbert curves for anonymous location based service,” in Proceedings of the 2nd International Conference on Social Computing, pp. 1093–1098, Minneapolis, MN, USA, August 2010.
View at: Publisher Site | Google Scholar
C. Zhang and Y. Huang, “Cloaking locations for anonymous location based services: a hybrid approach,” GeoInformatica, vol. 13, no. 2, pp. 159–182, 2009.
View at: Publisher Site | Google Scholar
C.-P. Wu, C.-C. Huang, J.-L. Huang, and C.-L. Hu, “On preserving location privacy in mobile environments,” in Proceedings of the 2011 9th IEEE International Conference on Pervasive Computing and Communications Workshops, PERCOM Workshops 2011, pp. 490–495, Seattle, WA, USA, March 2011.
View at: Publisher Site | Google Scholar
T. Xu and Y. Cai, “Exploring historical location data for anonymity preservation in location-based services,” in Proceedings of the 27th IEEE Conference on Computer Communications (INFOCOM '08), pp. 547–555, IEEE, April 2008.
View at: Publisher Site | Google Scholar
P. Shankar, V. Ganapathy, and L. Iftode, “Privately querying location-based services with sybilquery,” in Proceedings of the 11th ACM International Conference on Ubiquitous Computing, UbiComp'09, pp. 31–40, usa, October 2009.
View at: Publisher Site | Google Scholar
B. Palanisamy and L. Liu, “MobiMix: protecting location privacy with mix-zones over road networks,” in Proceedings of the IEEE 27th International Conference on Data Engineering, pp. 494–505, Hannover, Germany, April 2011.
View at: Publisher Site | Google Scholar
K.-T. Yang, G.-M. Chiu, H.-J. Lyu, D.-J. Huang, and W.-C. Teng, “Path privacy protection in continuous location-based services over road networks,” in Proceedings of the IEEE 8th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob '12), pp. 435–442, October 2012.
View at: Publisher Site | Google Scholar
T.-H. You, W.-C. Peng, and W.-C. Lee, “Protecting moving trajectories with dummies,” in Proceedings of the 8th International Conference on Mobile Data Management (MDM '07), pp. 278–282, Mannheim, Germany, May 2007.
View at: Publisher Site | Google Scholar
A. Pingley, N. Zhang, and X. Fu, “Protection of query privacy for continuous location based services,” in Proceedings of the INFOCOM, pp. 1710–1718, IEEE, Shanghai, China, 2011.
View at: Publisher Site | Google Scholar
C. Ardagna, G. Livraga, and P. Samarati, “Protecting privacy of user information in continuous location-based services,” in Proceedings of the IEEE 15th International Conference on Computational Science and Engineering (CSE '12), pp. 162–169, Nicosia, Cyprus, December 2012.
View at: Publisher Site | Google Scholar
T. Xu and Y. Cai, “Location anonymity in continuous location-based services,” in Proceedings of the 15th ACM International Symposium on Advances in Geographic Information Systems (GIS '07), pp. 300–307, November 2007.
View at: Publisher Site | Google Scholar
T. Wang and L. Liu, “Privacy-aware mobile services over road networks,” in in Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 1042–1053, 2009.
View at: Google Scholar
H. Lee, B.-S. Oh, H.-I. Kim, and J. Chang, “Grid-based cloaking area creation scheme supporting continuous location-based services,” in Proceedings of the 27th Annual ACM Symposium on Applied Computing (SAC '12), pp. 537–543, March 2012.
View at: Publisher Site | Google Scholar
D. Song, J. Sim, K. Park, and M. Song, “A privacy-preserving continuous location monitoring system for location-based services,” International Journal of Distributed Sensor Networks, vol. 11, no. 8, pp. 1–10, 2015.
View at: Publisher Site | Google Scholar
B. Niu, Q. Li, X. Zhu, G. Cao, and H. Li, “Enhancing privacy through caching in location-based services,” in Proceedings of the 34th IEEE Annual Conference on Computer Communications (IEEE INFOCOM '15), pp. 1017–1025, IEEE, May 2015.
View at: Publisher Site | Google Scholar
S. Amini, J. Lindqvist, J. Hong, J. Lin, E. Toch, and N. Sadeh, “Caché: caching location-enhanced content to improve user privacy,” in Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services, pp. 197–210, ACM, 2011.
View at: Publisher Site | Google Scholar
X. Zhu, H. Chi, B. Niu, W. Zhang, Z. Li, and H. Li, “MobiCache: When k-anonymity meets cache,” in Proceedings of the 2013 IEEE Global Communications Conference, GLOBECOM 2013, pp. 820–825, IEEE, Atlanta, GA, USA, December 2013.
View at: Publisher Site | Google Scholar
P. Berkhin, “A survey of clustering data mining techniques,” in Grouping Multidimensional Data, J. Kogan, C. Nicholas, and M. Teboulle, Eds., pp. 25–71, Springer, Berlin, Germany, 2006.
View at: Publisher Site | Google Scholar
A. K. Jain and R. C. Dubes, Algorithms for Clustering Data, Prentice Hall, 1988.
View at: MathSciNet
A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM Computing Surveys (CSUR), vol. 31, no. 3, pp. 264–323, 1999.
View at: Publisher Site | Google Scholar
G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: hierarchical clustering using dynamic modeling,” Computer, vol. 32, no. 8, pp. 68–75, 1999.
View at: Publisher Site | Google Scholar
R. Shokri, G. Theodorakopoulos, J.-Y. Le Boudec, and J.-P. Hubaux, “Quantifying location privacy,” in Proceedings of the IEEE Symposium on Security and Privacy, SP 2011, pp. 247–262, Berkeley, Calif, USA, 2011.
View at: Publisher Site | Google Scholar
C.-Y. Chow and M. F. Mokbel, “Trajectory privacy in location-based services and data publication,” ACM SIGKDD Explorations Newsletter, vol. 13, no. 1, pp. 19–29, 2011.
View at: Publisher Site | Google Scholar
E. Kaplan, T. B. Pedersen, E. Sava, and Y. Saygin, “Discovering private trajectories using background information,” Data and Knowledge Engineering, vol. 69, no. 7, pp. 723–736, 2010.
View at: Publisher Site | Google Scholar
T. N. Phan, T. K. Dang, and J. Küng, “User privacy protection from trajectory perspective in location-based applications,” in Proceedings of Interdisciplinary Information Management Talks, pp. 281–288, 2011.
View at: Google Scholar
M. Wernke, P. Skvortsov, F. Dürr, and K. Rothermel, “A classification of location privacy attacks and approaches,” Personal and Ubiquitous Computing, vol. 18, no. 1, pp. 163–175, 2014.
View at: Publisher Site | Google Scholar
T. Brinkhoff, “Oldenburg: nodes & edges,” 2017, http://iapg.jade-hs.de/personen/brinkhoff/generator.
View at: Google Scholar

Copyright

Copyright © 2017 Jia-Ning Luo and Ming-Hour Yang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

934

Downloads

864

Citations