Abstract

In Location-Based Services (LBSs) platforms, such as Foursquare and Swarm, the submitted position for a share or search leads to the exposure of users’ activities. Additionally, the cross-platform account linkage could aggravate this exposure, as the fusion of users’ information can enhance inference attacks on users’ next submitted location. Hence, in this paper, we propose GLPP, a personalized and continuous location privacy-preserving framework in account linked platforms with different LBSs (i.e., search-based LBSs and share-based LBSs). The key point of GLPP is to obfuscate every location submitted in search-based LBSs so as to defend dynamic inference attacks. Specifically, first, possible inference attacks are listed through user behavioral analysis. Second, for each specific attack, an obfuscation model is proposed to minimize location privacy leakage under a given location distortion, which ensures submitted locations’ utility for search-based LBSs. Third, for dynamic attacks, a framework based on zero-sum game is adopted to joint specific obfuscation above and minimize the location privacy leakage to a balanced point. Experiments on real dataset prove the effectiveness of our proposed attacks in Accuracy, Certainty, and Correctness and, meanwhile, also show the performance of our preserving solution in defense of attacks and guarantee of location utility.

1. Introduction

In a platform with Location-Based Services (LBSs), a relevant position is submitted for a share or search (e.g., check-in, local search), which connects the physical world with cyber world and social world together [1]. However, users’ locations are exposed to LBS providers and stored in a LBS server during this procedure. Once misused or attacked, users’ habits or activities can be inferred through their historical trajectories. For example, the WeChat (https://web.wechat.com) user, who uses the surrounding search to explore new friends nearby, may disclose his current location [2, 3]. Another example is for Foursquare (https://foursquare.com) users using local search at a relatively private place (e.g., the hospital, the bank). In above cases, location privacy protection is a need, while the user experience should also be ensured in search-based LBSs.

Existing researches on location privacy protection have worked out many feasible solutions in this traditional area, including access control [46], data distortion [79], and cryptography [10]. However, little attention has been paid to the location privacy leakage in account linked mixed LBSs, where the situation becomes more complex. The account linked mixed LBSs refer to two or more LBSs, varied in kind, whose base platforms are linked by account. According to functionality, LBSs can be divided into two groups, namely, share-based LBSs (e.g., check-in) and search-based LBSs (e.g., local/remote search), respectively. With these mixed LBSs linked, privacy protection becomes more difficult from the view of either individual platform or comprehensive platforms.

Individually, for account linked mixed LBSs, different strategies should be developed corresponding to different situations in specific LBSs. Similar to users in search-based LBSs facing the trade-off between privacy protection and user experience, users using share-based LBSs can also be caught in a dilemma when some locations are considered to be private for themselves. Traditional obfuscation is efficient for the former case but does not work in the latter case. Since locations published in share-based LBSs are aimed at requiring a sentimental value from online friends, user status and locations submitted should not be obfuscated. As a result, users need to weigh the gains and loss by themselves to decide whether to make the location public.

Comprehensively, account linkage [11] shares user information cross platforms, which leads to a boost in location inference attacks about both the users and their friends. For an anchor user who has linked accounts cross platforms, his sequential locations in time order are connected to form a relatively complete trajectory. On the other hand, the different friend circles maintained in different platforms are integrated into a new one, which provides more abundant information for location inference attacks.

Based on challenges mentioned above, in this paper, we would like to preserve users’ location privacy in account linked LBSs, which is formally defined as the “account linked LBSs caused location privacy leakage” (AL-LPL) problem. AL-LPL is a typical problem belonging to the “account linked services caused privacy leakage” (AL-PL) problem, which is very urgent under the prevalence of account linked platforms. Once a user’s location leaked, the harm is greater than in single platform as more friends are exposed. The solution we have proposed in this paper could not only solve AL-LPL, but also provide a feasible method for the following work in this area. Furthermore, AL-LPL is a general problem and can be applied to a kind of account linked platforms providing LBSs, like Facebook (https://www.facebook.com), Yelp (https://www.yelp.com), WeChat, and DIAN PING (http://www.dianping.com). In addition, AL-LPL is a novel problem when compared with traditional privacy leakage problems. Different from traditional location privacy leakage [12, 13] in a specific kind of LBS, AL-LPL has discussed location privacy leakage caused by linking different LBSs. AL-LPL is also distinct from the work [14] discussing location privacy leakage in mixed LBSs in one platform because LBSs in AL-LPL are organized in more than one system such that network alignment is needed between two platforms.

Here, we use the Foursquare-Swarm as the research case to show the effectiveness of our proposed attacks and test the performance of corresponding protection mechanism. For information integration, we make use of existing anchor users and adopt the location profile system of Foursquare to be the knowledge base [15]. As shown in Figure 1, every Point of Interest (POI) in this system has a unique profile with detailed information, which can tolerate light deviation of coordinates and address name, respectively. Hence, a location in any forms of coordinates or address can be unified in this system, represented by a location profile with a unique location ID.

Based on the network alignment of two platforms, a new heterogeneous social network with mixed LBSs is constructed, where specific inference attacks are formed to guess user’s next submitted location. These inference attacks are supported by previous analysis of user's behavior in social network [1619], which suggests that user’s location can be predicted by both friends’ trajectory and his periodic mobility pattern. Besides, to avoid the disturbance from overactive users with friends or trajectory across the world, we limit the range of inference attacks to a predefined geographical region (e.g., a city).

To defend the attacks above, we use a framework (shown in Figure 2) based on noncooperative game to help a user to obfuscate both his historical locations and his online location before the submission in search-based LBS. The basic hypothesis here is that “service providers and LBS location servers are untrustworthy.” Therefore, our design does not rely on any trusted third party (TTP). Here, every location in search-based LBS is obfuscated through a unified method. The solution imitates users’ random walk to a near Point of Interest (POI), takes the trade-off between obfuscation and submission’s utility into consideration, and implements optimal obfuscation to fend off dynamic inference attacks. For details, an obfuscation model is established for specific fixed attack where AL-LPL is transformed into a multiobjective optimization problem to reduce attack’s performance in Accuracy and Correctness. Based on every obfuscation solution against every listed inference attack, a game-based framework is adopted to defend the dynamic attacks where AL-LPL is transformed into a minimax optimization problem to avoid the worst case for location privacy leakage under any given attacks. Our main contribution of this paper consists of the following:(i)To the best of our knowledge, this is the first work to give a specific protection solution for location privacy leakage caused by account linked LBSs.(ii)New attack models are established to infer users’ next submitted location based on users’ and their friends’ historical trajectory and proved to have boosts in attack effectiveness through information fusion when compared with the state-of-the-arts [20, 21].(iii)To defend continuous and dynamic attacks formed from the above, an improved gamed-based framework is proposed to obfuscate every location in search-based LBS, which consists of the offline combination of obfuscation solutions generated by our proposed obfuscation model and the online updating according to whether a new location is produced in any LBS.(iv)The experiments, based on real dataset, demonstrate the effectiveness of our solutions in defending multisource continuous attacks and prove that the location utility of our solutions is under users’ control.

The rest of this paper is organized as follows. Section 2 introduces the preliminary work. Section 3 gives the corresponding solution, and empirical experiments are reported in Section 4. Section 5 discusses the related work. Section 6 draws the conclusion.

2. Preliminaries

In this section, we first introduce background information about Foursquare and Swarm (https://www.swarmapp.com), followed by network alignment between these platforms, which provide a foundation for problem formulation.

2.1. Background Information

As representative LBS platforms, Foursquare and Swarm are adopted to be the research case to prove the effectiveness of our proposed attacks and protection. As a companion application separated from older Foursquare, Swarm allows users to share their current location with friends and develops into a Location-Based Social Network (LBSN) with the share-based LBS. Meanwhile, new Foursquare provides a local search and recommendation service and transforms into a popular mixed LBS platform (i.e., local search is a search-based LBS and recommendation service is a share-based LBS). Here, the recommendation service is provided based on tips written in Foursquare, which reflects users’ preference for POI. During the separation, old Foursquare users coexist between these two new platforms, which produces sufficient cross-platform account linkage. On the other hand, once new users register in Swarm, they could choose to use their existing Foursquare account. In addition, a redirection linkage is also provided on the web page of Foursquare, which directs to the web page of Swarm. As a result, the two online platforms are closely coaligned with each other. In this typical case, the AL-LPL problem potentially exists for all these linked accounts.

2.2. Network Alignment

As the foundation of user information fusion, network alignment [22] is a reflection between the nodes of different networks, so as to combine two or more networks into a new hybrid network with global information. For platforms with LBSs, their networks are composed of two kinds of nodes (i.e., user node and location node) and two kinds of edges (i.e., user-user friendship edge, user-location submission edge). As a consequence, the reflection should be established both between users and between locations.

As for network alignment between Foursquare and Swarm, for one thing, the sufficient account linkages provide adequate anchor users, which leads to a complete reflection between user nodes cross platforms. For another thing, Foursquare and Swarm share the same location profile system as a result of earlier separation, so they share the same reflection between a given position (e.g., coordinates, address) and location ID. Hence, the location mapping is not a problem. For platforms not sharing the unified reflection with each other, Foursquare can also be used as the mapping standard because of its well-defined location profiles, where coordinates can be clustered to the nearest POI to alleviate location drift and incomplete address can be matched with the key words.

2.3. Problem Formulation

In LBS, a user submits a (longitude, latitude) pair to location server when he prefers a kind of services. Suppose ’s real location is ; without location privacy protection, this information can be used by adversary for his inference attacks. For our solution, in local search service, a pseudolocation is selected in a specific location set, which is composed of all the historical locations produced by and ’s friends. Here, from the specific location set, the -nearest POIs are sorted out, each with an obfuscated probability . Then the pseudolocation is chosen from these POIs according to the certain probability distribution . Here, is called protection level.

From adversary’s point of view, when location servers are compromised, adversary obtains three parts of user ’s information:   ’s mixed historical trajectory from two platforms;   ’s friendship list in Swarm; any friend ’s mixed historical trajectory from two platforms. Meanwhile, adversary can guess the protection level if ’s home address has been exposed in his profile, so we suppose is known to adversary. As a result, adversary’s background knowledge about is .

Armed with above knowledge , when getting ’s current pseudolocation , adversary would guess ’s true location according to candidates’ probability distribution . Here, all possible inference attacks are established to compute where candidates are -nearest positions that can be obfuscated to .

Therefore, the AL-LPL problem can be defined as follows: given   ’s setting about lower bound of location utility after obfuscation ( is a real number between 0 and 1, and larger represents better location utility);   ’s current location ; adversary’s background knowledge about ; and adversary’s possible attacks set , how to get optimal obfuscation distribution under maximum loss of location utility ()?

3. Solution

In this section, our proposed solution for AL-LPL is given through analyzing potential inference attacks, establishing specific obfuscation model for corresponding attacks, and using the improved game-based framework to defend continuous dynamic attacks.

3.1. Potential Attacks Analysis

Here, the analysis of potential inference attacks consists of two steps: we enumerate the common-used guessing methods based on all the meaningful combination of background knowledge and adopt two tracking methods, namely, distribution tracking and maximum likelihood tracking, to determine the final probability distribution of guessed results.

(1) For the first thing, ways to compute initial candidates’ probability distribution are as follows.

(1.1) Basic Guessing . Basic Guessing refers to the guessing without any background knowledge. With user’s pseudolocation given, the candidates’ probability is represented as . Here, two common conditional probability distributions are adopted for their wide usage, namely, even distribution and Bayesian distribution. The guessing following these distributions are represented by E and B, and their candidates’ probabilities are defined as follows:where is the number of candidates. and represent the probability of ’s appearance in location and , respectively. Here, they are based on the statistics of ’s historical trajectory. is the probability of submitting when is the real location. Since real value of is difficult for adversary to infer, obfuscation is supposed to follow an even distribution. With protection level known, equals .

(1.2) Friendship-Based Guessing . Friendship-Based Guessing is the guessing based on similarity between and ’s friend . According to previous research carried out by Cho et al. [23], there is a positive correlation between social relationship and human movement. A friend with higher similarity intends to visit more common places with . Suppose the similarity score in guessing is ; then the calculation of the candidates’ probability distribution is as follows:where refers to the location set of all the possible obfuscation when user locates at . Here, is used as a weight, to multiply the probability that has ever visited at least one position in obfuscation set . As a result, all friends of are taken into account to decide every candidate’s probability. Additionally, is the similarity score between and , which are calculated through two-item attribute information, namely, common friend ratio and trajectory similarity.

First, the similarity score between users’ friends is calculated with measure Jaccard’s coefficient [24]; that is, for and his friend ,where and are the friend list of and , respectively.

Second, the trajectory similarity between and his friend is measured by Jaccard’s coefficient where can be defined as follows:where and are the historical locations generated by and on both Foursquare and Swarm in time order, respectively.

With and known, the similarity score between and his friend is defined as follows:where and are weight coefficients, whose optimal value can be learnt from data theoretically. However, this will make the model too complicated. To focus on the AL-LPL problem itself, in this paper, we assume and are equally important and assign them with the same weight (i.e., ) for simplicity concerns. Besides, normalization is used to calculate a relative similarity score, so as to eliminate the effect of different value ranges which is caused by different attribute information.

(1.3) Mobility-Pattern-Based Guessing . Mobility-Pattern-Based Guessing is the guessing based on periodic mobility pattern of . Previous studies [23] show that a user tends to visit the same place at a regular time. Here, POIs on ’s trajectory can be extracted and listed based on their proposed timestamps and are classified into seven time windows, which represent seven days of a week, from Sunday to Saturday. As a consequence, the corresponding guessing process is represented aswhere is the time window of input location and denotes the submitted location in the past. Other definitions have been demonstrated in expression (2). Therefore, the candidates’ probability distribution equals the probability that has ever paid a visit to at least one location in obfuscation set and meanwhile the visit belongs to the same time window of current pseudolocation .

From all the possible combination of the above guessing, in this paper, we form four kinds of representative attacks: Basic Attacks: attacks only adopt Basic Guessing E/B; Friendship-Based Attacks: attacks integrate Basic Guessing E/B with Friendship-Based Guessing F; Mobility-Pattern-Based Attacks: attacks integrate Basic Guessing E/B with Mobility-Pattern-Based Guessing M; Comprehensive Attacks: attacks integrate Basic Guessing E/B with Friendship-Based Guessing F and Mobility-Pattern-Based Guessing M. Here, linear combination is used in this integration due to its simplicity and wide usage. That is,where is the weight coefficient of every guessing and is learnt through experiments. is candidates’ probability distribution of Comprehensive Attacks based on guessing B, F, and M. The denominator is used for normalization.

(2) Moreover, tracking methods are introduced as follows.

As for tracking methods, some adversary believe only locations with maximum likelihood are qualified for candidates such that they remove positions with less likelihood in inferring ’s real location. Others regard all points around as candidates and select guessed location by following candidates’ probability distribution . Figure 3 provides an example for these two tracking methods.

Suppose distribution tracking is denoted by T1 and maximum likelihood tracking is represented by T2; all possible attacks are listed as follows: ET1, ET2, EFT1, EFT2, EMT1, EMT2, ECT1, ECT2, BT1, BT2, BFT1, BFT2, BMT1, BMT2, BCT1, BCT2.

3.2. Obfuscation Model

As is mentioned before, our work aims to get the optimal obfuscation distribution under maximum loss of location utility . Here, two targets are aimed at in our obfuscation according to Raza Shokri et al.’s work in [25]. In that paper, three criteria are put forward to measure the attacks’ effectiveness, namely, Accuracy, Certainty, and Correctness. Accuracy is defined as the hit probability of real position , Certainty shows the entropy of candidates’ probability distribution , and Correctness is the distance expectation between guessed location and true location . As is analyzed in [25], Certainty is less important than the other two factors in determining whether an attack is effective. Hence, Accuracy and Correctness are selected as optimum targets in our protection solutions.

As the opposite of adversary, our protection solutions need to decrease the hit probability and increase and ’s distance expectation as much as possible. The definitions of and are as follows:where is the obfuscation probability of given that is the real location. is the hit probability based on knowledge when is pseudolocation. In addition, is candidates’ probability based on knowledge when is pseudolocation, and is the distance between real position and guessed result. Here, all the possible with corresponding guessed results are accumulated for , and the Euclidean distance is used to calculate .

Besides, the loss of location utility with the limit of maximum value at is defined as follows:

Based on restrictions mentioned above, including , , and , we have proposed our obfuscation model corresponding to the situation where the attack is identified.

Measured by hit probability and the distance expectation , the score of our obfuscation model can be represented aswhere and are the weight of and , respectively. and . Here, we assume that is the most important index and set and , respectively. Definitely, it is convenient for users to modify these factors when they expect higher performance in and do not so much care about the performance in . Consequently, the optimal obfuscation distribution that can minimize the score function will be

To solve the objective function, the set of -nearest locations is initialized for every POI. Next, three targets including , , and are initialized according to (8), (9), and (10), respectively. After that, simplex algorithm [26] is used with the input of objective function and the constraints. Here, the time complexity of this algorithm is where is the number of POIs in dataset and is the protection level.

As a consequence, 16 protection solutions are formed according to specific attacks proposed before. They are ET1P, ET2P, EFT1P, EFT2P, EMT1P, EMT2P, ECT1P, ECT2P, BT1P, BT2P, BFT1P, BFT2P, BMT1P, BMT2P, BCT1P, and BCT2P.

3.3. Improved Game-Based Framework

Based on every obfuscation solution against every listed inference attack, a game-based framework is adopted to defend the continuous and dynamic attacks. The improved game-based framework is to obfuscate every location in search-based LBS, which consists of the offline model combining obfuscation solutions generated by our proposed obfuscation model and the online updating according to whether a new location is produced in any LBS.

(1) Offline Model. For dynamic inference attacks, a mapping is established from attack-and-defense process to a noncooperative game, corresponding to the main elements of game, namely, players, the strategy, and the profit. In our work, players refer to adversary and users. The adversary’s strategy space is a subset of possible inference attacks set , containing relatively efficient attacks selected from , while users’ strategy space is composed of corresponding protections to defend attacks in . For users, the profit is measured by how efficiently their locations in local search are protected, and corresponding target functions have already been analyzed in formula (11). For adversary, the obtained profit can be regarded as the profit loss of users, so the sum of these two profits is zero. Therefore, profits of both sides are defined as below:where we assume that is the most important index and set and , respectively. Definitely, it is convenient for users to modify these factors when they expect higher performance in and do not so much care about the performance in .

To defend dynamic attacks, specific obfuscation solution is mixed based on our proposed obfuscation model through the minimax theorem [27] aswhere is every obfuscation solution based on the obfuscation model, which belongs to . is its weight which is learnt through experiments. After several rounds, an equilibrium will be achieved, from which we can obtain hybrid protection solution as is described below.

To produce a proper pseudolocation as user’s submission, the set of -nearest locations is initialized for every POI. After that, the set of protection strategy is established corresponding to adversary’s strategy space . Then, every entry of users’ profit matrix is calculated through formula (13). Furthermore, the zero-sum game is transformed into a dual linear programming problem, and the simplex optimization method is used to get the final obfuscation probability . At last, selection is done following the probability of to produce a pseudolocation . For time complexity, this algorithm needs , where is the number of POIs in dataset, is the protection level, is the size of attack space (), and is the size of protection space ().

From the above, it is uncertain whether the loss of location utility is still lower than ’s setting in this game-based protection. Derivation below is the analysis of this issue, which proves that the constraint is still satisfied; that is,

(2) Online Updating Strategy. Every time a location is submitted for share-based LBSs in two platforms, the online updating is carried out for our game-based solution where the optimal obfuscation distribution is recalculated. On the other hand, if a location is proposed for local search, our protection would work out its fuzzy location following and begin updating afterwards. Additionally, every updating refers to a recalculation of , so as to support consecutive protection.

4. Experiments

In this section, dataset, compared methods, evaluation metrics, and experiment setup are introduced, followed by our experiment result and some corresponding analysis.

4.1. Dataset Description

In our experiments, we use trajectory produced by New York users in Foursquare and Swarm. A user is considered New York-based if he specifies New York in the location field of his profile.

Here, we adopt the method mentioned in [28] to crawl users’ check-ins in Swarm and get users’ tips in Foursquare through Foursquare API. Unfortunately, users’ submitted locations in local search are unavailable through public data collection, so we use the location list drawn from tips in Foursquare to simulate the locations in local search due to their tight coupling in temporary-space. Here, we make two assumptions: For a common user, his tips are always published after local searching to share his experience. Not every user would share tips after using local search service. As a result, the tips used for simulation in our experiment provide relatively sparse data compared with practical local search. Though locations are comparatively sparse in local search, considering the attack-and-defense process based on the same dataset, the utility of dataset cannot be determined. Therefore, additional experiments in Section 4.5.3 would further discuss this problem and find the influence of location sparsity on effectiveness testing in attack-and-defense process.

Here, data are preprocessed in both Foursquare and Swarm. We filter users through their HomeCity (a tag that can be found in user profile and it is included in our initial dataset; HomeCity indicates the user’s living city), eliminate inactive users with less than 5 positions, and get the initial research dataset . As a result, in , users are all New York citizens, and their proposed locations are limited in New York, in which tips were produced from 2008-10-14T22:53:35Z to 2017-10-18T09:24:30Z and check-ins were produced from 2009-04-13T06:23:53Z to 2017-10-27T20:19:39Z. As shown in Table 1, in dataset , we can find that every user has nearly 19 friends and has made about 15 tips and 155 check-ins in average. Furthermore, the mean check-ins that happened on every location is over 46 while the mean tips produced on every location is about 2. Besides, the distribution of initial dataset is analyzed, including count of users with varied numbers of check-ins/tips in corresponding LBSs and count of locations with different numbers of check-ins/tips in corresponding LBSs. As is shown in Figure 4, all submitted locations distributions both follow the power-law distribution and have scale-free properties, which accords with the basic characteristics of LBSN. Therefore, our research dataset is representative and general.

4.2. Compared Methods

To show the improving effectiveness of multisource attacks, we compare our proposed inference attacks with two other attacks, namely, Maximum Movement Boundary (MMB) [20] and Context-aware Location Attack (CLA) [21].

MMB is an attack introduced in [20] based on user’s maximum movement boundaries. In this attack, user’s real location can be inferred from the overlapping area of two continuous movement boundaries. In our experiment, the user’s maximum movement speed is set to 6 km/h according to empirical human walking speed.

CLA is an attack based on association of locations in the context and is proposed in [21]. For ’s any two pseudolocations and , the corresponding reference attack targeting at AL-LPL problem can be represented aswhere ) can be calculated by sixth formula proposed in [21] and contains all the possible obfuscation when user locates at the input position.

To show the performance of our proposed Improved game-based protection I-GAMEP, two kinds of protection are adopted here to make a comparison, namely, I-INIP and I-BASICP, respectively.

Here, I-INIP is the continuous obfuscation solution which follows even distribution. Its offline solution is represented by INIP. I-BASICP and I-GAMEP are both established based on initial protection I-INIP. I-BASICP refers to the corresponding obfuscation protections combined with our proposed updating strategy in defense of specific attacks. For comparison, I-GAMEP is the continuous protection solutions in defense of the combination of five representative inference attacks (i.e., ET1, ECT1, ECT2, BCT1, and BCT2).

4.3. Evaluation Metrics

Here, Accuracy, Certainty, and Correctness proposed in [25] are used to verify the effectiveness of our attack models at one point. Furthermore, their mean values , , and are proposed to measure the performance of our proposed continuous protection and are defined as follows:where is the frequency of attack and defense in the testing set. is ’s background knowledge. is ’s real position, is pseudolocation, and is guessed location. is measured by Euclidean distance.

When it comes to the location utility, on the one hand, from theoretical aspect, we adopt two representative protections as examples to show the tendency of privacy leakage under different value of location utility loss, which aim to verify that the location utility loss has an upper limit for the minimized privacy leakage. On the other hand, from practical aspect, we use the Precision@N to measure the Accuracy of the location list returned by local search service when submitting pseudolocation. Here, the list returned by local search before the obfuscation is set to be the benchmark. The list, as a result of Foursquare local search, contains nearby requested locations from the closest to furthest within limited distance range. Hence, the Precision@N is used to reflect the physical distance between pseudo- and true location in a direct way.

4.4. Experiment Setup

All of our experiments were conducted on a machine running Windows 7 with an Intel® Core™ i5 processor and 8 GB of RAM. Our solutions have been implemented in C++.

In our experiments, to test the effectiveness of attacks, the initial protection INIP following even distribution is carried out to locations in local search. For performance analysis of protection solution, 400 users are selected to train the weight coefficient of the hybrid attacks, and the optimized combination of weights is selected through these cases. To evaluate the performance of our protection methods, the same method is adopted while the training set of randomly selected 400 users is used to find the optimized weight combination of users’ strategy space . Then, other 100 users are randomly selected to compose the testing set to evaluate the performance of baseline and our proposed protection methods.

For parameters introduced in this paper, we set the protection level and the loss of location utility . If there is no specific explanation, the default values are assigned where and .

4.5. Experiment Results

The experiment results of addressing the AL-LPL problem are available in Tables 2-3 and Figures 57. In this part, we test the effectiveness of our proposed inference attacks, prove the priority of our game-based protection through comparison, and measure the loss of location utility.

4.5.1. Attacks’ Effectiveness Analysis

To test the performance of our proposed 7 representative attack methods, INIP is carried out on local search part of dataset . Here, we fix and vary the protection level from 3 to 15 in a step of 2. From the results shown in Table 2, we can observe the following: In most cases, BCT2 performs the best among all the compared methods in inferring user future location evaluated by Accuracy and Certainty. If not the best, its performance is also near the best. This demonstrates that the joint of multisource information can deeply aggravate the leakage of location privacy. By comparison, for Correctness, the performance seems to have no obvious regularity. In most cases, EMT1 can achieve better performance in Accuracy and Correctness than EFT1, while EFT1 performs better in Certainty, which suggests that both close friends’ trajectory and users’ historical trajectory can reveal more location privacy in different aspects and also proves the supplemental role of linked platforms in location disclosure. By comparing ECT2/BCT2 with ECT1/BCT1 correspondingly, we observe that tracking method T2 is more effective than T1 in most cases when measured by Accuracy, Certainty, and Correctness. With the increase of protection level, most attacks’ Accuracy falls down steadily while most of their Certainty and Correctness show a rise trend instead, which agrees with the correlation analysis in [25].

Extra experiments are performed to compare the effectiveness of different attacks, which are controlled by variables and . In Figure 5, comparison is made between our proposed inference attacks (i.e., ECT1, BCT1), MMB, and CLA with the same tracking method T1. Here, it is shown that BCT1 performs best in all the cases, which reflects the general optimization of our proposed attacks towards location inference in most linked platforms. Furthermore, the attack effectiveness of MMB is undesirable because of the locations’ sparsity in online social networks. The reason why CLA has a bad performance is that CLA only considers the correlation between locations but neglects friendship and user behavioral periodicity. As a result, attacks considering these factors perform better.

4.5.2. Protection Effectiveness Analysis

In this experiment, and are both set to the default value. Here, Table 3 provides the performance comparison between different protection methods. I-BCT1P and I-BCT2P are chosen as the typical I-BASICP. We observe that our proposed protection methods both perform better than baseline protection I-INIP in defensive Accuracy and Correctness while I-INIP has a better performance in Certainty. When compared with the typical I-BASICP, I-GAMEP shows its general defense of varied changed attacks such that the I-GAMEP can always avoid the worst case in defense of varied attacks. Hence, the I-GAMEP is proved to have priority in semantic level to avoid adversary’s accurate guessing in POI level.

4.5.3. Location Utility Analysis

Here, from the view of our model, to verify that the location utility loss has a limit for the minimized privacy leakage, the relation between location utility and privacy leakage is analyzed for each attack and defense through experiment where the value of is fixed as before. Here, we use the latest attack and defense to be the example and measure the privacy leakage by through formula (13). As is shown in Figure 6, we specify BASICP to be ECT1P and compare it with GAMEP. To find out the relation between location utility and privacy leakage towards ECT1 attack under these two protections, respectively, we compute every privacy leakage under different (from 0.05 to 0.19 step by 0.01) and describe the tendency by connecting every discreet point. Corresponding conclusions are as follows: It is easy to find that the leaked privacy decreases with the increase of and converges to a stable value. Here, the stable value is supposed to be the limit of the location utility loss when privacy leakage maintains minimum. This is because the protection methods cannot select a pseudolocation which has an infinite distance from real position. According to our mechanism, the selected candidates are limited by the locations of user’s and his friends’ daily visit. Besides, for the same , privacy leaked by ECT1P is always smaller than using GAMEP. We suggest that this is because GAMEP has involved the defense of other localization attacks so that its performance is more inferior than ECT1P in defense of ECT1 attack.

Moreover, when it comes to the specific analysis of local search results, we make the comparison between different protection solutions, which contains the BCT2P as a typical BASICP and the GAMEP. Here, we use the latest attack and defense to be the example. Since each local search in Foursquare will return 30 locations at first, we use 30 to be the upper bound in our precision computing. On the other hand, one topic is selected, namely, “Shop & Service”, in order to provide real-world scenarios in our verification. The topic “Shop & Service” is a relatively wide range, which includes ATM, bank, and market. With and at the default value, based on three criteria, the results shown in Figure 7 suggest the following: Both BASICP and the GAMEP could support local search service since the searching results have some overlapping parts when compared with the results produced by user’s real position. Additionally, the precision of GAMEP seems better than the results from BASICP in most cases.

Previous studies on location privacy protection have mainly paid their attention to the leakage problems that occurred in search-based LBS through mobile phone data [29], whose solutions are all provided from the view of LBS providers, system design, and users.

From the stand of LBS providers, the earliest work is a protocol named Geopriv [30] proposed by The Internet Engineering Task Force (IETF), which is aimed at making specifications about location presentation and transformation. After that, a growing number of works are carried out about access control [46]; however, most provisions need to be observed conscientiously. With no legal restraint and no benefit for LBS providers, corresponding protocols are difficult to be effective.

Moreover, solutions trying to promote system design are most based on data distortion [8, 9, 3133] and cryptography [10]. For data distortion, methods like cloaking [8, 31], suppression [9], and perturbation [32, 33] are proposed. Here, cloaking [8, 31] requires the submission of a larger region to avoid privacy leakage while suppression [9] is aimed at breaking the periodic mobility pattern through the removal of location data. In addition, differential privacy [32, 33], as a typical perturbation-based method, adds noise to the aggregate data to prevent information disclosure between continuous query. Besides, Imran Memon et al. propose an asymmetric cryptograph based anonymous communication for LBS in [10]. Despite having performed well, most of works mentioned above draw support from a TTP or the location anonymizer, which is not easy to establish and maintain for a developed LBS platform, such as Foursquare and Yelp.

For user-centric solutions, Raza Shokri et al. use the Stackelberg Bayesian game to formalize the mutual optimization of user-adversary objectives in LBS in [7], which is easier to be put into effect. As a result, the game-based protection framework we used in this paper draws on the experience of Raza Shokri et al.’s work, combined with an update strategy to defend dynamic continuous inference attacks. Other user-centric solutions more or less have their drawbacks, such as Shokri et al.’s another work in [34]. It hides most of the user’s queries through collaboration with other peers: context information has been kept in a buffer and passed to someone who is seeking it. Although it does not rely on the TTP, the QoS is also not measured in this situation.

Combined with the above methods, there are two common frameworks. One is Mix-Zone proposed in [35] and improved in [36, 37] to provide a space for a set of users entering, changing pseudonyms, and exiting. The key point of this framework is to establish a mapping between users’ old and new pseudonyms, but this exchange does not apply to social networks. Another famous example is -anonymity [38, 39], which needs a trusted third party to collect information of the nearest users for obfuscation. Having realized this problem, many researchers proposed extensions to avoid this problem including Raza Shokri et al. in [7].

With the prevalence of smartphones and the development of LBS platforms, attention has been turned to the location privacy protection in search-based LBS platforms and shared-based LBS platforms, where new challenges are faced due to different application scenarios and varied data sparsity. For location privacy protection in search-based LBS platforms, some new possible attacks [2, 40] are focused on and solved. Other works [41, 42] are devoted to traditional location privacy leakage problem, such as that of Cheng and Aritsugi in [41], who have designed a system with obfuscated region maps generated in mobile devices to provide a user-centric protection. And our work is similar to them in storing user related locations in their mobile phone. As for share-based LBS platforms, new attacks are proposed including untrusted friends’ inference attacks [43, 44] and destination inference attacks [45].

Above all, existing researches have not paid much attention to location privacy leakage in account linked LBSs, not to mention corresponding protection solutions. As a new problem full of challenges, it has been solved in this paper with a continuous game-based preserving framework.

6. Conclusions

In this paper, we focus on a new circumstance where the information linked by accounts is fused by the adversary to make a more accurate inference attacks about user’s next proposed location. Our analysis shows a remarkable influence of multisource data on location privacy disclosure. To defend the continuous multisource attacks, we propose an improved game-based location privacy-preserving framework GLPP to obfuscate every location in local search before submission. Here, the obfuscation model is used to provide specific obfuscation solutions according to varied inference attacks while the user experience is ensured under user’s control. During the attack-and-defense process, the online updating strategy in GLPP provides a dynamic obfuscation distribution, which, as a result, supports consecutive protection. Experimental results show that our proposed GLPP performs better than initial protection in Accuracy, Certainty, and Correctness. Meanwhile, GLPP is also proved in our experiments to provide a good user experience without much loss in location utility.

Additional Points

Highlights. In this paper, Location-Based Services (LBS) are divided into two kinds of services: The search-based LBSs provide search service for users, which would still work with submission obfuscated under a limited distortion. The share-based LBSs refer to check-in services and other check-in based LBSs, which cannot work with obfuscated submission due to their functionality.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by National Natural Science Foundation of China under Grants no. 61772133, no. 61472081, no. 61402104, no. 61370207, no. 61370208, no. 61300024, no. 61320106007, no. 61272531, no. 61202449, and no. 61272054; Collaborative Innovation Center of Wireless Communications Technology; Collaborative Innovation Center of Social Safety Science and Technology; Jiangsu Provincial Key Laboratory of Network and Information Security under Grant no. BM2003201; and Key Laboratory of Computer Network and Information Integration of Ministry of Education of China under Grant no. 93K-9.