The rapid growth of location-based services (LBSs) has greatly enriched people’s urban lives and attracted millions of users in recent years. Location-based social networks (LBSNs) allow users to check-in at a physical location and share daily tips on points of interest (POIs) with their friends anytime and anywhere. Such a check-in behavior can make daily real-life experiences spread quickly through the Internet. Moreover, such check-in data in LBSNs can be fully exploited to understand the basic laws of humans’ daily movement and mobility. This paper focuses on reviewing the taxonomy of user modeling for POI recommendations through the data analysis of LBSNs. First, we briefly introduce the structure and data characteristics of LBSNs, and then we present a formalization of user modeling for POI recommendations in LBSNs. Depending on which type of LBSNs data was fully utilized in user modeling approaches for POI recommendations, we divide user modeling algorithms into four categories: pure check-in data-based user modeling, geographical information-based user modeling, spatiotemporal information-based user modeling, and geosocial information-based user modeling. Finally, summarizing the existing works, we point out the future challenges and new directions in five possible aspects.

1. Introduction

The advanced information technologies that have resulted from the rapid growth of location-based services (LBSs) have greatly enriched people’s urban lives. Location-based social networks (LBSNs) allow users to check-in and share their locations, tips, and experiences about points of interest (POIs) with their friends anytime and anywhere. For example, while having lunch at a restaurant, we may take photos of the dishes on the table and immediately share these photos with our friends via LBSNs. Such a check-in behavior can make real-life daily experiences spread quickly over the Internet. Moreover, such check-in data of LBSNs can be fully exploited to understand the basic laws of human daily movement and mobility [1], which can be applied to recommendation systems and location-based services. Thus, location-based social media data services are attracting significant attention from different commerce domains, for example, user profiling [13], recommendation systems [4, 5], urban emergency event management [69], urban planning [10], and marketing decisions [11].

User-generated spatiotemporal data can be collected from LBSNs and can be widely used for understanding and modeling human mobility according to the following four aspects.

1.1. Geographical Feature

The spatial features of human movement as hidden in millions of check-in data has been exploited to understand human mobility. For example, people tend to move to nearby places and occasionally to distant places [2, 4]: the former is short-ranged travel and is not affected by social network ties, which are periodic both spatially and temporally, and the latter is long-distance travel and more influenced by social network ties [1].

1.2. Temporal Features

As for the routines and habits of our daily lives, there are different probabilities for different locations at different hours of the day and different days of the week. The check-in data of LBSBs also reveal these results [3, 5]. Most people go to work on the weekdays, their check-in behaviors often happen at noon or at night, and the locations they choose are close to their workplaces or homes. On the weekends, most check-in behaviors happen in the morning or afternoon, and the locations are close to certain POIs (e.g., a marketplace, restaurant, museum, or scenic spot).

1.3. Social Features

First, many research studies [1, 12] show that people tend to visit close places more often than distant places, but they tend to visit distant places close to their friends’ homes or those that are checked-in by their friends. These observations have been widely used for location recommendations in LBSNs [1315]. Second, the spatiotemporal feature abstractted from check-in data has been exploited to infer social ties [16] and friend recommendations [1719].

1.4. Integrated Feature

As one type of global public data source about individual activity-related choices, the check-in data in LBSNs provide a new way to sense people’s spatial and temporal preferences and infer their social ties. Moreover, it always provides a new perspective from which urban structures and related socioeconomic performances can be portrayed, street networks and POI popularity can be estimated [10, 20], intraurban movement flows can be analyzed in urban areas [21, 22], urban major/emergency events can be identified [69], and socioeconomic impacts of cultural investments can be detected [11].

Though several surveys on POI recommendation have been published, few current studies present a formalization of user modeling for POI recommendations in LBSNs and classify existing user modeling approaches based on the type of LBSNs data. This paper focuses on reviewing how we can efficiently make use of user-generated data to model POI recommendations in LBSNs. The contributions of this paper are as follows:(1)Briefly introduce the structure and data characteristics of LBSNs. LBSNs can be abstracted into a three-layer + one-timeline framework. There are three types of data in LBSNs and six distinct characteristics of LBSNs data.(2)Considering the characteristics of geographical and social data in LBSNs, we present a formalization of user modeling for POI recommendations in LBSNs.(3)According to the type of LBSN data that is fully utilized in user modeling approaches for POI recommendations, we divide user modeling algorithms into four categories: pure check-in data-based user modeling, geographical information-based user modeling, spatiotemporal information-based user modeling, and geosocial information-based user modeling.

2. Characteristics of LBSNs

2.1. Structure of LBSNs

LBSNs are based on traditional online social networks and provide location-based services that allow users to check-in at physical places and share location-related information with their friends. Meanwhile, LBSNs provide a new perspective for bridging the gap between the real and virtual worlds that allow users’ real-life geographical activities to be disseminated on the Internet. The descriptive definition of LBSNs is given by Zheng [23]. From this descriptive definition, LBSNs can be abstracted into a “3 + 1” framework [24], namely, three layers and one timeline, as shown in Figure 1. The geographical layer is composed of users’ historical check-ins, and the social layer is composed of users’ friendships, while the content layer contains the media (i.e., photo, video, and text) that has been shared by users.

There are six types of relationship in LBSNs: location-location networks, user-user networks, media-media networks, user-location networks, user-media networks, and location-media networks. Traditionally, location-location networks, user-location networks, and location-media networks are the key research content, and most user modeling for POI recommendations in LBSNs are designed from the aspect of data mining and analyzing these three networks.

2.2. The Data Characteristics of LBSNs

There are three types of data in LBSNs: (1) user check-ins: the data record users’ check-ins at different geographical locations at different times; (2) users’ social relationships: the data record users’ social relationships; and (3) social activities: the data record the social activities where users participate at different geographical locations and at different times, or shared social media information. Except for users’ social relationships, users’ check-ins and social activities are the distinct properties of LBSNs data. More appropriately, they bridge the gaps between the real and virtual worlds in LBSNs.

In general, LBSNs’ data characteristics can be summarized as follows:

2.2.1. Multilayer Heterogeneous Networks

As shown in Figure 1, there are three different networks in LBSNs: check-in trajectory networks, social networks, and social media networks. The nodes and edges in the three networks are entirely disparate. Furthermore, different networks also exist between any two of the abovementioned three networks.

2.2.2. Geographical-Temporal Characteristics

Although geographical locations and temporal information are the main components of users’ check-ins that are recorded in LBSNs, social activities and social media shared via LBSNs are typically labeled with a location tag. For example, a tourist may share photos with his friends via WeChat (it has become an important social media platform in China; it provides users an innovative way to communicate and interact with friends through text messaging, one-to-many messaging, hold-to-talking voice messaging, photo/video sharing, location sharing, and contact information exchange (https://en.wikipedia.org/wiki/WeChat)). When he visits Olympic park in Beijing, first, his current geographical location and the time will be recorded by WeChat through his check-ins. Second, the shared photos also indicate this current location by recognizing some of the distinct buildings at Olympic park in Beijing.

2.2.3. Explicit Location Description

LBSNs both record the longitude and latitude of a location and record additional textual descriptions for a popular POI such as categories, labels, and comments. Therefore, it is easy to distinguish two adjacent stores on a street or two neighboring buildings in a park. This is also why the geographical data in LBSNs are efficiently used in some location-based services (e-commerce recommender, trip planning, and accurate advertising).

2.2.4. Unambiguous Social Relationships

Like traditional online social networks, LBSNs allow users to add other users as friends, meaning that the social relationships between users are entirely defined by users. The social relationships of all users in LBSNs can be written as a 0-1 matrix, where 1 represents when two users are friends and 0 represents when two users are not friends.

2.2.5. User-Driven Big Data

Users’ check-in behavior is user-driven [25] in LBSNs; the user freely decides whether to check-in at a specific location depending on their personal preference. The recent rapid development of location-based services has meant that LBSNs record a large amount of user-generated geographical and social data from billions of users. For example, Foursquare (a social-driven location sharing and local search-and-discovery service mobile app (https://en.wikipedia.org/wiki/Foursquare)) had approximately 55 million monthly active users and 10 billion check-ins by December 2016 (http://expandedramblings.com/index.php/by-the-numbers-interesting-foursquare-user-stats/). WeChat had 938 million monthly active users by April 2017 and 639 million users had accessed it on a smartphone each month on average by January 2016 (http://expandedramblings.com/index.php/downloads/dmr-wechat-statistics-report/).

2.2.6. Data Sparsity

The increasing use of smart devices and popular LBSNs has meant that the total number of user check-ins in LBSNs has increased. However, the user-driven behaviors for check-in anywhere and anytime have led to significant sparseness in consecutive check-ins on LBSNs. For example, the average number of daily check-ins on Foursquare is 8 million (https://en.wikipedia.org/wiki/Yelp), and we can determine that the average number of daily check-ins for each user on Foursquare is .

3. Formalization of User Modeling for POI Recommendations in LBSNs

LBSNs typically consist of a set of users and a set of POIs , where each POI belongs to one or more categories. Furthermore, Yelp (https://en.wikipedia.org/wiki/Yelp) (a crowd-sourced local business review and social networking site in USA) and Dianping (a crowd-sourced local business review and social networking site in China) also provide the semantics of POIs that contain far more information than just the category. A social network matrix represents the social relationships among all users, and indicates the existence of a social relationship between user and , whereas indicates no social networks between them, and a user-POI matrix , where represents the frequency or comment information of POI visited by user .

The goal of user modeling in LBSNs is to learn user’s implicit preferences for POIs at the correct time and place; this formation is summarized as follows:where , is a bias that represents the popularity of , , , and and , respectively, represent users’ movement regions and the time.

4. The Taxonomy of User Modeling for POI Recommendation in LBSNs

In this section, we briefly review the taxonomy of user modeling in LBSNs according to which type of LBSN data is used in the user modeling approaches for POI recommenddation. We divide user modeling algorithms into four categories: pure check-in data-based user modeling, geographical information-based user modeling, spatiotemporal information-based user modeling, and geosocial information-based user modeling.

4.1. Pure Check-In Data-Based User Modeling

User-POI data are usually encoded into a sparse matrix because users only visited a few locations in LBSNs; most elements of the user-location matrix are zero. If a user’s demographics and POI categories are added to the user-POI data, the user-POI data are formatted as in Figure 2.

Because the check-in frequencies recorded in LBSNs implicitly reveal users’ preferences for POIs, several studies have adopted a topic model [26], a location hierarchical classification model [27, 28], a latent Dirichlet allocation [29], a Gaussian kernel approach [30], matrix factorization [31], or a latent factor model [31, 32] to infer users’ preferences for POI recommendations.

The effectiveness and efficiency when dealing with a large user-item rating matrix mean that matrix factorization techniques [33] have been successfully used in traditional recommender systems. Two low-rank matrices and are decomposed from the user-item rating matrix , where and are treated as a user latent factor and an item latent factor. Matrix factorization techniques can also be employed for POI recommendations in LBSNs. For example, Berjani and Strufe [34] sought to deal with a lack of explicit ratings for POIs in LBSNs by first transforming users’ check-ins to ratings information and proposing a regularized matrix factorization-based POI recommendation algorithm.

The objective function was as follows:where is the set of user-spot pairs and is the regularization parameter. Wang et al. [35] proposed a new POI recommendation algorithm to model the importance of venue semantics in user check-in behavior by treating venue semantics as an additional regularizer in the objective optimization function; the objective function is as follows:where and is the semantic similarity between venues and .

Moreover, some content information and context information (e.g., POI category, user context, sentiment indication, and timestamp) in LBSNs also reveal different characteristics of users’ check-in behaviors. Many researchers [31, 32, 3640] have proposed many context-aware [3638] and content-aware [39, 40] POI recommendation frameworks in consideration of the abovementioned information.

One major advantage of these approaches is to achieve the purpose of dimensionality reduction and alleviate data sparseness. There is not a standard way to transform users’ check-ins into rating data. Another disadvantage is without considering the geographical, temporal, and social influence of users’ check-ins.

4.2. Geographical Information-Based User Modeling

Like traditional recommender systems, the abovementioned approaches often treat POIs as items but do not consider geographical influence, which is a unique characteristic that distinguishes POIs from items in traditional recommender systems. Therefore, leveraging the geographical information of users’ check-ins (as shown in Figure 3) can capture the spatial distribution of humans’ daily movement and enhance the performance of POI recommendation systems in LBSN.

4.2.1. Bayesian Model-Based User Modeling

Similar to Figure 3, many studies [4143] have shown the spatial clustering phenomenon of users’ check-ins in LBSNs, which results from users’ tendency to visit nearby places rather than distant ones in their daily lives. It is intuitive that the Bayesian model [44] and probabilistic method [4146] can be employed to model the geographical influence of user check-ins in LBSNs. For example, to model the geographical influence of users’ check-in behaviors, Ye et al. [44] utilized the power law distribution to model the geographical influence among POIs and proposed a collaborative POI recommendation algorithm that was based on a naïve Bayesian one. Zhang et al. [46] proposed a probabilistic approach to model personalized geographical influence on user check-in behavior and predict the probability of a user visiting a new location. To model the numbers of centers that are checked-in by different users’ LBSNs, Cheng et al. [41] computed the probability of a user checking in to a location via a multicenter Gaussian model and proposed a POI recommendation framework with a combination of user preference, geographical influence, and personalized ranking. Nguyen Pham et al. [42] proposed an out-of-town region recommendation algorithm in consideration of the spatial influence between POIs to measure a region’s attractiveness. By taking the spatial influence of users’ check-ins into account, the searching space to enhance the performance of recommendation systems could be narrowed. The disadvantage of these approaches is that they could not deal with user cold-start problems.

4.2.2. Latent Factor Model

Alongside the development of the matrix factorization technique in the recommendation system, another intuitive method of modeling users based on geographical information in LBSNs is the latent factor model. The main challenge is how to combine the geographical influence of user behaviors with matrix factorization. In general, the inherent spatial feature (e.g., neighbor) of POIs and the spatial clustering phenomenon (e.g., all users who visit POIs tend to cluster together and several geographical regions are automatically formed) are the core geographical influences that are considered in the latent factor model and are usually treated as additional latent factors in matrix factorization. The state-of-the-art approaches to user modeling can be divided into two groups:(1)Geographical neighbors

The observations that individuals tend to visit nearby POIs and their geographical neighbors in LBSNs have been effectively used in POI recommendations. For example, Hu et al. [47] proposed a latent factor model for rating predictions that combined the intrinsic characteristics of businesses and the extrinsic characteristics of their geographical neighbors. The predicted rating and objective function were as follows:where is the average rating of all known ratings, and , respectively, represent the user bias and item bias, represents the latent factors of user , represents the latent factors of item for its intrinsic characteristics, represents the latent factors of item for its extrinsic characteristics, represents the latent factors of category , and represents the latent factors of review word .

Moreover, Li et al. [48] proposed a ranking-based geographical factorization method for POI recommendations that obtains user-preference scores and geographical neighbor scores through user-POI matrix factorization and POI-k-nearest neighbor matrix factorization. Feng et al. [49] considered sequential influence, where the next POI is influenced by the current POI within a short period and the geographical influence of a distant POI is less likely to be recommended, and proposed a personalized ranking metric that embeds a model for the next new POI recommendation.(2)Geographical region

Apart from geographical neighbors, geographical region is another geographical influence that is used in a latent factor model. Many researchers [5056] have recently discovered spatial clustering phenomena in human mobility behavior and demonstrated its effectiveness in POI recommendations. For example, Liu et al. [54] proposed a novel location recommendation approach that exploits instance-level characteristics and region-level characteristics by incorporating two-level geographical characteristics into the learning of the latent factors of users and locations. The predicted rating and objective function werewhere is the instance weighting parameter, is the set of nearest neighboring locations of , is a Gaussian function, is the column vector in , and is the weight assigned to .

Furthermore, Liu et al. [55] leveraged a latent region variable to model user-mobility behaviors over different activity regions and proposed a geographical probabilistic factor model for POI recommendations. Chen et al. [56] proposed a probabilistic latent model by considering the cluster phenomenon where the users’ check-in places were automatically divided into several regions and how users’ psychological behavior could make them prefer a nearby place to a distant one.

The main challenge to latent factor model is to incorporate the geographical information into latent factor and reduce computational complexity.

4.3. Spatiotemporal Information-Based User Modeling

User check-ins demonstrate that short-ranged travels are successive and periodic, both spatially and temporally in LBSNs [1, 57, 58]. Although users’ check-ins exhibit a periodic pattern, which implies the users’ lifestyle, all POIs visited by users result in check-in sequences, which reveal how two successive POIs can be geographically adjacent and temporally relevant from the perspective of a venue’s function (as shown in Figure 4). Therefore, temporal information is an important contextual factor used in user modeling for POI recommendations.

4.3.1. Time-Aware User Modeling

The time factor is considered as a contextual factor and used to enhance the POI recommendation system. The time factor affects human experiences and the temporal clustering phenomenon also exists in our daily lives, not only the geographical clustering phenomenon. For example, most users visit different types of POI at different times in a day and visit different types of POI on weekdays and weekends. For example, they may visit a food-related POI at noon and a nightlife spot in the evening. Most office staff commute from home to their company every weekday morning and shop at a supermarket on weekday afternoons. Some users’ temporal POI preferences may be similar, which naturally fits the underlying assumption of collaborative filtering, that is, users who have similar temporal preferences for certain POIs will likely have similar temporal preferences to others [5961]. User-based recommendation methods [5, 15, 6265], tensor factorization [61, 66], ranking SVM [67], generative models [6872], graph-based models [73, 74], and neural networks [75, 76] are effective methods for modeling users for POI recommendations in LBSNs. For example, Yuan et al. [62] proposed a user-based extended POI recommendation algorithm by leveraging the time factor when computing the similarity between two users and the recommendation score for a new POI. Yao et al. [63] took the compatibility between the time-varying popularity of POIs and the regular availability of users into consideration to propose temporal matching between a POI popularity and user-regularity recommendation system. Ozsoy et al. [77] proposed a dynamic recommendation algorithm by leveraging users’ temporal preferences at different times or days of the week.

These approaches could dynamically produce the recommended POIs in terms of users’ temporal preferences. However, the recommended POIs are usually popular with most users and unpopular POIs (namely, long-tail POIs), and new POIs would not been recommended to any user.

4.3.2. Sequential Influence-Based User Modeling

Sequential influence is another temporal influence that is utilized in user modeling for POI recommendations in LBSNs. All POIs visited by a user can bring out a check-in sequence, and successive check-ins are typically correlated both spatially and temporally. For example, a user may habitually visit a bar after dinner in a restaurant. This observation reveals that the bar and the restaurant are geographically adjacent, and the check-in sequence implies that the temporal relevance from the perspective of venue functions in addition to the user’s daily life custom. The Markov chain model is most often exploited to model the sequence pattern for POI recommendations in LBSNs. For example, Cheng et al. [51] took into account two prominent properties in the check-in sequence: personalized Markov chain and region localization, and proposed a novel matrix factorization method for POI recommendation, which exploits personalized Markov chain in the check-in sequence and users’ movement constraint. He et al. [32] proposed a third-rank tensor with which to model successive check-in behaviors by fusing a personalized Markov chain with a latent pattern. Zhang et al. [78] exploited a dynamic location-location transition graph to model sequential patterns and predicted the probability of a user visiting a location via an additive Markov chain; they also fused sequential influence with geographical influence and social influence into a unified recommendation framework. Furthermore, they [79] proposed a gravity model that weighs the effect of each visited location on the new location, which integrates the spatiotemporal, social, and popularity influences by estimating a power-law distribution.

Apart from the Markov chain model, matrix factorization [80], tensor factorization [81], a pairwise ranking model [82, 83], and recurrent neural networks [75] are employed to model the check-in sequential pattern. For example, Chen et al. [81] used a third-rank tensor to compute transitions between categories of users’ successive locations and proposed a graph-based location recommendation algorithm. Zhao et al. [82, 83] presented two POI recommendation algorithms via a pairwise ranking model and exploited two different methods to model the sequential influence from two different aspects. Liu et al. [75] exploited extended recurrent neural networks to model local temporal and spatial contexts and proposed a location-recommendation algorithm.

Furthermore, Yang and Eickhoff [84] used the word2Vec technique to propose a spatiotemporal embedding similarity algorithm for location recommendations by treating the time, location, and venue functions of check-in records as virtual “words,” check-in sequences as “sentences,” and the activity of a neighborhood or user as “documents.” Liao et al. [85] proposed a location prediction model by utilizing temporal regularity and sequential dependency. Zhu et al. [86] constructed a user model from location trajectory, semantic trajectory, location popularity, and user familiarity and proposed a semantical pattern mining and preference-aware POI recommendation algorithm. Liu et al. [87] developed a low-rank graph construction model to learn static user preferences and dynamic sequential preferences, and thus proposed a POI recommendation algorithm.

Obviously, most of the abovementioned approaches are content-based recommendation techniques. They make fully use of sequential influence of users’ check-ins to model users’ spatiotemporal preferences for successive POIs. However, if a user does not check-in often enough, or is a new user, these approaches would not work well.

4.4. Geosocial Information-Based User Modeling

Besides geographical and temporal influences, social influence is another source of contextual information exported to user modeling for POI recommendations. Check-ins in LBSNs show [1, 24] that users’ long-distance travel is influenced by their friends and users are more likely to visit places that have been visited by their friends. In other words, friends tend to share more common interests than nonfriends in LBSNs (the geographical and social relationships of users’ check-ins in LBSNs are as shown in Figure 5). Their observations are widely exploited to model users for POI recommendations [2, 8894]. For example, Hu and Ester [88] proposed a top-N POI recommendation algorithm by leveraging both the social and topic aspects of users’ check-ins. Zhang and Zhang [89] leveraged the location, time, and social information to model users and weighted approximately ranked pairwise losses to achieve top-n POI recommendations. Jia et al. [90] defined several features to measure the influence of friends and rank friends by a sequential random walk with a restart in terms of their influence and utilized a Bayesian model to characterize the dynamics of friends’ influence to predict locations. Li et al. [91] focused on the problem of predicting users’ social influence on event recommendations in event-based social networks and proposed a hybrid collaborative filtering model by incorporating both event-based and user-based neighborhood influences into matrix factorization. Gao et al. [92] presented an event recommendation algorithm by fusing social group influences and individual preferences into a Bayesian latent factor model. Zhang and Chow [93] proposed a geosocial collaborative filtering model through a combination of user preference, social influence, and personalized geographical influence that had been learned from users’ check-in behaviors by a kernel density estimation approach. Additionally, they proposed a geographical-social-categorical correlation enhanced POI recommendation approach in [2] by taking categorical correlations between POIs into consideration. In [94], an LDA-based POI recommendation model that jointly mined latent communities, regions, activities, topics, and sentiments from social links between users, venue geographical locations, venue categories, and textual comments on venues is presented.

In addition to traditional recommendation systems (e.g., e-commerce recommendation systems and context-aware recommendation systems), POI recommendation systems in LBSNs also face many challenging problems such as the issue of data sparsity and the user/POI cold-start problem. Incorporating social network ties into certain mathematical models (e.g., matrix factorization and graph model) is an effective solution to cope with such challenges [9599]. For example, Zhang and Wang [96] proposed a local event recommendation approach that took Bayesian Poisson factorization as its basic unit to model events, social relations, and content text, and handled the cold-start problem by incorporating event textual content and location information into these basic units. Yin et al. [97] proposed a POI recommendation algorithm that was based on a probabilistic generative model, which considered the phenomenon of user interest drift across geographical regions, exploited social and spatial information to enhance the inference of region-dependent personal interests, and alleviated the issue of data sparsity. Yao et al. [37] presented a collaborative filtering POI recommendation method based on nonnegative tensor factorizeation and fused users’ social relations as regularization terms of the factorization to improve the recommendation accuracy. Ren et al. [98] exploited a weighted product of user latent factors and POI factors by incorporating a topic with geographical, social, and categorical information to enhance the performance of a probabilistic matrix factorization. POI recommendation accuracy can be improved and cold-start problems can be addressed by leveraging the information of friends; Li et al. [99] first defined three types of friend (social friends, location friends, and neighboring friends) and incorporated the set of locations that received individual likes and were checked-in by individual’s friends into matrix factorization. The predicted formulation and objective function were as follows:where is the category of location and is a tuning parameter. For each user , represents observed locations, potential locations, other unobserved locations, and is the loss function for the observed, potential, and unobserved preferences of user for a location. Moreover, Guo et al. [100] exploited the geographical, social information, and aspects extracted from user reviews to better model user preferences, then constructed a novel heterogeneous graph by fusing three types of nodes (users, POIs, and aspects) and various relations among them, and finally transformed the personalized POI recommendation as a graph node ranking problem.

These approaches could solve user cold-start problem and alleviate the sparseness of users’ check-ins by leveraging social information in LBSNs. The major challenge is how to incorporate users’ social relations into the popular models (such as matrix factorization).

5. Statistics on the Literature

In this section, we first give some brief statistics on the literature published on well-known journals and conferences in recent years, as shown as in Table 1. Inspired by literature [101], we further classify user modeling on POI recommendation into three main categories: context-aware techniques, content-based techniques, and collaborative filtering (CF) and hybrid techniques; the CF techniques are composed of memory-based CF and model-based CF and the memory-based CF methods include user-based CF (UCF) and item-based CF (ICF). Finally, we simply list them in Table 2.

6. The Challenges and New Directions in the Future

POI recommendation system in LBSNs not only satisfies the basic functions of traditional recommendation systems but also has the characteristics of location-based services and mobile urban computing. In recent years, many researches have been done in user modeling for POI recommendation in LBSNS. But, there are some challenges and new directions that would attract lots of researchers’ attention in the future.

6.1. Mining Users’ Check-Ins and Social Activities in LBSNs

Users’ check-ins and social activities recorded in LBSNs usually hide their personalized preferences for POIs in real world. To some extent, the spatiotemporal properties of users’ check-ins and social activities are the significant assumptions for user modeling for POI recommendation [44, 47, 54, 56, 63, 84]. Therefore, mining users’ check-ins and social activities play an important role in POI recommendation in LBSNs. The main research contents are as follows: the spatiotemporal distribution of users’ check-ins, the similarity of users’ check-in trajectories, and users’ activities tracking and recognition.

6.2. The Relevance between Users’ Check-Ins and Social Relationships

In the past, it was difficult in collecting users’ spatial data and social relationships through a united platform. LBSNs provide a new way to collect these data and a new perspective to study the relevance users’ check-ins and social relationships. Users’ check-ins in LBSNs usually uncover users’ personal behaviors, and users’ social relationships in LBSNs usually reveal users’ social behaviors in real world. Qualitative and quantitative analysis of the relevance can be used as an important heuristic in user modeling for POI recommendations [8894] and inferring new social links [1719]. With the wide application of location-based POI recommendations, the research on the relevance between users’ check-ins and social relationships will gradually attract more and more attention.

6.3. The Interpretation of Recommendations

Effective interpretation and clear presentation can make users to fully understand the recommendations, thus improving users’ acceptance of the recommendations and their stickiness for recommendation systems. Especially in the scenario of smart device with a relatively small screen and inconvenient input, a more user-friendly interface and game-oriented interpretation are needed. There are few studies on the interpretation and presentation of POI recommendations in LBSNs. However, we think that making users to understand the recommendations is likely to be a hot research point in the future, not only referring to a two-dimensional point or a continuous path in the map.

6.4. Scalability

Scalability always seems to be a disturbing problem in recommendation systems. In a user-based collaborative filtering algorithm, the computational complexity of user similarity is , where is the number of users and is the average number of items rated by each user. With the increases in the numbers of users and items, the computational complexity of similarity will sharply increase. At present, the solutions of the problem in user-based collaborative filtering involve dimensionality reduction through factorization model and narrowing the searching space by incorporating users’ contextual information. The same problem always exists in POI recommendation in LBSNs, and it is certainly sure that these abovementioned solutions are effective as well. Therefore, with the growth of data in volume and dimensionality, designing a high-efficiency model will be a long-term-focused research point in POI recommendation systems as well as in e-commerce recommendation systems.

6.5. Privacy Preservation

Users often puzzle over the privacy issue (e.g., location disclosure and sensitive relationship disclosure) when they are checking-in in LBSNs [102]. For example, Gundecha et al. [103] found that privacy issue is the most concern factor of users when they using location-sharing services. As a matter of fact, users enjoy lots of location-based services by forwardly sharing their current locations in LBSNs; at the same time, inappropriate disclosure of location information poses threats to their privacy. Nowadays, privacy preservation in LBSNs has attracted people’s attentions from academic research [104] to industrial applications [105].

7. Conclusion

The increasing use of smart devices and LBSNs has led to millions of user-generated data in recent years, and how these data can be utilized to understand human mobile behavior and help users make correct decisions has attracted the interest of many researchers from different domains. In this paper, we have focused on reviewing the taxonomy of user modeling for POI recommendation via data analysis in LBSNs. We have divided user modeling algorithms into four categories according to which type of LBSN data have been fully utilized in user modeling approaches for POI recommendation: pure check-in data-based user modeling, geographical information-based user modeling, spatiotemporal information-based user modeling, and geosocial information-based user modeling. At last, summarizing the existing works, we point out the future challenges and new directions in five possible aspects.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.


This work was supported by National Science Foundation of China (no. 61602518) and Open Foundation of Hubei Key Laboratory of Intelligent Geo-Information Processing (no. KLIGIP2016A06).