Abstract

Point-of-interest (POI) recommendation which aims at predicting the locations that users may be interested in has attracted wide attentions due to the development of Internet of Things and location-based services. Although collaborative filtering based methods and deep neural network have gain great success in POI recommendation, data sparsity and cold start problem still exist. To this end, this paper proposes session-based graph attention network (SGANet for short) for POI recommendation by making use of regional information. Specifically, we first extract users’ features from the regional history check-in data in session windows. Then, we use graph attention network to learn users’ preferences for both POI and regional POI, respectively. We learn the long-term and short-term preferences of users by fusing the user embedding and POI ancillary information through gate recurrent unit. Finally, we conduct experiments on two real world location-based social network datasets Foursquare and Gowalla to verify the effectiveness of the proposed recommendation model and the experiments results show that SGANet outperformed the compared baseline models in terms of recommendation accuracy, especially in sparse data and cold start scenario.

1. Introduction

In recent years, with the rapid development of mobile Internet of Things and smart devices, mobile crowdsourcing becomes popularity as a new data collection and analysis paradigm that involve pervasive smart devices belonging to various participants [1]. Due to the success of GPS technology, most of the smart devices support location-based services which are the main tasks in the mobile location-based social network (LBSN) such as Yelp, Foursquare, and Gowalla [2]. Different from traditional mobile social networks, the dimension of location brings social networks back to reality, bridging the gap between the physical world and online social networking services. LBSNs track and share users’ location information in addition to person-to-person connections. A good sharing experience attracts more user generated data for deep understanding. Among the various location-based services such as traffic monitoring, air pollution detection, and travel scenic recommendation, point-of-interest (POI) recommendation is one of the most important tasks in LBSN, which can help users to discover new and interesting locations [3]. POI recommendation usually recommends a list of POI which users most likely check-in in the future based on users’ check-in records, venue information, and users’ social relationship.

POI recommendation is a branch of recommendation systems which faces the following challenges: (1) Data Sparsity. In general, the larger the amount of data is, the more sparse it is. When the data is sparse, most algorithms based on association analysis (such as collaborative filtering) do not work well. It is difficult to use the existing data to make high satisfaction recommendations to users. (2) Cold Start. It is difficult to learn a new user’s preference for lacking of user’s historical check-in data and contextual information about POI. (3) Implicit Feedback. The interaction between users and POI is often implicit feedback, such as check-in records and number of clicks. The last is the heterogeneity of POI ancillary information as the POI recommendation contains information such as geographic location, timestamp, social relationship, POI type, and related descriptions. Related research show that these subsidiary information of the user and POI can improve the quality of recommendation [4]. It has become a challenge to effectively use this heterogeneous information to recommend higher satisfaction POI for users. Most of the traditional POI recommendation models are based on matrix factorization models [5], only use dot product of the user vector, and the item vector to model the interaction between the user and the items. This kind of models is linear model and has limited expressive ability; furthermore, these models usually do not make full use of ancillary information.

To solve the above problems, we propose a novel recommendation model (SGANet), which can learn user preferences in unsupervised manner. First, SGANet uses multiple type hot spots to model the user’s historical check-in data. Then, the user’s historical check-in data is further extracted based on session windows. In order to learn the user’s preference representation, we use the graph attention network to model user preference and regional preference. Finally, SGANet learns the representation of user and POI by fusing the subsidiary information through a GRU network and makes recommendation for the target user.

To summarize, we make the following contributions: (i)We propose a POI recommendation model for LBSN named SGANet. It not only takes into account the user’s preference for single POI but also takes into account the user’s preference for regional hot spots. At the same time, it digs POI ancillary information to recommend the top-k POI with the highest satisfaction for users(ii)To incorporate the nonlinear relationship between user and POI, we construct user POI embedding and regional hot spot embedding based on the user’s check-in record. And we introduce graph attention network to capture the user’s preference for POI pairs and regional POI within the model and further learn the long-term and short-term preferences of users by fusing POI ancillary information and using recurrent neural network GRU(iii)We conduct experiments on two real-world datasets (Foursquare and Gowalla) to verify the effectiveness of the proposed method

The rest of the paper is organized as follows: the related works are introduced in Section 2. Section 3 introduces the proposed POI recommendation model in detail, and Section 4 introduces the experiments as well as the results analysis. Finally, it is the conclusion of this paper.

2.1. Traditional Recommendation Algorithms

Most traditional recommendation algorithms are based on matrix factorization [6] and probability models. In 2001, Sarwar et al. [7] proposed a classic item-based collaborative filtering recommendation algorithm. They conducted research on item-based collaborative filtering technology and, at the same time, analyzed the user-item matrix to learn the relationship between different items. Then, they implicitly recommend items to users through these learned relationships. Mnih and Salakhutdinov [8] proposed a probabilistic matrix factorization technique which can linearly scale the number of observations and have better performance on large, sparse, and unbalanced datasets. Pan et al. [9] proposed heterogeneous implicit feedback HIF. The confidence interval is learned adaptively through adaptive Bayesian personalized ranking, and HIF showed better recommendation performance on various evaluation indicators. Fitzgerald [1] applied the Markov decision process model to the recommendation system which generated a recommendation list with maximized revenue score through iterative convergence. Zhang et al. [10] proposed a model combining collaborative filtering with deep learning technology. This model obtained the latent features more accurately by improving upon the traditional matrix factorization algorithm and further improved the quality of the recommendation results. Bi et al. [11] proposed a deep neural networks based recommendation algorithm and built a regression model for predicting user ratings based on deep neural networks. Deng et al. [12] proposed a novel K-medoids clustering recommendation algorithm based on probability distribution for CF. This approach enhances the prediction accuracy and effectively deals with the sparsity problem.

2.2. POI Recommendation Algorithms

Like many other recommendation problems, POI recommendation has attracted extensive research interests. Aliannejadi and Crestani [13] proposed a probabilistic model to find the mapping between user-annotated tags and locations’ taste keywords. It demonstrated that personalized recommendations for POI play a key role to meet users’ satisfaction in LBSN. Liu et al. [14] created a geographically-temporally awareness hierarchical attention network (GT-HAN) to obtain insight into user mobility for next POI recommendations, it contained an extended attention network that used a theory of geographical influence to simultaneously uncover the overall sequence dependence and the subtle POI-POI relationships. Li et al. [15] studied how to predict the POI competitive relationship. They built a heterogeneous POI information network (HPIN) from POI reviews and map data and developed a graph neural network-based deep learning framework DeepR. Hang et al. [16] proposed a heterogeneous graph-based method to encode the correlations between users, POI, and activities and jointly learn embeddings for the vertices. Zhou et al. [17] initiated the first attempt to learn the distribution of user latent preference by proposing an adversarial POI recommendation (APOIR) model. Si et al. [18] found that most existing POI recommendation methods lack adaptability when making recommendations for users with different preferences which caused unsatisfactory recommendation results, thus, they proposed an adaptive POI recommendation method by combining user activity and spatial features. Rahmani et al. [19] proposed a POI recommendation method based on a local geographical model, which considers both users’ and locations’ points of view. By leveraging geographical information to capture both the user’s personal, geographic profile, and location’s geographic popularity, incorporating the geographical model into the matrix factorization approaches, their method improved the performance of POI recommendation. Sun et al. [20] proposed long- and short-term preference modeling (LSTPM) for POI recommendation which consists of a nonlocal network for long-term preference modeling and a geo-dilated RNN for short-term preference learning. The collection of poi data can be achieved using crowdsourcing. The emergence of crowdsourcing has affected and changed the traditional business model. It enables companies to subcontract work to the public through the Internet. Any participant can use the network platform to improve ideas, solve problems, and get corresponding rewards. [21] proposed a novel bilayer collaborative clustering (BLCC) method for the label aggregation in crowdsourcing. [22] proposed a novel task bundling based incentive mechanism that dynamically bundles tasks with different popularity together to solve the participation unbalance problem. [23] proposed a spatial crowdsourcing schema for the opportunistic collection of information within an interest area in a city or region. [24] proposed a blockchain-based task matching scheme for crowdsourcing with a secure and reliable matching. Instead of utilizing a centralized cloud server, we employ smart contracts, an emerging blockchain technology, to provide reliable and transparent matching. [25] developed an online task assignment system, which can on-the-fly assign workers with appropriate tasks in order to improve answer quality. [26] analyzed the privacy leaks and potential threats in the task matching and proposes a single-keyword task matching scheme for the multirequester/multiworker crowdsourcing with efficient worker revocation.

2.3. Graph Attention Network Recommendation

With the great success of neural networks in traditional fields, such as computer vision and natural language processing, researchers have also introduced deep learning into the recommendation system. In recent years, the technologies based on graph attention network have attracted more and more attention in various fields. In 2017, Veličković et al. [27] first proposed the graph attention network. The network architecture calculated graph structured data and solved the disadvantages of existing methods based on graph convolution through self-attention mechanism. In 2019, Busbridge et al. [28] studied the relational graph attention network. The model extended the nonrelational graph attention network to incorporate relational information, so these methods can be applied to a wider range of problems. In 2019, Song et al. [29] proposed session-based social recommendation via dynamic graph attention networks. The system uses recurrent neural networks to model dynamic user behaviors and graph attention neural networks to model context sensitive social influences. The model can dynamically infer the influence according to the user’s current interest. In 2018, Zhang et al. [30] proposed multiresolution graph attention networks. The model learns the multilayer representation of the vertices through the graph convolutional network. Then, it matches the short text fragment with the graphic representation of the document and applied the attention mechanism on each layer of the GCN to learn the correlation matching between the short text and the long document. Wang et al. [31] proposed a novel heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions. Mohan and Pramod [32] proposed a temporal graph attention network (TempGAN), whose aim is to learn representations from continuous-time temporal network by preserving the temporal proximity between nodes of the network.

Most of the above POI recommendation methods failed to analyze the check-in records of users and ignored the fact that although the check-in distribution characteristics of users are different, the regional hot spots may be similar. Thus, taking consider of region information effectively is important to extract user preference accurately. In addition, the existing methods still suffer the data sparsity and cold start problem due to that they did not make full use of ancillary information when making recommendation. Therefore, in this paper, we propose a novel recommendation model to solve the above shortcomings of the existing methods.

3. The Proposed SGANet Model

The overall structure of SGANet is showed in Figure 1. SGANet consists of four embedding layers: user check-in distribution embedding layer, regional hot spot embedding layer, ancillary information embedding layer, and information interaction embedding layer. We will introduce the details of each layer in the following subsections.

3.1. Graph Attention Network

Graph neural network has become one of the popular directions in the field of deep learning. As a representative network structure, the graph convolution network introduces an attention mechanism to achieve better neighbor aggregation. By learning the weights of neighbor nodes, the graph attention network can achieve weighted aggregation of neighbor nodes. Therefore, the graph attention network is not only robust to noisy neighbor nodes but the mechanism also gives the model a certain interpretability. Figure 2 is a typical graph attention network [27].

In order to calculate the attention value between a pair of nodes , the attention network needs to consider the influence of two nodes at the same time, as shown in the following equation:

Here, is a projection matrix, and are the representations of node and node , respectively. The representations of nodes and are spliced and mapped into a scalar, as shown in the following equation:

Then, the attention of all neighbors of each node is normalized, and the attention weight is obtained after normalization, as shown in the following equation:

See equation (4) for the complete figure:

3.2. User Check-In Distribution Embedding

In SGANet, we use a transition matrix to map the user-POI interaction in the feature space to characterize the potential representation of the POI. The input of the model is a user check-in vector characterized by multihot . When is 1, it means that user has visited the POI at time . The mapping process is showed in the following equation.

Here, represents the latent representation vector of user . and , respectively, represent the weight matrix and the deviation vector.

3.3. Regional Hot Spot Embedding

By analyzing the check-in record of users in real datasets, we find that the user’s check-in history is often clustered. People are more inclined to visit POI that is near to the POI they have visited. The entire distribution is similar to the multivariate Gaussian distribution. In the whole dataset, the check-in distribution characteristics of each user are different, but the regional hot spots may be similar. In order to extract these features, this paper introduces the session window mechanism. By observing the user’s check-in history, we find that the user’s behavior is segment by segment, and the behavior in each segment is continuous and compact. The correlation degree of the behavior within the segment is much greater than the correlation degree of the behavior between the segments. We regard each segment of user behavior as a session, and the gap between segments is called session gap. Therefore, we segment the user’s behavior flow according to the session window and calculate the result of each session. In the end, we can get the user’s preference for regional POI.

Here, represents the user’s preference for area .

Then, we concatenate the user’s check-in distribution embedding vector and regional hot spot embedding vector:

Finally, we use the graph attention network of equation (4) to aggregate user check-in distribution embedding vector and regional hot spot embedding vector. We obtain the user’s embedded representation .

3.4. Ancillary Information Embedding

In this section, we will introduce two methods of ancillary information embedding: (1)POI Spatiotemporal Information Embedding. In order to use the geographical distance attribute of POI, we use Gaussian kernel function to extract the neighbor perception influence of check-in POI

Among them, and are the geographic coordinates of the two POI signed in by the user. By calculating the paired Gaussian kernel value of each POI pair, we can obtain the Gaussian kernel value vector. (2)POI Score and the Number of Visits Embedding. We preprocess the LBSN dataset; then, the user score and the number of visits can be obtained, which is normalized by softmax. Therefore, it is easier to characterize the user’s preference for check-in POI

We calculate each user’s score and the value of the number of visits probability separately to obtain the vector and of probability values. Combining the above ancillary information, we get the potential representation vector of the ancillary information , see the following equation:

3.5. Information Interaction Embedding

In the user’s check-in, some POI can better characterize the user’s preferences than other information. These representative POI can provide more contributions to users’ implicit feedback. Therefore, we introduce the GRU, a variant of the recurrent neural network, to learn the long-term and short-term preferences of users. GRU, as a variant of LSTM, combines the forget gate and the input gate into a single update gate. And it mixes the cell state and the hidden state. This model is simpler and easier to train than LSTM. We integrate user embedding with POI ancillary information embedding.

Through GRU training, we can get the implicit representation according to the following equations.

Here, is the forget gate, remembers the current state of the moment, forgets some dimensional information in the past and adds some dimensional information by the current node.

Finally, the final predicted value is obtained through the three bottleneck layers. The following equations show the computation process:

, , and are the output vector of the three bottleneck layers, is the potential representation vector of the POI ancillary information, and is the predicted value we ultimately want.

3.6. Loss Function

In this paper, we use the combined regular term, and the objective function of the proposed model is shown in the following equation.

And,

is a regularization parameter, is a parameter learned by the aggregation layer, and is a judgment condition: . By minimizing the objective function, the partial derivatives of all parameters can be calculated by the gradient descent of back propagation. This paper uses Adam optimizer to automatically adjust the learning rate in the training process.

4. Experiments

In this section, we conduct a series of experiments on two real world datasets to verify the effectiveness of the proposed method. We first introduce the datasets and then introduce the evaluation metrics of the experiments and the specific setting of parameters. Then, we compare the relevant algorithms with our proposed methods and analyze the experiments results.

4.1. Dataset Description

In this paper, we evaluate the proposed model on two LBSN datasets, Gowalla and Foursquare. The check-in records in the dataset include the user’s ID, POI ID, POI latitude and longitude, user sign-in timestamp, POI score, and the number of POI visits. Users rate the visited POI between 1 and 5, and the timestamp is expressed in UNIX format. After preprocessing, the relevant detailed information of the dataset is shown in Table 1.

4.2. Performance Metrics

In this paper, we use , , , and to evaluate the model. For each user, represents the percentage of locations in the top- recommended POI visited by the user, and represents the percentage of locations that may be visited among the top-k recommended POI. is the weighted harmonic average of and , which considers the result of and . When the score of is high, it indicates that the model is more effective. is the average accuracy of the top top-k recommended POI, where the average accuracy is the average of the accuracy values of all the POI after ranking.

4.3. Parameter Setting

In our experiment, is the function, is the ReLU function, and is the sigmoid function. The learning rate and regularization parameters are set to 0.001 and 0.001. In the Gowalla dataset, the batch size is set to 128. In the Foursquare dataset, the batch size is set to 256. The size of the bottleneck layer is set to . In the experiment, the drop rate is set to 0.6. The experimental equipment configuration in this paper is as follows: CPU is i7 8700k, the memory is 48 G, GPU is GeForce GTX 1080Ti. This paper runs the model through the pytorch 1.2.0 framework.

4.4. Baseline Algorithms

In order to indicate the effectiveness of the method, we will compare it with the following POI recommended model:

WRMF [33]. Weighted regularization matrix decomposition, by assigning different confidence values to the check-in and unsigned POI based on the matrix decomposition, thereby minimizes the square error loss.

BPRMF [34]. Bayesian personalized ranking, it can optimize the preference sorting of check-in and unchecked positions.

RankGeoFM [35]. Ranking-based geographic decomposition, this is a ranking-based matrix decomposition model used to learn the user’s preference ranking of POI, including the geographic impact of neighboring POI.

PACE [36]. Preference and context embedding, this is a deep neural network architecture that can learn the embedding of users and POI together to predict the user’s preference for POI and various contexts associated with the user and POI.

SAE-NAD [37]. A model with a self-attention encoder and neighbor-aware decoder for implicit feedback.

APOIR [17]. A recommender and a discriminator to understand the distribution of users’ implicit preferences.

SSANet [38]. A model that integrates POI ancillary information to learn user preferences through a self-attention mechanism.

4.5. Performance Analysis

Figures 310 show the performance analysis between out model and other models on the Foursquare and Gowalla datasets. The top-k in the abscissa represents the top-k POI recommended by our model. The index of the ordinate is the evaluation index we used in the experiments.

From these figures, we can observe that our model achieved better performance in most evaluation indicators on the two datasets. Take the Foursquare dataset as an example (Figures 3, 4, 5, and 6): (1) compared with SSANet model [17], SGANet reaches 5.0% in , 3.6% in , 4.7% in , and 7.7% in . (2) RankGeoFM has outstanding performance among traditional models. In comparison, SGANet reaches 32.3% in , 36.7% in , 33.9% in , and 36.9% in . It can be seen that the performance of our model has been significantly improved. This is because it makes full use of the graph attention network to extract regional hotspot preferences. It captures the user’s preference for POI pairs and regional POI within the model. At the same time, it uses ancillary information to learn implicit feedback and uses GRU recurrent neural network to capture POI. The dependencies in the sequence are used to learn the long-term and short-term preferences of users.

It can be seen from the experimental results that the performance of RankGeoFM is the best in the traditional recommendation model, even surpassing the deep learning method APOIR, and is comparable to the performance of PACE in most experimental data. The experimental results of the APOIR model are not ideal, which show that only using generative adversarial network technology to train the model for POI recommendation cannot learn the potential intent representation in the user’s check-in sequence.

5. Conclusions

In this paper, we focus on the location based service in mobile crowdsourcing system and mobile social networks. Specifically, we propose a novel POI recommendation model SGANet to improve the recommendation performance in face of data sparsity and cold start problem. SGANet considers not only the user’s preference for a single POI but also the user’s preference for regional hotspots. At the same time, it digs into the POI ancillary information and recommends the top-k POI with the highest satisfaction for users. SGANet constructs user POI embeddings and regional hotspot embeddings according to user’s check-in records to learn the nonlinear user-POI relationship. Graph attention network is introduced to capture the user’s preference for regional POI within the model. SGANet further learns the long-term and short-term preferences of users by fusing POI ancillary information through the recurrent neural network GRU. A series of comparative experiments results on two real world dataset Foursquare and Gowalla show that the recommendation model proposed in this paper has high accuracy and effectiveness and also prove that the auxiliary regional information can improve the accuracy of recommendation and alleviate the data sparsity and cold start problem effectively.

In future works, we will explore more auxiliary information like social relationship among users, POI categories to further improve the POI recommendation performance. In addition, we will extend our model to federated learning framework to protect the location privacy of users in POI recommendation.

Data Availability

The dataset used in this paper are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was supported by National Key R&D Program of China under Grant no. 2020YFB1710200.