Abstract

With the improvement of living standards, more and more people are pursuing personalized routes. This paper uses personalized mining of interest points of ethnic minority tourism demand groups, extracts customer data features in social networks, and constructs data features of interesting topic factors, geographic location factors, and user access frequency factors, using LDA topic models and matrix decomposition models to perform feature vectorization processing on user sign-in records and build deep learning recommendation model (DLM). Using this model to compare with the traditional recommendation model and the recommendation model of a single data feature module, the experimental results show the following: (1) The fitting error of DLM recommendation results is significantly reduced, and its recommendation accuracy rate is 50% higher than that of traditional recommendation algorithms. The experimental results show that the DLM constructed in this paper has good learning and training performance, and the recommendation effect is good. (2) In this method, the performance of the DLM is significantly higher than other POI recommendation methods in terms of the accuracy or recall rate of the recommendation algorithm. Among them, the accuracy rates of the top five, top ten, and top twenty recommended POIs are increased by 9.9%, 7.4%, and 7%, respectively, and the recall rate is increased by 4.2%, 7.5%, and 14.4%, respectively.

1. Introduction

As the self-service travel group continues to grow, more and more users organize the travel information they want to obtain from the internet [1]. Personalized ethnic minority tourism route recommendation refers to the generation of ethnic minority tourism routes that meet their travel conditions for each user based on the user's personalized factors. However, with the rapid development of technologies such as cloud computing and big data, the scale of data has shown an explosive growth trend. When faced with massive amounts of network data, users cannot quickly select information. The typical solutions to the problem of information overload are search engines and personalized ethnic minority travel route recommendation systems.

The personalized minority travel route recommendation algorithm is the core of the recommendation system. The traditional recommendation algorithm mainly includes collaborative filtering, content-based recommendation algorithm, and hybrid recommendation algorithm. Collaborative filtering is the most widely used recommendation algorithm. Its principle is to use the interactive information between users and items to make recommendations. The advantage of collaborative filtering recommendation is that there is no need to perform complex feature modeling of users or items, and only the number of users' historical feedback is required, so it is simple and efficient. The disadvantage is that there are serious data sparseness and cold start problems. The principle of a content-based recommendation algorithm is to use the items selected by the user to find items with similar attributes for recommendation [2]. The content-based recommendation algorithm does not have the problem of sparse scoring data, and at the same time avoids the cold start problem of new users. The disadvantage is that complex feature engineering is required for feature extraction. Multimedia data such as images and audio often face the problem of feature extraction difficulties, and there is still a cold start problem for new users [3].

A hybrid recommendation algorithm refers to the fusion of multiple recommendation algorithms to achieve a better recommendation effect. Common combination strategies include weighted combination, result mixing, feature combination, cascade, and so on. This combination algorithm not only enhances the feature crossover ability of the model but also effectively avoids the problems of combination explosion and high computational complexity [4]. Deep learning originated from the study of artificial neural networks, and its concept was proposed by Hinton et al. in 2006 [5]. In recent years, in the fields of computer vision, natural language processing, and speech recognition, deep learning technology has made breakthroughs, and it has also brought a new technological revolution to the research of recommender systems. Integrating deep learning into the recommendation system can effectively make up for the shortcomings of the traditional personalized ethnic minority travel route recommendation model and improve the quality of recommendation results. Wang et al. used the influence of image features shared by tourists' historical tours on POI recommendation and proposed a POI recommendation system based on image content enhancement [6]. Zheng et al. proposed a recommendation model of deep CNN. The model extracts different features by defining two parallel convolutional neural networks. One convolution kernel is used to extract the user behavior pattern features, and the other convolution kernel is used to extract the features from items reviewed by users in history. Then the features extracted by the two convolution kernels are fused for the recommendation. This method has achieved good results in the field of item recommendation [7]. Shen et al. constructed an e-learning recommendation system by using convolutional neural networks to extract features from text information [8]. Tang et al. constructed a sequential recommendation model through a convolutional neural network [9]. Lei et al. proposed a deep learning model for image recommendation based on convolutional neural networks [10].

This chapter proposes a deep neural network recommendation framework that integrates DNN network [11] with LDA topic model and matrix factorization algorithm and uses word embedding technology for user preference features, geographic factor features, and probabilistic topic features in social networks [12] being integrated into the minority tourist routes recommendation task, through the neural network to learn the high-level interaction between features and then make personalized recommendations to users.

2.1. LDA Topic Model

Latent dirichlet allocation is a topic model algorithm based on a probability model proposed by Blei et al. in 2003. The LDA is an unsupervised machine learning technology that can be used to identify the potentially hidden topic information in a large-scale document set or corpus. The method assumes that each word is extracted from a potentially hidden topic behind it.

The shaded circle in Figure 1 represents the observable variable, the unshaded circle represents the latent variable, the arrow represents the conditional dependence between the two variables, the box represents repeated sampling and the number of repetitions is in the lower right corner of the box. This corresponds to the production process of LDA.

2.2. Matrix Factorization Algorithm

The matrix factorization algorithm is the core algorithm that won the 2006 Netflix recommendation competition. It has a pivotal position in the history of the entire recommendation system and has contributed to the large-scale development and industrial application of the recommendation system. Assume that the set of (u, ) pairs (u represents the user and represents the subject matter) combination of which all users have a score is A.

The vectors that embed the user u and the subject into the k-dimensional implicit feature space through matrix decomposition are

Then the prediction score of user u on the subject is

The error between the true value and the predicted value is

If the prediction is more accurate, then the smaller ||∆r||, for all users rated (u, ), if we can ensure that the sum of these errors is as small as possible, then there is a reason to believe that our prediction is accurate. Matrix factorization can be transformed into a machine learning problem, that is, an optimization problem of finding the minimum value.

2.3. DNN Model

The fully connected neural network is the simplest neural network, it has the most network parameters and the largest amount of calculation. The neural network layers inside the DNN can be divided into three categories, input layer, hidden layer, and output layer, as shown in the example below. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the number of layers in the middle are all hidden Floors are shown in Figure 2.

The layers are fully connected, that is, any neuron in the ith layer must be connected to any neuron in the i+1th layer. Although the DNN looks complicated, it is still the same as the perceptron from a small local model, that is, a linear relationship plus an activation function σ(z). The commonly used activation function types are (1) sigmoid activation function; (2) tanh activation function; (3) softmax activation function; (4) Relu function.

2.3.1. Sigmoid Activation Function

The three main defects of the sigmoid function: a. When the neural network uses the sigmoid activation function for backpropagation, the neuron whose output is close to 0 or 1 has a gradient close to 0. If a large neural network contains sigmoid neurons, many of them are in a saturated state, then the network cannot perform backpropagation, and the network cannot be learned and optimized. b. The sigmoid output is not centered at zero, which will easily cause the result of the subsequent activation function output to shift from the center POI. The effective range of the sigmoid function is between (−4, 4). This will cause the gradient to disappear. c. Compared with other nonlinear activation functions, the sigmoid function is computationally expensive.

2.3.2. Tanh Activation Function

Tanh function will also have the problem of vanishing gradient and high computational cost.

2.3.3. Softmax Activation Function

2.3.4. Relu Function

The defect of the Relu function is not centered at zero, similar to the sigmoid activation function. The output of the Relu function is not centered at zero, which will cause the center to shift; during the forward pass, if x < 0, the neuron remains It is inactive and “kills” the gradient in the backward pass. In this way, the weights cannot be updated and the network cannot learn.

2.4. Data Vectorization

Social network data contains rich user access information and POI information. How to find the effective features from the user's historical check-in data and integrate the effective features into the user preference model is the key to improving the effect of personalized ethnic minority travel route recommendations. This section analyzes the features from three aspects: theme features, geographic factor features, and user access features, and determines the effective features that affect the users' travel preferences. Extract better user preferences from the analyzed effective features to improve the accuracy of personalized ethnic minority travel route recommendations.

2.4.1. Theme Feature Analysis

In the check-in records of users on social networks, different users often have different theme preferences, that is, everyone has their own preferred theme. For check-in locations under the same theme, different users have their own personalized visit hobbies. For example, user u loves shopping and often visits major shopping malls, while user is a food lover and is not keen on shopping. Then user u's favorite shopping mall is higher than user . That is, foodies may be more interested in gourmet POI, while shopaholics are more concerned with shopping mall POI . According to the historical check-in and visit records of the user in the social network, the theme characteristics of the user's visit are analyzed, and according to the analysis result, the theme preference characteristic can fully display the user's personalized preference in the social network. Therefore, the topic features can be integrated into the personalized recommendation model, and the user's preference for different topics of the POI can be used as one of the criteria for measuring the user's preference for the POI. According to the set of POI names in the user's historical visit records, the topic vector characteristics of each user are obtained, which characterizes the user's preference for different topics of POI, and enables deep learning to better learn the topic characteristics. The probabilistic topic model digs out the personal preference for POI based on the user's historical visit frequency to different topics. Given the theme set , according to the set H of all keywords appearing in the check-in record, the LDA model is used to obtain the keyword distribution corresponding to each theme :

represents a matrix of keywords belonging to the topic . For the user , every social network record has its probability topic distribution in the topic set T. This distribution describes the probability of each record to the theme of the set T and represents the degree of relevance between the points of interest (POI) and the theme.

For the check-in record in the check-in set of a certain user in the data set, the probability [] that the check-in record belongs to the topic is obtained according to formula (11). The topic vector feature in the check-in record is used to reflect the user's personalized access preferences for different topics.where represents the number of keywords belonging to in the check-in record , represents the prior probability of symmetric Dirichlet, generally  = 0.1, and |T|, respectively, represent the number of keywords in in the check-in record, and the number of potential topics .

2.4.2. Geographical Factor Analysis

In the check-in records of social networks, first, according to all the historical check-in records of each user, the user’s historical TOP-K check-in record POI is calculated. The user’s access frequency to the POI can reflect the user’s personal preference from the side. TOP-K POI of interest and all historical records are visualized and analyzed using ArcGIS tools. All points of interest in the user's TOP-10 are distributed in the center of the user's activity, that is, the user's activities are basically carried out around these centers. It can be concluded that the user's historical activities are all in his frequent area, and the user is usually used to visiting the POI in the frequent area, that is, the user always likes to visit the POI that is close to his frequent area. Therefore, geographic factor features can be integrated into the personalized recommendation model, and the user's preference for POI in different geographic locations can be used as one of the criteria for measuring the user's preference for this POI.

The geographic distance affects the choice of human visits. The closer the distance to the center of the frequent activity area, the greater the possibility of the user’s choice [13]. According to formula (11), the longitude and latitude information in the user sign-in data are respectively normalized, and the data is normalized to one appropriate range, and input into the model as the geographic factor feature of each POI.where is the longitude or latitude of the POI, represents the largest value in the longitude or latitude of the location information, is the smallest value in the longitude or latitude, as the data format after the neural network is normalized. POI geographic factor characteristics. In addition, normalization is used to process the location information in the social network data, and the normalized latitude and longitude eliminate the influence of singular data on model training and increase the learning ability of the model.

2.4.3. Analysis of User Access Characteristics

In social networks, users will not show their favorite POI like movie ratings, and the user's check-in record does not show their favorite POI. Even if the user has visited a POI, the user may not necessarily like it. For the POI, the performance of user preference is often implicit feedback [14]. In social networks, the more frequently a user visits a certain POI, the more likely the user is to prefer the POI. Based on the preference of user visit frequency, user visit characteristics in social networks can be determined, and the relationship between users can still be analyzed based on the idea of collaborative filtering, that is, users with the same visit pattern may have similar interests and preferences. Therefore, similar interests among similar users can be used as one of the criteria for measuring users' preference for this POI.

The singular value matrix decomposition technology is used to vectorize the user's sign-in data. After the user's sign-in record matrix is matrix-decomposed, the high-dimensional sparse data is transformed into a low-dimensional user potential vector. For users with similar sign-in records, the user potential vector obtained by matrix decomposition is closer in the vector space.

According to the user records in the social network, the user-POI sign-in matrix is formed, where N represents the number of all users in the city, M represents the number of all POI of interest in the city, and the value in the check-in matrix represents the user The total frequency of visits at POI is . According to formula (13), the historical visit matrix of user-POI is matrix-decomposed.

3. Personalized Minority Travel Route Recommendation Algorithm

3.1. DLM Recommendation Algorithm Framework

The DNN network can adaptively learn high-level features and their interactions from the input of a specific task. Therefore, this chapter proposes to use the DNN deep learning model for personalized POI recommendations. The overall framework of the model is shown in Figure 3. DLM personalized POI recommendation model is divided into feature extraction module and network learning module. Among them, the feature extraction module uses word embedding technology to achieve the extraction and construction of LSBN features; the network learning module includes a network connection layer and a network layer, the connection layer in the network learning module realizes the feature fusion of the extracted feature vectors, and the network learning module The network layer implements DLM model training and user preference score prediction for POI.

The DNN network can adaptively learn high-level features and their interactive features from the input of a specific task. Therefore, this chapter proposes to use the DNN deep learning model for personalized ethnic minority travel route recommendations. The overall framework of the model is shown in Figure 3. The DLM personalized ethnic minority travel route recommendation model is divided into a network training module and a feature vectorization module. Among them, the feature vectorization module uses word embedding technology to realize the extraction and vectorization of points of interest features in social networks; the network training module includes a network connection layer and a network layer, and the network training module first uses the connection layer to feature the extracted feature vector integration, and then through the network layer to achieve the DLM model training and the user preference score prediction of ethnic POI.

The feature vectorization module uses word embedding technology to extract topic feature vectors, geographic factor feature vectors, and user-visit feature vectors in the user's social network. The network connection layer in the network training module fully connects the vectors extracted by the feature vectorization module and sends them to the DNN training network. The connection layer ensures the scalability of the entire model. If other relevant context information needs to be added, it can be automatically fully connected through the fully connected layer, and the characteristics of the input layer can be sent to the network for training. The network layer in the network training module contains two functions: training and prediction. Among them, the DNN network is used in the training stage to extract and learn implicit features; then the feature vectors extracted in the hidden layer are input to the softmax layer for classification task learning. The DLM model transforms personalized ethnic minority travel route recommendation into a two-category task, in which the user's check-in record is defined as a positive sample and personalized ethnic minority travel routes that the user has not visited as a negative sample. The output of the softmax layer is a two-dimensional probability vector P = [Q1, Q2], where Q1 represents the user’s preference probability for personalized ethnic minority travel routes, and Q2 represents the user’s non-preference probability for personalized ethnic minority travel routes, a cross-entropy loss function is constructed based on the network output results and positive and negative samples, and the gradient descent method is used to optimize the function; in the prediction stage, input user and personalized ethnic minority travel route information, and the network outputs a probability vector P, ranking top according to the probability of Q1 k, recommend to users.

3.2. DLM Model Learning Optimization

The connection layer of the DLM learning module fully connects the existing features and sends them to the neural network for learning. For any user-POI pair < , >, its fully connected vector representation is shown in formula.

Among them, the feature contains the user-POI latent vector < , >, the user preference feature, the topic feature , and the geographic factor feature . Merge connects all the feature vectors into a one-dimensional vector and sends it to the model. According to formula (15) calculated in the hidden layer of the modelwhere represents the number of hidden layers in the model, and the activation function used is Relu. Avoiding the problem of gradient disappearance in multi-layer network training, reducing the impact of this problem on model training, and keeping the model in a stable state during iteration [15]. In addition, in each hidden layer training, the dropout technology is added to prevent over-fitting problems. The addition of this technology can effectively increase the generalization ability of the model so that the model still has strong adaptability when dealing with deep network training. In the output layer of the model, the predicted access probability of the user to the POI is obtained, as shown in formula .

Among them, is the weight value of the output layer, and is the bias value of the output layer. The output of the softmax output layer of the model is two probability values, which respectively represent the possibility of the user accessing the POI and the possibility of the user not accessing it. Here, cross-entropy is used as the loss function of model adjustment, as shown in formula.

The number of times of the summation of the loss function is related to the dimension of the input data. The optimization of the model is achieved through the minimization formula (17), and the sorted top-k is output as the recommendation result. Among them, k is the number of recommended POI results. Finally, the recommended results are sorted by the prediction results after model training according to the probability that the user may visit. The greater the probability, the more likely the user is to visit. When the value of k is determined, the corresponding top-k prediction results are selected to recommend to the user according to the sorting result of the probability.

DLM personalized minority travel route recommendation algorithm. The model training code is shown in Table 1.

4. Simulation Experiment

This chapter designs and implements two parts of experiments to verify the effectiveness of the methods in this chapter, including comparison experiments of the characteristics of the DLM algorithm and comparison experiments with existing methods on real data sets and verification and evaluation of experimental results with algorithm recommendations. In the self-feature comparison experiment of the DLM algorithm, the different feature combinations in the algorithm of this chapter are compared in the comparison experiment with the existing method, the comparison between the DLM algorithm and the existing POI recommendation algorithm is realized in the chapter. Finally, the DLM algorithm and other recommendation algorithms are compared and analyzed.

4.1. Description of the Experimental Data Set

From the real data set of Foursquare, users whose check-in data location is Beijing were selected as the experimental data. In addition, the data was preprocessed and denoised. In the division of the experimental data set, in this chapter, the historical check-in record data of the user will be randomly selected according to 8 : 2. Among them, 80% of the training set is used for model training, and 20% of the test set is used for model evaluation. Before data application, a good job of data denoising and filtering is performed, and the inconsistent data are deleted through the test of the distance between the check-in point and the positioning point of the social network.

4.2. Description of the Comparison Algorithm

Traditional recommendation algorithms UCF, PMF, LCARS, etc. are selected. To compare the promotion model of this article, the optimization effect of this algorithm is studied, and the control variable method is used to construct a recommendation algorithm lacking a certain data feature recognition module, and the importance of data features is evaluated by comparing the recommendation effect.

UCF: This is a user-based collaborative filtering method that improves the efficiency of personalized ethnic minority travel route recommendations by taking into account the influence of interest among similar users.

PMF: Explain the feasibility of matrix decomposition from the perspective of the probability generation process, and then recommend it to users.

LCARS: This method integrates topic features into the recommendation system and realizes travel route recommendations to users by considering the comprehensive interests of personal interests and user preferences [16].

Rank-GeoFM: Matrix decomposition method based on ranking, through this method, the number of user access matrices is increased, the sparseness of data is alleviated, and the matrix decomposition method is used to realize travel route recommendation to users [17].

SGFM: Based on the social relationship and geographic influence between users, this method designs a travel route recommendation method based on social geographic factors [18].

DLM: This method is the recommended method proposed in this chapter. The recommended method of user interest feature, topic factor feature, and geographic factor feature is added to the feature fusion.

DLM_MF: This method is the recommended method proposed in this chapter, and only the pass matrix is included in the feature fusion Decomposed user preference characteristics.

DLM_MF + Geo: This method is the recommended method proposed in this chapter, and only the passing moment is included in the feature fusion user preference characteristics and geographic factor characteristics obtained by matrix decomposition.

4.3. Evaluation Criteria

In this chapter, two broad indicators are used to evaluate the performance of different recommendation algorithms, namely, accuracy and recall (represented by pre@N and Rec@N, respectively) [19], as shown below:

Among them, represents the number of users, N represents the number of recommended points, Top- N represents the list of the top N points of interest recommended by the recommendation model to the target user, and K represents the actual check-in list in the user test set, that is, the user’s actual history in the visit record, the set of POI that the user has actually visited.

4.4. Experimental Results and Analysis

By comparing the recommendation results of the three models of DLM, DLM_MF, and DLM_MF + Geo, the impact of each feature in the social network on the recommendation results was studied, and the comparison was made in the case of Pre@5, Pre@10, and Pre@20. The experimental results are shown in Figure 4.

Experiments show that the three DLM methods that incorporate the three features are better than the DLM_MF + Geo model. The recommendation effect of the DLM_MF + Geo model is significantly better than that of DLM_MF, which shows that in the recommendation system, the selection of geographic features is more important than the selection of features that are of interest to users. The best DLM effect shows that adding topic features to the recommendation system is more conducive to accurately positioning user needs.

By comparing the DLM method with other methods on the accuracy and recall rates on the Foursquare data set, The recommended results are shown in Figures 5 and 6.

The results show that on the Foursquare data, the DLM algorithm is significantly better than other personalized ethnic minority tourism route recommendation algorithms, which shows that the recommendation system based on deep learning is effective for the recommendation of ethnic minority tourism routes; among them, the user-based collaborative filtering algorithm and the matrix factorization algorithm had the worst recommendation effect, and neither algorithm makes use of the geographic influence or other characteristics in social networks to make effective personalized recommendations. Compared with the UCF and PMF algorithms, the LCARS algorithm adds theme factor features, and its recommendation effect is effectively improved. In the case of Pre@5, its recommendation accuracy rate is 50% higher than that of the UCF. Rank-GeoFM and SGFM also have a significant improvement due to the addition of geographic factors. In the case of Pre@5, its recommendation accuracy rate is improved by 110% compared with the UCF. In the task of personalized travel route recommendation, how to better integrate the characteristics of social networks is the key to improving the recommendation performance. The recommendation effect of the DLM model based on deep learning algorithms is due to other traditional algorithms, one is the integration of three data characteristics, and the other is the advantages of deep learning algorithms, which makes the recommendation effect more accurate.

5. Conclusion

This paper uses feature extraction of user check-in data in social networks to construct data features of interest topic factors, geographic location factors, and user access frequency factors. The LDA topic model and matrix decomposition model are used to perform feature vectorization processing on user check-in records. The word embedding technology and the DNN network model are jointly constructed as a DLM model and this model is used to compare with the traditional recommendation model. The experimental results show the following:(1)In the personalized recommendation system of ethnic minority tourism routes, the characteristics of geographic factors recorded by users are more important to the recommendation effect than the topics and frequency of user visits.(2)In this method, the performance of the DLM model is significantly higher than other POI recommendation methods in terms of the accuracy or recall rate of the recommendation algorithm. Among them, the accuracy rates of the top five, top ten, and top twenty recommended POIs are increased by 9.9%, 7.4%, and 7%, respectively, and the recall rate is increased by 4.2%, 7.5%, and 14.4%, respectively.

Data Availability

The dataset can be accessed upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest.

Acknowledgments

This study was supported by the Liaoning University of Science and Technology Fund (Research on the New Development of Rural Tourism Industry under the “Two-Wheel Drive” Integration of Agriculture and Tourism) (Item no. 2019RW08), Liaoning Provincial Department of Education Project (Research on the Composition and Influencing Factors of Retired People’s Tourism Happiness) (Item no. LJKR0127), and Liaoning Provincial Department of Education Project (Research on the Formation Mechanism and Influence of Tourist Awe Emotional Experience Based on Embodied Cognition) (Item no. 2020LNJC10).