Abstract

This paper proposes a personalized tourist interest demand recommendation model based on deep neural network. Firstly, the basic information data and comment text data of tourism service items are obtained by crawling the relevant website data. Furthermore, word segmentation and word vector transformation are carried out through Jieba word segmentation tool and Skip-gram model, the semantic information between different data is deeply characterized, and the problem of very high vector sparsity is solved. Then, the corresponding features are obtained by using the feature extraction ability of DNN’s in-depth learning. On this basis, the user’s score on tourism service items is predicted through the model until a personalized recommendation list is generated. Finally, through simulation experiments, the recommendation accuracy and average reciprocal ranking of the proposed algorithm model and the other two algorithms in three different databases are compared and analyzed. The results show that the overall performance of the proposed algorithm is better than the other two comparison algorithms.

1. Introduction

Service robot refers to an autonomous or semiautonomous robot [1] that completes useful service activities instead of human beings, but does not engage in production work. Its role is to replace service personnel and provide services required by human beings. Service robot contains many scientific knowledge including mechanical engineering, automation, computer science, and control engineering [24]. With the continuous development of artificial intelligence, service robots are gradually moving towards intelligence [5, 6]. Due to the increasingly obvious trend of personalized tourism service selection by users, the research on intelligent tourism service robot has become a hot spot in intelligent service robot [7, 8], which is innovative and forward-looking in the field of robot application in tourism industry.

At present, the traditional method of providing services by service personnel cannot meet people’s personalized needs [9, 10]. Reference [11] developed a robot partner for information support and proposed a new method that can flexibly recommend all kinds of information according to human intention. However, this method does not divide the user’s access sequence into different interest segment sequences according to time, so the recommendation accuracy is low. For the dynamic traveling repairman problem and dynamic vehicle routing problem, Reference [12] combined with time-varying requirements, sent the available robots to the nearest service request location, sent multiple robots for each service request arriving in the system, and proposed a new model-free operation strategy independent of load factor. However, this method does not have model algorithm and cannot be applied to more complex situations. In order to realize long-term autonomous operation, Reference [13] proposed a modular general software framework for intelligent mobile robot, which can use complex human voice commands to interact with humans. However, this method only considers the home service robot. Reference [14] introduced local attention and nonlinear attention to capture local and global project information at the same time. On this basis, a nonlinear attention similarity model (NASM) was proposed for project-based collaborative filtering through local attention embedding. However, the algorithm cannot accurately capture human high-order sequence behavior, and it is difficult to realize complex recommendation. Reference [15] proposed a personalized robot service system centered on robot thinking, which can migrate with the user’s geospatial movement at any time, so that it can continue to grow with the user. However, this method does not consider the environmental factors of users, and the personalized growth cycle is long. Reference [16] constructs the intelligent robot control system based on the principle of human-computer interaction and designs the corresponding model-based control algorithm to identify the dynamic model of the robot. However, this method is difficult to obtain the prior distribution, and it is difficult to characterize the high-dimensional semantics of users. In Reference [17], aiming at the path-planning problem of hospital service robot in drug delivery, medical insurance order and other services, based on the automatic control robot with visual recognition ability, combined with the three-dimensional reconstructed image and the route area shunting method using edge calculation, an image edge detection algorithm based on three-dimensional features is proposed. However, this method is only suitable for special people in specific areas and does not have the characteristics of in-depth personalized service.

Based on the above analysis, aiming at the problem of personalized travel route recommendation of intelligent service robot in the scenic spot, a personalized tourist interest demand recommendation model based on deep learning using word embedding technology is proposed. The basic idea is to (1) reduce the sparsity of data vector and improve the recommendation accuracy of the algorithm by preprocessing the original data and (2) build a depth prediction model to deeply mine the relationship between users and scenic spots. Compared with the traditional service robot travel route recommendation method, the contributions of the proposed method are as follows:(1)The Skip-gram model in the word2vec word embedding method is used to transform the word vector of data, and the effective extraction of topic feature vector, geographic factor feature vector, and user access feature vector is realized.(2)The proposed model uses the deep neural network to transform the recommendation of tourists’ interests and needs into the task of binary classification, which improves the ability of extracting the features of the original data and effectively enhances the performance of predicting users’ ratings.

The rest of this paper is organized as follows: the second part introduces the personalized tourist interest demand recommendation model based on deep neural network; the third part compares with the existing recommendation model to realize the feasibility and optimization of the method proposed in this paper; the fourth part is the conclusion of this paper.

2. Proposed Model

2.1. Overall Framework

Collaborative filtering technology is the most used in the recommendation system, which can be applied in various occasions and achieve good results, as well as tourism service recommendation [18]. Although collaborative filtering technology has many advantages, such as good processing of unstructured data and high degree of personalization and automation of recommendation, collaborative filtering technology also has the problem of data sparsity [19]. Generally, in the application of recommendation system, the data scored by users are insignificant for the overall data, which will lead to some problems. In this case, the use of collaborative filtering technology often cannot achieve good results. In other words, for collaborative filtering technology, if an item has less scores, it is difficult to recommend it to other users. In addition, if a user scores very little, it will be difficult to get some recommendations. It is difficult to achieve excellent results by using traditional collaborative filtering technology.

Therefore, this paper proposes a tourism service recommendation model based on deep learning to solve the related problems. The proposed model is mainly divided into four modules: data preprocessing, construction of depth prediction model, network training, and final recommendation list generation. The principle and function of each module are shown in Figure 1:(1)Data acquisition and preprocessing—this module mainly obtains the basic information data and comment text data of tourism service items by crawling the relevant website data and then preprocesses these data.(2)Construction of prediction model—this module uses deep learning technology to predict users’ scores on tourism service items.(3)Training network—the module uses sample data to train the model network, mine the potential relationship between users and tourism service items, and learn the interaction between users and tourism service items, so as to obtain a predictable model.(4)Generate personalized recommendation list—the main function of this module is to test the experimental data. The module inputs the experimental data into the trained model. The model predicts users’ scores on tourism service items, sorts them according to the score, and finally generates a personalized recommendation list for each user to complete the recommendation.

2.2. Data Acquisition and Preprocessing

The goal of data acquisition mainly includes three aspects: self-built database, data from Foursquare, and data from Tokyo [20, 21]. The self-built database adopts the data set established by collecting a large amount of travel information from the MaFengWo, with about 390000 user travel access records. Foursquare’s data includes the long-term (about 10 months) check-in data collected in New York from May 2013 to March 2014, filtering out users with less than 8 access records and locations with less than 8 visits. There are 1158 users and 5092 locations for the experiment, with a total of 257221 check-in records in the Foursquare data set. The data from Tokyo are similar to the data from Foursquare, and the same filtering operation is adopted for the data. Finally, there were 2095 users and 8246 locations for the experiment, with a total of 605893 check-in records.

The specific statistics of the three data sets are shown in Table 1.

2.3. Vector Representation of Text

Preprocessing the data in the above three databases mainly includes the following steps:(1)Chinese word segmentation technology is used for word segmentation. In this paper, Jieba word segmentation tool is used to segment data [22]. Jieba word segmentation tool adopts the Chinese word segmentation algorithm of NShort, which is the python implementation of the algorithm. It has the characteristics of simple principle, easy to understand, low model resource occupation, and easy training. In addition, NShort Chinese word segmentation algorithm has excellent efficiency in large-scale word segmentation application scenarios and is widely used in various commercial fields. Moreover, the model supports incremental expansion. It is the mainstream algorithm of Chinese word segmentation and is used by most search engine companies.(2)Convert the divided data into word vectors. As a method of text feature representation, vocabulary is expressed as a feature vector, which is called word vector [23]. It is a common expression of word vectors to represent text information with feature vectors on a word list. The data can be expressed as feature vectors using the frequency and TF-IDF value of words. A typical representation is One-Hot Encoding. However, this method essentially uses a vector containing only one 1 and the others are 0 to uniquely represent words, and its dimension is the number of words in the whole vocabulary list. However, the vocabulary list used in this representation method is very large, so the dimension of feature vectors represented by this method is very large, which eventually leads to the problem of very high vector sparsity. Secondly, such methods cannot express the relationship between words and cannot well reflect the deep semantic information between words and text. Here, the word embedding method is used for word vector conversion [24]. Different from the traditional lexical feature expression methods, the word embedding method can represent words by dense real number vectors in low dimensional space. This cannot only represent words in vector form but also calculate the effective distance between two words and describe the semantic information between words. It is a very effective method to process text information. At present, word2vec method is the most widely used word embedding technology. Word2vec model is divided into Skip-gram model and CBOW model [25]. Skip-gram model predicts the generation probability of context vocabulary through target vocabulary, while CBOW model uses context vocabulary to predict the generation probability of target vocabulary. This paper mainly uses Skip-gram model for word vector representation, and its basic structure is shown in Figure 2.

As can be seen from Figure 2, Skip-Gram model is a neural network model, including input layer, hidden layer, and output layer. Firstly, the vocabulary is transformed into One-Hot encoding form and input into the input layer and then calculated in the hidden layer. The output layer outputs the probability of the target context vocabulary. When the model training is completed, the weight from the input layer to the hidden layer can be used to represent the target word vocabulary. This is because in the weights at this time, only the weights at the position of “1” in the one-hot encoder are activated, and the number of these weights is the same as the number of hidden layer nodes so that the vector composed of these weights can represent the target vocabulary. The position of 1 of one-hot encoder of different words is different, so the target word is uniquely mapped into a low-dimensional dense vector.

The kip-Gram model predicts the probability of target vocabulary context vocabulary using the following formula (1):where, represents the context vocabulary of the target vocabulary . When using softmax to calculate , the calculation efficiency is often very low. Therefore, the existing models often use hierarchical softmax for efficient calculation. Combined with hierarchical softmax, formula (1) is expanded into formula (2):where, represents the path length of the context vocabulary in the output hierarchy tree. represents the context-sensitive vocabulary of the target vocabulary . represents the output vector of the target vocabulary. represents the output word vector at the corresponding level under a context word.

represents the logistic output indicator variable, when , it is expressed as , and when , it is expressed as .

In the process of word embedding model training, some negative samples are often added to improve the training speed and improve the quality of the obtained word vector. At this time, the objective function of Skip-gram model training can be expressed as follows:

, , and represent parameter vector, context word embedding vector, and logistic indicator variable, respectively, in which the sampled vocabulary set of vocabulary is represented by . In deep learning tasks based on neural networks, the word embedding vector generated by Skip-gram model can be used as good input data. This method is used to map word vectors.

For the comment information that has been divided into words, the ultimate goal is to express the comment information as a word vector matrix and input it into the neural network. In order to achieve this goal, all comments written by user , that is, user comments, are integrated into a single document and recorded as . Document consists of words in total, as shown in equation (4).where, represents the first word in comment document, , represents the second word in comment document , and so on. Function is used to map the first word into a low-dimensional dense real number word vector through word embedding technology. Symbol is an association operation, which combines the word vectors of each word by line to form a word vector matrix. Compared with the traditional word bag technology, using this method to process the formed word vector matrix maintains the order of words in the sentence to the greatest extent, so that the order information of words can be well-preserved in the generated word vector matrix , which is of great help and a good advantage for further processing.

This method is also used to process the comment information of tourism service items. All comment texts obtained from tourism service item are also integrated into a single document and then the document is transformed into word vector matrix by the above method.

Information other than the comments of users and tourism service items is called other information, which includes the basic information of users and tourism service items. The age in the user’s basic information is normalized by , the gender is directly normalized to the real value of , and the occupation and city are directly converted to the word embedding vector. In addition to the comment text, the user’s historical evaluation information and the evaluated item name also need to be considered. Because in terms of tourism service projects, the name of tourism service projects can also reflect the characteristics of this project. For example, for scenic spots, if a scenic spot is called “XX mountain,” we can know that the scenic spot may be natural scenery. If a scenic spot is called “XX Museum,” we know that the scenic spot may belong to buildings or museums. Therefore, the user’s historical evaluation item name is transformed into a word embedding vector. For tourism service items, the names and locations in its basic information can be directly transformed into word embedding vectors. For tags, tags are mainly used to describe a few sentences summarizing tourism service items, so its processing is also similar to comment information. First, word segmentation is carried out and then word embedding vector transformation is carried out for the divided words. Through the above operations, other information is also transformed into vector form.

2.4. Network Construction

This chapter proposes to use the deep learning model of DNN for personalized POI recommendation. The network model is shown in Figure 3 and mainly includes two modules: feature extraction module and network learning module. Among them, the feature extraction module uses word embedding technology to extract and construct the features of location-based social networks. The network learning module includes network connection layer and network layer. The function of the connection layer in the network learning module is to fuse the extracted feature vectors. The function of the network layer is to train the proposed model and predict the score of users’ preference for interest demand recommendation.

In the process of building effective features recommended by tourists’ interest needs, the feature extraction module extracts topic feature vector, geographic factor feature vector, and user access feature vector through word embedding technology [26]. The network connection layer in the network learning module fully connects the vectors extracted by the feature extraction module and sends them into the deep neural network. The connection layer realizes the expansibility of the model. For example, if the relevant context information needs to be added in the application, it can be fully connected automatically through the full-connection layer to send the characteristics of the input layer into the network training. The network layer in the network learning module includes the following two functions:(1)Training function. In the training stage, the network layer uses the deep neural network to learn and extract the implicit features, obtain the high-order interaction between the features, and then take the high-order feature results extracted from the implicit layer as the input of the softmax layer to learn the classification task. The proposed personalized tourist interest demand recommendation model based on deep learning using word embedding technology transforms the tourist interest demand recommendation into a binary classification task, in which the user’s check-in record is defined as a positive sample, and the tourist interest demand points not visited by the user are regarded as a negative sample. The output result of the softmax layer in the model is a two-dimensional probability vector , where represents the user’s preference probability for the tourist interest demand point and represents the user’s nonpreference probability for the tourist interest demand point. Then, the cross entropy loss function is selected, and the gradient descent method is used to optimize the function.(2)Prediction function. In the prediction stage, input the interest and demand point information of users, the network outputs a probability vector , and recommend to users according to the probability ranking of .

2.5. Network Training

Taking the text features constructed in the previous stage as the input of the model, the final prediction results of tourists’ interest and demand points can be obtained after training.

The connection layer of DLM learning module fully connects the existing features. The vector representation of the full-connection of any user-tourist interest demand point is shown in the following equation (5).where, and represent the user subject features and geographical factor features represented by the potential vector of user tourist interest demand points, respectively. represents the process of connecting all feature vectors into a one-dimensional vector and sending it into the model, which can be calculated in the hidden layer of the model according to the following equation (6).where, represents the number of hidden layers in the model. represents the weight value of the hidden layer. represents the offset value of the hidden layer. indicates the activation function.

In addition, the dropout layer is added in the training process of each hidden layer to prevent over fitting problem.

In the output layer of the model, the predicted probability of the user’s interest demand for tourists can be obtained, as shown in formula (7) below.where, represents the weight value of the output layer. represents the offset value of the output layer. represents the normalized exponential function, and its output is two probability values, representing the probability that the user visits the tourist’s interest demand and the probability that the user does not visit.

The proposed model uses a combined regularization, and the adjusted loss function is shown in formula (8):where, represents the final desired prediction result. represents the regularization parameter. represents the learning parameters of the aggregation layer. represents the learning parameters of the layer of interest, and represents the input data. can be calculated by formula (9).

The model is optimized through formula (8), and the sorted is output as the recommendation result, where represents the number of recommendation results of tourist interest and demand.

Finally, according to the prediction results obtained by the model, the recommendation results are determined according to the user access probability. The higher the probability, the more likely the user is to access. When determining the value , select the corresponding first prediction results according to the ranking results of probability to recommend to users.

3. Experiment and Analysis

The software environment of the experiment is 64 bit Ubuntu 16 Python 3 and tensor flow deep learning platform installed under 04 operating system. The computer hardware configuration used in the experiment is Intel (R) core (TM) i7 CPU and single NVIDIA GTX Titan GPU.

3.1. Evaluating Indicator

In order to verify the effectiveness of the proposed method, two evaluation indexes, accuracy and average reciprocal ranking , are used. The calculation of accuracy and average reciprocal ranking are shown in equation (10) and (11), respectively.where, represents the correct quantity in the prediction results. represents the total number of prediction results. represents the total number of nodes in the network model. represents the predicted result sequence.

3.2. Experimental Results and Analysis

The algorithm in Ref. [11, 14] are selected for comparative analysis. Firstly, the self-built data set is used as the training set, and the number of nodes in the network model is set to 15 and 30, respectively. The results are shown in Table 2.

Then, the Foursquare data set is used as the training set, and the number of nodes in the network model is set to 15 and 30, respectively, to verify the three different algorithms. The results are shown in Table 3.

Finally, based on the data set of Tokyo, the number of nodes in the network model is set to 15 and 30, respectively, to verify three different algorithms. The results are shown in Table 4.

As can be seen from Tables 24, the accuracy and average reciprocal ranking of the proposed algorithm on three different data sets are the best. This is because the algorithm proposed in this paper uses the Skip-gram model to improve the extraction performance of user access feature vector, uses deep neural network to process tourists’ interest demand recommendation, which can better capture the sequence relationship, and better track users’ footprints and users’ points of interest, which greatly improves the recommendation performance. In the process of intelligent recommendation, the method in Ref. [11] does not consider the sequential process of tourists in the tourism process and does not divide the user access sequence into different interest segments according to time. Although the method in Ref. [14] can model local sequence behavior, it has some limitations in capturing high-order sequence behavior. For tourists’ complex interest point selection, feature extraction cannot be carried out.

4. Conclusion

This paper proposes a personalized tourist interest demand recommendation model based on deep learning and compares the proposed algorithm model with the other two algorithms through simulation experiments. The proposed model uses the deep neural network to transform the recommendation of tourists’ interests and needs into the task of binary classification, which improves the ability of extracting the features of the original data and effectively enhances the performance of predicting users’ ratings.

The future work will deeply study the method of real-time personalized travel route recommendation by improving the algorithm speed and the personalized route recommendation method for group travel.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.