Abstract

Personalized travel experience and service of tourist has been a hot topic research in the tourism service supply chain. In this paper, we take the context into consideration and propose an analyzed method to the tourist based on the context: firstly, we analyze the context which influences the tourist behavior patterns, select the main context factors, and construct the tourist behavior pattern model based on it; then, we calculate the interest degree of the tourist behavior pattern and mine out the rules with high interest degree with the association rule algorithm; we can make some recommendations to the tourist with better personalized travelling experience and services. At last, we make an experiment to show the feasibility and effectiveness of our method.

1. Introduction

With the development of economy and the improvement of people’s living standard, more and more people pay more attention to the quality of personalized travelling experience and service. In recent years, there has emerged more and more personalized ways to travel in tourism, such as FIT travel and independent travel. The traditional mode of travel service limits the diversity of service options, and it cannot fully meet the personalized needs of tourists. How to find the laws and the features of the tourist behavior through mining tourist behavior patterns and offer them better services has been a problem in the tourism service supply chain.

There are many researches concentrating on the tourist behavior pattern. Qing analyzed the characteristics of tourism services and the structural properties, constituent elements, and operation mechanism of tourism service supply chain in the context of modern information technology, and he put forward a new tourism service supply chain conceptual model based on tourist personalized demand [1]. Farmaki took the Troodos (Cyprus) as a case to research on the tourist motivation [2]; Martin and Witt proposed tourism demand forecasting model to represent tourists’ cost of living [3]; Smallman and Moore studied on the tourists’ decision making [4]; Kim et al. has worked on the Japanese tourists’ shopping preference with the decision tree analysis method [5].

These studies only analyzed the tourist from the view point of the psychology and behavioral science of the tourist and have not considered the context set which will influence the tourist behavior patterns. So, in this paper, we take the context into consideration and propose an analyzed method to the tourist based on context to find out the relationship between services in the travel and the context and analyse the important contexts which will influence the tourist behavior. To mine out rules with high interest degree with the association rule algorithm and do some recommendations to the tourist with better personalized travelling experience and services, we propose a method based on network diagram, and it can reflect the relationship of the contexts which influence the tourist behaviour clearly. Through this method, we can delete the low interest degree of tourist behavior patterns; then, we use the Apriori algorithm to mine the association rules of tourist behavior which have high interest degree. Finally, we take an experiment to show the feasibility and effectiveness of our method.

2.1. Context

There are many definitions on the context and many researchers work on it. Schilit et al. defined the context as identifications and change of location, people, and objects around them [6]. Brown et al. thought that the context should be defined as the symbols around people or other objects such as location, time, season, temperature, and so on [7]. In paper [8], the definition of context would be extended to the feature information of some objects’ situation, such as people, location, and so on. Snowdon and Grasso defined the context as the multilevel structure, mainly including the individual layer, the project layer, the group layer, and the organization layer [9]; Gu thought that the context would respond to the transformation based on the computers which are used as the centers to the people; in fact, he defined the context as a spectrum in his paper, as shown in Figure 1. He divided the context into computing context (such as communication bandwidth), user context (such as location), physical context (such as weather, temperature), time context (such as hour), and social context (such as law) [10].

In this paper, we think that the context is the influence factors of the tourist behavior pattern; different contexts will lead the tourist to different behavior patterns. We may take the following contexts into consideration: user, location, time, and device, and service type.

2.2. Association Rule and Apriori Algorithm

There are many association rule algorithms, and these algorithms can be divided into two classes: the first one is mainly focused on improving the analytical efficiency of the association rules; the other one pays more attention to the application of association rule algorithm and how to deal with value type variables and promotes the association of the single concept layer to multiple concept layers include and further reveals the inner structure of objects.

Apriori algorithm is one of the classical association rule algorithms; the earliest Apriori algorithm was proposed by Agrawal et al. [11]. The algorithm mainly including two parts: producing frequent item sets and producing association rules according to the frequent item sets. The algorithm scans data base, accumulates each item count, collects the items which meet the minimum support (min_sup), finds out the frequent 1-itemsets, and named it . Then, the algorithm uses to find out the frequent 2-item sets and uses to find out the frequent 2-item sets and so on and keeps doing these until it cannot find out the frequent -item sets. In these frequent item sets, it will be defined as a strong-association rule if it reaches the minimum confidence [12]. Since the association rule algorithm was proposed, it has been improved and applied in many fields. For example, Kang et al. applied the association rule algorithm in the Smart home [13], and Zhang et al. used the improved association rule algorithm in the university teaching managements [14].

3. Modeling and Mining Method for Tourist Behavior Pattern Based on Context

3.1. The Context Influence Factors Analysis of  Tourist Behavior Pattern

We can consider a tourist as a mobile customer because the tourist moved anytime and anywhere. Presently, there are only a few researchers who work on the mobile customer behavior pattern. Tseng and Lin thought that the service and location are the influence factors of customer behavior in mobile service environment; they proposed a method named SMAP-Mine to mine customer behaviors [15]. Ma et al. took the time context into consideration and constructed a temporal sequence mobile access patterns mining model based on context awareness [16]. Chen et al. studied in the terms of the problem of mining matching mobile access patterns based on joining the following four kinds of characteristics: user, location, time, and service [17]. So in this paper, we think that the context influence factors of mobile customer behavior pattern includes mobile user, location, time, and service type.

At the same time, we take different capabilities of the mobile devices that the customer use, such as screen size, battery durability, and access bandwidth, into consideration. We consider that these capabilities will influence the mobile customer behavior pattern directly or indirectly. To prove that, we make an experiment as as follows. In the particular context, we observed behavior patterns of three customers who used different equipments and recorded the service types, the trajectory at which they moved, and time and type of service. Finally we got the customer movement trajectories as shown in Figure 2 and the service request information table as shown in Table 1. We can conclude from Figure 2 that customers have different behavior patterns when they use different mobile devices. For example, when the user   used the device , his movement trajectory was ; when he used the device , his movement trajectory changed to . Then, we can conclude form Table 1 that the customer requested different services when he used different devices in the same time or requested the same service in different times. For example, when the user used device at time ; he requested the service ; when he used device at time , he requested the service ; the user requested the service at the location when he used device ; he requested the service   at the location when he used device . Through these analyses, we can conclude that the mobile customer has different movement trajectories, request different services at the same times and requests the same service in different places when he or she uses different devices. So we take the mobile device as a context influence factor of mobile customer behavior pattern.

There are other context factors which influence the mobile customer behavior pattern, such as the physically environmental condition in which the customer stays, including weather, temperature, humidity, and so on; and the social situations in which the customer is involved (e.g., manners and customs and laws) will influence the mobile customer behavior pattern.

We use the form of the questionnaire to determine the main context factors. In this questionnaire, we design nine questions. Each of the nine questions involves a context factor which will influence the tourist behavior pattern. From these questions, we can study which contexts will influence the tourist behavior pattern most. A total of 102 individuals participate in the survey; they are all tourists. After stating these questionnaires, we use SPSS to analyze the results. We set that different option to different weight (1–5), and then statistically averaging, what are the context weights influence the behavior. We can get the results as shown in Figure 3. So in this paper, we choose the following five context factors as the main context factors: tourist (user), device, location, time, and service.

3.2. Modeling the Tourist Behavior Pattern Based on Context

The preceding part of this paper has a brief analysis on the context factors which influence the tourist behavior pattern, and then we will build a model based on these context factors. In the following part of this paper, we will give relational definitions about the tourist behavior patterns firstly and construct a model of the tourist behavior pattern based on context latterly.

Definition 1 (tourist user). is the set of all the users; every user denotes a person who uses the mobile device to request mobile service messages from the mobile service supplier when he or she was travelling.

Definition 2 (devices of the tourist use). The device of the user use is a set of the devices of the user use to request mobile services and can be defined as .

Definition 3 (location). Location denotes a set of places in which the tourist moves some times, and we use the set to represent it.

Definition 4 (service). Service is a set of messages in which the tourist requests tourism services from the suppliers, and we use the set to represent it.

Definition 5 (timestamp, sojourn time and service request time). To represent the time quantum of the forming of the tourist behavior pattern approximately, this paper divides a day’s 24 hours into 24 time intervals simply, as shown in Table 2; every time interval denotes one hour, and the hour denotes one timestamp; sojourn time denotes the time in which the user sojourns at somewhere; service request time denotes the time in which the tourist requests some tourism services.

According to the previous definitions, this paper assumes as one tourist behavior, where is an element of the tourist user set , is an element of the device of the user use set , is an element of the time set , is an element of the location set , is the time in which the tourist sojourns at location , denotes an element of service messages set , and denotes the time in which the tourist requests for tourism services.

In the graph theory, there is a structure called network whose structure is composed of nodes and edges. Every edge has its quantitative index related to the nodes or edges; this quantitative index is normally called weight which could denote distance, expenses, carrying capacity, and so on [18]. Namely, the structure of the network is composed of nodes and edges involving weight; taking this advantage of the network, this paper makes the context factors which influence the tourist behavior pattern as the nodes of the network, the connected relationship among the context factors as the edge of the network, and the connect coefficient among different context factors as the weight of the edge (the specific connect relationship and the connect coefficient will be demonstrated in detail in the following part of this paper). Like this, the behavior pattern of a tourist can be clearly portrayed in the network. Figure 4 illustrates the network structure of the behavior patterns of two different mobile users; we use and to represent their behavior patterns, respectively.

3.3. Tourist Behavior Pattern Mining Method Based on the Network

The preceding part of this paper has a model analysis on the structure of the network of the tourist behavior; in the following part of this paper, we will give out the related definitions firstly and the specific procedures of the tourist behavior mining pattern based on the network latterly.

3.3.1. Basic Definitions

To explain the content of the mining method more clearly, we will give relational definitions firstly.

Definition 6 (connect coefficient). Connect coefficient denotes the connection relationship between two different attributes; the specific connect coefficients are , , , and . The connect coefficient of denotes the connection times between a mobile user and a device . The connect coefficient of is which denotes the connection times of a device with a location . The connect coefficient of is which denotes the time in which a user sojourns at location . The connect coefficient of is which denotes the time in which a mobile user requests for services.

Definition 7 (interesting locations and interesting services). When the length of time in which a tourist sojourns somewhere is larger than the threshold value we set, we think that the tourist is interested in this place. Similarly, when the length of time in which a mobile user requests for a service is larger than the threshold value or the connection times is larger than a threshold value, we think that the mobile user is interested in this service. Usually the length of time will be set up to 30 minutes and the connection times will be set up to 10 times.

Definition 8 (repeated edge). For a tourist, he may have the same connection edge in two different behavior patterns; such edge will be called repeated edges in this paper. For example, in the following behavior patterns and , they have two repeated edges, namely, and .

Definition 9 (connect edge value). Connect edge value is a standard value obtained with standardizing the connect coefficient (Definition 6) in the case where the different quantity levels of input variables affect the final mining result. In this paper we use “” to present the connect relationship between different attributes, and specific weights are , , , and ; the computational formulas of every edge weight are as follows.

Connect Edge Value of . The connect edge value of mobile user and device equals the ratio of the connect times between user and device to the sum times between user and device set; the specific formula is where denotes the amount of devices. Similarly, the connect edge value of is as follows: where denotes the amount of locations and and denote the connect times between devices and locations in the same behavior pattern of a mobile user.

The Connect Edge Value of where and denote the time in which a mobile user requests services at somewhere in his behavior pattern.

The Connect Edge Value of where denotes the amount of the connect service set and and denote the time in which a mobile user requests for services in his behavior pattern. An edge will be deleted if its connection edge value is smaller than a threshold value. A behavior pattern will not be involved in the calculation of the connect edge value if it contains interesting locations or interesting services.

Definition 10 (connect edge coefficient ). When a repeated edge appears, this edge value constitutes of several behavior patterns; connection edge coefficient denotes the incidence a behavior pattern has on this edge. Its value equals ratio of the connect edge coefficient of this behavior pattern to the sum of all the connect edge coefficients of the same mobile user at this edge.

Definition 11 (interesting degree id). Interesting degree id is an index to reflect the degrees of interests of the mobile user behavior pattern. Specifically, it equals the value that the sum of all the tuple (, , , , ) weight, the formula of interesting degree . If the value of interesting degree id is smaller than a threshold value th1, we will regard the degree of interests of this mobile user pattern as low interest level and delete this pattern from the network. If the value of interesting degree id is larger than another threshold value th2, we will regard the degree of interests of this mobile user pattern as high interest level. Like this, we divide mobile user behavior patterns into three parts, namely, low level of interest, common level of interest, and high level of interest. We can set them by our need; the larger the value is, the higher degree of interest the rules of the results will have. As is illustrated in Figure 5, we can use it as a behavior prediction model to predict the behavior pattern of a mobile user in the future. If a behavior pattern contains interesting locations or interesting services, we will regard it as the high interesting level behavior pattern without calculating the specific value of its interestingness.

3.3.2. Mining Steps

First Step (collecting data). To mine tourist behavior pattern, we must collect data about the tourist. We can get the information table as is shown in Table 3 through collecting user data, mainly including tourist information (), mobile device (), location (), collecting times (), time (), time of the user stay the location (), the service type the user request (), and time of the user request the service ().

Second Step. Extracting context attribute number of the context set which influences the mobile customer behavior pattern and design corresponding layers of the network diagram; in this paper, we should design a network diagram with five layers, each layer corresponds to all nodes of , , , , and , respectively, and the number of the layer nodes corresponds to each attribute value number, as shown in Figure 6.

Third Step. Collecting the adjacent nodes, each connection coefficient should be marked as Definition 7; we need to add the connection coefficient of the side when it repeats several times. For example, there are two situations when the device collects the location : one is 4 and the other is 2, so the connection coefficient of equals .

Forth Step. Considering different customers have different behavior patterns, we classify each user into a group and calculate the collection weight according to Definition 8; when the collection weight is lesser than the threshold, the edge will be deleted.

Fifth Step. Calculating the remaining customer interest degree according to Definition 11 and set the low interest degree th1 and the high interest degree th2 value. When the customer interest degree is lesser than the low interest degree th1, this customer behavior pattern will be unconcerned, and the general interest degree and the high interest degree pattern will be conducted in the next step.

Sixth Step. Using Apriori algorithm to mine the frequent pattern to the general interest degree and the high interest degree pattern, mine out the association rules with higher degree value on support and confidence; we can use these rules to forecast the customer’ behaviors in future or recommend some services to mobile customers.

In order to show the availability of our method, we propose the concept of “coverage,” which means the ratio of the number of the same rules that are produced by our model to the number of rules that produced directly. If the coverage is larger than a threshold, we say that the method we proposed is available. Generally, the larger the threshold is, the more the availability of the method is. In this paper, we set the threshold to be equal to 80%.

4. Experiment and Analysis

4.1. Example and Analysis

We take the West Lake of Hangzhou, for example, to illustrate the application of the model, via GPS and RFID provide personalized services to users combined with requirements and preference of the user. So we select part of the information data about tourist behavior from West Lake of Hangzhou Scenic Area Management Committee as is shown in Table 4.

To verify the effects of the proposed method, we use two standard metrics: interest degree and coverage.

We can conclude from Table 4 that the patterns , , and are the patterns with interested locations or interested services, and we think that these patterns are the behavior patterns with high interest degree patterns. To show the processes of our method, we choose three users’ patterns in this paper and design a network diagram with five layers and collect the adjacent layers and then calculate the connection coefficients according to Definition 7, as shown in Figure 7.

Then, we divide each user into a group and calculate the collection weight according to Definition 8; we set that the edges whose weight is lesser than 0.2 would be deleted, so the edges , , , , , , , , and will be deleted and the patterns with edges , , , , , , , , and will be deleted too, as shown in Figures 8, 9, 10.

So remain following patterns:, , , , , , ; then we calculate the interesting degree of these patterns according to Definition 11, as is shown in the following expressions:

In this paper, we set the low interesting degree th1 value to be equal to 0.8 and the high interesting degree th2 value to be equal to 1. So the patterns whose interesting degrees are lesser than 0.8 interesting degree are the low interesting patterns, the patterns whose interesting degrees are higher than 1 interesting degree are the high interesting patterns, and the patterns whose interesting degree between 0.8 and 1 are the common patterns. So and are the low interesting degree patterns, , , and are the common interesting degree patterns, and , , and are the high interesting degree patterns. We delete the low interesting degree patterns and get the patterns with high interesting degree as is shown in Table 5.

Then, we use the Apriori algorithm to mine rules on the high interesting degree patterns; we set the minimum support to 20% and the minimum confidence to 80%, then we can get the results as follows.

The lift denotes the ratio of the confidence to the support of the consequent item; the computational formula is followed: . The lift reacts the influence degree of the antecedent item to the consequent item appears. Generally, the lift value should be larger than 1, and it means that the antecedent item has a positive influence on the consequent item appears. The larger the life value is, the better the rule is.

From Table 6, we can conclude that we can get 39 association rules when we use the method we proposed in this paper. These rules were obtained from the high interesting pattern; we thought that these rules were interesting rules. Then we observe the rule with the maximum lift, time = , and service = location = . The value is 4.5. It means that this association rule has the highest realistic guidance. So this rule will be firstly considered when we use the rules of the result. We can use these association rules to recommend some services to tourist to offer them better services; for example, using the rule location = and time = service = , we can recommend the to the tourist when the tourist stays in the context with location = and time = . In this paper, the service is the tourism route guide, so we can send the tourism route guide to the tourist as is shown in Figure 11.

4.2. Comparison and Discussion

To verity the effects of the method we proposed in this paper, we use the Apriori algorithm, the GRI algorithm, the CARAMA algorithm and Predictive-Apriori algorithm on the original data (here we set the minimum support equals to 20% and the minimum confidence equals to 80%; too), and we get following rules as is shown in Tables 7, 8, 9, and 10.

Comparing Table 6 with Tables 7 and 8, there are 11 rules from Table 7 which have been emerged in Table 6 (the rules marked with yellow as is shown in Table 6), and all rules are in Table 8 have been emerged in Table 6. So we think that the method we proposed to mine the mobile customer behavior pattern has the merit of effectiveness; in this experiment the validity of the method is about 91.67% (11/12) to the Apriori algorithm and 100% (6/6) to the GRI algorithm, which means the coverage values are 91.67% and 100%, which are larger than the threshold we set before. It means that the method we proposed is feasibile and effective. Excluding the 11 rules in Table 7, Table 6 has other 28 rules and these rules have the feature of high interest, so they will provide more choices to the service provider and more services to the mobile customer. Then we observe the rule which has the maximum value of lift from Tables 6 and 7, the rule is time = and service = location = ; it means that the method we proposed is similar to the classical Apriori algorithm. At last, the rule whose ID = 1 in Table 7: location = user = ,It is the only rule that is not included in Table 6, although this rule meets the minimum support and the minimum confidence; the pattern with is a low interesting pattern as we definite before, and the rule location = user = is an uninteresting rule. In our method, we can reject uninteresting rules like this. Through the analysis, the method we proposed in this paper is more feasible and advanced when being compared with the Apriori algorithm.

5. Conclusion

In this paper we considered the context factors which influence the tourist behavior pattern comprehensively, such as the device the tourist use, time, location, and service types, and got the context set which influences the tourist behavior pattern. Then we proposed a method to mine tourist behavior patterns based on the network diagram; this method constructed a network diagram firstly. Then, we got the behavior patterns with high interesting degree and did association rule mining in the patterns and got the rules; at last, we made an experiment to show the feasibility and effectiveness of our method. In our experiment, we set the low interest degree th1 value to be equal to 0.8 and the high interest degree th2 value to be equal to 1 and deleted the low interest pattern; then we did association mining with Apriori algorithm to the remainder of the patterns and got 39 rules; we can do some recommendations to the tourist with these high interest rules. Compared to the results which do not use this method, it has the following advantages: (1) it can keep the interest rules and delete the uninterested rules in the results; (2) it can produce many other interest rules, which we can use them to make more recommendations for the tourist; (3) it can produce the same rule which has the highest lift compared to the result that does not use this method. That is, the method we used in this paper is feasible and superior.

The future work will be further researching on the context factors which influence the tourist behavior pattern and expanding the context set; also we will analyze the performance of the method we proposed and optimize the method and so on.

Appendix

Questionnaire

Your age: —

Gender: male/female(1) To what extent do you think the user will influence the behavior?(A) Strongly disagree(B) Disagree(C) Neutral(D) Agree(E) Strongly agree(2) To what extent do you think the location will influence the behavior?(A) Strongly disagree(B) Disagree(C) Neutral(D) Agree(E) Strongly agree(3) To what extent do you think the time will influence the behavior?(A) Strongly disagree (B) Disagree(C) Neutral(D) Agree(E) Strongly agree(4) To what extent do you think the device will influence the behavior?(A) Strongly disagree  (B) Disagree(C) Neutral(D) Agree(E) Strongly agree(5) To what extent do you think the service will influence the behavior?(A) Strongly disagree   (B) Disagree(C) Neutral(D) Agree(E) Strongly agree(6) To what extent do you think the network bandwith will influence the behavior?(A) Strongly disagree      (B) Disagree(C) Neutral(D) Agree(E) Strongly agree(7) To what extent do you think the temperature will influence the behavior?(A) Strongly disagree      (B) Disagree(C) Neutral(D) Agree(E) Strongly agree(8) To what extent do you think the law will influence the behavior?(A) Strongly disagree   (B) Disagree(C) Neutral(D) Agree(E) Strongly agree(9) Others factors will influence the behavior, such as —.

Acknowledgments

This research is supported by the National Natural Science Foundation of China (Grant nos. 71071140 and 71301070005), the National Natural Science Foundation of Zhejiang Province (Grant no. Y1090617), the Key Innovation Team of Zhejiang Province (Grant no. 2010R50041), the Soft science key research project of Zhejiang Province (Grant no. 2013C25053), the Zhejiang Gongshang University Graduate Student Scientific Research Project (1130XJ1512168), and the Modern Business Centre of Zhejiang GongShang University.