Abstract

This article aims to explore a more suitable prediction method for tourism complex environment, to improve the accuracy of tourism prediction results and to explore the development law of China’s domestic tourism so as to better serve the domestic tourism management and tourism decision-making. This study uses grey system theory, BP neural network theory, and the combination model method to model and forecast tourism demand. Firstly, the GM (1, 1) model is established based on the introduction of grey theory. The regular data series are obtained through the transformation of irregular data series, and the prediction model is established. Secondly, in the structure algorithm of the BP neural network, the BP neural network model is established using the data series of travel time and the number of people. Then, combining BP neural network with the grey model, the grey neural network combination model is established to forecast the number of tourists. The prediction accuracy of the model is analyzed by the actual time series data of the number of tourists. Finally, the experimental analysis shows that the combination forecasting makes full use of the information provided by each forecasting model and obtains the combination forecasting model and the best forecasting result so as to improve the forecasting accuracy and reliability.

1. Introduction

In recent years, tourism has sprung up. Many areas regard tourism as a pillar industry to develop, hoping to drive the development of the whole social economy. It can be seen that the correct prediction of tourism demand plays an important role in the formulation of national tourism development policy and strategic planning, the optimal allocation of tourism market resources, and the strategic planning and decision-making of tourism enterprises [1, 2]. It is the first condition to realize the sustainable and healthy development of tourism to establish a scientific and operable tourism demand prediction model and make an accurate and effective prediction. At present, both at home and abroad, the research of the tourism demand forecasting model method has made considerable achievements, but from the perspective of forecasting accuracy, there is still no universally applicable method [3]. The main reason is that tourism demand itself is a complex system, which is restricted by many factors.

With the rapid development of modern basic science and computer technology, a large number of new research results have emerged in the theory of prediction model. According to the current method of combining grey model with the neural network, some scholars put forward three forecasting models: parallel grey neural network combined model, series grey neural network combined model (SGNN), and embedded grey neural network combined model (EGNN) [4, 5]. The so-called combination forecasting of the parallel combination model is to make a weighted average combination of the predicted values obtained by different methods so as to get a more accurate prediction value [1]. And its core problem is how to calculate the weighted average combination coefficient so that the combination prediction model can improve the prediction accuracy more effectively. The combination of the first mock exam model, that is, the combination forecasting, can make full use of all useful information so as to realize the division of labor among the models and complement each other [6, 7]. The research of tourism demand forecasting needs to involve the natural environment, politics, economy, and cultural concept of the tourist source country and is also affected by the various aspects of the destination country, so it has higher requirements for accurate forecasting. Domestic and foreign scholars mainly focus on tourism income and tourism reception and involve many factors. The prediction of the tourism industry mainly uses the traditional prediction model based on mathematical statistics, which is suitable for the problem of linear regression but has no advantage for the nonlinear prediction. This article mainly uses the grey system theory, BP neural network theory, and combination model method to model and predict the tourism demand. The influencing factors of tourism demand involve many aspects such as social economy and physical geography, and their relationship is also complex and most of them are nonlinear correlation, so the prediction of artificial neural network has more advantages. The combination forecasting makes full use of the information provided by each forecasting model to obtain the combination forecasting model and the best forecasting result so as to improve the forecasting accuracy and reliability.

The main research content of this article is to find a better model method suitable for tourism demand forecasting. Firstly, the grey model and neural network are used to forecast, respectively, and then the weighted combination of the forecast results is used as the actual forecast value. This article is composed of the following parts: starting from the introduction of grey theory, the GM (1, 1) model is established [8]. Through a certain transformation of the irregular original sample data series, a relatively regular new data series is obtained, and the prediction model is established. In the structure algorithm of the BP neural network, firstly, the single-variable BP neural network model is established using the number of tourists data series, and then, the multivariable BP neural network model is established from the time series of several factors that affect the number of tourists to analyze the prediction results. Compared with the single prediction, the grey neural network and the tourism demand prediction model can learn from each other and improve the prediction accuracy. Combined forecasting makes full use of the information provided by each forecasting model, synthesizes their results in an appropriate way, and obtains the combined forecasting model and the best forecasting result so as to improve the forecasting accuracy and reliability. The main contents and organizational structure of the article are as follows: the third section is the literature review, which introduces the detailed research status at home and abroad; the third section introduces the grey theory and BP neural network in detail. The fourth section constructs the grey neural network combination tourism demand forecasting model, which is divided into parallel combination model and series combination model. In Section 5, the related experiments and analysis are carried out, and the univariate model and the multivariate model are used to predict and analyze the tourism demand, respectively, achieving good results.

The research of tourism demand forecasting needs to involve the natural environment, politics, economy, cultural concept of the tourist source country and is also affected by the various aspects of the destination country, so it has higher requirements for accurate forecasting [9]. The early research on tourism demand forecasting focuses more on the accuracy of forecasting, so the main research direction is to use different technical methods to forecast tourism demand and discuss the performance of different forecasting methods. With the rapid development of computer technology, artificial intelligence method is more and more applied to the research of tourism demand prediction [10, 11]. Its biggest advantage is that it has no higher requirements for other information except the probability distribution of data, and the prediction accuracy is higher. Since the invention of the first generation of single-layer perceptron, the neural network has been continuously improved, including BP neural network, principal component neural network, radial basis function neural network, SVM neural network, convolution neural network, Boltzmann machine neural network, and deep residual artificial neural network.

Vetitnev used the error correction model and time series model to study the seasonal change of inbound tourism demand in Australia [12]. The results show that the prediction error of the time series model is small, but at the same time, it points out that the prediction accuracy of the time series model is quite different when predicting the tourism demand of different regions. Jun W et al. pointed out that the prediction accuracy of the ARIMA model will not exceed that of the traditional Naive model, and the reason for the difference in prediction accuracy may be the different data used in the test and comparison in the research process [13]. It is also possible that the differences in tourism demand structure in different regions lead to the different laws reflected by the historical data, and the differences in the final data lead to the differences in the prediction accuracy. Song H developed a new type of tourism demand forecasting system by combining the fuzzy clustering method (FCM) with least squares support vector regression technology in order to forecast tourism demand more accurately [14]. The empirical results show that the performance of the forecasting system is better than other methods, and the forecasting accuracy is higher. Croce used the artificial neural network model and time series method to predict Catalonia's tourism demand, respectively, considering the seasonality and volatility of tourism [15]. The results show that the neural network model is more suitable for the prediction of nonlinear data, and the time series model is more suitable for linear data. There is no absolute difference between the two. The key lies in the degree and technology of data preprocessing.

Through the above review, it can be seen that different forecasting models have different forecasting accuracy and adaptability in different situations, but many empirical studies show that no forecasting model is generally acceptable for tourism demand forecasting. More and more foreign scholars try to use the combination forecasting model to forecast the tourism demand in order to improve the forecasting accuracy. Sakhuja combined four time series models to forecast tourism demand and found that the combined forecasting model has higher forecasting accuracy and smaller forecasting error than a single forecasting model [9]. Silva et al. used a variety of combination methods to study the combination of long-term and short-term forecasts and finally proved the effectiveness of the combination method by forecasting the monthly number of inbound tourists in Egypt [16]. However, some scholars put forward different views on whether the prediction accuracy of the combination prediction model is better than that of the single prediction model. For example, Wong et al. studied whether the combination model under the background of tourism can improve the prediction accuracy, constructed three different combination methods of prediction models, and concluded that the prediction accuracy of not all combination prediction models is higher than that of all single prediction models. But most of the first mock exam models will not be lower than those of the worst prediction models. It is pointed out that different combination methods have a very important impact on the prediction accuracy of combined forecasting models.

3. Grey Theory and BP Neural Network

3.1. Grey Prediction Model

Grey system theory was introduced in 1982 and has been developed for more than 30 years [17]. Its theoretical system and structure have been gradually established and improved and have been successfully and widely used in many scientific fields. The grey system theory mainly takes the uncertain system of “poor information and small sample” as the research object. Through the generation and development of some existing information, valuable information can be obtained to accurately describe and effectively control the operation behavior and evolution law of the system. Grey system theory is mainly composed of grey prediction model, grey clustering analysis, grey correlation analysis, grey decision-making method, and grey sequence operator. The main technical contents involved include data processing and analysis, model establishment, decision-making of major issues, and prediction of development trend of things [18]. Prediction is to explore the past and then speculate and understand the future. grey prediction refers to the use of grey system theory to process the original data, establish a prediction model to study, discover and master the development law of the system, and make a scientific quantitative prediction of the future trend and state of the system.

Tourism demand is a kind of random event synthesized by fuzziness and contingency, which conforms to the applicable characteristics of grey theory. Therefore, it can be analyzed and studied using grey system theory. GM (1, 1) prediction model of the grey theory is the basic model of grey system theory. The basic idea is to transform the irregular original sample data series so that we can get relatively regular new data series and then establish a prediction model. The definition and modeling steps of the grey GM (1, 1) model are as follows:(1)Data preprocessing  Because most of the original data are irregular and random, they can not be directly used to build the model, so it needs to process the original data sequence.  Record the original data sequence:  The calculation formula is as follows:  When λ (k) is within the tolerance coverage of Y, it indicates that the original data sequence x (0) can be used to build a model for prediction. Otherwise, the original data sequence needs to be modified appropriately to make it in the range of acceptable coverage, and then the modified data sequence is used for modeling and prediction:  A new data sequence is obtained after an accumulation process:(2)Constructing matrix B and vector   The new data sequence generated by accumulation enhances the regularity of the original data sequence and weakens the randomness, and the weakening of randomness becomes more obvious with the amount of accumulation. The cumulative matrix B and constant term vectorare constructed as follows:(3)The basic form of the GM (1, 1) model is established. b is the amount of ash, which is calculated by the least square method according to the following formula:  The prediction model can be obtained by introducing the values of a and b into the time function:  Through the derivation and reduction of , the following results are obtained: is the predicted value.

3.2. BP Neural Network

Since the invention of the first generation of single-layer perceptron, the artificial neural network has experienced decades of development and has become an important member of machine learning research. BP (backpropagation) neural network is one of the most widely used neural networks and a key link in the development history of artificial neural network. BP neural network is a kind of multilayer forward and multilayer perceptron based on an error backpropagation algorithm. It has powerful self-healing, association function and good nonlinear adaptability to deal with all kinds of information. It also processes a large number of data in parallel. It is a general approximator. The basic principle is that when the system does not know the mapping relationship between input and output, the gradient descent search technology is used to continuously modify the parameters of the network by using the error back propagation algorithm feedback so that the mean square error of the actual output and the expected output in the network is slowly reduced, and the mapping relationship is adjusted to reduce the internal error of the system, so as to achieve better convergence effect. As shown in Figure 1, in general, BP neural network is mainly composed of input layer, hidden layer, and output layer, with at least one hidden layer. The establishment of a neural network starts from basic neurons. An artificial neuron is the basic unit of the neural network; each layer is composed of a different number of neurons; except for the same layers of neurons that are not connected, adjacent layers of neurons are connected with each other. Input layer neurons mainly quantify the index factors needed, depending on the number of samples. The hidden layer is to express the internal logical relationship in the neural network, which is mostly set by researchers according to the research purpose and their own experience. The output layer is the data corresponding to the input layer through continuous logical relation operation, and its quantity is determined according to the research object.

As shown in Figure 2, the network structure diagram includes an input layer, and the input data are represented by x1. The output of the hidden layer node is represented by YH and the output target signal is represented by TJ. In an output layer, the output nodes are represented by ZJ, ih represents the weights from the input layer to the hidden layer, jh represents the weights from the hidden layer to the output layer, and N1, N2, and N3 are the numbers of the input, hidden layer, and output nodes, respectively. It can be seen from the figure that the error of the BP neural network is backpropagation, and the error backpropagation function represented by △ n can also be called loss function.

The main process of the BP learning algorithm is as follows:(1)According to the sample situation, the network structure is designed, the data are initialized, the target error and the maximum number of iterations are set, and the number of iterations is less than the number of samples.(2)The sum of squared error is calculated and tested. The node output of the hidden layer is.(3)Backpropagation error: the error vector of neurons in the output layer is calculated. Firstly, the error backpropagation function is defined as . The gradient descent algorithm based on the backpropagation error is used to continuously correct the error from the output layer and gradually to the next layer to iterate the weights.(4)Constantly adjust the weight and threshold. In the network, weight vector and threshold are adjusted according to certain rules.

3.3. Fusion of Grey Model and BP Neural Network

Grey model and BP neural network model have their own unique advantages and characteristics, but through in-depth analysis, we can find that there are many similarities and complementarities between them. First of all, the output of the BP model is approximated to a fixed constant with the set expected error, which makes the output of the BP network fluctuate around the constant. Therefore, if BP is put in the grey system, its output is actually a grey number; that is, the BP network itself contains the content of grey system theory.

At the same time, the grey model takes the uncertainty of “small sample” and “poor information” as the modeling object, and its modeling process needs fewer sample data and does not need to consider its distribution law and change trend, so it has the characteristics of simple modeling and convenient operation, but it lacks the ability of self-learning, self-organization, and self-adaptivity, and its ability to deal with nonlinear information is weak. The characteristics of the neural network model can just supplement the grey method.

Therefore, the “poor information” of the grey model is used to replace the “large sample” needed by the neural network model, and the nonlinear processing ability of the BP neural network is used to make up for the bad nonlinear fitting of the grey model. The grey neural network model with better performance can be established by combining the simple modeling of the grey model with the ability of error feedback adjustment of the neural network, as shown in Figure 3. According to the following steps, the tourism demand combination forecast is carried out: Step 1: determine the tourism demand data according to the actual situation and sort out the data preliminarily Step 2: establish two or more single tourism demand forecasting models and forecast the tourism demand in the next stage according to the established forecasting models Step 3: judge the prediction result in step 2; if the prediction result is normal, go to step 4; if the prediction result is abnormal, analyze the reason, and then repredict, go to step 2; cycle in turn Step 4: using the qualified individual prediction results in step 2 to establish a combined prediction model and predict Step 5: test the prediction results and analyze the prediction results to draw a conclusion

In the prediction of practical problems, it is often necessary to choose different prediction methods according to different data structures for different forms of combination. Compared with single model forecasting, the advantages of combination forecasting are as follows:(1)Combined forecasting can obtain different system information by different models according to different sample data and from different angles. Because it has the advantages of multiple models, it can synthesize the prediction results of each model more comprehensively, which not only improves the accuracy of model prediction but also greatly increases the stability and reliability of prediction results.(2)Combined forecasting makes model selection relatively easy. The theoretical and practical application research shows that the combination prediction using different models often has higher prediction accuracy than that of the single model. At the same time, the combined model is more robust to the change of data signal.(3)The first mock exam can make up for the deficiency of a single model. In essence, combination forecasting is to integrate the prediction results of various single models to disperse the uncertainty of the prediction results of a single model so as to improve the prediction accuracy.

Based on this, aiming at the problem of tourism demand forecasting, this article carries out the research of using the grey model and BP neural network to establish an agent model for combined forecasting.

4. Grey Neural Network Combined Tourism Demand Forecasting Model

The combination forecasting method is to use two or more different forecasting methods for the same problem. It is not only a combination of several quantitative methods but also a combination of several qualitative methods or a combination of qualitative and quantitative methods. Different forecasting methods can provide different useful information. The main purpose of combination forecasting is to make full use of the information and improve forecasting accuracy as much as possible.

At present, single forecasting methods are mostly used for tourism demand forecasting, such as grey theory, genetic algorithm theory, and support vector machine theory. These new theoretical methods overcome the shortcomings of traditional forecasting methods in solving nonlinear, uncertain, and time-varying system forecasting and improve the accuracy of tourism demand forecasting. However, tourism demand will be affected by many internal and external factors. The use of a single forecasting method can only grasp some of the main influencing factors but can not include the comprehensive and effective information of tourism demand, which makes the accuracy of forecasting results low. The use of a combined forecasting method to forecast tourism demand can make use of the useful information of each single forecasting model to improve the accuracy of forecasting. The purpose of this article is to open up a new perspective of this work through the combination of tourism demand forecasting research and to provide a new reference for the future.

In this article, the grey model and neural network are used to predict, respectively, and then the weighted combination of the prediction results is used as the actual prediction value. The factors that affect the prediction performance of the parallel combination prediction model are, first of all, the advantages and disadvantages of each component prediction method for the same prediction object; especially, a prediction method with excellent performance should be included in the component prediction method. Generally speaking, the prediction performance of parallel combination forecasting improves with the increase of the number of composition forecasting methods, but the improvement of combination forecasting performance decreases with the increase of the number of composition forecasting methods. Finally, the structure of the combination forecasting model is the weight of each component forecasting method in the combination model.

The series combination model takes the prediction results of multiple grey models as the input of the neural network and uses the nonlinear fitting ability of the neural network to calculate the weight of each grey model. Because the series grey neural network only takes the prediction result of the grey prediction model as the input, it ignores the influence of other factors on the prediction result. In fact, the influence of other factors on the prediction result is very important. Therefore, the prediction results of the grey model and the main factors affecting the tourism demand are taken as the input of the neural network at the same time. Through the prediction error of the grey model to correct the weights in the training process of the neural network and the nonlinear effect of the influencing factors on the tourism demand, the best fitting between the predicted value and the observed value can be achieved.

4.1. Parallel Combination Model

The so-called parallel combination forecasting is to comprehensively use various forecasting methods to obtain the combination forecasting model in the form of an appropriate weighted average. The core problem of parallel combination forecasting is how to calculate the weighted average coefficient to make the combination forecasting model more effective. The weight of each prediction method should reflect the contribution of each method to the total prediction results; that is, the greater the error, the smaller the weight in combination prediction; the smaller the prediction error, the greater the weight in combination prediction.

Based on the above results, the prediction values of the g-Markov (1, 1) model, single-variable BPNN, and multivariable BPNN from 2010 to 2020 are obtained. The statistical results are shown in Table 1 and Figure 4.

The prediction error information matrix is obtained by MATLAB programming:

The optimal weight coefficient vector is  = [0.0642 0.6378 0.3283]. According to this combination weight, the parallel combination prediction model is established as follows:where y is the combination prediction value; Y1 is the grey Markov correction model prediction value; Y2 is the single-variable BP neural network prediction value; Y3 is the multivariable BP neural network prediction value.

According to the linear formula of the weighted combination model, we can get the prediction value of the parallel combination model from 2010 to 2020. In order to compare the combination prediction effect, we use the average absolute error mad, mean square error MSE and average absolute percentage error MAPE to evaluate them.

The calculation formula of the above three indicators is as follows:

Let the original data sequence be and the predicted data series . The first mock exam result is that the prediction accuracy of the combined model is smaller than that of the other three models, and the lower the precision index, the better the prediction effect of the model. Therefore, the combination model can make full use of the advantages of each model and can play the role of learning from each other.

4.2. Series Combination Model

In view of the fact that the traditional series grey neural network only takes the prediction result of the grey prediction model as the input and ignores the influence of other factors on the prediction result, the influence of other factors on the prediction result is very important. Therefore, we take the prediction results of the grey model and the main factors affecting the tourism demand as the input of the neural network at the same time. Through the prediction error of the grey model to correct the weights in the training process of the neural network and the nonlinear effect of the influencing factors on the tourism demand, we can achieve the best fitting between the predicted value and the observed value, as shown in Figure 5.

The model structure of the series grey neural network is as follows: the input layer of the neural network has five nodes, the first four nodes are the main factors affecting the tourism demand, and the fifth node is the prediction result of the GM Markov model. The number of input nodes should not be too many because the time series used for training is limited. Too many nodes will lead to insufficient training of the neural network, which can not well describe the complex relationship between input and output. There is one node in the output layer, which is the number of tourists. The sample selection table is shown in Table 2.

The number of neurons in the hidden layer is calculated by the empirical formula. After trial calculation, it is finally selected as 8. The neural network is trained in 7638 steps and the trained neural network is used to predict the data of the next three years. The results are shown in Table 3 and Figure 6.

In the first mock exam, the grey GM (1, 1) model and the tourism demand are used as inputs of the neural network. The number of tourists is used as output to train the network. The trained network is used to predict the data in the past three years. The result shows that the mean square error is less than the mean square error of the single model and can be used for the prediction of tourism demand.

5. Results and Discussion

In the design of a three-layer BP neural network structure, it is obvious that the number of output layer nodes is 1 (the output is the number of tourists data column). When the input layer inputs the iterated number of tourists in data column, the number of input layer nodes is selected according to experience; when the input layer inputs four influencing factors of tourism demand, the number of input layer nodes is obviously equal to 4. According to whether the types of time series of the input layer and output layer are consistent, the two kinds of BP neural network models are distinguished. The single-variable BP neural network model (abbreviated as “single BPNN”) takes the tourist number time series as the input, and the multivariable BP neural network model (abbreviated as “multi BPNN”) takes the influencing factors as the input.

5.1. Univariate Model Prediction

For the time series of the total number of tourists, the input layer adopts the method of time series iteration to establish a single-variable BP neural network. The number of neurons in the hidden layer is calculated by the empirical formula. After trial calculation, it is finally selected as 8. After 8738 steps, the network meets the training requirements. The trained BP neural network is used for prediction, and the results are shown in Table 4 and Figure 7.

It can be seen from Table 4 that the average relative error of the BP neural network is 2.51%, which is 2.45 percentage points smaller than that of the GM (1, 1) model and 2.1 percentage points smaller than that of the GM Markov model. And the fitting effect is better than both.

By comparison, we can obviously feel the excellent prediction effect of the BP neural network. However, the structure design of a neural network (such as the selection of the number of hidden layer neurons) depends on the prior knowledge and experience of the designer and lacks a strict design program with a theoretical basis. The neural network has no memory for the trained weights and thresholds. If new samples are added, the network needs to be retrained.

5.2. Multivariate Model Prediction

When the input layer inputs four influencing factors of tourism demand, a multivariable BP neural network can be established. A three-layer BP neural network with four input nodes and one output node is selected. The number of neurons in the hidden layer is calculated by the empirical formula. After trial calculation, 9 is selected. After 3956 steps, the network training is completed, and its performance curve is shown in Figure 8.

The trained network is used to predict the samples. The structure of fitting and prediction is shown in Table 5 and Figures 9 and 10.

After the multivariate BP neural network model is used to predict the number of tourists, the average relative error is 2.35%. The average relative error of the single-variable BP neural network model is 2.51%, which shows that multifactor analysis can reflect the real situation of tourism demand more comprehensively than single factor analysis. Theoretically, the number of influencing factors is the number of neurons in the input layer, but if the network input is too much, the weight between the input layer and the hidden layer will be multiplied. Experience shows that the number of training samples should be 5–10 times the total number of network connection weights to achieve better mapping accuracy.

Based on the combined artificial neural network model analyzed above, the relevant performance evaluation indexes of neural network learning after training and learning are calculated. After comparing the learning data of the artificial neural network, the evaluation indexes of each model data test are listed, respectively. The specific values are shown in Figure 6 and Table 6:

MSE index is the mean square error of the network, which can reflect the change degree of the data, and is used to measure the mean square error of the model. R2 is to explain the representativeness of the data, which can roughly reflect whether the sample data can explain the situation of the whole sample. By comprehensively analyzing the above two indicators, it can be seen from the test results of the model that the improved support vector machine neural network learning and training effect is the best, the test accuracy is also the highest, followed by SVM artificial neural network, and BP artificial neural network effect is also better. In contrast, the test results of the RBF artificial neural network are acceptable, not excellent, and the prediction accuracy is poor, so we choose to use the grey model and BP neural network model for prediction, showing excellent classification and prediction ability in processing samples.

6. Conclusion

The particularity of tourism products determines that there are many factors influencing tourism demand, so the prediction of tourism demand becomes more complex and uncertain. The first mock exam method is unable to meet the requirements of accuracy and stability of tourism demand prediction because of its limitations, and the general combination forecasting model fails to take account of the change of the forecasting accuracy of a single model in different periods, in order to improve the accuracy and stability of tourism demand forecasting. BP neural network and grey prediction theory are widely used in the application of tourism demand prediction theory. With the wide application, the limitations of the original model are becoming more and more prominent. Therefore, it is of great theoretical significance to study the improvement of BP neural network and grey prediction model and their combination prediction model. In this article, the grey system theory, BP neural network theory, and combination model method are used to model and predict the tourism demand. Combining the two methods, the grey neural network combination model is established. The prediction accuracy of the model is analyzed by the actual time series data of the number of tourists. In the end, the first mock exam is conducted by using three precision indexes, namely, absolute error, mean square error, and average absolute percentage error. The results show that the combined model has the advantage of the unitary model; that is, it has the advantages of less information and higher accuracy.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.