Abstract

In rural tourism, precise visitor flow forecasts may aid management in making better decisions. It aids in the reduction of visitor crowds and trash. It also has the potential to improve visitor security. As a result, it is critical to continue to encourage tourism’s long-term growth. However, regional tourist flow of rural tourism has the characteristics of high volatility, complex nonlinearity, and susceptibility to seasonal influences. Moreover, a single neural network model cannot learn both temporal and spatial correlation. Therefore, by examining the variables impacting regional tourist flow and integrating residual networks with fully linked networks, this research offers an enhanced Quad-ResNet model for forecasting regional tourist flow of rural tourism. To be specific, this model learns spatial correlation through deep convolution; combines four residual networks to learn temporal proximity, similarity, periodicity and tendency; and uses one fully connected network to learn seasonal effects. Furthermore, this study compares the Quad-ResNet model with LSTM, CNN, and ST-ResNet models on the same dataset for regional tourist flow prediction experiments. The findings reveal that the Quad-ResNet model has less error and is substantially simpler to train and predict than the LSTM model, making it more suited for predicting regional visitor flows in rural tourism. For relevant stakeholders, the generated model may serve as a useful decision-making tool.

1. Introduction

Nowadays, citizens are suffering from more and more pressure from work as well as lives. Under this background, rural tourism has turned into a new kind of tourism form and has become an important component of tourist products. Rural tourism is not only based on agricultural tourism activities but also includes multifaceted tourism activities such as walking, mountain climbing, and riding a horse, adventure, sport and health tourism, hunting and fishing, educational travels, cultural and traditional trips, and some regional folk tourism activities. It provides stressed people a good chance to get in touch with nature and relax themselves. What is more important is that it can be one of the economic driving forces for rural areas and contributes to regional development significantly [1]. Also, rural tourism can bring jobs and support retail development [2]. In other words, studying tourism development is one of the effective ways to promote rural development [3]. Developing rural tourism is significant to the new rural economy construction. Thus, people attach more and more importance to rural tourism space as a suitable tourism carrier [4].

The development of rural tourism has a brilliant future in China. According to the latest estimates of the National Tourism Administration, the total annual tourist flow of rural tourism tourists in China has reached 300 million with revenue of more than 40 billion yuan, accounting for nearly one-third of the total tourism. It is reported that there are about 400 rural tourism demonstration sites in China, covering 31 provinces, autonomous regions, and municipalities in the mainland. In the three golden weeks (the Spring Festival, the May Day, and the National Day) every year, the proportion of urban residents choosing rural tourism accounts for about 70%, which forms about 60 million person-times of rural tourism market. From 2009 to 2013, China’s rural tourism revenue increased at an average annual rate of 43%. In October 2018, the National Development and Reform Commission and 13 other departments jointly issued the Action Plan for Promoting the Development and Upgrading of Rural Tourism (2018-2020), proposing to “encourage and guide social capital to participate in the development of rural tourism” and increase supporting policy supports for rural tourism development. Previously, the No. 1 document of the CPC Central Committee in 2018 clearly stated requirements on “implementing high-quality projects for leisure agriculture and rural tourism.” This good news indicates that China’s rural tourism will usher in a new round of investment and consumption upsurge. There is no doubt that rural tourism has become a new growth point of tourism [5].

Rural tourism-related attractions and enterprises are usually small in scale, and many new sites are open and operating in a short time. Managers often lack business management experience in the tourism industry. This tends to tourist attraction saturation, tourist congestion, and other problems, which then affect tourists’ travel experience and the sustainable development of rural tourism. Forecasting the tourist flow is an important aspect in tourism economy in the rural area [6]. Accurate prediction of the tourist flow of rural tourism can provide decision-making references for managers. It helps to avoid the tourists gathering and material waste. In addition, it can also improve the safety of tourists. Thus, it is of great significance to further promote the sustainable development of tourism [7].

With the development of computational techniques, specifically in machine learning and deep learning, there are more and more algorithms that can be applied for tourist flow prediction. Traditional methods usually consist of linear regression, grey forecasting method, ARIMA, time series, Markov forecasting model, etc. [818]. Although these methods have been applied in various studies and get certain achievements, they failed in accurately predicting regional tourist flow with strong fluctuation and complex nonlinear. They as well cannot represent the spatial correlation. Advancements in deep learning provide a new solution for prediction problems. Algorithms like convolutional neural network, long and short memory neural network, and residual network made it possible for predicting tourist flow in consideration of elements of time and space [1927].

Regional tourist flow of rural tourism is susceptible to seasonal effects, with stronger short-term correlation and strong volatility. To address this problem, this paper presents an improved Quad-ResNet model for tourist flow forecasting of rural tourism. A rural tourism site named Jiayuguan in Gansu province has been selected for the case study. By comparing the prediction results with other models like LSTM, CNN, and ST-ResNet, it can be found that the developed model has better accuracy. Findings from the study contribute to the knowledge body of tourism management. It can as well serve as a decision tool for managers of rural tourism.

2. Tourist Flow

2.1. Overview

This research focuses on the use of pedestrian positioning data and weather and holiday data to study the changing pattern of regional tourist flow of rural tourism and predict it in the next five minutes. The longitude and latitude of the tourist area are divided into the grid map, where a grid map represents a region; then, the regional tourist flow of the th row and th column grid at time can be expressed as follows: where is the set of location points of pedestrians in the tourist scene at moment , is a location point in the set , denotes that is in the grid map, and refers to the base of the set.

2.2. Influence Factor Analysis
2.2.1. Spatial Effect

According to the first law of geography, everything is related to each other, and the correlation between neighboring things is likely to be stronger, with the closer things are to each other the more closely related they are likely to be. Therefore, the tourist flow to neighboring areas of a tourist attraction can be influenced by the movement of people, which means that there is spatial proximity. The change in tourist flow in two similar areas shows some correlation, expressed as spatial similarity.

2.2.2. Seasonal Effect

The term “seasonality” is often used in studies of rural tourism’s regional visitor flow. This illustrates the unpredictably unequal distribution of flow across time as a result of weather and scheduling choices, resulting in a unique low season in the tourist business. For example, in winter when the weather is cold, regional tourist flow may show a downward trend. For example, in winter when the weather is cold, regional tourist flow may show a downward trend. In the case study of regional tourist flow of rural tourism, the factors to be considered are weather and official holidays.

2.2.3. Time Effect

The tourist flow to an attraction at a certain moment in time is influenced by the previous moments and can be specifically classified as similarity, proximity, tendency, and periodicity depending on the length of the time interval.

The difference in regional tourist flow of rural tourism between 12:00 and 11:00, 10:00 and 9:00 indicates that the majority of the areas have a small difference in tourist flow, reflecting the similarity in time.

Figure 1 shows a graph of the total tourist flow in all zones at five-minute intervals, from which it can be seen that the flow is low and decreasing from 0:00 to 5:00, increasing from 5:00 to 11:00, with a peak for the day around 12:00 and then oscillating until 21:00, after which the tourist flow starts to decrease gradually. For each hour of the day, the tourist flow is influenced by the neighboring moments, and it also influences the tourist flow in the following moments, reflecting the proximity of the area.

Figure 2 shows the length of time for 6 months, with an interval of 15 days, for each 9:00 am passenger flow, which increases at the same moment as the temperature warms, reflecting the tendency of regional tourist flow of rural tourism.

In Figure 3, it can be seen that the daily flow trends are broadly similar and that the tourist flow from 2019.9.3 to 2019.9.5 at the same time of day is similar, suggesting the periodicity of the regional tourist flow of rural tourism. Also, the overall trend shows a relatively stable, which suggest the periodicity as well.

2.3. Quad-ResNet Model

The structure of the Quad-ResNet model is shown in Figure 4. The model is divided into five parts, with four residual networks to simulate temporal proximity, similarity, periodicity, and tendency, and a two-layer fully connected network to simulate seasonal effects. The residual network consists of one convolutional layer, residual cells, and one convolutional layer, which is structured to simulate spatial proximity and similarity. The outputs of the four residual networks, , , , and are fused into by a parametric matrix, and is fused with the output of the fully connected network. Finally, the fused output is mapped to [1] by the tanh function.

The structures of the four residual networks are the same. Take the part of proximity as an example, the residual cells based on BN (batch normalization) can be calculated as follows [28]: where refers to the convolution operations, is the activation function ReLU, and and are learnable parameters.

The length of the input sequence for the proximity part is expressed as , the time interval for proximity is , and the input sequence is , , …, . Then, the final output of the proximity part is denoted as .

The length of the input sequence for the similarity part is expressed as , the time interval for proximity is , and the input sequence is , , …, . Then, the final output of the proximity part is denoted as .

The length of the input sequence for the periodicity part is expressed as , the time interval for proximity is , and the input sequence is , , …, . Then, the final output of the proximity part is denoted as .

The length of the input sequence for the tendency part is expressed as , the time interval for proximity is , and the input sequence is , , …, . Then, the final output of the proximity part is denoted as .

The proximity, similarity, periodicity, and tendency components are fused by a parametric matrix. The output after fusion is shown in the following equation: where refers to Hadamard product; , , , and are learnable parameters, which are used to adjust the influence level of proximity, similarity, periodicity, and tendency.

After combining and , the predicted value can be obtained through tanh function:

The loss can be calculated through MSE:

3. Experimental Design and Analysis

3.1. Model Tuning Experiments
3.1.1. Dataset

The raw data of tourist flow was sourced from the Tencent Location Big Data website, which crawled the APP location data of Jiayuguan in Gansu Province, from February 3, 2021 to August 2, 2021, with a time interval of 8 minutes, which was converted into the data of regional tourist flow of rural tourism, containing a total of 47,828 moments.

The historical weather data of Jiayuguan was used as the weather data of this attraction, the weather data contains 13 periods with these four attributes: weather conditions, temperature, wind, and wind direction. The holiday data is sourced from the open API, with weekdays marked as 0, weekend rest days are identified as 1, and legal holidays are identified as 2.

3.1.2. Parameter Setting

The Quad-ResNet model is trained using the Adam optimization algorithm with batch size set to 22, learning rate set to 0.0002, and loss function set to MSE. Interval is in units of 2 weeks. Since the time interval for the data set is 6 minutes, the proximity interval is 1, the similarity interval is 14, the periodic interval is 312, and the trend interval s is 2,018.

3.1.3. Model Assess Standard

The RMSE (root mean squared error) is used to evaluate the model and is calculated as shown below. A smaller RMSE means that the model has less error and higher accuracy.

3.1.4. Model Selection

The parameter selection experiments mainly focus on the selection of proximity sequence length, periodic sequence length, similarity sequence length, tendency sequence length, and the number of residual cells.

Taking the proximity sequence length selection experiment as an example, the experimental results are shown in Figure 5. When , the RMSE is the smallest, indicating that the accuracy of the model is the highest at this time. When , the RMSE is larger than before, indicating that a longer proximity sequence not only fails to improve the accuracy of the model but also may lead to a higher accuracy of the model. This indicates that the longer proximity sequences do not improve the accuracy of the model but may lead to a decrease in the accuracy of the model. Therefore, the length of the proximity sequence was chosen as the model with the lowest error and the highest accuracy. According to Figures 6, the length of the periodic series , the length of the similarity series , the length of the tendency series , and the number of residual cells is 3.

3.2. Comparative Experiment
3.2.1. Experiment Design

In order to verify the validity of the models, the models ST-ResNet, LSTM, CNN, and Quad-ResNet are selected to compare the performance of the models, where the deep learning models ST-ResNet, LSTM, and CNN set the same parameter values as in Quad-ResNet. These three models are all widely applied deep learning models for prediction.

A residual neural network (ResNet) is a kind of artificial neural network (ANN) that is based on known pyramidal cell constructions in the cerebral cortex. Residual neural networks do this by the use of skip connections, or shortcuts, to bypass certain layers. Typical ResNet models are built with double- or triple-layer skips with nonlinearities (ReLU) in between and batch normalization. To learn the skip weights, an extra weight matrix may be utilized; these models are referred to as HighwayNets. DenseNets are models with several parallel skips. A nonresidual network is referred to as a plain network in the setting of residual neural networks. Both vanishing gradients and degradation (accuracy saturation) may be mitigated by adding skip connections to a sufficiently deep model, but there are two primary reasons for doing so. Upstream layer muted during training and previously skipped level amplified during training. Using the simplest example, just the weights for the link between the neighboring layer and the upstream layer are changed. A single nonlinear layer or all of the intermediate layers must be linear for this to operate. A weight matrix should be learnt for the link that was missed if it is not already (a HighwayNet should be used). As a result of skipping, the network’s early training phases use fewer layers [29]. There are fewer layers to propagate through, decreasing the effect of gradients disappearing. In time, the network will progressively return to the levels that were bypassed. The closer it gets to the manifold at the conclusion of training, the more quickly it learns. It is easier for a neural network with no leftover pieces to explore the whole range of features. Perturbations that lead it to depart the manifold demand more training data in order to recover.

The long short-term memory (LSTM) architecture is a kind of recurrent neural network (RNN) utilized in the area of deep learning. In contrast to conventional feedforward neural networks, LSTMs have feedback connections. It is capable of processing not only individual data points (such as photos) but also complete data sequences (such as speech or video). For instance, LSTM may be used to do tasks such as unsegmented, linked handwriting recognition, voice recognition, and anomaly detection in network traffic, or IDSs (intrusion detection systems). A typical LSTM unit is made up of four components: a cell, an input gate, an output gate, and a forget gate. The cell retains data across arbitrary time periods, and the three gates control the inflow and outflow of information. The LSTM networks are highly suited for categorizing, analyzing, and forecasting time series data, since there may be unpredictable delays between significant occurrences in a time series. LSTMs were designed to address the vanishing gradient issue that may occur when regular RNNs are trained. In various situations, LSTMs outperform RNNs, hidden Markov models, and other sequence learning algorithms due to their relative insensitivity to gap length.

The name “convolutional neural network” indicates that the network employs a mathematical operation called convolution. Convolutional networks are a specialized type of neural networks that use convolution in place of general matrix multiplication in at least one of their layers. Convolutional neural networks (CNNs or ConvNets) are a kind of artificial neural network used most often in deep learning for image analysis. They are also referred to as shift invariant or space invariant artificial neural networks (SIANN), due to the shared-weight design of the convolution kernels or filters that slide along input features and produce translation-equivariant outputs known as feature maps. Contrary to popular belief, the majority of convolutional neural networks are equivariant to translation, not invariant. They are used in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain-computer interfaces, and financial time series analysis. CNNs are specialized multilayer perceptrons. Multilayer perceptrons are often used to refer to completely linked networks, in which each neuron in one layer is coupled to every neuron in the next layer. Due to their “complete connectedness,” these networks are prone to overfitting data. Regularization techniques that are often used to avoid overfitting include punishing parameters during training (such as weight decay) or cutting connectivity (skipped connections, dropout, etc.). CNNs tackle regularization differently: they use the hierarchical structure of data to construct patterns of increasing complexity by embossing smaller and simpler patterns in their filters. Thus, on a connectivity and complexity scale, CNNs are at the bottom end. Convolutional networks were motivated by biological processes since their connection pattern mirrors that of the animal visual brain. Individual cortical neurons react to stimuli only within a small section of the visual field referred to as the receptive field. Different neurons’ receptive areas partly overlap to encompass the whole visual field. In comparison to other image classification methods, CNNs need comparatively less preprocessing. This implies that the network learns to optimize the filters (or kernels) via automatic learning, while these filters are hand-engineered in conventional techniques. This independence from past knowledge and human interference is a significant benefit in feature extraction.

3.2.2. Experiment Result Analysis

The results of each model can be seen in Figure 6. It can be seen that RMSE of Quad-ResNet is the smallest, meaning that the accuracy is the highest. The RMSEs of LSTM-3, LSTM-6, and LSTM-14 are larger than that of Quad-ResNet, meaning that LSTM model can capture short time relevance effectively and get great predicted results. However, the spatial dependence can cause an important influence on the predicted results. Thus, it cannot just consider time dependence; otherwise, you cannot improve accuracy any further.

The RMSEs of the LSTM-3, LSTM-6, LSTM-14, LSTM-144, and LSTM-312 models are relatively similar and significantly smaller than those of the LSTM-2018 model, while the RMSEs of the models decrease and then increase as the lookback increases, suggesting that the LSTM models may have difficulty capturing very long-term temporal correlations.

The RMSE of the CNN model is significantly larger than that of the Quad-ResNet, LSTM-3, LSTM-6, LSTM-14, LSTM-144, and LSTM-312 models, and only slightly smaller than that of the LSTM-2018 model, indicating that the shallow CNN may have difficulty in capturing sufficient spatial correlation, resulting in poor prediction accuracy of the model.

The RMSE of the ST-ResNet model is larger than that of the Quad-ResNet model, which is due to the fact that the ST-ResNet model only considers temporal proximity, periodicity, and trend, but not temporal similarity, resulting in no further improvement in the accuracy of the model. The impact of short-term temporal correlation on the accuracy of regional tourist flow of rural tourism prediction is significantly greater than that of long-term temporal correlation. The Quad-ResNet model considers the temporal similarity, and therefore, the experimental results are better than those of ST-ResNet. The experimental results are better than those of the ST-ResNet model.

4. Conclusion

People nowadays are under increasing pressure due to the demands of both their professional and personal life. A new kind of tourism is emerging, and rural tourism is becoming a significant part of the overall product mix. China’s rural tourist industry has a bright future. However, managers in the tourist sector often lack business management expertise. This may lead to overcrowding, congestion, and other issues that negatively impact visitors’ travel experiences and rural tourism’s long-term viability. For rural tourism businesses, forecasting regional tourist flow is an essential part of their business plan and strategy. This paper proposes a method for predicting regional tourist flow of rural tourism using spatiotemporal residual networks. This approach can predict the regional tourist flow of rural tourism based on pedestrian location data, weather and holiday data and thus find the hotspots of tourist attractions. In this research, three models are selected to compare the performance of the Quad-ResNet model, and the Quad-ResNet model is proved to be more suitable for predicting regional tourist flow of rural tourism. However, there are still some aspects that need to be improved, as the regional tourist flow of rural tourism in this study was obtained from pedestrian data. The factors considered in this paper fail to represent all the influences and do not consider unexpected events or unknown causes. Therefore, if other influences can be explored and analyzed, the prediction model can be enhanced and its accuracy can be further improved.

Data Availability

The labeled dataset used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no competing interests.

Acknowledgments

This work was supported in part by the Heilongjiang Art and Science Planning Project “Research on the Integrated Development of Music Culture and Tourism Industry in Heilongjiang Province” (No. 2021B009).