Abstract

To help related operators to allocate and dispatch the number of bike-sharing and provide good guidance for setting up electronic fences, this paper proposes a spatiotemporal graph convolution network prediction model (SGCNPM) with multiple factors to enhance the accuracy of predicting the demand for bike-sharing. First, we consider time, built environment, and weather. We use a multigraph convolution network (GCN) to model the built environment, utilize a long short-term memory (LSTM) network to extract temporal features, and utilize a fully connected network (FCN) to model weather influence. We construct SGCNPM which can effectively fuse GCN, LSTM, and FCN, thus creating a prediction method considering the influence of multiple factors. The results of the real case in Tianjin, China, show that the proposed model can perform well in improving prediction accuracy. Also, we analyze the influence of factors on model prediction results in different periods.

1. Introduction

Over the past few decades, with increasing concerns about global warming and rapid urbanization, many efforts have been put into cities to promote bike-sharing as a viable green mobility solution. The successful implementation of bike-sharing systems could provide alternative solutions to many urban problems such as traffic congestion, air pollution, energy shortages, and deterioration in human health. Taking a shared bike as an example, it is estimated that it will generate about 76 kg of carbon dioxide emissions in the whole life cycle of production, transportation, and scrapping. However, during its service life, the average cycling distance is more than 4,000 km, which is estimated to reduce carbon emissions by about 105 kg, enough to achieve “zero carbon” [1]. Bike-sharing promotes comprehensive resource conservation and recycling, meets people’s green travel needs with lower resource input and higher operation efficiency, and contributes to the realization of carbon peaking and carbon neutrality goals.

Dockless bike-sharing has been introduced in China since 2014 and has grown rapidly in the past few years, spreading to overseas countries, including the US and the UK. In 2022, the number of bike-sharing in China reached 23 million, far surpassing docked bike-sharing and becoming one of the most important modes of shared transportation. The bike-sharing service platforms, e.g., Hellobike and Mobike, are emerging technologies with the mobile Internet boom. The platforms dispatch and transfer bike-sharing to match supply with demand. At present, bike-sharing has problems such as low utilization rate, low turnover rate, and limited distribution. According to the report by the Beijing Municipal Commission of Transportation, in Beijing, the average daily turnover rate of bike-sharing is only 1.1, the average daily active bike-sharing accounts for only 16% of the total reported bike-sharing, and the average weekly active bike-sharing accounts for only 30% [2]. Understanding the short-term passenger demand in different spatial regions can help the platform and the operator solve this problem effectively. Then, bike-sharing can be dispatched to regions with more potential passenger demand to enhance the utilization and turnover rates.

The selection of factors is one of the most important steps in demand forecasting. In addition to time, weather and built environment are also important factors. Weather factors include temperature and weather state. Regarding the temperature, Heinen et al. investigated that travel demand was positively correlated when the temperature was between 0 and 20°C; demand reached its highest level when the temperature was between 20 and 30°C [3]. Regarding the weather state, Hyland and Hong found that rainfall and snowfall were the most unfavorable weather conditions, and there was a negative correlation between them and bike-sharing demand [4]. Built environment refers to the artificial environment provided for human activities, including a large urban environment. It is generally characterized by diversity (or land use type), design (particularly design for public transportation), and so on [5, 6]. Regarding the land use type, Eren and Uz concluded that bike-sharing demand was 15 times higher in business districts than in residential areas and 3–5 times higher in parks than in schools and subways [6]. Regarding the design for public transportation, Kabak et al. investigated that bike-sharing can effectively solve the last-kilometer connection problem, and bike-sharing demand was high near transportation hubs such as trains, buses, and subways [7].

Although many scholars have carried out relevant research on short-term forecasting of dockless bike-sharing demand based on considering multiple factors, few of them considered the impact of both weather and the built environment. Sun et al. demonstrated positive associations between taxi demand and built environment variables as well as weather conditions [8].

This paper intends to propose a short-term forecasting method with multiple factors to enhance the accuracy of predicting the demand for bike-sharing. The contributions are summarized as follows. First, this paper considers time, built environment, and weather. This paper uses a multigraph convolution network (GCN) to model the built environment, utilizes a long short-term memory (LSTM) network to extract temporal features, and utilizes a fully connected network (FCN) to model weather influence. This paper constructs a spatiotemporal graph convolution network prediction model (SGCNPM) which can effectively fuse GCN, LSTM, and FCN, thus creating a prediction method considering the influence of multiple factors. The results of the real case in Tianjin, China, show that the proposed model can perform well in improving prediction accuracy. Also, this paper analyzes the influence of factors on model prediction results in different periods.

The rest of this paper is organized as follows. Section 2 first introduces the explanatory variables. Section 3 proposes the modeling framework of this research. Section 4 gives the research of enterprise bike-sharing data and compares the predictive performance between the SGCNPM and other models. Section 5 summarizes the research results and proposes the future research direction.

2. Literature Review

There is much research on demand forecasting of bike-sharing. According to the research objects, they can be divided into station-based and dockless.

Focusing on station-based bike-sharing, Sathishkumar et al. discussed five models for hourly rental demand prediction, including linear regression, gradient boosting machine, support vector machine, boosted trees, and extreme gradient boosting trees [9]. Li et al. proposed a spatial-temporal memory network to predict short-term bike-sharing usage [10]. Sohrabi et al. proposed generalized extreme value count models which can predict hourly arrivals and departures at each station [11]. Reynaud et al. proposed a panel mixed generalized ordered logit model to estimate hourly bicycle availability; the model accommodated exogenous variables and a station-level model [12]. Hu et al. proposed a set of generalized additive models to delineate temporal interactions between station-level daily bike-sharing usage and independent variables, including land use, station characteristics, and COVID-19 [13]. Collini et al. used bidirectional long short-term memory (Bi-LSTM) networks to predict the number of available bikes and free bike slots in bike-sharing stations [14]. Mehdizadeh Dastjerdi and Morency first used the Louvain algorithm to identify six communities in the bike-sharing network and utilized CNN-LSTM to predict pickup demand in each community [15]. Unlike station-based bike-sharing, dockless sharing bikes can be parked by users in any appropriate place. This characteristic improves the availability of bicycles and service coverage but increases prediction difficulty. The forecast method for station-based bicycles cannot be used directly for dockless bike-sharing systems.

For the dockless bike-sharing system, Chang et al. utilized deep learning algorithms to predict the number and location of shared bikes, which adopted the encoder-decoder architecture embedded with the attention mechanism to enhance prediction ability further [16]. Shang et al. used big data to analyze the impacts of COVID-19 on the user behaviors and environmental benefits of bike-sharing [17]. They employed the topological indices arising from complex network theory to analyze the transformation of user behavior patterns. Yang et al. analyzed dockless bike-sharing in Nanchang, China, over a period when a new metro line came into operation, which utilized spatial statistics and graph-based approaches to quantify changes in travel behaviors [18]. Ai et al. proposed a convolutional long short-term memory (Conv-LSTM) network to forecast short-term distribution [19], which solved the problem of spatial dependence and time dependence of dockless bike-sharing. Li and Shuai proposed a deep learning model called CLTFP to predict the travel distance and OD distribution of bike-sharing under different conditions of time and space [20]. Previous studies have mainly considered the time correlation when simulating the correlation, and this paper finds that the built environment is also important for the demand forecast for bike-sharing. For example, area A and area B are far apart. However, the land use types of the two areas are similar, or the two areas are connected by urban rail, so the two areas will influence each other, and the demand for bike-sharing may be similar.

Considering the built environment, Xu et al. used a four-month GPS dataset to reconstruct the temporal usage patterns of shared bikes at different places and applied an eigendecomposition approach to uncover their hidden structures [21]. Li et al. applied ordinary least squares (OLS) regression and geographically weighted regression (GWR) models to explore how the built environment and social-demographic characteristics influence bike-sharing utilization [22]. Dong et al. proposed DestiFlow based on points of interest (POIs) clustering to predict the demand for dockless bike-sharing [23]. Yan et al. investigated the travel distance distributions of dockless bike-sharing near metro stations to provide the basis for the service area of dockless bike-sharing [24]. Li et al. utilized the data of the dockless bike-sharing service Mobike to quantify short-trip transportation patterns and analyze the comprehensive view of mobility patterns [25]. Li et al. proposed a framework based on the gravity model and Bayesian rules to infer the purpose of dockless bike-sharing trips at the individual level [26]. In addition to bike-sharing, many scholars have also made short-term forecasts for other fields. Zeng et al. proposed a DWT-Bi-LSTM model to predict parking space availability based on historical parking data [27]. Ma et al. proposed a short-term traffic flow prediction model to improve the accuracy of short-term traffic flow prediction [28]. Ziheng et al. proposed a deep learning model MOS-BiAtten to predict ride-hailing demand during COVID-19 in Beijing [29]. Zhu et al. proposed a deep learning model to achieve accurate and stable taxi demand in dynamic areas [30].

Although there were many studies that analyzed demand, they also demonstrated the importance of weather and built environment for demand forecasting. However, few researchers considered both the built environment and the weather environment as variables. This paper sorts out the variables involved in the main literature, as shown in Table 1. The purpose of this paper is to explore bike-sharing demand forecasting with both the built environment and the weather environment.

2.1. Preliminaries

The short-term bike-sharing demand forecasting is essentially a time-series forecast problem. The nearest historical demand can provide valuable information for forecasting future demand. This paper also observes that the built environment influences the short-term bike-sharing demand. In this paper, the built environment is characterized by two factors, land use type and accessibility of public transportation.

When forecasting a region’s demand, other regions with similar functions can be intuitively referred to. If both region A and region B are residential areas, the spatiotemporal characteristics of bike-sharing demand in the two regions are comparable, and mutual reference can be made in predicting demand. Public transportation accessibility is also an essential factor in spatiotemporal prediction. Objectively speaking, geographically remote but accessible areas can be correlated. This connection is caused by public transportation such as buses and subways. Furthermore, the attributes of time-of-day, day-of-week, and weather conditions also impact the short-term bike-sharing demand. In this paper, the research area is divided into several grids consistently. Each grid refers to a zone and is represented by (m, n). Also, a day is divided into different periods according to the same time interval. Then, the related variables are defined as follows.

2.1.1. Demand Intensity

The demand at the t-th time slot (hour) lying in the grid (m, n) is defined as the number of demands during this time slot within the grid, which is denoted by . The bike-sharing demand in all grids at the t-th time slot is defined as the matrix (R is the real set), in which the (m, n)-th element is .

2.1.2. Land Use Type

The number of each land use type is used to measure this factor. Let denote the number of land use types i in the grid (m, n). The land use type i is divided into nine categories: subway stations, parks, shopping centres, training institutions, office buildings, schools, well-known enterprises, housing estates, and comprehensive restaurants. The number of land use type i in all grids is defined as the matrix (R is the real set), in which the (m, n)-th element is .

2.1.3. Accessibility of Public Transportation

The public transport system is the foundation of public transport accessibility. The spatial layout of the public transport network, the actual operation scheme, and the cooperation between the modes of public transport and rail transit within the system will all affect the choice of travellers’ travel paths and modes in the public transport system, thus affecting the public transport accessibility. Let indicate the accessibility of public transport in the grid (m, n), representing the number of public transportation modes from area m to area n. The accessibility of public transportation in all grids is defined as the matrix (R is the real set), in which the (m, n)-th element is .

2.1.4. Day

By empirically examining the distribution of demand intensity concerning time in the training set, day-of-month, day-of-week, and time-of-day are selected. Let xda ∈ [1, 31] denote the day-of-month, xwe ∈ [1, 7] denote day-of-week, and xtp ∈ [1, 24] denote time-of-day. In addition, a dummy variable xho is introduced to catch up on the distinguished properties between rest days (including holidays and weekends) and working days (i.e., weekdays). It is given by

2.1.5. Weather

This paper considers five categories of weather variables: weather state, maximum temperature, minimum temperature, wind level, and air quality index. Let xwt denote the weather state. It is given by

Let xma ∈ [−50, 50], xmi ∈ [−50, 50], xwl ∈ [0, 5], and xaq ∈ [0, ∞] denote maximum temperature, minimum temperature, wind level, and air quality index, respectively. The weather state, the maximum temperature, and the minimum temperature take a value for a time interval, while wind level and air quality index are taken on average in each time interval.

3. Methodology

3.1. Framework

SGCNPM consists of two submodels: (1) the spatiotemporal and built environment variable model based on multigraph convolution network and LSTM and (2) the weather variable model based on fully connected network. The output results of the two submodels are fused with different weights to obtain the bike-sharing demand of each region, as shown in Figure 1.

Firstly, we encode relationships between regions through multiple graphs, such as distance, POI similarity, and public transport accessibility. The non-Euclidean relationship is captured by using multigraph convolution network, and temporal features are extracted by LSTM, and the prediction result of the submodel is output; the historical demand and the weather labels are input by fully connected network, and the prediction result of the submodel is output. Lastly, the two results are weighted and fused, and the forecast value is output.

At the spatial level, multigraph convolution network is used to model the non-Euclidean relationship [32].

At the temporal level, considering the historical travel characteristics of bike-sharing, the LSTM is used to model the travel characteristics of bike-sharing, which obtains the demand rule of bike-sharing.

3.2. Spatial Modeling
3.2.1. Distance Map

Each grid is regarded as a vertex of the graph, the centre of the grid is the vertex’s location, and the distance between vertices is the distance between the centres of the grids. If the distance between two grids is close, the demand for bike-sharing is similar. The range set of distances can be expressed aswhere am and an represent the centre of region m and region n, respectively, and denotes the Euclidean distance between region m and region n.

3.2.2. POI Map

When forecasting a region’s demand, this paper can refer to other regions with similar POI. The function of a region can be represented by each category of POI in the region, and the following formula can represent the similarity of POI between the two regions:where bm and bn, respectively, represent the POI vectors of region m and region n, the vector dimension is the number of POI categories, and the input value is the total number of all kinds of POI.

3.2.3. Interconnection Map

The transportation system is an important factor that affects spatiotemporal prediction. Although some areas are geographically far away, they can be connected by public transportation. The following formula can express public transport interoperability:where con (m, n) indicates the accessibility of public transport in region m and region n. If there are n kinds of public transport connections between the two areas, the value equals n.

3.2.4. Modeling Spatiotemporal Dependency by Multigraph Convolution

This paper uses the following formula for convolution using the above three types of graphs.

, are the feature vectors of |V| regions in layers l and l +1, σ represents the activation function, ∐ represents the aggregation function, A represents graph set, represents aggregation matrix based on different samples represented by θi, and represents feature transformation matrix. For instance, if is a polynomial function of Laplacian matrix L, it will be ChebNet; if is the identity matrix, it will be fully connected network. is K-order polynomial function of graph Laplacian matrix L. Figure 2 represents an example that a central region transforms value by graph convolution. Suppose the input of the adjacent matrix is 0 or 1, and represents i reaching j in k-hop. In convolution operation, k defines the size of the reception field when spatial features are extracted.

Figure 2 describes the graph convolution operation. Left: the centralized region is black, the adjacent area is yellow, and the peripheral area is blue. Middle: as the Laplacian degree of the graph increases, more areas change from blue to green. Right: the output layer is the sum of graph transformations in which the degree increases from 1 to K.

The spatial variables involved in this paper include distance, POI, and public transport connection, and variables such as intersections can also be considered in future research. It models spatial correlation by feature extraction of region relationship; when the reception is small, feature extraction is concentrated in the near region. Increasing the graph Laplacian degree or stacking multiple convolutional layers will increase the reception field and capture more global correlations.

3.3. Temporal Modeling

LSTM network is a variant of recurrent neural network (RNN), which was proposed to solve the long-term dependence problem. As can be seen from the cellular structure diagram of LSTM in Figure 3, LSTM adopts three kinds of gating mechanisms to solve the long-term dependence problem in traditional RNN.

The three types of LSTM gating units are forget gate f, input gate i, and output gate o. The forget gate is used to control the information to be discarded by the current cell state C(t). The state update formula of the forget gate is as follows:

The input gate controls how much of the input d(t) of the current network will remain in the cell state C(t), and the state update formula for the input gate is

The output gate controls how much information can be output in the current cell state. The state update formula of the output gate is

σ represents the sigmoid function, which will generate the vector between [0, 1] according to the input. represents candidate cell information; Wf, Wi, Wo, Wc represent the weight coefficient matrix in the LSTM cell state update process; bf, bi, bo, bc represent the bias matrix in the state update process.

This paper introduces LSTM with a contextual gate mechanism to model the correlation between observations at different time, as shown in Figure 4.

First, the information in the relevant region is considered as contextual information and is convolved by a K-order graph convolution using the corresponding graph Laplacian matrix (10). The context gating mechanism is designed to perform graph convolution before pooling in order to make the pooling contain topological information.

Second, this paper uses global average pooling Fpool to aggregate the information of all nodes into one node:

Third, the weight s is generated by nonlinear transformation of vector z using attention operation (12), δ and σ are the ReLU and sigmoid functions, respectively.

Finally, s is applied to weight each time step:

After obtaining multiple graphs weighted by time steps, multiple graphs are fused into one graph using weight-sharing LSTM:

3.4. Environment Modeling

The model can effectively reflect the influence of environmental factors on demand for bike-sharing. The main structure is fully connected network implemented for time-series variables, including weather state, maximum temperature, minimum temperature, wind level, and air quality index. The model could output demand prediction .

3.5. Model Fusion

The results of the two models are weighted and fused to obtain the demand prediction of bike-sharing, and the calculation formula is as follows:

is the demand prediction of bike-sharing, and represent the output of two models, and and represent fusion weight, respectively.

4. Experiments and Results

Since bike-sharing entered Tianjin in February 2017, the development of the industry has gone through four stages: early stage, development stage, boom stage, and maturing stage. Initially, Tianjin also suffered from the problem of excessive and indiscriminate distribution of bike-sharing, with about 1 million bikes in the city. An accurate prediction of the demand for bike-sharing can help the Tianjin government to rationally release and dispatch bike-sharing, help bike-sharing better to play the role of green travel and slow transportation, and keep Tianjin among the advanced cities in the management of bike-sharing among companies in China.

In this section, this paper first preprocesses the data, predicts the bike-sharing demand with SGCNPM, and then compares the prediction performance of other models.

4.1. Data Collection

In this paper, three types of data collection work are involved: spatiotemporal variables, built environment variables, and weather variables.

4.1.1. Spatiotemporal Variables

The dataset utilized in this paper is extracted from Hellobike and Mobike, the top two bike-sharing service platforms in China, for two months, from May 1, 2019, to June 30, 2019. As shown in Figure 5, the studied site is in the Heping District, surrounded by solid black lines. The dataset is partitioned into 1-hour time intervals, and the investigated region is partitioned into 53 regions by 500 m × 500 m grids; this paper also studies grids connected to regions, so the studied region includes 53 grids, which are filled with yellow.

The demand dataset is divided into a 70% training set comprised of observations between May 1 and June 11, a 10% validation set consisting of the observations between June 12 and June 17, and a 20% test set comprised of observations between June 18 to June 30. Figure 6 shows the total demand of all girds on different days within the investigated training (before the red dash line), validating (between the red dash line and the green dash line), and testing (after the green dash line) period.

It can be observed from Figure 7, based on the training set, which shows the mean and variance of bike-sharing demand in different hours of the day, that the bike-sharing demand on both working days and rest days demonstrates a double-peak nature. However, the double peak on working days is steeper, and the peak on working days is higher than that on rest days. It is observed that demand is lower on rest days, which presents a challenge for short-term demand forecasting.

To verify whether the spatiotemporal variables exist in spatiotemporal correlations, this paper uses Pearson correlation to examine the correlations between the demand at the t-th time interval and spatiotemporal variables ahead of the t-th time interval, given bywhere X and Y are two random variables with the same number of observations.

First, this paper calculates the Pearson correlations between the demand at t time in the grid (m, n) and demand at t − k time in the grid , for all , , . Second, this paper averages the correlations partitioned by distances and look-back time intervals. The distance of grid (m, n) and is denoted as the Euclidean distance between the central points of the two grids.

Figure 8 shows the average correlations between the dependent variables (the demand at t time in the grid (m, n)) and the explanatory variables (the demand at t − k time in the grid ). It can be observed from the figure that the average correlations decrease gradually with the increase of distance, which verifies that each area is spatially correlated with its neighbours. On the other hand, the smaller the look-back time intervals are, the more relevant the variables are. Pearson correlation confirms that the spatiotemporal variables have spatiotemporal dependencies.

4.2. Built Environment Variables
4.2.1. Land Use Type

This paper crawls the Baidu Maps’ POI data through Python, which includes longitude, latitude, name, address, and administrative region. There are nine categories of POI: subway stations, parks, shopping centres, training institutions, office buildings, schools, well-known enterprises, housing estates, and comprehensive restaurants.

4.2.2. Public Transport Accessibility

This paper gathers public transportation data from Baidu Maps and investigates how many public transportation methods link one region to another, including subways and buses.

4.3. Weather Variables

This paper also gathers one-hour aggregated weather variables, including temperature, weather state, wind speed, and air quality index, during the same period from the China Meteorological Administration. This paper also gathers variables, including date, week, and holiday.

4.4. Prediction Result

The SGCNPM with full variables is trained on the training set and validated on the test set, respectively. Aggregation matrix f (A; θi) selects the Chebyshev polynomial function with K equal 2. The model consists of five layers: an input layer, three hidden layers, and an output layer. Each hidden layer has eight hidden cells. To solve the overfitting problem, this paper introduces L2 parameter regularization. The model uses ReLU as the activation function of the graph convolution network. This paper predicts the demand for 24 hours in a day, and the output is a 24 × 53 matrix. The paper analyzes the influence of epoch, time step, and batch size on the results in advance, as shown in Figure 9. Then, this paper sets the epoch, time step, and batch size to 100, 12, and 24, respectively, because these values provide the best prediction. There are 53 regions in this paper, and each node represents a map region, so this paper sets the node to 53.

The model is evaluated via the four measures of effectiveness: mean absolute error (MAE), mean absolute percentage error (MAPE), root mean squared error (RMSE), and coefficient of determination (R2), given by where y(i) and are the i-th ground truth and predicted demand value, respectively, is the mean of , and n is the size of the test set. This paper uses MAE, RMSE, and R2 to measure the overall forecast accuracy of the entire test data and MAPE to measure the forecast performance of the model in high-demand areas and periods.

To investigate the fusion weight, the results under different weights are analyzed, and the bike-sharing demands in 53 regions are predicted. As shown in Figure 10, it is found that the model has the best prediction effect when equals 0.4 and e equals 0.6.

This paper applies SGCNPM to forecast the demand for bike-sharing in 53 regions and find that region 7 has the best forecast effect and region 8 has the worst forecast effect; the results are shown in Figure 11 and Table 2. Figure 11 shows MAE of each region, region 7 has the smallest MAE, and region 8 has the largest MAE. Table 2 shows the average values of the predicted results of all regions, and MAE, MAPE, RMSE, and R2 are 8.209, 37.12%, 11.527, and 0.737, respectively.

Figure 12 shows the error heat maps, where deeper color and denser dots represent greater demand. It can be observed from the heat maps that the error during peak hours (e.g., 8–9 AM and 6–7 PM) is much higher than that during 0–1 AM and 11–12 PM, and the heat maps can accurately reflect the error. The combination of short-term demand forecast and visual analysis can help operators quickly identify areas with great errors and adjust them.

4.5. Model Comparisons
4.5.1. Comparison with Other Models

Apart from the proposed models, other algorithms are tested. The algorithms include four traditional time-series forecasting models (i.e., HA, MA, ARIMA, and Holt exponential smoothing) and several learning/deep learning methods (i.e., ANN, LSTM, and GRU).(1)HA: The historical average model predicts future demand in the test set based on empirical statistics in the training set. For instance, the average demand during 7–8 AM in the grid (m, n) is estimated from all historical demand during 7–8 AM in the grid (m, n).(2)MA: The moving average model is widely used in time-series analysis. It predicts the future demand by the average of several recent historical demands. This paper uses the average of 12 previous demands in the grid (m, n) to predict the future demand in the grid (m, n).(3)ARIMA: the autoregressive integrated moving average model integrates the autoregressive (AR), integrated (I), and MA parts and takes into account the trend, periodicity, and nonstationary characteristics of the dataset [33].(4)Holt exponential smoothing: Holt exponential smoothing model adds a trend smoothing coefficient β based on a simple exponential smoothing coefficient α, which is also called the two-parameter smoothing method.(5)ANN: The artificial neural network [34, 35] uses all the variables of a specific grid (m, n), including historical demand, travel time rate, hourly state, weekly state, and weather variables with retrospective time windows, to predict the future demand in the grid (m, n). The neural network cannot distinguish the variables of different time, so it cannot capture the time correlation.(6)LSTM: In LSTM [36], all variables in the grid (m, n) are reconstructed into a matrix, where one axis is the time step (the size of which equals look-back time window K = 12), and the other axis is the feature category. In this way, all the features used in SGCNPM are sent to LSTM for training. LSTM considers time dependencies but does not capture spatial dependencies.(7)GRU: gated recurrent unit is a kind of neural network which performs similar to LSTM but is computationally cheaper.

The aforementioned deep learning methods have the same input features (same category and look-back time windows) as the SGCNPM, and the four time-series models (HA, MA, ARIMA, and Holt exponential smoothing) make use of the same time series. Moreover, the deep learning models (ANN, LSTM, and GRU) are also trained with 100 epochs. Before model training and validation, all data are standardized to the range [0, 1] through the same standardization.

The research uses Python 3.7 with TensorFlow 1.14.0 [37], Keras [38], and scikit-learn [39] for comparing the models.

For all methods, this paper makes predictions on the validation set. Table 3 compares the results of different forecasting methods in twenty runs. This paper observes the following phenomena in the results. (1) The prediction results of deep learning methods, including ANN, LSTM, GRU, and SGCNPM, are superior to other models. (2) SGCNPM achieves the best results regarding all the metrics on the same dataset. (3) The deep learning methods take longer than other methods, and SGCNPM takes the longest time. (4) MAPE obtained by all methods is great.

4.5.2. SGCNPM with Different Factors

To verify the need to consider multiple factors, this paper rebuilds SGCNPM, which removes the weather variable, land use type variable, and public transport accessibility variable, respectively. The results are shown in Table 4. After removing the public transport accessibility variable, the prediction accuracy decreases the most, MAE increases from 8.209 to 11.223, and RMSE increases from 11.527 to 14.017. As shown in Table 4, removing any variable will lead to an increase in prediction error, reflecting the importance of each variable.

Furthermore, to analyze the accuracy of the SGCNPM in different periods, this paper predicts the demand for bike-sharing separately during the day and at night. The model is recalibrated. The results are shown in Tables 5 and 6. Several conclusions are put forward here.

It is necessary to consider all variables. The prediction results considering all variables are the best in the daytime or at night. The MAE, MAPE, RMSE, and R2 are 8.377, 20.21%, 10.648, and 0.738, respectively, in the daytime; the MAE, MAPE, RMSE, and R2 are 0.866, 32.19%, 1.203, and 0.811, respectively, at night. The MAE and RMSE of the night are less than those of the daytime, while the MAPE is greater than that of the daytime because the bike-sharing demand is greater in the daytime than at night.

5. Conclusions

This paper first analyzes the travel behaviors of bike-sharing in the Heping District of Tianjin. SGCNPM forecasts the demand for bike-sharing in various urban areas, providing technical support for bike-sharing dispatching. SGCNPM is a fusion of two models, which are the built environment variable model and the weather environment variable model. The built environment variable model integrates distance map, POI map, and interconnection map. The weather environment variable model considers the influence of weather state, maximum temperature, minimum temperature, wind level, and air quality index. The model can better reflect the influence of spatiotemporal correlation, built environment, and weather on demand for bike-sharing. Through the comparison of forecasting results, this paper finds that the forecasting accuracy of SGCNPM is better than that of HA, MA, ARIMA, Holt exponential smoothing, ANN, LSTM, and GRU. In addition, this paper discovers that the forecast accuracy will decline regardless of any time when built environment variables or weather variables are missing from the model. It is demonstrated that the built environmental variables and weather variables are crucial for forecasting the demand for bike-sharing. Future research will deeply analyze how to allocate and dispatch bike-sharing based on bike-sharing demand forecast accurately and continue to improve the model.

Data Availability

The demand data extracted from Hellobike and Mobike to support the results of this study have not been made available because of the requirements in the confidentiality agreement.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by the Fundamental Research Funds for the Central Universities (2021PT207), the National Natural Science Foundation of China (52172312 and 71931003), and the Research Foundation for Youth Scholars of Beijing Technology and Business University.