Abstract

To provide an efficient demand-responsive transport (DRT) service, we established a model for predicting regional movement demand that reflects spatiotemporal characteristics. DRT facilitates the movement of restricted passengers. However, passengers with restrictions are highly dependent on transportation services, and there are large fluctuations in travel demand based on the region, time, and intermittent demand constraints. Without regional demand predictions, the gaps between the desired boarding times of passengers and the actual boarding times are significantly increased, resulting in inefficient transportation services with minimal movement and maximum costs. Therefore, it is necessary to establish a regional demand generation prediction model that reflects temporal features for efficient demand response service operations. In this study, a graph convolutional network model that performs demand prediction using spatial and temporal information was developed. The proposed model considers a region’s unique characteristics and the influence between regions through spatial information, such as the proximity between regions, convenience of transportation, and functional similarity. In addition, three types of temporal characteristics—adjacent visual characteristics, periodic characteristics, and representative characteristics—were defined to reflect past demand patterns. With the proposed demand forecasting model, measures can be taken, such as having empty vehicles move to areas where demand is expected or encouraging adjustment of the vehicle’s rest time to avoid congestion. Thus, fast and efficient transportation satisfying the movement demand of passengers with restrictions can be achieved, resulting in sustainable transportation.

1. Introduction

The right to travel refers to citizens’ right to move freely and safely. Because it is a fundamental right that is indispensable to human life, efforts to ensure the right to move continuously are needed [1]. Although transportation patterns have changed over the past few decades, mainstream passenger transport (e.g., buses and taxis) has not changed sufficiently to meet these changes. In particular, timed route methods, such as buses, incur fixed operating costs. If a passenger is not picked up, a loss occurs; if the passenger’s demand changes, the utilization rate decreases, and eventually, the fixed cost increases. This leads to a vicious cycle that results in a decrease in use, because the service does not adequately satisfy the requirements of passenger travel. If this phenomenon persists, supply is concentrated on major routes, which can create barriers to passengers’ travel rights. Moreover, socially disadvantaged people (elderly people, disabled people, residents of vulnerable areas, etc.) can experience severe isolation. Demand-responsive transport (DRT) services have emerged to solve this problem.

A DRT service refers to a transportation service that responds to the movement demand of passengers without a predetermined route or operation plan. It combines low fares, which are the advantages of buses running fixed routes, and convenient boarding and disembarking and speed, which are the advantages of taxis. Therefore, relative to buses and taxis, DRT services achieve a tradeoff in terms of efficiency and cost. DRT services have the following advantages over fixed-route operations. First, the demand resolution is optimized. For DRT services, the driving distance of a fixed-route vehicle divided by the number of passengers onboard is approximately half that for a fixed-route operation. Additionally, DRT services have the advantage of efficient operational cost management. DRT services are economical because the fixed cost incurred when there is no demand is low. Another advantage of DRT services is their environmental superiority. They have a shorter tolerance distance than fixed-route vehicles. They are ecofriendly with regard to greenhouse-gas emissions and fuel consumption because they use small vehicles. Finally, passengers are highly satisfied with DRT services. DRT services operating in a door-to-door manner achieve higher levels of passenger satisfaction than fixed-route operations, where passengers must travel directly to the station [2].

DRT is applied to the movement of passengers with restrictions, e.g., in areas where demand is intermittent or transportation services are insufficient and vulnerable [3]. Real-time response to the travel demand is crucial for efficient DRT service operations, requiring a system and demand forecasting model to allocate requests to vehicles quickly and efficiently when passengers receive travel requests [4]. The demand forecasting field for mainstream passenger transport continues to improve with the development of deep-learning technologies such as long short-term memory (LSTM). For example, in [5], LSTM was utilized to predict future demand according to past demand through traffic card data analysis. However, DRT services are designed for the movement of passengers with restrictions; therefore, they exhibit a different demand pattern from general mainstream passenger transportation. Because the existing liquor passenger transportation model cannot be applied, a model that reflects the movement characteristics of passengers with restrictions is required.

It is crucial to consider the demand at previous times in the region, but it is also essential to reflect spatial characteristics. Each region has spatial characteristics, such as commercial districts and suburban areas [6, 7]. Because spatial characteristics affect temporal trends, spatiotemporal factors must be considered. In this study, three types of components that reflect spatial, temporal, and spatiotemporal characteristics were constructed and reflected in the model. Because DRT services are subject to spatiotemporal influences, the data are sparse. LSTM affiliation is not well suited for sparse data. To solve this problem, we used channel-wise attention and temporal means to alleviate the sparsity of the data to the greatest extent possible and then used ConvLSTM.

The main contributions of this study are as follows. (i)First, we improved the interpretability of the model by identifying the cause of spatiotemporal demand and reflecting it in the model(ii)Second, we used channel-wise attention and temporal means to maximize the demand for sparse demand response(iii)Finally, a graph convolutional network (GCN) was used for the first time to reflect spatial factors in demand prediction according to the region of the DRT service

The remainder of this paper is organized as follows. Section 2 presents related research and basic deep-learning models related to DRT service demand prediction. Section 3 presents the proposed method. Section 4 presents the results of applying the proposed method to actual data. Section 5 presents conclusions and suggestions for further research.

This section introduces DRT service demand prediction research and deep-learning methods.

2.1. DRT Service Demand Prediction

Because demand prediction must precede the efficient operation of DRT services, many studies have recently been conducted using various methodologies. For example, in [8], after the entire region was divided into grids, the demand for a DRT service was predicted using a convolutional neural network (CNN), LSTM, and ConvLSTM, along with exogenous variables such as weather. In [9], an appropriate DRT type was identified by estimating the average number of people getting on and off at bus stops in a regular pattern identified through cluster classification of time-by-time boarding points for the efficient placement of DRT.

Recent studies focus on spatial dependence, traveler personal heterogeneous, sparse uncertainty, and demand prediction quality requirements. Reference [10] mentioned that variables representing factors related to the characteristics of service supply, demographic characteristics, land use, and accessibility should be discovered and fused to reflect the direct impact and ripple effect on demand. Their research uses a model structure (Attention, ConvLSTM) that can demonstrate demand patterns of call taxis for the disabled as a service supply characteristic. In addition, to reflect demographic characteristics, the administrative region, which is a division of a population-based area, was used as variables representing factors related to land use and accessibility were discovered and utilized as a functional similarity adjacency matrix of the GCN method. However, this paper is aimed at developing an optimal bus route rather than a DRT service. Call taxi for the disabled is a short-distance transportation service for people who cannot go to the appropriate stop due to severe disabilities. There is a separate long-distance customized bus service for the disabled in Seoul. Therefore, the use of the call taxi for the disabled is different. Reference [11] is a thesis that studies the error distribution rather than specific parameters, learning methods, and hyperparameter adjustments for a transportation demand prediction model for adequate public transportation (PT) operation. To build an accurate model, it is necessary to study the error distribution considered in the study. References [12, 13] utilized H-ConvLSTM that applies convolution based on a hexagonal shape rather than a conventional pixel standard. We improved the performance by using the ensemble for postaggregation, like bagging. To reflect the interregional relationship between hexagons, they used the GCN additionally.

In particular, the traffic demand was predicted using call taxi data for people with disabilities in Seoul. In [14], the waiting times for disabled people in Seoul were predicted using SARIMA and LSTM and compared. In [15], the call taxi latency for the disabled was predicted using several hyperparameters of LSTM. However, in these studies, only past temporal characteristics were considered; spatial characteristics were omitted or reflected only in the Euclidean space. Furthermore, because the spatial relationship is not based only on the location in Euclidean distance, it is necessary to reflect various spatial structures based on non-Euclidean distance in the model.

2.2. Spatiotemporal Prediction

Demand prediction and urban traffic prediction fields, such as traffic volume prediction and congestion distribution estimation, exist in tasks that reflect spatiotemporal factors. Previous studies on urban traffic prediction can be classified into two categories according to the input data format. Grid-based inflow and outflow prediction is based on images, whereas graph-based traffic speed prediction is based on graphs.

2.2.1. Grid-Based Inflow and Outflow Prediction

The demand forecasts for DRT services and taxis are highly similar [8]. Therefore, to predict the general taxi demand, the entire area is converted into an image set to a grid of a specific size and utilized. In [16], exogenous variables such as weather and weekend availability were added in a fully connected layer. The values before a certain point, such as the distant, near, recent of the grid, are learned through convolution. In [6], predetermined point-of-time values and point-of-interest (POI) characteristics were learned by a grid through convolution, such as the time, day, and week of the set grid, and combined through ResPlus to predict the regional taxi demand at the next time. Collecting exogenous variables that may be related to future demand can improve the predictive performance. Although weather, POI, or traffic flow was used in the foregoing studies, the performance improvement was insignificant relative to the increase in the number of parameters, because the improvement through exogenous variables was orthogonal to capturing complex spatiotemporal dependence in the data [7].

2.2.2. Graph-Based Traffic Speed Prediction

Graph structures—not images—are used to solve various urban problems. In contrast to the grid-based method, research is focused on solving various urban problems, such as predicting traffic speed, rather than predicting movement demand. For example, in [17], a GCN with three adjacency matrices was used. Spatial characteristics were adopted, along with a contextual gated recurrent neural network [14, 18, 19] and temporal characteristics with values prior to a certain point in time of closeness, period, and trend. In [20], a three-part model that predicts the travel demand at the next point in time was proposed. The first part is a long-term encoder for encoding the past moving demand. The second part is a short-term encoder for deriving next-step predictions from generated multistep predictions. The third part is an attention-based output module for modeling dynamic temporal and channel-wise information. In [21], ST-Conv block—a combination of temporal-gated convolution and spatial graph convolution—was used to predict the traffic speed at the next point in time. In this study, we predicted the demand for a DRT service using the graph-based method.

2.3. GCN

The GCN applies to graph , where refers to vertices and is a matrix with edges expressing the relationships between the vertices. The GCN can extract a local feature from a non-Euclidean structure in another receptive field. For example, to utilize convolution in the graph structure, the Fourier transform [22] can be used. To share the basis of the Fourier transform, we compute the Laplacian matrix where denotes the degree matrix. We denote as the features of the th layer, as trainable coefficients, as the -order multiplier of the graph Laplacian matrix, and as an activation function. The graph convolution operation [23] using a Laplacian matrix is defined as follows:

We learn the relationships between adjacent vertices by updating feature through multiple layers. Moreover, because the GCN has the characteristics of learning weight sharing and local features, which are characteristics of the CNN, it is possible to obtain a node feature reflecting the connection information of the adjacent (hop) nodes of each node.

2.4. Channel-Wise Attention

Given an input , channel-wise attention [17, 18] learns the weights for each channel to find and highlight the most important frame with larger weights. Here, , , and refer to the height, width, and channel number of the image, respectively. The channel-wise attention is defined as follows. A summary of each channel is obtained. Then, we obtain the attention . The algorithm learns to assign a large weight to the important channels. The attention value to the original input values is channel-wise as follows:

Here, is a global average pooling operation, and and are the corresponding weights. and are nonlinear functions for each ReLU, i.e., rectified linear unit and sigmoid function.

3. Method

3.1. Description of Dataset

In this study, DRT service data of call taxis for the disabled in Seoul for two years (from 00:00 on January 1, 2018, to 24:00 on December 31, 2019) were used. The call taxis were primarily operated in Seoul but sometimes moved to areas adjacent to Seoul, depending on the passenger demand. However, we limited the spatial range to Seoul. Therefore, we included data from both departure and destination sets within Seoul. The call taxi data for the disabled included the following information. For each call, the variables were the type of call (regular reception, full-day reservation, and direct call), reception, hope, dispatch, boarding, departure, destination, departure coordinates, customer number, purpose of use, and number of boarding vehicles. Of the 424 administrative districts, Wirye-dong, which had no passenger demand in 2018 or 2019, was excluded. In addition, we excluded data corresponding to hours other than the primary operating hours. There were 1,699,614 data points within 11 h, including 7–17 h.

The data contained one row of demand consisting of a three-dimensional matrix with 8030 rows and 423 (administrative districts) columns in (time zones) by aggregating the number of demands by administrative district in the date-time period. The number of demand cases was continuous data; however, as mentioned previously, the number of demand cases had an extensive and intermittent distribution. Zero accounted for 62% of the cases (1,064,141 of 1,699,614), one accounted for 21%, and the others accounted for only 17%. The class imbalance problem was alleviated by treating multiple demands as one demand (0/1). For example, for 5 p.m. on November 1, 2019, the data exhibited a wide variety of demands, as shown in Figure 1(a). However, the number of demands was changed according to whether there was demand, as shown in Figure 1(b).

3.2. Proposed Method

The proposed method consists of three steps. The first step is to encode the spatial dependency, the second step is to use ConvLSTM [24] to reflect the temporal dependency, and the third step is to use a GCN [23] to reflect the temporal dependency. Figure 2 illustrates the overall process. Furthermore, pseudocode is presented in Algorithm 1.

Require:
1: Past demands
2: Future true demand
3: Adjacency matrix , degree matrix , hop:
Ensure: future prediction demand
4: While training do
5:   For all do
6:     (1) Spatial dependency: apply Chebyshev to each adjacency matrix
7:     rescale (normalize )
8:     For alldo
9:        Chebyshev
10:     End for
11:     (2) Temporal dependency: apply with contextual gating (CG) and ConvLSTM
12:     
13:   End for
14: (3) Spatial-temporal dependency: apply with FC (fully connected) and GCN (graph convolution network)
15:   
16:   Compute loss:
17: End while
3.3. Encoding Spatial Dependency

The proposed method utilizes several types of adjacency matrices to reflect the spatial dependency. The adjacency matrix reflects the neighborhood between administrative districts.

Figure 3(a) shows a heat map of the adjacency matrix for adjacent connections between administrative districts. The second adjacency matrix was designed to reflect the real travel distance between the administrative districts.

Figure 3(b) shows a heat map of the adjacency matrix for the transportation convenience connection between administrative districts. According to the third adjacency matrix , for administrative districts that are more functionally similar, the demand patterns are more similar.

Here, denotes cosine similarity. is a vector of the medical location quotient (LQ), disability LQ, number of resident registration disabilities, and demand movements for each administrative district. Location quotient (LQ) measures the dispersion of a specific industry. We calculate the satisfaction of medical care and disability facilities in administrative districts by comparing them with Seoul city. It can be interpreted that the higher the coefficient, the higher the satisfaction of the owned facilities compared to other administrative districts, and vice versa—the lower the coefficient, the insufficient. LQ, a quantitative indicator, was used to compare the functional similarity between the two administrative districts. The adjacency matrix and normalized Laplacian matrix for the functional similarity between the two administrative districts were expressed in a heat map, as shown in Figure 3(c).

Chebyshev polynomials [25] were used to embed the configured adjacency matrix. We transformed the adjacency matrix into a Laplacian matrix as follows:

where is degree matrix, is normalized graph Laplcaian matrix, and is identity matrix.

Using -order Chebyshev polynomials [25],

encoding.

3.4. Learning Temporal Dependency

Contextual gates and ConvLSTM deploy temporal dependencies. We use input values based on closeness, period, and trend. For closeness, we consider the demands from 1, 2, and 3 h in the past. The period is the same as that of 1, 2, and 3 d in the past. The trend is the demand a week in the past. As shown in Figure 4, contextual gating is performed.

We first compute GCN () applying GCN to the original value. In the model, GCN is applied as follows. Multigraph convolution is used, such as equation (10), to reflect spatial dependency by utilizing several graphs configured. Multigraph convolution is used to reflect spatial dependency. is feature vectors of region layer in and is activation function and is aggregation function, where is sum. is a set of graphs, and is the aggregation matrix of other samples. If is the feature transformation matrix, is updated to

Then, we apply global average pooling to all regions.

Let be a sigmoid function and be the GeLU, i.e., Ga linear unit function. Equation (11) produces the following summary: for each of the temporary observation periods. We multiplied the calculated summary by the original value.

Through the contextual gating mechanism, we obtain reweighted observations with weights over time.

However, the LSTM architecture may not be well learned from sparse data. To resolve this, we applied ConvLSTM after the temporal mean, as shown in Figure 5. For each of the three inputs, the temporal mean is as follows:

We learn the temporal characteristics of each region using ConvLSTM in the temporal mean reweighted observations. Owing to intermittent demand, we convert sparse data into dense data. Therefore, we average features for closeness, period, and trend and then apply ConvLSTM. Across all regions, ConvLSTM is applied to the values of the reweighted observations. This results in a single vector that aggregates the learned spatiotemporal information.

Finally, a multigraph GCN is applied to the result of the ConvLSTM to learn spatiotemporal characteristics simultaneously. We then apply a fully connected layer for aggregation.

4. Spatiotemporal Characteristics of DRT Service

In the proposed method, the regional demand for DRT services is predicted via graph-based deep learning using the spatiotemporal characteristics of the demand in the past two years. Therefore, it is necessary to investigate the cause of the presence or absence of demand. Accessibility to the DRT service is influenced mainly by time and space, as shown in Figures 6 and 7. In this section, the factors that affect the demand for transportation services are identified through temporal and spatial characteristic analyses.

4.1. Analysis of Spatial Characteristics

To visually validate the spatial dependency embedded vectors of the functional similarity adjacency matrix, we used -distributed stochastic neighbor embedding (t-SNE) [26] over a low-dimensional space. Then, we applied -means clustering [27] to the lower dimensions. We performed dimension reduction with t-SNE for visualization and observed five clusters, as shown in Figure 6. Table 1 presents the mean feature vectors for each group.

Group 0 shows the residential area with the most passengers boarding to commute. Meanwhile, there is a moderate demand for the rest of the purposes. In the case of group 1, the number of garages is relatively large, and it is a residential area where people board the most for returning home and religious purposes. In the case of group 2, the medical LQ and disability LQ are low, and they do not board well for business work and treatment purposes. In the case of group 3, many people used DRT service for returning home, rehabilitation, and shopping, and the pursuit of work was relatively high. Finally, in the case of group 4, the medical LQ and disability LQ are high, and the residential area tends to have the highest purpose of returning home.

4.2. Analysis of Temporal Characteristics

As shown in Figure 7, aggregating the demand status for the two years by the hour revealed that 7 a.m. was the most in demand and shows a decreasing trend at 8 a.m. and 9 a.m. However, it increases again from 10 a.m. and then to decrease to 20% from 1 p.m. to 5 p.m. Because of this characteristic, it is crucial to predict the demand in the period when the demand is plummeting, as most administrative districts exhibited a demand of >50% at 7 a.m. These results are attributed to the purpose of passenger use.

Figure 8 shows the usage purpose pattern: the number of people returning home increased by 12 p.m., and the demand for treatment, rehabilitation, and commuting/work increased in the morning. In the case of movement for this purpose, because the movement is often constant, it is possible to predict the demand position using this pattern. A functionally similar adjacency matrix can explain this pattern.

According to the ratio of call types by time, direct calls and full-day reservations were inversely proportional in the case of full-day reservations. Therefore, we infer that disabled call taxis operate regularly. We make three policy suggestions. First, the demand should be checked on the previous date by expanding the operating time zone of the full-day reservation. Currently, the service is only operated at 7 a.m., 8 a.m., and 10 a.m. However, the demand should be predicted by expanding the operating hours or establishing a system that can be flexibly received the reservation before anytime. Second, movement should be encouraged by utilizing measures such as deploying additional temporal vehicles at 7 a.m., when the demand is the highest. Third, maximum movement should be achieved at the minimum cost by avoiding and adjusting the driver’s rest time between 10 a.m. and 12 p.m., when the demand increases again.

4.3. Model Performance Comparison

In this section, we compare the two aforementioned models. Let () be the conditional probability given an input . For a loss of observation, we used the binary-cross entropy loss.

The training dataset included data from January 1, 2018, to October 31, 2019. Twenty percent of the data were used for the validation. Data from November 1, 2019, to December 31, 2019, were used as test data. To maintain chronological order, the data were not shuffled. ConvLSTM had four hidden sizes and three layers, and the GCN had 64 hidden sizes.

The performance of the proposed method was compared with that of other methods, and the results are presented in Table 2. Compared with the existing time series and classification model, the proposed method achieved significantly better performance. In contrast to the other methodologies, previous time zones, e.g., the closeness, period, and trend, were input as data configurations, and the characteristics of each administrative district (medical LQ, disability LQ, etc.) were added. Three adjacency matrices were used, and the results of the experiment are presented in Table 3. The first row presents the results obtained using only the neighborhood adjacency matrix. The second row presents the results obtained using two transportation adjacency matrices: the neighborhood and transportation adjacency matrices. The third row presents the results obtained using all three functional adjacency matrices, i.e., neighborhood, transportation, and functional similarity.

As shown, the method exhibited the best performance when all three adjacency matrices were used. However, in the case of the second row, the performance was inferior to that achieved using only the neighborhood adjacency matrix.

Table 4 presents the performance with respect to the type of temporal correlation. ConvLSTM outperformed vanilla LSTM, which did not reflect the spatial information. Also, max pooling shows lower performance.

The performance differences for different combinations of closeness, period, and trend are presented in Table 5. As time was used more, performance increased. In the case of call taxi data for the disabled, the demand is very intermittent, so the less time is used, the greater the sparse value will be affected. In addition, in the case of the demand a week ago, the actual past information is excessively required; therefore, the demand was fixed to 1. The performance difference when using the performance difference according to the use of is presented in Table 6.

In the GCN, problems such as oversmoothing occur as the number of layers increases excessively [29]. Similarly, in this study, when increased by four or more, the performance was degraded. Finally, in the case of Seoul, if it is influenced by too many hops, the performance is degraded, reflecting irrelevant administrative district relationships.

At the time of demand generation, we investigate the average difference time in waiting time between the case where the empty car is waiting and the case where there is no waiting. Table 7 shows the mean waiting time depending on whether there is a vacant vehicle that exists or not. When an empty car is on standby, we expect that we could reduce waiting time by about 16 minutes on average.

5. Conclusions

The proposed method can resolve unequal waiting times between regions by predicting the demand location for efficient operation of DRT services, which can support minimum cost–maximum movement. The objective of this study was to reduce the waiting time by efficiently rearranging nearby empty cars by predicting the regional demand for Seoul’s call taxi service for the disabled, which has intermittent call characteristics. After configuring various subgraphs, the GCN was used to reflect the spatial characteristics between regions, and the model was constructed using the temporal mean and ConvLSTM to reflect temporal characteristics. Using various subgraphs from real data analysis showed alleviated results in terms of accuracy and interpretation. We expect improved convenience of movement and satisfaction with public transportation by reducing the waiting time. In addition, DRT services can replace public buses, increase the efficiency of subsidies for various types of public transportation, and generate profits and labor inducement effects for transportation companies, revitalizing the local economy and increasing the sharing rate of public transportation.

Data Availability

Data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by the Basic Study and Interdisciplinary R&D Foundation Fund of the University of Seoul (2021). The authors express their gratitude for this support.