Journal of Advanced Transportation

Journal of Advanced Transportation / 2021 / Article

Research Article | Open Access

Volume 2021 |Article ID 9929622 |

Chuqiao Chen, Simon Hu, Washington Y. Ochieng, Na Xie, Xiqun (Michael) Chen, "Understanding City-Wide Ride-Sourcing Travel Flow: A Geographically Weighted Regression Approach", Journal of Advanced Transportation, vol. 2021, Article ID 9929622, 15 pages, 2021.

Understanding City-Wide Ride-Sourcing Travel Flow: A Geographically Weighted Regression Approach

Academic Editor: Zhi Chun Li
Received04 Mar 2021
Revised19 Apr 2021
Accepted11 Jun 2021
Published24 Jun 2021


The emerging ride-sourcing service has become an important element of urban mobility. A challenging question underlying the provision of such service is how and to what extent the built environment affects origin-destination (OD) travel flows. This paper employs the geographically weighted regression (GWR) model to analyze the OD-based ride-sourcing travel flow. It makes a comparison with the existing ordinary least square (OLS) model and spatial autocorrelation model (SAM). We have collected ride-sourcing order data in Hangzhou, China, to provide an accurate source for acquiring ride-sourcing travel flow. We investigate the effects of the residential area, points of interest (POIs), and transit stations on ride-sourcing travel flow among traffic analysis zones (TAZs). The results show the following: (a) GWR has better goodness-of-fit than SAM and OLS. (b) Residential area, enterprise, and bus stations have positive correlations with ride-sourcing OD flows, but education and subway stations have negative correlations. We have further investigated the issue and found that it is not a causal relationship between the bus station and OD flow, due to collinearity between the two variables. The bus station builds on locations with high demand, but its capacity is not large enough to reduce the ride-sourcing flow to a low level, which results in a positive coefficient. (c) Based on the estimated coefficients, the prediction of ride-sourcing flows is feasible, supporting the impact analysis for urban land use and transportation planning. This paper contributes to understanding OD-based ride-sourcing travel flow distributions and provides a framework of long-term OD flow prediction for urban land use and transportation planning.

1. Introduction

The government agencies and urban planning department attach great importance to the prediction and understanding of large-scale origin-destination (OD) flow. The OD flow pattern reflects the distribution of travel demand and reveals the human mobility pattern. It helps plan traveling routes [1, 2], discover commuting regularity [3], and analyze land use properties [4]. An accurate estimation of OD matrices benefits decision-makers to better coordinate urban resources and mobility demand.

The advanced technologies of intelligent transportation systems (ITS) have offered extensive data collection methods to rectify the trip pattern identification problem. These technologies and data, including the automatic vehicle identification (AVI) technologies [5], radio frequency identification (RFID) and the license plate-based AVI [6], taxi GPS traces data [35], mobile phone data [7, 8], and smart card fare data [9], have been studied extensively in the past. Apart from transit and taxi, ride-sourcing services offered by platforms like Uber and Lyft have made tremendous changes to the transportation systems and have become an essential component of city transportation mode [10]. For instance, by 2020, Uber offered ride-sourcing services for more than 900 cities in 70 countries [11]. The number of rides served by Lyft has already reached one billion in the region of the US, Toronto, and Canada by September 2018 [12].

1.1. Objective

There have been numerous studies in the literature to investigate the influences of the built environment on travel demand. However, the analysis of OD-based ride-sourcing flows is inadequate. This paper aims to fill the research gap of ride-sourcing OD flow analyses in the existing research by applying the geographically weighted regression (GWR) model and two baseline models. We analyze and explain the relationship between OD flow and independent variables (built environment) on the origin and destination levels and then illustrate the marginal effect of the residential area and subway on ride-sourcing OD flow based on urban land use and transportation planning policies.

1.2. Methodology

We adjust the GWR model to analyze the OD-based ride-sourcing travel flow and take the ordinary least square (OLS) and spatial autocorrelation model (SAM) as comparisons. The implications of model coefficients are analyzed. We collect the emerging ride-sourcing order data that can accurately reflect the OD flow distribution of on-demand ride services. We have also collected the built environment data for 70 traffic analysis zones (TAZs) in Hangzhou City, China, including the residential area, ten types of POIs (i.e., beauty, restaurant, education, enterprise, medicine, hotel, house, entertainment service, tourist spots, and shopping), and transit stations (i.e., bus and subway). We further analyze the policy influence of subway stations development in 2020 and the increase of the residential areas in 2050 on TAZs. The plausible reasons for future OD flow changes are discussed and interpreted. The results could provide insights into the importance of existing factors and a better understanding of ride-souring flow distribution.

1.3. Results

For the total OD flow, we find that the residential area, enterprise, and bus station variables are positively related to OD flow at both the origin and destination levels. However, education and the subway are always negatively related. When analyzing the result of different periods, we find that the enterprise attracts people in the morning rush, while in the evening rush hour, people travel from zones with dense enterprise buildings. There are some differences between weekdays and weekends. Entertainment service and tourist spots play a more important role on the weekend. We find that the residential area’s marginal effect is positive in most TAZs at the inflow, outflow, and intraflow levels. In contrast, the subway station has negative effects on most TAZs.

1.4. Contribution

The paper’s major contributions are threefold: (a) This study is among the first attempts to utilize ride-sourcing order data to explore the influence of the built environment on OD flows. (b) We explore the influence of the built environment on ride-sourcing OD flow by applying three models (OLS, SAM, and modified GWR). The results show that GWR has better goodness-of-fit than SAM and OLS. (c) The future influences of built environment changes due to local government policies on ride-sourcing OD flows are estimated based on marginal analysis.

The rest of the paper is organized as follows: Section 2 reviews the related work on OD flow estimation in terms of models used to analyze travel flows over time. Section 3 introduces the three models (i.e., GWR, OLS, and SAM). Section 4 introduces the data collected for this study. Section 5 presents the results of the three models. Finally, Section 6 concludes the findings and provides the future direction of research.

2. Literature Review

There has been a vast body of literature exploring the relationship between the built environment and traffic flow. However, when it comes to ride-sourcing data, only a handful of papers existed. Sabouri et al. [13] analyzed how Uber demand in 24 diverse US regions was affected by 5D variables with the multilevel modeling (MLM) method. They found that demand was negatively related to intersection density and destination accessibility variables. Yu and Peng [14] explored the relationship between ride-sourcing demand (from RideAustin company) and built environment and socioeconomic factors. The factors they chose covered income, age, education, population, and transit accessibility. By applying the geographically weighted regression model, Bao et al. [15] established the relationships between ride-sourcing usage and various environmental factors such as commercial, residential, and parking areas. Gerte et al. [16] used a linear panel model to estimate the relationship between ridesharing adoption and time, built environment, and demographic variable.

However, the studies above aggregated the ride-sourcing flow into origin-based demand flow rather than OD flow among different regions. The latter reflects travel mobility in a more detailed way. By analyzing the relationship between OD flow and built environment, we may find that the built environment factors affect inflow, outflow, and intraflow, which is a research gap worth filling.

Many studies have used spatial analysis approaches to understand the effect of the built environment on various transportation usages. The spatial autoregressive model (SAM) is one of them. LeSage and Pace [17] proposed the standard spatial autoregressive model (SAM), which considered the interaction among regions. Many researchers applied the model to predict traffic flow [1821]. The GWR model is different from SAM by allowing the coefficients of explanatory variables to vary over space [22]. Plenty of researchers applied the GWR model on transit data. Cardozo et al. [23] explored the station-level transit ridership. The prediction of ridership at the Madrid subway stations showed that GWR outperformed traditional ordinary least squares multiple regression. Chiou et al. [24] identified the major factors for the public transit usage rate in Taiwan. The results showed that the GWR model had better accommodation of spatial autocorrelation and better prediction accuracy than the Tobit regression model. Ma et al. [25] explored the relationship between the built environment and transit ridership in Beijing using one-month transit card data and POI data. Other studies utilized data such as taxi trajectory, walking demand, and daily activities. Li et al. [26] used taxi trajectory data within one week and POI information to estimate transportation factors such as pick-ups and drop-offs. Qian and Ukkusuri [27] modeled the spatial taxi ridership distribution through the GWR model with various sociodemographic and built-environment variables. They further investigated how the rise of TNCs infected traffic states and emissions [28]. Yang et al. [29] studied walking travel demand at intersections using walking counts over ten years in Chittenden County, Vermont, USA. Lucas et al. [30] studied the influence of travel disadvantage on travel amount with personal surveys. They found that the level of bus services, street connectivity, and neighborhood safety were all significant factors to the undertaken daily trips. Shen et al. [31] collected the car license plate recognition data and analyzed the spatial-temporal automobile travel demand with a geographically and temporally weighted regression model.

Our study estimates how different factors influence the ride-sourcing OD flow, which benefits urban land use and transportation planning. The model can predict the changes in flow once the future built environment change is determined. To the best of our knowledge, this is the first study that employs GWR to ride-sourcing OD flow analyses.

3. Models

OLS is a traditional linear regression model that maps the independent variables linearly to dependent variables. Compared with the traditional OLS model, the SAM introduces origin, destination, and OD dependence to capture the spatial autocorrelation. The GWR model is different from the above models by allowing the coefficients of explanatory variables to vary over space. The following subsections will introduce these models.

The traditional OLS has often been applied to estimate flows in population migration, transportation, and trade. However, the model assumes that observations are independent of each other, which could potentially cause inaccuracy in treating spatial problems. In this paper, we formulate the traditional OLS model as follows:where represents OD flows from origin to destination. It is an vector in which represents the number of TAZs in our study region. Thus, there are OD pairs in total. We stack the vector first by origins and then by destinations. and represent independent variables in the origin and destination side. They are matrices in which is the number of independent variables. and are the associated coefficients. represents the travel distance between each OD pair. It is an vector and is the associated scalar coefficient. is the constant term, in which is a vector of ones with the size of , is the associated scalar coefficient, and is the random disturbance.

The spatial autocorrelation model extends the traditional OLS by using three spatial weight matrices for origin, destination, and origin-to-destination dependence. It can be formulated by adding three correlated terms [17]:where , , and represent the origin, destination, and OD dependence. They are spatial weight matrices whose elements are relevant to the distance between regions. , in which is the distance between origin i and destination j. is an identity matrix with the size of , and , , . and , , and are the associated scalar coefficients representing the effect strength of , , and , respectively.

The GWR model is different from the above models by allowing the coefficients of explanatory variables to vary over space. We change the model to estimate the OD flow. The model can be formulated as follows:where represents the location of the centroid of the ith TAZ. is the OD flow from origin i to destination j. and represent the kth independent variables on origin i and destination j. and are the associated coefficients using the same bandwidth. is the constant term. is the random disturbance.

Algebraically, the GWR estimates can be expressed as follows:where is the associated coefficient vector. represent independent variables on the origin and destination side. and are matrices in which is the number of independent variables. is a diagonal matrix, in which . denotes the allocated weight for neighboring TAZ i and TAZ j. It is determined by the adaptive Gaussian kernel function , where refers to Euclidean distance between TAZ i and TAZ j. is an adaptive bandwidth. We use the Akaike information criterion (AIC) to choose the best specification of .

4. Data

4.1. Ride-Sourcing Passenger Flow Pattern

The city-wide ride-sourcing order data are collected during March 6–12, 2017, from Didi company in Hangzhou, China, as the ride-sourcing passenger flow input to our models. Figure 1 illustrates the hourly orders during the period. There are two peaks on weekdays in Figure 1(a) (7:00–9:00 and 17:00–19:00), while there is only one peak from 16:00 to 18:00 on weekends. As shown in Figures 1(b) and 1(c), we analyze the relationship between travel time and the number of orders. There is no significant difference between the two distributions. The spatial distributions of the trip origin in the morning and evening peak hours (7:00–9:00 and 17:00–19:00) are shown in Figures 1(d) and 1(e). Most trips originate from the city center. The distribution of ride-sourcing trips reflects the urban mobility pattern to some extent.

4.2. Data Statistics in TAZs

The statistics (i.e., minimum, mean maximum, and standard deviation) of the data on the level of TAZs are shown in Table 1. The TAZ and residential area information is offered by the Hangzhou Planning Bureau, which is the government agency in charge of urban planning. We have collected the information on POIs from AMAP, one of the largest map service companies in China. There are ten types of POIs in the dataset: beauty (barbershop and beauty salon), restaurant, education, enterprise, medicine, hotel, house, entertainment service (government bodies, gym, and places for recreation and entertainment), tourist spots, and shopping (grocery store, supermarket, furniture, and computer market). The subway stations and bus stations are also collected. There were only three subway lines in 2019, and the largest number of subway stations in one TAZ was only six. As shown in Figure 2, there are 70 TAZs in Hangzhou. The study region covers from 119.89 to 120.57 degrees longitude and from 30.07 to 30.5 degrees latitude, which contains the urban center.

VariablesStd.MinMeanMaxNumber of observations

Residential area (m2)1,496,802.670685,9006,439,90070
Tourist spots118.46149.5689770
Flow (daily)7,937.51024.32717224,900

Std.: standard deviation.
4.3. Explanatory Variables

We choose the residential area, ten types of POIs, transit stations, and travel distance as the explanatory variables, all of which are defined at the TAZ level. Since we divide the study area into 70 TAZs, there are 70 observations for each variable. There are 4,900 (70 by 70) observations of ride-sourcing passenger flow and travel distance variables defined for each OD pair.

4.3.1. Residential Area

As the origin of travel demand, the residential area is a critical variable for the ride-sourcing passenger flow analysis. We have collected the land use data, including the base year of 2019 and the future year of 2050. It contains the size of residential land use as well as the commerce and residence land in each TAZ.

As shown in Figure 3(a), more than 80 thousand residents live in TAZs 1, 17, 27, 33, 38, 47, and 52. Since the residential area in these 70 TAZs varies significantly, we normalize the area size between 0 and 1. The distribution of the residential area in 2050 is consistent with that in 2019, as shown in Figure 3(b).

4.3.2. POIs

As the attraction of passengers, POIs have significant impacts on the ride-sourcing flow. We have collected the POIs data from AMAP, one of China’s biggest map service companies. Among the ten types of POIs, medicine facilities include pharmacy, clinic, and hospitals. Beauty facilities contain shops as barbershop and beauty salon where people improve their looks. Entertainment service facilities contain government bodies, gyms, and places for recreation and entertainment. Shopping facilities contain the grocery store, supermarket, furniture, and computer market. The education includes university, middle school, primary school, and kindergarten. Enterprise mainly covers office buildings. We illustrate the zonal distribution of each type of POIs in Figure 4.

4.3.3. Transit Stations

The transit station is another important factor influencing ride-sourcing passenger flows. This paper collects the public transit data, including the numbers of bus stations and subway stations of each TAZ in 2019. Generally, the ride-sourcing flow is expected to rise with the number of bus stations and subway stations. As shown in Figure 5(a), the city center owns the highest subway station density. There are 76 subway stations and three subway lines in Hangzhou in 2019. As shown in Figure 5(b), there are 4,227 bus stations in 2019, most of which are located in the central, northern, and southern parts of Hangzhou.

4.3.4. Multicollinearity Problem

The existence of multicollinearity will lead to bias in the experiment results. To solve this problem, we calculated the Pearson correlations among all the variables. The results show that some variables are strongly related to others. The beauty variable is strongly related to the restaurant, medicine, house, and entertainment variables with correlation coefficients of 0.910, 0.908, 0.910, and 0.910. The restaurant variable is closely related to education, medicine, and house variables with correlation coefficients of 0.917, 0.905, and 0.911. Medicine is closely related to the house, entertainment, and shopping variable with correlation coefficients of 0.901, 0.952, and 0.918. The house variable is related to entertainment and shopping with correlation coefficients of 0.877 and 0.869. The shopping variable is related to entertainment, with a correlation coefficient of 0.885. The remaining coefficients are below 0.7. According to Qian et al. [28], variables with coefficients higher than 0.7 are deleted. Thus, beauty, restaurant, medicine, house, and shopping variables are removed from the explanatory variable in the experiment. The Pearson correlation coefficients among other explanatory variables are listed in Table 2. We have further calculated the variance inflation factor (VIF) for the rest of the variables to ensure no multicollinearity issues. The VIF indices of these variables are as follows: residential area: 1.093, education: 3.011, enterprise: 5.675, hotel: 4.581, entertainment: 3.425, tourist spots: 2.930, bus: 1.885, and metro: 1.598. None of them exceed 6. Thus, there is no multicollinearity issue among the rest of the variables.

Residential areaEducationEnterpriseHotelLifeTourist spotsBusSubway

Residential area1.000−0.150−0.041−0.073−0.088−0.1180.0020.011
Tourist spots−0.1180.2810.1720.7220.3211.0000.165−0.026

4.3.5. Spatial Autocorrelation Test

Spatial autocorrelation of an explanatory variable means its value in one zone is dependent on its value at neighborhood zones. The existence of spatial autocorrelation will cause the basis of the GWR model. Thus, before conducting the GWR model, the spatial autocorrelation should be tested. We adopted Moran’s I for testing our spatial autocorrelation, as it is the most commonly used index in literature. Moran’s I of all explanatory variables are summarized in Table 3. The Z test value larger than 1.64 or smaller than −1.64 means the variable is statistically significant and has a strong spatial autocorrelation. As can be seen, most variables are significant except for shopping, house, and subway variables. The subway variable is not significant, since it is very sparse. Only 76 subway stations were built in Hangzhou in 2019. However, it is a rather important variable for transportation planning. Thus, we keep the subway station and remove shopping and house variables.

Explanatory variableMoran’s IZ test-N value

Residential area−1.136−1.640.05
Tourist spots−0.0461.920.03

5. Results

5.1. Model Estimation Results

We use the overall OD data during March 6–12, 2017, and normalize the dependent variable and independent variables before processing the model. Table 4 shows the coefficient of SAM and OLS. The −2 log-likelihood, AIC (Akaike information criterion), BIC (Bayesian information criterion), and AICc (second-order AIC) indicate that SAM is more accurate than OLS. It indicates that taking origin, destination, and OD dependence into consideration is essential. Most coefficients in SAM are statistically significant, which means that most variables have considerable influences on ride-sourcing flows. The result of the GWR is summarized in Table 5. The criterion for optimal bandwidth is AIC, and the chosen bandwidth is 342. Due to the size of the coefficients, we cannot present them in one table. Hence, the average, minimum, maximum, and standard derivation of the coefficients are presented. The −2 log-likelihood, AIC, BIC, and AICc show that the GWR model fits the data better than SAM. Hence, we will choose GWR to predict OD flow in Section 5.2.

Independent variablesSAMOLS

Residential area variablesO_pop0.1031.6880.147−0.541

POIs variablesO_education−0.2431.696−0.2770.584

Bus station variablesO_bus0.4501.8560.898−6.016

Subway station variablesO_subway−0.1560.8770.040−0.948

Spatial variablesDistance−0.2210.913−1.3122.259

−2 log-likelihood:−14,933.8−13,559.3

0.05 level; 0.1 level; NA: not applicable.

Explanatory variableAverage coefficientMin coefficientMax coefficientSTD coefficient

Residential area variablesO_pop0.102−0.0610.7290.164

POIs variablesO_education−0.245−1.2980.4970.402

Bus station variablesO_bus0.527−0.8662.0150.657

Subway station variablesO_subway−0.202−0.5960.2090.175

−2 log-likelihood:−52,874.39

5.1.1. Influences of Residential Area and POIs

The coefficients provide some insights into how these variables influence ride-sourcing flows. As shown in Table 4, the residential area at both origin and destination has a positive coefficient in OLS and SAM. This result is consistent with the GWR model in Table 5, which is reasonable, since the residential area is the source and the strong attraction of traffic flow.

However, in the GWR model, education plays a negative role, since most Chinese parents would prefer kindergarten, primary school, and middle school close to their home locations, and their children do not need to take ride-sourcing service. A walk or bike ride is enough to cover the distance. For high school or college/university in China, most students live on campus, indicating that this group of people does not need to take the ride-sourcing service often either. That is why the education variable does not have a positive coefficient, although it is supposed to be a strong attraction of flow. Enterprise has a positive effect on the ride-sourcing OD flow. Since office buildings are places with a large population density (hundreds of working people gather in one office building) and high access frequency (people access their workplace almost every weekday), their influences on flow are positive at both origin and destination levels.

The hotel is dense around the transportation hub, like high-speed railway stations and bus terminals. Many people would choose ride-sourcing services to travel from hotel to transportation hub and vice versa, resulting in a positive coefficient. As for entertainment services, people go to these places for entertainment. Thus, they tend to choose the most relaxed transportation mode, like taxies or ride-sourcing services. The coefficient is mainly positive. Tourist spots at destination have negative correlations with ride-sourcing flows, and the spots at origin have few effects.

5.1.2. Effects of Transit Stations

In OLS and SAM (Table 4), bus stations at origin and destination have a strong positive effect on OD flow. The results are consistent in the GWR model in Table 5. The result is surprising, since the bus, as another mode of transportation, is supposed to reduce the pressure on the roadway. Figures 1 and 5(b) show that the distribution of bus stations is consistent with trip origin/destination distribution. Thus, we infer that it is not a causal relationship between the bus station and OD flow but a correlated relationship. The bus station is designed to build on locations with high demand to reduce the traffic pressure, but its capacity is not large enough to reduce the ride-sourcing flow to a low level, which results in a large coefficient.

The effect of subway stations is consistent in GWR and SAM, where the average coefficient is negative. Some studies [25, 26, 31] find that subway stations were positively related to walking demand, transit ridership, or automobile travel demand, since subway stations do not compete with these travel modes. However, in our case, subway stations relieve traffic pressure on the roadway by attracting passengers to the subway. Thus, it reduces the ride-sourcing OD flow and obtains a negative coefficient.

5.1.3. Effects of Explanatory Variables in Different Periods

Since the GWR model outperforms SAM, we further explore the effects of explanatory variables in different periods with the GWR model. As shown in Table 6, the residential area, education, hotel, bus, and subway station coefficients on weekdays and weekends do not make any differences compared with those in Table 5. The residential area, hotel, and bus variables always play a positive role in attracting and generating traffic flow, and the education and subway stations always play a negative role. On a weekday, the enterprise has a positive effect on OD flow, while on weekends, its effect is not that strong and there even appears a negative coefficient at the destination level. On the weekend, entertainment and tourist spots have stronger attractiveness for passengers, since the weekend is time for recreation and outings.

Explanatory variableWeekdayWeekendAM peakPM peakOff-peak

Residential area variablesO_pop0.0980.1080.0970.0940.107

POIs variablesO_education−0.234−0.270−0.257−0.276−0.227

Bus station variablesO_bus0.5330.5190.1950.5430.631

Subway station variablesO_subway−0.196−0.216−0.217−0.206−0.195

For coefficient at different times of day, the effect of the residential area, education, hotel, and the subway station is consistent with that on weekdays and weekends. Nevertheless, for enterprise, in the morning rush hour, the OD flow going to zones with more enterprise buildings would be enlarged. Meanwhile, in the evening rush hour, the OD flow from zones with more enterprise buildings would be enlarged. It is related to the activity of commuters who go to enterprise buildings in the morning rush hour and leave from enterprise buildings in the evening rush hour.

5.2. Marginal Effects of Policy Implementation
5.2.1. Marginal Effects of Residential Area Change

In 2050, the residential area is predicted to increase by 48.30%. The residential area distribution is shown in Figure 3(b). The increase in residential areas is not even across all TAZs. With the coefficients estimated in GWR, we predict the marginal effects of residential area change in 2050. The change ratio (local flow difference divided by local flow in 2019) of OD flow is illustrated in Figure 6.

As shown in Figure 6, we divide the flow into three types, that is, intraflow (trips depart from and arrive at the same TAZ), inflow (summation of trips arriving at the TAZ), and outflow (summation of trips departing from the TAZ). As shown in Figure 6(b), the outflow changes are consistent with residential area changes. TAZs with a significant residential size increase (15, 12, 24, 54, 58, and 67) will have an increase in the outflow. TAZs with a residential size decrease like 1 and 47 will have a decrease in the outflow. As shown in Figure 6(a), for most TAZs, the changes of outflow and inflow are similar. Meanwhile, for TAZ 47, a decrease in residential areas will have an increase in inflow. The possible explanation is that TAZ 54, which has a large increase in residential areas, influences its neighbor 47, causing many people to enter TAZ 47. A similar situation happens to TAZ 33 and TAZ 52, which has no increase in the residential area but has an increase in inflow, since their neighbors TAZ 25 and TAZ 24 have a rising residential area size. The intraflow change in Figure 6(c) is much more even. The intraflow of TAZ 69 and TAZ 17 decreases. It may be caused by the increase of residential areas in TAZ 58 and TAZ 15, which attract part of the intraflow. Overall, the outflow changes are consistent with residential area changes.

5.2.2. Marginal Effects of Transit Station Change

The local government planned to construct ten new subway lines by 2022. As can be seen in Figure 7, the newly built subway stations are illustrated. There will be 260 subway stations in Hangzhou, increasing by 242.11% compared to the base year. Based on the coefficients acquired in the GWR model above, the increased ride-sourcing OD flows in the whole city can be estimated.

We have shown the OD flow changes ratio in Figure 8. Since the average coefficient of the subway station is negative, most of the TAZ’s intraflow, outflow, and inflow will decrease. In Figure 8(a), TAZ 56 has the largest outflow decrease, since its subway station number dramatically increases from 0 to 11. For TAZ 38 and TAZ 66, which have no newly built subway stations, the outflow increase is caused by new stations in nearby TAZs like 42, 10, 6, and 57. People might flow out of the zone to take the subway. A similar conclusion can be drawn in the inflow case where TAZ 38 and TAZ 66 have an increase in inflow. Since TAZ 38 and TAZ 66 did not contain dense subway stations in 2019, people had to take the ride-sourcing service to enter these TAZs, causing an increase in inflow. The intraflow is consistent with the outflow.

6. Conclusions

This paper explores the influences of several built environment variables (e.g., residential area, POIs, and transit stations) on ride-sourcing OD flow. This study differs from related research by analyzing ride-sourcing OD flow rather than just origin-based demand (outflow) or destination-based inflow, which offers more detailed spatial information. The results of the OLS, SAM, and GWR models are compared. The GWR model is different from the other models by allowing the coefficients of explanatory variables to vary over space. The result shows that the GWR model outperforms both SAM and OLS models. On average, the increase in the residential area, enterprise, and bus station variables will increase OD flow at both levels of origin and destination. The increase in education and subway will cause an opposite result, since students are not the main force of ride-sourcing passengers, and the subway competes with ride-sourcing services. For the different time of day, we find that enterprise attracts people in the morning rush hour, and people start to travel from enterprise building in the evening rush hour. It is related to the activity of commuters who go to work in the morning and go off work in the evening. Entertainment service and tourist spots play a different role on the weekends and weekdays. We also calculate and illustrate the changes in OD flow based on the residential area and the subway line construction plan. The findings and the modeling approach in this study help better understand the ride-sourcing flow and provide planners and policymakers with scientific guidance on the design of urban land use and transportation planning.

This paper has a few limitations. All independent variables are regarded equally without considering their scales. For example, large shopping malls should have a larger weight than small ones. In the future, it is better to introduce multisource datasets such as subway transaction data to compare the impact of the built environment on different transportation modes. Several important characteristics like characteristics of the service (e.g., price, and type of car) should be considered for mode choice evaluations.

Data Availability

The POI information can be accessed by utilizing the AMAP API service by visiting

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This research was financially supported by the National Key Research and Development Program of China (2018YFB1600900) and the National Natural Science Foundation of China (71922019, 71961137005, 71771198, and 71772195). This work was also supported in part by the Zhejiang University/University of Illinois at Urbana-Champaign Institute and was led by Principal Supervisor Simon Hu. The authors thank Didi Chuxing for offering the ride-sourcing data.


  1. C. Chen, D. Zhang, N. Li, and Z.-H. Zhou, “B-planner: planning bidirectional night bus routes using large-scale taxi GPS traces,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 4, pp. 1451–1465, 2014. View at: Publisher Site | Google Scholar
  2. C. Chen, D. Zhang, B. Guo, X. Ma, G. Pan, and Z. Wu, “Trip planner: personalized trip planning leveraging heterogeneous crowdsourced digital footprints,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 3, pp. 1259–1273, 2015. View at: Publisher Site | Google Scholar
  3. C. Peng, X. Jin, K. C. Wong, M. Shi, and P. Liò, “Collective human mobility pattern from taxi trips in urban area,” PLoS One, vol. 7, no. 4, Article ID e34487, 2012. View at: Publisher Site | Google Scholar
  4. G. Pan, G. Qi, Z. Wu, D. Zhang, and S. Li, “Land-use classification using taxi GPS traces,” IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 1, pp. 113–123, 2013. View at: Publisher Site | Google Scholar
  5. M. P. Dixon and L. R. Rilett, “Population origin-destination estimation using automatic vehicle identification and volume data,” Journal of Transportation Engineering, vol. 131, no. 2, pp. 75–82, 2005. View at: Publisher Site | Google Scholar
  6. C. Chen, L. Zheng, C. Cui, and W. Liu, “Estimating origin-destination flows using radio frequency identification data,” in Green, Pervasive, and Cloud Computing, S. Li, Ed., pp. 215–225, Springer International Publishing, Cham, Switzerland, 2019. View at: Publisher Site | Google Scholar
  7. M. G. Demissie, F. Antunes, C. Bento, S. Phithakkitnukoon, and T. Sukhvibul, “Inferring origin-destination flows using mobile phone data: a case study of Senegal,” in Proceedings of the 2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, pp. 1–6, IEEE, Chiang Mai, Thailand, June 2016. View at: Publisher Site | Google Scholar
  8. L. Alexander, S. Jiang, M. Murga, and M. C. González, “Origin-destination trips by purpose and time of day inferred from mobile phone data,” Transportation Research Part C: Emerging Technologies, vol. 58, pp. 240–250, 2015. View at: Publisher Site | Google Scholar
  9. A. Alsger, B. Assemi, M. Mesbah, and L. Ferreira, “Validating and improving public transport origin-destination estimation algorithm using smart card fare data,” Transportation Research Part C: Emerging Technologies, vol. 68, pp. 490–506, 2016. View at: Publisher Site | Google Scholar
  10. T. Cetin and E. Deakin, “Regulation of taxis and the rise of ridesharing,” Transport Policy, vol. 76, pp. 149–158, 2019. View at: Publisher Site | Google Scholar
  11. Uber Technologies Inc., “Use uber in cities around the world,” 2020, View at: Google Scholar
  12. CNBC, “Lyft has now delivered 1 billion rides,” 2018, View at: Google Scholar
  13. S. Sabouri, K. Park, A. Smith, G. Tian, and R. Ewing, “Exploring the influence of built environment on uber demand,” Transportation Research Part D: Transport and Environment, vol. 81, Article ID 102296, 2020. View at: Publisher Site | Google Scholar
  14. H. Yu and Z.-R. Peng, “Exploring the spatial variation of ridesourcing demand and its relationship to built environment and socioeconomic factors with the geographically weighted poisson regression,” Journal of Transport Geography, vol. 75, pp. 147–163, 2019. View at: Publisher Site | Google Scholar
  15. J. Bao, P. Liu, H. Yu, and J. Wu, “Spatial analysis for the usage of ride-sourcing services, an application of geographically weighted regression,” in Proceedings of the 17th COTA International Conference of Transportation Professionals, Shanghai, China, July 2017. View at: Google Scholar
  16. R. Gerte, K. C. Konduri, and N. Eluru, “Is there a limit to adoption of dynamic ridesharing systems? evidence from analysis of uber demand data from New York city,” Transportation Research Record: Journal of the Transportation Research Board, vol. 2672, no. 42, pp. 127–136, 2018. View at: Publisher Site | Google Scholar
  17. J. P. LeSage and R. K. Pace, “Spatial econometric modeling of origin-destination flows,” Journal of Regional Science, vol. 48, no. 5, pp. 941–967, 2008. View at: Publisher Site | Google Scholar
  18. K. Kerkman, K. Martens, and H. Meurs, “A multilevel spatial interaction model of transit flows incorporating spatial and network autocorrelation,” Journal of Transport Geography, vol. 60, pp. 155–166, 2017. View at: Publisher Site | Google Scholar
  19. K. Kerkman, K. Martens, and H. Meurs, “Predicting travel flows with spatially explicit aggregate models,” Transportation Research Part A: Policy and Practice, vol. 118, pp. 68–88, 2018. View at: Publisher Site | Google Scholar
  20. L. Ni, X. Wang, and D. Zhang, “Impacts of information technology and urbanization on less-than-truckload freight flows in China: an analysis considering spatial effects,” Transportation Research Part A: Policy and Practice, vol. 92, pp. 12–25, 2016. View at: Publisher Site | Google Scholar
  21. L. Ni, X. Wang, and X. Chen, “A spatial econometric model for travel flow analysis and real-world applications with massive mobile phone data,” Transportation Research Part C: Emerging Technologies, vol. 86, pp. 510–526, 2018. View at: Publisher Site | Google Scholar
  22. C. Brunsdon, A. S. Fotheringham, and M. E. Charlton, “Geographically weighted regression: a method for exploring spatial nonstationarity,” Geographical Analysis, vol. 28, pp. 281–298, 1996. View at: Google Scholar
  23. O. D. Cardozo, J. C. García-Palomares, and J. Gutiérrez, “Application of geographically weighted regression to the direct forecasting of transit ridership at station-level,” Applied Geography, vol. 34, pp. 548–558, 2012. View at: Publisher Site | Google Scholar
  24. Y.-C. Chiou, R.-C. Jou, and C.-H. Yang, “Factors affecting public transportation usage rate: geographically weighted regression,” Transportation Research Part A: Policy and Practice, vol. 78, pp. 161–177, 2015. View at: Publisher Site | Google Scholar
  25. X. Ma, J. Zhang, C. Ding, and Y. Wang, “A geographically and temporally weighted regression model to explore the spatiotemporal influence of built environment on transit ridership,” Computers, Environment and Urban Systems, vol. 70, pp. 113–124, 2018. View at: Publisher Site | Google Scholar
  26. B. Li, Z. Cai, L. Jiang, S. Su, and X. Huang, “Exploring urban taxi ridership and local associated factors using GPS data and geographically weighted regression,” Cities, vol. 87, pp. 68–86, 2019. View at: Publisher Site | Google Scholar
  27. X. Qian and S. V. Ukkusuri, “Spatial variation of the urban taxi ridership using GPS data,” Applied Geography, vol. 59, pp. 31–42, 2015. View at: Publisher Site | Google Scholar
  28. X. Qian, T. Lei, J. Xue, Z. Lei, and S. V. Ukkusuri, “Impact of transportation network companies on urban congestion: evidence from large-scale trajectory data,” Sustainable Cities and Society, vol. 55, Article ID 102053, 2020. View at: Publisher Site | Google Scholar
  29. H. Yang, X. Lu, C. Cherry, X. Liu, and Y. Li, “Spatial variations in active mode trip volume at intersections: a local analysis utilizing geographically weighted regression,” Journal of Transport Geography, vol. 64, pp. 184–194, 2017. View at: Publisher Site | Google Scholar
  30. K. Lucas, I. Philips, C. Mulley, and L. Ma, “Is transport poverty socially or environmentally driven? comparing the travel behaviours of two low-income populations living in central and peripheral locations in the SAMe city,” Transportation Research Part A: Policy and Practice, vol. 116, pp. 622–634, 2018. View at: Publisher Site | Google Scholar
  31. X. Shen, Y. Zhou, S. Jin, and D. Wang, “Spatiotemporal influence of land use and household properties on automobile travel demand,” Transportation Research Part D: Transport and Environment, vol. 84, Article ID 102359, 2020. View at: Publisher Site | Google Scholar

Copyright © 2021 Chuqiao Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Related articles

No related content is available yet for this article.
 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

No related content is available yet for this article.

Article of the Year Award: Outstanding research contributions of 2021, as selected by our Chief Editors. Read the winning articles.