Understanding the Effect of an E-Hailing App Subsidy War on Taxicab Operation Zones
Understanding taxicab operation behaviors under various management or market policies (i.e., subsidies) is critical to making informed operating decisions for e-hailing companies and for government surveillance. This paper investigates the change of taxicab operation zones in context of an e-hailing app subsidy war in China, which is an important perspective that reflects changes in taxicab behavior, such as how the operation zones of taxicabs under the e-hailing app subsidy war change and how this change affects their trip distance and cruising time. To investigate this issue, this paper utilizes three indexes to elucidate the change of taxicab operation zones, namely, the repetition ratio of operation zone pairs, the area, and the degree of dispersion in the spatial distribution. A case study using taxicab trajectories during all of the important periods of the e-hailing app subsidy war in Shenzhen, China, was conducted and produced several valuable findings; for example, with respect to taxicabs as a whole, the proportion of habitual operation zone pairs among operation zone pairs in neighboring periods is relatively stable under any subsidy policy, and changes in the operation zones have little effect on changes in the average daily trip distance and average daily cruising time. Four groups of taxicabs divided according to initial change patterns in the operation zones present different change patterns during the subsidy war. By comparing these changes before and after the subsidy war, this paper finds that the subsidy war influences the taxicabs in groups I and II, while it has little influence on the taxicabs in groups III and IV, although all groups were affected during the subsidy war. For the taxicab groups in the period with the highest subsidy, the average daily trip distance and average daily cruising time decreased, whereas, in other periods, they presented different patterns.
Taxicabs perform an indispensable function in urban transportation systems. Cabdrivers’ mobility patterns and operating strategies have been widely studied since large-scale taxi GPS trajectories became obtainable due to the rapid development of information and communication technologies (ICT). For example, previous studies analyzed the distance and direction distributions of intraurban trips , urban cabdrivers’ activity distribution , spatial variations in urban taxi ridership , cabdrivers’ operation strategies with respect to revenue [4, 5], and alternative taxi services strategies .
E-hailing services (e.g., Didi in China and Uber in the US) can connect passengers and cabdrivers directly using communication technology since cabdrivers know the passengers’ origins and destinations beforehand. However, several disturbing issues are commonly reported in the taxi industry, including unfair competition, safety concerns, longer working hours, surge pricing, request rejection, and increased difficulties among senior citizens with catching taxis. With the e-hailing app subsidy war that occurred from January to August in 2014 in China, which had a dramatic effect on the taxi industry, it has become feasible to analyze how cabdrivers react to the external policy stimulus and what consequences these different reactions cause. From the perspective of taxicabs, the most direct reflection of the spatial effect of subsidy policy is the change of operation zones. Do the operation zones of taxicabs change under the stimulation of a subsidy war? If so, how does this change affect cabdrivers’ trip distance and cruising time? This paper primarily focuses on how the operation zones of taxicabs under the e-hailing app subsidy war changed and how this change affected their trip distance and cruising time. This paper investigates this question in three dimensions, including the repetition ratio of operation zone pairs, the changes in the area, and the changes in the degree of dispersion in the spatial distribution of operation zones. These three dimensions can well describe the change of taxicab operation zones from the perspective of space. Understanding these questions can provide insights into the effects of subsidy policies on taxicabs’ operation tactics, helping taxi management authorities to develop precise and targeted management tactics for cabdrivers.
To address the research question, three indexes are calculated to indicate the changes in the operation zones, namely, the repetition ratio, the changes in the area, and the changes in the degree of dispersion in the spatial distribution of operation zones. Moreover, this paper calculates the average daily trip distance and the average daily cruising time for each taxicab to investigate the correlation between the spatial change and these two variables. Selected taxis are used to conduct the investigation of the primary research question.
The remainder of this paper is organized as follows. Section 2 reviews previous studies on spatial patterns of taxicabs and e-hailing services. Section 3 introduces the subsidy war and taxi GPS trajectories data and gives details on three indexes which are used to measure the change of taxicab operation zones and presents the method to access the effect of the change of taxicab operation zones. Section 4 presents the analysis of the experimental results. Section 5 concludes the paper and suggests directions for future study.
2. Related Work
2.1. Spatial Pattern of Taxicabs
Researchers have great interest in studying location data with spatiotemporal information due to the rapid development and increasing availability of various location-acquisition technologies [7, 8]. The large-scale collection of taxicab trajectory data provides a promising tool to investigate operation behavior patterns among cabdrivers.
Researchers have analyzed the spatial patterns of taxicabs using three approaches. The first approach is to uncover pattern variations; for example, Chen et al.  discovered frequent trajectories between a given origin and destination. Liu et al.  utilized k-means clustering to uncover the spatiotemporal preferences of high-earning drivers and average-earning drivers in Shenzhen. A comparison between these two groups shows that the operation scale of high-earning drivers is much greater than that of ordinary drivers and that ordinary drivers prefer to stay in traditional city centers with the greatest travel demand but ignore potential congestion, while high-earning drivers choose to operate in areas other than the central business district when traffic is congested. Liu et al.  investigated the temporal variation in taxi pick-up and drop-off patterns and identified six traffic ‘source-sink’ areas, which were associated with land use types.
The second approach is to discover the aggregate patterns of taxicabs. For example, Lee et al.  performed a temporal analysis to create a time-dependent pick-up pattern within each area and recommended that empty taxis go to the nearest cluster locations derived by the k-means method to pick up customers. Wang et al.  investigated the travel patterns to and from hotspots. Kang and Qin  studied the habitual operation behaviors of taxicabs in Wuhan City and extracted six dominant types of spatial patterns by utilizing a nonnegative matrix factorization method. Fang et al.  discussed the aggregate convergence and divergence patterns of crowd dynamics, which could discover aggregate patterns of various objects, including taxicabs, private cars, and human beings.
The third approach is to discover the spatial operation knowledge of taxicabs. For example, Hu et al.  found that the location in which taxi drivers’ cruise and search for the next passenger’ pick-up location will be near the previous drop-off location by analyzing the statistical results of the distance between each taxi driver’s pick-up locations’ mean center and drop-off locations’ mean center. Dong et al. classified drivers into three groups based on revenue and compared different driver groups in terms of spatiotemporal and strategy aspects. Their conclusion was that top drivers tend to maximize their usage of a limited working time to serve more trips and cover more mileage, which is the main reason that they earn more revenue.
In short, these studies seldom introduce effective indexes to indicate changes in the operation zones, such as the repetition ratio of operation zone pairs and the spatial changes (i.e., changes in the area and the changes in the degree of dispersion in the spatial distribution) of the operation zones.
2.2. E-Hailing Services of Taxicabs
Passengers usually hail a taxi on the street in the traditional manner. However, with advanced information and communication technologies, hailing a taxi through smartphones has become very popular; this approach offers passengers a high level of comfort and efficiency, especially during rush hours and rainy days. With the advent of smartphones, an increasing number of smartphone-based e-hailing applications have emerged in recent years, such as Uber and Lyft in the US and Didi and Kuaidi in China. These e-hailing applications provide an information platform that makes communication between drivers and passengers more efficient and convenient. The e-hailing service of taxicabs, as a new form of communication between drivers and passengers, has aroused scholarly interest.
One group of prior studies investigated the market equilibrium of taxi or related policies. Wang et al.  incorporated theories of two-sided markets into studies of the taxi industry, introduced matching functions to model the cross-group externalities between customers and taxi drivers on an e-hailing platform, and proposed a system of nonlinear equations to describe the taxi market equilibrium under hybrid modes of e-hailing and roadside hailing to reveal the impacts of the pricing strategy of an e-hailing platform on taxi service. Zha et al.  proposed equilibrium models under different behavioral assumptions about the labor supply in a ride-sourcing market. Qian and Ukkusuri  modelled the taxi market as a multiple-leader-follower game at the network level and investigated the equilibrium of the taxi market with competition between traditional taxi services and app-based third-party taxi services. Their conclusion was that the fleet size and pricing policy are closely associated with the level of competition in the market and can have significant impacts on the total passenger cost, average waiting time, and fleet utilization. He and Shen  proposed a spatial equilibrium model for balancing the supply and demand of taxi services and depicted the possible adoption of e-hailing applications for drivers and customers in a well-regulated taxi market. Garza et al.  analyzed two different government support programs within a dynamic equilibrium model of imperfectly competitive industries in order to investigate their effectiveness.
The second group of studies focuses on pricing strategy. Salnikov et al.  compared the cost between Yellow Cab and Uber X, which is the cheapest version of the Uber taxi, in New York City. The authors observed that Yellow Cab appears on average 1.4 US dollars cheaper than Uber X, while Uber X begins to become cheaper only above the threshold of 35 dollars. However, further study by Noulas et al.  showed that the Uber X service appeared to be considerably cheaper than Black Cab in London. Interestingly, it was observed that Uber X is much cheaper in London than in New York. Another key issue in e-hailing apps is the surge pricing phenomenon; namely, a trip price fluctuates with time and varies from one area to another in a city. For example, prices can reach 7.5 times the base rate on New Year’s Eve . Previous studies have shown that the majority of surge pricing lasts less than 10 minutes and that surge prices have a strong negative impact on passenger demand and a weak positive impact on car supply . Chen  also determined that surge pricing increases the supply of rides on Uber. Noulas et al.  reported that the driving factor of a higher fee for Uber X in New York is surge pricing.
In addition, a few studies have investigated the effects of subsidy wars on e-hailing services. For example, Wen et al.  showed that if subsidies were given to passengers, the difficulty in hailing taxis would not be alleviated, whereas this problem would be alleviated if subsidies were given to the taxi drivers. Leng et al.  compared taxi service data collected during a period in a subsidy war with that from another period without subsidies and concluded that, during the period with subsidies, the average number of trips increases, the idle time decreases, and the number of short-distance trips increases, whereas the distribution of long-distance trips is unchanged and the distribution of pick-up and drop-off locations does not change significantly. Su et al.  discovered that the changes in pick-up and drop-off location varied greatly in quantity and spatial distribution during the subsidy war.
However, few studies have focused on the question of how the operation zones of taxicabs under an e-hailing app subsidy war change and how this change affects their trip distance and cruising time. Therefore, the present paper investigates these questions.
3. Materials and Methods
This section introduces the subsidy war and experimental data, the measurements of the change of taxicab operation zones, and the average daily trip distance and cruising time derived from taxicab trajectories.
3.1. Subsidy War and Experimental Data
A fierce subsidy war between two e-hailing apps, Didi and Kuaidi, was triggered in 2014 in China. Based on the different subsidy policies available to drivers and passengers, this paper divides the time period of the subsidy war into six subperiods, which are listed in Table 1. No subsidy policy was implemented before Jan. 9th, which is defined as the first period. In the second period, which is from Jan. 10th to Feb. 16th, the subsidies for both cabdrivers and passengers were 10 yuan, since the subsidy war had just begun. The third period extends from Feb. 17th to Mar. 4th, which was a peak period in the subsidy war. During this period, both companies provided the highest subsidies for passengers, which ranged from 10 to 20 yuan. In addition, Didi gave 50 yuan to new drivers who used their app. The fourth period extends from Mar. 22nd to May 16th, during which cabdrivers still received higher subsidies; however, passengers received lower amounts. During the fifth period, which extends from May 17th to July 8th, both e-hailing apps cancelled the subsidies for passengers; however, cabdrivers still received subsidies. Finally, in the sixth period, neither of these apps provided any subsidies to anyone after Aug. 10th.
Human travel patterns on holidays vary greatly from those on nonholidays. For example, an enormous number of people leave Shenzhen City to go back their hometowns to celebrate Spring Festival, which has a considerable effect on the level of taxi service in Shenzhen City. In addition, bad weather, such as rainstorms, can significantly influence the demand for taxi services. To reduce the influence of such factors, holidays and bad-weather days are excluded in our datasets. Due to the short time span of the third period, which has only 16 days after excluding holidays and bad-weather days, we choose the same number of days for other periods equally. Finally, Dec. 23-31 in 2013 and Jan. 2, 3-6, 8, and 9 in 2014 were chosen from the first period; Jan. 20, 21, and 23-29 and Feb. 10-16 were selected from the second period; Feb. 17-28 and Mar. 1-4 were chosen from the third period; Mar. 23-28 and Apr. 4, 9, 10, 16, 17, 19-22, and 27 were selected from the fourth period; May 24, 26, 27, and 29 and June 3-5, 11-14, and 26-30 were chosen for the fifth period; and Aug. 11, 15-18, 21, 23-26, and 29 and Sep. 2, 3, 9, 10, and 18 were selected for the sixth period. Let Periods 1, 2, 3, 4, 5, and 6 denote these six periods, respectively.
The data were collected from 9,648 taxicabs equipped with GPS devices that operated in the city of Shenzhen, China, during the selected days in these six periods. All of these taxicab trajectory data cover all six periods. Each record includes the plate number, spatial location (i.e., longitude and latitude), timestamp, operation status (i.e., vacant or occupied), driving direction and velocity of a taxicab. The data do not include the actual identifying information about which taxicab driver uses Didi or Kuaidi app. Therefore, this analysis will reflect this change in the taxicab operation zones under the context of different subsidy policies, which cannot reflect taxicab’s behavior changes directly affected by the Didi or Kuaidi app.
In this paper, we extract two datasets from the raw data, which are a passenger carrying dataset called DATASET #1 and a passenger hunting dataset called DATASET #2. For DATASET #1, the plate number, start time and end time of a trip, travel time, pick-up location and drop-off location of a trip, and the travel distance were collected. For DATASET #2, each record has the plate number, end time of the current trip, start time of the next trip, cruising time, drop-off location of the current trip, pick-up location of the next trip, and cruising distance.
Abnormal data are excluded from the dataset, such as, for DATASET #1, (i) records with a travel time less than 0.5 min, based on the reasonable hypothesis that a taxi trip lasts longer than 0.5 min; (ii) records with a travel distance shorter than 500 m based on the reasonable hypothesis that a taxi trip should be longer than 500 m; and (iii) records with the pick-up and drop-off latitude and longitude value outside of the territory of Shenzhen City. For DATASET #2, the following data are excluded: (i) records with a cruising time of more than 60 min based on the reasonable hypothesis that a taxi can find a passenger within 60 min and (ii) records with the pick-up and drop-off latitude and longitude value outside of the territory of Shenzhen City. After these steps, we prepare two preliminary datasets, both of which contain the same 9,605 taxis.
The traffic analysis zone (TAZ) is the analysis unit in this study. TAZ is usually constructed by census block information, where trips begin and end in transportation planning. This study uses TAZs to represent the operation zones of taxicabs from the perspective of taxicab service and the requirement of policy making for transportation planning. As shown in Figure 1, there are 1,067 TAZs in the city of Shenzhen, and each TAZ is given a unique ID. The pick-up and drop-off locations in DATASET #1 and DATASET #2 are aggregated into corresponding TAZs.
3.2. The Measurements of the Change of Taxicab Operation Zones
First, we introduce the definitions of operation zone and operation zone pairs. The operation zone represents a zone where a taxi has ever performed a pick-up or drop-off passengers, which can be described as a sequence . A taxi’s operation zone pair is a pair of origin and destination of a completed trip, which can be described as a sequence . Figure 2 gives an example of the change of a taxicab’s operation zones between neighboring periods. In Period p, the sequence of operation zones of this taxi is and the sequence of operation zone pairs is . In Period p+1, the sequence of operation zones of this taxi is and the sequence of operation zone pairs is
In this section three indexes are introduced to uncover the change of taxicab operation zones in context of the e-hailing app subsidy war, namely, the repetition ratio of operation zone pairs, the area and the degree of dispersion in the spatial distribution of operation zones.
3.2.1. Repetition Ratio of Taxis’ Operation Zone Pairs
A repetition ratio indicator is introduced to depict the proportion of repeatedly appearing operation zone pairs of each taxi between two neighboring periods. Each relation among origins and destinations of a taxicab can be described by the adjacency matrix . The numbers of rows and columns of this matrix are both 1,067 which is the same as the total number of the analysis unit in this paper, or TAZ. If the taxi has trips that start from TAZ and end in TAZ , we set , which denotes an element of matrix , equals to 1; otherwise, . To obtain the repeated operation zone pairs between two neighboring periods, the following matrix calculation is implemented. First, we define a matrix-based AND operation, which is denoted by . Consider two matrices, and , which denote the two adjacency matrices of two neighboring periods. If both elements in the compared position are 1, then the result comes to 1, namely, ; otherwise, the result is 0 (). Thus, the repeated operation zone pairs can be derived from implementing the matrix-based AND operation among two adjacency matrices, as follows:where denotes the adjacency matrix of repeated operation zone pairs.
The repetition ratio of the taxis’ operation zone pairs is calculated in two different ways, i.e., unweighted and weighted. The unweighted repetition ratio of the taxis’ operation zone pairs can be derived by calculating the ratio of the number of repeated operation zone pairs to the total number of distinct pairs: where denotes the unweighted repetition ratio of the taxis’ operation zone pairs, denotes the element of the matrix , represents the element of the matrix , denotes the element of the matrix , and and are the numbers of rows and columns of matrices and which are both equal to 1,067 in this case. A taxi’s operation zone pairs are similar between two neighboring periods when is large, which means that the taxi prefers to stay in the previous operation zone pairs.
Considering that operation zone pairs with higher visiting frequency should make crucial contributions to the repetition ratio of operation zone pairs, we measure the visiting frequency of each pair as a weight to calculate the weighted repetition ratio of the operation zones:where denotes the weighted repetition ratio of the taxis’ operation zone pairs, denotes the number of trips that start from TAZ and end in TAZ in the current period, and denotes the number of trips that start from TAZ and end in TAZ in the next period.
For the convenience of understanding, we calculate this index for the example in Figure 2. First, the adjacency matrices of origin and destination, and , are as follows:Next, can be derived by the above defined method:Finally, and can be derived by the above defined method:
3.2.2. Area of the Taxi Operation Zones
The taxi-origin and taxi-destination relations can be described by two adjacency matrices and , respectively. The numbers of column of these two matrices are both equal to 1,067 which is the number of TAZs in this study. If taxi has trips that start from TAZ , we set ; otherwise, . Analogously, we set if taxi has trips that end in TAZ and otherwise. Since the operation zone represents a zone where a taxi has ever performed a pick-up or drop-off passengers, when we collect all of the operation zones of a taxi, the duplicate zones should only count once. To exclude duplicate zones, the following matrix calculation is implemented. First, we define a matrix-based OR operation among two matrices, which is denoted by . Given two matrices, the result in each position is 0 if both elements are 0 (), while, otherwise, the result is 1 (). After this definition, the operation zones can be obtained by performing the matrix-based OR operation on matrix and matrix as follows:where denotes the adjacency matrix that describes the relation between the taxis and operation zones.
The area of the taxis’ operation zones is calculated in two different ways, i.e., unweighted and weighted. First, the unweighted area of the operation zones of taxi is derived from summing the area of each operation zone as follows:where denotes the unweighted area of the operation zones of taxi , denotes the area of an operation zone, denotes an element in matrix , and denotes the number of columns which is equal to 1,067 in this study.
Considering that operation zones with higher visiting frequency play a more crucial role than other operation zones, this paper sums the number of pick-ups and drop-offs within each operation zone and uses the sum as the weight to calculate the weighted area of the operation zones:where denotes the weighted area of the operation zones of taxi and denotes the number of pick-ups and drop-offs within an operation zone.
For the convenience of understanding, we calculate this index for the example in Figure 2. First, for Period p, the adjacency matrices of taxi-origin and taxi-destination relations, and are as follow:Next, can be derived by the above defined matrix-based OR operation:Finally, and can be derived by the above defined method:where , , , , denote the area of these five operation zones, respectively.
3.2.3. Degree of Dispersion in the Spatial Distribution of the Taxi Operation Zones
Several indexes are widely used to measure the spatial dispersion in landscape studies; for example, the proximity index  measures the degree of patch isolation and the degree of fragmentation of the corresponding patch type within the specified neighborhood of the focal patch. Pearce  proposed an indicator to assess the degree of dispersion of a forest patch, which considers only the number of specified patch types and ignores the impact of the area of a patch. Nearest-neighbor standard deviation (NNSD) is also a measure of patch dispersion, where a small value implies a more uniform or regular distribution of patches and a large value indicates a more irregular or uneven distribution of patches. This index should be interpreted together with the mean nearest-neighbor distance; otherwise, it can be misleading to rely only on itself. In addition, Ripley’s K-function  is a widely used analysis tool for study of population spatial pattern. It is used to determine whether features exhibit statistically significant clustering or dispersion over a range of distances. However, it lacks a convenient comparable quality. The nearest-neighbor index (NNI) proposed by Clark and Evans  provides a precise measure of the spatial distribution of a pattern, which is more suitable in our case since it accounts for the area, quantity, and spatial locations of the patches comprehensively and could be compared among different cases. For a random arrangement of patches, NNI = 1; NNI < 1 represents more clustered, contiguous conditions that form patches, whereas NNI > 1 indicates a dispersed distribution of patches. Therefore, the NNI is used to investigate the degree of dispersion in the spatial distribution of the taxis’ operation zones, as follows:where denotes the value of NNI, denotes the mean observed nearest-neighbor distance, represents the mean expected nearest-neighbor distance, and is the edge-to-edge straight line distance between zone and its nearest-neighbor. Based on the spatial proximity among operation zones, adjacent operation zones are merged first to avoid a result of zero for . represents the area of the merged operation zone, and is the number of the merged operation zones.
Similarly, we consider that operation zones with higher visiting frequency should make greater contributions to the degree of dispersion in the spatial distribution of the taxi operation zones. We measure the visiting frequency of each merged zone by summing the numbers of internal pick-ups and drop-offs, and we use the visiting frequency as the weight to calculate the weighted nearest-neighbor index (WNNI):where denotes the value of WNNI, denotes the weighted mean observed nearest-neighbor distance, represents the weighted mean expected nearest-neighbor distance, and denotes the number of pick-ups and drop-offs within a merged operation zone.
3.3. The Average Daily Trip Distance and Cruising Time Derived from Taxicab Trajectories
This paper assesses the effect of changes in the taxicab operation zones on their cost (average daily cruising time) and income (average daily trip distance). The average daily trip distance of each taxi in a period can be derived by averaging the sum of the trip distances in the period of subsidy war. The change in the average daily trip distance of each taxi between two neighboring periods can be obtained by subtracting its average daily trip distance in the current period from the next period. Similarly, this paper derived the average cruising time and the change in the average cruising time.
4. Experimental Results
In this section, the overall changes in the taxicab operation zones in Shenzhen City within the context of the e-hailing app subsidy war are analyzed. Moreover, the change of operation zones in different groups of taxicabs and the influence of this change on their average daily cruising time and average daily trip distance are investigated, and finally, the changes in the patterns of weighted and unweighted areas and the NNI in different periods of this subsidy war are elucidated.
4.1. Overall Repetition Ratio Changes in the Taxicabs’ Operation Zones
Figure 3 shows the distributions of the unweighted repetition ratio and weighted repetition ratio of the taxi operation zone pairs between different neighboring periods. It is noteworthy that, with the consideration of the weight, which is represented by the number of trips within an operation zone pair in calculating the repetition ratio, the importance of frequently visited operation zone pairs is highlighted. The unweighted repetition ratios of about half of the taxis are less than 10% for all neighboring periods, and the weighted repetition ratios are less than 20%. These findings indicate that the operation zone pairs of the majority of taxis change greatly. In addition, the results calculated from the weighted method are greater than those by the unweighted method. This finding results from the considerable number of trips that exist among repeated operation zone pairs, although the ratio of repeated operation zone pairs is relatively low. Additionally, the distribution of the weighted repetition ratio is flatter than the distribution of the unweighted repetition ratio, which indicates that although the proportions of repeated operation zone pairs are close among all taxicabs, the numbers of trips within the operation zone pairs vary from cab to cab. In addition, the repetition ratios slightly change in different periods regardless of whether the unweighted or weighted calculation methods are used, which indicates that the proportion of habitual operation zone pairs among all of the operation zone pairs in the neighboring periods is relatively stable under all subsidy policies in this subsidy war. Therefore, the majority of taxicabs have a significant change of their operation zone pairs for all periods, which is not easily affected by different subsidy polices.
4.2. Change Patterns in Different Groups of Taxicabs
This paper analyzes the spatial change in the operation zones from the perspectives of the change in the area and the change in the degree of dispersion in the spatial distribution. To analyze the change of operation zones of taxicabs in the context of the subsidy war, the selected taxicabs are divided into four groups (see Table 2) according to the initial increase or decrease pattern of the area and the degree of dispersion in the spatial distribution of the operation zones of taxicabs during Period 1-2. Here, Period 1 did not have any subsidy, while Period 2 was the first period with a subsidy, and therefore, this division can reflect the direct stimulus of the subsidy policy on the taxicabs’ operation zones. Table 2 shows the grouping results, which indicates a relatively close percentage of these four groups.
To investigate the change patterns of these taxicabs in each group during other neighboring periods, this paper lists the taxicab percentages of the four patterns within each group in Table 3. Specifically, pattern A represents taxicabs with increased values for both the area and NNI, and similarly, patterns B, C, and D represent the other combinations of increase or decrease in the area and NNI. There are two findings in the table: (i) the patterns with the largest proportions of taxicabs in Period 3-4, Period 4-5, and Period 5-6 are the same for groups I, II, and IV. For example, approximately 50% of the taxicabs in these groups enlarged their operation zones and travelled more dispersed operation zones to service passengers (which is pattern A) due to a decrease in the subsidy for passengers from Period 3 to 4. When the subsidy was cancelled for passengers from Period 4 to 5, approximately 42% of the taxicabs in these groups still enlarged their operation zones, but they travelled relatively less dispersed in operation zones than during the previous neighboring periods (which is pattern B). With the further cancelation of the subsidy for taxi drivers from Period 5 to 6, 42%-48% of the taxicabs decreased their operation zones’ area and travelled within these zones with a relatively low dispersed state (which is pattern D). (ii) The taxicabs in group III have a different pattern distribution from the other groups. From Period 2 to 3, 40.35% of the taxicabs reduced their operation zone areas and travelled within concentrated areas (pattern C). From Period 3 to 4, 53.62% of the taxicabs enlarged their operation zones and travelled with less dispersed areas (pattern B) than during the previous periods due to the reduced subsidy on passengers. From Period 4 to 5, 42.84% of the taxicabs readjusted their operation zones within more dispersed areas due to the subsidy cancellation policy for passengers (pattern C). From Period 5 to 6, 40.53% of the taxicabs still needed to travel larger and more dispersed operation zones to sustain their income level (pattern A). (iii) The taxicabs in group III are less sensitive to the subsidy policy than taxicabs in other groups from Period 2 to 3. This could be concluded from the result that 8.21% (group I), 13.11% (group II), and 12.79% (group IV) of taxicabs in these groups keep their initial changes, while 40.35% of taxicabs in group III could keep their initial changes. After this period, very few taxicabs could keep their initial changes. In short, the taxicabs in all groups present a subsidy policy-sensitive changes of their operation zones.
To further investigate the operation changes of these four groups, this paper lists the changes in the operation zones, the average daily trip distance and the cruising time in each group between all neighboring periods in Table 4. The changes in the operation zone areas show that taxicab groups I and II exhibited the same change tendency, as did groups III and IV, and for the changes in the NNI, groups I and III exhibited the same change tendency, as did groups II and IV. The table reveals that (i) there is an obvious stimulus of the subsidy policy in Period 1-2 due to the larger statistical results than in the other periods. (ii) In Period 3, with the highest subsidy, all of the selected taxicabs travelled with decreased areas of operation zones and less passenger hunting time, but they had a decreased average daily trip distance. This finding indicates that the taxicabs could find their passengers efficiently and did not require greater trip distance than in other periods to maintain their incomes because they could receive high subsidies from e-hailing apps. Figure 4(a) illustrates the change trends of each group’s average daily cruising distance compared with that in Period 1. The figure shows an obvious decrease in the average daily cruising distance in Period 3, confirming the finding about shorter passenger hunting time. Therefore, this finding provides strong evidence that cabdrivers were more likely to pick up passengers who were nearby during this period. (iii) As the subsidy decreased, all taxicabs had increased areas of operation zones in Periods 3-4 and 4-5, and they had a high dispersion of their operation zones, especially in Period 4; as a result, they increased their average daily trip distance to maintain their incomes. (iv) The average daily cruising time steadily decreased after Period 2. This result indicates that the e-hailing apps improved their passenger hunting efficiencies.
To reflect the changes before and after the subsidy war, this paper compares the change trends of each group’s operation zones area (Figure 4(b)) and the NNI (Figure 4(c)) with those in Period 1. Figure 4(b) presents two interesting results. The first result is that all taxicabs in groups I and II had more areas of operation zones after the subsidy war than before the war. This result indicates that the subsidy war enlarged their operation zones. The second result is that all taxicabs in groups III and IV reduced their operation zone areas and then increased their operation zone areas to maintain their income, but finally, they returned to same area value of operation zones. These results indicate that when the subsidy war ended, cabdrivers possibly returned to their previous work status before the war. In the aspect of operation zones’ dispersion, there are also two results that can be drawn from Figure 4(c). The first result is that the NNI of group I’s and group III’s operation zones after the beginning of the subsidy war became more dispersed than that before the subsidy war. The second result is that the NNI of group II’s and group IV’s operation zones became smaller than before the war. This result indicates that the distance between the taxicabs’ operation zones became shorter. In addition, when combining these results together, this paper indicates findings about these groups. For example, (i) the taxicabs in group I increased their operation zone areas and increased their NNI of operation zones. This finding suggests that the subsidy war required them to produce a higher workload to change their operation zones. (ii) The taxicabs in group II increased their operation zones’ area while decreasing the NNI of their operation zones. This finding suggests that the subsidy war made them pay a greater attention to certain potential service zones.
4.3. Comparing Weighted and Unweighted Areas and NNIs in Different Periods
To assess the effect of the number of pick-ups and drop-offs on the spatial changes in the operation zones, the weighted area and WNNI are used to compare the difference in the unweighted areas and the NNIs in different periods. This number can help reflect the habitual operating preferences of cabdrivers. Figure 5 shows the histograms of the unweighted areas of operation zones (in blue) and weighted areas of operation zones (in red) of all of the selected taxis in each period. There are three findings in this figure: (i) Compared with the histograms of the unweighted areas of the operation zones, the mean values of the weighted areas of the operation zones are all smaller than the unweighted areas in any period. (ii) The mean values of the weighted areas of the operation zones in Periods 1, 2, 3, and 6 are relatively close to the unweighted values. In other words, from the perspective of aggregate level, there is little significant difference between weighted and unweighted calculation methods in these periods, which implies that the weight distributions of the different operation zones are similar. (iii) The span between the red and blue vertical lines is considerable in Periods 4 and 5, when the subsidies given to the passengers were reduced, which indicates a relatively heterogeneous distribution for the number of pick-ups and drop-offs in the operation zones during these two periods.
The spatial dispersion distribution of the operation zones with a high number of pick-ups and drop-offs is another important characteristic that reflects the change in the operation zones. It is reasonable that the operation zones with high weights could be used to describe the degree of dispersion in the spatial distribution of all of the operation zones. Figure 6 shows the histograms of the maximum number of pick-ups and drop-offs within an operation zone of each taxi in every period. The results show that the majority of pick-ups and drop-offs total less than 150. Thus, this paper establishes thresholds for the number of pick-ups and drop-offs within an operation zone to filter out the dominant operation zones, namely, 0, 25, 50, 75, 100, and 125. Notably, the higher the threshold is, the more dominant the operation zones are. Figure 7 provides statistical results. For example, (i) there are slight changes in the WNNI in different periods when the threshold=0 and the values are all extremely low (Figure 7(a)), which cannot be explained by decreased dispersion of the spatial distribution of the taxicabs’ operation zones because it should be influenced by the merging of adjacent operation zones in calculating the WNNI. (ii) Between Period 1 and Period 2, when the subsidy war began and the threshold was ≥25, the dominant operation zones become more dispersed. (iii) The value of the WNNI in Period 1 is equal to that in Period 6, when the threshold is 100 and 125, which means that the degree of dispersion of the dominant operation zones returns to the original levels when the subsidy war ends. These operation zones should be highly visited habitual zones. These results indicate that the distances of the taxicabs’ highly visited habitual operation zones were similar before and after the subsidy war.
4.4. The Correlation Analysis between Changes in the Operation Zones and the Change in the Average Daily Trip Distance or Average Daily Cruising Time
Pearson’s correlation coefficient R  was used to investigate the relationship of the spatial change in the operation zones with the change in the average daily trip distance or the change in the average daily cruising time. The results in Table 5 show that the change in the average daily trip distance or the change in the average daily cruising time and the change in the unweighted area of operation zones are significant. The correlation coefficients are relatively low except that between the change in the average daily trip distance and the change in the unweighted area of the operation zones in Period 1-2, which means that there is a relatively moderate correlation between these factors. In addition, the change in the average daily trip distance or the change in the average daily cruising time and the change in the NNI of the operation zones are significant. However, the correlation coefficients are extremely low, indicating a weak correlation. Therefore, the changes in the operation zones have little effect on the changes in the average daily trip distance and average daily cruising time in the aggregate perspective of the entire set of taxicabs.
This paper uses the taxicab trajectories in Shenzhen City to elucidate the change of taxicab operation zones within the context of e-hailing apps subsidy war. The results reveal interesting findings about the aggregate level of taxicabs: (i) the subsidy policies for taxicabs and passengers have an influence on the operation zones of taxicabs; (ii) the proportion of habitual operation zone pairs among operation zone pairs in neighboring periods is relatively stable under any subsidy policy during this subsidy war; and (iii) the changes in the operation zones have little effect on the changes in the average daily trip distance and average daily cruising time.
From the perspective of taxicab groups, there are also important findings: (i) the initial increase or decrease in patterns of the area and the degree of dispersion in the spatial distribution of the operation zones of the taxicabs in the beginning of the subsidy war could be used to reflect the difference in further operation zone changes between groups during the subsidy war. This group division rule is helpful for similar investigation tasks. (ii) In Period 3, which had the highest subsidy, all of the selected taxicabs travelled with decreased areas of operation zones and less passenger hunting time, but they had a decreased average daily trip distance at the same time. In this period, taxicabs could find their passengers efficiently and did not require a greater trip distance to maintain their incomes than in other periods because they could receive high subsidies from the e-hailing apps. (iii) When the subsidy to the passengers decreased, all of the groups of taxicabs increased their areas of operation zones in Periods 3-4 and 4-5 and had a high dispersion of their operation zones, especially in Period 4; as a result, they increased their average daily trip distance to maintain their incomes. (iv) When the subsidy war ended, some taxicabs may have returned to their prewar work status.
Based on the findings from all taxicabs and grouped taxicabs, this study could inform the operating decisions of e-hailing companies and government surveillance. For example, when e-hailing companies plan their future specific subsidy policies, the impact of the planned policies on taxis can be estimated to some extent. For the taxi management authority, the derived knowledge of the changes in the taxicabs’ operation zones and habitual operating preferences under the stimulus of policies could help to develop precise and targeted management tactics for different groups of taxicabs to improve urban mobility and public transportation services.
This paper focuses only on the influence of the subsidy war on cabdrivers, whereas, in future research, it is necessary to deeply investigate the interactions between cabdrivers and passengers, since the subsidy war not only stimulated cabdrivers’ behaviors but also influenced passengers’ daily travel patterns. To better understand these patterns, it is necessary to integrate the taxi trajectory data with the detailed order data of the e-hailing apps.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
The research was supported in part by the National Natural Science Foundation of China (Grants nos. 41771473 and 41231171), the National Key Research and Development Program of China (2017YFB0503802), and the Innovative Research Funding of Wuhan University (2042015KF0167).
Y. Dong, Z. Zhang, R. Fu, and N. Xie, “Revealing New York taxi drivers' operation patterns focusing on the revenue aspect,” in Proceedings of the 12th World Congress on Intelligent Control and Automation (WCICA '16), pp. 1052–1057, June 2016.View at: Google Scholar
G. Chen, B. Chen, and Y. Yu, “Mining frequent trajectory patterns from GPS tracks,” in Proceedings of the International Conference on Computational Intelligence and Software Engineering (CiSE '10), pp. 1–6, December 2010.View at: Google Scholar
J. Lee, I. Shin, and G. Park, “Analysis of the passenger pick-up pattern for taxi location recommendation,” in Proceedings of the Fourth International Conference on Networked Computing and Advanced Information Management (NCM '08), pp. 199–204, Gyeongju, South Korea, September 2008.View at: Publisher Site | Google Scholar
D. Garza, Y. Giat, S. T. Hackman, and D. Peled, “A computational analysis of R&D support programs,” Economics of Innovation and New Technology, vol. 24, no. 7, pp. 682–709, 2015.View at: Google Scholar
V. Salnikov, R. Lambiotte, A. Noulas, and C. Mascolo, “OpenStreetCab: Exploiting Taxi Mobility Patterns in New York City to Reduce Commuter Costs,” 2015.View at: Google Scholar
A. Noulas, V. Salnikov, D. Hristova, C. Mascolo, and R. Lambiotte, “Developing and Deploying a Taxi Price Comparison Mobile App in the Wild: Insights and Challenges,” 2017.View at: Google Scholar
L. Chen, A. Mislove, and C. Wilson, “Peeking beneath the hood of uber,” in Proceedings of the ACM Internet Measurement Conference (IMC '15), pp. 495–508, October 2015.View at: Google Scholar
M. C. Pearce, “Pattern analysis of forest cover in southwestern Ontario,” The East Lakes Geographer, vol. 27, pp. 65–76, 1992.View at: Google Scholar