Abstract

Passenger demand plays an important role in railway operation and organization, and this paper aims to estimate passenger time-varying demand by simulating the ticket-booking process for High Speed Rail (HSR) system. The ticket-booking process of each OD pair can be partition into discrete booking phases by the times when the tickets of any itinerary had sold out. The ticket booking volume of each itinerary is reversely assigned to its corresponding expected departure intervals to obtain the time-varying demand in each booking phase using the rooftop model, and the total time-varying demand are estimated by summing the time-varying demand distributions in all booking phases. Only with the data about the itinerary flow, the precedence relationship is introduced to constrain the ticket sold-out order of all itineraries for each OD pair. Based on the precedence relationships of itineraries, two typical situations are proposed, in which the Single Booking Phase Reverse Assignment (SBPRA) algorithm and the Multiple Booking Phases Reverse Assignment (MBPRA) algorithm are proposed to estimate the time-varying demand respectively. Case analysis on OD pair Beijing-Shanghai are presented, and the validity analysis demonstrates that the error rates of SBPRA algorithm and MBPRA algorithm are 8.64% and 6.37%, respectively.

1. Introduction

Passenger demand plays an important role in railway operation and organization. The conventional methods of line planning and scheduling generally aim to meet total passenger demand volumes of each OD pair [14]. However, High Speed Rail (HSR) is characterized as train operations with rapid speed and high frequency, and the transportation capacity of HSR is much larger than ordinary speed railway. For example, the distance from Beijing to Shanghai is about 1300 kilometers, and the service frequency between them in ordinary speed railway (speed160km/h) is 10 times a day in 2006. With the construction and operation of HSR (250km/hspeed350km/h), the service frequency from Beijing to Shanghai is 42 times a day in 2018 (https://www.12306.cn). The HSR system can not only meet total passenger demand volumes with large transportation capacity, but also meet expected departure/arrival times of passengers with the high frequency of train operations. For a given OD pair, the demand rates with different expected departure/arrival times may differentiate within a day, which can be defined as the time-varying demand. With the improvement of HSR system, more and more studies focus on meeting the time-varying demand in the line planning and scheduling [58]. As for the above studies, the time-varying demand was adopted as input data, how to obtain that data has become an important common issue. However, little research has focused on this problem. This paper aims to fill in the research gap of the time-varying demand estimation of HSR. This paper focuses on the time-vary demand over the expected departure time. However, one may estimate the demand against the expected arrival time. If train travel time is constant (without delays and uncertainties), departure-based demand can be converted into arrival-based demand. However, if delays and uncertainties are to be considered, these two might not be simply converted to each other. We leave this for future research.

At present, there are different organization modes adopted by HSR among different countries, which cause distinctive estimation problems in terms of the time-varying demand. In some countries, such as China, passengers must book in advance, and sit according to their ticket number. Passenger flow in the train therefore is equal to the ticketing volume of this train. We are able to get all passenger flows of a OD pair from the Railway Ticketing System (RTS), and the transport volume of this OD pair is the sum of all passenger flows. In China, the ticket fare is fixed throughout the pre-sale period and will not be discounted due to multiple purchase of a single passenger or group purchase. There are also some countries, such as Japan, where HSR tickets have two types: free/non-reserved seats and reserved seats. Passengers who hold a free/non-reserved seat ticket can get on any train during the valid time. Passenger flow in each train therefore cannot be calculated by the ticketing data. In these areas, passengers who book tickets in advance, purchase round-trip tickets or group tickets may enjoy discounts. Hence, ticket discounts will affect the choice of HSR passengers. In this paper, we focus on solving estimation problem of HSR time-varying demand in situations like the case in China: all passengers must book in advance and sit according to their seat numbers marked on the tickets, i.e., the flow of each itinerary can be obtained from its corresponding ticketing volume; and the ticket fare for each itinerary during the whole pre-sale time window is fixed (booking time independent fare).

For HSR system, the time-varying demand of each OD pair has two features: total demand volume in one day and its corresponding time-varying distribution in the operation period of this day. At present, China has large transportation capacity and high frequency of HSR system. HSR passengers do not need to shift to other transport modes (except for some important holidays) due to insufficient capacity. Therefore, in our estimation problem, we assume that HSR system has enough capacity over the service time window to serve all passengers for each OD pair, and the flow shifting between transport modes is not considered. Then, the total demand volume of each OD can be obtained from Railway Ticketing System (RTS). In China, HSR passengers must book tickets in advance, and then, the real departure time of each passenger and the ticketing volume of each itinerary can be obtained from RTS. However, the real departure time of a passenger may deviate from his or her expected departure time. For instance, if there was no departing train at the expected departure time, or the ticket of that train had sold out, this passenger would have to adjust his or her departure time to another train. Hence, all passenger flows at their real departure time from the RTS cannot be regarded as the time-varying distribution directly. In this paper, we managed to tackle the issue that given the total demand volume of the OD pair and the ticketing volume of each itinerary, how to reverse the discrete ticketing volume to continuous time-varying distribution.

At present, there are little studies of estimation of HSR passenger time-varying demand. In the past few decades, previous studies on the time-varying demand estimation mainly focus on airline demand forecast, traffic dynamic OD estimation and public transit OD estimation.

In the aviation industry, accurate forecasts of passenger demand are the heart of a successful revenue management system [9]. The objective of revenue management or yield management is “selling the right seats to the right customers at the right prices” [10]. The forecasts are usually based on historical booking data. Thus, one of the main objectives of airline booking data analytics is to estimate unconstrained demand for each fare class using censored historical booking data [11]. The booking data are called censored because after a booking limit is reached, further booking attempts are rejected and not recorded by the system [9, 12]. Weatherford and Pölt [9], McGill [13], Mukhopadhyay et al., [14] and Ratliff et al., [15] have developed various remedial approaches for estimating unconstrained demand. Additionally, low-cost airlines irregularly launch ticket promotions, where fares may differ by day of the week and departure dates. The timing for purchasing air ticket is thus closely associated with fares. Passengers often do not buy airline tickets immediately when they determine their itinerary, and may choose to wait for fare promotions before marking reservations [16]. Therefore, Wen and Chen [16] and Chiou and Liu [17, 18] study the advance purchase behavior of air passengers using booking data. The result from Wen and Chen [16] indicated that lower fares increase the number of bookings and heterogeneous preferences in booking timing are present. Some travelers tend to book flights earlier than the other groups: these are the price-sensitive customers. The result from Chiou and Liu [18] indicates that advance purchase timing is associated with airfare, uncertainty of airfare, time of day, days of the week, months of the year and consecutive holidays. Diego [19] uses an original dataset with posted prices and sales to estimate the dynamic demand for airlines. They find that consumers become more price sensitive as time to departure nears which is consistent with having lower valuations and the number of active consumers increases closer to departure. However, HSR time-varying demand estimation problem is different from airline demand forecast. The ticket fare in airline changes dynamically during the pre-sale period, and the change of the ticket fare is associated with demand and advance purchase timing of passengers. The main purpose of demand forecasting in airline demand forecast is to obtain the demand corresponding to different fares during the pre-sale period in the segmentation market. However, for the estimation problem of time-varying demand in HSR system, the ticket fare is fixed throughout the pre-sale period and the effect of ticket fare changes on demand does not need to be considered. Passengers usually purchase their tickets as early as possible when they determine their itinerary, in order to purchase the tickets as close as possible to their expected departure time. Under the circumstance that HSR has enough capacity over the service time window to serve all passengers for each OD pair, we want to estimate the time-varying demand distribution in the operation period of each OD pair.

For traffic dynamic OD estimation problems, they mainly use some observation information, including link volumes, traffic counts and various forms of exogenous information, either in the forms of a priori knowledge or structural assumptions, to solve the estimation problems. A common approach is using autoregressive process to describe the dynamic process for the evolution of demand [20, 21]. Along this line, instead of an autoregressive process, Zhou and Mahmassani [22] developed a polynomial trend filter to capture the possible structural deviation in real-time demand. To improve unknown/equations ratio, Marzano and Papola [23] and Cascetta et al. [24] proposed a “quasi-dynamic” framework estimator. Djukic et al. [25] used principal component analysis to reduce the dimensionality of the estimation problem. In addition, for the dynamic demand estimation problem, not only within-day dynamic demand estimation, day-to-day dynamics has received much attention as well. For instance, Zhou and Mahmassani [22] modelled explicitly a day-to-day evolution process using a Kalman filter. Hazelton [26] used statistical estimation theory to estimate day-to-day OD matrices. Shao et al. [27] estimated the mean and covariance of peak hour OD demands from day-to-day traffic counts. However, HSR time-varying demand estimation problem is different from the above problem. Firstly, HSR trains operate according to timetable, and then the impact of timetable should be taken into consideration. Secondly, HSR trains operate during time-of-day periods; therefore, it is only necessary to analyze the within-day time-varying demand.

In Transit network, Wang et al. [28] and Chan et al. [29] used the boarding counts at every station from the Automatic Fare Collection system to generate the estimation problems, and some researchers use the boarding and alighting data from the Automatic Passenger Count systems and base on some assumptions and principles to estimate transit station-to-station OD matrices [3036]. Although the transit network operates according to timetable, there are still some differences between transit network OD estimation and HSR time-varying demand estimation. In transit network, passengers don’t need to book in advance, they purchase tickets when they arrive at the station, so the arriving or boarding time can be regarded as their expected departure time. However, in HSR system, specifically in China, passengers must book tickets in advance, only those passengers who hold tickets are allowed to get on. All passengers are scrambling for tickets, and occupying the train capacity, which are affected by several factors including timetable, travel cost and train capacity etc. The real departure time of passengers cannot be regarded as their expected departure time. In general, we need a proper method to resolve the HSR time-varying demand estimation problem.

The highlight of this paper is presented below. We utilize ‘rooftop’ model to figure out the relationship between itineraries and expected departure intervals, and then reversely assign the ticketing volume of itinerary to its corresponding expected departure intervals to obtain time-varying demand. By simulating the ticket-booking process of HSR, the precedence relationship is introduced to constrain the ticket sold-out order of all itineraries for each OD pair. Based on the precedence relationships of itineraries, we propose two typical situations of all preferable itineraries’ tickets sold out order, i.e., for any itinerary, its tickets would be sold out in its first booking phase, and its tickets would be sold last from its first booking phase to last booking phase respectively. According to these two typical situations, two algorithms are proposed to estimate the time-varying demand respectively. Case analyses on OD pair Beijing-Shanghai are presented and the validity analyses of those two methods are further examined.

The rest of the paper is organized as follows. We propose the assumptions and state the details of HSR time-varying demand estimation problem in Section 2. Section 3 develops the Single Booking Phase Reverse Assignment (SBPRA) algorithm and the corresponding case analysis is presented. The Multiple Booking Phases Reverse Assignment (MBPRA) algorithm is proposed and the corresponding case analysis is given in Section 4. In Section 5, validity analysis is presented. Finally, Section 6 concludes the paper.

2. Problem Statement and Overview of Proposed Approach

In this section, we first summarize the major assumptions for the estimation problem of time-varying demand. Then, we describe the estimation problem of time-varying demand. After that, the rooftop model and simulated ticket-booking process will be introduced, respectively. At last, based on the booking phases, the reverse assignment method will be introduced.

2.1. Assumptions

The following assumptions are made for the demand estimation problem. Assumption () reflects the current practice of the HSR system operations in China. However, different fares can be readily incorporated in our modeling framework.

() HSR system has enough capacity over the service time window to serve all passengers for each OD pair, and the flow shifting between transport modes is not considered.

() All passengers have the same value of time (homogeneous passengers).

() Each passenger chooses the itinerary to minimize his or her travel cost (rational passengers).

() The ticket fare for each itinerary during the whole pre-sale time window is fixed (booking time independent fare).

2.2. Problem of Time-Varying Demand Estimation

Before moving further, the major notations are shown in Appendix A. The time-varying demand estimation problem can be descripted as follows: Given each itinerary flow between each OD pair , we need to estimate the time-varying demand , , where is the expected departure time for passengers, and is the operation period of OD pair .

For OD pair , let denote an itinerary, which means a travel scheme adopted by passengers, including trains and transfer stations from station to station . Denote as the itinerary set of OD pair . For any itinerary , its cost is defined as , which includes the in-train time costs, transfer time costs and ticket fees. The flow of is expressed as , which can be obtained from the RTS. With the large capacity and high frequency trains in the HSR network, the total demand volume could be obtained by summing all itinerary flows for OD pair . Therefore, the problem of time-varying demand estimation is how to reversely assign all itinerary flows to to obtain the time-varying distribution.

Next, we will describe the simulation of the ticket-booking process, and then reversely assign each itinerary flow to its corresponding expected departure time interval to estimate the time-varying demand.

2.3. Rooftop Model and Ticket-Booking Process of HSR Passengers

As HSR passengers must book tickets in advance, which is distinct from the conventional traffic assignment models of public transit, thus ticket-booking process need to be analyzed to model the passenger assignment.

In HSR system, passengers of each expected departure time book tickets of their preferable itineraries in the set of available itineraries. As the ticket-booking process goes on, some itineraries’ tickets will be sold out. Then, some passengers have to book the tickets of their preferable itineraries in the set of remaining available itineraries. Hence, the ticket-booking process means that the above process is repeated until all passengers have booked their tickets.

From the above ticket-booking process, it is known that the set of available itineraries would be updated after the time division points when the tickets of any preferable itinerary had sold out. These time division points partition the pre-sale period into several pre-sale time intervals. In each pre-sale time interval, which also can be regarded as a booking phase, passengers choose their preferable itineraries in the current set of available itineraries. Thus, the continuous ticket-booking process can be partitioned into several discrete booking phases following the above method. In each booking phase, passengers’ choice behaviors can be described as a rooftop model which will be introduced afterwards.

The rooftop model [37, 38] can be described based on the Assumption () and (). If there was no itinerary at the time of , he/she would adjust his/her departure time to another itinerary. Let denotes the departure time of itinerary at station . Define as the unit time fee for passengers who adjust expected departure time, and the travel cost of this passenger choosing itinerary is . Therefore, based on the Assumption () and (), the preferable itinerary chosen by passengers with the expected departure time can be expressed as:Besides, is the current set of available itineraries of OD pair .

A simple example of preferable itineraries which are calculated by rooftop model is shown in Figure 1. For a given OD pair , there are 6 itineraries , departure times are respectively, and the cost of each itinerary is respectively, which are shown by the height of the black vertical solid lines in Figure 1. For passengers who want to depart during , the extra cost of adjusting their expected departure times for each itinerary is illustrated by red dotted line in Figure 1, and the slopes of those lines are and . Based on Assumption () and (), the set of preferable itineraries which calculated by Eq. (1) is . Passengers of all expected departure times would only book tickets with their current minimum travel costs in the preferable itinerary set . For instance, passenger who wants to depart at would book tickets of rather than due to the reason that the current travel cost of is lower than for him/her.

In addition, for OD pair , define as preferable itinerary set which includes all itineraries calculated by Eq. (1) for any . We sort every preferable itinerary in according to the departure time and still express it as , i.e., , with .

In the above analysis, we can see that these preferable itineraries in divide the total expected departure time period into several intervals. For any preferable itinerary , we denote as the expected departure interval of , which means that passengers whose expected departure times are within would only choose the preferable itinerary , and their actual departure times are . Hence, divides into expected departure intervals as follows.For Eq. (2), is the division point between the expected departure interval and . As shown in Figure 2, for the intersection point of line and line , its abscissa value is , and its ordinate value satisfies the following equation:

Then, can be calculated by the following equation:

Besides, let .

The ticket-booking process also can be partitioned into several booking phases. In the example of Figure 1, at the beginning of pre-sale period, denoted as the booking phase I, passengers can book tickets in the preferable itineraries set which is calculated by Eq. (1). According to Eq. (2) and (4), would be divided into 3 expected departure intervals by , and respectively. Passengers who want to depart at would book tickets of preferable itinerary . As the booking process proceeds, when any preferable itinerary’s tickets had sold out, such as , this itinerary would be unavailable for passengers. The set of preferable itineraries would be updated to by Eq. (1). Then the booking process moves on to booking phase II. In the booking phase II, would be partitioned by the new preferable itinerary set . The ticket-booking process is similar to the above process, and repeat it until all passengers have booked their tickets.

2.4. Reverse Assignment Based on the Booking Phases

As we analyzed in Section 2.3, ticket-booking process can be partitioned into several booking phases, and in each booking phase, passengers book tickets in the set of preferable itineraries. Hence, according to the booking phases, we adopt reverse assignment method to estimate the time-varying demand of HSR network.

The ticket booking volume of each preferable itinerary is reversely assigned to its corresponding expected departure interval in each booking phase, and the time-varying demand distribution of each booking phase can be calculated. The total time-varying demand can be obtained by summing all the time-varying demand distributions of all booking phases.

In this paper, without the data about the time division points when the tickets of each preferable itinerary had sold out, we only have the data of each itinerary flow for each OD pair to estimate the time-varying demand. Therefore, the key point of this problem is how to partition ticket-booking process into discrete booking phases, i.e., how to get the ticket sold-out order of all preferable itineraries, and how to determine the ticketing volume of each preferable itinerary in each booking phase.

The sold-out order of all preferable itineraries’ tickets is various, and the ticketing volume of each preferable itinerary may be different in each booking phase. However, some itineraries’ ticket sold-out order can be determined by their costs with Assumption ().

For itinerary , if itinerary satisfy the following Eq. (5) for any , then the tickets of wouldn’t be sold until the tickets of had sold out.

Denote this precedence relationship as , i.e., takes precedence over . From the precedence relationship constraint, the ticket sold-out order of them is that the tickets of had sold out earlier than that of . With the calculating formulation of travel cost of each itinerary, Eq. (5) can be equal to the following:

Thus, Eq. (6) can be used to easily check the precedence relationship between any two itineraries.

The precedence relationship has transitive property, i.e., if and , then . With the precedence relationship, can be obtained and regarded as a precedence relationship chain if there is no itinerary satisfying , or . Besides, if an itinerary has no precedence relationship with any other itineraries, this single itinerary also can be regarded as a precedence relationship chain. Hence, there are many precedence relationship chains in . The ticket sold-out order of all itineraries is constrained by the precedence relationship. The itinerary number of a precedence relationship chain can be regarded as its length. Denote the longest precedence relationship chain in as , and the itinerary set and itinerary number of are denoted as and respectively. For further estimating the time-varying demand with the precedence relationship, we propose the following assumptions.

() For each expected departure time, the ticket-booking process is continuous and lasts the entire pre-sale time period, and for all expected departure times, the ticket-booking processes are synchronized during the pre-sale time period.

() Passengers’ booking tickets of itineraries in the longest precedence relationship chain would last the entire pre-sale time period. The ticket-booking process would be partitioned into only booking phases by the ticket sold-out time points of itineraries in .

For the Assumptions () and (), it should be noted that the tickets of each itinerary which is not in would also have sold out in one of the above ticket sold-out time points. For instance, in Figure 1, is . Based on Assumption (), () and (), the ticket-booking process would only be partitioned into 3 booking phases by the ticket sold-out time points of and . For precedence relationship chain , the ticket sold-out time points of would be at the end of booking phase I or II or III.

Hence, the tickets of the itinerary in would be sold only in booking phase . For other itineraries not in , the sale of their tickets may last for more than one booking phase. For instance, in Figure 1, for itinerary , its tickets may be sold for more than one booking phase, and its ticket sold-out time point may be the end of any booking phase.

For any , denote the first and the last booking phase when its tickets can be sold as and respectively. In the following content, and is described as the first and the last booking phase of briefly. Obviously, if , its tickets would be sold out in one booking phase, i.e. . If , the sale of its tickets may last from its first booking phase to booking phase . For any , we designed the Algorithms 1 and 2, which is shown in Appendix B and C, to calculate and respectively.

Input  The effective operation period and the itinerary set of OD pair ; the
cost , and the departure time of itinerary ; the unit time fee for adjusting
expected departure time for passengers
Output  ; , for
Begin
For any
calculate the precedence relationship between and using Eq. (6);
do ;
For any
do ;
if
do ;
do ;
while , do
Begin 1
for any
if
do ;
otherwise
do ;
, ;
Return 1
End
Input  The effective operation period and the itinerary set of OD pair ; the
cost , and the departure time of itinerary ; the unit time fee for adjusting
expected departure time for passengers; the first booking phase scheme ;
Output   for
Begin
For any
calculate the precedence relationship between and using Eq. (6);
do ;
For any , do ;
For , do
Begin 1
do ;
For any
do ;
For any ,
if there is no satisfying
do ,
;
Return 1
End

It should be noted that for can be calculated by the Algorithm 1. We denote the first booking phase scheme of as , which will be used to calculate the value of for in the Algorithm 2.

In Algorithm 2, can be calculated for . We denote the last booking phase scheme of as .

The example of the calculations of the first and the last booking phase for each itinerary in Figure 1 can be described as follows. According to the first and the last booking phase partition algorithm, we can obtain the following first and the last booking phase scheme.

The ticket sold-out order of itineraries in each precedence relationship chain is constrained by the precedence relationship. For instance, in Figure 1, for precedence relationship chain , due to the reason that , the tickets of would only be sold in booking phase I. Due to the reason that , the tickets of can be sold in booking phase II or in booking phase II and III. In conclusion, due to the reason that the ticket sold-out time point of preferable itinerary may be at the end of different booking phases, we proposed 2 typical situations of all preferable itineraries’ tickets sold out order to estimate the time-varying demand respectively.

Typical Situation 1. For any itinerary , its tickets would be sold out in its first booking phase .

Typical Situation 2. For any itinerary , the sale of its tickets would last from the first booking phase to the last booking phase .

For each typical situation, the HSR passenger time-varying demand estimation can be described as follows: Firstly, partition the ticket-booking process into several booking phases and figure out the set of preferable itineraries in each booking phase. Secondly, in each booking phase, divide the total expected departure time period into expected departure intervals based on the set of preferable itineraries. Thirdly, ticket booking volume of each preferable itinerary is reversely assigned to its corresponding expected departure interval to obtain the time-varying demand distribution in each booking phase. At last, sum the time-varying demand distributions of all booking phases to obtain the time-varying demand.

The following content are based on Typical Situation 1 and 2 to design two corresponding time-varying demand estimation algorithms.

3. Single Booking Phase Reverse Assignment Algorithm

3.1. Framework of Single Booking Phase Reverse Assignment Algorithm

According to Typical Situation 1, all itineraries in would only be preferable in their first booking phase. Denote as the set of preferable itineraries in the booking phase . Based on Assumption (), (), () and Typical Situation 1, the booking phases can be obtained by the first booking phase partition algorithm. Hence, the booking phase scheme is , and .

In the booking phase , for any preferable itinerary , denote its corresponding expected departure interval as . According to Typical Situation 1, the ticket booking volume of preferable itinerary in booking phase is equal to its flow . We reversely assign the ticketing volume of to its expected departure interval . is evenly assigned to expected departure interval , which leads to a distribution in , i.e.,where .

For the example in Figure 1, the single booking phase reverse assignment can be described as follows. Based on Assumption (), (), () and Typical Situation 1, the booking phase scheme is expressed as follows.

In booking phase I, using Eq. (8), evenly assign the flows and to their corresponding expected departure intervals and to obtain the time-varying distribution of booking phase I. In booking phase II, evenly assign the flows and to their corresponding expected departure interval and to obtain the time-varying distribution of booking phase II. Then in the booking phase III, evenly assign the flow of preferable itinerary to its corresponding expected departure interval and obtain the time-varying distribution of booking phase III. At last, sum the time-varying distributions of all booking phases to obtain the time-varying demand.

Based on Assumptions (), (), () and Typical Situation 1, we can use the first booking phase partition algorithm to get the booking phase scheme . In the booking phase , for any preferable itinerary , its corresponding expected departure interval can be obtained by Eq. (2) and (4). The flow of is reversely assigned to by Eq. (8). In general, we design the Single Booking Phase Reverse Assignment (SBPRA) algorithm, which is shown in Appendix D, to estimate HSR time-varying demand. The flow diagram of the SBPRA algorithm is shown in Figure 3.

3.2. Case Analysis

We apply the data (shown in Appendix E) of OD pair Beijing-Shanghai on December 2015 from the RTS into the SBPRA algorithm. There are 34 itineraries for OD pair Beijing-Shanghai, and the departure time, cost and flow of each itinerary are given in Table 1. The total effective operation period of this OD pair . The average monthly residential incomes of Beijing and Shanghai are 7086 RMB and 6504 RMB in 2015 respectively [39, 40]. Based on 22 working days in a month and 8 working hours in a day, the average income can be expressed as 0.67 RMB per minute and 0.62 RMB per minute respectively. We use average residential income to express the unit time fee of adjusted expected departure time, i.e., RMB per minute.

We calculate the passenger time-varying demand for OD pair Beijing-Shanghai with SBPRA algorithm. Firstly, use the Algorithm 1 to calculate the first booking phase scheme . Secondly, do to obtain the booking phase scheme . Thirdly, for , we calculate the expected departure interval for all shown in Figure 3 by Eq. (2) and (4); then the ticket booking volume of each preferable itinerary in booking phase is reversely assigned to its corresponding expected departure interval, shown in Figure 4; and the distribution of the reverse assignment in booking phase is illustrated in Figure 5; at last, the accumulated time-varying demand distribution is shown in Figure 6.

In Algorithm 1, for any , we use Eq. (6) to calculate the precedence relationship between and shown in Table 2. According to the precedence relationship in the Table 2, firstly, we set , and calculate the of by . For any itinerary , if , then and put it in . The and of are shown in column of Table 3. For example, G11G105, then is 1; and G1G115 and G13G115, then is 2. The itinerary set is shown in Table 4.

Secondly, set . For any itinerary , if , then and put this itinerary in ; otherwise, set . The and of are shown in column of Table 3. For example, G11G105, , , and , then . G1G117, , , and , then ; G13G117 and , then . The itinerary set is shown in Table 4.

Thirdly, set . For any itinerary , if , then and put this itinerary in ; otherwise, set . The and of are shown in column of Table 3. For example, G115G117, , , and , then . The itinerary set is shown in Table 4.

Figure 4 illustrates the expected departure interval and travel cost of each preferable itinerary in each booking phase. From booking phase I to III, the number of preferable itinerary is decreasing rapidly, the time range of each preferable itinerary’s expected departure interval is wider and the height of vertical solid line which represents the cost of each preferable itinerary is rising.

Figure 5 shows the change of ticketing volume of each preferable itinerary. From booking phase I to III, the height of vertical solid lines, regarded as ticketing volume of each preferable itinerary, is declining drastically. Based on this information, the feature of tickets-booking process simulated by the SBPRA algorithm can be described as follows:

For OD pair , passengers choose their preferable itineraries and book their corresponding tickets with the minimum travel cost, so the sold-out order of all preferable itineraries’ tickets is from low cost to high cost.

For OD pair , all itineraries’ tickets had sold out in the first booking phase, it causes that most passengers book tickets at early booking phase, the travel cost of those passengers are relatively low and the adjusted expected departure time ranges are relatively narrow. In contrast, a small percentage of passengers who book tickets at late booking phase have to choose those itineraries with higher cost and need to adjust their expected departure time in a wider range.

We reversely assign each preferable itinerary’s ticket booking volume in Figure 5 to its corresponding expected departure interval in Figure 4 to obtain the time-varying demand distribution in each booking phase in Figure 6. Sum the distributions in Figure 6 from booking phase I to III to get the accumulated time-varying demand distribution in Figure 7. The red solid line in Figure 7 is the time-vary demand of OD pair Beijing-Shanghai calculated by SBPRA algorithm, and the Table 5 shows the numerical results of time-varying demand in details.

The method of polynomial fitting is adopted for fitting the above distribution of time-varying demand, and the result is shown in Figure 8. It can be seen that the travel demands before around 7:30 and after 17:30 are relatively low. From 9:00 to 11:00 and 14:00 to 16:00, there are two demand peaks. Around 12:00, the drop of travel demand is probably due to the approaching lunch time.

For the sensitive analysis of parameter in the SBPRA algorithm, we also calculate the numbers of booking phases and time-varying demand distributions for different values , shown in Table 6 and Figure 9 respectively. From Table 6, it can be seen that with larger value of , passenger is more concerned about the cost of adjusting the expected departure time, which results in the decreasing of numbers of booking phases. From Figure 9, with the increasing of parameter , the fluctuation of the time-varying demand distribution is increasing.

4. Multiple Booking Phases Reverse Assignment Algorithm

Since the above SBPRA algorithm is the estimation method based on Typical Situation 1, each itinerary flow is reversely assigned to corresponding expected departure intervals in its first booking phase. In this section, based on Typical Situation 2, each itinerary flow is reversely assigned to corresponding expected departure intervals from its first booking phase to last booking phase. The MBPRA algorithm will be introduced next.

4.1. Framework of the Multiple Booking Phases Reverse Assignment Algorithm

According to Typical Situation 2, itinerary would be preferable for passengers in the booking phase . As a result, the set of preferable itineraries in the booking phase can be calculated as follows.

Hence, the booking phase scheme can be expressed as , and .

For any itinerary , its flow is evenly assigned to its all corresponding expected departure intervals in booking phase , i.e., the ticket booking volume of preferable itinerary in booking phase is allocated by a proportion of this itinerary’s flow . The proportion is equal to the ratio of its expected departure interval’s time range in booking phase to its all corresponding expected departure intervals’ time range in all booking phase . Hence, the ticket booking volume is evenly assigned to its expected departure interval in booking phase , which leads to a distribution in . This distribution can be expressed as:Besides, is the time range of .

The example of the multiple booking phases reverse assignment in Figure 1 can be described as follows. Based on Assumption (), (), () and Typical Situation 2, the booking phase scheme which is calculated by Eq. (10) is expressed as follows.

In booking phase I, the preferable itinerary set . Using Eq. (11) can obtain the time-varying distribution of booking phase I, i.e., evenly assign the ticket booking volumes and of and to their corresponding expected departure intervals and respectively. In booking phase II, the preferable itinerary set . Using Eq. (11) can obtain the time-varying distribution of booking phase II, i.e., evenly assign the ticket booking volumes and to their corresponding expected departure intervals and respectively. Then in the following booking phase III, the preferable itinerary set . Evenly assign the ticket booking volumes and to their corresponding expected departure intervals and respectively, this can obtain the time-varying distribution of booking phase III. At last, sum the time-varying distributions of all booking phases to obtain the time-varying demand.

Based on Assumption (), (), () and Typical Situation 2, we can obtain the first booking phase scheme and the last booking phase scheme by the first and the last booking phase partition algorithm respectively. After that, we can obtain the booking phase scheme by Eq. (10). In the booking phase , for any preferable itinerary , its corresponding expected departure interval can be calculated by Eq. (2), (4). Then, the tickets booking volume of is reversely assigned to by Eq. (11). In conclusion, we design the Multiple Booking Phases Reverse Assignment (MBPRA) algorithm, which is shown in Appendix F, to estimate HSR time-varying demand, and the flow diagram of MBPRA algorithm is shown in the Figure 10.

4.2. Case Analysis

We apply the data (shown in Appendix E) of Beijing-Shanghai HSR on December 2015 from the RTS to the MBPRA algorithm, and the parameters setting are the same as in Section 3.2. The first and the last booking phase scheme calculated by Algorithms 1 and 2 respectively are shown in Tables 4 and 7. Based on the first and last booking phase scheme, the booking phase scheme is obtained by Eq. (10), shown in Table 8. We can see that the continuous ticket-booking process is partitioned into 3 booking phases.

Figure 11 illustrates the expected departure interval of each preferable itinerary in each booking phase. From booking phase I to III, the cost of each preferable itinerary is rising, and there are no other obvious trend of changes. Figure 12 shows the change of ticketing volume of each preferable itinerary. From booking phase I to III, the ticketing volume of each preferable itinerary is declining gradually, and the declining speed is slower than SBPRA algorithm.

For OD pair , passengers choose their preferable itineraries with the minimum travel cost and book their corresponding tickets, and each itinerary’s tickets remain on-sale from its first to the last booking phase. Those conditions cause most passengers to book tickets at the early booking phase with a relatively lower travel cost. As the booking process goes on, few passengers purchase tickets at the late booking phase with a relatively higher travel cost. The decline speed of ticketing volume and the increase speed of travel cost simulated by the MBPRA algorithm are gentler than the SBPRA algorithm. In conclusion, comparing the two solutions by the MBPRA algorithm and the SBPRA algorithm, passengers are less sensitive to the changes of travel cost in the former algorithm.

Figure 13 shows the time-varying demand distribution in each booking phase, and the accumulated time-varying demand distribution of the MBPRA is illustrated in Figure 14. Table 9 shows the numerical results of time-varying demand by the MBPRA algorithm.

Polynomial fitting is adopted for the above distribution of time-varying demand results, and we get demand distribution curve of Beijing-Shanghai, shown in Figure 15. We can see that the fluctuation trends of time-varying demand distribution from the MBPRA algorithm and the SBPRA algorithm are similar. It means that the solution space between the MBPRA algorithm and the SBPRA algorithm is relative narrow.

For the sensitive analysis of Parameter in the MBPRA algorithm, the time-varying demand distributions with different parameters are shown in Figure 16. From the SBPRA and MBPRA algorithms, it is obvious that they have the same number of booking phases. From Figure 16, it can be seen that the change trend of the time-varying demand distribution have the same characteristic comparing with Figure 9.

5. Validity Analysis

In order to analyze the validity of the above two time-varying demand estimation algorithms, we compare the results of those two algorithms with the real time-varying demands. However, due to the difficulty of obtaining real time-varying demand, we adopts some special data from RTS, which are close to the real time-varying demand, to analyze the validity. We choose some special OD pairs of HSR which are served by high train frequencies (headway is less than 1 hour) with sufficient train capacities, so the transport volumes in each hour can be seen as its real hourly demand of this OD pair. Comparing them with the time-varying demands calculated by the SBPRA algorithm and the MBPRA algorithm respectively, we can test the accuracy of the above two algorithms approximatively.

We apply the data (shown in Appendix G) of OD pair Beijing-Tianjin in December 2015 to analyze the validity of the SBPRA algorithm and the MBPRA algorithm. There are 129 itineraries of OD pair Beijing-Tianjin on that day. The effective operation period of this OD pair is . The average monthly residential income of Beijing and Tianjin are 7086 RMB and 4944 RMB in 2015 respectively [40, 41], and the average income can be expressed as 0.67 RMB per minute and 0.47 RMB per minute respectively. We set RMB per minute. The comparison between hourly transport volumes from RTS and the results from the SBPRA algorithm and MBPRA algorithm are shown in Table 10.

From the Table 10, we can see that the error rates of the SBPRA algorithm and the MBPRA algorithm are 8.64% and 6.37% respectively, which are relatively low and verifies those two algorithms. Besides, the MBPRA algorithm has a lower error rate than the SBPRA algorithm, which implies that ticket-booking process of this OD pair on December 2015 is closer to Typical Situation 2.

6. Conclusion and Further Studies

This paper focuses on the problem of HSR time-varying demand estimation. By simulating ticket-booking process, we reversely assign the ticketing volume of each preferable itinerary to its corresponding expected departure interval in each ticket-booking phase, and then sum the demand distributions in all booking phases to obtain the time-varying demand. Owing to the variety of the sold-out orders for all preferable itineraries’ tickets and only the data of the itinerary flow, the precedence relationship is introduced to constrain the ticket sold-out order of all itineraries for each OD pair. Based on the precedence relationship of itineraries, two typical situations are proposed, and the SBPRA algorithm and the MBPRA algorithm are designed. The case analysis shows that the results of those two algorithms can better reflect the time-varying characteristics of HSR passenger demand, and the fluctuation of those two distributions are similar, but the SBPRA algorithm results are more relevant to the itinerary cost differences. Numerical analysis have shown that the error rates of the SBPRA algorithm and the MBPRA algorithm are 8.64% and 6.37% respectively. They have rather good estimation accuracy, which validate those two algorithms.

The current research, as a first step to estimate time-varying demand in HSR, can be extended along several avenues as follows: This paper only considers travel cost for passengers with the same unit time fee, but the unit time value may vary for different passengers. Further studies can classify passengers into several categories with different socio-economic characteristics (e.g. income level). Besides, different class seats could be considered. This paper uses simulative method to estimate time-varying demand by partitioning continuous ticket-booking process into discrete booking phases according to two typical situations. If more detailed information is accessible, such as ticket-booking time of each passenger, then the time-varying demand can be estimated by the actual sold-out order of preferable itineraries. We will study the estimation problem of the day-to-day dynamic demand for HSR system in the further research.

Appendix

A. Notation

Sets Set of OD pair Set of itineraries of OD pair Current set of available itineraries of OD pair Set of preferable itineraries for passengers of OD pair Set of preferable itineraries whose first booking phase is Set of preferable itineraries whose last booking phase is Set of preferable itineraries in the booking phase The itinerary set of the longest precedence relationship chain

Indexes Index of OD pair, Index of booking phase, Index of preferable itinerary, Index of itinerary,

Parameters Effective operation period of OD pair Itinerary of OD pair , Preferable itinerary of OD pair , Cost of itinerary Depart time of itinerary from station Flow of itinerary Expected departure time for passenger, Unit time fee for passengers who adjust expected departure time Time division point between and The first booking phase of The last booking phase of Expected departure interval of Expected departure interval of in booking phase

Variables The itinerary number of the longest precedence relationship chain Time-varying demand of OD pair ,

B. The First Booking Phase Partition Algorithm

See Algorithm 1.

C. The Last Booking Phase Partition Algorithm

See Algorithm 2.

D. Single Booking Phase Reverse Assignment Algorithm

See Algorithm 3.

Input  The effective operation period and the itinerary set of OD pair , the cost ,
the flow , and the departure time of itinerary , the unit time fee of adjusting expected
departure time for passengers;
Output  Passenger time-varying demand for OD pair .
Begin
Calculate and for by the first booking phase partition algorithm;
do , obtain ;
;
For , do
Begin 1
Obtain the expected departure interval for all by Eq. (2) and (4);
Obtain the distribution of reverse assignment for all by Eq. (8);
do ;
Return 1
End

E. The Ticket Booking Data of Each Itinerary of OD Pair Beijing-Shanghai on December 1st 2015

See Table 11.

F. Multiple Booking Phases Reverse Assignment Algorithm

See Algorithm 4.

Input  The effective operation period and the itinerary set of OD pair , the cost ,
the flow , and the departure time of itinerary , the unit time fee of adjusting expected
departure time for passengers;
Output  Passenger time-varying demand for OD pair .
Begin
Calculate and for by the first booking phase partition algorithm;
Calculate for by the last booking phase partition algorithm;
Based on the first and the last booking phase scheme to obtain the booking phase scheme
by Eq. (10);
;
For , do
Begin 1
Obtain the expected departure interval by Eq. (2) and (4);
Obtain the distribution by Eq. (11);
do ;
Return 1
End

G. The Ticket Booking Data of Each Itinerary of OD Pair Beijing-Tianjin on December 1st 2015

See Table 12.

Data Availability

The relevant data used to support the findings of this study are in the main body and Appendix of the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Funding

This study was supported by the National Natural Science Foundation of China [grant number 71701216, U1334207] and the Humanities and Social Sciences Foundation of Jiangxi China [grant number GL1516].