Research Article | Open Access
Floyd-A∗ Algorithm Solving the Least-Time Itinerary Planning Problem in Urban Scheduled Public Transport Network
We consider an ad hoc Floyd-A∗ algorithm to determine the a priori least-time itinerary from an origin to a destination given an initial time in an urban scheduled public transport (USPT) network. The network is bimodal (i.e., USPT lines and walking) and time dependent. The modified USPT network model results in more reasonable itinerary results. An itinerary is connected through a sequence of time-label arcs. The proposed Floyd-A∗ algorithm is composed of two procedures designated as Itinerary Finder and Cost Estimator. The A∗-based Itinerary Finder determines the time-dependent, least-time itinerary in real time, aided by the heuristic information precomputed by the Floyd-based Cost Estimator, where a strategy is formed to preestimate the time-dependent arc travel time as an associated static lower bound. The Floyd-A∗ algorithm is proven to guarantee optimality in theory and, demonstrated through a real-world example in Shenyang City USPT network to be more efficient than previous procedures. The computational experiments also reveal the time-dependent nature of the least-time itinerary. In the premise that lines run punctually, “just boarding” and “just missing” cases are identified.
When a traveler plans to travel from one place (origin) to another (destination) beginning at a given initial time (or imposed by a deadline) in a real-world urban scheduled public transport (USPT) network, it can be difficult to determine the a priori least-time (LT) itinerary. Typically speaking, the itinerary should specify the USPT services (e.g., metro lines, bus lines) that combine the vehicle trips to take, the roads to walk, and the stops at which to transfer in order to arrive at a destination consuming the least time. The aforementioned least-time itinerary planning problem in an urban scheduled public transport network (LTIP-USPT) is a common decision problem for travelers in a city travelling from an origin to a destination to make a date, attend a conference, participate in a party, and, most often, to go to work.
Unlike the well-studied shortest path problems in static networks, LTIP-USPT is more difficult to address concerning the following reasons: (i) the topological structures and timetables of the public transport services in a city are prescheduled. (ii) One or more transfers are unavoidable in most cases, resulting in walking time and waiting time as penalties. (iii) In addition to spatial connectivity and distance, travelers must also consider temporal and operational factors. In a word, public-transport travelers, as opposed to private-vehicle travelers, cannot wander in a USPT network freely without any constraints (for more detailed explanation, please see ). Moreover, in real-world applications, the solution procedure should be as fast as possible. It is a challenging task because in the time-dependent USPT network, various nodes and lines are interconnected, thus leading to numerous combinations of lines, vehicle trips, walks, and transfers, making a difficult combination optimization problem.
The headway-based public transport services were considered in early researches (see [2–4]). Recent years witness a boom in the development of schedule-based public transport. Correspondingly, in the transportation research community, the focuses have shifted from headway-based services to schedule-based ones. The main difference comes from the evaluating of each transfer waiting time. The transfer waiting time is typically assumed to be half of the headway concerning headway-based services. However, for schedule-based services, it could be precisely evaluated depending on the combined timetables. Thus the network travel times (especially transfer waiting times) shift from deterministic to time dependent, resulting in the needs of methodological changes for many transportation problems, for example, itinerary planning (see [5–8]) and traffic assignment (see [9–11]). And these methodologies appear more promising and competing against those for the traditional headway-based services. Note that in reality, the “schedule-based lines” are a generalized concept that might include bus lines and metro lines. In this sense, the bimodal (USPT lines and walking) network considered in this paper could also be interpreted as a multimodal one, which is a typical urban public transport network in real world.
There are numerous related works in the existing literatures. Tong and Richardson  were among the first to study the scheduled public transport; they developed a branch-and-bound type algorithm for the minimum itinerary. Along this vein, research community extended the itinerary planning problems by introducing different real-world considerations. Horn introduced multimodal transport services, and a Dijkstra-based algorithm was considered for minimizing generalized travel costs . Tan et al.  developed a recursive algorithm for finding reasonable paths that satisfy the defined acceptable time criterion and transfer-walk criterion, where travelers’ preferences were required to give. Not only the LT itinerary, but also the k-LT itinerary need to be determined for travelers. Accordingly, Xu et al.  and Canca et al.  studied the k-shortest path problems in schedule-based transit networks. Androutsopoulos and Zografos considered a dynamic programming based algorithm [14, 15] for the itinerary planning problem in the context where traveler imposes time windows on nodes; multiobjective was also considered. These works have significantly contributed to real-world derived itinerary planning problems for different types of travelers’ requirements. However, their focuses were mainly on the considerations of real-world factors. With regard to the algorithmic efficiency, they only cared about whether a query could be completed within a short time. In query-intensive scenarios, the query should be as fast as possible, which motivated researchers in recent years to examine heuristic methods to speed up computing.
Fu et al. reviewed the heuristic shortest path algorithms for transportation applications , noting that A*-based algorithms were widely used. The heuristic A* algorithm was first proposed by Hart et al. , typically for the shortest path problems in static networks. The performance of A* algorithm depends primarily on the strategy of estimating the travel time of a partial path. A well-designed strategy leads to a considerable savings of computational time while assuring an optimal solution. Chabini and Lan adapted A* for the fastest path in deterministic discrete-time dynamic networks ; they proposed three estimating strategies. Subsequently, numerous works appeared in the literatures that primarily applied A*-based algorithms to LT path problems in unscheduled time-dependent road networks (see [19, 20]). Chen et al. proposed an A*-based integrated approach combining offline precomputation and online path retrieval for a road navigation system . This work introduced precomputation (see [22, 23]) technology and significantly improved the efficiency. However, determining the LT itinerary in a scheduled network is more challenging than that of unscheduled network. But relatively little related works were found.
In fact, two notable works had studied the itinerary planning problem in scheduled public transport network, especially focusing on the speed-up technologies [24, 25]. They showed that the time-dependent model is superior to the time-expanded model in the sense of itinerary-finding efficiency. A*-based and other strategies are demonstrated to be capable of speeding up computing. Following the previous works, this paper goes beyond in the following aspects: (i) modelling a modified bimodal (i.e., USPT services and walking), time-dependent USPT network. This model is observed to be more applicable in that the results over the modified network model intrinsically have smaller number of transfer times. (ii) An ad hoc Floyd-A* algorithm is developed to solve the LTIP-USPT where transit vehicles are assumed to run punctually. A novel approach to estimating travel time of the partial itinerary is embodied in Floyd-A*. To implement the approach, we generate a slacked network and let the arc travel time be a static tight lower bound of the associated real time-dependent arc travel time. Precomputing technology is also used. The algorithm is proven to be optimal in theory and was demonstrated with a real-world example to be very applicable. The Floyd-A* procedure outperforms the previous procedures. It reduces the averaged computational time by 63.9% compared with a conventional Dijkstra-like procedure. (iii) From the management perspective, an illustrated example reveals the time-dependent nature of the least-time itinerary. In the premise that lines run punctually, the solution aids travelers in avoiding “just missing” cases.
The remainder of this paper is organized as follows. Section 2 formulates the modified USPT network model and itinerary structure, develops the formula for each time-dependent time-label arc, and subsequently formulates the LTIP-USPT followed by hypotheses. In Section 3, we propose an ad hoc Floyd-A* algorithm composed of two procedures, that is, Floyd-based Cost Estimator and A*-based Itinerary Finder. The Cost Estimator precomputes the estimated travel time of destination-ended partial itineraries as heuristic information. The Itinerary Finder heuristically determines the LT itinerary in real-time. Floyd-A* is mathematically proven to be admissible and efficient. Furthermore, an illustrated example is presented in Section 4 that reveals the time-dependent nature of an LT itinerary and provides guidance for travelers in determining the initial time to begin travel. Meanwhile, through both a numerical example and a real-world case, the Floyd-A* procedure is proven to be more efficient than two other conventional procedures, that is, Dijkstra-like and Plain-A* procedures. Finally, concluding remarks and future works are discussed in Section 5.
2. Formulation of the Least-Time Itinerary Planning Problem in Urban Scheduled Public Transport Network (LTIP-USPT)
In modeling the urban scheduled public transport (USPT) network, both USPT lines and walking modes should be considered. An itinerary should encompass spatial, temporal, and operational features. In an USPT network, a passenger who plans to travel from an origin to a destination at a given initial time will select the USPT lines to take, the roads to walk, and the stops at which to transfer in order to arrive at their destination as quickly as possible. We identify this problem hereafter as LTIP-USPT. To solve a LTIP-USPT, a challenge is to model a more applicable USPT network and a reasonable itinerary structure; these are developed in Sections 2.1 and 2.2, respectively. In Section 2.3, we formulate the LTIP-USPT as the least-time itinerary planning problem in a deterministic bimodel time-dependent scheduled network.
2.1. A Modified Scheduled Network Model
Intuitively, Figure 1(a) shows an example of a physical USPT network, and the associated modified network model is shown in Figure 1(b). The advantage of this model beyond previous scheduled network model can be found in Remark 1. Table 1 gives an associated timetable example.
Let denote the scheduled directed USPT line that operates on the USPT network. The USPT line typically refers to (but not constrained to) the bus line or metro line that runs on fixed road and runs through a predetermined serious of nodes based on a timetable . There are numerous vehicle trips within a single day. , element of represents the scheduled time when the th vehicle trip of arrives or departs at node . Table 1 shows a timetable example, where, for example, .
Any move along a specific line is not necessarily between two adjacent nodes but may pass through one or more intermediate nodes. Correspondingly, arcs can be generated by line . The set of these arcs is formulated as , where denotes the sequence number that line passes through node . Let if does not pass through . Obviously, for any node , . For example, of Figure 1 associates with the set of arcs . In this model, it should be noted that only one arc in , rather than two or more connected arcs, is traversed from the traveler’s boarding to alighting a vehicle of line .
Remark 1. One of the challenges of this problem is the existence of multiple solutions, and it is fairly easily that an algorithm is trapped into some poor local minimums. This Remark elaborates this phenomenon and the solution method. In previous related works, the arcs of a USPT line typically only exist between the adjacent nodes; that is, . In practice, the disadvantage is shown as follows by an example. Consider the two USPT network models in Figure 2, where Figure 2(a) shows a previous network model and Figure 2(b) a modified one. A traveler goes from to starting at 9:00. There are the following two alternative itineraries.
Itinerary 1. Wait 5 minutes and start at origin , traveling by directly to the destination .
Itinerary 2. First travel by line to , and wait 5 minutes and then travel by to .
They both arrive at destination at 9:50, so they are both the least-time itineraries in theory. But real-world traveler typically prefers Itinerary 1, because Itinerary 2 contains a transfer activity. In the model of Figure 2(a), a label-setting algorithm (see Dijkstra 1959) will obviously choose Itinerary 2, because 9:25 is earlier than 9:30 regarding node ; that is to say, dominates . In comparison, executing a label-setting algorithm in the model of Figure 2(b) will lead to the choice of Itinerary 1. In fact, once the is searched, it will never be dominated, because there is no way to reach earlier than 9:50. It is observed that the aforementioned phenomenon exists comprehensively in our experiments. The results over the modified network model intrinsically have smaller number of transfer times; thus, it is more applicable.
In most real-world cases, the destination node cannot be reached by using only one line from the origin node , so transfer is necessary. Transferring does not always occur at just one node; a traveler may have to walk some distance to another node in order to transfer. The tolerable walking distance is constrained by a constant upper bound . Thus, the set of walking arcs is formulated as . Using the USPT network of Figure 1 as an example, while because . For denotation convenience, let USPT service s denote either a line or the walk ; that is, .
In summary, with regard to USPT network , , and . In general, a node in this network represents a bus stop or metro station. An shows an available move. In any given specific arc, the arc travel time does not always remain constant, which is actually dependent on the initial start time. This makes the USPT network a single-layer, bimodal, and time-dependent network.
2.2. Itinerary Structure and Timing
The Itinerary is represented as a sequence of orderly arcs or nodes in a static network, though there must be some adaptation in a USPT network context. An arc could not describe the temporal factor, so we define the time-label arc in Definition 2. An itinerary in a USPT network could be represented as a sequence of time-label arcs.
Definition 2. A time-label arc (t-arc for short) is defined as a 4-tuple link representing a passenger’s move from a tail node to a head node by means of a transport service at a given initial time . This representation is legitimate if and only if there exists an available transport service for a passenger who is located at node at time (maybe with some waiting time) to move towards node . Using Figure 1 as an example, denotes that a passenger arrives at node at the initial time 7:10:00 and travels to node by line . In addition, subscripts of the service and the initial time are kept consistent with the tail node . If necessary, superscripts are used to distinguish the different services and initial times.
When a passenger travels from to at a given initial time , there can be numerous eligible itineraries. The set of these itineraries is denoted by , whose elements can be represented by a sequence of connective -arcs shown in where is the number of -arcs that compose .
A passenger may be concerned about the total travel time of itinerary , which is the accumulated travel time of each component -arc. The elapsed travel time therefore acts as the cost (weight) of each -arc. There are three components of travel time as follows:(1)in-vehicle time—elapsed during vehicular travel on the line,(2)walking time—elapsed during walking between two nodes for transfer purposes,(3)waiting time—elapsed at node waiting for the arriving transfer vehicle.
Let be an operator that times each -arc or itinerary. The computing method to time -arc depends on the associated transport service . If , the -arc is traversed by walking. It calculates the fixed walking time cost as shown in
The arrival time at is then easily calculated in
If , the -arc is traversed by line . Both the in-vehicle time and the waiting time must be considered. Therefore, the associated travel time is not fixed but time dependent, as calculated in
Because the passenger will board the first arriving vehicle of the transferred line , of Formula (4) is determined by Formula (5); the waiting time and the in-vehicle time in this process are and , respectively. The corresponding arrival time at is calculated with Formula (6):
In any specific USPT network, once the initial time of each -arc is known, the travel time of this -arc and the associated arrival time can be easily calculated. With respect to any itinerary , the first initial time is predetermined by the passenger, and the subsequent times can be calculated recursively by Formula (3) or (6). In other words, the initial time of a specific -arc is equal to the arrival time of the upstream -arc. In this case, the travel time of the itinerary formulated in Expression (1) can be written as
Along with spatial and temporal features, practical operability should also be considered from the passenger’s perspective. Some properties of the itinerary that describe operability are given below.
Property 1. Two -arcs that are traversed by walking cannot be adjacent, due to the hypothesis that a walking distance between two nodes cannot be larger than . In other words, when , we have , where .
Property 2. During the travel process, if a line has been already used as a transport service, a passenger will not likely reuse this line or its inverted line (see Definition 3) in his/her subsequent travel process. In other words, when , we have and , where and .
Property 3. In reality, a passenger is not likely to travel an itinerary that goes through a specific node twice. Therefore, we have , where , and .
Definition 3. With regard to a specific line , there usually exists an inverted line that runs on almost the same road segments of but in inverted directions. is also the inverted line of ; that is, . Intuitively, is the inverted line of in the USPT network shown in Figure 1 ().
Take the USPT network of Figure 1 as an example, whose associated timetables are provided in Table 1. A passenger arrives at at 6:10, waits for 4 minutes, takes the first available vehicle trip of towards , arrives at 6:25, walks to using , waits for 3 minutes, boards the vehicle on the 5th trip of at 6:30, and finally arrives at at 6:45. This itinerary is represented by and consumes 35 minutes in total; thus, .
2.3. Problem Formulation
In any specific USPT network, a passenger decides to travel from an origin to a destination , at an initial time . The problem is determining a connected itinerary among the large volume of available choices that requires a minimum of travel time. This can be mathematically formulated as follows:
The travel time of the -arc traversed by walking is fixed, while that traversed by a line is time dependent, thus leading to a time-dependent rather than static USPT network. The LTIP-USPT pertains to the least-time itinerary planning problem in a bimodal, time-dependent scheduled network. The traditional shortest path algorithms do not apply. Through the adaptation of the A* algorithm, Section 3 develops an ad hoc Floyd-A* algorithm to address the LTIP-USPT. The following hypotheses are assumed and summarized as follows.(1)Line vehicles run punctually.(2)The vehicle capacities are infinite.(3)The road network is noncongested.(4)The vehicle departs immediately after arriving at a specific node.(5)The origins and destinations are all located just at nodes.(6)One walking distance cannot be greater than the tolerable upper bound .
The findings of this research can be widely used; they can assist passengers in arranging their travel and be integrated into traffic assignment models. They can also verify the accessibility of a USPT network and help in the design of timetables, contributing both theoretically and practically.
3. Floyd-A* Algorithm for LTIP-USPT
To solve the LTIP-USPT efficiently, an ad hoc Floyd-A* algorithm is developed that is composed of two procedures, that is, an A*-based Itinerary Finder and a Floyd-based Cost Estimator. The basic scheme of the Floyd-A* algorithm is shown in Figure 3.
The Cost Estimator precalculates the estimated travel times of itineraries between any two nodes in a slacked USPT network, where static arc travel time is given as the lower bound of the associated time-dependent actual travel time. These values are stored in Table H. This is accomplished by a Floyd-based algorithm , which is a well-known all-to-all shortest paths algorithm. Once complete, the Cost Estimator is no longer required unless there is an update to the USPT network. The A*-based Itinerary Finder makes use of the Table H obtained by the Cost Estimator as heuristic information, determining the least-time itinerary. In the case that traveler inputs a triad of , only the Itinerary Finder conducts a real-time computation. These two procedures are expounded in detail in Sections 3.1 and 3.2, respectively. Section 3.3 mathematically proves its admissibility and analyzes the corresponding computing efficiency by comparing it with Plain-A* and Dijkstra-like procedures.
Remark 4. Speed-up technologies such as “Avoiding Binary Search” and “Further Speedup When Modeling with Train Routes” discussed in the work of Pyrga et al.  may further contribute to a higher efficiency. However, this paper only concerns a more efficient A*-based search (also known as goal-directed search), which could coexist with other speed-up technologies to further speed up computing.
3.1. Least-Time Itinerary-Finder Procedure
Assuming that typical readers may not be familiar with the A* algorithm, this searching process will be explained in detail. Given a triad of origin, destination, and initial time to determine an LT itinerary , the Itinerary Finder expands promising origin-rooted partial itineraries (partial itinerary for short) in a node-to-node manner. Beginning with , each successor is expanded by searching for each -arc in the first round. Each of these -arcs (partial itineraries) may contribute to the LT itinerary. During the second round, we must determine which terminal node of partial itinerary among several candidates is the most promising one.
Let each node be associated with a state denoted by state(). There are three states of node .(1)NEW: node has not been expanded up to now.(2)OPEN: node has been expanded and acts as a candidate to expand to another node in the next searching process. That is to say, for each partial itinerary thus far, state() = OPEN.(3)CLOSED: node has been expanded and has already expanding to another node. In other words, for any node that has gone through by any current partial itinerary , state() = CLOSED.
As defined above, the nodes associated with the state OPEN are candidates for expanding partial itineraries. For convenience, we use relative time rather than absolute time hereafter. Using the USPT network of Figure 1 as an example, let , and (minutes after 6:00). Figure 4 combined with Table 2 shows part of the searching process.
|The node to be COLSED in the next searching round; The node whose labels are updated in the searching round.|
We first initialize the state of origin as OPEN and others as NEW by default (see Figure 4(a)). In the first expansion round (see Figure 4(b)), , , and are expanded by searching for -arcs , , , and . At the same time, becomes CLOSED; , , and turn from NEW to OPEN. The next paragraph shows that is the most promising partial itinerary and ; thus, we should continue the second expansion round for the most promising node , and only is expanded by searching for a -arc . This time, becomes CLOSED and turns to OPEN (see Figure 4(c)). The searching process continues by similar means. Note that the state of a node may turn from NEW to OPEN, from OPEN to CLOSED, or remain the same. However, a CLOSED node can never re-OPEN (see Theorem 10), for example, is searched in the 3rd searching round (see Figure 4(d)), but the state of unconditionally remains CLOSED.
The exposition above focuses on the changing states of nodes during the expansion of partial itineraries. To determine the most promising OPEN node among several candidates, is defined as the estimated travel time of an LT itinerary . For each partial itinerary , the terminal node(s) whose is/are the minimum one(s) among those of all OPEN nodes is/are identified as the most promising one(s). If there is more than one, you may choose the first expanded one.
The actual travel time of the LT itinerary can be the summation of two parts calculated as
However, it is difficult to calculate and in real-time within an acceptable computing time. Because of the time-dependence factor, they are not able to be precalculated and stored as fixed values. This is a different situation from a static network context. Therefore, and are defined to estimate them, respectively. is their summation, calculated as
The A*-based Itinerary Finder utilizes the minimum travel time of the partial itinerary , determined to this point as ; the strategy for estimating will be addressed in Section 3.2. To illustrate the process for selecting the most promising node, we again use the USPT network of Figure 1 as an example. A traveler first predetermines . In the first searching round (see Figure 4(b)), , , , and can be easily determined with Formula (4). One can easily determine that , , and . As for the heuristic information yielded by the Cost Estimator, , , and . Thus, . Similarly, we have and . Dijkstra-based approaches only consider the performances of origin-rooted partial itineraries and thus identify as the most promising partial itinerary due to . The A*-based approaches, however, are goal-directed by the heuristics, and the Itinerary Finder selects as a result of . Similarly, the second searching round selects the terminal node of , that is, , as the most promising node, and so forth. Note that in the 3rd searching round, the -arc is searched. We have calculated by Formula (4), meaning that of OPEN node will be potentially turned to . However, because the previous value of is 12 and , the value of is not updated but remains 12. In another words, the partial itinerary dominates , referring to the partial itinerary from to . The destination is also expanded in this searching round; the associated state turns to OPEN. The searching process will continue, however, until state() = CLOSED.
Through the scheme that is recursively expanding, comparing, and selecting promising partial itineraries, the algorithm is terminated once the state of destination turns to CLOSED. If an algorithm is guaranteed to determine an optimal itinerary from origin to destination, we designate it as admissible. The Itinerary Finder is proven to be admissible in Section 3.3. The Itinerary Finder places OPEN nodes in an OPEN list and CLOSED nodes in a CLOSED list. If the state of the node cannot be placed either in the OPEN or CLOSED list it is regarded as NEW as default. In a summary of the above analysis, the outline of the Itinerary Finder is presented in Algorithm 1.
If more detailed information (e.g., waiting time, in-vehicle time) is required with respect to a specific arc, this can be obtained by simply adding to pre() in the associated iteration.
3.2. Cost-Estimator Procedure for a Tighter Lower Bound
The Itinerary-Finder procedure must be well informed when making a choice to expand partial itineraries. Expanding an unlikely part of an LT itinerary is a waste of computational time, while missing a promising partial itinerary may lead to a failure in determining the LT itinerary. Therefore, the strategy of estimating the travel time of a destination-ended partial itinerary is viewed as the key to improving the efficiency of the Itinerary Finder. Meanwhile, the estimated travel time must be a lower bound of the real travel time. Note that a tighter lower bound results in higher efficiency.
The travel time of an itinerary is composed of the travel time during walking between two nodes, waiting at nodes for a transfer, and traveling in vehicles. The walking time between two specific nodes is fixed. The waiting time varies in different cases. If fortunate, a traveler can transfer without waiting time. The in-vehicle time depends on the timetable of different lines combined with their different vehicle trips. This paper proposes a strategy to estimate the travel time between two nodes as a tight static lower bound of this real time-dependent value. The basic concept is shown by generating an associated slacked network (see Definition 5) of the USPT network; the minimum travel time of itinerary in this SUSPT network is the associated estimated value in the USPT network.
Definition 5. A slacked USPT network (SUSPT network for short) is defined to share the same topological structure as the USPT network. However, each arc of the SUSPT network is assigned a static travel time as a lower bound of the associated real travel time of the arc in the USPT network. The arc in the SUSPT network is timed by explicitly slacking the associated real travel time by using the following 3 rules. Figure 5 shows the associated SUSPT network of the USPT network of Figure 1.
Rule 1. Walking times remain the same.
Rule 2. Ignore all waiting times.
Rule 3. Let the minimum travel time among those traversed by different lines combined with different vehicle trips between two specific nodes be the estimated travel time.
Obviously, there exist no temporal concepts in the static SUSPT network; therefore, let each initial time of -arc in the SUSPT network be nil. Let denote the operator to time the -arc in SUSPT network. Rule 1 can be reflected in Formula (11). Rules 2 and 3 are interpreted in Formula (12):
For this problem, typical all-to-all shortest paths algorithms are qualified. This paper chooses a well-known Floyd algorithm. We assume typical readers have already known the algorithm, so there is no detailed exposition here.
The outline of the procedure is shown in Algorithm 2.
Remark 6. In this work, the Itinerary Finder obtains from table outputted by the Cost Estimator, while the previous related works substituted with Formula (13), which was calculated in an online way. The strategy proposed in this work is proved to generate a tighter lower bound and thus leads the Floyd-A* algorithm to be more efficient both in theory and in computation experiments (see Sections 3.3, 4.1, and 4.2):
3.3. Admissibility and Efficiency Analysis
The admissibility and efficiency of the Floyd-A* algorithm are discussed in this section. Hart et al.  established how to determine the admissibility of an A* algorithm, which is primarily affected by the travel time estimating strategy of the destination-ended partial itineraries shown in Lemma 7. On this basis, Theorem 8 establishes the admissibility of the Itinerary Finder.
Lemma 7. If , then A* is admissible .
Theorem 8. The Cost Estimator guarantees that the Itinerary Finder is admissible.
Proof. The Itinerary Finder is A*-based, where is computed by the Cost Estimator. To prove Theorem 8, we learn from Lemma 7 that it is equivalent to prove that the Cost Estimator guarantees each .
Let be a destination-ended partial itinerary of the actual LT itinerary . Therefore,
denotes the LT itinerary in the SUSPT network. Note that the topological structures of and are not necessarily the same.
If , then for any -arc , ;
In contrast, can be calculated by Formula (4) combined with Formula (5); that is, where
In summation, for any -arc , .
Thus, Theorem 8 is proven.
If Inequality (20) is satisfied, we deem this a consistency assumption for the Itinerary Finder. The definition of this assumption helps to explain why the Itinerary Finder never re-OPENs a CLOSED node. The explanation can be found in Theorem 10:
Lemma 9. Assuming that the consistency assumption is satisfied, A* needs never to re-OPEN a CLOSED node .
Theorem 10. The Cost Estimator assures that the Itinerary Finder needs never to re-OPEN a CLOSED node.
Proof. To prove Theorem 10, we learn from Lemma 9 that it is equivalent to prove that the estimating strategy proposed in the Cost Estimator satisfies the consistency assumption:
One can prove that . Thus, In other words, the consistency assumption is satisfied. Theorem 10 is thus proven.
Previous related works had developed two variants of Itinerary-Finder procedure, that is, the Plain-A* procedure in which the value is revised by calculating in Formula (13) and the Dijkstra-like procedure in which is replaced by constant 0. Similarly, it is not difficult to prove that the Dijkstra-like and Plain-A* both satisfy the consistency assumption and are thus admissible. We show the comparison among the Floyd-A* and the two procedures as follows.
Lemma 11. Consider the set of lower bounds verifying the consistency assumption. If a node is selected by the A* algorithm for a given lower bound, then this node will be selected by the A* algorithm using any smaller lower bound .
Theorem 12. .
Proof. To prove Theorem 12, one can equivalently prove that the estimated travel time values of Floyd-A*, Plain-A*, and Dijkstra-like are each a smaller lower bound of the real cost than the next; that is, .
Theorem 8 has proved , and it is obvious that because and are both positive.
We therefore need only prove that where .
can be calculated as .
With regard to , the numerator is obviously not greater than the real distance of any itinerary from to , and the denominator is not less than any velocity observed by walking, bus, and metro. Then,
Thus, the theorem is proven.
Corollary 13. .
Under the premises of Theorem 12, Corollary 13 can be easily determined, meaning that the total number of expanded nodes from the Dijsktra-like, Plain-A*, and Floyd-A* algorithms are each no less than the next. Correspondingly, their efficiencies increase orderly.
4. Example Illustration and Analysis
A numerical example and a real-world USPT network instance are presented to demonstrate the suitability and efficiency of the proposed Floyd-A* algorithm, as well as the instructive significance for travelers. For this purpose, the experiments are composed of four parts. Section 4.1 shows the efficiency of the Floyd-A* algorithm through a numerical example, in comparison with the two other conventional procedures, that is, the Dijkstra-like and Plain-A* procedures. A real-world instance is tested to demonstrate applicability and efficiency of the Floyd-A* algorithm when solving large-scale network instance, which is given in Section 4.2. The experiments on time-dependent nature of the least-time itinerary and the phenomenon “just missing” and “just boarding” cases are presented in Sections 4.3 and 4.4, respectively. The experiments ran in a MATLAB environment on an HP Compaq 8280 Elite CMT PC with Intel Core i5-2400 CPU @3.1G Hz and 4GB memory (RAM).
The USPT network of the numerical example shown in Figure 7 is formed by 30 nodes and 103 arcs. There are 10 lines, including 2 metro lines and 8 bus lines, where ; 10 corresponding timetables are also provided. Node is traversed by , , and ; each of another ten nodes is simultaneously traversed by 2 lines. Specific data are omitted due to the limited space.
4.1. High Efficiency of the Floyd-A* Algorithm
An itinerary planning assistant is capable of determining the LT itinerary through real-time querying. Efficiency is the ultimate goal. For testing, 1000 triads of , , and initial time are randomly generated with the distance between each pair of , no less than 5000 meters; all pairs are connectable. Note that in the generating process, cases exist where no itinerary from to was found. Therefore, the Floyd-A* algorithm is capable of verifying the connexity of a USPT network. Given each triad (, , and ), each of the three procedures (i.e., Floyd-A*, Plain-A*, and Dijkstra-like aforementioned in Section 3.3) are used to solve the LTIP-USPT. The Dijkstra-like and Plain-A* procedures are traditional methods for solving these types of problems. We utilize two indicators, one averaged the running time during the calculating of the LT itinerary by a specific procedure, and the other averaged the expansion times of nodes during the searching process. Using the performance of Dijkstra-like procedure as a reference, the relative reductions of the two indicators are shown in Table 3 as well. In addition, the results outputted by different procedures in a specific instance are exactly the same. Considering that the three procedures are all admissible, we learn from Table 3 that Floyd-A* procedure reduced the running time by 33.3% and the expansion times of nodes by 61.58% compared with the Dijkstra-like procedure, while the two corresponding values were 12.84% and 25.34% savings from the Plain-A* procedure. Floyd-A* procedure proposed in this paper is superior to both the Plain-A* and the Dijkstra-like conventional procedures in terms of efficiency.
4.2. Applicability of the Floyd-A* Algorithm for Real-World Instance
To verify the applicability and efficiency of the Floyd-A* procedure in a real-world network, we implement and test the three procedures in a Visual Studio 2010 environment on the aforementioned PC, using the real-world public transport data of Shenyang City, the central city of northeastern China. The main urban zone of Shenyang City has a size of more than 700 square kilometers, and a population of more than 5 million until the year 2010. There are totally 446 directed USPT lines, which are composed of 2 metro lines and 444 bus lines. The modeled Shenyang City USPT network (within the main urban zone) is formed by 2812 nodes (after aggregating) and 184178 arcs. Similar to the experiments performed in Section 4.1, 1000 triads of , , and initial time are randomly generated; the performances are shown in Table 4. The real-world LTIP-USPT can be solved by the Floyd-A* procedure in a more efficient way; it reduces the averaged running time by 63.9% compared with that solved by the Dijkstra-like procedure. Therefore, we concluded that the Floyd-A* procedure is significantly superior to the previous related work, that is, both the Plain-A* and the Dijkstra-like procedures, with reference to efficiency.
In reality, faced with such a large network, local citizens and tourists are difficult to determine an optimal itinerary without an itinerary planning system. To benefit the travelers, the Floyd-A* algorithm module is implemented and embedded in a Shenyang City Public Transport Query System, shown in Figure 8. The system is implemented in a Visual Studio 2010 environment, combined with the geography information system TransCAD. In a case that a traveler wants the least-time travel from the Bainaohui Stop to the Wanquan Park Stop given the initial starting time 9:10, the system returns the solution that the traveler should cost 24 minutes (including in-vehicle time and waiting time) traveling from the Bainaohui Stop to the EPA Stop by Line 222, walking 1 minute to another EPA Stop, and finally arriving at the Wanquan Park Stop by Line 118 in 15 minutes. Note that the two EPA Stops are geographically different but close. The interface and the LT itinerary of the example are given as shown in Figure 8. It appears to be applicable and efficient after numerous experiments. It finally turns out that Floyd-A* can potentially be used into many large-scale real-world USPT networks for LT itinerary planning use. Note that it also has the potential to be applied in interurban context, given that all services are schedule-based.
4.3. Time-Dependent Nature
In a static public transport network that does not consider a timetable, it is obvious that, given an origin and destination pair, the optimal itinerary (also referred as path) will consider objectives such as the least transfer time and the lowest financial expense. In other words, the solution does not depend on the departure time, while the situation is different when considering a timetable.
In the case of a specified origin and destination, when given different initial time , the proposed computation method returns a different LT itinerary and corresponding travel time . Using the USPT network of Figure 7 as an example, and are predetermined; when given a different initial time, for example, and , the itinerary and itinerary found with the Itinerary-Finder procedure are LT itineraries in these two cases, respectively. These results are shown in Figure 9, where the horizontal axis represents the time of day and the vertical axis represents the accumulated travel distance of the itinerary. The circles represent nodes and the links are explained in the legend. It is not difficult to see that the slope of the link represents the corresponding velocity and the curve must be monotonically increasing:
costs 33.1 minutes, traveling 12336 meters; the itinerary costs 37.6 minutes, traveling 11974 meters.
The Spatial itinerary is defined as an itinerary with the temporal factors deleted. The spatialitineraries of itineraries and are represented as and , respectively, as shown in Figure 10. If we neglect the waiting time at transfer, the static itinerary intuitively appears more likely to cost less time than because about half the distance of is traversed by the metro, which is much faster than a bus and the total distances of the two itineraries are very close. How can sometimes cost less time than , for example, when ?
To answer this question, the corresponding itineraries of and , both given an initial time of 60, are compared in Figure 11. The waiting time of the former itinerary is 3.7 minutes longer than the latter one, while the value of total travel time is only 2.4 minutes longer. Similar results can be found in other cases. Therefore, we conclude that the complex timetables that lead to waiting times during transfers are variable and almost uncontrollable, primarily resulting in the time-dependent nature of an LT itinerary in a USPT network. Obviously, these results could not be determined without considering timetables.
4.4. Just Missing and Just Boarding
Recall that the USPT lines are assumed to run punctually. In this premise, the phenomena of “just missing” and “just boarding” can be evaluated with the proposed algorithm. Again, we let and