Abstract

The Train Design Optimization Problem regards making optimal decisions on the number and movement of locomotives and crews through a railway network, so as to satisfy requested pick-up and delivery of car blocks at stations. In a mathematical programming formulation, the objective function to minimize is composed of the costs associated with the movement of locomotives and cars, the loading/unloading operations, the number of locomotives, and the crews’ return to their departure stations. The constraints include upper bounds for number of car blocks per locomotive, number of car block swaps, and number of locomotives passing through railroad segments. We propose here a heuristic method to solve this highly combinatorial problem in two steps. The first one finds an initial, feasible solution by means of an ad hoc algorithm. The second step uses the simulated annealing concept to improve the initial solution, followed by a procedure aiming to further reduce the number of needed locomotives. We show that our results are competitive with those found in the literature.

1. Introduction

An ample range of investigations have been undertaken to develop optimization algorithms for various problems encountered in the realm of rail systems, like routing [1], scheduling [24], crew assignment [5, 6], and blocking [7, 8], among others [912]. However, to our knowledge, the Train Design Optimization Problem as defined below is the only attempt to simultaneously deal with block-to-train assignment, train routing, and crew assignment [13], arising as one of the most fundamental and difficult combinatorial optimization problems formulated in the railroad industry [1317], with a huge potential to benefit from the application of operations research techniques.

In rail systems, a railroad car, railcar, or train car (car, for short), is a vehicle used for the carrying of cargo. Such cars, when coupled together and hauled by one or more locomotives, form a train. A car block (block, for short) is a semipermanently arranged formation of cars. Trains are then built of one or more blocks coupled together as needed. Also, in the operation of rail systems a block swap can occur, namely, when a locomotive delivers a block at a station distinct from the block’s destination, and another locomotive picks up the block afterwards. Each time a train stops en route at a station to pick up and/or to deliver blocks, a work-event takes place.

In this paper we address the Train Design Optimization Problem (TDOP) arising in the operation of railroad freight transport as part of the logistics chain. It consists in determining, at minimal total cost, and subject to capacity and operational constraints, the number of locomotives and crews, together with the logistics related to the movement of locomotives, blocks, and crews through a railway network, so as to transport goods from a set of shippers to a set of destinations.

The total cost depends on the number of assigned locomotives, the distance traveled by locomotives and cars, the distance traveled by crews to return to their departure stations, the number of block swaps and work-events, the number of blocks not arriving to destination, and the difference between the number of locomotives arriving to and departing from stations. The constraints include maximum number of blocks per locomotive, maximum number of block swaps, maximum number of work-events per locomotive, maximum length, weight, and number of trains passing through a railroad segment, and crew limitations as to the route to follow.

Only a handful of approaches have been developed for the solution of the TDOP. In [18] a mixed integer programming model is proposed; block routes are first generated together with train routes to cover them, then a matching is sought to minimize the objective function, and finally several greedy and local search rules are iteratively invoked to update the block-paths and train routes so as to improve the solution quality. In [19, 20] the problem is formulated as one of integer programming where the number of variables and constraints is exponential, proposing for its solution a column-row generation heuristic followed by clever tabu search methods. An iterative procedure is suggested in [21] to solve two subproblems of the TDOP: train design and block-to-train assignment, where the former consists in determining train routes to be operated, and the latter deals with the routing decision for blocks; both subproblems are solved by integer programming techniques. In [22, 23] a column-generation approach is designed: first, a set of promising train routes is generated based on the crew segments; then an integer linear programming model is developed for the subsequent decisions including train route selection and block-to-train assignment.

We propose hereafter a method to heuristically solve the TDOP in two steps. By means of an ad hoc procedure the first step aims to produce an initial feasible solution, namely, a solution satisfying all the constraints. The second step uses the simulated annealing method to improve the initial solution, followed by simple, specialized procedures that attempt to reduce the number of needed trains without increasing the overall cost. To test our proposal we solved the only three published instances to date, finding it competitive with other approaches in the literature. Also, we randomly generated 20 synthetic instances as well as (improvable) lower bounds for their corresponding optimal solutions; our heuristic came up with results averaging an error close to 25% on these bounds.

The rest of the paper is organized as follows: Section 2 provides the necessary terminology and makes a detailed description of constraints and objective function of the TDOP; a toy example is also furnished to help understand the problem. In Section 3 the TDOP is formulated in mathematical programming terms. Our solution approach—basically combining an ad hoc algorithm to find an initial feasible solution followed by the metaheuristic known as simulated annealing—is explained in Section 4. Further, Section 5 is devoted to the computational experiments that we conducted to test our procedures; for this experimentation we used the three available known instances as well as a new set of 20 randomly generated ones. Finally, Section 6 presents our conclusions and a proposal for future work.

2. Problem Description

This section is aimed to describe the TDOP, borrowing some terminology from [20]. An alternative description can be found in [13]. The TDOP terminology, constraints, and objective function are provided in Section 2.1. To help in grasping our description, an instance of the TDOP is presented in Example 1 by means of a toy example, followed by one of its feasible solutions in Section 2.2.

2.1. Terminology, Constraints, and Objective Function

Terminology(i)A block is a formation of cars sharing origin and destination. Trains are built of one or more blocks coupled together as needed.(ii)A train is a locomotive carrying or not carrying blocks.(iii)A block swap occurs when a block is moved from one train to another.(iv)A block-path is a sequence of railroad segments through which a block can be feasibly routed from its origin to its destination.(v)A work-event occurs each time a train stops en route at a station to pick up and/or deliver blocks. The train stop at its destination (or origin) station is not considered as a work-event.(vi)A crew segment is a minimal length route between two stations, called end points, on which crews operate trains in either direction.(vii)The crew imbalance on a crew segment, defined by say end points and , is the absolute difference between the number of crews going from to , and the number of crews going from to .(viii)The train imbalance on a station, say , is the absolute difference between the number of trains originating in and the number of trains terminating in .(ix)A car is missed if it is not transported from its origin to its destination.

Constraints(i)Blocks per train: trains are constrained by an upper bound on the number of blocks they carry.(ii)Swaps per block: each block is constrained by an upper bound on the number of times it can be swapped.(iii)End points: crews must start and end traveling at end points.(iv)Crew-to-train assignment: every train has to be assigned to a crew on each crew segment, and it must originate and terminate at the end points of a crew segment, even if the train has to move part of the way along a crew segment not carrying any blocks (the entire train routes should be decomposed in crew segments).(v)Upper bounds on segments: railroad segments are constrained by upper bounds on the number, length, and weight of trains traversing them in either direction.

The objective of the TDOP is to minimize the sum of eight components:(1)Locomotive cost: product of the number of scheduled locomotives and the unit locomotive cost .(2)Locomotive travel cost: product of the total traveled miles by all scheduled locomotives, and the per mile locomotive travel cost .(3)Work-event cost: product of the total number of work-events of all scheduled trains, and the unit work-event cost .(4)Car travel cost: product of the total traveled miles by all cars, and the per mile car travel cost .(5)Block swap cost: let be the unit block swap cost in station . For a given block , denote the set of stations where block is swapped; then its (total) block swap cost is .(6)Crew imbalance cost: product of the number of all crew imbalances and the unit crew imbalance cost .(7)Train imbalance cost: product of the number of all train imbalances and the unit train imbalance cost .(8)Missed car cost: product of the total number of missed cars and the cost per missed car .

Example 1. Consider a railroad network with five stations A, B, C, D, and E and six railroad segments as schematically depicted in Figure 1, where distances (in miles) are assumed symmetrical and all segments are bidirectional. Seven blocks must be delivered. Table 1 furnishes the relevant data.
The only nonadjacent end points defining a crew segment are B and D. The corresponding crew segment is BCD because its length is minimal among all possible paths connecting B and D. Clearly, the other crew segments are trivially found. Thus, if the route of some train is, say, D C B A, then at D a crew from crew segment BCD could be assigned to this train in the subroute D C B, and subsequently at B a crew from crew segment BA could be assigned to the train in the subroute B A. When a train crosses over from one crew segment to another, the onboard crew gets off the train and a new crew gets onboard. Further, crew segments are bidirectional. Hence, crews in crew segment BA can take a train from A to B or vice versa. It is assumed that crews always travel along the shortest path between any pair of stations.

2.2. A Feasible Solution of Example 1

Consider the solution shown in Table 2. Two trains, and , are scheduled to transport blocks. Note that train picks up 63 cars at station D and then goes to C, where it delivers 63 cars and picks up 5 cars. From station C train travels to B to pick up 42 cars; then it goes to station A to deliver 47 cars and to pick up 13 cars. Finally, it travels to station B to deliver 13 cars.

The distance traversed by both and is 1,273 miles. Train stops to pick up or drop blocks in stations C, B, and A, while train stops to pick up or drop blocks in stations D, B, and C. Hence, the total number of work-events produced by both trains is six.

Table 3 indicates the train on which each block travels between stations. For instance, block travels first on train from A to B, then on train from B to D, yielding one block swap (with cost 60).

The column labeled “Car miles” is computed as the product of the distance traveled by a block on a segment, and the number of cars in the block. Note that the block-to-train assignment satisfies the constraints on the maximum number of block swaps, and on the maximum number of blocks per train (see Tables 1(a) and 1(b)).

In Table 4 the crew-to-train assignment information is provided. It is shown, for instance, that a crew travels from D to B on train . Then another crew travels from B to A on train . Finally, this crew travels from A to B on the same train.

Note that there is one train imbalance in station B because it is the origin of no train, and one train terminates there. Similarly, there is one train imbalance in station E. Hence, this solution yields two train imbalances.

The objective function value of this solution is as much as 47,603, computed in Table 5 (costs in $).

3. The Model

We consider the railroad network as a directed graph whose set of nodes corresponds to the set of stations, and the set of arcs is derived from the railroad segments between stations, associating a couple of arcs with opposite directions to each rail segment. The length of arc is equal to the distance between stations . Also, for , let , , and denote, respectively, the maximum train length, the maximum weight, and the maximum number of trains allowed to traverse arc (or segment) in either direction.

Denote the set of crew segments, where each crew segment is a path through a set of stations —including end points—and let . We assume crews are always traveling on crew segments.

Let be the set of blocks to be delivered. For block , its origin, destination, number of cars, length, and weight are denoted , , , , and , respectively.

Let be the set of all possible trains, and let be a generic sequence of arcs of , with denoting its length. Thus, the arc sequence followed by train is , and that of block is ; namely, is a block-path of . In case train is used, let and be, respectively, the set of crews traveling on train and the number of work-events of train . Our mathematical formulation below closely follows the one proposed in [20].

Let denote the cost of train , which consists of the unit locomotive cost , plus the train travel cost, and plus the cost of work-events. Hence,

Each train can be seen, in fact, as a sequence of crew segments, forming a path . The subset of trains passing through arc is , and is the number of times train passes through . From set and graph , one can obtain a multigraph , where is obtained by replacing each arc with copies of it. If a train passes several times through arc , then each passage induces a different copy.

The binary variable states whether train is used or not in the solution. Let be equal to if train starts its route in node , and equal to +1 if train ends its route in . Thus, is the train imbalance in node .

Let and be the end points of a crew segment . Then denotes the difference between the number of times train goes through , from to , and the number of times it goes through from to . So, is the crew imbalance of crew segment .

Let denote the set of all block-paths associated with block , which includes a dummy block-path of one arc from to with cost . Further, let be the subset of paths in that use train , and let be the whole set of block-paths.

For , let denote the block-path cost for block , namely, in case of a dummy block-path; then ; otherwise whose two terms correspond to block travel cost and block swap cost, respectively. Recall that is the number of cars of block , each block swap at node costs , and is the set of stations where is block swapped.

Let be the set of block-paths that use arc of multigraph MG. Note that there is one train corresponding to . Binary variable states whether block-path is used or not in the solution.

Therefore, our mathematical model of the TDOP is

The terms in the objective function (3) correspond to the costs of locomotives, blocks, crew imbalances, and train imbalances, in that order. Constraint (4) imposes the existence of a block-path—real or dummy—for each block . Constraint (5) forces selecting train if any of the selected block-paths uses it. Constraints (6)–(10) stand for upper bounds on number of blocks that a train can transport, length and weight of trains passing through arcs, number of trains passing through arcs, and number of work-events per train, respectively. We emphasize that every sequence of train must be decomposed in a sequence of crew segments.

4. Algorithm

This section presents the algorithm developed by us to tackle the TDOP. It consists of two main steps.

Step 1 (initial feasible solution). Composed by substeps (1.1)–(1.3), this step is aimed to produce from scratch a feasible solution of the TDOP, namely, where constraints (6)–(11) are satisfied.

(1.1) Crew Selection. For every block determine a minimum length path from to , where each arc can be operated by at least one crew. The set of crews to operate along path is chosen following substeps (1.1.1)–(1.1.5) below. For every a minimum set of trains is constructed, so as only one train corresponds to every pair of crews such that one ends where the other starts. Note that with this rule the corresponding trains to carry can be easily deduced. Let be a set of crew segments covering the arcs of .

In substep (1.1.4) every possible effort is made to avoid the overlapping of crews in ; therefore, even though contained overlapping crews, this is not an issue since the probability of this happening is truly negligible. In the sequel will be simply written . Although is theoretically of exponential size, in all our experiments we were able to enumerate it completely.

(1.1.1) For , let be the number of swaps of block when transported through crew segments . Determine and reduce by eliminating from it every such that . If has a unique element, say, , make ; otherwise go to (1.1.2).

(1.1.2) If in there is a crew set starting in , then reduce by eliminating from it every not starting in . If has a unique element, say, , make ; otherwise go to (1.1.3).

(1.1.3) If in there is a crew set ending in , then reduce by eliminating from it every not ending in . If has a unique element, say, , make ; otherwise go to (1.1.4).

(1.1.4) Let be the total distance traveled by the trains corresponding to the crew segments in , and let . Reduce by eliminating from it every such that . If has a unique element, say, , then make ; otherwise go to (1.1.5). For example, assume we have a block with origin A and destination D, together with crews A-B-C, B-C-D, and A-B. One solution is to use crews A-B-C and B-C-D, while another uses crews B-C-D and A-B; with this criterion the second solution is chosen.

(1.1.5) Let . Reduce by eliminating from it every such that . If has a unique element, say, , make ; otherwise make equal to an arbitrarily selected .

(1.2) Block Ordering. An order on the set of blocks is constructed as follows. Let be the length of path . If then block precedes block in . However, if , then precedes in whenever , where denotes the number of blocks with origin and destination . In case and hold together, block precedes whenever . Finally, if , , and , this tie between and is arbitrarily broken to produce .

(1.3) Train Assignment. Blocks are considered one after the other, according to order . First, assign block to the crews selected in Step (1.1), together with the required train(s). For , if path is contained in path for some , then assign block to path whenever upper bounds on feet, tons, and max blocks per train allow. Otherwise, assign block to the crews selected in Step (1.1), together with the required train(s). This step produces a feasible solution .

Step 2 (simulated annealing). Proposed more than three decades ago [24, 25] Simulated Annealing (SA, for short) is one of the most successful metaheuristics to find good solutions of many combinatorial optimization problems (see [26] and the references therein), including the railroad freight transportation design problem [27]. In its theoretical formulation, SA converges with probability 1 to a global minimum under certain assumptions on the control parameters. In practice, these assumptions are impossible to be implemented, but adequate cooling schemes increase the likelihood of obtaining a near optimal solution [28].

Central to SA is the neighborhood concept. For our proposal we hand tailored it as follows. Let denote the set of feasible solutions of an instance of the TDOP. The neighborhood of a solution is where (i) contains every solution obtainable from by adding one or more trains to deliver one block alone, say , through . (ii) contains the solutions obtainable from by removing one block, say , from the set of trains carrying it and delivering it through a minimal length route from to without using any train of , using another trains in . (iii) contains every solution obtainable from by the removal of one block, say , from the set of trains carrying it, and delivering it through a minimal length route from to without using any train of , with at least one added train, and using at least one existing train.

We implemented a simulated annealing procedure with geometrical cooling scheme as shown below, where the variable stands for the system temperature, and the parameters , denote, respectively, the initial and the final temperature, the cooling factor, and the internal cycle length.

Procedure NEIGHBOR randomly produces (with uniform distribution) a feasible solution in , and RAND delivers a uniformly distributed random number in the interval . SA starts with the solution delivered by Step above (see Algorithm 1).

; ;
While do
  ;
  While do
   NEIGHBOR;
   ;
   If RAND or then
       ;
   EndIf
   ;
   If then
      ; ;
   EndIf
  EndWhile
  ;
EndWhile
call FUSION.

Throughout the process, the variable holds the best solution so far, and the counter helps to control the number of iterations of the internal cycle. Thus, the system temperature drops if and only if iterations occur without any improvement on .

Added at the end of the external cycle, procedure FUSION attempts to improve solution by successively applying the following four operations in the order shown. Before passing to the next operation, for each operation all possibilities are considered, repeating it exhaustively as long as the cost is lowered and feasibility is preserved.(1)When two trains have identical routes, one locomotive is eliminated once its blocks are passed to the other.(2)If there are stations , , such that and , then a train with no blocks is added starting in and ending in .(3)If a train starts its route in a station where another train terminates its travel, one locomotive is eliminated once its blocks are passed to the other.(4)If there are trains , , such that the route is contained in the route, then the locomotive is eliminated once its blocks are passed to .

Example 2. For clarity sake, consider a railroad network as schematically depicted in Figure 2. Assume a block with A and D, and let the crew segments be AF, AD, AI, AP, FD, IL, LD, QR, and DP. Also, let be a feasible solution with six trains in correspondence to routes: A → E → F, F → G → D, A → H → I, I → J → F → K → L, A → N → O → P, and L M D, where block is delivered by the trains assigned to routes A E F and F G D; hence 63 is their total traveled distance. Thus, solution is identical to with the exception that block is delivered by a new train through route A → B → C → D, with length 42.
Solution is identical to with the exception that block is delivered by the trains with routes A → H → I, I → J F K L, and L M D, with total length 91, and without adding a new train.
Solution is identical to with the exception that block is delivered through route A N O P, adding a new train for segment PD; the total length of this route is 47.
In regard to simulated annealing parameters, we tested diverse settings, modifying one parameter at a time. Among the possible combinations of the cooling factor , , , , , , and , and the internal cycle lengths , 1,000, 1,500, 2,000, 4,000, 6,000, 8,000, and 10,000, for every pair we run ten times the algorithm on a PC with Windows 8.
Table 6 displays the average results obtained for instance Data Set 2, where computer times are shown in italics. Thus, we chose and = 1,000 since together they yielded a good trade-off between computer time and solution quality. As initial temperature we chose = 30,000, because on average it yielded an acceptance rate of around 95%. For the final temperature was selected, yielding an acceptance rate of around 1% on average.

5. Numerical Results

Our simulated annealing approach, as described in Section 4 to deal with the TDOP, was implemented on a computer with Xeon E5-2643 v2 3.5 GHz processor (we run in a single processor), 64 GB RAM, and g++ compiler. Two experiments were conducted to investigate its efficiency.

In the first experiment—see Section 5.1—we tested SA on three instances available to us: Example (a toy, synthetic instance), Data Set 1 (real), and Data Set 2 (real), from [13].

The second experiment was designed to evaluate the performance of SA from the point of view of the quality of the results and the required computer time. We did extensive testing on randomly generated instances of various sizes; they are dealt with in Section 5.2.

5.1. Testing SA on Specific Instances

Employing parameters = 1000 and = 0.9, we compared SA results on instances Example, Data Set 1, and Data Set 2, as can be seen in Table 7, where computation times are also shown to give an idea of the performance of the implementation, although the platforms used were not the same. So far [18] had provided—using a mixed integer programming model—the best results for the two real instances but at the expense of very high computing time. On the other hand, the network-oriented formulation in [21] yielded the fastest algorithm to date.

Other approaches include column-generation [23], with better results than those found in [22], and column-generation combined with tabu search [20] improving on [19]. In the case of instance Example the only published results come from [23]. The best results found with SA were 1,999,315 with = 8000 and = 0.99, in 19,819 seconds for instance Data Set 1, and 3,155,267 with = 10,000 and = 0.95 in 41,233 seconds for instance Data Set 2, with the more relevant results found in the literature; our results yield lower cost for Data Set 1 (%), same cost for Example, and higher cost for Data Set 2 (%).

5.2. Testing SA on Random Instances

A set of 20 random instances was generated as follows with the number of stations and number of blocks shown in column II of Table 8. All random choices were made with uniform distribution.

For each pair we first construct a complete graph , whose vertex set corresponds to a set of integer points randomly generated in a square of side ; the length of each edge is equal to the Euclidean distance between vertices and . Also, we determine the set of edges in a minimum length spanning tree on , as well as the set of edges in the convex hull of . Finally, we make the railroad network of the instance correspond to graph .

Then, crew segments are randomly created so that every edge of the network belongs to one crew segment. The swap cost in every station is also randomly assigned in the integer range, .

We create next a set of blocks, one after the other, verifying that each does not exceed the maximum number of swaps allowed to arrive to destination when traveling along a minimal length path. The number of cars , length , and weight of every block are randomly chosen in the integer ranges , ], [, , and , , respectively, as these ranges are similar to those found in real instances. Each time we tentatively form a block, Step 1 of Section 4 is performed for feasibility verification.

Also, for every edge , the values of , , and are randomly chosen in the ranges 7,000, 14,000 and 9,000, 18,000, , , respectively.

Once random instances were generated, SA run 50 times for each. Results are displayed in columns III–V of Table 8 (indeed, instance p5_7D was taken as Example 1 in Section 2).

5.3. Lower Bounds

Establishing good lower bounds for the optimal solution of the TDOP seems a very difficult task. However, to assess the SA performance, we propose here a lower bound for each solved instance; see Tables 7 and 8. To this aim we computed as explained below lower bounds for total travel cost (), train start cost (), train travel cost (), work-event cost (), and missed car cost (). Thus, .

In regard to travel cost let , where is the length of the shortest route from to . In the case of the two real instances is identical to the bound computed in [23].

For the train start cost a trivial lower bound is , where stands for the maximum number of blocks per train.

Let be an order on , with , for . Thus, for the train travel cost our proposed lower bound is , where . Bound slightly improves on that from [20].

For every station , let be the number of blocks such that or . Then , where is the set of stations belonging to a crew segment with the exclusion of its end points. As far as we know this lower bound for the work-event cost is proposed here for the first time.

Let denote the set of blocks such that or does not belong to (refer to Section 3 for a definition of vertex set ). Thus, it is impossible to deliver any block in . Our lower bound for the missed car cost is then . Although in practice it is not likely to get , instance Data Set 02 contains some blocks belonging to .

Thus, in regard to the best solutions found for instances Example, Data Set 1, and Data Set 2 (see Table 7) our lower bound yields differences of 36.14%, 20.79%, and 14.57%, respectively, namely, 23.83% on average. Column VII of Table 8 shows a similar behavior of for the synthetic instances. These facts lead us to believe that , although the highest lower bound known to us is far from the true optimum.

6. Conclusion

In this paper we proposed a simulated annealing approach for the Train Design Optimization Problem. This approach was computationally tested with three instances well-known in the specialized literature, and with 20 instances randomly generated by us, of sizes up to 320 stations and 640 car blocks. Its results show superiority—in regard to the objective function value—over those obtained elsewhere for instance Data Set 01, and competitivity (in fact the runner-up) for instance Data Set 02, both results using reasonable computing resources.

We think however that much research, be it theoretical or empirical, must still be carried out to successfully deal with this most challenging, combinatorial optimization problem. The TDOP being so fundamental in the railroad industry, there is a need to develop new heuristics or metaheuristics, which, working alone or hybridized, improve the best results obtained so far. Also, a pending and difficult task is to produce efficient procedures that yield good lower bounds for the TDOP objective function, so as to help assess the performance of proposed approaches.

Disclosure

David Romero is on sabbatical leave at Laboratorio Nacional de Informática Avanzada, Xalapa, Veracruz, Mexico.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.