Abstract

Effective railway freight transportation relies on a well-designed train service network. This paper investigates the train service network design problem at the tactical level for the Chinese railway system. It aims to determine the types of train services to be offered, how many trains of each service are to be dispatched per day (service frequency), and by which train services shipments are to be transported. An integer programming model is proposed to address this problem. The optimization model considers both through train services between nonadjacent yards, and two classes of service between two adjacent yards ( i.e., shuttle train services directly from one yard to its adjacent yard, and local train services that make at least one intermediate stop). The objective of the model is to optimize the transportation of all the shipments with minimal costs. The costs consist of accumulation costs, classification coststrain operation costs, and train travel costs. The NP-hard nature of the problem prevents an exact solution algorithm from finding the optimal solution within a reasonable time, even for small-scale cases. Therefore, an improved genetic algorithm is designed and employed here. To demonstrate the proposed model and the algorithm, a case study on a real-world sub-network in China is carried out. The computational results show that the proposed approach can obtain high-quality solutions with satisfactory speed. Moreover, comparative analysis on a case that assumes all the shuttle train services between any two adjacent yards to be provided without optimization reveals some interesting insights.

1. Introduction

Railway freight is an important logistical system supporting both global and regional economies. Millions of railcars are continuously moving every day and night. In 2018, around 2.6 trillion ton-km of goods were transported on the Chinese railway network, which comprises 131,000 kilometers of operating mileage [1]. The demand for transportation is growing more rapidly than infrastructure investment and construction, thus motivating the optimization of current railway operations.

A physical railway network consists of terminals and links with restricted capacity. A freight shipment is defined as being between terminals. A common method of planning railway transportation is first to arrange direct services between high-demand or high-priority origin–destination (O–D) pairs. Scheduled train services are then dispatched hauling the high-value shipments. The remaining shipments undergo consolidation operations. At the appropriate yard, shipments with different final destinations are assembled into blocks, each of which is an arbitrary unit to be considered. The blocks are eventually attached to a train, and may be transferred from one train to another. When a block arrives at its destination, it is separated from the train, and its cars are sorted. Generally, this type of railway operation consists of the following sub-problems: the blocking problem, the block-to-train problem, and the train routing and scheduling problem. The blocking problem is the foremost sub-problem; its aim is to determine the overall blocks to be built at each yard and the specific shipments that should be placed into block to reduce the amount of intermediate handling as they travel from their origins to their respective destinations. Once a shipment is placed in a block, it will not be reclassified until it reaches that block’s destination. After a blocking policy is developed, the next step is to identify which trains should carry which blocks to their destinations. On the top of these two sub-problems, the train routing and scheduling problem is considered, which determines trains’ routes and time-tables so as to minimize the total costs of carrying the cars [25].

There is rich literature on the blocking problem and block-to-train problem. Bodin et al. [6] established an arc-based mixed integer programming model that considered the capacity constraints at each yard to calculate the maximum number of blocks and the maximum volume of cars that can be handled. Barnhart et al. [7] formulated the railway blocking problem as a network design problem with maximum degree and flow constraints on the nodes; they proposed a heuristic Lagrangian relaxation approach to solve the problem. Ahuja et al. [8] proposed a model for the railway blocking problem using a very large-scale neighborhood search technique, as adopted by many railway companies. Jha et al. [9] formulated both arc-based and path-based time-space network models to solve the block-to-train problem at the operational level. Their models assigned a blocking plan for a given train schedule with respect to minimizing global transportation costs; they also developed greedy and Lagrangian relaxation heuristic algorithms to solve the model. Xiao et al. [10] also solved the block-to-train problem at the tactical level, which determines both the supplied services and the transportation strategy for each block established in the railway network. Yue et al. [11] introduced a model which can comprehensively describe the blocking policy and various combinations of multi-route O–D pairs in large scale railway networks, and proposed an improved ant colony algorithm to solve the problem. Lin et al. [12] formulated and solved the problem of train connection services in the Chinese network to determine the freight train services, and their frequency considering the differences between the freight rail networks in China and North America. However, they only considered single-block trains. Fügenschuh et al. [13] presented a linear mixed-integer model for the car routing problem arising from Deutsche Bahn’s operations. The model sought the most economical car routing, and considered train and car travel kilometers and the amount of used sorting tracks. For a railway network only using single-block trains, the blocking and train make-up problems are naturally combined.

Given that the blocking, the block-to-train, and the train routing and scheduling sub-problems are interrelated, some researchers have considered several of the issues as an integrated optimization problem. Zhu et al. [14] presented a model integrating service selection and scheduling, car classification, and blocking based on a cyclic three-layer space-time network. Crainic et al. [15] presented a general optimization model that considered the interactions between routing freight traffic, scheduling train services, and allocating classification work on a rail network. Gorman [16] developed a heuristic approach based on the genetic algorithm and Tabu search to plan the freight railway operation by integrating train scheduling and demand-flow problems. Kwona et al. [17] developed an algorithm to improve a given blocking plan and block-to-train assignment, formulated the problem as a linear multi-commodity flow problem, and used the column generation technique to solve it. Keaton [18] formulated train operating problem as a mixed integer programming model to determine which pairs of terminals are to be provided with direct train connections and the frequencies of service. The objective was to minimize the sum of the costs of the train, car time, and classification yard, while not exceeding limits on train size and yard volumes.

The voluminous literature on the blocking and block-to-train problems focuses mainly on the railway networks of North America and Europe, which are significantly different to that of China. Specifically, the railway operation is supervised by the China Railway Corporation, whose hierarchical management system is different to the parallel management by different companies in the North America. As the demand for passenger and freight transportation exceeds the capacity of the system, the emphasis in China is less on “scheduling”, and more on managing the train and line operations to operate freight trains between passenger traffic. Freight trains are arranged by waiting until enough cars have been collected that are traveling either to the destination yard or farther. Therefore, the models and algorithms applicable to North American and European systems cannot reflect the situation of the Chinese railway network, and the methods proposed in the literature cannot be applied directly in this study. The literature on the Chinese railway network (e.g., Lin et al. [12]) generally assumes that all the shuttle train services between any two adjacent yards are provided without optimization. This simplifies the problem and reduces the modeling complexity, but also leads to sub optimal solutions. Therefore, in order to find globally optimal solutions, it is necessary to relax the assumption and reformulate the mathematical model, which constitute the motivation the motive of this paper. Moreover, the train service network design problem is a typical combinatorial optimization problem and has an inherent NP-hard nature, making it difficult for an exact solution algorithm to find the optimal solution within a reasonable time. Therefore, it is of great significance to develop heuristic algorithms (e.g., evolutionary algorithms [1923]) to obtain high-quality solutions quickly.

This paper primarily addresses the problem of train service network design optimization at the tactical level. Its contributions are summarized as follows. (1) Two classes of train services between two adjacent yards are considered, which is different from the traditional method that assumes all the services between any two adjacent yards are shuttle services without optimization. (2) An optimization model for the problem of train service network design is proposed with respect to minimizing car-hour consumption across all the yards in the railway network. (3) An improved genetic algorithm is designed to solve the optimization model. (4) The proposed model and algorithm were tested in a realistically sized railway network. The computational results show the proposed approach can obtain reasonable-quality solutions with satisfactory speed. Moreover, a comparative analysis against a method with traditional assumptions reveals some interesting insights.

The remainder of this paper is organized as follows. Section 2 further describes the problem. An integer programming formulation is presented in Section 3. Section 4 presents a heuristic approach based on the improved genetic algorithm to solve the model. Section 5 tests the model and the approach to solving it. Finally, Section 6 summarizes the research.

2. Description of the Problem

There are various types of railway service including local train services, shuttle train services and through train services. These services can be used in different ways to transport a given shipment. The overall transportation process is outlined using the example in Figure 1.

Figure 1 shows a simple line network consisting of four yards (, , , ; red circles) and six railway logistics centers (, , , , , ; green circles) located between pairs of adjacent yards. The green arcs (labeled 1–3) represent shuttle train services formed at one yard and broken up at the adjacent yard. The blue arcs (labeled 4–6) are through train services that are formed at one yard, pass through one or more yards and are finally broken up at a relatively distant yard. Both of these service types are collectively called direct train services. The red dotted arcs (labeled 7–9) denote local train services which pick-up and deliver shipments between yards and logistics centers and carry shipments among the logistics centers. The local and shuttle train services both run between two adjacent yards, the former makes intermediate stops while the latter goes directly from one yard to the other. The local train service naturally takes longer than the shuttle train service. We consider the methods of transporting shipments , , and .

The shipment can only be shipped by the local train service Arc 7, which forms at yard , stops at intermediate stations and , and finally breaks up at yard . There are two options for the shipment : one is by Arc 1, a direct shuttle train service from to , the other is the local train service Arc 7, which stops both at the logistics centers and . These two intermediate stops make the travel time longer than that for direct shipping, despite both the trains following the same path. These train services can be represented as and , respectively. In summary, shipments between two adjacent yards can be carried either by a shuttle or local train service. The shipment can be carried directly by the through train service Arc 4 from its origination to the destination without classification: i.e.,(1),Other strategies that are possible include (but are not limited to),(2),(3),(4),(5),(6),(7).

Strategies (2), (3), (4), and (5) involve the shipment being classified once, while two classification operations are needed in strategies (6) and (7).

The above analysis shows that different types of shipments have different transportation strategies, which involve different costs incurred from accumulation, reclassification, train operation, and train travel. Selecting the best shipping strategy for each shipment while minimizing the total costs is a combinatorial optimization problem.

3. Mathematical Model

This section proposes an integer programming formulation for this problem. The model aims to minimize the total costs while satisfying constraints on train and yard capacity. Two cost factors are considered: the economic costs of train operation and the time delay cost of cars at yards during the journey. The model determines the types and frequencies of train services to run between yards and which cars are to be consolidated into a given train service. The following assumptions facilitate the model formulation.

The first assumption is that the shipment routing is given in advance. Each shipment is restricted to one transportation strategy (train service chain), and will not be split during the shipping. Each freight train is arranged when the collected cars reach its size. In general, local trains are unique in their size and operation costs compared with other types; their distinctive details are given in this paper. The positive shipment volumes among the logistics centers mean that local train services running between two adjacent yards must be provided without optimization.

3.1. Notations

The following notations are defined:

: Set of yards in the railway network. , , , , and refer to any yard belong to .

: Set of yards along the path from yard to , including , and .

: Set of yards adjacent to yard .

: Set of the intermediate logistics centers along the path from yard to yard , . represents the th logistics center, .

: Average car accumulation parameter at yard per day. The accumulation cost of a train is the product of the term and the number of cars in the train (, defined below). This parameter represents the influences of all factors other than the size of the train on the accumulation cost. It depends on the arrived car flow and the level of organizational work at the accumulation yard. Its detailed description and calculation can be found in the literature [24].

: Utilization coefficient of classification capacity at yard .

: Average extra time cost (in hours) per car at the yard compared with when a train passes through the yard without classification.

: Number of shunting lines at yard .

: Classification capacity in terms of the number of cars at yard .

: The original shipment from yard to yard in terms of the number of cars.

: Shipment originating from yard and destined to the intermediate logistics centers located between yard and yard , .

: Average number of cars per train (excluding local trains).

: Average number of cars for local trains.

: Number of intermediate logistics centers along the path from yard to yard , .

: Shipments from logistics center to , , .

: Fixed train operation costs of train (excluding local trains).

: Fixed operation cost of a local train , .

: Average accumulation time in hours for a local train at yard without running a shuttle train.

: Average accumulation time in hours for a local train at yard with running a shuttle train.

: Average travel time in hours of a shuttle train , .

: Average travel time in hours of a local train , .

: Coefficient of converting the train operation cost to the equivalent car-hour consumption.

Three groups of decision variables are defined:

: Its value is 1, if the through train service , is dispatched. Otherwise, it is zero.

: Its value is 1, if the shuttle train service , is dispatched. Otherwise, it is zero.

: Its value is 1, if the cars whose destination is yard take the direct train service at yard . Otherwise, it is zero.

The formulation also requires the following intermediate variables to be defined.

: Actual number of cars from yard to , including the original shipment demand and cars from other yards to the yard that are classified at yard .

: Number of cars allocated to the train service .

: Number of cars classified at yard .

: Operational frequency of local train service , .

: Number of cars from the yard to , .

3.2. Objective Function

The objective of the model is to minimize the total costs comprising those of accumulation, classification, train travel, and train operation. Cars traveling between two adjacent yards should be distinguished from those traveling among other yards. The final objective function is expressed as follows.

where represents accumulation car-hour costs when a shuttle and a local train service are dispatched simultaneously between two adjacent yards; in this case, the shipments between two adjacent yards are transported by the shuttle train service. represents accumulation car-hour costs when only the local train service runs between two adjacent yards, and thus shipments between two adjacent yards are transported by the local train. In these two cases, the cars traveling between two adjacent yards take different types of train generating different travel costs, which can be calculated as . The operation costs of the local train are defined as , while those of shuttle train are . For cars moving between two nonadjacent yards, the costs include those of accumulation , classification , and train operation .

3.3. Constraints

The formulation includes the following constraints:(1) The cars from yard to yard is either assigned to a through train service without classification or carried indirectly by a sequence of train services.(2) Only if a train service is arranged, it is possible for cars going from yard to yard to be classified at the intermediate yard .(3) The number of cars classified at each yard should not exceed its classification capacity.The value of is calculated asThe value of is calculated as(4) The number of the occupied sorting tracks should be less than the number of the available tracks.Here is a step function reflecting the utilization of the tracks. Assuming that one track can service a maximum of 200 cars, it is determined as follows:where is calculated as(5) The operation frequency of the local train needs to meet the transportation demands.The value of is calculated as(6) The values of the decision variables , , and can be either 0 or 1.

4. Solution Approach

The optimization of the train service network design model can be regarded as a combinatorial optimization problem of the decision variables: ,, and . The total number of decision variables increases exponentially with the number of yards in the network. This is a typical multi-variable NP-hard problem, which can be feasibly solved by heuristic methods such as genetic algorithm, particle swarm optimization, and simulated annealing algorithm. Most many-objective evolutionary algorithms use a one-by-one selection strategy to solve many-objective optimization problems because of their incapability to balance convergence and diversity in the high-dimensional objective space [25]. Considering that the first classification yard of cars between two yards is an integer value, it is convenient to implement encoding in a genetic algorithm, which is therefore applied here to solve the model. The genetic algorithm, first proposed by Holland [26], is an adaptive heuristic search algorithm based on the evolutionary ideas of natural selection and genetics. It is of high generality and robustness, and is especially suitable for solving combinatorial optimization problems, because the decision variables are easy to be represented during the encoding process. It has been used successfully to solve train service network design problems [2730].

The genetic algorithm includes the following elements: encoding, population size, initial population, fitness function, crossover, mutation, population regeneration, penalty function, and ending conditions. Figure 2 shows the framework of using the genetic algorithm to solve constrained combinatorial optimization problems.

The genetic algorithm is set up as follows to solve this specific optimization model.

(1)Encoding: We set the variable to indicate the first classification yard of the cars from yard to yard as a gene to encode. The value range of each is . Note that the chosen value of the cars between two adjacent yards can be zero to indicate that only the local train service is dispatched. In a railway network with yards, each is set with a specific value and arranged in a fixed sequence, as shown in Figure 3, which forms a chromosome. Each chromosome is a possible solution for the model. Changing the value of each gene in the chromosome obtains different chromosomes that constitute the solution space of the problem.(2)Population size and initialization: The population size has an important influence on the performance of the algorithm. Its preferred value usually depends on the size of the problem. The initial population is obtained by randomly choosing the first classification yard for each car flow.(3)Fitness function: The fitness function is the same as the objective function:where represents a chromosome in the population.(4)Penalty function: The penalty function guarantees that the constraints (formulae 5 and 8) are satisfied.(5)New population generation: For each chromosome in POP(t), a selection probability is calculated asThe new chromosomes in NewPOP() are selected from the chromosomes in POP() by synthesizing rank selection and roulette wheel selection. The chromosomes initially in POP() are ranked from the largest to the smallest based on the values of . The top chromosomes are chosen directly. The remaining chromosomes are chosen by roulette wheel selection.(6)Crossover: Two chromosomes in NewPOP() are randomly selected as the parent, and are recombined with probability . This process is performed multiple times to generate multiple offspring; then the fittest offspring of each parent is selected. The offspring will inherit some characteristics of each parent. After the crossover operation, the newly generated chromosomes might not satisfy the constraints in formulae (3) and (4). Therefore, we use the following formula to evaluate the feasibility of the new chromosomes and adjust the value of the genes where necessary to guarantee that the chromosomes in CrossPOP() represent feasible solutions.(7)Dynamic mutation: Each gene in the chromosomes of CrossPOP () will be mutated with probability that it is changed at each iteration. The initially chosen value for is decreased by multiplying a decay factor until it reaches a specified minimum value . Similarly, after mutation formula (17) evaluates the feasibility of the newly generated chromosomes, and adjusts the values of the genes where necessary to guarantee that the chromosomes in MutPOP() represent feasible solutions. An operation that filters identical individuals retains only one in the population; the remaining repeated individuals are updated by the mutation operation.(8)Ending conditions: The calculation will terminate when t equals the defined maximum iteration number.

5. Model Testing

5.1. Data Preparation

This section tests the formulation and the proposed algorithm on a realistic railway sub-network in Northeast China, as shown in Figure 4. The network contains 21 yards, numbered 1–21. Table 1 gives the original shipment demands between any two yards. The fixed train operation costs are listed in Table 2. The technical parameters of each yard are shown in Table 3. Note that the operation cost values here are not actual currency costs, but a relative value for comparison purposes. The value of α is defined as 3, the average train size is 55, and the average local train size is 40. The network diagram does not show logistics centers for clarity. The shipments from the yard to the logistics center and between two logistics centers are listed in Appendix A. The Appendix also contains both the average accumulation time for local train at yard with and without a shuttle train running and , the number of intermediate logistics centers between two adjacent yards , the average travel time of a shuttle train , the average travel time of a local train and the fixed operation cost of a local train . The travel path of each shipment is defined in Appendix B. The row header indicates the origin yard, and the column header indicates the destination yard. The cell value is the yard sequence along the route: i.e., . Note that real-world data are processed for some parameters. The dummy data used here are solely for testing the model and algorithm.

5.2. Computational Results and Algorithm Performance

The parameters of the proposed algorithm are set as follows: the population size is 10, the crossover probability is 0.5, the initial mutation probability is 0.1, the decay factor is 0.99, the minimum mutation probability is 0.01. We executed the algorithm 10 times and observed the variation of the results of each trail. Figure 5 shows the optimization process of all 10 trails. The algorithm converges after 12,000 generations, and the solutions vary from 660,632 to 670,671 (a deviation of less than 2%). The solution time is 447 s. The minimum result of 660,632 is selected as the best solution. Table 4 shows the dispatched train services and frequencies. In total, 163 direct services, including 39 shuttle train services, are arranged. In addition, given that the arc in the network is directed, 62 local train services are arranged without optimization; they and their frequencies are listed in Table 5. Table 6 shows the value of . The row header indicates the origin yard and the column header indicates the destination yard . The cell value is the first classification yard of the cars going from to . Clearly, if the cell value is equal to , a train service from to is provided. Note that for a pair of adjacent yards and , if the cell value is zero, only the local service is dispatched between them. Consider as an example the cars going from yard 2 to yard 9: they are firstly classified at yard 3, where they are merged with cars traveling from yard 3 to yard 9; these cars are then reclassified at yard 8, and merged with cars from that yard going also to yard 9; the final leg is the local service from yard 8 to yard 9. Table 7 shows the classified cars at each yard and the utilized sorting tracks at each yard.

The performance of the proposed improved genetic algorithm (IGA) is highlighted through comparison with two other state-of-the-art algorithms widely used in combinatorial optimization problems: the simulated annealing (SA) [31] and particle-swarm optimization (PSO) algorithms [32]. The SA and PSO algorithms were also run 10 times each in the same railway network, like the proposed IGA. The SA solutions (objective value) vary from 668,287 to 681,763. The variance is 13,580,513 and the average solution time is 501s. The PSO solutions vary from 661,896 to 673,411, with a variance of 13,288,476 and average solution time is 487s.

Table 8 presents the computational results of the three methods, IGA, SA, and PSO.

Table 8 shows that all three meta-heuristic methods find similar solutions for the same problem instance. Among them, the proposed IGA outperforms the others in terms of both solution quality and solution time. Therefore, it represents a good choice for solving the train service network design problem.

5.3. Comparison with the Traditional Assumption-Based Method

Consider another case having all the shuttle services between two adjacent yards provided without optimization which is a widely adopted assumption in existing literature, (see, for example [12]). The compositions of the objective function in each case are listed in Table 9. Compared with this situation, the optimization model proposed here reduces the total car-hour costs by 15,363 car-hours, and the total number of shuttle train services by 23, which decreases the accumulation costs by 11.2%. The associated increases of classification, train operation, and train travel costs increase by 4.7%, 0.5%, and 0.4%, respectively. This comparison suggests that assuming shuttle train services must be provided between any two adjacent yards beforehand tends to result in sub-optimal solutions in most cases.

6. Conclusions

This paper proposes a model for the tactical optimization problem of train service network design. It aims to determine the type and frequency of each service provided among yards. The constraints are formulated in terms of classification capacity, number of tracks, and actual operational requirements. The objective function considers the train operation costs and the car-hour consumption due to accumulation, classification, and train travel. An improved genetic algorithm was developed to overcome the difficulties in solving the model, which includes an enormous number of decision variables and complicated constraints. It was tested in a real-world 21-yard Chinese railway sub-network. The approach produced relatively stable results within a reasonable computation time. The results show that, compared with the case that all shuttle train services between pairs of adjacent yards are dispatched, cost is saved by the optimization model, as it reduces the number of services and total number of dispatched trains. In particular, the decrease of the accumulation of car-hour consumption makes the largest contribution to the final saving of the total cost, although the costs of classification, train operation, and train travel slightly increase. Future research will focus on timetabling based on the train operation plans obtained here.

Appendix

See Tables 10 and 11.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was funded by China Railway Eryuan Engineering Group Co. Ltd grant number KYY2019100(19-21).