A program was developed using a genetic algorithm and automated lookup features to design an efficient passenger rail system for the eastern-half of the United States connecting large cities, metropolitan populations greater than two million, with overnight rail service. The results of the program predicted a passenger starting at the farthest point of the system boards the train at 16:02 on average and arrives at a different point of the system at 07:57 on average the following day, assuming the train travels an average speed of 70 mph. The design used actual distances by train track where possible. The system was modeled with six trains that meet at a hub and exchange passengers and continue on to their destination. The optimal solution had a total one-way minimum distance of 4334 km (2693 miles). Assuming the same ridership that currently exists on a popular train route, ticket prices would average $62 (USD) for a one-way ticket. For this system to be feasible, the government would need to own or lease one set of tracks for all the routes determined, build a hub for passengers to transfer trains near Charleston, WV, and ensure the trains are unimpeded by other trains. Installing tracks that go around cities that the trains do not stop at would be a great benefit also. With advances in communication, GPS, and train control technology, this article points out the benefits of publically available tracks to form a transportation network similar to that found in road, air, and water traffic.

1. Introduction

This article explores the possibility of overnight train travel between major cities east of the Mississippi River in the USA. Overnight train travel is common between major European cities and some distances between major cities in the Eastern-half of the United States are similar to those between European cities. However, the time needed in the Eastern United States is long, for example, traveling from St. Louis to New York takes more than 33 hours. Traveling from Memphis to Washington DC takes more than 37 hours. Traveling from Detroit to Philadelphia has a driving distance of 940 km (584 miles) and a driving time of 8.5 hours which requires nearly 38 hours by train. Purchasing a ticket on a rail or an airplane 25 days ahead of travel results in the same price between St. Louis and New York for both forms of transportation. Airfare is 35% less than rail travel using a major airline carrier that allows a carry-on and two checked bags traveling between Memphis and Washington, DC. Airfare is 25% cheaper between Detroit and Philadelphia using the same major airline carrier.

Since the solution to the problem is challenging, possible solutions were determined using genetic algorithms, which are uniquely suited to solve difficult optimization problems. In addition, train transportation is the most cost-effective method of transporting passengers anywhere and freight overland.

Energy used to transport cargo per ton-mile has been reported [1] as well as to transport passengers per passenger-mile (accounting for average occupancy in each form of transportation). These results are summarized in Table 1 [2]. There are assumptions that are needed to successfully design and implement an efficient passenger rail system. Many of these assumptions are discussed in the next section.

2. Assumptions

For a successful efficient passenger rail system to be designed and ultimately constructed, some assumptions were needed in the design process. First, the US government would need to own one set of tracks for the routes shown and control train traffic on those tracks. This is similar to what the government does for road, air, and water travel. Another option would be that the government make a financial lease agreement with the private owners of the tracks to allow the passenger trains to travel uninterrupted along their route. Second, a train hub near Charleston, WV, would need to be built to accommodate six trains arriving at nearly the same time and exchanging passengers.

3. Literature Review

Genetic algorithms have been used to determine solutions for complex problems. Genetic algorithms work by proposing a solution, many times randomly generated, to a problem through a group of possible solutions. The possible solutions are evaluated based on their value of an objective function, typically something that is desired to be maximized or minimized. The candidate solutions with the best objective functions are recombined, and possibly randomly mutated, to form a new generation of solutions. The new generation of candidate solutions are then used in the next iteration. This process continues until a stopping criteria is met, which is usually based on a maximum number of iterations performed, or no change for a certain number of iterations. The solution found is not necessarily the best or global optimum; other solutions may exist depending on the initial conditions of the candidate solutions of the genetic algorithm.

Lesiak and Bojarczak [3] explained that genetic algorithms were developed at the University of Michigan by John Holland in the century. They try to imitate the processes of natural selection where the fittest individuals are likely to survive and create offspring. This cycle is continuously repeated and each new generation is better than its parents. These properties of genetic algorithms allow them to solve difficult and complex optimization problems when traditional methods fail. Genetic algorithms use objective function information without any gradient information that most traditional methods use. By doing this, genetic algorithms can search an entire design space and not get “stuck” in a local minimum.

Ngamchai and Lovell [4] proposed a new model showing how genetic algorithms can be used to optimize bus routing design and frequency for each route. Their new genetic algorithm had seven proposed genetic operators to improve the search in a reasonable amount of time. The genetic operators were route-merge, route-break, route-sprout, add-link, remove-link, route-crossover, and transfer-location genetic operators. Their model was applied to a benchmark network and it was determined that the genetic algorithm presented is more efficient than a typical binary coded genetic algorithm.

Chung et al. [5] used a hybrid genetic algorithm for train sequencing in the Korean railway. The algorithm attempted to even out wear on train engines and ensure that each depot, which has a finite overnight stay and maintenance capacity, is not overloaded. They concluded that the practical and operational considerations while operating a railway are difficult and complex and that future research into this topic should continue. They also concluded to use the genetic algorithm in lower-level train routing problems.

Zhou et al. 2017 [6] used genetic algorithms to optimize the train-set circulation problem. The model optimized the number of required train-sets and their maintenance times in a high-speed rail system. A multiple-population genetic algorithm was designed to solve the train-set circulation problem. The model and algorithm were tested based on the Beijing-Shanghai high-speed rail system and the results showed the approach is both feasible and efficient for formatting a good-quality train-set circulation problem. Based on the results, a new maintenance mode is proposed that could greatly improve train-set utilization efficiency.

Gholami and Sotskov [7] developed a genetic algorithm that routed and scheduled trains to achieve efficient and robust train routes. The algorithm was allowed to change the start times of the trains to have less number and a smaller total time of train delays, time a train is idle waiting for another train on the tracks. The algorithm also can change some parts of the train paths to the destination for more efficient travel of trains; this is particularly true when there is only one set of tracks between stations. The results of the genetic algorithm were tested with data from a railway company as a benchmark. The numerical testing included five to fifteen trains. The results of the testing show the delay time is reduced from 57 to 32 minutes and 190 to 85 minutes when using five and fifteen trains in the system, respectively. There was a significant reduction in delay time results with fifteen trains by allowing the genetic algorithm to change a portion of the train path.

Li et al. (2013) [8] produced a compound model for train routing and scheduling problems for direct transport in a single-track railway network through a mixed-integer nonlinear programming model. The model considered constraints of headway, trip time, meeting/crossing/overtaking between trains, and the capacity of side-tracking. The model results showed only a small relative error relative to the optimal solution. The optimal solution produced by the model slightly changed based on the dependence of the initial condition. The results showed that different departure times of the fast train exerted a large impact on the choice of the train route and train delay.

Wang et al. (2019) [9] addressed the issue of train-set utilization problems considering high-speed railway transportation hubs and finding the optimal train-set circulation plan. A train-set circulation plan model was established to obtain relationships between train-sets, trip tasks, and maintenance. A genetic algorithm was designed to solve the model. A case study based on the Nanjing and Shanghai high-speed rail transportation hubs was completed by the model and genetic algorithm and the results showed a more efficient train-set circulation plan can be created dispatching train-sets among different train stations in the same hub.

Xu et al. (2015) [10] improved the utilization rate of railway tracks and reduced train delays by developing a high-efficiency train routing approach for double-track railway corridors where trains are allowed to travel on reverse direction tracks. They designed an improved switching policy that analyses possible delays caused by different path choices. They tested their switching policy for scheduling of heterogeneous trains on the Beijing–Shanghai high-speed railway. Their improved switching policy reduced the total delay of trains by 44% and 73% compared to the original switching policy and no switching policy.

Khaled et al. (2015) [11] created an optimization model for routing trains in a disruptive situation to minimize the system-wide total cost. The optimization model determined the number of trains, their routes, and associated blocks subject to various capacity and operational constraints at rail yards. A case study was conducted for a major US Class-I railroad based on publicly available data. The results determined the minimized increased total cost and also the critical infrastructure of the study from the viewpoint of strategic planning.

Xu et al. [12] considered service quality, i.e., passenger transit time, and energy efficiency to develop a multiobjective timetable optimization approach for subway systems. First they analyzed the passenger flow at stations and then they developed a profile of speed and energy efficiency. Then, they developed multiobjective optimization to minimize total passenger time, waiting and traveling time, and energy consumption. They used a linear weighted compromise approach and fuzzy linear programming approach to find a suboptimal solution utilizing a genetic algorithm. They conducted a case study with data from the Bejing Yizhuang and 4-Daxing subway lines in China to test their model. The results showed that passenger waiting time and energy consumption can be reduced during both peak and off-peak hours.

4. Model Development

Major cities were chosen in the eastern-half of the US that had a metropolitan population of 2 million or more people. Cities more north than New York City, Detroit, and Chicago were not chosen because the objective was overnight travel arriving around 08:00 the next morning and cities too far away would not meet the objective. By the same reasoning, cities more south than Atlanta, GA, and cities west of the Mississippi River were not chosen. One exception is that St. Louis, MO, was selected as it is just west of the Mississippi River but essentially lies on the Mississippi River.

A central hub was chosen that was approximately the average latitude and longitude of each city being equally weighted. The average latitude and longitude of each city were approximately at Charleston, WV; hence, that city was chosen as the hub. It is anticipated that the trains meet at 30 minutes past midnight at the hub and passengers can change trains if needed or remain on the train if that train is going to their destination. The best solution would be for each train to continue through the hub to a final destination opposite from the hub from where it came, allowing some passengers to remain on their original train. The train would then return to their origin city during the overnight trip that begins the following day.

In the algorithm that determined the optimum routing of trains, distances between the current train stations of the selected cities were calculated using the Distance Matrix API and requesting the distance using passenger train travel, with appropriate starting times, and also driving distance [13]. If passenger train travel was less than 1.2 times the driving distance, then the distance from train travel was chosen, if not the driving distance was chosen. This feature allowed distances of current passenger train travel to be used where appropriate.

The number of trains was chosen to allow passengers to travel to their destination by about 08:30 the following morning. Too few trains would lead to too long travel times and not reaching their destination the following morning. Too many trains are an added expense to the system. The model was run with 4, 5, 6, and 7 trains and it was determined that 6 trains were acceptable for passengers to arrive the next morning.

5. Solution

The solution will be shown graphically and lists acronyms of each city. Table 2 shows the acronym of each city and the full name of the city.

The model used was a multiple traveling salesperson genetic algorithm with a fixed starting point and open ending point [14]. Each train was considered to be a salesperson in the model. The model required that each salesperson had to stop in at least two cities in addition to the central hub. Three solutions were routinely obtained. Solution 1, shown in Figure 1, has approximately the same distance to be traveled by all trains and had the shortest number of one-way miles of any solution of 2693 miles (4334 km). Figure 2 shows the optimal solution but shows which sections of the solution were determined from existing passenger train travel, while the other sections were determined by driving distances.

Genetic algorithms many times start with a random solution set which is improved through iterations; this means that there could be more than one solution and that each solution is a local optimum. There were two other solutions that sometimes occurred, shown in the appendix. These other solutions had longer total distances and had one train traveling a much farther distance, which does not lead to all trains meeting at a hub simultaneously and also does not lead to passengers arriving at the destination by 08:30 the following day.

6. Discussion and Analysis of Solution

Table 3 shows the results of the calculations performed from the solution of the mass transit system. It was assumed the train averages 70 mph. Currently, most train tracks are rated for 80 mph but it is known that trains lose time slowing down before a station, stopping at the station, and speeding back up after the station. The data in Table 3 were calculated assuming all trains arrive 30 minutes past midnight and passengers switch trains without any loss of time. This additional time for switching trains was also accounted for by assuming the trains average 70 mph, slower than the rated speed of 80 mph of most tracks. Table 3 also shows that the trains must leave their first city between 14:50 and 16:57 to arrive at the hub 30 minutes past midnight. The table shows that trains arrive at their final city between 06:49 and 09:18 the following day. The times shown are local times; all cities were in the US eastern time zone except CHI, STL, MEM, and NASH, which are in the US central time zone. Table 4 shows the arrival and departure times of the train of each city, in local time.

Table 5 shows the cost per passenger to ride this mass transit system. This initial cost calculation is difficult as it is unknown how many passengers are traveling and what their destination is. The first attempt at cost assumed that each passenger would pay the same amount. The cost to operate the trains was obtained from written communication from a passenger train company that stated their cost to operate a train was $25 per mile. The total miles of the system were obtained by summing the distance of all routes both directions, recognizing that all trains will travel the same total distance after meeting at the hub, even if the same train does not return back on the same tracks. Ticket prices were calculated assuming a varying number of riders per train from 600 to 1200 riders per train. This range was chosen because it was reported that a popular passenger train that currently travels in this region averages 900 riders per journey. Ticket prices assuming only 600 passengers per train were approximately $93, which is considerably less than airfare between the east coast and Midwest. In addition, the train system would not require parking costs or transportation costs to the airport. If there were 1200 passengers per train, the ticket price would only be about $46, which is a significant saving compared to one-way airfare. If the ridership could be 900 or more passengers per train, this would be a cost-effective way to travel especially for people who can easily get to the downtown of a major city and wish to stay in the downtown of the city they arrive in.

7. Conclusions

It appears that overnight train travel is currently feasible between major US cities east of the Mississippi River given that the infrastructure is provided as stated in the Assumptions. The results also show that the new system is time-effective due to stopping only at major cities, traveling through the night, and passengers would depart the train downtown likely close where they wish to be for the day. The new system is also cost-effective as this method would be considerably less than air travel. One way to improve transit time would be to build tracks to bypass cities where the train does not stop. Trains must slow down through cities even if they are not stopping and it is more time-effective if they could bypass these cities without significantly slowing down. Another way to improve transit time is to convert certain tracks to high-speed travel, particularly tracks that are currently relatively straight and flat, as this is the most cost-effective. Extremely fast high-speed rail is not needed since the trains are traveling overnight; speeds could remain less than 100 mph, but sections of the track that are easy to allow higher speeds would especially help the routes that need to travel the furthest to reach the hub such as the trains leaving NYC and MEM.

If an entire transit route is too much to implement at one time, a beginning point could be starting the most popular route, likely based on the cities with the greatest population, and then expanding the system when that route becomes popular. As the system becomes more popular, an outer loop could be constructed that allows trains to travel around an outer loop of cities overnight. Passengers on these trains would not connect with other trains. Possible cities well suited for overnight travel around an outer loop would be NYC – PHL – DC – CHAR – ATL; NYC – CLE – CHI; and CHI – STL – MEM – ATL.

One downside of this travel system that might be mentioned is that the travel system only assists passengers living in large cities. However, if the system becomes successful, transportation would likely become available from small cities, or groups of small cities, to a large city arriving there shortly before the train arrives. Then, the passengers living in small cities could access this network. The current setup of train transportation that stops in many small cities does not greatly benefit the potential passengers of the small cities, because the train takes so long to reach the final destination that many potential passengers drive their own vehicle to their destination or to a major city and fly to their destination.


See Tables 6, 7, and 8 and Figures 3, 4, 5, 6, and 7.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.