#### Abstract

The introduction of customized bus (CB) service intends to expand and elevate existing transit service, which offers an efficient and sustainable alternative to serve commuters. A probabilistic model is proposed to optimize CB service with mixed vehicle sizes in an urban setting considering stochastic bus arrival time and spatiotemporal demand, which minimizes total cost subject to bus capacity and time window constraints. The studied optimization problem is combinatorial with many decision variables including vehicle assignment, bus routes, timetables, and fleet size. A heuristic algorithm is developed, which integrates a hybrid genetic algorithm (HGA) and adaptive destroy-and-repair (ADAR) method. The efficiency of HGA-ADAR is demonstrated through numerical comparisons to the solutions obtained by LINGO and HGA. Numerical instances are carried out, and the results suggested that the probabilistic model considering stochastic bus arrival time is valuable and can dramatically reduce the total cost and early and late arrival penalties. A case study is conducted in which the proposed model is applied to optimize a real-world CB service in Xi’an, China. The relationship between decision variables and model parameters is explored. The impacts of time window and variance of bus arrival time, which significantly affect service reliability, are analysed.

#### 1. Introduction

Public transportation agencies are experimenting with on-demand and shared technologies to augment traditional fixed-route bus services, such as flexible-route bus services [1], demand responsive transit (DRT) [2], ridesharing, microtransit [3], and customized bus (CB) service [4]. These new, flexible transit solutions have tremendous potential to expand agencies’ service areas, attract new riders, fill transportation gaps, and provide more effective and sustainable ways to reach low-density communities and other traditionally hard-to-serve situations.

As an emerging transportation mode, CB is similar to microtransit, which offers another option between the pricey convenience of taxis and slow, cheaper public transit. CB service is focused on providing commuters with fast and comfortable commuting buses in morning and evening rush hours. It allows commuters to make reservations for trips by submitting the information related to spatiotemporal demand (i.e., origination and destination (OD) and desired arrival time) and arranges buses to serve the shared commuters. The service plan is updated on a short-term basis subject to the passenger demand change. To ensure service quality, CB does not take walk-in riders without reservation [5].

The major challenges that need to be addressed by operators are the following problems: how many buses to be used and how to determine bus types, assignments, routes, and timetables to serve passengers. To comprehensively assess the feasibility and cost-effectiveness of CB service, the operators need to take a holistic approach that can jointly determine vehicle assignment, routes, timetables, and fleet size with the spatiotemporal passenger demand.

In addition, bus travel time on the route as well as bus arrival time at the stop is affected by recurring and non-recurring congestion in urban settings. Optimizing service planning with deterministic travel time might underestimate the cost and result in an unexpected level of service. Moreover, the CB customers would expect to arrive at destinations within the acceptable time windows, which makes it challenging for optimizing bus routing, scheduling, and timetabling.

This paper aims to optimize the CB service with a mixed bus fleet to minimize total cost, considering stochastic bus arrival time, which is known to be an NP-hard problem. A hybrid heuristic algorithm that integrates the features of genetic algorithm (GA), simulated annealing (SA), and adaptive destroy-and-repair (ADAR) is proposed to efficiently search for the approximate optimal solution. The HGA hybridizing GA and SA [5] performs well in global exploration but spends much time converging to the optimal solution. In contrast, local search methods, such as ADAR, can find the optimal solution quickly. So, we develop an HGA-ADAR algorithm by incorporating ADAR into HGA to improve its performance to reach the optimal solution in the region of convergence.

In summary, the main contributions of this study are as follows:(1)An important consideration of CB service planning is addressed and integrated in our model, namely, stochastic bus arrival time. Furthermore, some practical issues, such as mixed fleet size and multi-terminal, are also considered. The proposed model can optimize bus assignment, routes and timetables, and mixed fleet size simultaneously.(2)A hybrid heuristic algorithm that integrates the features of GA, SA, and ADAR is proposed to efficiently search for the approximate optimal solution.(3)The relationship between model parameters and decision variables is explored, which may provide some decisions and strategy supports for operating CB service in the future.

The rest of this paper is structured as follows. Section 2 summarizes an overview of related existing literature. Section 3 describes the developed probabilistic model, including the objective function and associated constraints. The developed HGA-ADAR is discussed in Section 4, while the model and algorithm performance and efficiency are assessed in Section 5. Section 6 presents a case study, in which the applicability of the developed model is presented, and the optimized results are discussed. The relationship between decision variables and model parameters is explored in the results of sensitivity analysis. Finally, the research findings with a future study are presented in Section 7.

#### 2. Literature Review

The discussion of the literature review covers previous studies in transit planning with deterministic bus arrival time, vehicle routing problem with stochastic travel time, bus arrival time distribution, and feasible solution algorithms.

##### 2.1. Transit Planning with Deterministic Bus Arrival Time

Considering deterministic bus arrival time, Chien [6] optimized route choice, headway, and bus capacity for a feeder bus system, which minimized total cost. Ulusoy et al. [7] and Ulusoy and Chien [8] optimized the integration of all-stop, short-turn, and express service considering heterogeneous demand. Later, Qu et al. [9] enhanced the model by taking time-dependent demand into account. Chowdhury and Chien [10] optimized fare and headway considering demand elasticity that maximized total profit. Chen et al. [11] simultaneously optimized bus routing, operating frequency, wireless power device locations, and battery capacity for a multi-route bus network, which minimized total cost.

Several previous studies focusing on optimizing CB service considered deterministic bus arrival time. Tong et al. [12] optimized passenger assignment and routes considering spatiotemporal windows that maximized ridership. Guo et al. [13] proposed a model for optimizing passenger assignment and routing that minimized the total cost, and later they enhanced the model by considering time window [14] and time-dependent bus arrival time and path flexibility between stops [15]. Lyu et al. [16] introduced a planning framework to optimize stop location, routing, timetabling, and passengers’ probability of choosing CB simultaneously, which maximized total profit. Huang et al. [17] modelled the decision-making process for CB service and optimized passenger assignment and bus routing that maximized total profit. Sun et al. [5] optimized trip assignment, routing, timetabling, and bus fleet size for CB service considering the mixed fleet size and multi-terminal, which minimized total cost.

In summary, to the best of our knowledge, the optimization of CB service considering stochastic bus arrival time has not been studied in the above studies. This variant constitutes an important extension to the classical CB service planning that has not been investigated in the literature for which we believe it is worth exploring.

##### 2.2. Vehicle Routing Problem with Stochastic Travel Time

In the field of operational research, the optimization of CB service considering stochastic bus arrival time is an extension of stochastic vehicle routing problem (SVRP) [18, 19]. Gendreau et al. [20, 21] presented some comprehensive surveys on SVRP and made a classification of the articles relying on different stochastic parameters. The authors argued that the common stochastic parameters considered are stochastic demand, stochastic customer, and stochastic travel time. In this paper, we focus on the consideration of stochastic travel time. Thus, we reviewed the literature related to vehicle routing problem with stochastic travel time (VRPSTT) emphatically.

Kenyon and Morton [18] optimized VRPSTT considering two objectives: minimizing the expected completion time that all vehicles will return to the depot and maximizing the probability that the trip is complete on or before a prespecified deadline. Li et al. [19] optimized an VRPSTT, which minimized total transportation cost. It was found that evaluating the objective function of stochastic problems would consume lots of computation time. Adulyasak and Jaillet [22] studied an VRPSTT with soft time windows, where the service at each customer should be within a predetermined time window or a violation occurs. The problem aimed to construct the routes to minimize the sum of probability of time window violations. Errico et al. [23] solved an VRPSTT with hard time windows, where a penalty is paid whenever the service at a customer is skipped. The objective function was to minimize the total expected travel cost and penalty cost.

The optimization of CB service is a more complex problem than the conventional VRP because it incorporates several specific model constraints, such as time windows, bus capacity, pairing, and precedence [24]. However, previous models discussed above focused on general VRPSTT, which are not suitable for CB service. Besides, the trade-off between operator cost and user cost is the main consideration in CB service, which is different from general VRPSTT.

##### 2.3. Bus Arrival Time Distribution

Bus arrival time inherently fluctuates due to various interruptions caused by traffic signals, accidents, unplanned road works, and adverse weather [25–27]. The distributions of bus arrival time were investigated and assumed that they follow normal [18, 19, 22, 28–30], log-normal [31–33], and gamma [34, 35] distributions.

Assuming normally distributed bus arrival time, Chen et al. [36] optimized limited-stop bus service (i.e., stop-skipping), which minimized total cost. Results found that the randomness of bus arrival time should not be ignored for transit planning. Xiao et al. [37] optimized scheduled arrival time and slack time for a transfer hub of a bus transit considering the probabilistic bus travel time and walking speed, which minimized total system cost.

Considering log-normally distributed bus arrival time, Chien et al. [31] proposed a probabilistic model for optimizing disseminated bus arrival time for pretrip passengers, which minimized total wait time. Prakash [32] studied on determining the most reliable routes on the stochastic and time-dependent network, which maximized on-time arrival probability.

Taş et al. [34] developed a model for optimizing an SVRPSTT with soft time windows considering bus arrival time following a gamma distribution, which minimized the sum of transportation and service costs. Later, they enhanced that model, considering time-dependent stochastic bus arrival time [38].

##### 2.4. Solution Algorithm

As discussed above, the studied CB optimization problem is an extension of VRP. Thus, it belongs to the NP-hard problem. An efficient algorithm is desired to search for the optimal or approximate optimal solution. Genetic algorithm (GA) is generally employed to deal with the NP-hard problem [5, 39]. Sun et al. [5] used a hybrid GA to optimize the routing and scheduling for a CB service. Chen et al. [11] developed a nested GA for planning an electric feeder system. GA has a great ability of global search in solution space, but it does not employ a local search and spends much time to converge to the optimal solution. So, it needs some modifications to improve the convergence and reduce the computation time.

The neighbourhood search method, such as ADAR, which is a local search-based strategy, has shown excellent performance in solving routing problems [40] and finds the optimal solution very quickly. Kitjacharoenchai et al. [41] developed an ADAR method that employs three destroy operators and three repair operators to solve a VRP. Dong et al. [42] optimized stop plans and timetables for commuter railways with an ADAR method by employing multiple destroy and repair operators. It was found that the local search-based strategy is easy to integrate with other algorithms to effectively find the optimal solution [43–45]. Soto et al. [43] proposed a local search hybridized with a tabu search (TS) to deal with a multi-depot VRP. Moshref-Javadi et al. [45] proposed a metaheuristic algorithm, in which several local search strategies were used to improve the solution obtained by SA, and it was proved to be able to solve VRP efficiently. Masmoudi et al. [46] introduced a hybrid GA to deal with a DARP by integrating local search. Belhaiza [44] designed a heuristic that integrated GA and ADAR to solve a DARP. Baniamerian et al. [47] proposed a hybrid heuristic algorithm based on variable neighbourhood search (VNS) and GA.

Based on the above literature review, we proposed a probabilistic optimization model to assist a transit agency in planning a CB service in the context of the stochastic environment. To the best of our knowledge, there are no studies on CB service optimization problems considering stochastic bus arrival times. Table 1 presents a comparison of our study with previous studies. It is found that the studied optimization problem is combinatorial with many decision variables including vehicle assignment, bus routes, timetables, and fleet size. Unlike previous studies, some practical issues, such as mixed fleet, multi-terminal, time window, and stochastic bus arrival time, are also considered. None of the previous research has covered all the above issues. Therefore, this study will fill the gap in comprehensive optimization for CB service and provide a significant basis for new areas of research to explore in the future.

#### 3. Methodology

In this paper, a probabilistic model is formulated to optimize a CB service (i.e., vehicle assignment, routing, scheduling, and mixed fleet size) considering stochastic bus arrival time, which minimizes the total cost including operator cost, user cost, and early and late arrival penalties subject to some practical constraints. It is worth noting that the passengers need to be classified into groups before the CB service is optimized, and we adopt the passenger partition approach given in Sun et al. [5]. In the research of Sun et al., the authors partitioned passengers into groups based on the information submitted by passengers (i.e., OD information and desired arrival time) and a prespecified time interval.

##### 3.1. Assumptions

To formulate the developed model, the following assumptions were made:(1)The spatiotemporal passenger demand of the study area is known.(2)Bus sizes and associated information are given (i.e., capacity and operating cost).(3)Buses are dispatched from a terminal and return to a different one after finishing a trip.(4)In the advent of real-time information of bus arrivals, passengers’ wait time at pick-up stops can be ignored.

The model is described on a complete graph , which consists of a set of nodes denoted as *N* and a set of links denoted as *A*. *N* consists of a set of terminals denoted as , a set of pick-up stops denoted as *N*^{+}, and a set of drop-off stops denoted as *N*^{−}. , *N*^{+}, and *N*^{−} are mutually exclusive. Table 2 lists the parameters and variables used to formulate the proposed model.

##### 3.2. Objective Function

The objective function minimizes total cost denoted as ZT which consists of operator cost denoted as ZO, user in-vehicle cost denoted as ZU, and early and late arrival penalties caused by stochastic bus arrival time at drop-off stops denoted as ZP. Thus,

ZO consists of variable cost denoted as ZV and fixed cost denoted as ZF. ZV is the product of average variable cost per bus-hour *υ*_{b} and travel time , while ZF is the product of fixed cost per bus-hour *f*_{b}, number of buses used , and service time horizon *o*.where *x*_{ijb} is a binary variable, and *x*_{ijb} = 1 if a bus of type *b* serves link (*i*, *j*); otherwise, *x*_{ijb} = 0.

ZU is the product of value of time *λ*_{3}, in-vehicle time, and the number of passengers served *q*_{ij}. The in-vehicle time from pick-up stop *i* to drop-off stop *j* by a bus of type *b* is denoted as .where *y*_{ib} is a binary variable, and *y*_{ib} = 1 if stop *i* is served by a bus of type *b*; otherwise, *y*_{ib} = 0. is the arrival time of a bus of type *b* at stop *j*, which follows a probability distribution (i.e., ), as shown in Figure 1.

It is assumed that the expected bus arrival time at stop *j* is , and the acceptable arrival time window is tw. The earliest and latest arrival times within the time window are *e*_{j} and *l*_{j}, respectively (i.e., *e*_{j} = − tw/2, *l*_{j} = + tw/2). Figure 1 shows a probability distribution of arrival time for a bus of type *b* at stop *j* (i.e., ). Bus arrivals outside the time window are classified into (1) early arrival represented by area I where the bus arrives before the start of time window *e*_{j} (e.g., ) and (2) late arrival represented by area II where the bus arrives after the end of time window *l*_{j} (e.g., ).

The penalty incurred by early and late bus arrivals is denoted as ZP that can be determined based on the probability of bus arrival time at drop-off stops. For example, penalty caused by the early arrival of a bus of type *b* at stop *j*, denoted as PE_{jb}, is the product of unit penalty for early arrival *λ*_{1}, number of passengers alighting from stop *j* denoted as |*Q*_{jb}|, the probability of early arrival , and early arrival time. The early arrival time is the elapsed time from the actual arrival time to the start of the time window denoted as . Thus,

Similarly, the penalty for late arrival of a bus of type *b* at stop *j* denoted as *PL*_{jb} is formulated aswhere *λ*_{2} is the unit penalty for late arrival.

Finally, ZP represents the total early and late arrival penalties. Thus,

To minimize ZT, several constraints (e.g., vehicle assignment, routing, scheduling, and bus capacity) are taken into account. Vehicle assignment constraints are formulated in equations (7)–(9). Equation (7) states that each stop might be served by at least one bus, while equations (8) and (9) ensure that the dispatched bus of type *b* must be linked to at most one start (and end) terminal.

Bus routing constraints are stated in equations (10)–(12). Equations (10) and (11) make sure that each route begins from and ends at a terminal. Equation (12) represents a route continuity constraint.

Equation (13) is subtour elimination constraint. *u*_{ib} and *u*_{jb} are auxiliary variables, while *n* represents the number of all stops. If stop *j* is an immediate downstream stop of stop *i* served by a bus of type *b*, *x*_{ijb} is equal to 1, which implies that *u*_{jb} ≥ *u*_{ib} + 1 > *u*_{ib}. A subtour (*i*, *j*, …, *i*) without a terminal will be prevented by *u*_{ib} *u*_{jb} > *u*_{ib}.

Bus capacity constraints are defined in equations (14)–(17). Equation (14) represents that number of in-vehicle passengers is zero as the bus departs from a terminal. *Q*_{jb} in equation (15) is positive if stop *j* is a pick-up stop; otherwise, negative if stop *j* is a drop-off stop. Equation (16) tracks the number of in-vehicle passengers according to the sequence of the visited stops and the number of passengers boarding/alighting at each stop. Equation (17) makes sure that the number of in-vehicle passengers shall not exceed bus capacity.

Equation (18) restricts that bus arrival time at node *j* must greater than that at the previous node *i*, while equation (19) ensures that each pick-up and drop-off stop pair is served by the same bus and the pick-up stop is visited before its drop-off stop.

Finally, decision variables on routing and scheduling are defined in equations (20)–(22).

#### 4. Solution Algorithm

The studied CB problem is an extension of VRP. Thus, it belongs to the NP-hard problem which is difficult to solve by exact algorithms, especially for large networks. An HGA-ADAR algorithm is developed, which starts with an initial solution generated by HGA followed by performing the developed ADAR to effectively find the solution.

Genetic algorithm (GA) has been widely and successfully applied to solve various vehicle routing and scheduling problems, which is an important stochastic global search algorithm but suffers some issues with premature convergence and local optimum. Simulated annealing (SA) is a metaheuristic to approximate global optimization in a large discrete space (e.g., VRP), which can be integrated with GA, called HGA [5], to improve the performance for searching the optimal solution. However, the computation time consumed by the HGA significantly escalates in a transportation network considering stochastic travel time. The adaptive destroy-and-repair (ADAR) method is a local search-based method, employing destroy and repair operators that satisfy both diversify and intensify searching for the optimal solution, which can find the optimal solution very quickly. The advantage of hybridizing global search in form of HGA with local search in form of ADAR is the improvement in the convergence speed to the optimal or near-optimal solution [47].

##### 4.1. Solution Evaluation

A fitness function *F* (*S*) is considered to evaluate a solution *S*, which is formulated as the sum of the objective total cost as equation (1) and the penalty of bus capacity violation, denoted as ZT (*S*) and PC (*S*), respectively. Thus,

Note that if solution *S* meets the bus capacity constraint, PC (*S*) is equal to 0.

##### 4.2. Chromosome Encoding

Chromosome is used to represent the solution, which is an integer string, consisting of a set of substrings. Each substring is a sequence of nodes visited by a bus. For example, for a solution with 2 routes and 5 pick-up and drop-off stop pairs (PD pairs), the chromosome coding is as follows: 0-2^{+}-4^{+}-2^{−}-4^{−}-0-1^{+}-1^{−}-3^{+}-5^{+}-5^{−}-3^{−}-0, where 1^{+}, 2^{+}, 3^{+}, 4^{+}, and 5^{+} represent pick-up stops, while 1^{−}, 2^{−}, 3^{−}, 4^{−}, and 5^{−} represent drop-off stops. 0 represents a terminal, which divides the chromosome into multiple routes. The start and end terminals are, respectively, contingent on the proximity of the nearest terminals to the first stop and the last stop. Note that the chromosome is randomly generated in this study.

##### 4.3. Simulated Annealing (SA)

SA is integrated to prevent HGA-ADAR from getting trapped into a local optimum, which conditionally accepts a slightly worse solution based on a probability, denoted as *P*, estimated bywhere *F* (*S*) and *F* (*S*″) are the fitness values of current solution *S* and new solution *S*″, respectively. Note that *γT* represents the current temperature, the product of temperature in the previous iteration (*T*) and temperature change rate (*γ*) (i.e., 0 < *γ* < 1). Note that the probability of accepting a worse solution decreases as the temperature decreases at a rate of *γ*.

##### 4.4. Destroy Operator

A set of three destroy operators, denoted as *D*, are applied in ADAR. Destroy operator I removes a number of requests (i.e., PD pairs) denoted as *φ*_{1}, which are randomly selected from the current solution *S*. The idea of randomly selecting requests helps to diversify the search space. Destroy operator II randomly removes one route, consisting of multi-PD pairs, from *S*. This idea helps to reduce the number of buses used to decrease the operator cost. Destroy operator III removes *φ*_{2} requests resulting in greatest early and late arrival penalties. The idea is to prevent great deviations between excepted bus arrival time and actual arrival time. Note that the removed requests are placed in a destroy list, denoted as *L*.

The process of destroy operation is shown in Figure 2 using a simplified scenario. Two requests (i.e., the 2nd and 5th requests) are chosen and removed from the incumbent solution *S*. Then, the two requests are stored in the destroy list *L* and a temporary solution *S*′ is produced.

##### 4.5. Repair Operator

In this study, a set of two repair operators is employed by ADAR, which is denoted as *R*. Repair operator I (as shown in Figure 3) sequentially removes the requests one by one from the destroy list *L* (i.e., the first is the 2nd request and then the 5th request) and inserts them into the best positions of the existing routes. The best position is the one that incurs the least increase of fitness value (i.e., the positions marked in orange).

The illustration of repair operator II is shown in Figure 4. In each iteration, this operator repeatedly inserts every request in destroy list *L* (i.e., the 2nd and 5th requests) into the best position (i.e., the position marked in orange) of the existing routes. For each request, an increase in fitness value (calculated by equation (23)) is obtained. The request that causes the least increase in fitness value is chosen (i.e., the 5th request) and inserted into the best position of the existing routes. This process continues until the destroy list *L* is exhausted.

##### 4.6. Operator Weight

Let *ω*_{d} (*d* *D*) and *ω*_{r} (*r* *R*) be weights of destroy and repair operators *d* and *r*, respectively, which are initially set equal to 1. In each iteration, operators *d* and *r* are randomly chosen according to the associated weights to produce new solution *S*″ based on current solution *S*. Note that *ω*_{d} and *ω*_{r} are justified iteratively with respect to reaction factor *ρ* [0, 1] and scores of the operators denoted as *ψ*_{d} and *ψ*_{r}.

*ψ*_{d} and *ψ*_{r} are determined by equations (26) and (27), which increase by *σ*_{µ}. Thus,

If *F* (*S*″) is the least fitness value found in previous iterations, *μ* = 1. If *F* (*S*″) < *F* (*S*), *μ* = 2. If *F* (*S*″) > *F* (*S*) but *S*″ is accepted by SA, *μ* = 3. Note that *σ*_{μ} varies within 0 and 1, and *σ*_{1} + *σ*_{2} + *σ*_{3} = 1.

##### 4.7. HGA-ADAR

Based on an initial population with GA, three operators (i.e., selection, crossover, and mutation) are applied to reproduce new solutions iteratively. SA is integrated with GA, called HGA, which permits conditionally accepting a worse solution. The HGA process terminates as the maximum number of generations is yielded, and the solution obtained is regarded as the initial solution of ADAR. A detailed procedure about the proposed solution algorithm HGA-ADAR is discussed and illustrated in Figure 5.

#### 5. Performance Analysis

The effectiveness of HGA-ADAR is evaluated using different numbers of PD pairs in a CB network. The network scales range from 4 to 48 PD pairs. The minimized total cost yielded by the optimized solution was found by LINGO (Global Optimal Solver, version 18.0), HGA (MATLAB R2018b), and HGA-ADAR (MATLAB R2018b) on a personal laptop (Intel Core i5, 8G, 3.0 GHz).

##### 5.1. Distribution of Bus Arrival Time

The proposed model and algorithm can virtually be applied to any distribution of bus arrival time. In this analysis, bus arrival time at stop *j* denoted as follows a gamma distribution with shape parameter *α*_{jb} and scale parameter *θ*_{j} [52–54]. The probability density function denoted as is given below:where . Based on the theory of gamma distribution, the mean and variance of denoted as and are formulated in equations (29) and (30), respectively.

As is well known, the sum of gamma-distributed variables that have the same scale parameter leads to a gamma-distributed variable. To make the problem tractable, we assume the same scale parameter (i.e., *θ*_{j}) for all stops. In addition, can be obtained from the mean of travel times on links and dwell times at stops covered by the bus until stop *j*. Thus, *α*_{jb} and can be determined by equations (29) and (30). Then, *α*_{jb} and *θ*_{j} are applied in equation (28) to calculate .

##### 5.2. Parameter Setting

As discussed earlier, *φ*_{1} and *φ*_{2} represent the number of requests to be destroyed, which are between 1 and *ζH*, where *ζ* is the destroy rate and *H* is the number of PD pairs. The scale parameter *θ*_{j} is set as 1. The value of in-vehicle time is 6 $/pass-hr, while the unit penalty of early arrival and late arrival is 9 and 90 $/pass-hr, respectively. The parameters of HGA and HGA-ADAR summarized in Table 3 were tested, which yielded acceptable solutions with the least computation time.

##### 5.3. Solutions Obtained by LINGO and HGA-ADAR

LINGO solver is a well-known commercial software program, which is applied to optimize CB networks with deterministic bus arrival time (deterministic model). On the other hand, the proposed HGA-ADAR algorithm is applied to optimize the same CB networks considering both deterministic and stochastic bus arrival time (probabilistic model).

Ten simulation runs per experiment are conducted, and the results are listed in Table 4; the computation time is shown in Figure 6. Note that the computation time for LINGO is limited to 10,800 seconds. Considering deterministic bus arrival time, the difference between the minimized costs found by HGA-ADAR (e.g., *F* (*S*_{A})) and LINGO (e.g., *F* (*S*_{L})) is denoted as “*ε*_{1},” which is calculated by

The results indicate that LINGO can be applied to optimize small-scale networks (i.e., 4 and 6 PD pairs) within a given time limit. As the number of PD pairs increases, the computation time exponentially escalates and LINGO fails to find the solution. On the other hand, HGA-ADAR performs well, which efficiently finds the solutions for different network scales considering deterministic and stochastic bus arrival time. Note that the computation time consumed by optimizing stochastic cases is significantly longer.

##### 5.4. Solutions Obtained by HGA and HGA-ADAR

To further assess the efficiency of HGA-ADAR, HGA and HGA-ADAR are both applied to optimize the network with stochastic bus arrival time (probabilistic model). The results are shown in Table 5. The difference between the minimized costs found by HGA-ADAR (e.g., *F* (*S*_{A})) and HGA (e.g., *F* (*S*_{H})) is denoted as “*ε*_{2},” which is calculated by equation (32). The improved computation time after applying ADAR is recorded in the “CPU^{3}” column, which is calculated bywhere CPU^{1} and CPU^{2} are computation times of using HGA and HGA-ADAR, respectively. In addition, the total costs found by HGA and HGA-ADAR over iterations for 24 PD pairs are illustrated in Figure 7.

From Table 5, we can see that HGA-ADAR outperforms HGA in terms of shorter computation time as well as less minimized total cost. In addition, for the network with 24 PD pairs, the solution found by HGA shown in Figure 7 converges at the 620th generation. After applying the initial solution produced by HGA at the 100th generation, the solution found by HGA-ADAR converges at the 300th iteration. It indicates that HGA-ADAR increases convergence speed greatly when compared to HGA.

##### 5.5. Impact of Stochastic Bus Arrival Time

To demonstrate the performance of the probabilistic model, routing and scheduling for various sizes of networks with different PD pairs are optimized considering deterministic (scenario I) and stochastic (scenario II) bus arrival time. The minimized costs summarized in Table 6 are determined based on ten simulation runs. The results show that minimized total cost reduced by 4.5–13.3% for networks with various PD pairs if stochastic bus arrival time is considered, in which early and late arrival penalties reduced by 21.5–63.7%, user cost decreased by 0–10.2%, and operator cost increased by 0–17% because of increased fleet size.

#### 6. Case Study

A real-world CB network in Xi’an, China, is employed as a case study to evaluate the effectiveness and applicability of the proposed model. The study network includes 31 stops and 7 terminals as illustrated in Figure 8. An instance with 275 passenger requests is presented, which is generated based on the real-world passenger travel data in Xi’an. The service time horizon of CB service ranges between 6:00 am and 10:00 am. Based on the method described in the paper of Sun et al. [5], the passengers are partitioned into 35 groups which are shown in Table 7.

The fixed and variable costs as well as capacities of various bus types are shown in Table 8, which are provided by the transit agency, Xi’an Public Transport Corporation. Fixed cost per bus-hour *f*_{b} is determined by the expense on vehicle depreciation which is purchase cost divided by useful life. Average variable cost per bus-hour *υ*_{b} is determined by expenses on fuel/energy, labour, maintenance, management, and social costs. The parameters of HGA and HGA-ADAR are summarized in Table 3.

##### 6.1. Result Analysis

The results discussed in this section yield the minimum total costs in ten simulation runs. The optimized results under scenario I (deterministic bus arrival time) and scenario II (stochastic bus arrival time) are summarized in Table 9, while the associated CB routing and scheduling under scenario II are listed in Table 10.

In Table 9, the minimized total cost in scenario I is 3,622.7 $, consisting of user cost of 1,090.2 $, operator cost of 1,881.2 $, and early and late arrival penalties of 651.3 $. The average passenger travel time is 39.6 min, while the average travel time per bus is 92.3 min. 12 buses with different capacities are employed, in which the numbers of 10-, 15-, 20-, and 25-seat buses are 1, 1, 4, and 6, respectively. Under scenario II considering stochastic bus arrival time, the minimized total cost can be reduced by 8.4% (from 3,622.7 $ to 3,317.4 $). The user cost reduced by 10.8% (from 1,090.2 $ to 972.6 $), while early and late arrival penalties significantly reduced by 43.7% (from 651.3 $ to 366.9 $). The operator cost slightly increased mainly due to the increase of fleet size (from 12 to 14 vehicles). The results indicate that considering stochastic bus arrival time shall yield a better solution.

Table 10 illustrates the optimal CB routing and scheduling, vehicle assignment, average passenger travel time, and early and late arrival penalties. The results suggest 14 routes, in which route 1 is the shortest one consisting of 2 stops (excluding terminals), route 14 is the longest one consisting of 7 stops, and the average number of stops per route is 4. The suggested CB service begins at 6:20 am and ends at 9:10 am. It is found that the early arrival penalty is always greater than the late arrival penalty for all bus routes, indicating that the probability of passengers arrive destinations early is much greater than late arrivals because of the huge late arrival penalty. The probability distributions of bus arrival time at drop-off stops for route 2 are illustrated in Figure 9. Route 2 is 32-14-27-14-22-34. The bus starts from terminal 32, which takes 9 passengers at stop 14, drops them off at stop 27, then returns back to stop 14, takes 9 passengers, drops them off at stop 22, and then parks at terminal 34.

Two curves in Figure 9 represent the probability distributions of bus arrival time at stops 27 and 22. The scheduled bus arrival times are 7:44 am and 8:40 am with variance of 5.5 min and 9.2 min, respectively. Note that the shaded areas represent the probability of passengers will arrive at the stops within the desired time windows. The left and right parts of the shaded area represent the probabilities of early and late bus arrivals, respectively. The probabilities of late arrivals are 0.0005 for stop 27 and 0.1408 for stop 22, and those for early arrivals are 0.9706 and 0.5144, respectively. The early (and late) penalty is the product of early (and late) arrival probability and the number of passengers alighting from the drop-off stop multiplied by the unit penalty, which is 15.09 $ (and 0.01 $) at stop 27 and 4.98 $ (and 9.96 $) at stop 22.

##### 6.2. Impact of Distribution of Bus Arrival Time

As discussed above, the most commonly used distributions of bus arrival time are normal, log-normal, and gamma distributions. To verify the application and rationality of the proposed model and algorithm, the three distributions of bus arrival time are applied in the case study. Ten simulation runs per experiment are conducted, and the optimal results under the different distributions are shown in Table 11. To make the problem comparable and tractable, we assume that the mean and variance of bus arrival time with different distributions are all generated by the same standard. Thus, the parameters ( and ) of the log-normal distribution are calculated as

The probability density function denoted as is given below:

The parameters ( and *κ*) of the normal distribution are calculated as

The probability density function denoted as is given below:

From Table 11, we can see that the proposed model and algorithm can be applied to the three distributions of bus arrival time, and the difference under different distributions only lies in the optimal results. It is worth noting that the optimal results under the three distributions of bus arrival time are close, indicating that it is reasonable to use gamma distribution in this paper.

##### 6.3. Sensitivity Analysis

A sensitivity analysis is performed to explore the relationship between model parameters (i.e., time window tw and scale parameter *θ*_{j}) and optimal results (i.e., fleet size, travel time, and minimized cost).

Figure 10 suggests that as the time window increases from 5 to 20 minutes, smaller buses (10- and 15-seat) decrease from 12 to 5, while larger buses (20- and 25-seat) increase from 3 to 8. It indicates that a loose time window could make it more flexible for the operator to arrange the CB service, and larger buses are preferred to reduce operator cost. However, the decreased fleet size would increase the number of stops per route as well as passenger travel time.

Figure 11 suggests that an increased time window may reduce early and late arrival penalties and increase the probability of on-time passenger arrival. It is worth noting that the early arrival penalty is always greater than the late arrival penalty, indicating that there are more passengers who arrive earlier than those who arrive later from the desired arrival time, due to a greater unit penalty for late arrival.

Figure 12 shows that as the time window increases, minimized total cost consisting of user cost, operator cost, and early and late arrival penalties decreases. A decrease in operator cost due to the decreased fleet size would increase average passenger in-vehicle time and user cost.

Figure 13 indicates that as *θ*_{j} increases (i.e., a greater variance of bus arrival time), fleet size increases, and smaller buses are preferable. It is worth noting that the number of buses is equal to the number of routes. Therefore, the increased fleet size would increase the number of routes and decrease the number of stops per route. This would result in fewer passengers per route with less travel time, and smaller buses with lower operation cost are preferable.

As *θ*_{j} increases as shown in Figure 14, the early and late arrival penalties increase because of the increasing variance of bus arrivals at stops and the increasing probability of early and late arrival.

Figure 15 suggests that as *θ*_{j} increases, minimized total cost increases. Operator cost is expected to increase because of increased fleet size. A minor decrease in user cost results from reduced passenger travel time. The increased probability of early and late bus arrivals could increase the early and late arrival penalties.

#### 7. Conclusions

In this paper, we propose a probabilistic model for optimizing CB service considering stochastic bus arrival time and some practical conditions, including time window and capacity in mixed bus fleet. An HGA-ADAR algorithm is applied to efficiently search for the solution. The developed model and algorithm can deal with the impact of probabilistic bus arrival time on the optimization of CB service planning. The objective total cost, consisting of user cost, operator cost, and early and late arrival penalties, is minimized subject to several practical constraints. The decision variables include vehicle assignment, bus routing, timetabling, and fleet size.

According to the performance analysis, the developed model and HGA-ADAR have demonstrated themselves effective in minimizing the total cost with the least computation time. The reduced total cost is mainly from saving early and late penalties by elevating on-time passenger arrivals by routing and scheduling a mixed bus fleet. The solution suggested by the proposed model with stochastic bus arrival time is superior to that of the deterministic model. The minimized total cost can be reduced by 13.3%, and early and late penalties can be reduced by 63.7% in networks with various PD pairs.

The proposed model is applied to optimize the service for a CB network in Xi’an, China. Three distributions of bus arrival time (i.e., gamma, normal, and log-normal distributions) are used to verify the application and rationality of the proposed model and algorithm. The results of sensitivity analysis suggest that as the time window increases, minimized total cost decreases because of reduced fleet size and operator cost. At this moment, larger buses shall be employed to meet the demand, albeit user cost slightly increases due to the increase in passenger travel time. Results also suggest that as scale parameter *θ*_{j} increases, minimized total cost increases because of increased fleet size. Smaller buses are advisable for operator. In addition, early and late arrival penalties tend to increase as the variance of bus arrival time increases, and they decrease as time window increases.

The immediate extension of this study will focus on enhancing the model to deal with time-dependent traffic conditions which would permit the probabilistic model to be applied for cost-effective CB operation (e.g., dynamic bus dispatching and holding control). In the future, it is desirable to model the problem as a multi-objective optimization one and compute the Pareto front. The behaviour of HGA-ADAR shall be further explored, especially on the sensitivity of the initial solution and parameters (i.e., numbers and weights of destroy and repair operators, destroy rate, etc.), to expedite the converging speed and ensure the quality of the solution. In addition, the computation for solving the stochastic problem is time consuming, and HGA-ADAR still has much room for improvement in the computing speed of solving such stochastic problems.

#### Data Availability

The data used in the case study to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This study was in part sponsored by the Young Scientists Fund of the Natural Science Foundation of Shaanxi Province, China (grant no. 2020JQ-399), the Special Scientific Research Project of Shaanxi Provincial Department of Education (grant no. 20JK0356), and the Fundamental Research Funds for the Central Universities (300102220101).