Abstract

A closed-form mixed Logit approach is proposed to model the stochastic route choice behaviours. It combines both the advantages of Probit and Logit to provide a flexible form in alternatives correlation and a tractable form in expression; besides, the heterogeneity in alternative variance can also be addressed. Paths are compared by pairs where the superiority of the binary Probit can be fully used. The Probit-based aggregation is also used for a nested Logit structure. Case studies on both numerical and empirical examples demonstrate that the new method is valid and practical. This paper thus provides an operational solution to incorporate the normal distribution in route choice with an analytical expression.

1. Introduction

Route choice is one of the crucial issues in transportation analysis because it models the travelling behaviours so as to provide predictions for the future demand. Drivers always try to maximize their travelling welfare when choosing a path from a given origin-destination (OD) pair. However not all of them choose the best alternative because of the imperfect knowledge of network. To model this perception error and the stochastic route choice behaviour, Probit and Logit models are two of the most wildly used methods. The utility of each alternative is decomposed into a deterministic and a random portion. Assume that there are paths between an OD pair and the route choice set is ; the utility (welfare) of an alternative path can be represented aswhere is the utility of path , is the deterministic part which is composed by attributes such as length and cost that can be explicitly captured, and is the random term that captures the perception error. A rational traveller would select a path with the maximum utility among the alternatives in .

Probit assumes that the random portion is normally distributed; besides, it provides a highly flexible structure for correlation. However, it is limited due to the computation burden. It does not have a closed-form formula when there are more than two alternatives. Generally, the computation of multinomial Probit requires either Clark’s approximation [1, 2], Monte Carlo simulation [3], or numerical integration [4]. Yai et al. [5] used the multinomial Probit model in the context of route choice in the Tokyo rail network, but the maximum number of alternatives is limited to four. On the other hand, Logit is more popular for its analytical tractability. Logit assumes that the error term is type I extreme value (EV) distributed. Moreover, it assumes each of the error terms is independently identically distributed (IID), which leads to a closed-form mathematical structure to simplify the computation in estimation and prediction. As a consequence, Logit has two main disadvantages because of the IID assumption: (1) it cannot represent the path correlation which leads to enlarged probabilities of the overlapped paths, namely, the overlapping problem and (2) it cannot represent the heterogeneity in perception errors which would produce unreasonable results, namely, the scaling problem.

There are several modified methods to address the two drawbacks of Logit in the context of route choice. Regarding the first disadvantage, the overlapping problem, the improved models are classified into two types.(i)Modifications of multinomial Logit (MNL), such as path size Logit (PSL) [69], C-Logit [10], and implicit availability/perception (IAP) model [11]: in these models, an additional term is introduced in the utility function to capture the correlations of paths, so as to decrease the attraction of the overlapped path. This method maintains the simple form of MNL. Besides, the log-likelihood function of this method is globally concave, so it guarantees a global optimum for parameters estimations. However, the additional terms are convenient approximations. Previous researches show that they might be too sensitive to the composition of the choice set [9, 12].(ii)Generalized extreme value (GEV) proposed by McFadden [13]: the most widely used methods of this type in route choice are the link-based crossed nested Logit (CNL) [14, 15] and the paired combinatorial Logit (PCL) [1619] model. These two models all have a tree structure to represent the link-path relation, where alternatives with shared attributes are classified into the same nest so the correlation can be explicitly captured. The link-based CNL model treats each link as a nest, and each path uses several links which are classified into the corresponding nests. The PCL model compares paths by paired combinations, and each path pair is a nest. The CNL model has a large set of parameters that need to be estimated, so some researches provide approximated formulas [14, 20]. Besides, some researchers suggest that the parameters can be achieved by solving a system of equations of the correlation and constraints [21, 22]. Likewise, the PCL model also requires a parameter to represent the correlation, and the specifications are provided by Gliebe [23] and Prashker and Bekhor [24].

As for the second drawback of Logit, the scaling problem, Pravinvongvuth and Chen [18] propose origin-destination specific scaling factor to represent the different scale of diverse networks. Chen et al. [25] examine the scaling effect when applying route choice model in stochastic equilibrium models. Miwa et al. [26] examine how to set the scale parameter (dispersion parameter) and apply a multiclass stochastic user equilibrium (SUE) assignment model to consider differences in drivers’ perception errors.

Some researches combine both advantages of Probit and Logit, and the most representative model is the mixed Logit [27], also named as Logit kernel, error component, or hybrid Logit. It incorporates other distributions other than type I EV to provide a flexible and tractable form to represent the correlation across alternatives, the alternative specific variances, and also taste heterogeneity. Frejinger and Bierlaire [8] use the error component to model the subnetwork so as to represent the path overlap in an abstract network. Bekhor et al. [28] estimate an error component model based on the Boston route choice data. However, the mixed Logit does not have a closed-form expression; consequently the estimation and prediction all require the simulation-based method. Researches [20, 29] show that the simulation-based method requires a large number of draws to achieve stable predictions. Besides, currently there is no efficient path-based SUE traffic assignment for solving the route choice model with the mixed Logit model [18].

To fully use the advantages of Probit and Logit, this paper proposes a mixed Logit method with a closed-form to model the stochastic route choice behaviours. With a closed-form expression, the computation burden in estimation and prediction would be relieved. Moreover, the closed-form formula alleviates the difficulties in the path-based SUE assignment with a mixed Logit model. The paper is organized as follows. Section 2 describes the methodology, including the nested model structure and the Probit-based aggregation. The validation from a numerical example is presented in Section 3. The new method is applied to real data in Section 4. Finally conclusions and discussions for future study are given in Section 5.

2. Methodology

2.1. Model Structure

The proposed model has a similar structure as the PCL model. Paths are compared by paired combination. Consider alternatives in the choice set between an OD pair; by paired combination there are totally paths pairs. The new model has a two-level nest structure. Each path pair is a nest; within the nest there is actually a binary choice case. The expected maximum utility [7] of each path pair is used as the utility of the nest. In the upper level, it is a multinomial choice model with nests. Consider a three-alternative case, as shown in Figure 1, paths , , and , and the path pairs are , , and . The probability that path is chosen among three paths is a combination of the marginal probability of the nest and the conditional probability within the nest, which is

In order to relax constraints of the Logit model, the error part in (1) is decomposed into two parts, and . The first error , which is IID EV distributed, captures the differences of nests; the second error , which obeys normal distribution, captures the differences within the nest. For the first nest that includes paths and , their utilities arewhere and , and are attributes of paths and ; is a vector of parameters that to be estimated. and are nest specific, and they capture the unobserved attributes shared by alternatives in the same nest, so consequently they are the same, which is . The variance of nest is , where is the scale parameter. and are alternative-specific, and they capture the unobserved attributes specific to alternatives and . and are assumed to be normally distributed with the expectation of zero and the variances and , respectively. Sheffi [30] argues that the variance can be assumed to be proportional to the deterministic utility of paths, so as to link the perception error to the paths attributes. Hence the variance-covariance matrix of paths , iswhere is the correlated utility of paths and , such as the overlapped links; is the scale parameter of the lower-level and it is to be estimated.

The probability that traveller chooses path given the nest is chosen asBecause and are both normally distributed with zero means, is also normally distributed with expectation zero but with variance , where is the correlation. We havewhere is the standard cumulative normal distribution function. Since the Probit model is only used in a binary choice case with a closed-form expression, the advantages of Probit, such as flexible form of correlation and alternative-specific variance, could be fully exploited in our model.

The expected maximum utility of each nest is the aggregation of the paths within the nest; define it as for nest . The utility of nest iswhere .

The probability that nest is chosen among nests iswhere ; is IID EV distributed. According to the properties of EV distributed [7], we have

The probability that traveller chooses path in the choice set is

2.2. Calculation of the Nest Utility

The nest utility in (7) is represented by the expected maximum utility of paths and . In order to fully use the advantage of the Probit model, this paper incorporates the Probit-based aggregation to represent the utility of the nest.

Clark [1] provided an approximation method with which the expected maximum utility of paths and iswhere is the standard normal distribution function and is given by

Noticing that , the right-hand side of (11) can be simplified as

The Probit-based aggregation which is used in (9) is

3. Numerical Example

The new model is tested in a three-path network with one independent path and two correlated paths, as shown in Figure 2. Three paths have the same lengths of 10. The value of varies from 0 to 10. When , the middle and lower paths are completely correlated, so the choosing probabilities for the upper, middle, and lower paths are , , and , respectively. When , three paths are all independent, so the probabilities are for all of them. When , the probabilities for the middle and lower paths should monotonically increase from to , while the probability for the upper route should decrease from to . The purposes of this numerical example are firstly to check whether the models have reasonable outcomes in extreme cases, which are when and when , and secondly to check whether the shapes of the models are as the same as expected when , which means whether they are concave or convex and monotonously increase or decrease.

The proposed model is compared with the Probit, PCL, CNL, MNL, C-Logit, PSL, and Logit kernel (LK) models. The results of Probit, calculated by Clark’s approximation, are assumed to be the standard. The new model is actually an improved version of the PCL model with mixed distribution. The LK model is compared because it is also a mixed Logit model but without a closed-form. The CNL is compared because it is also a nested structure as the proposed one. The PSL and C-Logit models are the two most widely used models for their simplicity. The MNL, without addressing the overlapping or scaling issues, is supposed to be the worst. Assume that length is the only attribute in the deterministic utility and its parameter is set to be one. The choosing probability of the middle path in MNL, namely, , iswhere is the length of path . The scale parameter is normalized to one.

The link-based CNL model [14] is chosen because it is the most widely used CNL model in the route choice field because it systematically captures the route-link relations in the road network, where each link corresponds to a nest and the paths that share the same link belongs to the same nest. Therefore we choose the formula as follows:where is the GEV generating function, is its partial derivative with respect to , and is the inclusive parameter and is defined as , where is the length of link and is the length of path . is the root scale parameter and is normalized to 1; is the scale parameter for nest and is set as 1.5 in our case. Since this case is just illustrative, a larger or a smaller setting of the parameter would not change the major properties of the model, which means that it would not change its value when or when and it would not change the concave, convex, or monotonic properties. Without loss of generality, in this case we set the nest parameter .

The choice probability of the middle path in the LK model is calculated bywhere is the length of link , is the scale parameter and is set to be 1 in this case just for illustration, and is a random number from the standard normal distribution. Simulation-based method [28] is used with one million draws.

The formulas of PSL, C-Logit, and PCL are corresponded to the researches of Ben-Akiva and Bierlaire [6], Cascetta et al. [10], and Koppelman and Wen [17], respectively. Probit employs the scale parameter , whereas the MNL, PCL, CNL, PSL, and C-Logit models employ the parameter . To ensure consistency among different models, we assume that perception errors are the same, which is . In this case all the paths lengths are the same so the variances are the same.

The choice probabilities for the middle route calculated by different models are provided in Figure 3. When and , the new model has the expected results, so do the PCL, PSL, and C-Logit models; however both the CNL and the LK models fail to produce reasonable value when . When , the curve of the new model is close to the Probit and they both show a downward concave shape. Besides, the result from the new model demonstrates a substantial improvement over the PCL model. The results from PSL and C-Logit also have reasonable performances. However, the results from the CNL model have bizarre behaviours. It overpredicts the choosing probabilities while the paths are partly correlated. The curves of Probit, PSL, PCL, and the new model are upward concave, while the curves for C-Logit and LK are downward concave. The results from MNL are regardless of the variation of . It fails to provide logical prediction while paths are overlapped, as expected. According to the results and the comparisons in this case, the new model is valid and capable to produce reasonable outcomes.

4. Empirical Results

In order to evaluate the performances of the proposed method, we apply the new model to real data. A case study of taxi drivers choosing routes in the city center is presented. The studied city, Guangzhou, is situated in the southern China and it has approximately ten million inhabitants. Only the central business district (CBD), the Tianhe region as highlighted in Figure 4, is studied. The data set for the estimation is from GPS-equipped taxis when they were carrying passengers. The data was collected by a management company for monitoring purpose but not for navigation, so the route choice behavior is based on the drivers’ own judgment. The vehicles were monitored within a radius of 5 km in the CBD, and 5786 trips from 473 ODs are collected for case study. The information on the studied network is shown in Table 1. Three data sets are collected: the first one is for estimating the parameters of the new model and the compared models, which will be presented in Sections 4.1, 4.2, and 4.3; the last two data sets are for validating forecasting, which is to use the estimated models to predict the choosing probabilities and compare them with the actual choosing behaviors, and the results are shown in Section 4.4.

4.1. Model Specification

Three attributes, length, artery road ratio, and the number of signal-controlled intersections, are included into the utility function, as shown in Table 2. Length and time are two highly similar and correlated attributes, so only one of them would be sufficient into the utility function. However a precise actual travelling time is difficult to obtain before departure. When drivers decide which route to choose, they usually process the information which is more stable in their concept, in this case length is relatively more stable than time. Therefore we use length in the utility other than time. The unit of length is kilometer so its magnitude is similar with other attributes for the convenience of the estimation. The artery road ratio is the length of the artery road (major roads and arterial streets) divided by the total length of the trip. We assume that artery roads have a significant and positive impact in the utility, because, compared with minor streets, the artery roads have more lanes, higher capacity, and even less traffic lights in the studied region, which means higher level of service when driving. Therefore the larger value of the artery road ratio is supposed to be more attractive to the drivers and this attribute is included. The number of signal-controlled intersections is expected to have significant and negative impact when driving in the city center. More traffic lights mean more chances of stopping and delay; therefore the more intersections with traffic lights are expected to have a lower utility. The deterministic utility is shown in

4.2. Route Choice Set

According to the researches from Ramming [20] and Frejinger and Bierlaire [8], it is difficult to generate a choice set including all the actually chosen paths. At best 84% of the observed paths are found by combining all the choice set generation algorithms that Ramming [20] had tested. In order to avoid generating a choice set that misses the important chosen paths, we employ a data mining method to build the choice set for each individual. Assume that we observe the trips between a given OD pair for a long enough time period , and if there are totally paths that are actually chosen by the travellers, we can conclude that these paths are the choice set of this OD. Since the GPS data is large enough and is continuously provided, it is possible to find out the choice sets for all the ODs. The advantage of this method is that it would not miss the important and actually chosen paths, but the shortcoming is that it may require a very long observation time to determine a stable choice set .

According to 5786 trips from 473 ODs collected by taxi drivers in a period of one week, this paper analyzes the size of the choice set, denoted by . As shown in Table 3, the number of actual chosen paths between any OD pair are not larger than 12, and the average is just 4 paths. It suggests that it is rational to use the paired combination model in route choice, because the magnitude of does not lead to a heavy computation burden.

The objective of this paper is purely illustrative. It does not provide a full analysis to determine how long the observation time should be or how large the number of the observed trips should be. More tests on this subject would be desirable.

4.3. Model Estimation

This paper uses the maximum likelihood estimation method to calibrate the parameters. Five models are estimated and compared in this section: the proposed model; the MNL, expected to have a poor result; the two most wildly used models with tree structure, the PCL and CNL; the LK model with 100 draws estimated by a simulated-based method [31]. Ramming [20] points out that the estimation of the Logit family either is normalizing the scale parameter as or alternatively is actually the jointly estimates of . To facilitate the comparison among different models, a scaled parameter estimate is also provided. The scaling, , is based on the estimated length parameter in the MNL model. The magnitude of the scaled estimate for the parameters is consequently the same among the models.

The signs of all the estimated parameters are as expected, as shown in Table 4. The positive sign of the parameter suggests that taxi drivers tend to travel on the artery roads. The scaled estimates of and in models MNL and PCL and the new model have the same magnitude, and the magnitude of these parameters is approximately ten times smaller. The magnitude of the scaled and nonscaled estimates is the same for models PCL and CNL, but not for the new model. Actually, the log-likelihood function of the new model is not globally concave; when searching for the optimum it is easily “trapped” in a local point. The selection of the initial values, upper and lower bounds, is important in estimation. It would be a reasonable approach to estimate a MNL model first in order to gain adequate information on the parameters. The parameters estimates of all the models but LK are all significantly different from zero. Two parameters of LK, and , are not significantly estimated. The reason may be that the draws are not enough (only 100 draws). The estimation time for MNL, PCL, CNL, and the new model is within five minutes thanks to the closed-form expression. On the other hand, the LK model uses 19 hours, with only 100 draws, and still cannot attain good estimates.

Table 5 provides the likelihood ratio test between the models. It is asymptotically distributed as with degrees of freedom, where is the difference of the numbers of estimated parameters between two models. Results from this test show that the new model and LK and CNL models are significantly better than the PCL and MNL within the 95% confidence interval. Since the number of estimated parameters for CNL and the new model is the same, it is possible to compare the goodness-of-fit by the final log-likelihood. The data shows that the new model is better than LK, but the CNL model has a better model fit than the new model.

4.4. Forecasting

Route choice models are important to predict individual behavior; therefore the comparison of models should not just focus on model fit, but also on the forecasting results. Two out-of-sample data sets are chosen to validate the forecasting ability of models. “Out-of-sample” means these two data sets for validating forecasting have not been used for estimating the models. Each data set includes 200 OD pairs which are randomly selected in the studied region, and all observations associated with these OD pairs are included in the forecasting data sets. 3345 and 3536 observations are in the first and second data set.

To compare the forecasting power of models, we define the out-of-sample error aswhere is the out-of-sample forecasting error of model , and a smaller value would suggest a better forecasting ability in this case; is the predicted probability of the actual chosen path computed by model with estimated parameter , between OD pair , and for the observation obs. is the probability of the chosen path in reality, which is 1. is the number of observations in the data set.

The out-of-sample forecasting errors of the models are shown in Table 6. The MNL model uses the least time for computation in each data set; however its errors are the second largest among all the compared models. Even though the LK model provides a highly flexible structure by incorporating the multinomial normal distribution, however its errors in both data sets are the largest, and its computation time is also the largest. Although the CNL model has a good model fit in the estimation, it does not provide the most promising forecasting results in this case. The PCL model and the new model, which is also a PCL-based model, provide the least forecasting errors in both data sets. Because of the mixture of the Probit model, the new model uses more computation time than the PCL, but it still uses less time than the traditional mixed Logit model (the LK model) because the new one is closed-form and it does not require the simulation-based method in computation. By summarizing the numerical example, the estimation by real data, and the out-of-sample forecasting results, this paper therefore concludes that the new model is a promising approach in route choice analysis.

5. Conclusions

This article proposes incorporating the normal distribution to model the perception error of the travellers in route choice and at the same time retain the superiority of Logit for its closed-form expression. The traditional mixed Logit model requires a simulation-based method for estimation and prediction. This paper provides a new mixed model with an analytical formula, where the computation burden would be relieved. Besides, it would be practical to use the new model in the path-based SUE traffic assignment because of its closed-form expression. Probit model is employed in two places: firstly paths are compared by pairs where the superiority of the binary Probit can be fully used; secondly the Probit-based aggregation is also used for a nested Logit structure. The results on numerical example demonstrate that the proposed approach is valid, and the application with real data shows that the proposed method is practical. The parameter estimates show that the new model is practical, and the likelihood ratio tests present the superiority of the new model compared to the multinomial Logit and paired combinatorial Logit models. Moreover, the out-of-sample forecasting results suggest that the new model has a promising forecast power.

The proposed model provides new perspectives for the mixed Logit model in route choice modelling. It is a reasonable trade-off between the flexibility in correlation and analytical tractability. A natural extension is to loosen the assumption that the scale parameters of the normal distribution are all the same, because travellers are supposed to have different perception errors on different path pairs.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This research was funded by the National Natural Science Foundation of China (no. 51178475).