Journal of Advanced Transportation

Volume 2017, Article ID 5069824, 9 pages

https://doi.org/10.1155/2017/5069824

## Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution

^{1}Department of Civil and Environmental Engineering, FAMU-FSU College of Engineering, Tallahassee, FL, USA^{2}School of Engineering, University of North Florida, Jacksonville, FL, USA

Correspondence should be addressed to Emmanuel Kidando; ude.usf.ym@f51ke

Received 15 October 2016; Revised 18 December 2016; Accepted 28 December 2016; Published 20 February 2017

Academic Editor: Yuchuan Du

Copyright © 2017 Emmanuel Kidando et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM) with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC) sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.

#### 1. Introduction

Modeling travel time distribution is essential for measuring the consistency of the traffic performance of a highway system. Moreover, the distribution of the travel time is useful in simulation and theoretical derivations regarding different traffic performance measures such as travel time reliability and variability. The accurate estimation and prediction of travel time are essential for traffic operators, planners, and traveler information systems [1].

This study develops a nonparametric Bayesian model to estimate the travel time distribution for freeways. The model is based on Dirichlet process distribution with an extension of a hierarchical structure to account for the mixture/multistate characteristics of a given dataset. During the modeling process, the proposed model is truncated with an upper bound of six mixture components to reduce computational cost. Unlike a parametric model, this model does not require specifying the true number of components; instead, the number of components grows with the dataset, which is automatically inferred using the Bayesian posterior inference framework. The posterior distributions of the model parameter are derived using the Metropolis-Hastings Markov Chain Monte Carlo (MCMC) sampler. For this study, an Interstate 295 freeway corridor located in Jacksonville, Florida, was studied using 2015 traffic data.

In the next section, review of relevant studies is undertaken, followed by the methodology framework used in this research. Then, the discussion of the dataset and a method used to estimate the travel time is presented. Next, the results and model evaluation using simulated data with known parameters is displayed, after which conclusions and recommendations for possible future research are made.

#### 2. Literature Review

Literature review indicates that models of estimating the travel time distribution can be divided into two groups, that is, single probability (unimodal) and multistate/mixture models. Unimodal distributions commonly used to estimate the travel time distribution are Gaussian, lognormal, gamma, Weibull, and Burr [2]. Findings from several comparative studies of unimodal distribution functions suggest that travel time distribution is skewed, which makes lognormal, gamma, Burr, and Weibull more accurate than the Gaussian distribution in modeling travel time distribution. For example, using hourly-based data, Kieu et al. [2] compared Gaussian, lognormal, gamma, Burr, and Weibull models and concluded that the lognormal function fits the travel time distribution better than the rest of the models. Similar findings are reported by Arroyo and Kornhauser [3], Rakha et al. [4], and Emam and Al-Deek [5]. On the other hand, Pu [6] reported that, during congested and free flow conditions, travel time distribution is close to symmetrical, suggesting the Gaussian distribution of travel time. However, at the onset and offset of the congestion, the distribution is skewed. The study by Pu [6] suggested that lognormal distribution fits these conditions well.

The multistate/mixture models refer to models comprising two or more distributions. In mixture modeling, the individual distribution forming the mixture is linearly added using a weighted sum of the individual distribution contributing to the model. The weights refer to the mixing probabilities of the model. Studies comparing the performance of mixture models to single models revealed that mixture models provide a superior fit of travel time distribution over single models [1, 7–9]. Using field data collected on the Interstate I-35 freeway in San Antonio, Texas, Guo et al. [7] compared different multistate models. The outcomes were that the lognormal multistate distribution outperforms the rest of the models in modeling travel time distribution. This finding is consistent with results by Yang and Wu [10]. As a result, our study also adopts lognormal distribution in the analysis. It should be understood that, with the same road geometric characteristics (e.g., lane width, pavement condition, posted speed limit, and the number of lanes), the multistate characteristic of travel time is attributed to different vehicle type, traffic conditions, incidents, and driving characteristics on freeways. In addition to the previously mentioned factors, arterial roads are influenced by signal light, conflicts with pedestrians, and other factors [9, 11, 12].

In multistate modeling, there are two commonly used methods for finding model parameters, that is, the maximum likelihood estimation-expectation maximization (MLE-EM) and the Bayesian approach (BA) [13]. The MLE-EM method treats components of the mixture as missing variables and iteratively alternates between the E-step and the M-step to find the parameters of the model [14]. In addition, the method uses random initial guess and, after sufficient iterations, parameters converge. Compared to the BA, the MLE-EM method is computationally less expensive. However, it is susceptible to local maxima trap problem, which could result in overfitting of the resulting model [14]. Unlike the MLE-EM estimation method, the BA treats the model parameters as distributions that can be updated after new data become available. The BA method also incorporates prior knowledge regarding travel time distribution [15], which can be obtained from previously observed characteristics of the data distribution. Moreover, studies indicate that, by using informative priors, the BA can estimate the posterior distributions with smaller number of sample sizes than the MLE-EM approach [15, 16].

Taken together, the probability distributions discussed above are parametric with either the single model or multistate characteristics, whereby the multistate model consists of a fixed number of mixture components. The number of mixture components is specified as input in the model. The information criterion, cross-validation, and Bayesian factor are procedures commonly used to select the best model among a set of candidates [13]. However, these procedures for selecting the best model sometimes result in the output model suffering from over- or underfitting problem, depending on the amount of data available and on the model bound complexity [17, 18].

However, there are two methods that can be used in modeling without causing overfitting or underfitting problems. The use of the infinite Dirichlet Process Mixture Model (DPMM) with a truncated number of mixture components overcomes the underfitting problem [17–20]. The overfitting problem can be overcome by the use of a BA to estimate the posterior distribution of the parameters [18]. In this study, both DPMM and BA were used in modeling the travel time distribution. As indicated above, the infinite DPMM was selected. The infinite number of mixture components is achieved through the application of the stick-breaking process in building mixing weight of the mixture. This property of the infinite set of mixture components makes a model to be considered as a typical nonparametric model [21, 22]. Although the model is taken as infinite, only a few nonempty components are drawn depending on the actual characteristics of the dataset given [23]. Generally, the nonempty components are less than the realized number of the sample sizes considered in the analysis.

The Bayesian nonparametric mixture models have been implemented in a wide range of applications, including topic modeling, image analysis, and lifetime distribution [21, 24–26]. The attractiveness of Bayesian nonparametric mixture models includes the ability to handle randomness of the mixing distribution of a noisy dataset. The randomness of the mixing component is estimated using infinite dimension priors, whereby during sampling, true mixture components are built automatically and the rest die out. This study constructed priors using the stick-breaking process [21]. This process represents an infinite discrete distribution with the probability of being repeated from the previous draws. This characteristic makes the stick-breaking process appropriate for clustering data with multistate characteristics. However, controlling infinite dimensional posterior distribution can be computationally expensive [27]. To reduce this problem, literature suggests the use of truncated dimension priors to reduce computational complexity [27].

#### 3. Model Framework

The Dirichlet distribution is the generalization of a Beta distribution to account for higher order outcomes. The distribution is parameterized by a concentration parameter and mixture components. Its probability density function is given by The definitions of the terms of (1) through (4) are given in the Abbreviations.

The Dirichlet process is described as a set of distributions over the infinite sample space or distributions [21]. A mixture model with a hierarchical structure can be constructed using the Dirichlet process, which is also referred to as the DPMM [21, 28]. Figure 1 shows a graphical representation of the hierarchal mixture model.