Mathematical Problems in Engineering

Volume 2015, Article ID 615468, 11 pages

http://dx.doi.org/10.1155/2015/615468

## Estimation of Bimodal Urban Link Travel Time Distribution and Its Applications in Traffic Analysis

^{1}The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Shanghai 201804, China^{2}Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Sipailou No. 2, Nanjing 210096, China^{3}Department of Civil and Environmental Engineering, University of California, Davis, CA 95616, USA

Received 30 December 2014; Accepted 20 February 2015

Academic Editor: Gisele Mophou

Copyright © 2015 Yuxiong Ji et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Vehicles travelling on urban streets are heavily influenced by traffic signal controls, pedestrian crossings, and conflicting traffic from cross streets, which would result in bimodal travel time distributions, with one mode corresponding to travels without delays and the other travels with delays. A hierarchical Bayesian bimodal travel time model is proposed to capture the interrupted nature of urban traffic flows. The travel time distributions obtained from the proposed model are then considered to analyze traffic operations and estimate travel time distribution in real time. The advantage of the proposed bimodal model is demonstrated using empirical data, and the results are encouraging.

#### 1. Introduction

Travel time is an important piece of information for transportation planners, traffic operators, and road users. It has been widely used in the studies of route choice, origin-destination (OD) flow estimation, and transportation system reliability [1]. Travel time is also an essential input for the design and implementation of advanced route guidance and traveler information systems.

Loop detectors are the most common data source for travel time estimation, particularly on freeways. Since loop detectors provide traffic information, such as volume, speed, and occupancy, at fixed locations, additional assumptions need to be made to estimate vehicle travel times [2, 3]. Recently, probe vehicles utilizing automatic vehicle location (AVL) technologies such as those based on the global positioning system (GPS) have been used to collect vehicle travel times directly. Data from probe vehicles have been considered in various applications, such as congestion identification and incident detection [4–6].

Travel times are traditionally modeled with unimodal distributions, such as normal, lognormal, Gamma, and Burr distributions [7–9]. Since the mean is a commonly used summary statistic for variables that are unimodally distributed, some studies specifically focused on the mean travel time (or speed) estimation [4, 10–12]. For example, Sen et al. [4] and Hellinga and Fu [10] analyzed probe link travel times and concluded that the sample mean does not approach asymptotically the population mean. Hellinga and Fu [11] developed a method based on stratified sampling techniques to reduce the bias in the mean travel time estimation. Pu et al. [12] developed Bayesian updating procedure to infer the mean speed of general vehicles from bus probe data.

However, recent studies have revealed that travel time distribution tends to be bimodal and even multimodal. Jintanakul et al. [13] argued that travel time distributions on freeways would not be unimodal due to the mixes of driving patterns and of vehicle types. They developed a bimodal model and illustrated it using simulation data. Guo et al. [14] proposed a multistate model to assess travel time distributions on freeways. They explained that travel time distributions would not be unimodal since multiple traffic states could exist in the same period.

Travel time distributions on urban streets are more complex than those on freeways. Vehicles travelling on urban streets are heavily influenced by traffic signal controls, pedestrian crossings, and conflicting traffic from cross streets. The interrupted nature of urban traffic flows would likely result in large fluctuations in observed travel times. For example, a vehicle passing a signal at the end of the green would experience quite different travel time than the vehicle following behind it that must make a stop for the red, although they traveled next to each other. Taylor and Somenahalli [1] have demonstrated empirically that urban link travel times are bimodally distributed. The bimodal property of urban link travel time distributions has also been studied analytically in [15, 16].

This paper develops methodologies to analyze urban link travel times from probe vehicles. Unlike previous travel time studies, this paper not only develops methodologies to capture important characteristics of the travel time distributions, but also explores potential applications of the travel time distributions resulting from the developed model. Specifically, the resulting travel time distributions are used to analyze traffic operations and estimate travel time distribution in real time. The advantage of the developed methodologies is demonstrated using empirical data.

The rest of the paper starts with the travel time model and the algorithms to estimate model parameters. Then we introduce the dataset used in the empirical study. Next we demonstrate how to use the travel time distributions to analyze traffic operations and estimate travel time distribution in real time. Finally, we conclude the paper with a summary of the key findings and possible future research.

#### 2. The Methodologies

##### 2.1. A Hierarchical Bayesian Bimodal Travel Time Model

Probe travel times collected on a given urban link in a given time-of-day period over multiple days are considered in this study. A hierarchical Bayesian bimodal model is developed to fit travel time distributions. Travel time distributions on urban streets are affected by various factors. Some of the factors, such as geometric characteristics (length, lane width, speed limits, etc.), are time-invariant, while others, such as traffic volume, traffic mix, and signal timing, could vary over days. Therefore, it is natural to model travel times hierarchically, with observed travel times modeled conditionally on certain parameters, which themselves are given a probabilistic specification in terms of further parameters, known as hyperparameters. The associations among travel times in different days are captured by using a joint probability distribution for model parameters in different days.

Given that travel times are positive and their distributions are positively skewed and that urban link travel time distributions tend to be bimodal, it is assumed that each component of the bimodal travel time distribution follows log normal distribution. That is, the model is fitted to the logarithms of the travel time observations. Specifically, the log travel times in a day are modeled as a two-component mixture: one component is referred to as “fast vehicle” and the other is referred to as “slow vehicle.” A travel time observation is considered to come from the “fast vehicle” distribution with probability () and from the “slow vehicle” distribution with probability . In a day , the “fast vehicle” distribution is assumed to be a normal distribution with mean and variance , and the “slow vehicle” distribution is assumed to be a normal distribution with mean and variance . The variation of the travel times between days is modeled by having the mean following a normal distribution with mean and variance and following a normal distribution with mean and variance . Let represent the log travel time for an observation in a day ; the model can be written in the following hierarchical form:where is an unobserved indicator variable that equals one if an observation in a day belongs to the “slow vehicle” group and zero if it belongs to the “fast vehicle” group. The parameter represents the expected log travel time in a day if a vehicle is not delayed. The parameter represents the expected log travel time in a day when a vehicle encounters a long delay. The mixture probability represents the probability that a bus encounters a long delay. The expected travel times for vehicles in the “slow vehicle” and “fast vehicle” groups and the expected delay in a day can be obtained by, respectively,

##### 2.2. Posterior Distributions of the Model Parameters

Weak prior distributions (i.e., the prior densities are diffuse) are used in this paper such that the posterior distributions are dominated by the data. The prior distribution of the mixture probability is taken to be uniform on (0.01, 0.99) as values of zero or one would not correspond to mixture distributions. A Gamma distribution with the shape and rate parameters of 0.001 is used for the prior distributions of , , , and . A normal distribution with mean 0 and variance 10000 is used for the prior distributions of the parameters and . The parameter is restricted to be positive such that the model is identifiable [17].

Given the distributional assumptions of (1)–(4) and the assumptions of the prior distributions, the posterior distributions of the model parameters and the unobserved indicator variables are given bywhere and represent the sets of the model parameters and of the unobserved indicator variables, respectively. represents the collection of the travel time observations across all days.

Even though the proposed model is developed based on standard distributions, the marginal posterior distributions of the model parameters are analytically intractable [17]. We consider the Gibbs sampler [17] to simulate the posterior distributions. The Gibbs sampler divides the parameter vector into several components and draws samples for a joint posterior distribution (e.g., (8)) from the posterior distribution of one component conditional on the values of the other components. This method is useful when it is difficult to sample from the marginal posterior distributions directly, while it is easy to sample from the conditional posterior distributions.

We adopt the “rjags” (stands for just another Gibbs sampler in R) package in the statistics software “R” to simulate the marginal posterior distributions of the model parameters [18]. The implementation is straightforward for the one who is familiar with “rjags” and, therefore, the algorithm is not discussed further in the paper.

##### 2.3. Real-Time Estimation of Travel Time Distribution

Real-time estimation of travel time distribution is an important component in advanced traveler information system. Conditional on the real-time observed travel times, we develop a methodology to estimate the parameters of the travel time distribution for a given period of a given day. Historical data are considered in the methodology through the prior distribution of the model parameters.

The parameter set () of the travel time distribution in a given period of a given day consists of , , , , and (see (1) and (4)). Note that the subscript representing a day is omitted for simplicity in the following without the loss of generality. The prior information and the real-time available data are combined to produce the posterior distribution of the parameter set () and the set of the unobserved indicator variables ():

The posterior distributions of the model parameters obtained based on historical data are used as the prior distributions in (9). Conjugate prior distributions are adopted to summarize the samples of the posterior distributions produced by the MCMC algorithm. The conjugacy property means that the posterior distribution follows the same parametric form as the prior distribution [17]. Specifically, the prior distributions of and are assumed to be normal distributions ((2) and (3), resp.). The prior distributions of and are assumed to be Gamma(, ) and Gamma(, ), respectively. The prior distribution of is assumed to be Dirichlet(, ). The parameters of the prior distributions are estimated by matching their theoretical mean and variance to the corresponding values estimated from the MCMC samples.

The MCMC algorithm can be used to produce the posterior distributions of the model parameters. Nevertheless, the MCMC algorithm may not be suitable for real-time application due to its relatively high computational requirement. Therefore, we developed an expectation conditional maximization (ECM) algorithm [19] to produce the maximum likelihood estimates (MLE) of the model parameters (i.e., the parameter set ). The ECM algorithm, which includes an - (expectation-) step and a CM- (conditional maximization-) step, is an iterative method for finding the marginal posterior mode from the joint posterior distribution [19]. In the context of the problem of interest, the ECM algorithm finds the mode of the marginal posterior distribution of the parameter set from the joint posterior distribution of (9). The ECM algorithm, as applied to the problem at hand, consists of the following.(1)Start with some (likely crude) estimates of the model parameters , , , , and .(2)For , apply the following two steps iteratively:(2.1)-step: determine the expected value of : where represents the density of an observation , assuming that it follows the “fast vehicle” distribution , and represents the density of an observation , assuming that it follows the “slow vehicle” distribution .(2.2)CM-steps:(a)update the probability by(b)update the expected parameter by(c)update the expected parameter by(d)update the variance by(e)update the variance bySteps (2.1) and (2.2) are applied iteratively until some stopping criteria are satisfied.

#### 3. Empirical Study

The methodologies proposed in Section 2 are general and can be applied to any type of probe vehicle data. This paper evaluates the proposed methodologies using probe bus data. Even though the driving patterns of buses are different from those of general vehicles, studies have confirmed that bus travel times are highly correlated with those of general vehicles and that bus travel times can be used to infer the travel times of general vehicles [12, 20, 21]. The empirical results presented below would provide insight for further methodology developments and potential applications of the travel time distributions obtained by the proposed model.

##### 3.1. The Data

The probe bus data are provided by the Campus Area Bus Service (CABS) at the Ohio State University (OSU). The CABS serves approximately four million passengers annually on seven routes on and in the vicinity of the OSU Campus. GPS-based AVL systems have been used on all CABS buses since 2009. The AVL system records bus statuses (e.g., location and velocity) at a frequency of 1 Hz. This study considers AVL data collected by buses serving the Campus Loop South (CLS), Campus Loop North (CLN), and North Express (NE) bus routes. The advertised headways of the CLS, CLN, and NE bus routes are 9, 9, and 5 minutes, respectively.

Two links are considered in the empirical demonstration. The lengths of Links 1 and 2 are 248.1 and 216.8 meters, respectively. Link 1 contains a four-way signalized intersection and Link 2 contains a four-way signalized intersection and a pedestrian crossing. To capture the total time losses due to vehicle acceleration and deceleration caused by signal controls or pedestrians crossing the street in the travel times, links are defined such that the intersection and pedestrian crossing are located inside the links. In addition, although it is possible to eliminate the increase in travel time due to stopping at bus stops for passengers alighting and boarding [8], bus stops are excluded from the defined links to control the effect of the possible errors resulting from the travel time correction.

Travel times collected in a.m. (7:30 a.m.–7:45 a.m.) and p.m. (4:30 p.m.–4:45 p.m.) periods of 40 weekdays are considered for the empirical demonstration. Summary statistics of the travel time observations and the signal information for the corresponding travel direction are provided in Table 1. The three numbers in the third row represent the minimal, mean, and maximal numbers of travel time observations in the given period of a day.