Research Article  Open Access
Modeling LaneKeeping Behavior of Bicyclists Using Survival Analysis Approach
Abstract
Bicyclists may cross the bicycle lane and occupy the adjacent motor lanes for some reason. The mixed traffic consisting of cars and bicycles shows very complicated dynamitic patterns and higher accident risk. To investigate the reason behind such phenomenon, the lifetime analysis method is adopted to examine the observed data for the behavior that bicycles cross the bicycle lane and occupy the adjacent motor lanes. The concepts named valid volume and probability of lanekeeping behavior are introduced to evaluate the influence of various external factors such as lane width and curb parking, and a semiparametric method is used to estimate the model with censored data. Six variables are used to accommodate the effects of traffic conditions. After the model estimation, the effects of the selected variables on the lanekeeping behavior are discussed. The results are expected to give a better understanding of the bicyclist behavior.
1. Introduction
Our daily life and work are closely related to traffic and mobility. Nowadays, in consequence of dramatically increasing traffic demand, traffic congestion has an immense negative impact on daily life and modern society [1, 2]. In many developing countries, like China, a typical traffic phenomenon is called “mixed traffic” which consists of cars (representing motorized vehicles) and bicycles (representing nonmotorized vehicles). Such inhomogeneous traffic flow has been considered as an important cause of traffic congestions and accidents. Additionally, bicycles are usually used as a kind of green traffic tool and it is important to improve traffic condition. Therefore, researchers turned their attention to the traffic characteristic of bicycles and mixed traffic composed of bicycles and cars. They did research from theoretical and practical points of view. Of the research on bicycles and mixed traffic, traffic simulation is popular method. Jiang et al. proposed a stochastic multivalue cellular automata model for bicycle flow [3]. Faghri and Egyháziová developed a microscopic simulation model of mixed motor vehicle and bicycle traffic over an entire urban network [4]. Zhao et al. described mixed traffic flow by combining the NaSch model and the Burger cellular automata (BCA) model and investigated the mixed traffic system near a bus stop [5, 6]. Mallikarjuna and Rao extended the cellular automata (CA) model to study the heterogeneous traffic [7]. On the other hand, theoretical models derived from empirical data were proposed. Oketch proposed a microscopic model to describe the mixed traffic flow by using the combination of carfollowing model and lateral movement [8]. Yang et al. presented a road capacity model for mixed traffic flow at the curbside stop based on queuing theory and gap acceptance theory [9]. Guo et al. used PLM model and Weibull’s distribution to analyze the lanecrossing behavior of nonmotorized vehicles under the influence of curb parking [10].
In an urban street without segregated facility, the bicyclists may drive in the motor lane because of the blockage in the bicycle lane. Once the bicyclists do not satisfy the traffic condition, they will arbitrarily change travel route and even occupy the motor lane. Particularly for the position near bus stop or parking area, the occupancy of motor lane has a strong impact on traffic performance and safety [9]. The abovementioned literature mainly focused on analyzing the influence of mixed traffic flow on traffic performance, and most research adopted the microsimulation method. However, the research on the traffic behavior of bicyclist is limited. In this paper, the traffic behavior that bicyclists keep in bicycle lane (called by lanekeeping behavior) is considered. The question why bicyclists cross the bicycle lane is discussed. With this aim, a survivalanalysisbased approach is used to model the lanekeeping behavior of bicyclists under various external factors. To give a quantitative analysis of the lanekeeping behavior, the probability of lanekeeping behavior is studied using field data. A concept named valid volume is proposed and a semiparametric method is used to perform the analysis. It is hoped to provide reference frame for evaluating the influence of traffic conditions on the traffic behavior of bicyclist.
2. Method
2.1. Analysis of the Bicyclist Behavior
Some bicyclists are apt to travel in the adjacent motor lanes in order to get their desired driving conditions, for example, speed and space. When the bicycle lane is blocked due to some reason, the probability of the lanecrossing behavior would increase obviously. Assume that the lanecrossing behavior occurs if the bicycle volume is higher than a critical value , yielding where is the distribution function of the lanecrossing behavior.
In this paper, the critical volume is defined as valid volume in order to represent the maximum volume that bicycles would travel in the bicycle lane. That is, the lanecrossing behavior would not occur if the bicycle volume is lower than the valid volume (). Here, another definition that the probability of lanekeeping behavior is used:
In terms of , the influence of external factors on the behavior of bicyclist can be reflected by the distribution function of a lanecrossing behavior or lanekeeping behavior. Therefore, the influential factors (e.g., narrowing lane and obstructing lane) can be analyzed from a macroscopic perspective. Considering the volume as the input variable, the data acquisition and the data processing can be simplified because some variables involving the traffic behavior of a bicyclist are difficult to be quantized.
2.2. Modeling Bicyclist Behavior Based on Lifetime Analysis
Survival analysis models (also called lifetime analysis) have been used extensively for several decades in biometrics and industrial engineering as a means of determining causality in lifetime data [11, 12]. In recent years, they have been applied in the field of transportation [13, 14], including the analysis of activity participation and scheduling, vehicle transactions analysis, and incidentduration analysis. These models concern the distribution of lifetime : where is the distribution function of lifetime data representing the probability that an individual fails before ; is the survival function representing the probability that an individual survives longer than . is also called reliability function.
The traffic behavior that bicyclists travel in bicycle lanes can be considered as a valid state under particular conditions (e.g., lane widths, traffic volume, and curb parking). Such a valid state continues with an increasing bicycle volume. If the volume is greater than the valid volume, the lanecrossing behavior will occur. It means that the particular conditions are hard to satisfy the travel demands of bicyclists. The continual process of valid state is similar to the continued life. If the lanecrossing behavior is regarded as the termination of life, the methods for lifetime data analysis can be used to estimate the valid volume , which is the analogy of the lifetime . In addition, due to the randomness of traffic behavior and the influence of curb parking, the same volume may correspond with two contrary behaviors, that is, crossing the lane or not. In this case, the lifetime analysis models are appropriate to solve such problem by censor analysis though the general statistical methods are no longer applicable [10]. The whole analogy between the bicyclist behavior analysis and lifetime data analysis is given in Table 1.

Firstly, an important concept, hazard function, is introduced. A hazard function at specified volume in mathematical definition is
The result in the hazard function is hazard rate (or hazard), which is the instantaneous probability that the lanecrossing will occur in an infinitesimally small volume after time . is the approximate probability of the lanecrossing behavior in .
According to the mathematical relation between the hazard function and survival function, the probability of lanekeeping can be obtained:
To accommodate the effects of external factors, the hazard function can be written as where is the baseline hazard function, is a known function to represent the effects of covariates, is a column vector of covariates and it is independent of duration time, is a row vector of unknown parameters. The form of (4) is one of the popular mathematical models used for duration analysis and its name is proportional hazard (PH) model.
In this study, a framework of nonparametric baseline hazard, which was proposed by Cox using , is adopted [15]. With this parameterization, the hazard function is
The endurance probability function combining (5) and (7) can be written as where is the baseline probability function for the lanekeeping behavior. It represents the probability without any external influence.
The shape of in the PH model has important implications for data analysis. Also a parametric shape could be chosen according to data distribution. In this paper, the nonparametric baseline hazard is used to avoid the error when the assumed parametric form is incorrect. The parameter estimation can use the partial likelihood method. Other methods can be referred to in [11, 12].
2.3. External Factors Selection
The selection of external factors takes into account the previous researches and arguments regarding the effects of the exogenous variables and human factors on bicyclist behavior. Three broad sets of variables may influence the bicyclist behavior: personal characteristics, traffic conditions, and trip characteristics. In this paper, the traffic conditions are considered. The following factors, as shown in Table 2, are adopted to construct the model.
 
^{
a}“veh” is the abbreviation of vehicle. 
3. Survey and Data
The field survey is conducted in the urban roads with no isolation facilities. The selected survey sites are monitored by video cameras. Then, the bicycle volumes in the lanes with different effective widths can be acquired. According to [10], the effective width of bicycle lane is defined as the physical width minus the margin of safety (0.5 m). Such safety margin indicates the influence of curb parking so that bicycles would keep a safe distance from the parked cars. On the other hand, the data related to the external factors are also derived from the video survey. The data acquisition is performed by manual counting and recording and the assistance of video processing tool.
The length of the observed section is 25 m and there is no influence of bus stop and pedestrian crosswalk. In consideration of the discrete arrival of bicycles and the nonuniform volume, short observed interval may not include enough samples while long interval may influence the definition of data status. Therefore, the observed interval is 30 s. The status of each interval is defined as (a) censored data if there is no bicycle entering the motor lane in the interval and (b) distinct data if the lanecrossing behavior occurs in the interval [10]. The field surveys are conducted in four typical urban roads in Beijing, including East Jiaoda Road, North Yufang Road, West Tucheng Road, and Da Liushu Road. The basic features of observed sections are shown in Table 3.

4. Model Estimation and Discussion
4.1. Model Estimation
Table 4 shows the model estimation of the lanekeeping behavior of bicyclists. The LR statistic of the estimated model clearly indicates the overall goodnessoffit (the LR statistic is 98.4, which is greater than the chisquared statistic with 6 degrees of freedom at any reasonable level of significance). The statistical significance of each variable is examined by the test, which has an asymptotically normal distribution with mean zero and variance one. The significant level corresponding to each covariate is given by value in Table 4. From the results, most of the included variables are statistically significant at the 0.05 level of significance. It means that these covariates are significantly related to the violation behavior. Two covariates ( car volume and safe gap) have a relatively low significant level. The effects of the variables on lanecrossing behavior will be discussed in the next section.

Figure 1 shows the lanekeeping probability by the proposed model with the average of all variables. It means that the curve in Figure 1 can reflect the average lanekeeping probability of the typical bicycle flow which has an average value for every external factor. The curve is monotonely decreasing. The median of the distribution is 25 vehicle/30 s, indicating that over a half of the observed interval would result in lanecrossing behavior if the bicycle volume is greater than 25 veh/30 s in average condition (2.5 m lane, bicycle speed is 12 km/h, car volume is 450 pcu/h/lane, and about 30% of the bicyclists are affected by curb parking). If the variable changes (representing the change of external factor), the probability of lane keeping would change correspondingly.
4.2. Effects of External Factors
In the proportional hazard model, the effects of variables are multiplicative on the baseline hazard function. A negative coefficient on a variable implies that an increase in the corresponding variables decreases the hazard rate, or equivalently increases the valid volume. The greater valid volume means that the occurrence of lanecrossing behavior decreases. The effects of external factors are analyzed in the following.
Effective width of bicycle lane shows a significant negative on hazard. It means the wider the lane width for bicycles, the less the lanecrossing behavior would occur. Figure 2 shows various distributions of the lanekeeping behavior with different lane widths. According to the empirical data and the estimated model, if the effective width of a bicycle lane decreases from 3.5 m to 2.0 m, the valid volume will decline from 28 veh/30 s to 20 veh/30 s (the valid probability is 0.8). Assume that the average volume is 15 veh/30 s, the probability that all bicycles travel in the bicycle lane would be 0.6 under the condition of the narrow lane. Meanwhile, the valid volume would decrease by 40%.
Curb parking also shows a positive effect on the hazard while it means that the curb parking can increase the hazard or decrease the probability of lanekeeping behavior. The effect of curb parking on the probability of the lanekeeping behavior is shown in Figure 3. The road sections for a comparative analysis show differences in the distribution of valid volume. Namely, probabilities of the lanekeeping behavior in the sections with curb parking are lower than those without influence of curb parking. It should be noted that the probability that the lanecrossing behavior occurs is low when the volume of the bicycle is low. Meanwhile, the influence of curb parking is insignificant. When the volume of the bicycle is high, the relation between the occurrence of the lanecrossing behavior and curb parking is also insignificant. From the results, the influence of curb parking on the valid probability is significant when the volume of bicycle distributes in median range. Taking the width of 3.0 m as an example, the influence of curb parking on the lanekeeping behavior is significant when the volume ranges between 22 and 32 veh/30 s.
The travel speed of a bicycle can have a positive effect on the hazard function. The faster the bicycles travel, the higher the possibility they cross the bicycle lane. In this paper, the electric bicycle is considered as bicycle. According to the field survey, there is a certain number of electric bicycles that cross the bicycle lane and travel in the motor lane. The electric bicycles travel faster than other bicycles; thus, they want to seek ideal travel space. Particularly, when there are bicycles travelling in low speed or curb parking car in front of the faster electric bicycles, they change the travel direction and overtake the blockage via the motor lane without the least hesitation.
Retrograde motion has a positive effect on the hazard function, like travel speed of bicycle. As shown in Figure 4, the existence of retrograde bicycle can decrease the probability of lanekeeping behavior. Such effect is more significant in the condition of higher bicycle volume. The retrograde motion of bicycles can hinder the travel routes of other bicycles; it is easy to provide a motivation for changing travel route, even changing the lane.
According to the estimation results, the effect of car volume and safe gap is not significant from the perspective of statistic. However, the car volume still can influence the bicyclist behavior. If the car volume is high, the bicyclist may have little chance to travel in the motor lane. Additionally, the variable of safe gap can also reflect the chance and the safety for a bicyclist to travel in the motor lane. From the field survey and the estimated results, a certain number of bicycles travel in the motor lane when the car volume is very high. These bicyclists neglect the accident risk caused by the lanecrossing behavior. It is dangerous for the bicyclists to travel in the motor lane in heavy car flow. And the lanecrossing behavior can enforce a blockade against the moving car so that the traffic performance reduces obviously.
5. Conclusion
This paper proposed a model to describe the lanekeeping behavior of a bicyclist in urban street by using survival analysis. A concept of valid volume is also proposed to describe the relation between the lanecrossing behavior and the bicycle volume. The volume data are defined as censored data and uncensored data. Proportional hazard method is used to estimate the field data with censored data. In order to capture the effect of external factors involving traffic conditions, six variables are selected to construct the PH model. The results show that the effective width of bicycle lane, travel speed, curbs parking, and retrograde motion have significant effect on the lanekeeping behavior. Two variables (car volume and safe gape) show relatively low significance. It is concluded that the lanekeeping behavior results from various related factors such as personal features, traffic conditions, and environmental factors, and any change of the influential factors can modify the lanekeeping behavior. Therefore, the planning and designing of urban street should consider these influential factors apprehensively.
The future work will focus on the influential factors. More factors will be introduced into the model and the field surveys of sites will be increased to obtain more empirical data. For example, the average speed of bicycle travelling in the car lane could be an important influential factor on the lanekeeping behavior of cyclists. Also the significances of variables and their effects on bicyclist behavior will be discussed deeply.
Acknowledgments
The authors would like to thank the anonymous editor and referees for their valuable comments. This research is supported by the Programme of Introducing Talents of Discipline to Universities under Grant no. B12022 and the Programme of International Science and Technology Cooperation in the Beijing Institute of Technology.
References
 W. H. Wang, W. Zhang, H. W. Guo, H. Bubb, and K. Ikeuchi., “A safetybased behavioural approaching model with various driving characteristics,” Transportation Research C, vol. 19, no. 6, pp. 1202–1214, 2011. View at: Publisher Site  Google Scholar
 W. Wang, Y. Mao, J. Jin et al., “Driver's various information process and multiruled decisionmaking mechanism: a fundamental of intelligent driving shaping model,” International Journal of Computational Intelligence Systems, vol. 4, no. 3, pp. 297–305, 2011. View at: Google Scholar
 R. Jiang, B. Jia, and Q. S. Wu, “Stochastic multivalue cellular automata models for bicycle flow,” Journal of Physics A, vol. 37, no. 6, pp. 2063–2072, 2004. View at: Publisher Site  Google Scholar  MathSciNet
 A. Faghri and E. Egyháziová, “Development of a computer simulation model of mixed motor vehicle and bicycle traffic on an urban road network,” Transportation Research Record, no. 1674, pp. 86–93, 1999. View at: Google Scholar
 X. M. Zhao, B. Jia, Z. Y. Gao, and R. Jiang, “Traffic interactions between motorized vehicles and nonmotorized vehicles near a bus stop,” Journal of Transportation Engineering, vol. 135, no. 11, pp. 894–906, 2009. View at: Publisher Site  Google Scholar
 X. M. Zhao, B. Jia, Z. Y. Gao, and R. Jiang, “Congestions and spatiotemporal patterns in a cellular automaton model for mixed traffic flow,” in Proceedings of the 4th International Conference on Natural Computation (ICNC '08), pp. 425–429, October 2008. View at: Publisher Site  Google Scholar
 C. Mallikarjuna and K. R. Rao, “Cellular Automata model for heterogeneous traffic,” Journal of Advanced Transportation, vol. 43, no. 3, pp. 321–345, 2009. View at: Publisher Site  Google Scholar
 T. G. Oketch, “New modeling approach for mixedtraffic streams with nonmotorized vehicles,” Transportation Research Record, no. 1705, pp. 61–69, 2000. View at: Google Scholar
 X. Yang, Z. Gao, X. Zhao, and B. Si, “Road capacity at bus stops with mixed traffic flow in China,” Transportation Research Record, no. 2111, pp. 18–23, 2009. View at: Publisher Site  Google Scholar
 H. W. Guo, Z. Y. Gao, X. M. Zhao, and X. B. Yang, “Traffic behavior analysis of nonmotorized vehicle under influence of curb parking,” Journal of Transportation Systems Engineering and Information Technology, vol. 11, no. 1, pp. 79–84, 2011. View at: Google Scholar
 J. F. Lawless, Statistical Models and Methods for Lifetime Data, John Wiley & Sons, New Jersey, NJ, USA, 2002. View at: Zentralblatt MATH  MathSciNet
 C. R. Bhat, “Duration modeling,” in Handbook of Transport Modelling, K. J. Button and D. A. Hensher, Eds., pp. 91–111, Elsevier Science, Amsterdam, The Netherlands, 2000. View at: Google Scholar
 D. A. Hensher and F. L. Mannering, “Hazardbased duration models and their application to transport analysis,” Transport Reviews, vol. 14, no. 1, pp. 63–82, 1994. View at: Google Scholar
 D. Nam and F. Mannering, “An exploratory hazardbased analysis of highway incident duration,” Transportation Research Part A, vol. 34, no. 2, pp. 85–102, 2000. View at: Publisher Site  Google Scholar
 D. R. Cox, “Regression Models and Life Tables,” Journal of the Royal Statistical Society B, vol. 34, no. 2, pp. 187–220, 1972. View at: Google Scholar  MathSciNet
Copyright
Copyright © 2013 Hongwei Guo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.