Abstract

Bicycle traffic has heavy proportion among all travel modes in some developing countries, which is crucial for urban traffic control and management as well as facility design. This paper proposes a real-time multiple bicycle detection algorithm based on video. At first, an effective feature called multiscale block local binary pattern (MBLBP) is extracted for representing the moving object, which is a well-classified feature to distinguish between bicycles and nonbicycles; then, a cascaded bicycle classifier trained by AdaBoost algorithm is proposed, which has a good computation efficiency. Finally, the method is tested with video sequence captured from the real-world traffic scenario. The bicycles in the test scenario are successfully detected.

1. Introduction

In the past two decades, China’s motorization was developed rapidly with the economic growth. The motor vehicle has become one of the most important travel tools. However, national economy and per capita GDP are still lower than most developed countries. In some underdeveloped cities in China, many people yet select bicycles as travel tools [1]. Motor vehicles and bicycles are sharing the same roadway. Cycling has heavy proportion among all travel modes in these cities. In the developed countries, bicycle travel is recognized as “green traffic,” which has low energy consumption, is healthy to the users, and does not damage the health of others. It is relatively fast over short distances and provides a reliable and affordable form of transport for most sectors of the population [2]. Therefore, cycling is still one of the most sustainable travel modes [3] around the world.

In relation to the bicycle traffic, it is crucial for control and management as well as facility design of mixed traffic. Scholars have done much research work on it. Unfortunately, deficiencies and limitations in the existing sources of data often hamper these efforts. And several current data collection techniques for bicycle study still depend on observer-based manual operations, which remain time-consuming and resource intensive. With the emergence of a wide variety of automated detection technologies, a few applications have been developed for bicycle detection in recent years, including inductive loops, microwave, infrared, and vision-based. In previous works, most scholars [46] utilized inductive loops or their improved forms to design detection systems for acquiring bicycle parameters. But limited by the detection range, it is difficult to solve problems of multiple bicycles passing together or vehicles and bicycles passing together. In addition, although microwave detector can detect the occupying area of an object, the affordable automotive detector can only detect reflectors. As a result, it has no ability to recognize object’s category reliably which is the same to the infrared detector.

The vision-based method has the advantages of large detection range and high scalability compared to other methods and therefore is one of the most reliable techniques used for bicycle detection. However, in existing literatures of vision-based detection methods in ITS, vehicles and pedestrians remain the primary objects focused on [7, 8]. Bicycles have so far been limited. Messelodi et al. [9] presented a feature-based bicycle recognition algorithm. The algorithm extracts some visual projective features focusing on the wheel regions of the targets. And then support vector machine (SVM) is applied to distinguish bicycles from motorcycles in real-world traffic scenes. Rogers and Papanikolopoulos [10] detected moving objects through the scene by means of a background differencing technique. They localized the wheels by searching for ellipses using the generalized Hough-Transform in the edge map to recognize bicycles. In a similar way, in Dukesherer and Smith’s method [11], Hough-Transform is utilized to locate wheel regions of bicycles, and then the Hausdorff distance is used for matching the candidates with simple bicycle templates. A bicycle is recognized as two arcs of a circle separated by an approximately known distance. David et al. [12] developed and tested a bicycle detection and classification algorithm by active-infrared overhead vehicle imaging sensor technology. In the method, several message concepts are defined derived from four stages of the movement of a target underneath the sensor. A bicycle could be accurately detected and classified using the sequence of messages.

As can be seen from the literature review, some achievements of vision-based bicycle detection have already been made. Nevertheless, an individual or a small number of bicycles are considered as the research object to be detected in most methods, which do not adapt to the case of bicycles in large numbers. Since cycling has heavy proportion among all travel modes in China, there is always a great volume of bicycles in rush hour. In addition, due to the low speed, low occupancy in space, and the flexibility of bicycle travelling, cyclists often move together in groups. Therefore, it is very necessary to design a multiple bicycle detection method, which could provide real-time bicycle’s traffic information (the volume, the velocity, etc.) for traffic control and management.

This paper aims to propose a real-time multiple bicycle detection algorithm based on video. The remainder of this paper is arranged as follows: in Section 2, an effective feature called multiscale block local binary pattern is provided for bicycle feature representation; followed by the recognition task of bicycle, a cascaded classifier trained by AdaBoost algorithm is proposed. Lastly, the validity of the proposed approach with video sequence captured from realistic traffic scenario is tested and conclusions are drawn.

2. MBLBP Feature Representation

The LBP operator was first introduced as a complementary measure for local image contrast [13]. It is a gray-scale invariant texture primitive statistic, which has shown excellent performance in the classification of various kinds of textures [14]. A texture in a local neighborhood of a monochrome texture image is defined as the joint distribution of the gray levels of image pixels: where corresponds to the gray value of the center pixel of the local neighborhood and correspond to the gray values of equally spaced pixels on a circle of radius that form a circularly symmetric neighbor set. If the coordinates of are , then the coordinates of are given by . Much information of the joint gray level distribution about the textural characteristics can be conveyed by the joint difference distribution [15]: For each , a binary code can be produced by thresholding its neighborhood with the value of : where By assigning a binomial factor for each , a unique can be constructed that characterizes the spatial structure of the local texture: When and , can be obtained, which is the basic LBP descriptor. An illustration of the basic LBP operator is shown in Figure 1. In this way, a 256-bin histogram can be created to collect up the occurrences of different binary patterns over an image.

The basic LBP is defined for each pixel by thresholding the 33 neighborhood pixel value with the center pixel value. MBLBP is the extendable descriptor of the basic LBP, with respect to neighborhoods of different sizes. In MBLBP, the comparison operator between single pixels in LBP is replaced with the comparison between average intensities of subregions. Each subregion is a block containing neighboring pixels. A MBLBP descriptor is composed of 9 blocks, which is shown in Figure 2. In this way, an output value of the MBLBP can be obtained: where is the average gray values of the center block (size , is the width of the block; is the height of the block); are those of its neighborhood blocks. Particularly, when , , MBLBP is in fact the basic LBP. Compared with the basic LBP, MBLBP can capture large-scale structures that may be the dominant features of images. In addition, MBLBP could be calculated fast using integral image method [15], which incurs a little more cost than the basic 33 LBP operator.

Figure 3 gives some examples of MBLBP with different sizes for bicycle and nonbicycle images. From this figure we can see that, for a small scale, local, micropatterns of a bicycle structure are well represented, which may be beneficial for discriminating local details. But using average values over the blocks could reduce noise and make the representation more robust.

3. AdaBoost Learning

A cascaded classifier is constructed for obtaining possible bicycle candidates. It can effectively remove most nonbicycle subimages and accelerate the detection algorithm. The MBLBP features are adopted as the basic elements to construct the cascaded classifiers, and each layer is trained by AdaBoost algorithm [16].

The basic idea of AdaBoost algorithm is to use large capacity of general classification of the weak classifier by a certain method of cascade to form a strong classifier. The cascade structure containing stages is illustrated in Figure 4, where is referred to as an AdaBoost classifier in the th stage. As can be seen from the structure, the cascade classifier is a degenerated decision tree. At each stage, a classifier is trained to detect almost all bicycle candidates while rejecting a certain fraction of nonbicycle objects. Therefore, negative subimages that do not contain bicycles can be abandoned in some early stages of the cascade [17, 18]. Only the subimage passing all stages can be identified to be the bicycle. Detailed procedure of the cascaded AdaBoosting classifier is described as follows.(a)Give training samples , where corresponds to the types (nonbicycle and bicycle, resp.). The training set consists of nonbicycle samples and bicycle samples.(b)Initialize the th sample’s weights . If the sample is a nonbicycle object, the weight is represented as ; if the sample is a bicycle, the weight is represented as .(c)For each training stage (where is the number of training stages):① normalize the weights ② For each feature, the corresponding weak classifier is trained as where denotes the direction of the inequality sign, its value only refers to 1 or −1, denotes the feature value, and is the threshold.③ Choose the simple classifier with the lowest error : ④ Update the weights according to the best simple classifier : where if sample is classified correctly, ; otherwise, .(d)At last, a strong classifier is formed as As can be seen from the process proposed above, a strong classifier based on a set of weak classifiers by reweighting the training samples can be constructed using the boosting idea. At each stage of boosting, the feature-based classifier that best classifies the weighted training samples is used. Because the classifier should achieve the desired false alarm rate at a given hit rate, the number of weak classifiers should be increased. Lastly, all the weak classifiers are combined to form a strong classifier by different weights [16].

4. Experimental Results

For bicycle recognition in urban road environment, a number of bicycle images are selected manually and a bicycle sample dataset is constructed. In the sample dataset, the positive samples are typical bicycles with different size, pose, and cycler’s clothing. Some preexperimental studies have shown that the selection of negative samples is particularly important for reduction of false alarms. Thus, boles, trash cans, telegraph poles, and bushes, which are likely to be mistaken for bicycles, as well as some normal objects such as roads, vehicles, and other infrastructures are selected to form negative samples. A total of 4650 hand labeled samples were adopted to train the cascaded classifiers. They include 1650 positive samples and 3000 negative samples. In the bicycle dataset, each sample image is normalized to 16 × 64 pixels for training. Figure 5 shows some samples of the bicycle dataset.

Based on the cascaded classifier with 50 stages, 600 test samples (the first 300 samples are positives; the rest are negatives) are recognized. From Figure 6 we can see that feature outputs of bicycles and nonbicycles are obviously different. That means the MBLBP is an effective feature for bicycle detection.

A 99 mode MBLBP is used for feature representation of the training data. And then, multiple bicycles of the test set are recognized by an AdaBoost classifier trained. The results (2 examples) show a relative high false positive rate in Figure 7. In order to achieve a better detection performance, a two-layer detection strategy is operated in this paper. In the first layer, 99 mode MBLBP features of samples are calculated. Some not well-recognized samples are selected (false positives and false negatives) to compose a new training set. Multiple modes of MBLBP with 33 and 99 modes are calculated for feature representation in this layer. And then, an additional AdaBoost classifier is constructed for the second round recognition. In this way, false positive rate could be improved (Figure 8). In addition, the two-layer detection has better timewise performance than that of one-layer detection with 33 mode MBLBP calculation. Through our test, the average detection time is around 0.1 s with 10 detection scales, which could be used for the real-time application of intelligent urban traffic management and control.

5. Conclusions

According to this paper, a multiple bicycle detection algorithm for intelligent urban traffic management and control has been developed. The research conclusions are as follows: an extended LBP descriptor called MBLBP is proposed for feature representation, which is a well-classified feature to distinguish between bicycles and nonbicycles; then a cascaded bicycle classifier is constructed based on AdaBoost algorithm, followed by testing from real-world traffic scenarios. Reliable and timewise performances are shown on vision-based bicycle recognition. The processing speed could reach 10 frames/s which could satisfy the real-time requirement. In future work, we will further study the key technologies for the analysis of cyclist motion characteristics and behaviors.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is partly supported by National Science Foundation of China (nos. 51108208 and 51278220), Postdoctoral Science Foundation funded project of China (no. 2013T60330), Science and Technology Development Project of Jilin Province (no. 20130522121JH), and Fundamental Research Funds for the Central Universities of China (no. 201103146).