Abstract

The detection and recognition of arrow markings is a basic task of autonomous driving. To achieve all-day detection and recognition of arrow markings in complex environment, we propose a hybrid model by exploiting the advantages of biologically visual perceptual model and discriminative model. Firstly, the arrow markings are extracted from the complex background in the region of interest (ROI) by the biologically visual perceptual model using the frequency-tuned (FT) algorithm. Then candidates for road markings are detected as maximally stable extremal regions (MSER). In recognition stage, biologically visual perceptual model calculates the sparse solution of arrow markings using sparse learning theory. Finally, discriminative model uses the Adaptive Boosting (AdaBoost) classifier trained by sparse solution to classify arrow markings. Experimental results show that the hybrid model achieves detection and recognition of arrow markings in complex road conditions with the precision, recall, and F-measure being 0.966, 0.88, and 0.92, respectively. The hybrid model is robust and has some advantages compared with other state-of-the-art methods. The hybrid model proposed in this paper has important theoretical significance and practical value for all-day detection and recognition in complex environment.

1. Introduction

Road markings as arrows painted on the surface of roads in China have different forms: Forward, Left, Right, Forward-Left, Forward-Right, and so forth. As the road markings serve as traffic guidance and navigation information, the detection and recognition of road markings is a top research in recent years for both autonomous driving and road intelligence [1].

The detection and recognition of road markings is a challenging problem not only due to the intrinsic complexity of the driving environment itself but also due to the impossibility of controlling many environmental parameters [2], for example, day/night, sun/streetlight illumination, temperature, poor visibility, rain/snow, and different meteorological conditions, which in general are impossible to control and have to be faced by sensing devices [2, 3]. In order to achieve all-day detection and identification of arrow markings in complex environment, much progress has been made in recent years [316]. The detection and recognition algorithms are generally classified into discriminative models [36] and biologically visual perceptual models [716].

Discriminative models achieve the detection and recognition of arrow markings based on classifier (SVM (Support Vector Machine) [3, 4], Random Forest [5], ANN (Artificial Neural Network) [6], etc.). Arrow markings in [3] are recognized by SVM using the HOG (Histogram of Oriented Gradient) features under the influence of shadows and illumination invariant. Reference [4] obtained higher recognition accuracy than [3] by HOG-LBP (Local Binary Patterns, LBP) feature fusion. Despite the fact that all of the above achieve high recognition accuracy, much time is consumed due to the high dimension of the feature vector, and whether the algorithms are suitable at night remains to be verified. Reference [6] greatly reduces the dimension of the feature vector by using the 37-element feature vector of arrow markings, such as the width, length, maximum length, and the number of right angles, to achieve rapid recognition; however, the recognition accuracy is easily affected by road markings wear, poor visibility, and other factors due to the insufficient features.

Biologically visual perceptual models simulate the human eye to acquire detection and recognition of target in complex environment. First visual saliency detection model based on the human visual attention mechanism [9] quickly extracts the target from a complex background, such as HC (Histogram Contrast), LC (Luminance Contrast), CA (Context-Aware), and FT models proposed in [1012]. Next the target is represented by the sparse learning model based on the sparse sensing properties [13] of the human eye using sparse solution. Finally a classifier based on minimum residual method is used to achieve recognition of target [1416]. However, due to the lack of a strong classifier that can effectively use the solved sparse solution to classify the input samples, the accuracy of target recognition based on SR (sparse representation) needs to be improved.

It is established that biologically visual perceptual models have the speed advantage, but recognition accuracy is lower than discriminative models, and discriminative models could have the accuracy advantage, but recognition time is longer than biologically visual perceptual models. These problems are detrimental to autonomous driving [1, 2]. For improving both models, we proposed a hybrid model.

In our work, at the detection stage, biologically visual perceptual models accurately extract the foreground from the complex background to help discriminative models to improve anti-interference ability when detecting arrow markings in complex environment. At the recognition stage, biologically visual perceptual models provide sparse feature vector for discriminative models to avoid the problem of time-consuming classification when the feature dimension is too high [4] and avoid the problem of the recognition accuracy easily affected by the external environment when there are fewer features [6] and at the same time discriminative models can provide a powerful classifier for biologically visual perceptual models.

According to the above analysis, a hybrid model is constructed. First image frames from an on-board camera are captured and extracted using a defined ROI; then we use FT + MSER to extract TBI (targets to be identified) in ROI; finally, TBI is sparsely represented and classified using the trained AdaBoost strong classifier. Experimental results show that the proposed model can quickly and accurately detect and recognize arrow markings in complex road environment.

In Section 2, we will introduce the hybrid model including arrow markings detection and recognition. Section 3 presents the comparison experiment between the proposed hybrid model and other models. Finally, in Section 4, conclusions including the actual limits in application and future work are given.

2. The Proposed Hybrid Model

Arrow markings detection and recognition algorithms are generally divided into three steps: preprocessing, road marking description, and arrow marking recognition. Preprocessing step aims to improve the detection accuracy facing complex external environment (complex driving environment such as vehicle pedestrians and other symbols and weather conditions such as illumination invariant rain and snow) by using Inverse Perspective Mapping (IPM) [3] or a predefined ROI [17] (Figure 2). Road marking description step extracts candidate targets from the background generally using MSER [3, 17]. Arrow marking recognition step uses classifier [36] such as SVM, Random Forest, and Neural Network to achieve arrows classification. Algorithm’s flowchart is shown in Figure 1.

2.1. Arrow Markings Detection

Due to the location of arrow markings (painted on the surface of roads), unavoidable numerous noise sources (sun/headlight/streetlight reflection, road surface debris, reflection caused by surface water and decay, etc.) will interfere with the detection algorithm, causing the decrease of detections accuracy. It will be a challenge to effectively extract the arrow markings in such complex road environment. MSER algorithm, which can detect extremal regions such as lane lines, vehicles, and road markings, is the most commonly used method for detecting arrow markings at present. However, the detection results are often unsatisfactory due to numerous noise sources on roads. Salient object detection models based on biologically visual perceptual theory have the ability to search for interested objects in the field of vision quickly even with numerous noise sources [1012], which is beneficial to the extraction of arrow markings in complex environment.

Drawing lessons from the above methods, the paper combines the salient object detection model and MESR method to improve the anti-interference ability of the detection algorithm. This paper picks LC, FT models combined with MESR method, respectively, to verify the validity of proposed method. In the following sections, we demonstrate the combined FT model (one of salient object detection models) and MESR as an example and the experiment results compared between FT + MSER and MSER are shown in Figures 3(d), 3(h), and 3(l). It can be seen from Figure 3 that the combined FT + MSER method significantly improves the anti-interference ability of MSER in different environmental parameters. Experiment results of other combination methods using LC models are shown in Figures 3(c), 3(g), and 3(k).

Although the FT + MSER method can eliminate some of the jamming targets, it still cannot eliminate jamming targets completely, which will cause troubles for subsequent recognition. The MSERs have some particular characteristics such as position, orientation, and size. Through the analysis, we find that the location and size of arrow markings in the images are similar as the position, the orientation, and the size of the MSERs. Using these discriminative features, some of the regions that do not show arrow markings properties can be eliminated. Figure 4 shows the detection results after elimination. As shown in Figure 4(b), when more than one MSER may reserve, the largest MSER is chosen as the candidate target.

2.2. Arrow Markings Recognition

After arrow markings have been detected, they need to be recognized next. The process of recognition includes feature extraction and classification. Target recognition algorithm built on biologically visual perceptual theory is based on sparse representation. Compared to other recognition algorithms, although sparse representation describes a specific target using as few data as possible in feature extraction stage, which could accelerate classification algorithm in theory, due to the lack of strong classifiers, the accuracy of target recognition based on sparse representation needs to be improved [15, 16]. Relatively, the biggest advantage of target recognition based on discriminative models is that it can obtain higher accuracy in the target classification stage, but it takes a lot of time to classify targets due to feature redundancy [3, 4].

By synthetically analyzing the advantages of the two target recognition models, a method of arrow markings recognition is proposed, which combines the sparse representation and discriminative models. Through the effective combination of both, we achieved a faster and higher accuracy target recognition algorithm.

2.2.1. Arrow Markings Sparse Representation

The core idea of sparse representation theory is utilizing the linear combination of atoms in an overcompleted dictionary to approximate TBI and it requires that the number of atoms used be as few as possible, which is sparsity. The sparse representation of arrow markings is expressed in the form of mathematical model.

Suppose that is a target to be recognized, where is the dimension of the target and an overcompleted dictionary of arrow markings is known, where is the number of samples in the dictionary and represents the th sample in the dictionary . Then the linear representation of should be expressed aswhere is the coefficient vector of . When , (1) is an underdetermined ill-posed equation with unbounded solution. Therefore, how to solve is a hard problem. However, if we assume that has sparsity which is the number of nonzero elements in being as few as possible, we could solve utilizing the following formula:where is -norm. However, solving the sparse solution is an NP-hard (Nondeterministic Polynomial hard) problem. Later Tao and Candes [20] proved that -norm optimization problem has the same solution as -norm optimization problem under the condition of RIP (Restricted Isometry Property). Then formula (2) could be expressed asdue to the presence of noise, and formula (3) is rewritten aswhere is the noise term and is -norm.

Calculating formula (4) directly utilizing the initial overcompleted dictionary consisting of arrow markings samples, the precise sparse solution may not be obtained because the initial dictionary contains noise data. To obtain the precise sparse solution, we need to train the initial dictionary.

The first step is fixing the initial dictionary . Then the training set consisting of arrow markings samples is used to solve the sparse coding :

The results can be obtained by the LARS-Lasso algorithm [21]. Next, sparse coding is fixed to train dictionary . K-SVD method [22] is commonly used to solve the following formula: is a well-trained dictionary.

2.2.2. Discriminative Models Arrow Markings Classification

Recognition algorithm of discriminative models defines recognition problem as binary classification problem, which is discriminating whether TBI is foreground or background using classifier. SVM and AdaBoost, two commonly used classifier algorithms, are chosen to verify the validity of the proposed hybrid model in our work. We take AdaBoost as an example to demonstrate the process of obtaining a strong classifier.

The AdaBoost classifier of discriminative models is a strong classifier consisting of multiple weak classifiers [23]. Compared with others, AdaBoost can achieve higher recognition accuracy and higher efficiency with only a small amount of training samples and can be updated online to adapt to target changes.

The main task of arrow markings classification is utilizing training samples to train a classifier with strong classification ability. At present, it is a very time-consuming task to train a classifier with good performance due to the high dimension of the feature vector and the large number of samples. In [19] training an AdaBoost classifier using 100–200 training samples takes 1 hour and when the number of training samples is 900–1,000, training time is 6 hours. Training samples of the new method proposed in this paper are sparse solutions of all the samples and the sparse solution contains only a few nonzero elements, which greatly reduces the classifier training time. The process of utilizing sparse solutions to obtain an AdaBoost classifier is shown as follows.

Step 1. After sparse representation, sparse solutions of all training samples are obtained. Then we combine sparse solutions of all samples into the training set , where is the number of samples in the training set and represents the sparse solution of the th training sample in the training set and is the category label, while “1” is positive sample and “-1” is negative sample. Assume that after M iterations a highly accurate AdaBoost classifier can be obtained.

Step 2. Initialize the weight of all training samples. At first, the weight is uniform distribution; that is, and .

Step 3. For , using the training set whose weight is , the AdaBoost classifier is trained. A week classifier is obtained. Calculate the classification error rate on the training set:Then calculate the coefficient of the weak classifier : Update training set weight distribution:where is the normalization factor to guarantee that .

Step 4. The M weak classifiers obtained from M iterations are combined into a strong classifier:The final result isSo far, the theoretical analysis of the combination between the biologically visual perceptual model and the discriminative model has been completed. Next, we will verify the classification accuracy and classification time of the proposed model by experiments.

3. Experiments

In order to verify the validity of the proposed model, the test is performed on a dataset, consisting of 4,000 frames at a frame rate of 30 fps and a resolution of 1920 × 1080. These test data are captured all day at different vehicle speeds, different urban and suburban roads, and different weather conditions including sunny, cloudy, and rainy conditions. The experimental results are obtained on a 3.10 GHz Intel Core i5 CPU under MATLAB 2014a. As the detection and recognition theory of all kinds of arrow markings is the same, the experiment focused on two kinds of arrow markings (Forward and Forward-Right) for detection and recognition. This section mainly includes the following contents: experimental parameter setting and experimental results comparative analysis.

3.1. Experimental Parameters Setting
3.1.1. Sample Data

There is no published standard database in this area currently. We collected large numbers of images by an on-board camera in different places, different weather, and different road conditions, from which 1,500 samples were cut out including 750 Forward and 750 Forward-Right and the size of each sample was normalized as . Figures 5(a) and 5(b) show part of the database.

3.1.2. Dictionary Training and Classifier Training

500 Forward and 500 Forward-Right of 1,500 samples are selected as the initial dictionary. The trained dictionary is shown in Figure 6. After training the dictionary, sparse solutions of other samples are used to form AdaBoost and SVM training samples. During training the AdaBoost classifier, selection of number of iterations is crucial, due to the relevance between the recognition accuracy and the time of iteration. Too many or too few iterations can lead to worse recognition accuracy and the larger number of iterations you set the more training time it needs. Recognition accuracy could be calculated using formula (14). Figure 7 shows that an AdaBoost classifier is trained on different numbers of iterations. It can be seen from Figure 7 that when the number of iterations is 600, the overall recognition accuracy is the highest and if not the overall recognition accuracy will decrease.

3.2. Experimental Results Analysis

In order to prove the detection performance of the hybrid model, we use the most common performance evaluation metrics precision, recall, and F-measure for comparison with other methods. These metrics are calculated aswhere TP is true positive; FP is false positive; FN is false negative; TN is true negative; F-measure is the harmonic mean of the precision and recall. While precision, recall, and F-measure are close to 1 at the same time, the hybrid model has better detection performance.

Compared results between hybrid model and other methods are shown in Table 1. According to Table 1, hybrid model with different classifiers has different detection results that are related to the performance of classifiers. Hybrid model using SVM is superior to hybrid model using AdaBoost. But altogether the detection performance of the proposed hybrid model is better than other methods, in which baseline method [3] is the worst, followed by KB2010 [18] and K-NN (K-Nearest Neighbor) [5]. Although Random Forest has higher precision ratio, the whole detection performance is poor due to the lower recall rate. Although Table 1 shows that the result for recognition of proposed hybrid model is slightly improved compared to SVM in [3], the proposed method experiments under complex road environment such as nighttime and rainy day, which is different from [3].

In order to prove whether the proposed hybrid model has fully exploited the advantages of the biologically visual perceptual models and the discriminative models, we use the classification precision and classification time for comparison with the two models, respectively. Table 2 shows the comparison between the proposed hybrid model and biologically visual perceptual models. Table 3 shows the comparison between the proposed hybrid model and discriminative models.

The comparative results between hybrid model and the commonly used methods of two kinds of models are shown in Tables 2 and 3, respectively. From Table 2, it can be seen that although the classification time of the sparse learning model is less, the classification precision of the sparse learning model is lower than the hybrid model. Through the comparison between the hybrid model with SVM and discriminative models with SVM (HOG + SVM, MBLBP + SVM, and HOG-MBLBP + SVM) [4] in Table 3, we could find that although the classification precisions are much the same, hybrid model has obvious advantages in classification time. As we chose AdaBoost classifier in hybrid model, the result is superior to other methods. In ideal road environment, Haar + AdaBoost [19] reduces the dimension of Haar feature by clipping the image size of the test sample to obtain recognition rapidly and accurately. However, when facing complex road environment, the accuracy will drop to 0.71 and the robustness of the algorithm is worsened. Histogram + ANN [6] manually selects 37 features such as width, length, maximum length, and the number of right angles of arrow markings to achieve rapid recognition. But 37 features are easily affected by external environment, leading to unreliable recognition accuracy. Therefore, this method is not an appropriate method.

As the number of training samples increases, the training time of classifier becomes a crucial metric to evaluate the quality of a model [24]. In order to verify the superiority of the hybrid model in training time, Table 4 gives the comparative results of the training time of the hybrid model and discriminative models. It can be seen from the experimental results that, under the same hardware and the same number of the training samples, the hybrid model has the shortest training time. When the number of the training samples is 200, the training time of the traditional Haar + AdaBoost method is 1 hour. However, the proposed hybrid model needs only 8.3 seconds under even 500 training samples. When training SVM, the training time of hybrid model is also shortened greatly. The hybrid model improves the learning efficiency of discriminative models significantly.

Through the analysis of above experimental results, the proposed hybrid model has exploited the advantages of each model, high speed of biologically visual perceptual model and high accuracy of discriminative model, and excels the predecessors. At present, the precision of the hybrid model can reach 96.6%, and the model ran at an average rate of 5 fps. The traditional SVM [3] and AdaBoost [19] methods are 3 fps and 2 fps, respectively.

However, in practice, the number of training samples for classifiers is usually quite large. In order to study the possibility that the hybrid model can be used on larger dataset, this paper analyzes the performance of the hybrid model trained by continuously enlarged training dataset (increasing number of training samples). As shown in Figure 8, Figures 8(a) and 8(b) show the detection performance and training time of the hybrid method using the AdaBoost classifier on different numbers of training samples, respectively. Figures 8(c) and 8(d) are the same metrics as Figures 8(a) and 8(b), respectively, where we only change the classifier from AdaBoost to SVM. From Figures 8(a) and 8(c), the detection performance of the classifiers is robust while increasing the number of training samples. As shown in Figures 8(b) and 8(d), although the training time of the classifier increases with the increase of the number of samples, the training time is still shorter than the traditional Haar + AdaBoost [19] method. To sum up, we can infer that the hybrid model is able to be used on larger dataset.

As for computational cost [25, 26], time complexity [27, 28] is an important criterion for evaluating algorithms in practical applications, which describes the speed of algorithms. Table 5 shows the comparison of the time cost of the hybrid model and other models. Methods in Table 5 are all tested on a computer with an Intel Core i5-2400 CPU at 3.10 GHz and 4.0 GB RAM. From Table 5, we can see that although our method takes much less computation time than the traditional AdaBoost and SVM methods, it still takes more computational time than sparse coding. So accelerating our proposed method will be our future research focus.

We then perform our experiment on real-world scenario. Figure 9 shows the experimental results where we use hybrid model to detect and recognize arrow markings in complex road environment. For each image, bottom left is the extracted ROI. Bottom right is the arrow markings detected by FT + MSER. Top is the final recognition results using AdaBoost. As can be seen from Figure 9, the proposed hybrid model can effectively detect and recognize arrow markings under the influence of different illumination (sunlight, vehicle lights, and street lights) and different weather (sunny, cloudy, and rainy weather), achieving all-day detection and recognition of arrow markings.

4. Conclusions

In order to detect and recognize the arrow road marking on the road effectively in complex road environment, this paper proposes a hybrid model that combines biologically visual perceptual models and discriminative models. In the detection stage, biologically visual perceptual model using the salient object detection algorithm combines with the MSER algorithm, improving the anti-interference ability of the MSER and effectively detecting the arrow markings from the ROI. In the recognition stage, biologically visual perceptual model using the sparse representation algorithm combines with discriminative model using the AdaBoost algorithm, improving the learning efficiency of the AdaBoost classifier and achieving the rapid and accurate classification of arrow markings. The precision, recall, and F-measure of the hybrid model can reach 0.966, 0.88, and 0.92, respectively, and processing time is 0.2 fps. The future work will be a two-part job. Firstly, due to the limitation of the proposed hybrid model, which is that the performance of hybrid model relies on the performance of classifier and the salient object detection algorithm, we need to combine other algorithms of biologically visual perceptual models and discriminative models to verify universal applicability. Secondly, we need to enlarge training datasets to improve the accuracy of recognition and the adaptation of the detection and recognition facing all kinds of arrow markings.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

All authors declare that there are no conflicts of interest regarding the publication of this paper.