Abstract

Vehicle detection is one of the important technologies in intelligent video surveillance systems. Owing to the perspective projection imaging principle of cameras, traditional two-dimensional (2D) images usually distort the size and shape of vehicles. In order to solve these problems, the traffic scene calibration and inverse projection construction methods are used to project the three-dimensional (3D) information onto the 2D images. In addition, a vehicle target can be characterized by several components, and thus vehicle detection can be fulfilled based on the combination of these components. The key characteristics of vehicle targets are distinct during a single day; for example, the headlight brightness is more significant at night, while the vehicle taillight and license plate color are much more prominent in the daytime. In this paper, by using the background subtraction method and Gaussian mixture model, we can realize the accurate detection of target lights at night. In the daytime, however, the detection of the license plate and taillight of a vehicle can be fulfilled by exploiting the background subtraction method and the Markov random field, based on the spatial geometry relation between the corresponding components. Further, by utilizing Kalman filters to follow the vehicle tracks, detection accuracy can be further improved. Finally, experiment results demonstrate the effectiveness of the proposed methods.

1. Introduction

With the rapid development of intelligent traffic control, computer vision has attracted much attention from designers of intelligent transportation systems (ITSs) [1], due to its importance in information collection in the real-time environment. Monitoring systems based on computer vision technology have become very important in the development of ITSs, and a detailed introduction to vehicle monitoring methods and video monitoring system frameworks is available in [2].

At present, video-based vehicle monitoring systems can be divided into two categories according to two different kinds of vehicle features: vehicle appearance and vehicle moving character. In the method based on vehicle appearance [2, 3], a vehicle target is detected by means of geometric structures, color information, and texture features of the entire or part of the vehicle, such as the symmetry of the vehicle structure, the outline of the vehicle, the local car lights, and license plates. There are a variety of feature representation operators to describe vehicle appearance, such as high-order Godunov (HOG) features, Haar-like operators, scale-invariant feature transform (SIFT) features, and speeded-up robust features (SURFs). These feature descriptors are combined with classification algorithms for vehicle monitoring systems in ITSs, such as artificial neural networks, support vector machines, AdaBoost, and sparse representation classification. In the method based on moving features [4, 5], the entire or part of the vehicle is tracked, and the corresponding tracking trajectories used to detect vehicle targets and analyze vehicle behaviors are obtained.

The method based on a 3D vehicle model [6] comprises building a 3D model of the vehicle target and then detecting the vehicle target using the vehicle identification method. The difficulty of this method is in how to build a better vehicle model. There are different vehicle sizes and shapes among different vehicle models and types. It is not easy to put forward a unified 3D model to include all kinds of vehicles in real life. When the traffic situation is complex, the detection of a 3D model is usually ineffective. Although 3D information of the vehicle is useful, occlusion [7], adhesion, and other issues still cannot be resolved.

The method based on features [8] is different from the above methods. All types of vehicle detection methods view vehicles as the smallest unit of target detection, but regarding the vehicles themselves, they have many variable features [9, 10], such as headlights, taillights, license plates, and vehicle symmetries. The organic combination of these features can often represent vehicle targets. We can detect a vehicle’s local characteristics instead of detecting vehicle targets. How to detect a vehicle’s local characteristics is a key and challenging aspect of this method. The images obtained by video cameras distort the shape and size of the vehicle due to the perspective relation, which makes it difficult to extract the local features of the vehicle from the 2D image.

In this paper, we aim to solve how to detect vehicle local features or components, avoid complex scenes caused by occlusion and adhesion problems, and establish the relationship between a 2D image and a 3D scene, using the local features or inherent properties of size and shape components while avoiding image distortion caused by the camera.

Under the conditions of normal light illumination and the basic rules of the road, it is not a problem for a human being to locate the target vehicle’s headlights and its color information quickly. However, if a computer vision method is utilized to design a robust algorithm to fulfill these tasks, it becomes extremely challenging to detect vehicle headlights, license plates, and taillights accurately.

2. Contributions of the Paper

In this paper, we mainly study the target recognition algorithm in traffic monitoring systems. Based on the probabilistic model of the spatial relation, the detection of the target vehicle’s components is exploited instead of detection of the entire vehicle.

There are two main innovations in our paper: on one hand, the shape and size of the vehicle components are distorted due to camera projection transformation. In this paper, therefore, an inverse projection algorithm is proposed to construct an inverse projection map. The inherent shape and size of the vehicle can be obtained on the inverse projection map, and the saliency components of the vehicle can be detected using these inherent attributes. Details of this public detection of vehicles during both day and night are presented in Section 3.1. On the other hand, the target vehicle can be replaced by detecting organic combinations of vehicle parts. At night, headlights can be selected as detection object and are first detected according to their geometric characteristics. A Gaussian mixture model is established using the distance between the headlights and their height, and the resulting probability model is used to achieve accurate detection. Details of this process appear in Section 3.2. In the daytime, the taillights and license plate are selected as the objects of detection through use of a color model to detect them. Then, by determining the geometric relationship between the parts, a Markov random field is established to complete vehicle detection. Details of this process appear in Section 3.3.

2.1. Summary of the Proposed Nighttime Algorithm

In the evening, since sunlight is limited, only the headlight information of the vehicles can be used. The target vehicle components at night are detected by the following steps:(1)Search the real data according to the car design and manufacturing standard, and then establish a Gaussian mixture model (GMM) model based on the real data [11].(2)Detect the dominant information of the vehicle components by using the inverse projection map.(3)Use the GMM model to train the dominant information, and obtain the GMM stochastic value.(4)Judge the stochastic value to achieve final vehicle detection.

2.2. Summary of the Proposed Daytime Algorithm

During the daytime, solar illumination is adequate, and thus the color information of the vehicle’s parts can be utilized to detect the vehicle license plate and taillights. In the daytime, the detection of the vehicle’s license plate and taillight can be realized by using background subtraction and a Markov random field (MRF) along with the spatial geometry relation between the corresponding components.

The main flow of the proposed algorithm is shown in Figure 1. In this paper, we introduce discrete algorithms for day and night.

3. Vehicle Detection Algorithm Based on Components

3.1. Inverse Projection Plane and Inverse Projection Map

In this paper, an inverse projection plane is preliminarily determined in the traffic scene that has been calibrated [12]. Once the position of the space inversion projection plane is determined, the corresponding relationship between the point in the reverse projection plane and the position of the pixel in the projection image is also determined. Then, the data of the projected image can be projected onto the inverse projection plane to obtain an inverse projection map, which copies the spatial information of the inverse projection plane. The inverse projection [5] process consists of two parts: the design of the inverse projection plane and the construction of the inverse projection map.

The proposed method relies on the space information with three dimensions; therefore, the calibration procedure for traffic scenes is necessary, and there are many calibration methods for traffic scenes. In this paper, the direct linear transformation (DLT) as proposed by Abdel-Aziz and Karara [12] is used for scene calibration.

3.1.1. Design of Inverse Projection Plane

The inverse projection plane [13] is designed according to the characteristics and spatial position of the target to be detected. Depending on the specific circumstances, it can be set to be parallel of the road, perpendicular to the road, or at a certain angle to the road. The number can be designed as 1 or more.

The part of the surface of the vehicle can be approximated as a plane with some geometric features. If the vehicle in three-dimensional (3D) space is regarded as a polyhedron, when the characteristics of different faces of the vehicle body are selected as the detection objects, the reverse projection plane is attached to the corresponding plane of the vehicle body, making the data after the construction of the inverse projection able to effectively show some of the apparent characteristics of the body (see Figure 2).

3.1.2. Construction of Inverse Projection Map

According to the above-described method, an inverse projection plane that can be fitted with a certain local surface of the target is arranged in the space and divided into a grid with a certain resolution (such as 1 cm × 1 cm). The camera’s perspective relation is that the information contained in the grid is projected onto each pixel of the corresponding projection area on the image, where the inverse projection relationship from the image projection area to the reverse projection plane is determined, that is, a small inverse of the inverse projection plane in the space. The grid corresponds to a pixel on the image.

The inverse projection map is a pixel representation of the inverse projection plane, which means that a small grid in the inverse projection plane is represented by a pixel in the inverse projection map. The process of building inverse projection map data is as follows: the inverse projection plane of each small grid information is copied to the inverse projection map and the inverse projection map in each pixel then represents the information of the inverse projection plane on each square of the grid information.

Suppose that represents a small grid of the inverse projection plane, the image’s pixel of projected onto the map, and the grid corresponding to the inverse projection map of the pixel; the reverse projection process is then mapping the image pixel to the inverse projection map pixel . The inverse projection map construction principle is illustrated in Figure 3.

It can be seen from Figure 3 that an inverse projection plane is provided on the spatial plane of the target surface. The inverse projection after restoration of the data is the copy of the target surface, which not only eliminates the perspective of the camera (some of the shape features of the target surface in the captured image are geometrically deformed) but well reflects the true dimensions of the local features of the target surface.

In this paper, the road traffic scene calibration during the course of the experiment is designed as two inverse projection planes, respectively, perpendicular to the road, as shown in Figure 4. The former is placed in the world coordinate system  m plane of size 2 × 3.75 m and denoted inverse projection plane 1; the latter is placed in the  m plane of size 2 × 5 m and denoted inverse projection plane 2. The red border in the figure is the projection area of the two inverse projection planes. After a target vehicle and the tail plane and the side are completely fitted to the two inverse projection planes, using the inverse perspective transform method, the projection area data needed to construct the inverse projection map are obtained. As the inverse projection map shows, when just the vehicle tail plane is fitted to the inverse projection plane, the characteristics of the vehicle tail plane are recovered (blue boxes) and are not fitted to the still deformed tail plane. When the inverse projection plane is fitted to the vehicle side (blue box), the positive view of the real space on the side of the vehicle is also constructed. In the inverse projection map experiments, 1 pixel represents the world coordinate system, 1 × 1 cm square.

3.2. Algorithm for Nighttime Vehicle Detection

Nighttime vehicle detection can use the center surround extreme as in [10]. In this paper, we detect headlights on the inverse projection map. Vehicle headlights as a significant feature of a vehicle at night have an apparent brightness, and on the inverse projection map targets have real structure characteristics and are an approximation of the real vehicle lights. The height of shadow on the projection plane is zero, and the headlights have some characteristics of height information, which can be used to remove the interference from headlight shadows. The flow of the main detection algorithm flow is as shown in Figure 5. In the evening, our method utilizes the shape characteristics of the headlights, rather than the color information, and thus grayscale video can be adopted.

3.2.1. Segmentation of the Headlights

The background difference method [14] is a commonly used foreground object extraction method in static camera imaging. Its function is based on the principle of using a background extraction algorithm to obtain the video background, because the pixels’ gray values of foreground moving objects and those of the background exhibit a certain difference; thus, we can perform a differential operation between video pixel values and the background pixel value of the same position. If the difference is greater than the threshold, we can consider it the foreground target.

The mathematical representation of the background subtraction method is as follows: the hypothetical image size is , the point in the current frame’s gray value is , and the gray value of the corresponding pixel in the background image is . The difference and binarization after determining the foreground pixel gray value is :where is the image of the preset threshold for binarization processing.

The background extraction environment chosen for this paper is nighttime traffic scenes, so the impact of weather and shadows is relatively small, and it is used to reveal a significant difference in the brightness of nighttime vehicle headlights upon block segmentation of the background, so the background extraction requirements are not too difficult to meet. As shown in Figure 6, in this step the proposed algorithm utilizes the gray information of the headlights rather than the color information, and thus either gray or color video can be adopted.

3.2.2. Headlight Pairing

As the most obvious characteristic of nighttime vehicle detection [15], headlights have some geometrical features of their own, such as geometric features of the area, circularity, and the similarity of the headlights, as shown in Figure 7.

Calculating the foreground object block of these characteristics can exclude the nonheadlights block. The mathematical definitions and expressions of these geometric features, as well as the threshold values, are set as follows, which expresses the computing method for area :where is the gray value of the pixel point located in the two-value image and is the area of the foreground object. The formula statistics, the number of pixels of the foreground object within , is also the number of white pixels in the rectangle frame that is connected to the foreground object. are the lower left vertex coordinates of the outer rectangle, is the height of the outer rectangle frame, and is the width of the outer rectangle frame. According to the vehicle manufacturer’s production specifications, the headlights are of a size that is in a certain range, because if the foreground object block is too large or too small, it is not a headlight target. The upper and the lower thresholds are determined by the conversion ratio of the pixel to the size when the inverse projection map is constructed. When the foreground object blocks meet the condition , the foreground object block is marked as a kind of vehicle headlight block.

The method of computing the degree of circularity iswhere is the area of the foreground object and is the contour perimeter of the foreground object. According to the calculation, in a variety of geometries, the degree of circularity of a circular object is the minimum, the degree for square circularity is 16, the degree for rectangular circularity is greater than 16, and as the ratio of length to width increases, the degree of circularity also increases. A convex polygon’s degree of circularity is smaller, but not smaller than that of a circle’s circularity; that is, a concave polygon has a greater degree of circularity. According to a priori knowledge of headlight shape, usually headlight shape is circular or rectangular, but the degree of rectangularity will not be too large. Owing to the light scattering effect, the shape of headlight bright blocks will to some extent be close to circular, indicating that the headlight circularity is not too large. In this paper, the degree of circularity threshold value is . When the foreground target block’s degree of circularity is , it is not a headlight target block; otherwise, the prospect target block will be marked as a headlight block.

The method of computing the geometric similarity of headlights iswhere and are the foreground object block labels, and represent the areas of the and object blocks, and and represent the degree of circularity of the and foreground object blocks. As shown, from the geometric features, the vehicle is described in the inverse projection map. The two headlights of a vehicle have obvious symmetry, which means that the two headlights have the same area and geometry, and, coupled with the proposed inverse projection diagram, we can effectively restore the image of the target shape in order to apply the area ratio and the circularity ratio of headlights tending to 1 as the prior knowledge required to identify possible headlights. In addition, this allows us to satisfy the matching conditions of the foreground object block. In this paper, the threshold settings are as follows: and , when the area ratio and the circularity ratio of the and object blocks are both satisfied with and , and and are possibly a pair of headlights.

Using the above judgment conditions to detect headlights, the rough matching results are shown in Figure 8. In this step, our algorithm utilizes the shape characteristics of the headlights rather than the color information, and thus gray or color video can be adopted.

3.2.3. Modeling the Spatial Relation of the Headlights

At this stage of the proposed method, rough matching of the vehicle headlights has been completed. Now, using the headlight parts and the spatial relation of the headlights for recognition and positioning and for alternative vehicle identification and localization is a very important part of the process. According to the actual life of the spatial relationship features of the vehicle components, one can extend the spatial relationship of the headlight features: it belongs to the center of gravity of the vehicle if the same pair of target headlights is presented in the same horizontal line and at the same vertical distance with parameter dimensions for practical production of vehicles. The spatial relationship is shown in Figure 9.

The mathematical expression of the spatial relationship between the headlights is as follows:where and denote two headlights in the direction and direction distance difference and denotes the height of the headlights above the ground.

In this paper, vehicle headlight distances of samples of two headlights were statistically calculated in the direction and the direction differences, as were the ground height values of headlights. The values of variables of the spatial relationship of headlight components for a portion of the total sample set of 500 vehicles are shown in Table 1.

From the table, we can see that the difference of headlights on samples of the direction distance (i.e., ) is close to zero. Therefore, in this paper, the modeling of these samples can be considered as . As a result, we submit that the other two values (i.e., and ) constitute the probability distribution map and ignore the direction data. It is thus easy to find a spatial relationship model between the headlights. The results of using the GMM with these samples are shown in Figure 10. A 2D GMM is established via two attributes: the height of the headlights and the distance between them.

3.3. Algorithm for Daytime Vehicle Detection

Using the blue component of vehicle license plates and taillights, respectively, we studied color detection of the target vehicle components. In RGB color mode, the blue component of the license plate background is larger than the red and green components, and the red and green components are very small. The red weight of the rear lights is greater than the blue and green component weights, and the blue and green component weights are also small. Based on these features, after processing, the target video sequence color space conversion can locate the license plate and the taillights. The detection flowchart is shown in Figure 11. In the daytime, since our method depends on the color information of the blue license plate and the red taillight, color video is utilized.

3.3.1. License Plate Detection

(1) Color Conversion Model. Considering the RGB color pattern, we mainly used the background color of the license plate to analyze the blue and white license plate components and then obtain each component’s histogram. The results are shown in Figure 12, in which it is obvious that in the license plate background pixel value the blue component is much larger than the other two component values, and both component values are relatively small.

Therefore, by analyzing the apparent characteristics of the license plate, we convert the video to a special color space:where is converted after the color pixels, and , , and are the red, green, and blue components of the pixel value, respectively. This transformation can enhance the license plate region, while suppressing the non-license-plate region. The conversion results are shown in Figure 13. From the transformed image sequence, we can see that the gray gradient of the license plate region is obviously enhanced, and the gray gradient of the non-license-plate region is significantly suppressed.

In this step, our algorithm utilizes the blue information of the license plate, and thus color video is adopted (gray-level video has no color information).

After transformation of the image sequence, a single channel image is obtained, and the pixels of the license-plate-region characteristics are quite obvious. The brightness is prominent, but brightness also exists around the target region. In order to exclude the interference of the surrounding pixel brightness, the histogram of the single channel image can be used, as shown in Figure 14(a). In the figure, one can see that the pixel value is greater than 50 and that the peak is in the vicinity of the license plate area. After threshold segmentation and further processing, the results shown in Figure 14(c) were obtained.

(2) Gradient Extraction of License Plate Region. After color conversion, the gray gradient [16] of the license plate region is enhanced, so it can be expressed by the pixel-value difference of the pixels in the image sequence. The formula for the gray gradient calculation iswhere is the gradient value of the pixel point in the converted video image sequence and is the neighborhood pixel value of the pixel . The gradient extraction result is shown in Figure 15. A sliding window is used to scan the gradient image, and then the average gray gradient of the window region is calculated. Because the image processing operation described in this paper is based on a 3D inverse projection map, the size of the license plate area and the actual size of license plate are in a certain proportion. When the target vehicle coincides with the inverse projection plane fitting, width and height of the license plate in the image can be calculated and used to set the width and height of the scanning window. The average gray-level calculation is expressed aswhere is the average gray gradient of the center of the license plate region and and are the width and height of the scanning window.

(3) License Plate Location. Using the average gradient, we used non-maxima suppression (NMS) [17] to find the local maximum value of the current frame image average gradient: if a local maximum value is larger than a preset threshold, the region is a candidate region for a license plate.

In the video image sequence, except for the license plate area, there are still some similar license areas that also have a gray gradient, but that can also match a color feature, which may lead to a license plate detection error. Through observation, it is found that the license plate region has not only a significant gray gradient, but also a more uniform distribution. Therefore, we further improve the validity of the algorithm via the texture consistency of the vehicle license plate region [18], the expression of which iswhere represents the consistency of the image, the ash step, and the gray-level histogram. In this experiment, the consistency of the license plate area is 0.7, and the license plate location results are shown in Figure 16.

3.3.2. Taillight Detection

Taillights comprise a significant feature of a vehicle, and note that it is obvious that the taillight color is red in our experiment. We therefore use the red color to detect taillight.

(1) Color Conversion Model. As with the license plate location, taillight detection is based on the RGB color model, and through the statistical histogram it can be found that the red component is significantly greater than the other two components, as shown in Figure 17.

After the calculation of the color conversion model, the vehicle taillight area is enhanced, and the nontaillight area is suppressed. As observed, in the RGB color mode the red component of the vehicle taillight is greater than the blue and green components, and the difference between the green and blue components is small. Upon analyzing this feature, we obtained the color conversion model, as shown in Figure 18. The conversion expression iswhere is converted after the color pixels, and , , and are the red, green, and blue components of the pixel value, respectively. In this step, our algorithm utilizes the red information of the taillights, and thus color video is adopted (gray-level video has no color information).

(2) Threshold Segmentation. The advantages and disadvantages of target segmentation have a great influence on follow-up target detection. Holes still exist in the target after threshold segmentation. Considering the morphological operation mentioned above, the closed image can fill the target image hole and connect the adjacent target block. The boundary of the target can be smoothed and the results of the closed operation are shown in Figure 19.

3.3.3. Vehicle Detection Based on MRF

Considering the space geometry relationship between the license plate and headlight, the Markov random field (MRF) model was used to detect the target.

(1) MRF Principle. Markov random fields [19] comprise a class of stochastic processes originally based on the Markov chain proposed by Russian mathematician Markov. In 1907, A. A. Markov described the characteristics of the process: under a given current state, the change of its future status will not rely on the previous state. The Brownian motion of particles in a liquid and a frog jumping in a lotus pond are Markov processes. According to the time and the state of being continuous or not, Markov processes can be divided into four types: (a) time and state are continuous, (b) time is continuous and state is discrete, (c) time is discrete and state is continuous, and (d) time and state are discrete. Here, the time and state are discrete Markov processes called a Markov chain.

A MRF model mainly includes a Markov property and a random field:(1)A Markov property is a given sequence of random variables arranged with time; the state of time of the sequence only has a relationship with the state of .(2)In a random field, a value is assigned according to a certain distribution of each phase location. It mainly contains two elements: the location and phase space.

A 1D Markov stochastic process is a sequence of random variables , where represents the variables’ state at the time , and the value of all variables is called the “state space,” which can be expressed using a probability distribution function:where represents a state in the stochastic process; note that (11) is the most basic property of the MRF model.

(2) Relationship between MRF Model and an Image. An image is a collection of points on a 2D plane and can also be regarded as a 2D Markov random field [20]. We set to show the location of , that is, the location of the airport. represents the state space, that is, the phase space of the random field. represents a random field defined in , is expressed in the random field , and the state space is a random variable of .

In the application of image processing, for the convenience of modeling, the MRF model introduces the concept of a system and group in the field, which is used to define the relationship between a pixel and the surrounding pixels in the image. Common areas of the system include the first-order field system and the second-order system. The first-order system is the current pixel of the upper and lower and left and right four positions of pixels. In a second-order system, except for a first-order system in four positions of pixels and diagonal four pixels, as shown in Figure 20, a group is a subset of the realm.

(3) Model Representation. In using the MRF model, the concept of a tag is introduced. If represents the entire tag set, represents a frame of the image, represents the image of an element, and and are mapping relations: ; that is, use every pixel in the image to find the corresponding values. If there are pixels in and kinds of values in , then has kinds of values.

We define the image of of a set of random variables , where represents a value of in the label set, and denotes values for events. Events can be expressed as (), and supposing that , the joint event can be abbreviated as , the joint probability for .

Figure 20 shows a data structure that consists of a nonempty vertex set and multiple pairs of multiple relations. In this paper, we use the concept of graphs to construct the MRF model. We represent the vehicle license plate and taillights as a graph node, and the relationship between the graph nodes is shown in Figure 21. First, the license plate in the current frame is used as a node of the graph, and then the adjacent vehicle taillights are added as a candidate node.

represents a vehicle component model, represents a vehicle license plate and taillight, and represents the component relationship between the license plate and the taillight. is a complete graph, and each pair of nodes has a side. Each node in the graph corresponds to a random variable , which makes the tag set , the value of is , and there are four types of nodes in the MRF diagram presented in this paper:  , vehicle license plate node.  , left taillight node.  , right taillight node.  , detecting the wrong node.

In this paper, we use the probability distribution expression of the MRF model:where is the normalized function, represents the possibility of nodes, representing the detection confidence of each node, and is the edge of the possibility of representing the relationship between the two nodes.

3.4. Vehicle Tracking and Detection

Vehicle tracking can be used to predict the position of the vehicle and to match the video image of the vehicle, and the Kalman filter [21] (KF) is suitable for vehicle tracking. For the daytime scene, we selected the license plate center and the vehicle velocity as the state vector; for the nighttime scene, we selected the midpoint of the headlights and the vehicle velocity as the state vector:where and are the vehicle coordinates in the and directions, respectively, and and are the speeds of the vehicle in the and directions, respectively. Kalman filtering can be divided into two steps:

(1) Prediction. Here, we predict the state vector and the state covariance matrix of the current time, as follows:where is the state vector at time time, is the state transition matrix, is the state vector of the current time , and are the moments and moments of the covariance matrix, and is the process noise matrix.

(2) Update. Here, we select the nearest vehicle in the forecast position. If the predicted position and the current position of the vehicle are less than the set threshold values, the observed value is considered to be . If the observed value is not accurate, then the update phase is skipped. The update phase can be described by the Kalman gain :where is the measurement matrix and is the measurement of the noise covariance matrix. We can then update the state vector and covariance matrix, as follows:

4. Experiment

4.1. Nighttime Vehicle Detection Algorithm

In Section 3.2.3, according to real-world vehicle specifications, in order to complete the two light components of the spatial relationship GMM modeling, we make use of the GMM in the detection process to join the space position and 3D size information. Adding the distance between the two headlights, our method can eliminate incorrect matching of the adjacent headlights on the same horizontal line. Adding the headlight heights to the space position information, our method can judge whether the vehicle front plane fits with inverse projection plane, to avoid the same vehicle multirecognition problem. The GMM of the headlights’ spatial relationship contains the horizontal position, vertical position, and height information of the headlights; therefore, in the GMM further testing will obtain correct matching results for the headlights, thus allowing us to determine the representative headlights among the overall vehicle recognition results.

Assuming that the spatial probability model of the headlights is , the spatial variables relationship of the candidate headlight component is , where indicates that the detected headlight meets the probability of the established mode. The vehicle detection formula iswhere is a preset threshold with atypical value of 0.2.

When the probability of the candidate headlight component is greater than a certain threshold in a given space probability model, then we can take these candidate headlight parts to be the target vehicle. The nighttime vehicle detection results are shown in Figure 22.

4.2. Daytime Vehicle Detection Algorithm

After building the model as described in Section 3.3.3, the probability distribution function must be solved according to the probability that the confidence level [22] of the node and the edge can further detect the target.

4.2.1. Energy Function of Node

The energy function of the node is expressed aswhere is the probability of the current node, the parameters and are obtained by learning the function ( and ), , and the taillight component detection factor is 0.8.

4.2.2. Energy Function of Edge

The energy function of the edge is expressed as

Using the GMM method to determine the angle between the taillight and the license plate, as well as the horizontal angle between the two taillights, the distribution of the taillight and license plate is obtained and shown in Figure 23. In the expression parameter training process, the Expectation Maximization algorithm [23] is used to estimate the GMM parameters.

The confidence of nodes and edges determines the MRF. In this paper, the maximum a posteriori (MAP) can be calculated by all labels that can meet the MAP: Pearl [24] proposed a probabilistic inference method to solve this problem. For an acyclic graph, this method can obtain a precise solution; for a cyclic graph, it can obtain an approximate solution. After calculation, each node in the model has the best label and the target vehicle therefore can be detected, as shown in Figure 24.

If it is a blue car, detection is handled as follows: through analysis, after color model transformation, the gray gradient of the blue body is significantly enhanced, and there is a large target area on the transformation image.

According to the results of several experiments, if the total area of the target is greater than the total area of inverse projection map, which is 1/5, we can view the target as the rear of the vehicle, as shown in Figure 25.

Furthermore, the vehicle taillight is red, and the threshold segmentation results are shown in Figure 26. Note that the taillight area brightness is more obvious. After extraction of the target area and analysis of the two regions, because the body and the taillights belong to a vehicle target, if the two regions intersect it is judged to be a vehicle target.

4.3. Tracking and Testing Experiment

In the nighttime scene, we selected the midpoint of the headlights’ connection and vehicle speed as the state vectors, and the experimental results are shown in Figure 27. Figures 27(a) and 27(c) depict detection of the vehicle starting; Figures 27(b) and 27(d) show the yellow line for trajectory tracking.

5. Conclusions

This paper mainly introduces a method for realizing the recognition of a target component by using the spatial relationship model. The recognition result of the target was replaced by the result of the recognition of some parts of the target. In the real world, some inherent characteristics of the spatial relationship among the components of the target exist that would enhance the ability to describe and identify the content of the target. Based on the components of the vehicle detection algorithm, and by using multiple local components, a better detection effect was realized, but some missed and false detection phenomena still existed. Therefore, further improvement of the proposed method is required, mainly in the following aspects:(1)In the evening, detecting the headlight targets using the GMM, if only one headlight is turned on, or none are turned on, a detection error would be caused. Since the model is established based on the amount of statistical samples, more samples are needed in order to adapt the method to more models.(2)In the daytime, the vehicle is detected by the taillights and license plate color feature of the target. If the license plates are the same color as the car body, the detection results would be affected. Similarly, during taillight detection, a red-colored vehicle body would affect the test results. Therefore, the detection algorithm is still relatively imprecise and must be improved.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was funded by the Project of Shaanxi Provincial Science and Technology Program (Grant nos. 2014JM8351, 2015JZ018, and 2016JQ6011), the Fundamental Research Funds for the Central Universities (Grant nos. 2013G1241109 and 310824173603), and the National Natural Science Foundation of China (Grant nos. 61501058 and 61572083).