Abstract

An important aspect of the perception system for intelligent vehicles is the detection and signal measurement of vehicle taillights. In this work, we present a novel vision-based measurement (VBM) system, using an event-based neuromorphic vision sensor, which is able to detect and measure the vehicle taillight signal robustly. To the best of our knowledge, it is for the first time the neuromorphic vision sensor is paid attention to for utilizing in the field of vehicle taillight signal measurement. The event-based neuromorphic vision sensor is a bioinspired sensor that records pixel-level intensity changes, called events, as well as the whole picture of the scene. The events naturally respond to illumination changes (such as the ON and OFF state of taillights) in the scene with very low latency. Moreover, the property of a higher dynamic range increases the sensor sensitivity and performance in poor lighting conditions. In this paper, we consider an event-driven solution to measure vehicle taillight signals. In contrast to most existing work that relies purely on standard frame-based cameras for the taillight signal measurement, the presented mixed event/frame system extracts the frequency domain features from the spatial and temporal signal of each taillight region and measures the taillight signal by combining the active-pixel sensor (APS) frames and dynamic vision sensor (DVS) events. A thresholding algorithm and a learned classifier are proposed to jointly achieve the brake-light and turn-light signal measurement. Experiments with real traffic scenes demonstrate the performance of measuring taillight signals under different traffic conditions with a single event-based neuromorphic vision sensor. The results show the high potential of the event-based neuromorphic vision sensor being used for optical signal measurement applications, especially in dynamic environments.

1. Introduction

Vehicle safety technology is playing a more and more important role in intelligent vehicles [1]. Advanced driver assistance systems have been developed to assist in driving and avoid potential hazards by warning drivers based on environmental perception [27]. During driving, vehicle deceleration is possible to cause rear-end collisions, especially when drivers are distracted. For human drivers, taillights are critical warning signals of the deceleration of former vehicles. Therefore, taillight signal measurement is a promising approach to collision avoidance and mitigation. Currently, taillight signal measurement algorithms are based on a standard frame-based camera. Images acquired from the standard frame-based cameras always produce poor quality with low resolution and motion blur when confronted with rapid movements. The standard frame-based camera also has trouble measuring taillight signals when there are sudden changes in light, like when you enter a tunnel, or when there is a strong light source, like the sun.

The event-based neuromorphic vision sensors, such as the dynamic vision sensor (DVS) [810], can overcome the above-mentioned limitations of the standard frame-based cameras. Different from the standard frame-based cameras, in which measurements arrive at fixed time intervals, the event-based sensors can generate data according to the relative light intensity changes asynchronously. By registering these changes on the order of tens of microseconds, the event-based sensors have almost instant feedback, with a high temporal resolution and much less motion blur. Another important feature of the event-based sensors is their much higher dynamic range (120 dB), while the dynamic range of the standard frame-based cameras is usually about 60 dB. These features make the event-based sensors particularly suitable for daytime taillight signal measurement under different light conditions [11], e.g., at noontime or dusk, in different traffic scenes. As shown in Figure 1, standard frame-based cameras sample their environment with a fixed frequency and produce a series of frames, which lose all the information between two adjacent frames. In contrast to standard frame-based cameras, event-based sensors asynchronously respond to pixel-level brightness changes with microsecond latency and do not report anything when everything is at rest. In the taillight signal measurement system, we assume that the vision sensor and the detected vehicle are relatively stationary. Therefore, the event stream is generated only when the taillight state changes, i.e., ON → OFF or OFF → ON. From the point of view of event generation, the density change of the event stream closely follows the transition of the taillight state. Specifically, it corresponds to a large number of dense event streams when the state changes and a small number of sparse event streams when the state remains unchanged. Therefore, the event-based sensors are suitable for taillight signal measurements.

However, events do not provide absolute brightness values and contain no RGB information. Thus, it presents difficulties in performing robust and long-term taillight signal measurement. Considering the advantageous features and drawbacks, in this paper, we propose a robust taillight signal measurement system based on a novel event-based neuromorphic vision sensor named the dynamic active-pixel vision sensor (DAVIS) [1214]. The DAVIS contains an active-pixel sensor and a dynamic vision sensor in the same pixel array, which generates fixed frame-rate APS frames and asynchronous DVS events. It inherits the advantages of the event-based sensor while ensuring consistency with the standard frame-based camera.

Figure 2 illustrates the proposed taillight signal measurement system. We define all the five states of the preceding vehicle taillights as braking-on, braking-off, turning-off, turning-right-on, and turning-left-on. The following vehicle installs a single DAVIS sensor with a resolution of pixels to capture APS frames and DVS events while following the preceding vehicles. The taillight signal measurement system first locates the taillight regions of the preceding vehicles based on the APS frame. Then, the frequency domain features of each taillight region are extracted from the corresponding DVS events, which are used for taillight signal measurement. As the first attempt to use event-based sensors in the taillight signal measurement system, we believe that even-based sensors can be of interest for many vision-based measurement system research efforts. The contributions of this work are summarized as follows: (1)We present a novel taillight signal measurement system that combines APS frames with DVS events by using a single event-based neuromorphic vision sensor.(2)We extract the frequency domain features from the spatial and temporal signals of each taillight region based on DVS events, thus restraining the influences of low-frequency backgrounds.(3)A thresholding algorithm and a learned AdaBoost classifier are proposed to jointly achieve brake-light and turn-light signal measurement.

The remainder of this paper is organized as follows. Section 2 provides a review of the traditional frame-based vehicle taillight signal measurement system and describes the event-based neuromorphic vision sensor. Section 3 specifies the details of the proposed taillight signal measurement system. Section 4 discusses the experimental results and analysis. Section 5 presents the conclusions drawn from this paper.

2.1. Frame-Based Vehicle Taillight Signal Measurement

Existing works of the frame-based vehicle taillight signal measurement can be classified into three categories based on the information used.

2.1.1. Temporal Information-Based Methods

The time-dependent information is often used for the tracking of the taillights [1518]. [15] applies a detection-tracking model and uses a trained WaldBoost detector to obtain the new tracker. Then, the tracking is performed by a flock of trackers. [16] proposes a perceptive algorithm to track candidate vehicles, and the turn signals are detected by analyzing the continuous intensity variation of the vehicle box sequences. [1922] focus on extracting the invariant features from the tracked regions of taillight lights in the frequency domain. [19, 20] train an AdaBoost classifier for turn signal detection. [21, 22] use a measure function where the current frame value is normalized by the value of the current frame and last frame for brake signal measurement.

2.1.2. Color Information-Based Methods

The color information-based methods often extract features via morphology [23, 24] and color/intensity thresholds [2527]. Different color spaces are used, such as RGB [28], HSV [29], YCbCr [27], YUV [30], and Lab [31]. [25] introduces a lamp response function for rear lamp detection and a high-pass mask for rear lamp signal measurement. Based on the Nakagami-m distribution, [26] adopts color thresholds to detect turn signals at night by scattering modeling of taillights. It utilizes a contrast of reflectance to describe the direction. [27] uses YCrCb color space as feature space to detect brake behavior.

2.1.3. Mixed Information-Based Methods

Other methods combine advantages from the above two categories to increase their reliability and efficiency by combining temporal and color information for detecting and tracking vehicle taillights [17, 3234]. [34] utilizes both luminance and radial symmetry features for brake-light state determination, in which a detection refinement process using temporal information is employed for miss recovery. Most of the above methods make use of the color/brightness information and the symmetric information of vehicle taillights for object localization, combined with a trained classifier for object confirmation. Recently, deep learning approaches have also been applied to learn features for vehicle taillight detection. [35] firstly uses Fast-RCNN to detect vehicles, and then segments the vehicle taillight regions using FCN to extract features and detect the brake lights using an SVM classifier. [36] uses the brake-light patterns learned from the vehicle taillights appearance by a fine-tuning AlexNet model to measure brake-light signal. [37] uses a brake-light classifier based on chromatic and CNN features. [38, 39] propose a CNN-LSTM framework paired with a spatiotemporal attention model for taillight signal measurement, where the networks for brake and turn signals are trained separately.

Although frame-based sensors provide rich RGB information, which is beneficial for taillight signal measurement, a low temporal resolution and limited dynamic range reduce the sensor performance in challenging environments, e.g., motion blur in fast moving scenes, or poor image quality under the condition of a sudden change of light. To meet requirements for taillight signal measurement in different traffic conditions and fast grasping applications, it is essential to consider a sensor with a higher sampling rate and sensitivity while maintaining rich RGB information.

2.2. Event-Based Neuromorphic Vision Sensor
2.2.1. Dynamic Vision Sensor

The standard frame-based cameras output visual information in the form of frames at a constant rate. In contrast, event-based neuromorphic vision sensors, such as the dynamic vision sensor (DVS) [8], exhibit a far more efficient encoding manner. Each independent pixel in a DVS only outputs data in response to the log-intensity brightness changes [4042]. Given a static scene, these pixels will not produce any output, and therefore, the data rate from the device is dependent on the activity in the scene. Each pixel emits an AER (address event representation) [43] event containing the physical location of the pixel in the array , and generally a single bit of information to indicate whether the illumination on the pixel increased or decreased at time . The direction of the change in illumination is encoded as , in which is conventionally referred to as an ON event, representing an increase in illumination, and correspondingly representing an OFF event in which a decrease in illumination occurs. The temporal resolution is limited by the rate at which events can be read from the physical hardware (usually on the order of microseconds). Unlike the standard frame-based cameras, there is no concept of frames for the event-based sensors, as the data arrives entirely asynchronously. In summary, the event-based sensors offer multiple advantages over the standard frame-based cameras, mainly (1) high temporal resolution, which allows the capture of multiple events in microseconds; (2) high dynamic range, which allows the information capture in difficult lighting environments, such as night or very bright scenarios; and (3) low power and bandwidth requirements.

In this work, a specific event-based sensor named the dynamic and active-pixel vision (DAVIS) [12] is used, which implements a standard frame-based camera and an event-based sensor in the same array of pixels. Therefore, the output consists of a stream of asynchronous high-rate events together with a stream of synchronous color frames (APS frames) acquired at a low rate. An example of the output of the DAVIS sensor is shown in Figure 2. It is important to note that the notion of frames is absent from the event-based sensor acquisition process. Event frames can be reconstructed, when needed, by buffering the events generated over a given period. As can be seen from Figure 3(b), for representation, the DVS events are collected every 20 ms to form the accumulated event frame.

2.2.2. Algorithms for Event Processing

Since the output of an event-based sensor is an asynchronous stream of events, existing computer vision techniques that are designed for standard frame-based cameras cannot be directly applied to process events. Consequently, many algorithms have been specifically tailored to leverage events, either by processing the event stream in an event-by-event fashion or by building intermediate, “image-like” representations from event data [44]. The former methods can achieve minimal latency but are sensitive to parameter tuning and are computationally intensive because they all perform an update step for each event. In contrast, methods operating on event images trade-off latency for computational efficiency and performance [45]. Despite their differences, both paradigms have been successfully applied to recognition tasks [4650]. [46] describes a real-time hand gesture recognition system based on a stereo pair of DVSs and an event-driven processing technique based on LIF neurons. [47, 48] introduce novel event-based feature representation, a hierarchy of time surfaces (HOTS) [47] and histogram of average time surfaces (HATS) [48] for object recognition. Instead of [47, 48], where pure event counts are measured and summed for each pixel and polarity to generate an event count image, [49, 50] use event timestamp to construct the surface of active events (SAE) for each pixel and polarity.

The event-based sensors mentioned above mainly focus on computer vision tasks. They also are used for vision-based measurement (VBM) systems. Event-based neuromorphic vision sensors have become significantly popular recently and introduce a paradigm in computer vision applications for (VBM) systems [11, 5156]. Thanks to the low latency and low power consumption of the event-based sensor, an event-based frame approach is proposed to measure the contact force in grasping applications by attaching the event-based sensor to an elastic material in [11, 51] and [53, 56] for incipient slip detection. [11, 54, 55] use the event-based sensor for force estimation. The results show the high potential of the event-based sensor used for manipulation applications, especially in a dynamic environment. Different from tactile sensing, in this paper, we propose a VBM approach to measure the frequency characteristics of each taillight region for the taillight signal measurement using an event-based sensor mounted on the following vehicle, where APS frames and DVS events are captured by the camera.

3. Method

3.1. System Architecture

The overview architecture of the proposed system is illustrated in Figure 4, and it consists of four stages: vehicle detection, taillight localization, feature extraction, and signal measurement.

Due to the DVS events that can provide dense temporal information about the changes in scenes, a simple but effective vehicle taillight signal measurement method becomes achievable in the wild. In this work, we explore and study how to use the event-based neuromorphic sensor in a vehicle taillight signal measurement task. Specifically, building on the DVS events triggered by per-pixel brightness changes in the scene asynchronously, we measure the taillight signals based on the frequency characteristics of the sequential events within the taillight regions. The DVS event density changes drastically when the taillight state changes, especially within the taillight regions; i.e., the number of DVS events increases when a preceding vehicle changes its taillight state from braking-off to braking-on. Otherwise, the number of DVS events fluctuates within some range in the holding phase. The trend in the number of DVS events is consistent with the changes in the brightness of the taillight accordingly. Thus, we can utilize the frequency characteristics of the number of DVS events within one taillight region to measure the vehicle taillight signals. In the end, a simple but effective preceding vehicle taillight signal measurement method can be achieved, with fewer computation requirements compared to conventional vision-based techniques where only APS frames are used [39].

3.2. Vehicle Detection

The proposed preceding vehicle taillight signal measurement system mainly pays attention to vehicles in front of the DAVIS sensor. Therefore, we reduce the processing area for vehicle detection by presetting a ROI on the APS frame. The ROI is the centered two-thirds area at the bottom of an APS frame, and the vehicles outside the ROI are ignored. It not only greatly reduces computation consumption but also avoids unnecessary disturbances from the taillights of nearby vehicles.

Vehicle detection can be said to be a very mature technology. A lot of work has been done in the field of vehicle detection [57], and different studies in object detection classification [58] have also attempted to automatically detect and classify different classes of vehicles. For vehicle detection, in this paper, we adopt a multistage object detection architecture, the Cascade R-CNN [59]. With a simple and effective detection architecture and the released source code, Cascade R-CNN is suitable for the taillight measurement system requiring lightweight and rapidity. Specifically, this architecture is composed of a sequence of detectors trained with increasing intersection-over-union (IOU) thresholds for minimizing overfitting and eliminating quality mismatches at inference. It shows good performance in general object detection tasks. We build the Cascade R-CNN on the FPN framework and utilize ResNet-101 as a backbone. A sample result for preceding vehicle detection is shown in Figure 5(a). Then, the bounding boxes of the detected vehicles are grouped into tracks via a simple IOU tracker [60]. The IOU tracker relies on the coincidence area between adjacent frame targets for tracking. With fast tracking speed and low computational cost, it is suitable for use in the proposed taillight signal measurement system. Because robust vehicle detection and tracking results are the basis for the following steps, we compute the average recall rate of the detected vehicles in the ROI. Our result shows that the average recall rate is close to 100%. The vehicle detection method is not the main focus of this work. Considering that Cascade R-CNN has achieved satisfactory detection results in this work, we do not set up a special comparative experiment to analyze the impact of different vehicle detection methods on the experimental performance.

It is important to note that the success of our method depends on the fact that the taillight region is unique and that different signals share a pair of taillights (refer to Section 3.3.1). However, trucks and buses violate these facts, where different signals correspond to different taillight regions. Therefore, in this work, we only measure the taillight signal of the car and do not focus on the measurement for other vehicle types, such as trucks and buses. To achieve this, we have made two efforts: (1) we focus on the acquisition of the taillight signal of the car in the experimental data acquisition. (2) After “vehicle detection,” we only select vehicles labeled “car” by Cascade R-CNN for subsequent steps. On the other hand, in order to apply the proposed measurement system to trucks and buses, we first need to establish a unique correspondence between signal-taillight region pairs before the “taillight localization” step. Considering that the establishment of uniqueness between signal-taillight region pairs is related to the position and setting of the taillights, it is beyond the scope of this study. Hence, in this work, we do not discuss this establishment in detail.

3.3. Taillight Localization

We introduce a subwindow search method to locate the taillight position. The subwindow search method uses lamp response (LR) [25] as a quality function to perform clustering and IOU searching operations to finally achieve taillight localization.

3.3.1. Lamp Response

The LR is defined in [25]. It measures the relative intensity of the red component compared to the blue and green components of a pixel in an APS frame, which can be described by the following equation:

where , , and are the RGB color values of pixel . The taillight is displayed in high red chromaticity. The red component is larger than the green and blue components when the blue and green components are close to each other [25]. Based on the above characteristics of the taillight, we use LR to isolate pixels with large red components in an APS frame for taillight localization. Figures 5(b) and 5(c), respectively, show the APS frame of a detected vehicle and its LR isolation results. As shown in Figure 5(c), pixels with large red components are highlighted when other pixels are suppressed. It can be seen that most of the highlighted regions correspond to the taillight regions of the detected vehicle in Figure 5(b).

3.3.2. Subwindow Search

Based on the defined in Equation (1), we formulate taillight localization as a clustering search problem. Specifically, we introduce a subwindow search method (see Figure 6) to find the taillight position. This method is divided into three main steps. First, the bounding box of the detected vehicle is traversed by a rectangle of fixed width and height in each dimension of an APS frame using a sliding window algorithm. Overlapping blocks are set between adjacent sliding windows. Then, a quality function (Equation (8)) is defined to measure the intensity of each rectangular subregion (subwindow) on the APS frame. Second, the intensity values of all the subwindows in each dimension are fed into a K-means clustering algorithm to merge these subwindows. Finally, an IOU searching operation is followed to find the taillight position. The process of the subwindow search method is illustrated in Figure 6, and the details are described below.

3.3.3. Subwindow Selection

We first divide the bounding box of the detected vehicle into five equal horizontal stripes. Because the taillights tend to appear in the middle of the vehicle, we only use the middle three horizontal stripes as the searching space (SS) for the taillight position to reduce the computational cost. Then, we generate candidate subwindows (, where is the number of the subwindows) with overlapping blocks in each dimension of the SS using a sliding window algorithm. These subwindows have a fixed width and height. Suppose the size of the subwindow is . For subwindows in the horizontal direction, is one-tenth of the width of SS, and is the same as the height of SS. And similar settings are also applied to subwindows in the vertical direction. For subwindows in the vertical direction, is the same as the width of SS, and is one-tenth of the height of SS. The quality function of the subwindow is defined as where is the number of pixels within the subwindow .

3.3.4. Clustering

Based on the intensity value defined in Equation (2), we aggregate the subwindows in each direction into clusters through a K-means clustering algorithm. is the number of clusters expected to be generated. The input of the K-means is the intensity value of all subwindows. The output of the K-means is subwindow clusters. The subwindows that belong to the cluster with the lowest intensity value usually relate to the background. We filter out background interference by removing subwindows in the lowest cluster. After that, the adjacent subwindows in the rest are merged into a group . The quality function of the group is expressed as : where is the number of subwindows within the group .

3.3.5. IOU Searching

The best taillight position is found by calculating the intersection area between the best matching groups in both directions (horizontal and vertical). Considering that vehicle taillights are usually located on the left-hand and right-hand sides of the vehicle, the largest group region in the vertical direction is searched, and the first two largest group regions in the horizontal direction are searched.

3.4. Feature Extraction

After locating the taillight position, another challenging task is to measure their ON or OFF signals. Different from existing methods where APS frames are used for vehicle taillight signal measurement, we use DVS events to perform the signal measurement task. To extract the features of the brake-light signal and the turn-light signal, we first convert the DVS events of each taillight region into an event frame. Furthermore, we extract the spatial and temporal characteristics of the signal of each taillight region and transform them into the frequency domain for feature extraction.

3.4.1. Events to Frame Conversion

In order to convert asynchronous DVS events into a synchronous event frame, we accumulate DVS events in a time interval in a pixel-wise manner to generate a 2D event frame . Similar to [61], we simply use the number of events triggered at pixel location as the intensity value of the pixel, which is expressed as follows: where is the Kronecker delta function (the function is 1 if the variables are equal, and 0 otherwise). is the total number of events triggered at pixel location within time interval . Because the event-based sensors naturally respond to illumination changes and moving edges, the raw events of the event-based sensor output are maps encoding changes in taillight brightness and relative motion between the sensor and the vehicle. They cannot be processed directly for feature extraction by using prevalent algorithms. Thus, before generating the event frame, we execute a thresholding algorithm on the event counts to filter out disturbances from events caused by motion. The events-to-frame conversion process mainly consists of the following two steps:

(1) Event Accumulation. The DVS events are synthesized at a constant time interval of 20 ms. Accurate synchronization between DVS events and the APS frame only occurs near the timestamp of the corresponding APS frame. For a given APS frame timestamp , we generate the synchronized event frame from the DVS events in period . Besides, the event frame alleviates the noise impact for a high signal-to-noise ratio (SNR).

(2) Filtering. To generate motion-corrected event frames to establish reliable taillight features, outliers in the distribution of event counts are removed by clipping values of less than three. Figure 7 shows an example of an APS frame and the corresponding synchronized event frame.

3.4.2. Feature Extraction for Brake-Light Signal

According to [21, 22], the brake-light signal can be measured in the frequency domain based on the fact that the activated brake-light possesses a higher frequency property than the nonactivated one. In the DVS frame, the size of the pixel value depends on the number of events. The number of events depends on the level of change in the brightness of the taillight. The brake lights spread wider and brighter, which means that different levels of event streams can be generated in this region. In other words, the DVS frame successfully encodes the scattering of the taillight as a gradient change in pixel values using the number of events. Figure 7(c) demonstrates this statement, where different levels of events are generated at different locations in the taillight region. Therefore, we propose to use the DVS frame instead of the APS frame to extract domain features for brake-light signal measurement. Specifically, we extract the spatial characteristics of the signal of each taillight region and transform them into the frequency domain for the brake-light signal measurement. Because of the computational effectiveness for online implementation, fast Fourier transform (FFT) is a technique often used for frequency analysis [62]. Moreover, FFT has some important properties, such as separability, translation invariant, and rotation invariant. Hence, we adopt a 2D-FFT algorithm to get the high-frequency components of the event frame. Assuming that the event frame is of size , it is transformed into the frequency domain using the 2D-FFT as follows:

where and . represents the frequency domain value of the event frame . After transformation, the real and complex parts are combined by

Then, we find the maximum value of for the following brake-light signal measurement task.

3.4.3. Feature Extraction for Turn-Light Signal

We take into account the frequency of changes in the brightness of the taillight for the turn-light signal measurement. According to [20], the frequency of the turn-light signal is  Hz. To be both fast and robust, we calculate the temporal characteristics of the taillight signal within the 2 s time window to extract features. It includes two steps, i.e., feature extraction and feature transformation.

(1) Feature Extraction. The temporal characteristics of the signal of each taillight region are used for feature extraction. Specifically, first, the average of all pixel values for each frame of all frames in the past 2 s is calculated. Here, the frame means the event frame generated in Section 3.4.1. Then, we connect all the averages into one feature vector. With a frame rate of 20 fps, the size of the feature vector is 40, and we represent it as . In this representation, refers to the average of the last frame.

(2) Feature Transformation. We transform the feature vector into the frequency domain by using a 1D-FFT algorithm. The frequency domain features are used for the subsequent turn-light signal measurement task. The 1D-FFT is defined as

where . The transformed feature vector is .

3.5. Signal Measurement
3.5.1. Brake-Light Signal Measurement

We utilize a thresholding algorithm for the brake-light signal measurement. It captures the taillight state based on a threshold as follows:

The measured taillight signal state is braking-on when , vice versa the measured taillight signal state is braking-off. Ideally, the threshold is close to . As shown in the top row of Figure 7(c), there is almost no gradient change in pixel values of the DVS frame when the brake-light state is OFF, i.e., . However, in real traffic scenarios, some environmental factors, such as noise, vehicle speed, and the distance between sensor and vehicle, will affect the measure of . Therefore, we reset the value based on the real data analysis to attenuate these effects. It is worth noting that a fixed value of can meet the measurement requirements of most scenarios since the event stream of DVS can automatically filter out the effect of nontarget regions on frequency. The exception is in extreme scenarios, such as in fog or snow, where fog or snow can severely interfere with the event-based sensor response to changes in the brightness of the brake light. These extreme scenarios are beyond the applicable scope of the proposed system. To improve the brake-light signal measurement accuracy, we combine the measured states of a pair of taillights. Only both the taillight signals on the left and the right are measured as braking-on states; then, the final returned state is braking-on.

3.5.2. Turn-Light Signal Measurement

The transformed feature vectors in Section 3.4.3 are learned by an AdaBoost classifier for the turn-light signal measurement. The AdaBoost classifier linearly combines different weak classifiers into a single strong one as follows: where is the weight of the weak classifier . In our case, a weak classifier is a simple threshold from the feature vectors to split the feature space into two disjunct sets. The state of the vehicle’s taillights is identified by integrating the measurement results of a pair of taillights. If and only if one of the taillight pairs is measured as the turning-on state, the final returned state is turning-on. Besides, a flashing turn-light signal is measured after 2 s, because the feature extraction of the turn-light signal is based on a 2s time window.

4. Experimental Results and Discussions

In this section, we conduct experiments to evaluate the performance of the proposed system and discuss its general validity. The experimental data are captured using a front-mounted event-based neuromorphic vision sensor named Color-DAVIS346 in different traffic environments with various light conditions. The recordings contain both APS frames and DVS events. Specifically, the recordings contain open roads and driving scenarios, ranging from urban, highway, and suburbs, as well as different illumination conditions, including morning, afternoon, and dusk. All annotations are done on APS frames. Manual bounding box annotations of cars and taillights contained in the recordings are provided at a frequency of 20 Hz. We annotated a total of 60 sequences with 26,614 car examples, where 20 sequences are labeled for brake-light signal measurement, while the remaining 40 sequences are labeled for turn-light signal measurement. Each sequence lasts approximately 20 s. For brake-light signal measurement, all 20 sequences are used as the test set. For turn-light signal measurement, we divide 40 sequences into training and test sets based on 7 : 3. The values of these parameters used in the experiment are listed in Table 1. All these parameters are set based on experience and experiments. The performance of the measurement is evaluated using the accuracy metric shown below: where and are the number of true positives and true negatives, respectively. and FN are the number of false positives and false negatives, respectively.

4.1. Brake-Light Measurement Results

For the brake-light signal measurement, the spatial characteristics of the signal of each taillight region are extracted and transformed into the frequency domain, and the maximum value in the frequency domain is found as the extracted feature. Figure 8 shows the variation of in a testing sequence. Obviously, there are two dominant levels for in this sequence. Although vibrates significantly due to noises, we can still correctly measure the brake-light signal based on Equation (8) by setting the threshold to 0.3 (the red line in Figure 8). As shown in Figure 8, the braking-on state occurs at the 71th frame (the green line in Figure 8), where the first greater than the threshold appears. Table 2 lists the brake-light signal measurement results of the proposed system on all testing sequences. Each row in the matrix represents the measurement accuracy rate of each state, and the true positives are in italics. As indicated in Table 2, the brake-light signal can be measured with an accuracy higher than 93.4%.

To verify our statement, we analyze the measurement results of a testing sequence. In this testing sequence, a preceding vehicle changes its taillight state twice. The first time is the transition from braking-off state to braking-on state, and the second time is the transition from braking-on state to braking-off state. In this testing experiment, we separately record the feature intensity variations of the two taillights (left and right taillights) extracted from APS frames and DVS events, as shown in Figure 9. Figure 9(a) is the features extracted from APS frames. The blue (left_aps) and green (right_aps) lines indicate the feature levels of the left and right taillights, respectively. Figure 9(b) is the features extracted from DVS events. The olive line (left_dvs) and Indian red (right_dvs) lines indicate the feature levels of the left and right taillights, respectively. Figure 9 also presents the actual state of the vehicle’s taillights, which is represented by the red line (ground truth), and the measured signal state using the proposed system is indicated by the purple line (detected brake). As we can see from Figure 9(b), the brake-light is activated within 0.7 s to 6.3 s, and the measured signal state is similar to this. However, if we use the features extracted from APS frames (blue and green lines in Figure 9(a)) for the brake-light signal measurement, it is hard to select a suitable threshold. Because the event-based sensor naturally responds to changes in the brightness of a vehicle’s taillights, the features extracted from DVS events can more directly reflect the characteristics of the brake-light signal than the features extracted from APS frames.

Figure 10 shows the qualitative measurement results of the above testing sequence. The top row represents the beginning of the testing sequence, and the vehicle’s brake lights are not activated. The middle two rows indicate that the vehicle’s brake lights are first activated and last activated, respectively. The bottom row is the end of the testing sequence, and the vehicle’s brake lights are not activated. Figure 10(a) is the final measurement results of the vehicle taillights. Figures 10(b)10(d) show the intermediate processing procedures. Figure 10(b) shows the preceding vehicle detected. Figure 10(c) shows the position of the taillights located. Figure 10(d) shows the DVS events for each taillight region. The regions labeled by green bounding boxes are the ROI. The red ones represent the detected preceding vehicle. The yellow ones mean that these regions are the position of the taillights. From the middle two rows of Figure 10(a), we see that the brake lights are activated and measured correctly. From Figure 10(d), we observe a higher event density when the brake lights are activated than when they are not. Comparing the top and the bottom rows of Figure 10(d) and the second and the third rows, we observe that the event density attenuates as the vehicle moves away from the sensor. The attenuation will affect the quality of the brake-light signal features and result in false alarms. Therefore, we mainly measure the brake-light signals of vehicles in the forward direction.

4.2. Turn-Light Measurement Results

For the turn-light signal measurement, the temporal characteristics of the signal of each taillight region are extracted and transformed for feature extraction. Figure 11 shows the features of the turn-light signal extracted from APS frames and DVS events. The top row represents the feature before the transformation, and the bottom row represents the feature after the transformation. Figure 11(a) is the features extracted from APS frames, and Figure 11(b) is the features extracted from DVS events. As can be seen from Figure 11(b), three peaks within a 2 s time window (the top row) correspond to the 1.5 Hz (3 times within 2 s) maximum frequency value (the bottom row) for the feature extracted from DVS events. However, there is no clear frequency characteristic for the feature extracted from APS frames (see Figure 11(a)). This indicates that the transformed feature from DVS events can provide more effective information for the turn-light signal measurement than the feature from synchronous APS frames. Therefore, we use the transformed feature from DVS events for the turn-light signal measurement. Table 3 presents the measurement results by using an AdaBoost classifier. Tuning-on in Table 3 means turning-right-on state or turning-left-on state. As indicated in Table 3, we can measure the turn-light signal after 2 s since it starts to flash, and the accuracy is 93.7%.

For qualitative evaluation, the measurement results of a testing sequence are shown in Figure 12. In this testing sequence, a preceding vehicle switches its taillight state from turning-off (the top row) to turning-right-on (the bottom row). Each column in Figures 10 and 12 has the same meaning. From the bottom row of Figure 12(d), we clearly see that event density in the right taillight region is higher than that in the left taillight region. After analyzing the measurement results of all the testing sequences, we find that the measurement accuracy rate is low when the frequency value of the turn-light signal is less than 1 Hz. Because the feature extraction of the turn-light signal depends on the dynamic characteristics of the brightness of the turn-light signal in 2 s, the number of flashes plays an important role in the accuracy of measurement.

4.3. Discussion for General Validity

Integrating APS frame-based taillight localization and DVS event-based feature extraction (the frequency characteristics of the number of DVS events within one taillight region) based on a neuromorphic vision sensor DAVIS, daytime preceding vehicle taillight measurement in the real environment is targeted, as shown in the experiment results. It is mainly due to the property of event-based sensors that events naturally respond to illumination changes asynchronously, which better reveals the brightness changes of the taillight. Moreover, since the event-based sensor has the advantages of low-motion blur and high dynamic range, the proposed method can guarantee good performance under complicated driving scenarios, e.g., highway or tunnel exit or entrance, and different light conditions, e.g., noontime or dusk. In these scenes, the event-based sensor can effectively capture taillight brightness changes when the standard frame-based camera may fail. On the other hand, due to microsecond temporal resolution for events, some noises, such as measurement distance, camera vibrations, and relative motion, may lead to measuring uncertainty. We alleviate these noise impacts by a presetting measurement ROI, event filtering, and accumulating event frames. Although these techniques reduce the measurement uncertainty to a certain extent, other uncertainty factors (i.e., ambient light) are not paid attention to in this work, which will be one of the key research questions in our next work.

Although the advantages of event sensors have been explored for taillight signal measurement, low spatial resolution, such as the DAVIS346 with a resolution of , is certainly a limitation for its application. From the point of view of the measurement distance that the system can be adapted to, the low spatial resolution ( used in this work) limits the measurement performance to some extent. However, from the point of view of vision-based measurement system research, a low resolution does not limit the use of the event-based neuromorphic vision sensor in autonomous vehicles. The competitiveness of event-based sensors in autonomous vehicles lies in their main features, not high image quality [63]. These main features include low power and bandwidth consumption and the ability to respond to dynamically changing scenarios. This is also the purpose of this paper, to explore how to use these features of event-based sensors to measure vehicle taillight signals instead of focusing only on high measurement performance. Taking a broader perspective, event-driven measurement represents an exciting opportunity to enable power-efficient intelligent robots. On the other hand, the trend toward increased spatial resolution on event-based sensors [64, 65] also offers the potential to further improve the measurement performance of the proposed system.

5. Conclusion

In this paper, we propose a novel vision-based autonomous measurement system that can measure the daytime preceding vehicle taillight signal using an event-based neuromorphic vision sensor by analyzing signals in the frequency domain. Unlike traditional approaches that only employ APS frames, we focus on combining APS frames with DVS events. We explore the potential capacity of the event-based neuromorphic vision sensor for the vehicle taillight signal measurement task. Experiments with real traffic scenes demonstrate the performance of the system. The accuracy of the brake-light signal and turn-light signal measurements is 93.4% and 93.7%, respectively, which verifies its feasibility in real-world environments. The results suggest that the event-driven paradigm is a promising line of enquiry. From a research perspective, this paper is more focused on exploring how to leverage and amplify the advantages of event sensors to address the possible limitations, i.e., low dynamic range, interframe information loss, and motion blur, of standard frame-based cameras, while the state-of-the-art performance is a secondary concern. We hope this work can spur researchers to explore more applications of the event-based sensor in the VBM system, such as optical signal measurement for optical communication applications. At the same time, we hope that this work will encourage researchers to add more technologies for visual perception to VBM systems.

Data Availability

The datasets supplied by this work are temporarily unavailable for release due to privacy concerns. The authors can be contacted if necessary, and data can be provided after evaluation.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the Shanghai Automotive Industry Sci-Tech Development Program under Grant 1838 and in part by the European Union’s Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement 945539 (Human Brain Project SGA3).