Abstract

A vision based vehicle speed measurement method is presented in this paper. The proposed intraframe method calculates speed estimates based on a single frame of a single camera. With a special double exposure, a superimposed image can be obtained, where motion blur appears significantly only in the bright regions of the otherwise sharp image. This motion blur contains information of the movement of bright objects during the exposure. Most papers in the field of motion blur are aiming at the removal of this image degradation effect. In this work, we utilize it for a novel speed measurement approach. An applicable sensor structure and exposure-control system are also shown, as well as the applied image processing methods and experimental results.

1. Introduction

Nowadays, an increasing tendency can be noticed for automation and integration of information and communication technologies into conventional services and solutions, in nearly every aspect of our lives. Car industry is one of the leading sectors of this evolution with the intelligent vehicle concept, as there are several already existing solutions for the assistance of the vehicle’s operator (e.g., parking assist systems). Improved sensing technologies could also be used in the smart cities of the future, to improve traffic management and to provide real-time information to each individual vehicle, for better traffic load balancing. An imager sensor utilizing the proposed, novel speed measurement concept could be used as a sensing node of a distributed sensor network, as it is based on a low-cost sensor module.

Conventional speed measurement systems are usually based on either RADAR or LIDAR speed guns [1]. Both techniques use active sensing technologies, which are more complicated and expensive than passive camera systems. On the other hand, there are methods in the literature aiming at producing reliable speed estimates, based on optical information only [25]. Scientific studies in this field can be divided into two major research directions: optical flow (interframe) and motion-blur (intraframe) based displacement calculation methods; however, there are only a few papers related with the latter case [3]. Besides speed measurement, it would be profitable for many possible applications to be capable of identifying cars by number plate recognition. Therefore, it is an essential feature of these systems, to provide adequate image quality. The most important drawback of the motion-blur based methods is that the measurement concept itself is based on the degradation of the image, which is controversial with precise number plate identification, although in [3] a deblurring method is presented capable of providing appropriate image quality, with a sensor utilizing fast shutter speed, and high resolution. Our approach is based on a completely different measurement principle, using a low-end imager sensor. In this paper, we propose a novel double-exposure method, based on a special imager chip for intraframe speed measurement, which meets the mentioned requirements. A suitable sensor structure is shown along with hardware-level control for the imager.

The paper is composed in the following way. In Section 2, the fundamental concept is described, and the speed estimation based on the displacement is formulated. Section 3 presents a suitable pixel-level control method for the measurements and requirements related to the image itself. The applied image processing algorithms and compensation methods are described in Sections 4 and 5, and the paper is summarized with a conclusion.

2. Concept Formulation

The amount of incident light reaching the imager sensor is determined by the cameras shutter speed (), the lens relative aperture (), and the luminance of the scene (). Considering a measurement situation where are given, the intraframe behavior of fast moving objects on the image plane can be controlled through shutter speed. The appearing motion blur on an image is proportional to the speed of the object and the shutter speed.

2.1. Measurement Concept

Our speed measurement concept is based on a special control method of the sensor shutter. The proposed method ensures adequate image quality, while still holding information describing intraframe motion of certain objects with very bright spots. The classical shutter cycle of the CMOS sensor (open, close) is expanded with an intermediate, semiopen state. We defined a double-exposure scheme (Figure 1), with each phase having different quantum efficiency (QE) values. Quantum efficiency describes the responsivity of an image sensor, and it is defined as the number of photogenerated carriers per incident photon [6, 7] as described in where is the light-induced current density and is the optical power per unit area of the incident light. The QE of a specific sensor with respect to wavelength can be found in its datasheet.

The first phase of the double exposure is denoted with . This is a short interval, when the electronic shutter is fully open. During this time, the dominant component of the integrated image is collected. Since is small, even the moving objects will not be blurred. Then, in the semiopen phase , the process continues with significantly longer exposure , but with a lower QE. This means that much less portion of the incident light will generate charge carriers in the photodiode in a time unit, reducing the responsivity of the sensor. Assuming that we can control the length of the double-exposure phases ( and ), we can generate a superimposed image, consisting of a sharp image, and a blurred image. On the blurred image, only the high intensity regions of the scene appear, which typically drive the pixels to saturation or to a near saturation value. In the case of a fast moving object with a light source (e.g., car on a highway with headlights on), this implies that a light trace appears on the image plane (Figure 1) according to the movement path of the light source during exposure, and the length of the trace is proportional to the speed of the object.

2.2. Calculation of Speed Estimates

The measurement geometry is presented in Figure 2. Considering are known, as spatial geometry is known prior to the measurement, one can derive (2) and (3) from the given geometry:where is the angle between the image plane and the movement direction of the measured object, is the displacement, and can be derived from the image, assuming that the calibration parameters of the camera are known. After substitution of (2) and (4) into (3) and removing and , we haveIf the interval of the secondary exposure (or intraframe time) is denoted as , movement speed of the measured object can be obtained as follows:

As a result, the expected accuracy of the speed measurement is proportional to the measurement accuracy of the light trace on the image plane (again, if we consider that spatial geometry and camera parameters are known). Hence the longer the light trace is, the more accurately its length can be measured. The lateral movement of the measured vehicle inside the lane was considered to be neglectable.

3. Implementation

CMOS sensor technology enables the implementation of various pixel-level control or computation circuits. Therefore, special electronic shutters can be implemented with pixel-level exposure-control circuitry. This section presents a novel exposure-control concept for CMOS sensors, to implement the described double-exposure method.

Most CMOS imagers apply rolling shutter [6], where the exposure starts in a slightly delayed manner for every row of the sensor. This causes geometrically incoherent images when capturing moving objects. Therefore, in some machine vision applications, rolling shutter cameras are not applicable. This fact calls for the other type of CMOS sensors featuring global shutter pixels (Figure 3). In this case, the integration of the pixels in the entire array is performed simultaneously, and the readout is performed in a row-by-row manner.

3.1. Description of a Global Shutter Pixel

A fundamental component of a global shutter (or snapshot) pixel [6] is a sample-and-hold (S/H) switch with analog storage (all parasitic capacitances in the amplifier input) and a source follower amplifier, which acts as a buffer amplifier to isolate the sensing node, and performs the in-pixel amplification. The row select (RS) transistor plays an important role in the readout phase of the exposure cycle. The schematic figure of a common 4T global shutter pixel (pixel realization using 4 transistors) is shown in Figure 3. The incident light generated charge is transferred from the PD and stored in the in-pixel parasitic capacitance after the integration.

3.2. Pixel Control

To ensure that the sensor operates in accordance with the double-exposure schedule, the S/H stage could be replaced with a suitable control circuitry, which implements the functionality of the semiopen state of the shutter.

One important issue related to charge storage is the global shutter efficiency (GSE). According to [810], an increasing tendency of CMOS imager manufacturers can be noticed to achieve better GSE values, which is defined as a ratio of photodiode sensitivity during open state to pixel storage parasitic sensitivity during closed state. Or in other words, it is the ratio of the QE in the open state to the QE in the close state of the shutter. The storage parasitic sensitivity has many components, including charge formed in the storage due to direct photons, diffusion charge parasitic current outside of photodiode (PD), and direct PD to analog storage leakage. The GSE of a specific CMOS sensor [8] (Aptina MT9M021) used in this study is shown in Figure 3. Maintaining sensor performance, while reducing the pixel size, requires higher quantum efficiency and lower noise floor. Electrical and optical isolation of the in-pixel storage nodes is also becoming more and more difficult with the shrinking pixel size. Aptina recent 3.75 and 2,8 μm (3rd and 4th generation) global shutter pixel arrays implement [8] some extra features like Row-wise Noise Correction and Correlated Double Sampling (CDS) to reduce the impact of dark current (thermal generation of electron-hole pairs in the depleted region) and readout noise and to improve GSE. On the other hand increasing pixel-level functionality, along with transistor number, is controversial with sensitivity, since the fill factor is decreasing.

In our experiments, we exploit the relatively low GSE of the Aptina MT9M021 sensor. At short integration times and low scene luminance, PD to analog storage leakage during the readout phase could emulate the low QE phase of the double-exposure method proposed in Section 2.1 (assuming that and are represented by the exposure time and the read-out time, resp.). In our experiments, we used a custom test hardware (described in Section 3.3), where we can control not only the integration time of the sensor but the readout time (through readout frequency) as well. Qualitative characteristics of the secondary blurred image which will be superimposed on the initial sharp image depend on the read-out time (). Read-out time can be calculated as follows:where denotes the readout frequency. As a result, (6) can be rewritten into the following form:Notice that (7) implies that the readout time of a detected object depends on its vertical position on the image.

The capabilities of our hardware enables us to specify the intervals of and , based on the QE and the GSE of the specific sensor. During the measurements, we made an empirical observation. The trade-off between the license plate readability and the contrast of the background and the light trace is balanced, when the following statement holds:This needs further investigation, but, in this case, the stored charge of the primary exposure and charge accumulation caused by the leakage (until readout) is in the same order of magnitude (9). As a result, a bright trace will appear on the image, which represents the movement of the headlight during the readout. Technical details connected with the imager setup are described in Section 4.1.

3.3. Test Hardware

In our experiments, we used a custom test hardware, described in detail in [11, 12]. Figure 4 shows the camera module and the image capturing device. The system consists of a camera module, an interface card, and an FPGA development board. The camera module utilizes the previously mentioned Aptina MT9M021 sensor, which is operated in trigger mode, so that multiple cameras can be synchronized at hardware level. The interface card is responsible for deserializing the camera data and providing the FPGA board with input. This interface card is designed to be compatible with a series of FPGA development boards. In our experiment, the used FPGA board was Xilinx’s SP605 Evaluation Kit based on a Spartan-6 FPGA. As stated in Section 3.2, we can control the readout frequency of the sensor, which makes it an ideal platform for the measurements.

4. Light Trace Detection

Let us consider the measurement geometry ( on Figure 2) to be known. As described in Section 2.1, the expected accuracy of the method inherently depends on the accuracy of the light trace length measurement on the captured images. Hence, a crucial point of the whole system is the precise trace detection method. To specify the requirements of such system, the related regulations and specificities of the possible applications has to be taken into consideration. The first obvious application could be to use the system as a speed cam. The regulations in this regard vary with different countries; for example, in the United States, the Unit Under Test must display the speed of a target vehicle within +2, −3 km/h, according to US Department of Transportation National Highway Traffic Safety Administration [13]. We will use this data as a reference benchmark during the research, just for initial proof of concept measurements, without any approved validation process. Notice that the specification is more tolerant at lower speed ranges in terms of relative accuracy. Actually this absolute precision requirement is matching our speed measurement concept, because pixel accuracy of the light trace detection is equivalent to tolerance.

Besides speed cams there can be other applications with less strict requirements, especially in a smart city environment, like in the field of traffic statistics and traffic monitoring.

4.1. Input Image Requirements and Description of the Gathered Database

To achieve the best results in the light trace detection process, the input image has to be captured with the appropriate imager sensor settings. The integration time and readout time of the sensor fundamentally changes the effectiveness of the trace detection method. The trace measurement method would require as short integration time as possible. This would ensure the maximum contrast between the light trace and the background, making the detection much more easier and accurate. On the contrary, image segmentation for license plate recognition needs a brighter image, with longer integration time. On the other hand, it would be profitable to prolong the readout time, because the longer the trace is, the more accurate the measurement becomes. But as the secondary exposure becomes longer, more charge is accumulated, blurring the image in the lower intensity regions also (even with lower QE), making the car identification more difficult. In our experiments, we observed the best results at a relatively low illumination range of 100–1700 lx, and all of the images presented in the paper were captured in these lighting conditions. If the illumination exceeds this level and the lower limit of the integration time of the sensor does not allow further compensation, a neutral density filter should be used to maintain the quality of the results. During our measurements, we used integration times around 0.2 ms with 22 MHz readout frequency, which applies to the previously mentioned illumination level and satisfies assumption (9) in the following way. Consider that the measured object is in the middle of the frame. After rewriting (9), we getwhere GSE is an average efficiency value in the visible spectrum. Combining (10) with (7) and after substitution with the specific values of our sensor we get (11)

We captured image sets for the image processing methods in a real measurement scenario. After selecting a suitable location for the measurement, we observed the passing traffic. Numerous images were captured with our test platform with a wide variety of vehicle and headlight combinations: passenger cars and vans with LED and halogen lamps. A collection of such images can be seen in Figure 5. The speed of the vehicles was around 40–60 km/h, since the measurements were performed in an urban area. Two separate image databases have been captured: a single camera and a stereo set, consisting of about 200 and 50 images, respectively. The single camera database has been separated to an evaluation set and a learning set. The learning image set consists of about 30 images of vehicles with different headlight geometries, for parameter tuning of the image processing methods.

4.2. Detection Algorithm

This section summarizes the image processing algorithm implemented for the light trace extraction. An example input image can be seen in Figure 6, which was captured with our test hardware (Section 3.3). The light traces, arising out of the headlights, can be clearly seen. There are some universal features of the light traces on the images, which can be utilized during the detection process. First, regardless of the vehicle and the headlight itself, it is typically a saturated, or nearly saturated, area on the image. In most cases, the headlight itself and the first section of the trace are saturated, and, depending on the headlight type, the intensity of the trace is decreasing towards its endpoint. Second, if the sensor is aligned horizontally and the camera holder is placed at ~0.6 m from the ground, where the headlights are expected to be approximately, the traces will appear as horizontal edges on the images.

As a first step, we apply histogram transformation to highlight the bright regions on the image and to suppress other parts of the scene, so that less processing will take place in the irrelevant regions of the image in the later steps. This is followed by an anisotropic edge enhancement, to highlight horizontal edges. Then, thresholding is the next step. As described previously, the regions in question are typically nearly saturated; therefore, after edge enhancement, a universal high binarization threshold can be used. This results in a binary image, from which we can extract and label blob (binary large object) boundaries. Then, we filter out blobs based on boundary length. Blobs with boundary length above and under a threshold level are discarded, and the remaining will be considered candidate objects. These minimum and maximum thresholds have been defined based on the learning image set and tested to ensure maximum reliability. On each image, we will get a number of candidate objects. If the input image was the one on the left side of Figure 6, we would get the objects as candidates indicated in the right side of Figure 6. The remaining blobs are again filtered, according to the ratio of horizontal to vertical size of their bounding box. Selection of the final object from the candidates is based on morphological features. As you can see in Figure 6, reflection on the car body can modulate the shape of the light trace which is geometrically closer to the camera, making the measurement problematic, so we always prefer the farther trace in the selection process. The output of the algorithm is the full horizontal size of the selected blob, including the saturated area of the headlight. The above described algorithm is capable of detecting the light traces at a 91.46% ratio (based on the previously mentioned evaluation image set), if the input images are captured in the previously described way. The flowchart of the algorithm is shown in Figure 7.

5. Trace Length Measurement and Correction

After the light trace has been detected on the input image, we have to measure its length precisely, in order to get a precise speed estimate for the movement. The output of the trace detection is the horizontal size of the selected blob (denoted with in Figure 8). To measure the interframe movement of the headlight, we need to identify both endpoints of the traces. Identification of the starting point of the trace is difficult, because there is a saturated area around it, as you can see in Figures 6 and 8. In this section, we summarize the methods which we developed for the trace length correction.

5.1. Acquiring Ground Truth with a Stereo Image Pair

As described in Section 4.2, the proposed image processing method calculates speed estimates based on some properties of saturated or nearly saturated regions of the image. As there is information loss in those areas due to the saturation, the localization of the starting point for the trace length measurement needs to be done in a different way. Consider a second auxiliary camera synchronized to the primary sensor, which applies the same exposure settings, but with a dark neutral density filter, which cuts out 90% of the incoming light. As a result, only the brightest points of the scene will appear on this second correction image (Figure 8). Our test platform is capable of synchronizing multiple cameras, where the sensor control signals are driven by an FPGA [11, 12]. With stereo correspondence methods, we can pinpoint the position of the light source, based on the compensation image.

Let and be the projections of the starting point of the trace (), as shown in Figure 8. The intrinsic projective geometry between the two views is defined as follows:where is the fundamental matrix [14], which maps points in to lines in in pixel coordinates as , where is the epipolar line. Consider the detected trace starting point to be a point-like object on the secondary image and the fundamental matrix of the stereo rig to be known from extrinsic and intrinsic calibration. In this case, the intersection of the epipolar line and the major axis of the detected trace (if described as a blob) defines the starting point of the trace on . After that, the length of the trace can be measured. Later on, we consider this as the ground truth. As in most cases the saturated region on the compensation image is a point-like object with good approximation, and uncertainty caused by the size of the detected blob on the secondary image is negligible.

To verify the obtained results, we applied an Inertial Measurement Unit (IMU) on a vehicle to log the speed of the car in a real situation. Our solution offers 1.3% error compared to the IMU measurement, which encourages us to use this stereo method for acquiring the ground truth. The description of the proof-of-concept measurement and the related figures can be found in [15].

5.2. Statistical Trace Starting Point Localization Based on a Single Camera

When using a single camera for capturing the images, the best possible option could be a statistical based estimation of the starting point of the traces for the trace length measurement. According to Figure 8, we estimate the length of the traces in the following way:where the is the length of the detected blob, is the trace itself, and is the horizontal size of the headlight, respectively. This is based on an assumption that the starting point of the trace is in the middle of the headlight. For this calculation, we developed an algorithm to separate the beam originating from the headlight and the saturated region of the headlight itself, based on the vertical profile of the detected blobs along the horizontal axis. According to our stereo database, the mean value of the difference between the calculated headlight center and the light trace starting point obtained through the stereo correspondence method is 3.2 pixels, and the deviation is 1.6 pixels. Using the described method for trace starting point estimation, we ran the trace detection algorithm and evaluated the results compared to the ground truth. Figure 9 shows the error of the detection method. The whole detection and measurement algorithm using only a single double-exposed frame of a single camera resulted in 4.1% overall accuracy. As a result, it could be used, for example, as a sensing node of a smart city sensor network for traffic surveillance and monitoring, but not for precise speed measurement.

As calculated trace length is proportional to movement speed and inversely proportional to the distance between the camera and the vehicle, less distance between the vehicle and the camera and higher movement speed mean higher accuracy. Notice that, in Figure 9, different error values can be observed for samples with similar trace length. This effect comes from the difference between the estimated and the real headlight center in the case of different headlight geometries.

5.3. Accuracy Improvement Possibilities Based on a Novel Sensor Design Concept

In this subsection, a slightly improved exposure-control scheme is proposed to improve the accuracy and reliability of the measurement method. With a novel pixel architecture and the modification of the shutter cycle, inserting one additional short close state after the primary exposure [open, close, semiopen, and close], one can achieve an image, where the light trace is separated from the saturated areas of the headlight, which greatly simplifies the measurement of its length and makes it much more accurate. This method would require a dual-pixel sensor architecture with a truly controllable shutter as well as a modified in-pixel charge storage approach. Hence, the aim of future research is to develop a custom VLSI design, capable of this separation on a hardware level.

6. Conclusion

To summarize the results, a novel vision based speed estimation method was developed, capable of measuring speed of specified objects based on a single double-exposed image of a single imager sensor. The measurement results are encouraging, because the published intraframe speed measurement solution [3] reached 5% accuracy in average in outdoor environment. The method presented in that paper is based on assumptions which requires high quality, high frame-rate, hence expensive cameras. Our solution offers similar accuracy with a low-end sensor and much better accuracy with a stereo pair, which can match the requirements of a speed cam sensor in good lighting conditions.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The support of the KAP-1.5-14/006 Grant and the advices of Laszlo Orzo are greatly acknowledged.