Abstract

The detection algorithm is explored to improve the dynamic visual sensors (DVS) combined with computer digital technology, build a DVS network, and complete the monitoring and tracking of the target. Ultimately, the problem that needs to be solved is the poor quality of traditional communication sensor data transmission, which needs to be improved by DVS. Firstly, the structure and function of the network are described through dynamic visual perception requirements analysis. Secondly, by introducing a target tracking algorithm that combines event flow and grayscale images, two methods are proposed, namely, the event flow noise reduction method based on event density and the optical flow detection feature tracking algorithm. Finally, through experiments, the tracking and detection effect of the optical flow detection algorithm on the target object in the dark environment is verified in the high-speed motion scene and the reflection environment. The results show that the average error of target object detection and tracking is 3.2 pixels in a dark environment. The average error of target tracking in high-speed motion scenes and reflective environments is 4.86 pixels and 2.88 pixels, respectively. This research has practical reference value for the digital and intelligent development of digital video surveillance systems.

1. Introduction

As the most influential technological change in the new century, sensor networks [13] have provided strong technical support for the completion of target detection and tracking tasks since their inception. In a specific and complex environment, it is difficult to use human resources to track specific targets. A large amount of manpower and material and financial resources are needed for coordination and deployment. Appropriate sensor nodes are used in the target environment, and the location information of each sensor node is coordinated and controlled by the overall network. Through the coordinate information of each sensor, tasks such as monitoring and tracking of the target are completed. Therefore, in the sensor network, the basic function of the network system is achieved through appropriate measurement methods. Even in a complex environment with severe occlusion, it can also realize the sensory positioning of visual nodes in the visual sensor network.

The construction of the dynamic visual sensor (DVS) network requires the addition of vision sensor nodes (VSN) [46] to the general sensor network to meet the high-speed monitoring requirements of the sensor network. The sensor network contains intelligent sensors that can perceive a variety of physical quantities. It can collect target-related information in a complex and high-risk environment. It can monitor and track environmental data information through visual sensor nodes, realize the detection of the designated area, and then transmit the data to the data center through the transmission channel [7]. Additionally, the traditional vision sensor has the problem of motion blur when collecting high-speed moving targets, and it does not perform well for the data collection effect in the reflective environment and the dark environment. DVS can solve these problems due to their high dynamic range. Therefore, it has great potential in target tracking. DVS performs tracking and detection by the shape of the target, which can not only realize feature tracking and monitoring in multiple scenarios but also transfer and forward each sensor node, reducing the burden of data access in the data center.

However, the traditional sensor network has an uncertain scale, and the detection and tracking range of a sensor network that is too small is insufficient to achieve large-scale regional environmental data monitoring [8]. Excessively large sensor networks are often limited by factors such as cost, power consumption, and flexibility. Some nodes may lack global positioning devices and cannot achieve real-time positioning of all nodes, which in turn affects the effect of target monitoring and tracking [9, 10]. Therefore, based on the analysis of the requirement of DVS, the event stream noise reduction method and the event density-based optical flow detection feature tracking algorithm are proposed to study the detection algorithm by introducing the target tracking algorithm combining event stream and grayscale image. The research realizes the tracking and detection of target objects in dark environments, high-speed object motion scenes, and reflective environments. The research results have an important reference value for the intelligent application of vision sensors and the improvement of the accuracy of target detection algorithms.

After years of development, the types of sensors have become more and more diverse, and the target monitoring and recognition algorithms used by the sensors have attracted more attention. Sensors using traditional recognition algorithms have poor performance in monitoring and data transmission. The use of deep learning algorithms has achieved good results in the application of sensor target monitoring and target tracking and recognition. Therefore, in the field of sensor vision, deep-learning-related algorithms are widely used. Quaid and Jalal [11] have studied human behavior pattern recognition algorithms for wearable smart sensors. A new idea with a variant of the genetic algorithm was proposed, to solve the problem of complex feature selection and classification using sensor data. The proposed system is by the statistical dependence between the behavior and the corresponding signal data, which can maximize the possibility of obtaining the best feature value. Jalal et al. [12] used depth sensors to identify human interaction by the maximum entropy Markov model. Through extensive experiments, the average accuracy of the proposed feature extraction algorithm and cross-entropy optimization model reached 91.25%. Ma et al. [13] used support vector machines (SVM) to study human activity recognition. The experimental results verify the reliability of the proposed tensor-based feature representation model and the weighted support tensor machine algorithm for human activity recognition. Cheng-Bing and Xi-hao [14] proposed the use of a fuzzy C-means clustering algorithm (FCM) and adaptive neuro-fuzzy inference system (ANFIS) algorithm for array pattern recognition of the sensor. The calculation speed and convergence speed of the adaptive neuro-fuzzy inference system are greatly improved. Khomami and Shamekhi [15] used a sign language recognition sensor system to study Persian sign language recognition. The research results show that the average recognition accuracy rate reaches 96.13%. This can provide satisfactory results for 20 gestures. There have been many studies on the application of sensors in the field of target monitoring and target tracking and recognition, but there are relatively few studies on the use of bionic smart sensors to construct DVS. Therefore, combining computer digital technology and DVS, DVS is constructed to carry out application research on target monitoring and tracking. This provides a certain reference for the digital and intelligent development of vision sensors.

3. Feature Tracking Algorithm by Optical Flow Detection

3.1. Dynamic Vision Sensing Demand Analysis

With the advancement of semiconductor technology, photoelectric imaging devices [1618] have significantly improved their performance indicators and imaging quality in recent years. Compared with the existing photoelectric imaging device’s working mode of fixed frequency exposure and frame-by-frame output of grayscale images, the DVS is an intelligent camera inspired by biology. It can capture the vitality of the scene, thereby reducing data redundancy and delay. In modern cities, the increase in population and vehicles puts tremendous pressure on road traffic. Intelligent sensor interaction technology is used to collect road vehicle information, and the data collection system is used to collect and forward the information to the user terminal. On this basis, the Internet of Things (IoT) is used to digitize all the information elements of the system vehicles, and the virtual intelligent road traffic service platform is reconstructed in the network space to form a situation of coexistence and virtual integration of the road system in the physical dimension and the digital traffic control center in the information dimension. Therefore, physical systems and digital systems coexist and merge in the information dimension. During vehicle detection and identification, the system can provide vehicle historical data query function. It will also display road traffic information and traffic flow in real time, query statistical historical status data, and the stability and scalability of data storage. After the data are collected by the DVS, the data need to be transmitted to the data storage center through the central node. The overall network structure is shown in Figure 1.

3.2. Target Tracking Algorithm Combining Event Stream and Grayscale Image

In DVS, the edge of a moving object generates an event. In a short period of time , the increase in logarithmic brightness can be recorded as , which is calculated aswhere represents the event flow function. Assuming that the brightness is a constant, deriving the equation (1) to obtain the following equation:

Meanwhile, through Taylor Formula, the increment of logarithmic brightness can be expressed as the following equation:

By combining the above equations, represents the speed, which can also be denoted as . Solve by the way of obtaining the gradient of the grey image, the increment of the logarithmic brightness is shown in the following equation:

After calculating the brightness increment model, the target tracking algorithm is designed using the relationship between the event flow and the brightness increment calculated by the grayscale image. The overall process framework of the algorithm is shown in Figure 2.

In a period , the events are integrated pixel by pixel to obtain the integral graph . Let be the number of events that need to be integrated in this period, and the generation principle of the integral graph is shown in the following equation:where represents the Kronecker delta. According to the principle of DVS, in a short time , the optical flow can be regarded as a constant . The brightness increment value produced by the optical flow in the time calculated by the grey image is shown in the following equation:

In the process of tracking the target feature points, in order to simplify the calculation, the translation transformation and rotation transformation of the target image will be mainly considered. The calculation matrix of the transformed image is written as . The integral map is changed by affine to obtain the prediction map , and the calculation is shown in the following equation:

3.3. Research on Event Flow Noise Reduction Method by Event Density

During the use of the sensor, the thermal noise of the reset switch transistor will generate background activity noise. Noise will have a certain impact on the stability of sensor data transmission [19]. The event flow information of the fixed scene is collected. After statistical analysis, the background activity noise can be calculated by the Poisson distribution, as shown in the following equation:where represents the number of background activity noise events, represents the accumulation time of the events, represents the average value of noise events generated on average, and represents the probability of noise events generated in the time period . Through the analysis of the generation mechanism of noise and the temporal and spatial characteristics of the event stream, the event stream noise reduction method of event density is adopted. For newly arrived events, set the time neighbourhood to and the spatial neighbourhood size to , and accumulate the number of pixel output events within to obtain the density matrix . The calculation of matrix element is shown in the following equation:where is the space coordinates of the newly arrived event, and is a binary function, calculated as shown in the following equation:

In a fixed time interval T, the more the amount of background noise generated by a nonhot pixel, the lower the probability. In a fixed time and space, the amount of background activity noise will be less than a certain threshold, and the number of events generated by the real target will likely be greater than the threshold; that is, the equation (11) is satisfied.where is the threshold of the number of events. The threshold is related to the threshold of the switch comparator of the DVS and the target size.

Since the hot pixel noise occurs at a high frequency at a fixed position, the result after the event density threshold filtering is further refined. After the filtering, the noise reduction effect of the event stream by the probability of the real event is evaluated, and the event stream evaluation result generated by the periodic movement of the target is used to determine whether the event is a real event or noise. The probability of a real event is calculated as shown in the following equation:where is the turntable period of the sensor motor, represents the number of noise events and , is the total number of events generated by the pixel. When the event is related to the degree of temporal and spatial correlation of other events, the probability that the event is a target can be quantified to produce a true event, as expressed by the following equation:where represents the probability that the event is a true event. When this probability is greater than a certain threshold, the event is recognized as a true event. The determination of this threshold requires statistics on the area of noise events. The average true event probability of noise events is shown in the following equation:where is the threshold for judging noise, is the total number of statistical noise events, and are all constants. If the correlation degree of the event in the event stream is higher than , the event is deemed to be a real event. Otherwise, the event is judged to be a noise event, and the noise reduction is performed to eliminate the event data [20].

3.4. Feature Tracking Algorithm of Optical Flow Detection

In high-speed motion scenes [21], reflective scenes [22], and dark environments [23], it is difficult for traditional visual sensors to achieve timely and effective tracking of feature points. The feature tracking algorithm by optical flow is used to realize the feature point extraction in the integral map by the event flow integral map. The optical flow constraint equation [2426] uses the brightness change of the pixel during the movement of the target to calculate the following equations:where represents the brightness of the pixel at the position in the image at time . represents the position of the pixel after the time interval of .

Assume that the speed of the target along the horizontal direction is . The velocity along the vertical direction is , and then, and represent the partial derivatives of the image in two directions. The optical flow constraint equation is shown in the following equation:

Among the data collected by the sensor, the optical flow constraint equation of the image collected by the traditional sensor cannot be used for optical flow calculation. The constraint equation for the event flow requires the DVS to perform discrete data collection [27, 28]. In three-dimensional coordinates, the position transformation of the event stream over a period can be mapped to the coordinate change of the initial plane. The event mapping generated at the same point is the same. According to the time interval when the optical flow mapping is generated at different points, the detection of the position movement of the detection object can be realized. The flow chart of the feature tracking algorithm by the optical flow detection method is shown in Figure 3.

Additionally, an image window is established with the pixel point of the optical flow to be determined as the center. If the time of the initial plane is 0, the events in the space-time window centered on the pixel can be regarded as a set. The optical flow in the window is a constant. For many events in the window, the probability model cannot be used to make simple judgments of the corresponding relationship between the events, and further connections between the data need to be established.

For any two events in the time-space window, there must be a relationship shown in the following equation:

Assuming there are n events in the window, the optical flow in the window is calculated as the following equation:

According to the principle of the least square method, the simplified optical flow calculation after the constant iterative solution is as shown in the following equation:

4. Results and Discussion

4.1. The Target Tracking Result of Event Stream Combined with Grey Image

When the optical flow tracking algorithm is used to track and recognize the target, the sensor should first accurately detect the feature points according to the event flow and collect effective object information from the grayscale images in different environments. The target tracking detection results are shown in Figure 4.

The DVS is used to track and detect the target object. The result of tracking and detecting the target object in a high-speed motion scene is shown in Figure 4(a). Figure 4(b) shows the result of static object detection in a dark environment. Large black areas in the grayscale image cannot be detected and identified by the event stream. Figure 4(c) shows the result of tracking and detecting static objects in a reflective environment. The optical flow detection algorithm can effectively extract the feature points of dynamic objects and can be accurately identified even in a dark environment.

4.2. Data Processing Results of Event Stream Noise Reduction Method

The data results after noise reduction processing using the event stream method are evaluated, and the original event stream is time-sliced. Each time slice is 10 ms in length and is divided into 286 time slices in total. The average true rate PARE is calculated from the number of noise events, as shown in Figure 5. Additionally, the number of noise events in the real events (NIR) and the number of real events in the filtered events (NIF) are evaluated with two parameter values, as shown in Figures 6 and 7.

From the average true rate change curve of the results of each event stream slice in Figure 5, although the average true rate of each segment fluctuates, the average true rate of each event, in general, is basically maintained at about 1.2. This shows that the event stream slices the authenticity and reliability of the method.

Figures 6 and 7 show that the event density-based method and the background activity filter processing method are used. The average NIF values of these two methods are 171.355 and 158.222, respectively. The noise reduction method by event density is more effective than the background activity filtering method. Because the event density method is by the event density to make judgments, the accuracy rate is higher.

Additionally, the spatiotemporal filter is used to process the experimental data collected by the sensor. The sampling factors of the spatiotemporal filter are, respectively, set to 2, 4, and 6. The experimental data processing results are shown in Figures 810.

Figures 810 show that the sampling factor settings of the spatiotemporal filter are different, and the NIR and NIF values of the processed data will have large differences. There are higher NIF values under different sampling factors. This is since the time is stored in different groups due to different sampling. When the sampling factor is 2, the maximum NIR value after filtering can reach above 8000, and the maximum NIF value is about 1200. As the sampling factor increases, both the NIR value and the NIF value will decrease. When the sampling factor is 6, the highest value of NIR value can only reach about 2500, and the highest value of NIF value can only reach about 500. At the peak of NIR and the trough of NIF, the number of real events will decrease due to the increase in noise. This is because the latest events are judged. The number of support events is insufficient.

4.3. Performance Analysis Results of Optical Flow Detection and Feature Tracking Algorithms

In order to verify the performance advantages of the optical flow detection method compared with the traditional algorithm, the optical flow detection method is used to track and detect the target object in the dark environment, the high-speed motion scene of the object, and the reflective environment. The average error result of the detection and tracking is shown in Figures 1113:

Figures 1113 show that the average error of target object detection and tracking in a dark environment is 3.2 pixels. The average error of object tracking in high-speed motion scenes is 4.86 pixels. The average error of target tracking in a reflective environment is 2.88 pixels. Experimental description: the proposed optical flow detection method is applied to the tracking and detection of target objects, which can effectively extract the feature points of the image and can achieve good tracking and detection effects even in a dark environment.

5. Conclusion

The accuracy and speed with which sensors are used to detect target objects are gradually improving. However, the detection effect of traditional sensors is not good in high-speed motion scenes, reflective scenes, and dark environments. Firstly, through DVS technology, the computer digitization technology is used to study and improve sensor target detection and recognition algorithms. Subsequently, event flow target detection algorithms and optical flow target tracking algorithms are proposed. The proposal of these algorithms has an important reference value for the intelligent application of vision sensors. However, some disadvantages are unavoidable. Firstly, the target detection algorithm based on event flow can build a deeper detection network to further improve the detection accuracy. Secondly, if a sensor device with higher precision is added, the accuracy of the experimental data obtained will be higher. In the future, more real-world event datasets can be added. The addition of these datasets will make object detection and validation more general.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the research on the Training Model of Computer Professionals in Higher Vocational Colleges under the certificate system of “1 + X” (project no. 2020XHY223).