Abstract

The technology of moving objects detection has become an important research subject for its extensive application prospect. In this paper, it is presented that interframe difference algorithm and background difference algorithm are combined to update the background. The algorithm can deal with the flaw of background difference algorithm. The mathematical morphology method is employed to denoise the image, which may be helpful to improve the accuracy of the detection. The Pyramid algorithm is used to compress each frame data of video sequence. Then, the detecting and tracking of moving objects are tested on the hardware platform (DM643) and the software frame (RF5). The running speed is about 3 times faster than before. The result shows that the accuracy demanded by the detection is met. This method can provide a useful reference for similar application.

1. Introduction

The definition of moving target detection is that the valuable and mobile targets are separated form background [1, 2]. In the field of pattern recognition and computer vision, the technology of moving target detection is a very important topic. It is also widely used in other fields, such as artificial intelligence, security surveillance, human-computer interaction, and intelligent machines; it is a hot topic in recent years.

At present, three existing methods of moving target detection are optical method [3], background difference [4], and interframe difference [5]. Optical method is often used in the condition with unknown information. In this case, the independent movement of the target may be detected. However, this method is time-consuming and difficult to meet real-time detection. Therefore, background difference and interframe difference are usually used instead of this method. It is to provide a relatively complete feature data and obtain accurate target by background difference, but it is sensitive to the changes of external circumstances caused by light. Compared with background difference, the method of interframe difference is insensitive to the changes of external circumstances caused by light. But when there is large gray uniform region of the target surface, holes appeared in the area of the target so that the target is to be divided into multiple regions.

Interframe difference is simple and it is insensitive to the light scene changes; it can adapt to the dynamic environment and have good stability. Although the algorithm is simple, each pixel of image frame has to be differenced, which may reduce the running speed. In this paper, the Pyramid algorithm is used to compress frame data of video sequence so that the number of pixels of each frame becomes a quarter of the original. Therefore, this algorithm can accelerate the running speed and improve the performance of interframe difference.

The system of moving object detection is employed by using digital signal processor—TMS320DM643 [6] as a hardware platform. The feasibility of the algorithm is tested through the effect of target tracking.

It is to provide a relatively complete feature data and obtain accurate target by background difference, but it is sensitive to the changes of external circumstances caused by light. To solve the faultiness of the algorithm, the background can be updated by combining interframe difference and background difference in this paper. Besides, a great number of false moving targets are eliminated using the method of mathematical morphology [7]. The foreground of moving objects is acquired accurately.

2. The Outline of System

2.1. Hardware Platform

The platform contains CCD camera, decoder (TVP5150) [8], video processor (DM643), encoder (SAA7105) [9], LCD monitor, and other peripheral configurations. The architecture of the system is shown in Figure 1.

Real-time video information is captured by CCD camera. The information is transmitted to the decoder (TVP5150) through the coaxial cable. And then, the processor (DM643) [10] implements the function of identifying moving targets. At last, the real-time results after encoding (SAA7105) can be displayed on the LCD monitor.

The processor (DM643) device is based on the second-generation high-performance, advanced VelociT1 very-long-instruction-word (VLIW) architecture (VelociT1.2) developed by Texas Instruments (TI), making these DSPs an excellent choice for digital media applications. The DM643 DSP possesses the operational flexibility of high-speed controllers and the numerical capability of array processors. The core processor has 64 general-purpose registers of 32 bit word length and eight highly independent functional units, two multipliers for a 32 bit result, and six arithmetic logic units (ALUs). These performances make DM643 suitable for digital video processing applications particularly. The model of other devices is as follows: camera model: PNT-696; the specification of image acquisition:  cm ( in) and monitor model: 7 inch TFT LCD with the resolution: ; and the output format: PAL. It The system hardware platform is shown in Figure 2.

2.2. The Software Structure

The software flow chart of the system is shown in Figure 3. Firstly, the system performs initialization. It mainly includes CSL and BIOS [11] initialization, setting CACHE to 64 K, mapping CACHE to the CE0 and CE1 space, and setting the DMA priority sequence and length. Secondly, the RF-5 module initialization is to initialize SCOM module that transfers message to the internal unit [12]. Thirdly, the task module is initialized. It is employed to allocate and manage tasks storage space. After that, real-time video image is captured from the input device through the TSK-input thread. Here, video sequence is formatted according to 4 : 2 : 2 compression interleaved mode by FVID_exchange function [13]. Next, the acquisition-ending message from the input device is sent to the output device; then, when video output task detects the acquisition-ending message, the TSK-process thread that process the video data by the detection algorithm is activated. After processing, the video data are written into the output device by the TSK-output thread, and the FVID_exchange function is started to switch on the display. Finally, the output task sends the ending message to video acquisition tasks, and the system returns a waiting state for the next acquisition-ending message.

The module of data processing task mainly includes five modules (interframe difference, background difference, image denoising, object detecting, and object tracking). The flow chart of data processing task is shown in Figure 4. The aim of interframe difference is to detect moving objects and to provide information to background updating. Background difference module is employed to extract the complete area of objects. Image denoising module is used to remove noise in the binary image by mathematical morphology. And the aim of detecting and tracking object is to mark objects timely.

3. Improved Interframe Difference

Although the existing interframe difference is simple, each pixel of image frame has to be differenced, which may reduce the running speed. In this paper, the Pyramid algorithm is used to compress each frame data of video sequence so that the number of pixels of each frame becomes a quarter of the original. Therefore, this algorithm can accelerate the running speed and improve the performance of interframe difference.

3.1. The Existing Interframe Difference

The method of interframe difference is mainly to detect and extract target by the difference between two successive frames in the video sequence [14, 15]. The basic process of interframe difference is shown in Figure 5.

The algorithm is explained as follows: where and indicate the two successive image sequences, and are coordinate, is frame number, is the difference image, is the binary image, and means threshold.

First, (1) is employed to calculate the difference image between the th frame and the th frame. And then, according to (2), if a pixel grayscale value in is greater than the given threshold , the pixel belongs to foreground and is set as 0 in the binary image . Otherwise, the pixel is considered as pixel of background pixel and is set as 255 in .

3.2. Interframe Difference Based on Pyramid Algorithm

If all pixels of each frame are differenced, the computational complexity of the algorithm may be high. As it is known, the image always includes a lot of redundant information. So, if a reasonable compressed algorithm is employed and the compressed image instead of the original image is differenced, the computational complexity could be reduced highly.

Here, Pyramid algorithm is used to compress image data. The algorithm is explained as follows: where pixels in the original image are described by , , , ; indicates the pixel after compression; and is the difference image. According to the algorithm, an image with the size of can be compressed to the size of .

3.3. Comparison and Analysis

The number of clock cycles required by the above two algorithms during processing a frame image is calculated. In the same condition, it runs about 31198334 cycles after the application of Pyramid algorithm, while the existing method consumes about 100642094 cycles. Obviously, the running speed of interframe difference with Pyramid algorithm is 3 times faster than the existing method.

3.4. Dynamic Threshold Calculation

If the gray threshold in the algorithm of interframe difference and background difference is fixed, the threshold is adjusted instantly by manual intervention; otherwise, the changes of environment or scene could influence the result of moving objects detection. This problem could be solved by OTSU algorithm [16]; it is adopted to determine adaptively the threshold . The algorithm is based on the idea of the maximum variance between the two clusters. The process of the algorithm is as follows.

First, set an initial threshold to separate the foreground image from the background image. means the proportion that pixels of foreground account for the total number of pixels, is the average gray value of the foreground image, is the proportion that pixels of background account for the total number of pixels, is the average gray value of the background image, is the total average gray value of the image, and is the variance between the two clusters. Consider where is image sequences having the size , and are coordinate, is a threshold, , is the maximum value of grayscale, and is the obtained dynamic threshold.

4. Improved Background Difference

In order to meet the demand of real-time processing of video monitoring system, in the paper, it is presented that interframe difference algorithm and background difference algorithm are combined to update the background, and an improved background difference is used to extract motion region, which could deal with the problem that background cannot be automatically updated with dynamic scene. The algorithm mainly includes three steps: background extraction, motion region extraction, and background updating which are as follows.(1)Background Extraction. The initial background image is extracted by calculating the pixel average of the first 100 frames from video sequences. Let the initial background be , where and are coordinate, and is frame number.as shown in where is the video sequences from the first 100 frames and is the frame number, .(2)Motion Region Extraction. The difference between every frame and background is conducted to obtain motion region in current frame after background extraction. Motion region extraction includes two steps: making difference and difference image binarization. The output is a binarization image whose grayscale value is 255 (indicating background area) or 0 (indicating motion region). Let be the background image; the change measure can be written as where is frame and OTSU algorithm is adopted to determine adaptively the threshold .(3)Background Updating. First, the binary image of foreground image is acquired by subtracting the current frame from the previous frame based on interframe difference; secondly, let be the pixel of . If the value of is 0, background should be updated; otherwise, background keeps the original value. Variable is the updating speed. Background updating could be expressed as (4)If the value of all pixels in the binary image of foreground image is 0, the program ends; otherwise, the updated background is assigned to . Go back to step (2).

5. Image Denoising

For the influence of noise, some pixels which belong to background could be clustered for pixels of the foreground. Contrarily, some pixels which belong to foreground are mistaken for pixels of the background image. At the same time, because of the slight flutter of objects, some pixels of background image could be falsely identified as pixels of moving target. In order to fight these harmful effects, binary image should be disposed so that motion target area is detected. The mathematical morphology method [17] is selected to denoise in the binary image in this paper.

Mathematical morphology analyses image based on the morphology of image region. The main purpose of using the method is to obtain the object topology and structure information by interacting objects and the structural elements. The basic operations of mathematical morphology contain dilation, erosion, opening, and closing. Its basic idea is shown in Figure 6.

In this paper, the denoising of binary image is reached using the erosion operation, where the structure element is . The effects of the method are shown in Figure 7.

6. Multiobject Detecting and Tracking

After denoising, targets should be tracked. There are many kinds of methods to detect and track targets. And then the proposed method and the other existing methods are introduced.

6.1. The Existing Methods

The existing method of detecting and tracking targets is as follows.(1)Scan the binary image line by line. If pixels that belong to the same connected region are found by the connectivity analysis algorithm, they are marked as the same number, while in contrast, they are marked as different number.(2)Find the maximum and minimum value of -coordinate and -coordinate in each target region marked as the same number.(3)According to the maximum and minimum value of -coordinate, the top and bottom positions of a frame are determined. According to the maximum and minimum value of -coordinate, the left and right positions of the frame are determined. Targets are marked by this frame.

6.2. The Proposed Method

Pixels of the binary image are scanned line by line for targets. If one pixel of targets is discovered, the target is marked with a fixed frame. The concrete realization method is as follows.(1)Scan the binary image line by line. If the value of a pixel is 1, this pixel belongs to target.(2)This pixel is as a reference point. A fixed frame whose size is will mark the pixel, and then the pixel is as a starting point, beating 40 pixels as the next scanning point.(3)Repeat steps (1) and (2) until the binary image is scanned.

6.3. Comparison and Analysis

The proposed method has three advantages compared with the existing methods as follows.(1)Denoising. This method only uses the erosion operation in mathematical morphology method to denoising, which is easy to realize.(2)Marking Targets. The connectivity analysis algorithm is left out. The target is marked based on individual pixels instead of the whole region of some target.(3)Real-Time Performance. Because these algorithms in this method are easier and less than ones in the existing methods, the system with this method has a good real-time performance.

7. Results and Analysis

The results of detecting and tracking objects are shown in Figures 8 and 9. The original image sequence included moving targets as shown in Figure 8. As shown in Figure 9, moving targets are detected and tracked in real-time way. It shows that the algorithm presented in this paper is feasible in the schedule of multiobject detecting and tracking.

While the system is operating, the thread scheduling analyzation result of CPU (DM643) task running sequences is presented in Figure 10. Each task execution process mainly refers to three running states (“not ready”, “ready,” and “running”). The different color indicates different states in Figure 10. White color means “not ready” state, light blue means “ready” state, and blue means “running”. KNL-swi automatically created by DSP/BIOS is a kind of software interrupt. It is used to initialize the task scheduler and the program [18]. KNL-swi has the highest priority. Before the HWI event is activated, it must be executed. The program includes three threads (“tskOutput”, “tskInput,” and “tskVideoProcess”). The priority of “tskOutput” thread is the highest in the three threads. The priority of “tskInput” thread is the next. The “tskVideoProcess” has the lowest priority.

8. Conclusion

The results of the study basically have the following six aspects.(1)The processor (DM643) is chosen as a platform. This processor with VLIW architecture is developed by the VelociTI technology, which is suitable for digital media applications.(2)The video processing algorithm based on RF5 is implemented. System tasks are scheduled by RF5 framework automatically. This framework makes three threads (TSK-input, TSK-output, and TSK-process) synchronize and transfer message between each other through SCOM message queue. It makes three threads more conveniently and efficiently.(3)Interframe difference algorithm with Pyramid algorithm makes detecting and tracking of moving target be lower computational complexity. It allows the system to meet the application demands of the dynamic environment and has a high stability.(4)Improved background difference algorithm is implemented by combining interframe difference with background difference. It could be employed to update background and overcome the shortcoming that existed background difference is sensitive to environmental change.(5)Noise in the binary image of foreground is removed with the algorithm of mathematical morphology. This algorithm is simple and fast. It is helpful to the real-time performance of the system.(6)Multiple targets are marked by fixed frame. Although the fixed frame cannot change accordingly with the change of the size of target, this method has great advantages in real-time.

Conflict of Interests

The authors declare that they have no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is sponsored by the National Natural Science Foundation of China (31171775, 11201120, 61201389, and 61174056) and the National High Technology Research and Development Program of China (2012AA101608). Research project was supported by the Key Laboratory of Grain Information Processing and Control.