Mathematical Problems in Engineering

Volume 2015 (2015), Article ID 376494, 12 pages

http://dx.doi.org/10.1155/2015/376494

## Robust Object Tracking Based on Simplified Codebook Masked Camshift Algorithm

^{1}Information Research Institute, Shandong Academy of Sciences, Jinan 250014, China^{2}Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China^{3}School of Information Science and Engineering, Shandong University, Jinan 250100, China

Received 26 January 2015; Revised 5 June 2015; Accepted 10 June 2015

Academic Editor: Fernando Torres

Copyright © 2015 Yuanyuan Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Moving targets detection and tracking is an important and basic issue in the field of intelligent video surveillance. The classical Codebook algorithm is simplified in this paper by introducing the average intensity into the Codebook model instead of the original minimal and maximal intensities. And a hierarchical matching method between the current pixel and codeword is also proposed according to the average intensity in the high and low intensity areas, respectively. Based on the simplified Codebook algorithm, this paper then proposes a robust object tracking algorithm called Simplified Codebook Masked Camshift algorithm (SCMC algorithm), which combines the simplified Codebook algorithm and Camshift algorithm together. It is designed to overcome the sensitiveness of traditional Camshift algorithm to background color interference. It uses simplified Codebook to detect moving objects, whose result is employed to mask color probability distribution image, based on which we then use Camshift to predict the centroid and size of these objects. Experiment results show that the proposed simplified Codebook algorithm simultaneously improves the detection accuracy and computational efficiency. And they also show that the SCMC algorithm can significantly reduce the possibility of false convergence and result in a higher correct tracking rate, as compared with the traditional Camshift algorithm.

#### 1. Introduction

Moving object detection and tracking is the basis of object recognition and behavior understanding and has very broad application and research prospects. There are mainly three different categories of object detection algorithms, such as interframe difference methods [1], optical flow methods [2], and background subtraction methods. Background subtraction methods are the most popular ones in real world because of their high detection accuracy and medium computational complexity. Classical background subtraction algorithms include kernel density estimation [3], Gaussian Mixture Background Modeling [4], and Codebook background modelling [5].

Codebook algorithm was first proposed in 2004 by Kim et al. [5], and it has been one of the most advanced motion detection methods because of its high memory utilization, high computation efficiency, and strong robustness. Many improvements have been made based on Codebook algorithm. For example, Wu and Peng [6] proposed a modified Codebook algorithm based on spatiotemporal context which improves the detection accuracy by adding the correlation of the spatiotemporal pixels. However, the computational complexity of the whole algorithm has been increased at the same time. Tu et al. [7] made simplifications to accelerate the computational speed by introducing box-based Codebook model in RGB space to represent the matching field of the codewords. However, these simplifications decreased the detection accuracy. Most of the improvements to Codebook can improve either the detection accuracy or computational efficiency, but not both of them.

Camshift algorithm is a classical object tracking algorithm. Camshift is evolved from Mean Shift algorithm. It performs tracking according to the color information of an object and has very good real-time performance and high robustness. Mean Shift algorithm was first proposed in 1975 by Fukunaga and Hostetler [8]. Cheng [9] expended the algorithm and enlarged its application range. After that, Comaniciu and Meer [10] successfully applied it to image segmentation and object tracking. Bradski [11] established Camshift algorithm based on Mean Shift, which cannot only predict the centroid position of an object but also adaptively alter the size of an object frame. Currently, the improvement on Camshift algorithm exists in the following aspects: to improve the accuracy by improving the features of a histogram [12–14], to reduce computation time by increasing convergence velocity [15, 16], to increase robustness for objects rotation [17], and to solve the problem of background color interference. The improvement for Camshift algorithm in this paper concentrates on the issue of background color interference. In the literature, Camshift and Kalman combined algorithm [18–20] is easy to fail when object movement is nonlinear. The tracking accuracy of Camshift and interframe difference combined algorithm [21, 22] could be affected by a low performance interframe difference motion detection algorithm.

In order to simultaneously improve the detection accuracy or computational efficiency of Codebook algorithm, this paper first proposes a simplified Codebook algorithm. It is called hierarchical matching 5-tuple-based Codebook algorithm which is a modification of the original 6-tuple Codebook algorithm. The average intensity is introduced as a variable into the Codebook model instead of the minimal and maximal intensities. And different matching methods between the current pixel and codeword are adopted according to the average intensity in the high and low intensity areas, respectively. Based on the simplified Codebook algorithm, this paper then proposes a concise and robust object tracking algorithm called Simplified Codebook Masked Camshift algorithm (SCMC algorithm), which combines the simplified Codebook algorithm and Camshift algorithm together. A similar work is proposed by Wang [23], which uses the results of Codebook moving objects detection algorithm to mask the manual initialization searching box. However, our experimental results show that better tracking performance can be obtained when the color probability distribution images were masked by the simplified Codebook algorithm.

#### 2. Simplified Codebook Algorithm

Compared with original Codebook algorithm [5], our simplified Codebook algorithm has two improvements: First, maximum and minimum brightness in codeword model are substituted by average brightness. So codeword model is simplified and computation speed is increased. Second, different processing methods of high and low brightness regions are applied to the matching between current pixel and codeword, so detection accuracy is improved and the probability of false detection is reduced in the low brightness region. The simplified Codebook algorithm in this paper is called hierarchical matching 5-tuple-based Codebook algorithm.

This section will present how to detect the moving object by the proposed simplified Codebook algorithm. First, we will show process of building a codebook for a specific pixel. Then all the other pixels can repeat the same process to complete the detection for a whole image.

##### 2.1. Initialization

We build a codebook () containing several codewords for every pixel, where is the number of codewords. The th codeword includes two parts: RGB vector and a 5-tuple . The 5-tuple is composed of average brightness , codeword accessed frequency , maximal nonrepeatable time interval , the initial codeword accessed time , and the eventual codeword accessed time . Except for the maximum and minimum brightness replaced by the average brightness , all the other elements remain the same as the original Codebook algorithm.

##### 2.2. Training Background Model

Assume the first frames of the video are used to train background model. For a particular pixel, the sequence of pixel values for training is , where each element is RGB vector extracted from the th image frame. Now, we take this pixel as an example to explain the codebook training process:(1)Build a codebook: we build a codebook for the pixel and initialize it with an empty set (let ).(2)Train the codebook: the following steps are executed circularly while changes from 1 to :(1)A new pixel value is read from the sequence . Brightness is calculated through .(2)Match between the pixel value and the codebook. Find a codeword matching in based on the following two conditions:(a)Color distortion where is the RGB vector of the th codeword, is the threshold of color distortion matching, and is the projection of on , and it can be calculated by(b)Brightness where is the average brightness of the th codeword and and are the upper and lower bounds of brightness matching scope. , , and can be calculated by the following formulas: where is the threshold to determine whether the current pixel belongs to the low brightness region or not. is the average brightness of the th codeword. is a variable used to calculate the threshold of color distortion matching, whose value is between 0 and 1. is a constant threshold of color distortion matching in low brightness region. is a variable used to calculate the ratio of the upper bound of brightness matching and average brightness in high brightness region. is a variable used to calculate the ratio of the lower bound of brightness matching and average brightness in high brightness region. is half of the brightness matching range when the brightness of the current pixel is lower than . and jointly guarantee the ranges of color distortion and brightness matching are not too small in low brightness area to avoid the occurrence of false detection.(3)If or there is no matching codeword, then let and create a new codeword , where the color vector is and the 5-tuple is .(4)Otherwise, update the matched codeword if matches . Update the color vector by and also update the 5-tuple by(3)Regulate for every codeword , and let(4)Delete the nonbackground codeword. Assume the probability of background occurrence is bigger than 50%. Let denote the background model which is the codebook after temporal filtering step. Specific operations can be expressed as the following formula: generally, . is the codeword set describing the background, where is the th codeword in . is the maximal nonrepeatable time interval of the 5-tuple in .

##### 2.3. Foreground Detection

We match current pixel with a codeword by the same method as the training codebook. If matching exists, we update the codeword and take the current pixel as background. Otherwise, we take it as foreground.

#### 3. Classical Camshift Algorithm

The original Camshift algorithm [11] takes the color histogram of an object as its characteristic model based on its color information. Video frame images are then changed into color probability distribution images. The centroid of the object is searched and the size of the object box is predicted on these images.

The implementation of the classical Camshift algorithm can be depicted as follows:(1)Initialize the position of the centroid and the size of the bounding box of the object.(2)Compute the color histogram of the bounding box.(3)Compute the color probability distribution image for the current frame.(4)Predict the position of the centroid with Mean Shift algorithm.(5)Predict the size of the bounding box.

The original Camshift algorithm can easily converge the bounding box to an object position, when there are significant differences between the object color and background color. Under such circumstance, pixel value of the object area is much higher than those of background on the color probability distribution image. However, when the object color is similar to the background color, the pixel value of the object area is no longer distinctive to those of background on the color probability distribution image. The algorithm will not guarantee that it will correctly converge the bounding box to an object position because it is color sensitive. This phenomenon is shown in Figures 1 and 2, respectively.