Abstract

To solve the problems of tracking errors such as target missing that emerged in compressive tracking (CT) algorithm due to factors such as pose variation, illumination change, and occlusion, a novel tracking algorithm combined angular point matching with compressive tracking (APMCCT) was proposed. A sparse measurement matrix was adopted to extract the Haar-like features. The offset of the predicted target position was integrated into the angular point matching, and the new target position was calculated. Furthermore, the updating mechanism of the template was optimized. Experiments on different video sequences have shown that the proposed APMCCT performs better than CT algorithm in terms of accuracy and robustness and adaptability to pose variation, illumination change, and occlusion.

1. Introduction

Target tracking is very important in the field of computer vision, involved in intelligent transportation, monitoring security, vision navigation, and other civil and military fields. In recent years, it has been widely studied by relevant scholars in the field, so it has important practical significance [13]. Moving target tracking is involved in the motion parameters of object, such as location, velocity, acceleration, and pose in the consecutive image sequences. To study moving target tracking, it would be classified into two categories: (a) directly detect and identify the target in image sequences independent on the prior knowledge, and then find the location of interested target; (b) first of all, build the model according to the prior knowledge, and then find the target accurately in real time in the subsequent frames [4]. Based on these two ideas, a variety of effective target tracking algorithms were derived. But current tracking algorithms remain problems such as tracking drift and tracking error, so it still needs further research and exploration to improve the accuracy, real time performance, and robustness [57].

In recent years, the tracking algorithm based on compression perception has caused widely public concern [8, 9]. The concept of compression perception was proposed by Candes and Donoho in 2006; its main idea was to randomly sample from the signal which can be compressed under the condition of sampling intervals far lower than that of the Nyquist sampling and then to reconstruct the original signal within a certain error range through nonlinear reconstruction algorithm [10, 11]. In literature [12], Mei and Ling proposed a robust tracking algorithm in the particle filter framework. They regarded the target tracking problem as a sparse approximation problem and regarded the candidate which has minimum projection error as tracking target through a norm least squares solution. This method would deal with occlusion and noise effects well; however it is hard to meet the real time requirements due to the large amount of calculation. Zhang et al. [13] put forward a simple and effective compression perception tracking method. A sparse random measurement matrix was generated through certain liberal principles. Zhang et al. compressed the original feature according to the random measurement matrix and then obtained the model based on sparse representation and distinguished target with background by the naive Bayesian classifier. However, the compressive tracking algorithm will produce tracking errors even missing target due to factors such as target deformation, illumination change, and occlusion.

Aiming at the problem in literature [13], this paper proposed an algorithm combined angular point matching with compressive tracking (APMCCT). Firstly, a sparse measurement matrix was adopted to extract the Haar-like features and the classification results of each sample were calculated. Secondly, the target position offset of the sample which corresponds to maximum value was fused to angular point matching. Finally, to make the APMCCT algorithm much stronger and robust, the updating mechanism of the template was optimized.

2. Compression Tracking Algorithm

2.1. Feature Extraction Based on Compression Perception

The literature [13] used a sparse matrix to satisfy the condition of constraints isometric sex (RIP) [14], the original feature space was projected to a low-dimensional subspace, and the compressed low-dimensional subspace can keep the information of characteristics in original high-dimensional space. The specific formula is expressed bywhere is original high-dimensional feature, is compressed low-dimensional characteristics, is a random measurement matrix, and . Ideally, the low-dimensional characteristics will fully retain information of the high-dimensional signal or the distance relationship of samples in the original space.

A set of rectangle filters were used to convolute with each sample to generate high-dimensional feature: , which was defined as where and are the width and height of a rectangle filter, respectively. Then each filtered sample was represented as a column vector in . Afterwards, these vectors were concatenated as a very high-dimensional feature vector , where . The dimension of is very high, typically in the order of to . The literature [13] used a very sparse random measurement matrix; each element of the matrix is defined as

Achlioptas [15] proved that this type of matrix with or 3 satisfied the Johnson-Lindenstrauss lemma. This sparse matrix form very easily which requires only a uniform random generator. Assume that the probabilities of 1 and −1 are the same and the probabilities were set to be . Li et al. [16] showed that for , this matrix is asymptotically normal. Therefore, the new features were the weighted sum of original characteristics with as the weights; namely, each region character was the weighted sum of all characteristics of rectangular area in different size.

2.2. Classifier Construction and Update

In current frame, candidate samples were searched in the field of previous target location with radius , and then the sparse characteristics of candidate samples were extracted through the random samples measurement matrix. Finally the sample which corresponds to maximum value was selected by naive Bayes classifier. All elements in were assumed independently distributed and a model was established with a naive Bayes classifier [17]where represents positive sample and represents negative sample. The two prior probabilities are equal; all are 50%. In the literature [18], Diaconis and Freedman proved almost all projection of high-dimensional vector met Gaussian distribution, so the conditional probability also belongs to the Gaussian distribution. Therefore the probability density function of normal function distribution could be indicated by the mean and variance value. Namely,

In this type, and are the mean and variance value of positive sample, respectively. and are the mean and variance value of negative sample, respectively.

The formula of updating algorithm is as follows:where is learning factor and ; update rate is faster when is smaller.

3. Harris Corner Detection Principle

Angular point is an important local characteristic of image. These pixels contain rich two-dimensional structure information which gray-scale transformation value high enough in all directions [19, 20]. Harris corner detection algorithm is more effective in feature extraction and it only used the first-order difference of gray and filtering. So it is relatively stable and robust in factors such as rotation, noise, and visual transform [21]. This experiment adopts Harris operator as the corner extraction operator.

The basic idea of Harris corner detection is to design a local window in the image and then calculate the energy of the window in all directions. The formula is as follows:where represents all points in the local window, represents the energy of image in the local window along one direction, and is the window which was weighted. Usually, the values of weight which was in the center of the window are higher than that near the window border. The Gaussian function was often used to filter the noise.

Formula (7) in matrix form iswhere

, are the derivative of pixels along - and -axis, respectively. The two eigenvalues of the matrix are larger in the area where corner points exist. The corner points in the image will be detected through the corner response function in type (10) after the correlation function was calculated by (8) and (9)where and are the trace and determinant value of matrix , respectively. The value of was 0.04. Setting a threshold , the corner points satisfy the conditions of .

4. Angular Point Matching Combined with Compressive Tracking

4.1. Angular Point Matching

Local matching methods based on region mainly contain Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), and normalized cross correlation (NCC). The minimum value of similarity measure function was selected in SAD and SSD algorithm; however, the maximum value was selected in NCC algorithm. The NCC algorithm has best antinoise ability and matching precision, so this paper adopts the method of NCC as angular point matching:where and are two windows with same size which center on angular point in image 1 and angular point in image 2. and are the mean of pixel in the windows and , respectively, and the value of in the range of the size of window.

4.2. Ransac Fine Matching

There are mismatches no matter what feature descriptor and similarity measure method was used. This paper adopts Ransac [22] to remove the mismatch in the candidate matching points. Ransac algorithm calculates the mathematical model of the data based on sample data sets which contain abnormal data and then gets effective samples.

The basic idea of Ransac algorithm used in this paper is described as follows.

(1) A model and a sample set is considered; the potential of the model’s minimum sample sets is ( is the minimum of the sample size). The size of the sample is larger than . The subset contains samples randomly extracted from and the initial size of is .

(2) complementary set. The samples from whose error on the basis of is less than the value constituted set of . and subset composed . is the set of inliers, and these inliers form consensus set of .

(3) The correct parameters of the model are obtained if . A new model is recalculated by least squares and ; then a new is extracted, and the above process is repeated.

(4) If the consensus set is not found after certain times of sampling, the algorithm fails. Otherwise, the biggest consensus set is chosen and judge inliers and outliers are judged. And the algorithm is over.

4.3. Template Update Mechanism

In literature [13], Gaussian distribution was used to update the template; the mean and variance of sample were updated after tracking. However, it will produce tracking errors due to factors such as occlusion, illumination change, and pose variation. Similarity between template and target would be low; if the template was updated at this time, and it will result in greater deviation. So, in this paper, the position of target was compared with the previous template through NCC, and the template which has larger similarity was regarded as the template of current frame.

This paper set up a threshold . If normalized cross correlation is greater than the threshold, the template was updated; otherwise the template was not updated.

4.4. Process of APMCCT

Steps of angular point matching combined with compressive tracking (APMCCT) were as follows.

(1) Initialization: select the target manually and set the initial parameters; create the projection matrix; set the number of initial characteristics and initial parameters of classifier.

(2) Calculation: read the image of current frame, and then collect several positive samples within the region of and several negative samples within the region of . Calculate compressed domain feature vector of each sample in the candidate area. Extract the angular point of current image by Harris operator. In addition, as the template of the first frame image is calculated within the scope of the target area, the rest of the image given the target motion characteristics is calculated in an expanded 10% area.

(3) Tracking: calculate the response value of each candidate sample by the type (4) and the sample which corresponds to maximum value was selected to calculate the target position offset; calculate the angular point offset and then the two offsets are used to calculate target position.

(4) Update classifier: calculate normalized cross correlation between current location and the initial template; if the normalized cross correlation is greater than the threshold, the template was updated according to the type ((6), (7)).

(5) Return to step () to process the next frame image.

5. Experiment Results and Analysis

This experiment was implemented in MATLAB R2012a, which runs at 35 frames per second (FPS) on a AMD 2.10 GHz CPU with 4 GB RAM. The simulation software MATLAB R2012a was downloaded on the Internet. In order to verify the robustness of the new algorithm, the performance of APMCCT algorithm was compared with the original compression tracking. Some tracking results were shown in Figures 1, 2, 4, 5, 7, and 8 about Girl, SUV, and Fish. Furthermore, the tracking error curve was mapped as shown in Figures 3, 6, and 9. These challenging test sequences are from [1] and they are available on the website http://visultracking.net.

5.1. Experiment Parameter Settings

According to the target location at the current frame, the search radius of positive samples set was set to which generates 45 positive samples. The inner and outer radii to generating negative samples set were set to and , respectively. The number of negative samples was set to be 50 which is selected randomly from set . The search radius of candidate samples set was set to and 1100 candidate samples were generated. The dimensions of the projection space was set to , and the learning parameter was set to .

5.2. Results and Analysis

Center Location Error (CLE) was used to evaluate APMCCT and CT algorithm. CLE is the Euclidean distance of the real location center with the target tracking results. The specific formula is expressed bywhere and are the -coordinate of real location center and tracking location center, respectively. Likewise, and are the -coordinate of real location center and tracking location center, respectively.

Girl video sequence is constituted of 500 frame color images whose size is 128 × 96. The background of these sequence images is relatively simple. In the process of tracking, the head and the body of the target rotated 360 degrees, and the people who are similar to the tracking target appeared. In the experiments, the representative sequence images such as the 40th frame, the 86th frame, the 182nd frame, the 297th frame, the 442nd frame, and the 500th frame were selected to show the performance of APMCCT and compression tracking (CT) algorithm, as shown in Figures 1 and 2. When the target rotated in the process of tracking, the deviation of CT was becoming increasingly big as errors accumulate continuously, so the tracking effect is not good. APMCCT algorithm is oppositely stable. When the target rotated and changed in appearance in the 86th frame, 182nd frame, and 297th frame, APMCCT algorithm still tracked accurately. When the people who are similar to the tracking target appeared in the 442nd frame and the 500th frame, APMCCT algorithm performed relatively accurately. What is more, APMCCT algorithm would quickly find the tracking target; on the contrary, CT algorithm failed to find the target correctly while the target is missing. But because APMCCT algorithm did not make improvement on scale, the performance of APMCCT algorithm was slightly down in some frames with range from 90th frame to 110th frame and range from 180th frame to 240th frame. The change of scale leads to changing the area of corner detection, so angular point matching did not focus on the area of the target. We will research and make improvement in this respect next. In order to compare the performance of APMCCT and CT algorithm in each frame of the video sequence, the tracking error curve of Girl video sequence was mapped as shown in Figure 3.

SUV video sequence is composed of 945 frame images whose size is 320 × 240. In the process of tracking, the scale of the target was changed, and different obstructions appeared. In the experiments, the representative sequence images such as the 87th frame, the 280th frame, the 400th frame, the 504th frame, the 674th frame, and the 745th frame were selected to show the performance of APMCCT and compression tracking (CT) algorithm, as shown in Figures 4 and 5. APMCCT and CT algorithm tracked accurately until the obstructions appeared. When the obstructions appeared in the process of tracking, the deviation of CT was becoming increasingly big as errors accumulate continuously, even leading to loss of the target. Instead, APMCCT algorithm still tracked accurately as shown in the 504th frame, 674th frame, and 745th frame. APMCCT algorithm would quickly find the tracking target after the tracking is drifting or missing. In order to compare the performance of APMCCT and CT algorithm in each frame of the video sequence, the tracking error curve of SUV video sequence was mapped as shown in Figure 6.

Fish video sequence is composed of 476 frame images whose size is 320 × 240. In the experiments, the representative sequence images such as the 6th frame, the 116th frame, the 180th frame, the 215th frame, the 362nd frame, and the 445th frame were selected to show the performance of APMCCT and compression tracking (CT) algorithm, as shown in Figures 7 and 8. The deviation of CT was becoming increasingly big as errors accumulate continuously, even leading to loss of the target because illumination changed and camera dithered. Instead, the tracking error of APMCCT algorithm was relatively small. When illumination changed in the 362nd frame and 445th frame, APMCCT algorithm still tracked accurately. In order to compare the performance of APMCCT and CT algorithm in each frame of the video sequence, the tracking error curve of Fish video sequence was mapped as shown in Figure 9.

6. Conclusion

Aiming at the problem of object tracking, a novel algorithm combined angular point matching with compressive tracking (APMCCT) was proposed. And the updating mechanism of the template was optimized. The experiments for different video sequences show that APMCCT algorithm can adapt to the changes of the target and the background and track target accurately. The accuracy and robustness have obvious improvement compared with CT algorithm. But APMCCT algorithm is still unable to adaptively adjust the scale of the tracking box, which will be studied in the future.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (no. 61203302 and no. 61403277) and the Tianjin Research Program of Application Foundation and Advanced Technology (14JCYBJC18900).