Journal of Electrical and Computer Engineering

Volume 2016 (2016), Article ID 7975951, 7 pages

http://dx.doi.org/10.1155/2016/7975951

## Object Tracking via 2DPCA and -Regularization

^{1}College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China^{2}Aviation Information Technology R & D Center, Binzhou University, Binzhou 256603, China

Received 10 March 2016; Accepted 13 July 2016

Academic Editor: Jiri Jan

Copyright © 2016 Haijun Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We present a fast and robust object tracking algorithm by using 2DPCA and -regularization in a Bayesian inference framework. Firstly, we model the challenging appearance of the tracked object using 2DPCA bases, which exploit the strength of subspace representation. Secondly, we adopt the -regularization to solve the proposed presentation model and remove the trivial templates from the sparse tracking method which can provide a more fast tracking performance. Finally, we present a novel likelihood function that considers the reconstruction error, which is concluded from the orthogonal left-projection matrix and the orthogonal right-projection matrix. Experimental results on several challenging image sequences demonstrate that the proposed method can achieve more favorable performance against state-of-the-art tracking algorithms.

#### 1. Introduction

Visual tracking is one of the fundamental topics in computer vision and plays an important role in numerous researches and practical applications such as surveillance, human computer interaction, robotics, and traffic control. Existing object tracking algorithms can be divided into two categories, that is, discriminative or generative. Discriminative methods treat tracking as a binary classification problem with local search which estimates the decision boundary between an object image patch and the background. Babenko et al. [1] proposed an online multiple instance learning (MIL), which treats ambiguous positive and negative samples as bags to learn a discriminative classifier. Zhang et al. [2] propose a fasting compressive tracking algorithm which employs nonadaptive random projections that preserve the structure of the image feature.

Generative methods typically learn a model to represent the target object and incrementally update the appearance model to search for the image region with minimal reconstruction error. Inspired by the success of sparse representation in face recognition [3], supersolution [4], and inpainting [5], recently, sparse representation based visual tracking [6–9] has also attracted increasing interests. Mei and Ling [10] first extend sparse representation to object tracking and cast the tracking problem as determining the likeliest patch with a sparse representation of templates. The method can handle partial occlusion by treating the error term as sparse noise. However, it requires solving a series of complicated norm related minimization problems many times and the time complexity is quite significant. Although some modified norm methods have been proposed to speed up tracker, they are still far away from being real time.

Recently, many object tracking algorithms have been proposed to exploit the power of subspace representation from different points. Ross et al. [11] present a tracking method that incrementally learns a PCA low-dimensional subspace representation, efficiently adapting online to changes in the appearance of the target. However, this method is sensitive to partial occlusion. Zhong et al. [8] proposed a robust object tracking algorithm via sparse collaborative appearance model that exploits both holistic templates and local representations to account for appearance changes. Zhuang et al. [12] cast the tracking problem as finding the candidate that scores the highest in the evaluation model based upon a matrix called discriminative sparse similarity map. Qian et al. [13] exploit an appearance model based on extended incremental nonnegative matrix factorization for visual tracking. Wang and Lu [14] present a novel online object tracking algorithm by using 2DPCA and -regularization. This method can achieve good performance among many scenes. However, the coefficients and the sparse error matrix used in this method need an iterative algorithm to compute and the space and time complexity are too large to meet the real-time tracking.

Motivated by the aforementioned work, this paper presents a robust and fast norm tracking algorithm with adaptive appearance model. The contributions of this work are threefold: () we exploit the strength of 2DPCA subspace representation using -regularization; () we remove the trivial templates from the sparse tracking method; () we present a novel likelihood function that considers the reconstruction error, which is concluded from the orthogonal left-projection matrix and the orthogonal right-projection matrix. Both qualitative and quantitative evaluation on video sequences demonstrate that the proposed method can handle occlusion, illumination changes, scale changes, and no-rigid appearance changes effectively in a lower computation complexity and can run in real time.

#### 2. Object Representation via 2DPCA and -Regularization

Principal component analysis (PCA) is a classical feature extraction and data representation technique widely used in the areas of pattern recognition and computer vision. Compared with PCA, two-dimensional principal component analysis (2DPCA) [15] is based on 2D matrices rather than 1D vectors. So the image matrix does not need to be previously transformed into vector. That is, the extraction of image features is computationally more efficient using 2DPCA than PCA. In this paper, we represent the object by using 2D basis matrices. Given a series of image matrices , the projection coefficients matrices can be got by solving the following function: where denotes the Frobenius norm; represents the orthogonal left-projection matrix; represents the orthogonal right-projection matrix.

The cost function is set as an -regularization quadratic function: Here, is a constant. The solution of (2) is easily derived as follows: Here, and mean the identity matrix. stands for Kronecker product. means the vector-version of the matrix . Therefore, we can get the projection coefficients matrix . Let . Obviously, the projection matrix is independent from and we can precalculate it in each frame before circulation for all candidates. When a new candidate comes, we can simply calculate to obtain , which makes the proposed method very fast.

Here, we abandon the trivial templates completely, which makes the target able to be represented by the 2DPCA subspace fully. The error matrix can be obtained by the following equation after we get the projection coefficients matrix from (3): So, the error matrix can be calculated once.

#### 3. Tracking Framework Based on 2DPCA and -Regularization

Visual tracking is treated as a Bayesian inference task in a Markov model with hidden state variables. Given a series of image matrices , we aim to estimate the hidden state variable recursively: where is the motion model that represents the state transition between two consecutive states. is the observation model which indicates the likelihood function.

*Motion Model*. We apply an affine image warp to model the target motion between consecutive states. Six parameters of the affine transform are used to model of a tracked target. Let , where , , , , , and denote and translations, rotation angle, scale, and aspect ration and skew, respectively. The state transition is formulated by random walk; that is, , where is a diagonal covariance matrix which indicates the variances of affine parameters.

*Observation Model*. If no occlusion occurs, an image observation can be generated by a 2DPCA subspace (spanned by and and centered at ). Here, we consider the partial occlusion in the appearance model for robust tracking. Thus, we assume that the centered image matrices can be represented by the linear combination of the projection matrices and . Then, we draw candidates in the state . For each of the observed image matrices, we solve a -regularization problem:where denotes the th sample of the state . Thus, we obtain and the likelihood can be measured by the reconstruction error: However, it is noted that, just by penalizing error level, the precise location of the tracked target can be benefited. Therefore, we present a novel likelihood function, which considers both the reconstruction error and the level of error matrix: where can be calculated by (9): Here, is calculated by (3).

*Online Update*. In order to handle the appearance change of tracked target, it is necessary to update the observation model. If some imprecise samples are used to update, the tracked model may degrade. Therefore, we present an occlusion-radio-based update mechanism. After obtaining the best candidate state of each frame, we compute the corresponding error matrix and the occlusion ratio . Two thresholds = 0.1 and = 0.6 are introduced to define the degree of occlusion. If , the tracked target is not occluded or a small part of it is occluded by the noise. Therefore, the model with sample is updated directly. If , the tracked target is partially occluded. The occluded part is replaced by the average observation and the recovered candidate is used for update. If , most part of the tracked target is occluded. Therefore, the sample is discarded without update. After we cumulate enough samples, we use an incremental 2DPCA algorithm to update the tracker (left- and right-projection matrices).

#### 4. Experiments

The proposed tracking algorithm is implemented in MATLAB which runs on a computer with Intel i5-3210 CPU (2.5 GHz) with 4 GB memory. The regularization is set to 0.05. The image observation is resized to pixels for the proposed 2DPCA representation. For each sequence, the location of the tracked target object is manually labeled in the first frame. 600 particles are adopted for the proposed algorithm accounting for the trade-off between effectiveness and speed. Our tracker is incrementally updated every 5 frames.

To demonstrate the effectiveness of the proposed tracking algorithm, we select six state-of-the-art trackers: the tracker [10], the PN tracker [16], the VTD tracker [17], the MIL tracker [1], the Frag tracker [18], and the 2DPCA tracker [14] for comparison on several challenging image sequences including* Occlusion 1*,* David Outdoor*,* Caviar 2*,* Girl, Car 4*,* Car 11*,* Singer 1*,* Deer*,* Jumping,* and* Lemming*. The challenging factors include severe occlusion, pose change, motion blur, illumination variation, and background clutter.

##### 4.1. Qualitative Evaluation

*Severe Occlusion*. We test four sequences (*Occlusion 1*,* DavidOutdoor*,* Caviar 2, *and* Girl*) with long time partial or heavy occlusion and scale change. Figure 1(a) demonstrates that algorithm, Frag algorithm, 2DPCA, and our algorithms perform better, since these methods take partial occlusion into account. algorithm, 2DPCA, and our algorithms can handle occlusion by avoiding updating occluded pixels into the PCA basis and 2DPCA basis separately. Frag algorithm can work well on some simple occlusion cases (e.g., Figure 1(a),* Occlusion 1*) via the part-based representation. However, this method performs poorly on some more challenging videos (e.g., Figure 1(b),* DavidOutdoor*). MIL tracker is not able to track the occluded target in* DavidOutdoor* and* Caviar 2*, since the Harr-like features the MIL method adopted are less effective in distinguishing the similar objects. For the* Girl* video, the in- and out-of-plane rotation, partial occlusion, and the scale change make it difficult to track. It can be seen that the Frag and the proposed tracker work better than the other methods.