Abstract

The object tracking problem is an important research topic in computer vision. For real applications such as vehicle tracking and face tracking, there are many efficient and real-time algorithms. In this study, we will focus on the Lucas-Kanade (LK) algorithm for object tracking. Although this method is time consuming, it is effective in tracking accuracy and environment adaptation. In the standard LK method, the sum of squared errors is used as the cost function, while least trimmed squares is adopted as the cost function in this study. The resulting estimator is robust against outliers caused by noises and occlusions in the tracking process. Simulations are provided to show that the proposed algorithm outperforms the standard LK method in the sense that it is robust against the outliers in the object tracking problems.

1. Introduction

The Lucas-Kanade (LK) algorithm was originally proposed by Lucas and Kanade in 1981 [1], which makes use of the spatial intensity gradient of the images to find a good match using a type of the Newton-Raphson iteration. The goal of the standard LK algorithm is to minimize the sum of squared errors (SSE) function between the template and the warped image region by adjusting the warping parameters.

Based on the theory of robust statistics, various types of cost functions were adopted in many practical applications such as regression. One of the main approaches to robust regression involves M-estimation [26]. The M-estimator is found to be robust to outliers in the response variable, but turned out not to be resistant to outliers in the explanatory variables, called leverage points. In fact, when there are outliers in the explanatory variables, the method has no advantage over least squares. In the 1980s, several alternatives to M-estimation were proposed as attempts to overcome the lack of such resistance. Least trimmed (sum of) squares (LTS) is a viable alternative [710].

There are several robust LK algorithms proposed in the following literatures. The M-estimators were adopted to LK models for tracking problems in [1115]. The basic idea is to replace the cost function SSE by Huber or Hampel function so that the effects of outliers may be degraded. In [11], the authors generalized the LK approach to histogram-based tracking algorithm. It establishes a closer link between template matching and histogram-based tracking methods. Schreiber [12] proposed a novel variant of the LK algorithm for tracking bilaterally symmetric planar objects from a moving platform. This algorithm was capable of coping with any warping transformation and can be generalized for the case of objects possessing higher symmetry. According to analysis of data distributions, Senst et al. [13] proposed a robust local optical flow approach based on a modified Hampel estimator with robust characteristics. Fan et al. [14] proposed a robust template tracking with weighted active drift correction. The minimization of active drift correction is achieved by the inverse compositional algorithm, which consists of the tracking term and the drift correction term. In [15], a robust LK template matching algorithm was based on evidence which is accumulated over many frames. This algorithm described the drift correcting and used robust weights that are being updated from frame to frame. The previous papers emphasized robust inverse composition algorithm using a robust Huber or Hampel cost functions. However, there are many other methods for robust LK algorithm such as robust measurement of ocular torsion [16], visual tracking and learning [17], and graph-based transductive learning [18].

The LTS estimator is highly resistant to leverage points and is robust to outliers in the response. When we expect there to be some number of observations in the data that we wish to put no weight in the modeling, LTS estimator is usually a good choice. The percentage of the data that we wish to put no weight is termed the trimming percentage or trimming parameter, and this parameter is usually prespecified for the data at hand. In this study, the main idea is to utilize the resistant property of the LTS estimators in the LK algorithm (LTS-LK) for object tracking problems when there are outliers caused by, for instance, noises and occlusions. The updating rules for the warping parameters using LTS-LK approach will be derived. Three simulations are provided in this paper, two for book-object tracking and one for face-object tracking. The simulation results show that the proposed method can effectively track the object when salt-and-pepper noises corruption or occlusion appears.

The rest of this paper is organized as follows. In Section 2, we outline the standard LK algorithm. Section 3 presents the detail of the proposed LTS-LK method. Some experimental results are provided in Section 4 to verify the performances of the proposed method. Finally, a conclusion is made in Section 5.

2. Lucas-Kanade Algorithm

Template tracking of an object in a video sequence is performed by extracting a template in the first frame and then finding the region which matches the template as close as possible in the surplus frames. The goal of LK method is to align a template image to an input image , where is a column vector representing the pixel coordinates. Let denote the parameterized set of allowed warps, where is a vector of warping parameters. Here, we consider the following set of affine warps: The best match to the template in the new frame is found by minimizing the following SSE function: where is defined in (1) and the sum is performed over all of the pixels in the template image . Based on the SSE function in (2), the goal of the LK algorithm is therefore to find where is the frame number. Suppose that a current estimate of is known and we wish to compute an appropriate increment . The minimization problem in (3) can then be converted to Performing a first-order Taylor series expansion on at in (4) yields where is the gradient of the image evaluated at . The term is the Jacobian of the warp given by An approximate solution to (4) can be obtained by taking the partial derivatives of (5) with respect to and setting them to zero. It is easy to derive that this approximate solution is given by where is Gauss-Newton approximation to the Hessian matrix Finally, the parameters are updated as and the process is iterated until the estimate of parameters converge or the pre-specified stopping criterion is met.

3. LK Algorithm Using LTS Approach

Let and . Suppose we are given the training set In the following, we will use the subscript to denote the th observation. For instance, denotes the th component of the th input , , . The residual (or error) at the th component of the difference between the desired output and the predicted output due to the th observation is defined by

The LTS approach is to choose parameters that minimize the total sum of trimmed squared errors given by where and the penalizing weight is defined by Here, denotes the rank of the residual , , and are the ordered values of . A popular choice of is where indicates the largest integer less than or equal to . In the following, the term is referred to as the trimming percentage.

As given in the previous section, the values of in (7) are computed from the inverse compositional algorithm. To adapt the notation for LTS model, we regard the template image and subpixel location as one-dimensional vectors and , , respectively, where is the number of pixels in the template image. Now the training set becomes The predictive function, a two-dimensional image, is given by It can be regarded as a one-dimensional vector . The residuals are given by One of the commonly used cost function is the -norm of the residuals . This is the cost function used in (4). To address the object tracking problems when there are outliers caused by, for instance, noises and occlusions, we will use sum of trimmed squared errors instead of the -norm. Consequently, the goal of the LTS estimator is to find the parameters of LK algorithm that minimize the cost function defined by It is not difficult to derive that the variable can be calculated as where the robust Hessian matrix is

The flowchart of the proposed LTS-LK algorithm is shown in Figure 1. Note that the linearized formulation (18) may be regarded as a linear weighted least squares regression problem. In practice, the inverse of Hessian matrix may pose some numberical difficulties in six-dimensional space. For higher dimension, recursive weighted least squares algorithm can be applied to solve this optimization problem. Here, we adopt direct computation of the inverse.

4. Experimental Simulation

In this section, we compare the performances of standard LK and the LTS-LK algorithms for object tracking. Emphasis is put particularly on the robustness against outliers. Simulation programs are implemented using Borland C++ Builder 6.0 running on Microsoft Windows 7, Intel Core 2 Quad CPU, and 4 GB RAM platform.

To demonstrate the robustness of the proposed algorithm, we present three object-tracking experiments, consisting of two “book-tracking” and one “face-tracking” video sequences. In Example 1, the frames after the 100th of the “book” sequence are corrupted by salt-and-pepper noises in which a corrupted pixel has the intensity of 0 or 255. In Example 2, we add another type of corruption, a severe occlusion of about 1/3 template area at the 35th frame. Example 3 shows the same situation on face tracking with occlusion at the 240th frame. The trimming percentages used in our simulations are 5%, 20%, and 35% in the three examples, respectively. In the simulations, we set 20 as the maximal number of iterations and 0.0001 as the minimal change of . The iteration stops when either of the two conditions is met.

Example 1. The “book” sequence is of size 320 × 240 containing 250 frames in which the object to be tracked is the book. In this experiment, the initial bounding box size is 60 × 80 pixels and contains about 6.25% of the frame size. Simulation results are shown in Figure 2. The first and second rows show the results of the standard LK and the proposed LTS-LK algorithms, respectively. To demonstrate the robustness of LTS-LK algorithm, we randomly corrupt the video sequence with 20% noises, about 15,360 pixels, in the frames after the 100th frame. As observed, the standard LK algorithm fails to track the object in the 160th frame (the 3rd column), whereas our LTS-LK algorithm can effectively track the “book” object.

Example 2. In this example, the experiment deals with severe occlusions in another “book” sequence containing 60 frames, of size 320 × 240. Initial bounding box size is 50 × 70 containing about 4.5% of its original size. The results are shown in Figure 3. As can be seen, the area of LK algorithm bounding box shrinks down to 50% of its original szie at Frame 25 since occlusion appears, that is, because the standard LK algorithm is based on least-squares principle and is, therefore, sensitive to outliers. In the second row, we can see that the proposed LTS-LK algorithm on this sequence produces much better results when the object “book” suffers from severe occlusions. The mean squared error (MSE) of the tracked areas from LK algorithm and trimmed MSE from LTS-LK algorithm at the 35th frame are 8.95 and 1.76, respectively. As a consequence, LTS-LK algorithm demonstrates significant robustness under severe occlusion.

Example 3. The experiment deals with severe occlusions in “face-tracking” sequence containing 350 frames, of size 320 × 240. The results are shown in Figure 4. In the LK algorithm, there appears numerical error on updating the variable at the 240th frame. It means that the standard LK algorithm diverges. Therefore, the variable needs not be updated after the 240th frame. As observed, the LK algorithm fails after the 240th frame, whereas our proposed algorithm can effectively track the “face” object. Again, the proposed LTS-LK algorithm is quite robust against severe occlusion.
From the illustrative examples above, when applying the LTS approach to object tracking problems, the proposed LTS-LK model is robust against outliers. This is achieved by choosing an appropriate trimming percentage. Unfortunately, there is no systematic way to select a proper trimming percentage for the general object tracking problem at hand. The choice is usually problem dependent.

5. Conclusion

The main issue of this paper is the robustness against outliers caused by noises and occlusions in the video object tracking problems. The proposed robust algorithm adopts the LTS scheme for the LK algorithm. Experimental results show that the proposed LTS-LK algorithm outperforms the standard LK. It is noted that SSE minimization criterion is used in standard approach to search the optimal parameters. However, it is known that the estimation of parameters based on least squares criterion is vulnerable to outliers. The main contribution of this paper is the use of LTS robust criteria, instead of the standard least squares criterion, for the search for the LK parameters, and this provides a robust object tracking application.

Acknowledgment

This research reported here was supported by the National Science Council, Taiwan, under Grant no. NSC 101–2221-E-214-074.