Mathematical Problems in Engineering

Volume 2018, Article ID 8284123, 8 pages

https://doi.org/10.1155/2018/8284123

## Reliable Recognition of Partially Occluded Objects with Correlation Filters

^{1}Department of Mathematics, Chelyabinsk State University, Chelyabinsk, Russia^{2}Department of Computer Science, CICESE, Carretera Ensenada-Tijuana 3918, 22860 Ensenada, BC, Mexico^{3}Facultad de Ciencias, Universidad Autonoma de Baja California, Carretera Tijuana-Ensenada, No. 3917, 22860 Ensenada, BC, Mexico

Correspondence should be addressed to Vitaly Kober; xm.esecic@rebokv

Received 24 August 2017; Revised 5 March 2018; Accepted 27 March 2018; Published 14 May 2018

Academic Editor: Paolo Lonetti

Copyright © 2018 Alexey Ruchay et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Design of conventional correlation filters requires explicit knowledge of the appearance and shape of a target object, so the performance of correlation filters is significantly affected by changes in the appearance of the object in the input scene. In particular, the performance of correlation filters worsens when objects to be recognized are partially occluded by other objects, and the input scene contains a cluttered background and noise. In this paper, we propose a new algorithm for the design of a system consisting of a set of adaptive correlation filters for recognition of partially occluded objects in noisy scenes. Since the input scene may contain different fragments of the target, false objects, and background to be rejected, the system is designed in such a manner to guarantee equally high correlation peaks corresponding to parts of the target in the scenes. The key points of the system are as follows: (i) it consists of a bank of composite optimum filters, which yield the best performance for different parts of the target; (ii) it includes a fragmentation of the target into a given number of parts in the training stage to provide equal intensity responses of the system for each part of the target. With the help of computer simulation, the performance of the proposed algorithm for recognition partially occluded objects is compared with that of common algorithms in terms of objective metrics.

#### 1. Introduction

Recognition and tracking of objects in observed scenes degraded by additive noise, in the presence of cluttering backgrounds, geometric modifications such as pose changing and scaling, nonuniform illumination, and eventual object occlusions are challenges that a modern recognition algorithm must solve. In this paper, we deal with partial occlusion of objects to be recognized, in other words, when only some parts of the target are visible. Recent works have paid much attention to this problem [1–5].

Nowadays, object recognition based on correlation filters receives much research interest due to its high impact in real-life activities, such as video surveillance, human-computer interaction, robotics, biometrics, and target tracking [6–12]. Correlation filtering is a powerful technique for object recognition because of its ability to perform two essential tasks simultaneously: detection of a target within an observed scene and computation of the exact position of the detected object [13, 14]. Another advantage of correlation filters is their ability to detect multiple objects in a single scene simultaneously [15–17].

The performance of correlation pattern recognition may be improved either by discarding noise components from the output of a linear system [18] or by using an adaptive approach to the filter design [19]. The former approach is suitable for classification problems [20], whereas the latter is preferable for detection and tracking applications. For the case of nonstationary noise such as a cluttered background, statistical parameters of the noise are space-variant. The frequency response of a correlation filter is locally adapted to the parameters estimated in small spatially homogeneous fragments of the input scene. The locally adaptive filter improves pattern recognition in terms of location errors for a noisy environment that is important for accurate target detection.

Conventional correlation filters without training may yield a poor performance to recognize a target partially occluded by other objects [21], for example, to recognize a pedestrian partially covered with a tree or a man wearing sunglasses. There are several proposals to treat partial occlusions with correlation filters [22–29]. All of them use independent parts of the target to synthesize a composite correlation filter. However, no study was carried out on an augmented division of the object into parts.

Campos et al. [22] carried out a study on the performance of some correlation filters to discriminate occluded objects. They compare the phase-only filter, the inverse filter, and the trade-off filter between the minimum variance and minimum average correlation energy. All used filters enhance the edges of the object in order to have a good discrimination. The target is divided uniformly into seven parts without any justification. Moreover, the performance of the filters in the presence of noise and geometric distortions was not analyzed. Adaptive correlation filters for recognition of fragmented objects imbedded into real-life scenes and in the presence of additive noise were presented [23, 24]. The target is divided into independent fragments for the design of an adaptive filter. It was supposed that at least one of the fragments responses to the visible fragment of the target is embedded into the scene. Additionally, the algorithm uses available contour and texture information to improve recognition of partially occluded objects. Recent work [28] improves recognition of partially occluded objects embedded into a known cluttered background with an adaptive composite filter. The proposed filters are able to discriminate noisy similar objects, even, when available information of a target is about 19%. Khoury et al. [30] developed several optimal correlation algorithms for detection of obscured targets embedded into a disjoint background. It was noted that the boundary between obscuring and obscured objects makes a significant contribution to the correlation peak. So, blurring of the boundaries was utilized for detection of obscured targets.

Recently, masked correlation filters (MCFs) were designed [31] to handle partial occlusions in face images. MCFs utilize prior knowledge of the location of partial occlusions in test images as well as the zero-aliasing correlation filtering (ZACF) [27]. Since in real-life applications the location of partial occlusions is usually unknown, the filters cannot be widely used.

Finally, note that, in the design of common correlation-based methods, the target is arbitrarily divided into a number of parts, which are used for the design of composite filters. One of the motivations of this research is to determine a reasonable way for the target division to guarantee a high level of the overall system performance. In order to obtain a good recognition of each target part in noisy input scenes, the optimum correlation filters are also utilized [32].

The paper is organized as follows. Section 2 recalls the design of composite correlation filters. Section 3 describes the proposed algorithm for target fragmentation and robust recognition of partially occluded objects with multiple composite filters. Section 4 with the help of computer simulation presents the performance of the proposed algorithms in terms of detection efficiency. The results are discussed and compared with those obtained with common correlation filters. Finally, Section 5 presents our conclusions.

#### 2. Composite Correlation Filters

We are interested in the design of a correlation filter that is able to recognize a fragment of the target embedded into a disjoint background in the scene corrupted with additive noise. The designed filter should be also able to recognize geometrically distorted versions of the target. Let be an image set containing geometrically distorted versions of the target. The input scene is assumed to be composed by the target embedded into a disjoint background at unknown coordinates , and the scene is corrupted with additive noise , as follows:where is a binary function defined as zero inside the target area and unity elsewhere. The optimum filter for detecting the target, in terms of the maximum of the signal-to-noise ratio (SNR) and the minimum variance of location errors (LE), is the generalized matched filter (GMF) [13], whose frequency response is given bywhere and are the Fourier transforms of and , respectively; is the mean value of the background ; and denote the power spectral densities of and , respectively. Symbol denotes convolution.

Let be the impulse response of a GMF constructed for the th available view of the target . Let be the set of all GMF impulse responses constructed for all training images . Additionally, let be an image set containing unwanted patterns to be rejected. In order to recognize all target views in and reject the false patterns in , by combining the optimal filter templates contained in , we synthesize a composite correlation filter. Filter can be constructed as follows [33]:where the coefficients are chosen to satisfy prespecified output values for each pattern in . Using vector-matrix notation, we denote by a matrix with columns and rows equal to the size of the images, where each column is the vector version of each element of . Let be a vector of coefficients. Thus, (3) can be rewritten as

Let us denote by the desired responses to the training patterns and denote by the matrix whose columns are the elements of The response constraints can be expressed aswhere superscript denotes conjugate transpose. Substituting (4) into (6), we obtain

Finally, substituting (10) into (4), the solution for the composite filter is given by

Note that the value of the correlation peak obtained with (9) is expected to be close to unity for true-class objects and close to zero for false-class objects.

The MACE [34] filter minimizes the average correlation energy of the correlation outputs for the training images while simultaneously satisfying the correlation peak constraints at the origin. The effect of minimizing the average correlation energy is that the resulting correlation planes would yield values close to zero everywhere except at the location of a trained object, where it would produce an intense peak. In the Fourier domain, the MACE filter can be expressed in vector form as follows:where matrix contains along its diagonal the average power spectrum of the training images (i.e., average of the magnitude squares of the columns of ).

The Optimal Trade-off Synthetic Discriminant Function (OTSDF) [14] filter is a correlation filter that is similar to the MACE filter. In the OTSDF formulation, matrix is replaced with , where is an identity matrix and . The inclusion of the identity matrix improves noise tolerance.

The discrimination capability (DC) is a measure of the ability of the filter to distinguish a target from unwanted objects; it is defined by the following [33]:where is the value of the maximum correlation sidelobe in background area and is the value of the correlation peak generated by the target. A DC value close to unity indicates that the filter has a good capability to distinguish between the target and any false object. Negatives values of the DC indicate that the filter is unable to detect the target. Also, if the obtained DC is greater than a prespecified threshold (), then the target is considered as detected and, otherwise, the target is rejected.

#### 3. Recognition of Partially Occluded Objects

In this section we describe the proposed algorithm for recognition of partially occluded objects using a new target fragmentation procedure and a bank of composite correlation filters. To improve the detection performance of correlation filters an adaptive approach to the filter design is utilized [35].

The proposed algorithm for automatic fragmentation of the target into parts is shown in Figure 1.