EURASIP Journal on Image and Video Processing
Volume 2008 (2008), Article ID 374528, 19 pages
doi:10.1155/2008/374528
Research Article

Integrated Detection, Tracking, and Recognition of Faces with Omnivideo Array in Intelligent Environments

Kohsia S. Huang and Mohan M. Trivedi

Computer Vision and Robotics Research (CVRR) Laboratory, University of California, San Diego, 9500 Gilman Drive MC 0434, La Jolla, CA 92093, USA

Received 1 February 2007; Revised 11 August 2007; Accepted 25 November 2007

Academic Editor: Maja Pantic

Copyright © 2008 Kohsia S. Huang and Mohan M. Trivedi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We present a multilevel system architecture for intelligent environments equipped with omnivideo arrays. In order to gain unobtrusive human awareness, real-time 3D human tracking as well as robust video-based face detection and tracking and face recognition algorithms are needed. We first propose a multiprimitive face detection and tracking loop to crop face videos as the front end of our face recognition algorithm. Both skin-tone and elliptical detections are used for robust face searching, and view-based face classification is applied to the candidates before updating the Kalman filters for face tracking. For video-based face recognition, we propose three decision rules on the facial video segments. The majority rule and discrete HMM (DHMM) rule accumulate single-frame face recognition results, while continuous density HMM (CDHMM) works directly with the PCA facial features of the video segment for accumulated maximum likelihood (ML) decision. The experiments demonstrate the robustness of the proposed face detection and tracking scheme and the three streaming face recognition schemes with 99% accuracy of the CDHMM rule. We then experiment on the system interactions with single person and group people by the integrated layers of activity awareness. We also discuss the speech-aided incremental learning of new faces.