Advances in Optical Technologies

Volume 2008, Article ID 583687, 8 pages

http://dx.doi.org/10.1155/2008/583687

## Benefits of Using Decorrelated Color Information for Face Segmentation/Tracking

Department of Electronic Engineering, National University of Ireland, Galway, Nun's Island, Galway, Ireland

Received 1 October 2007; Accepted 18 February 2008

Academic Editor: María Millan

Copyright © 2008 Mircea C. Ionita and Peter Corcoran. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We analyze in this paper the benefits that can be derived from employing color image alignment techniques in the context of face segmentation or tracking based on texture (defined as the patch of intensities) template matching. By making full use of the decorrelated color information, improvements on the accuracy of the segmentation are demonstrated. This is intended to enhance the face segmentation algorithm by increasing its robustness to differences in images caused by various image acquisition devices or settings or by variations in the ambient illumination conditions.

#### 1. Introduction

The use of color information becomes increasingly important in nowadays image processing applications as inexpensive color image acquisition devices become easily available. Color image processing permits a more extensive image representation, which expectedly leads to better results.

We deal in this paper with the specific case of face segmentation which employs face modeling techniques. This can also be viewed in the more general context of deformable template matching, using for this purpose a statistical model of shape variations. Extensive work has been carried out in the area of face modeling and face segmentation using statistical models [1–5]. These techniques have initially been developed for gray level images. Extensions have later been proposed for color images [6, 7]. Some advantages of using the color extension have been demonstrated, but mostly for working in a controlled image acquisition environment. Processing color information can thus be challenging, especially when designing more general applications that are supposed to work within unconstrained image acquisition conditions. We demonstrate in this paper some positive results when using the decorrelated color information for applications which include face segmentation and face tracking, intended to work under no predefined constrains.

The outline of this paper is as follows. In Section 2, we briefly describe several decorrelated color spaces in terms of their transforms from the common color space; we also include a comparison between these color spaces in terms of how well they are able to decorrelate image channels on a series of test images. In Section 3, a face segmentation method is described, based on a statistical shape model and a fixed face texture template. The limitations of the application described in Section 3 are addressed in Section 4, introducing some texture alignment and color transfer techniques in order to adapt the texture template to the color distribution of the current image. These operations are facilitated by converting the texture data to one of the decorrelated color spaces presented in Section 2. In Section 5, we show our experiments performed on a general face image database; the database is built as a mixture of images, gathered mostly from various standard face image and video databases; a set of comparative results is provided in Section 5. Finally, in Section 6 we draw the conclusions of our work.

#### 2. Image Decorrelation with Respect to Color Information

Colorwise image decorrelation is useful for applying color image processing operations independently on each image channel.

#### 2.1. Karhunen-Loève Transform

The *Karhunen-Loève
transform (KLT)* is optimal in terms of energy compaction and mean-squared
error minimization for a truncated representation. By applying KLT to a color
image, it creates image basis vectors which are orthogonal, and it thus
achieves complete decorrelation of the image channels [8–12] as follows:where contains the
image color signals and , with ; is the
mathematical expectation. is the
covariance matrix of the of the image color signals as follows:with , . is the
transformation matrix formed by the eigenvectors of the covariance matrix :

Yet, KLT is data dependant, meaning that it requires the recalculation of the transformation matrix for each set of data (e.g., each new image).

#### 2.2. Color Space

An interesting color space is , proposed by Ohta et al. [13], which realizes a statistical minimization of the interchannel correlations (decorrelation of the components) for natural images. The conversion from to is given by the simple linear transformation in (4) as follows:

stands as the achromatic (intensity) component, while and are the chromatic components. We remark that the simple numeric transformation from to enables simple and efficient transformation of datasets between these two color spaces.

was designed as an approximation for the KLT of the data to be used for region segmentation on color images. As the transformation to represents a good approximation of the KLT for a large set of natural images, the resulting color channels are almost completely decorrelated.

In the previous work of Ohta et al., the discriminating power of linear combinations of , , and was tested on eight different color scenes. The selected linear combinations were gathered such that they could successfully be used for segmenting important (large area) regions of an image, based on a histogram threshold. It was found that of the linear combinations had all positive weights, corresponding mainly to an intensity component which is best approximated by ; another showed opposite signs for the weights of and , representing the difference between the and components which are best approximated by ; finally, the remaining linear combinations could be approximated by . Thus it was shown that the , , and components in (4) are effective for discriminating between different regions and that they are significant in this order [13]. We can further conclude, based on the above figures, that the percentage of color features which are well discriminated on the first, second, and third channels is around %, %, and %, respectively.

is also found
in [14] to perform
better as compared to other color space implementations like YIQ, CIELAB, and for segmentation
of color images based on *Markov random field (MRF)* processing. In
[15], the color space was
used for color image segmentation based on an MRF model and
simulated annealing due to its effectiveness in terms of the quality of the
segmentation and the reduced complexity of the transformation.

#### 2.3. Color Space

Assuming that the human visual system is ideal for processing natural scenes, Ruderman et al. [16] developed the color space, which also minimizes the correlation between channels for natural images. The conversion from is realized by means of an initial transform to cone space, followed by a conversion of the data to logarithmic space (used to reduce skewness):

Finally, the data is obtained from

This color space has successfully been used in [17, 18] for image color transfer operations, which will be described in Section 4.1.

#### 2.4. Comparison between the Different Color Image Representations

The correlation between two image channels is given bywhere and represent the th and th image channel signals, respectively, (with , ), for a certain color image representation.

The total interchannel correlation is calculated as follows:

The correlation coefficients have been measured for several test images (see Figure 1) in the discussed color image representations using the above formulae [19]. Results are summarized in Table 1.

It can be observed that the representation presents a very high interchannel correlation, while the and image representations significantly reduce this correlation. As stated above, the KLT, which is adapted to each particular image, achieves total decorrelation of the image channels.

#### 3. Face Segmentation Using Deformable Template Matching

Note that the term *texture*, frequently used in
this paper, refers in the context of this work to the set of pixel intensities
across an object, also subsequent to a suitable normalization.

#### 3.1. Statistical Shape Models

We are
interested in designing a shape model robust to head pose variations. The shape
is defined as the set of positions of some fiducial points on the face. The
model is statistically built from a training dataset which contains image
examples, annotated with a fixed set of landmark points. The sets of 2-D
coordinates of the landmark points define the shapes inside the image frame.
These shapes are aligned using the generalized *procrustes analysis* [20], a technique for
removing the differences in translation, rotation, and scale between the
training set of shapes. This defines the shapes in the normalized frame.

Let be the number
of training examples. Each shape example is represented as a vector of concatenated
coordinates of its points , where is the number
of landmark points. *principal components analysis (PCA)* is then applied
to the set of aligned shape vectors reducing the initial dimensionality of the
data. It can be noted that PCA is very similar
to KLT. In a geometric interpretation, KLT can be viewed
as a rotation of the coordinate system, while for PCA, the rotation of
the coordinate system is preceded by a shift of the origin to the mean point
[21]. Shape
variability is thus linearly modeled as a base (mean) shape plus a linear
combination of shape eigenvectors:where represents a
modeled shape, is the mean of
the aligned shapes, is a matrix
having shape
eigenvectors as its columns (); finally, defines the set
of parameters of the shape model. is chosen so
that a certain percentage of the total variance of the data is retained.

The standard deviation for each parameter of the face model, as resulted from the training dataset, provides its dynamic range. By altering the model parameters within their dynamic range helps insuring that only plausible instances of the modeled object are being generated. A description of the way in which the optimal model parameters for a new image can automatically be estimated follows in Section 3.2.

#### 3.2. Face Texture Template Optimization Algorithm

In order to optimize the face model parameters, a
texture template is also required. The separation between shape and texture is
realized using a reference shape. Based on this reference shape, the so-called *texture
examples* can be extracted. The reference shape is usually chosen as the
pointwise mean of the shape examples. The texture examples are defined in the
normalized frame of the reference shape. Each image example is then distorted
such that the points that define its attached shape, used as control points,
match the reference shape, such that the topology is preserved. An image warping
method is employed for this purpose. Image warping methods are discussed in
Section 3.3.

Subsequent to the warping stage, all shape differences between the image examples have been removed. The texture across each image object is thus mapped into a shape-normalized representation. The resulting images are also called the image examples in the normalized frame. For each of these images, the corresponding pixel values across their common shape are scanned to form the texture vectors , where is the number of texture samples.

Based on previous experiments, we remark that the variability of the shape component of the face is much more important than the variability of the texture component in terms of a successful segmentation of the face. Due to this fact, we consider in the following a simplified formulation of a model-based face segmentation technique, where the modeled image is represented by a fixed texture template; extensions could be made so that to include texture variability, yet that was beyond the purpose of the current work. Thus during an optimization stage (fitting the model to a query image), the parameters to be found are , where are the shape 2-D position, 2-D rotation, and scale parameters inside the image frame, and are the shape model parameters. The optimization of the parameters is realized by minimizing the reconstruction error between the query image and the modeled image. The error is evaluated in the coordinate frame of the model, that is, in the normalized texture reference frame, rather than in the coordinate frame of the image. The difference between the query image and the modeled image is thus given by the difference between the (normalized) image texture and the (normalized) template texture as follows:and is the reconstruction error, with marking the Euclidean norm.

A first order Taylor extension of is given
by should be
chosen so that to minimize . It follows that:Normally, the gradient matrix should be
recomputed at each iteration. Yet, as the error is estimated in a normalized
texture frame, it was shown that this gradient matrix may be considered as
fixed, being thus possible to precompute it from a training dataset; these
techniques, introduced in [22],
and extended so that to also incorporate a statistical texture variation model
(as opposed to a fixed texture template described above), are called *active
appearance models (AAMs)*. Using this technique, each parameter in is
systematically displaced from its known optimal value retaining the normalized
texture differences. The resulted matrices are then averaged over several
displacement amounts and over several training images. The update direction of
the model parameters is then given
bywhere is the
pseudoinverse of the determined gradient matrix, which can be precomputed as
part of the training stage. The parameters continue to be
updated iteratively until the error can no longer be reduced and convergence is
declared.

#### 3.3. A TPS-Based Model Fitting Technique

Piecewise affine warping is extensively used in techniques like AAM due to its reduced computational costs. A triangulation (e.g., Delauney) is used to partition the convex hull of the control points. The points inside triangles are then mapped via an affine transformation which uniquely assigns the corners of a triangle to their new positions. Although the assumption that the face patches are piecewise affine within the triangles is a satisfactory solution when there is a sufficiently large number of landmark points, it also shows an important drawback. This refers to the fact that, when modeling large face pose variations, corners of some triangles tend to get reversed due to occlusions of the corresponding landmark points. This obviously affects the image warping outcome by creating erroneous face patches. The errors are further propagated into the fitting algorithm, resulting in an incorrect fit. That is why the piecewise warping method works well mostly for modeling frontal or nearly frontal faces.

A more advanced and accurate warping method is
obtained by employing the *thin plate splines (TPSs)*, introduced in
[23]. A short
description of this warping method is also given in the appendix. An initial
drawback of using the thin plate splines was represented by the fact that they
were quite expensive to calculate. The solution requires the inversion of a matrix (the
bending energy matrix) which has a computational complexity of , where is the number
of points in the dataset (i.e., the number of
pixels in the image); furthermore, the evaluation process is . Fortunately, important progress has been made in
order to speed this process up. An approximation approach was proved in
[24] to be very
efficient in dealing with the first problem, reducing greatly the computational
burden. As far as the evaluation process is concerned, the *multilevel fast
multipole method (MLFMM)* framework was described in [25] for the evaluation of
two-dimensional polyharmonic splines, while in [26] this work was extended for
the specific case of TPS, showing that a reduction of the computational
complexity from to is indeed
possible. Thus the computational difficulties involving the use of TPS have been to an
important extent removed.

We show in Figures 2 and 3 an example of fitting the model based on TPS warping. The error is evaluated relative to the number of available data points after the deformation.

#### 4. Improved Model Fitting by Means of Local Color Transfer

A face detection algorithm is firstly applied for the current image. We used here the Viola-Jones face detector [27], which is based on the AdaBoost algorithm [28]. A statistical relation between the face detector estimates for the face position and size (rectangle region) and the position and size of the reference shape inside the image frame is initially learnt (offline) from a set of training images. This relation is then used to obtain a more accurate initialization for the reference shape, tuned with the employed face detection algorithm. It is also important to have a reasonably close initialization to the real values in order to insure the convergence of the fitting algorithm described in Section 3. Color statistics are then extracted across the convex hull of landmark points of the initialized reference shape.

#### 4.1. Image Color Transfer

According to [17], color can be transferred between two images (global color transfer) using the formula in (15), applied in the color space:where and are, respectively, the mean and standard deviation of the Gaussian distribution in the considered color space.

For local color transfer between two images, color
statistics (e.g., mean and variance of the Gaussian-modeled color
distribution) are gathered from the target and source image, respectively, and
used to calculate the *color influence map (CIM)*. CIM contains the
weights for each pixel in the target image, determined based on their proximity
to the color range in the source image.

Consider the distance between a pixel and the center of the color distribution. For three-dimensional color data this is the Mahalanobis distance given bywhere is the covariance matrix of the three-variate color texture vector.

Yet, if a decorrelated color space is used, then the
covariance matrix is close to
being diagonal and (16) reduces to the *normalized Euclidean distance* (17):where is the standard
deviation vector of over the sample
set.

The weights in CIM are calculated using a function of the above distance , for which the following conditions should be met as follows:

The function below was proposed in [18] to be used with the color space:

The color transfer equation in (15) was also extended in [18] toor, if a single color is used as source for color transfer,

#### 4.2. Adaptive Texture Template Matching

Using a decorrelated color space (see Section 2), the color of the texture template (see Figure 4(a)) can be adapted to the current image, increasing the chance of a correct fitting (correct-face segmentation) of the face model. Experimental results to support this premise and to confirm the benefits of employing color adaptation techniques with the template matching algorithm follow next.

#### 5. Experiments

The
experiments have been performed on a randomly chosen subset of 16 images from
the database in Figure 1. The images have been semiautomatically annotated and
the set of annotations has been used as the ground truth for calculating the
boundary errors, which give an objective measure for the fitting quality of the
face model. The boundary errors are measured between the exact shape in the
image frame (obtained from the ground truth annotations) and the optimized
model shape in the image frame. The boundary error is calculated as the *point-to-point* (*Pt-Pt*) error, which is given by the Euclidian distance between the two
shape vectors of concatenated and coordinates of the landmark
points. The mean and standard deviation of *Pt-Pt* errors is used to
evaluate the boundary errors over a whole set of images. The results are
summarized in Table 2.

An implementation based only on the intensity (gray scale) component has also been tested. The gray scale images have been obtained by applying the standard mix of components in (22):

The initial results (no color adaptation) show a slight gain in the fitting accuracy over the gray scale implementation when color information is added. However, significant increase in face segmentation accuracy can be observed when adapting the color of the texture template using color transfer techniques. It can also be noted that the implementation based on color space performs slightly better in terms of segmentation accuracy, although subjectively better color adaptation results have been observed when using the color space. This can be explained by the fact that the color space representation is more suitable to be used together with the fitting algorithm which is implemented in the color space.

The robustness to changes in the illumination conditions was also tested using the Oulu face image database [29]. An example of color adaptation of the texture template for this database is shown in Figure 5.

#### 6. Discussion and Conclusions

We analyzed in this paper the possibility of enhancing a face segmentation/tracking method based on texture template matching by means of color image alignment. We also presented a model parameters optimization approach which minimized the error between the texture template and the warped image texture across the current shape. We employed here the TPS-based warping method which is more robust for head pose variations.

The color alignment techniques make use of the decorrelated color statistics of the current image and template image. Improvements of the accuracy of the segmentation have been demonstrated.

From our experiments, we can conclude that the color-adaptation method for the texture template can also be useful in face tracking applications which employ face modeling techniques similar to the one described in Section 3. In particular, it was shown significant improvements and increased robustness for the case of tracking a face under changes in the illumination conditions, like the change of the type of illuminant. This may be a real change of the illuminant or it could be caused by some wrong white balance setting of the image acquisition device.

#### Appendix

#### Image Warping: Principal Warps

The thin plate splines (TPSs)-based warping method, also named principal warps, was first introduced in [23]. It represents a nonrigid registration method, built upon an analogy with a theory in mechanics. Namely, the analogy is made with minimizing the bending energy of a thin metal plate on which pressure is exerted using some point constraints. The bending energy is then given by a quadratic form; the spline is represented as a linear combination (superposition) of eigenvectors of the bending energy matrix:where ; are the initial control points. defines the affine part, while defines the nonlinear part of the deformation.

The total bending energy is expressed as

The surface is deformed such that to have minimum bending energy. The conditions that need to be met so that (A.1) is valid (so that has second-order derivatives) are given by

Adding to this the interpolation conditions , (A.1) can now be written as the linear system in (A.4):where , is a matrix of zeros, is a vector of zeros, ; and are the column vectors formed by and , respectively, while .

#### Acknowledgments

This research was jointly sponsored by Enterprise Ireland and FotoNation (Ireland) Ltd. under the Innovation Partnership Scheme, Grant no. IP/06/361, part of the National Development Program of Ireland. In addition to financial support the authors also wish to express their appreciation for advice and access to facilities provided by the industrial sponsor.

#### References

- T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” in
*Proceedings of the 5th European Conference on Computer Vision (ECCV '98)*, pp. 484–498, Freiburg, Germany, June 1998. - T. F. Cootes, G. J. Edwards, and C. J. Taylor, “A comparative evaluation of active appearance model algorithms,” in
*Proceedings of the 9th British Machine Vison Conference (BMVC '98)*, vol. 2, pp. 680–689, British Machine Vision Association, Southampton, UK, September 1998. - X. Hou, S. Z. Li, H. Zhang, and Q. Cheng, “Direct appearance models,” in
*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01)*, vol. 1, pp. 828–833, Kauai, Hawaii, USA, December 2001. View at Publisher · View at Google Scholar - R. Donner, M. Reiter, G. Langs, P. Peloschek, and H. Bischof, “Fast active appearance model search using canonical correlation analysis,”
*IEEE Transactions on Pattern Analysis and Machine Intelligence*, vol. 28, no. 10, pp. 1690–1694, 2006. View at Publisher · View at Google Scholar - J. Matthews and S. Baker, “Active appearance models revisited,”
*International Journal of Computer Vision*, vol. 60, no. 2, pp. 135–164, 2004. View at Publisher · View at Google Scholar - M. B. Stegmann and R. Larsen, “Multi-band modelling of appearance,”
*Image and Vision Computing*, vol. 21, no. 1, pp. 61–67, 2003. View at Publisher · View at Google Scholar - G. J. Edwards, T. F. Cootes, and C. J. Taylor, “Advances in active appearance models,” in
*Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV '99)*, vol. 1, pp. 137–142, Kerkyra, Greece, September 1999. View at Publisher · View at Google Scholar - R. K. Kouassi, J.-C. Devaux, P. Gouton, and M. Paindavoine, “Application of the Karhunen-Loeve transform for natural color images analysis,” in
*Proceedings of the 31st Asilomar Conference on Signals, Systems & Computers (ACSSC '97)*, vol. 2, pp. 1740–1744, Pacific Grove, Calif, USA, November 1997. View at Publisher · View at Google Scholar - J.-C. Devaux, P. Gouton, and F. Truchetet, “Aerial color image segmentation by Karhunen-Loeve transform,” in
*Proceedings of the 15th International Conference on Pattern Recognition (ICPR '00)*, vol. 1, pp. 309–312, Barcelona, Spain, September 2000. View at Publisher · View at Google Scholar - J.-C. Devaux, P. Gouton, and F. Truchetet, “Application of the Karhunen-Loeve transform to aerial color image segmentation,” in
*Proceedings of the 4th International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies (KES '00)*, vol. 1, pp. 373–376, Brighton, UK, August 2000. View at Publisher · View at Google Scholar - J.-C. Devaux, P. Gouton, and F. Truchetet, “Karhunen-Loeve transform applied to region-based segmentation of color aerial images,”
*Optical Engineering*, vol. 40, no. 7, pp. 1302–1308, 2001. View at Publisher · View at Google Scholar - M. Kirby and L. Sirovich, “Application of the Karhunen-Loeve procedure for the characterization of human faces,”
*IEEE Transactions on Pattern Analysis and Machine Intelligence*, vol. 12, no. 1, pp. 103–108, 1990. View at Publisher · View at Google Scholar - Y. Ohta, T. Kanade, and T. Sakai, “Color information for region segmentation,”
*Computer Graphics and Image Processing*, vol. 13, no. 3, pp. 222–241, 1980. View at Publisher · View at Google Scholar - J. Mukherjee, “MRF clustering for segmentation of color images,”
*Pattern Recognition Letters*, vol. 23, no. 8, pp. 917–929, 2002. View at Publisher · View at Google Scholar - P. Mohapatra, P. Nanda, and S. Panda, “Color image segmentation using MRF model and simulated annealing,” in
*Proceedings of Soft Computing Technique for Engineering Applications (SCT '06)*, NIT, Rourkela, India, March 2006. - D. L. Ruderman, T. W. Cronin, and C.-C. Chiao, “Statistics of cone responses to natural images: implications for visual coding,”
*Journal of the Optical Society of America A*, vol. 15, no. 8, pp. 2036–2045, 1998. View at Publisher · View at Google Scholar - E. Reinhard, M. Ashikhmin, B. Gooch, and P. Shirley, “Color transfer between images,”
*IEEE Computer Graphics and Applications*, vol. 21, no. 5, pp. 34–41, 2001. View at Publisher · View at Google Scholar - A. Maslennikova and V. Vezhnevets, “Interactive local transfer between images,” in
*Proceedings of the International Conference on Computer Graphics & Vision (GraphiCon '07)*, pp. 75–78, Moscow, Russia, June 2007. - J.-H. Lee, B.-H. Chang, and S.-D. Kim, “Comparison of colour transformations for image segmentation,”
*Electronics Letters*, vol. 30, no. 20, pp. 1660–1661, 1994. View at Publisher · View at Google Scholar - C. Goodall, “Procrustes methods in the statistical analysis of shape,”
*Journal of the Royal Statistical Society B*, vol. 53, no. 2, pp. 285–339, 1991. View at Google Scholar - J. J. Gerbrands, “On the relationships between SVD, KLT and PCA,”
*Pattern Recognition*, vol. 14, no. 1–6, pp. 375–381, 1981. View at Publisher · View at Google Scholar - G. J. Edwards, C. J. Taylor, and T. F. Cootes, “Interpreting face images using active appearance models,” in
*Proceedings of the 3rd IEEE International Conference on Face & Gesture Recognition (FG '98)*, pp. 300–305, Nara, Japan, April 1998. View at Publisher · View at Google Scholar - F. L. Bookstein, “Principal warps: thin-plate splines and the decomposition of deformations,”
*IEEE Transactions on Pattern Analysis and Machine Intelligence*, vol. 11, no. 6, pp. 567–585, 1989. View at Publisher · View at Google Scholar - G. Donato and S. Belongie, “Approximate thin plate spline mappings,” in
*Proceedings of the 7th European Conference on Computer Vision (ECCV '02)*, pp. 21–31, Copenhagen, Denmark, May 2002. - R. K. Beatson and W. A. Light, “Fast evaluation of radial basis functions: methods for two-dimensional polyharmonic splines,”
*IMA Journal of Numerical Analysis*, vol. 17, no. 3, pp. 343–372, 1997. View at Google Scholar - A. Zandifar, S.-N. Lim, R. Duraiswami, N. A. Gumerov, and L. S. Davis, “Multi-level fast multipole method for thin plate spline evaluation,” in
*Proceedings of the International Conference on Image Processing (ICIP '04)*, vol. 3, pp. 1683–1686, Singapore, October 2004. View at Publisher · View at Google Scholar - P. A. Viola and M. J. Jones, “Robust real-time face detection,”
*International Journal of Computer Vision*, vol. 57, no. 2, pp. 137–154, 2004. View at Publisher · View at Google Scholar - Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” in
*Proceedings of the 2nd European Conference on Computational Learning Theory (EuroCOLT '95)*, vol. 904, pp. 23–37, Barcelona, Spain, March 1995. - M. Soriano, E. Marszalec, and M. Pietikäinen, “Color correction of face images under different illuminants by RGB eigenfaces,” in
*Proceedings of the 2nd International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA '99)*, pp. 148–153, Washington DC, USA, March 1999.