Abstract

We present symbolic kernel discriminant analysis (symbolic KDA) for face recognition in the framework of symbolic data analysis. Classical KDA extracts features, which are single-valued in nature to represent face images. These single-valued variables may not be able to capture variation of each feature in all the images of same subject; this leads to loss of information. The symbolic KDA algorithm extracts most discriminating nonlinear interval-type features which optimally discriminate among the classes represented in the training set. The proposed method has been successfully tested for face recognition using two databases, ORL database and Yale face database. The effectiveness of the proposed method is shown in terms of comparative performance against popular face recognition methods such as kernel Eigenface method and kernel Fisherface method. Experimental results show that symbolic KDA yields improved recognition rate.

1. Introduction

Kernel principal component analysis (KPCA) [1, 2] and kernel Fisher discriminant analysis (KFD) [3] have aroused considerable interest in the face recogniation problem. KPCA was originally developed by Schölkopf et al., in 1998, while KFD was first proposed by Mika et al., in 1999 [3]. Subsequent research saw the development of series of KFD algorithms. The KFD based algorithms use the pixel intensity values in a face image as the features for face recognition. The pixel intensities that are used as features are represented by single valued variables. However, in many situations same face is captured in different orientation, lighting, expression and background, which all lead to image variations. The pixel intensities do change because of image variations. The use of single valued variables may not be able to capture the variation of feature values of the images of the same subject. In such a case, we need to consider the symbolic data analysis (SDA) [47], in which the interval-valued data are analyzed.

In this paper, new appearance based method is proposed in the framework of symbolic data analysis, namely, symbolic KDA for face recognition, which is a generalization of the classical KDA to symbolic objects. In the first step, we represent the face images as symbolic objects (symbolic faces) of interval type variables. Each symbolic face summarizes the variation of feature values through the different images of the same subject. It also drastically reduces the dimension of the image space without losing a significant amount of information.

In the second step, we apply symbolic KDA algorithm to extract interval type nonlinear discriminating features. According to this algorithm, in the first phase, we apply kernel function to symbolic faces, as a result a pattern in the original input space is mapped into a potentially much higher dimensional feature vector in the feature space, and then performs in the feature space to choose subspace dimension carefully. In the second phase, we extract interval type nonlinear discriminating features, which are robust to variations due to illumination, orientation and facial expression. Finally, minimum distance classifier with symbolic dissimilarity measure [4] is employed for classification.

The remainder of this paper is organized as follows: In Section 2, the idea of constructing the symbolic faces is given. Symbolic KDA is developed in Section 3. In Section 4, the experiments are performed on the ORL and Yale face database whereby the proposed algorithm is evaluated and compared with other methods. Finally, conclusion and discussion are given in Section 5.

2. Construction of Symbolic Faces

Let Ω={Γ1,,Γ𝑛} be the collection of n face images of the database, each of size 𝑁×𝑀, which are first-order objects. Each object Γ𝑙Ω, 𝑙=1,,𝑛, is described by a feature vector (𝑌1𝑌,,𝑝) of length 𝑝=𝑁𝑀, where each component 𝑌𝑗,𝑗=1,,𝑝, is a single-valued variable representing the intensity values of the face image Γ𝑙. An image set is a collection of face images of m different subjects (face classes) and each subject has different images with varying expressions, orientations, and illuminations. Thus there are m number of second-order objects (face classes) denoted by 𝐸={𝑐1,𝑐2,,𝑐𝑚}, each consisting of different individual images, Γ𝑙Ω, of a subject. We have assumed that images are belonging to a face class and are arranged from right-side view to left-side view. The view range of each face class is partitioned into 𝑞 subface classes and each subface class contains 𝑟 number of images. The feature vector of 𝑘th subface class 𝑐𝑘𝑖 of 𝑖th face class 𝑐𝑖, where 𝑘=1,2,,𝑞, is described by a vector of p interval variables 𝑌1,,𝑌𝑝, and is of length 𝑝=𝑁𝑀. The interval variable 𝑌𝑗 of 𝑘th subface class 𝑐𝑘𝑖 of 𝑖th face class is described as𝑌𝑗𝑐𝑘𝑖=[𝑥𝑘𝑖𝑗,𝑥𝑘𝑖𝑗],(1) where 𝑥𝑘𝑖𝑗 and 𝑥𝑘𝑖𝑗 are minimum and maximum intensity values, respectively, among 𝑗th pixels of all the images of subface class 𝑐𝑘𝑖. This interval incorporates variability of 𝑗th feature inside the 𝑘th subface class 𝑐𝑘𝑖.

We denote 𝑋𝑘𝑖=(𝑌1𝑐𝑘𝑖,,𝑌𝑝𝑐𝑘𝑖),𝑖=1,,𝑚,𝑘=1,,𝑞.(2) The vector 𝑋𝑘𝑖 of interval variables is recorded for 𝑘th subface class 𝑐𝑘𝑖 of 𝑖th face class. This vector is called as symbolic face and is represented as 𝑋𝑘𝑖=𝛼𝑘𝑖1,,𝛼𝑘𝑖𝑝,(3) where 𝛼𝑘𝑖𝑗=𝑌𝑗(𝑐𝑘𝑖)=[𝑥𝑘𝑖𝑗,𝑥𝑘𝑖𝑗],𝑗=1,,𝑝, and 𝑘=1,,𝑞; 𝑖=1,2,,𝑚.

We represent the 𝑞𝑚 symbolic faces by a matrix 𝑋 of size (𝑝×𝑞𝑚), consisting of column vectors 𝑋𝑘𝑖, 𝑖=1,,𝑚,𝑘=1,,𝑞.

3. Acquiring Nonlinear Subspace Using Symbolic KDA Method

Let us consider the matrix 𝑋 containing qm symbolic faces pertaining to the given set Ω of images belonging to m face classes. The centers 𝑥𝑘𝐶𝑖𝑗 of the intervals 𝛼𝑘𝑖𝑗=[𝑥𝑘𝑖𝑗,𝑥𝑘𝑖𝑗] are given by 𝑥𝑘𝐶𝑖𝑗=𝑥𝑘𝑖𝑗+𝑥𝑘𝑖𝑗2,(4) where 𝑘=1,,𝑞, 𝑖=1,,𝑚, and 𝑗=1,,𝑝.

The 𝑝×𝑞𝑚 data matrix 𝑋𝐶 containing the centers 𝑥𝑘𝐶𝑖𝑗 of the intervals for qm symbolic faces. The p-dimensional vectors 𝑋𝑘𝐶𝑖=(𝑥𝑘𝐶𝑖1,,𝑥𝑘𝐶𝑖𝑝), 𝑋𝑘𝑖=(𝑥𝑘𝑖1,,𝑥𝑘𝑖𝑝), and 𝑋𝑘𝑖=(𝑥𝑘𝑖1,,𝑥𝑘𝑖𝑝) represent the centers, lower bounds, and upper bounds of the qm symbolic faces 𝑋𝑘𝑖, respectively.

Let Φ𝑝𝐹 be a nonlinear mapping between the input space and the feature space.

The nonlinear mapping, Φ, usually defines a kernel function. Let 𝐾𝑞𝑚×𝑞𝑚 define a kernel matrix by means of dot product in the feature space: 𝐾𝑖𝑗𝑋=(Φ𝑖𝑋Φ𝑗).(5) In general, the Fisher criterion function in the feature space F can be defined as 𝑉𝐽(𝑉)=𝑇𝑆Φ𝑏𝑉𝑉𝑇𝑆Φ𝑤𝑉,(6) where V is a discriminant vector, 𝑆Φ𝑏 and 𝑆Φ𝑤 are the between-class scatter matrix and the within-class scatter matrix, respectively. The between-class scatter matrix and the within-class scatter matrix in the feature space 𝐹 are defined below: 𝑆Φ𝑏=1𝑚𝑚𝑖=1𝑞𝑖𝑚Φ𝑖𝑚Φ0𝑚Φ𝑖𝑚Φ0𝑇,𝑆Φ𝑤=1𝑞𝑚𝑚𝑞𝑖=1𝑘=1Φ𝑋𝑘𝐶𝑖𝑚Φ𝑖Φ𝑋𝑘𝐶𝑖𝑚Φ𝑖𝑇,(7) where 𝑋𝑘𝐶𝑖 denotes the 𝑘th symbolic face of 𝑖th face class, 𝑞𝑖 is the number of training symbolic faces in face class 𝑖, 𝑚Φ𝑖 is the mean of the mapped symbolic faces in face class 𝑖, and 𝑚Φ0 is the mean across all mapped 𝑞𝑚 symbolic faces. From the above definitions, we have 𝑆Φ𝑡=𝑆Φ𝑏+𝑆Φ𝑤. The discriminant vectors with respect to the Fisher criterion are actually the eigenvectors of the generalized equation 𝑆Φ𝑏𝑉=𝜆𝑆Φ𝑡𝑉. According to the theory of the reproducing kernel, V will be an expansion of all symbolic faces in the feature space. That is, there exist coefficients 𝑏𝐿(𝐿=1,2,𝑞𝑚) such that 𝑉=𝑞𝑚𝐿=1𝑏𝐿Φ𝑋𝑘𝐶𝑖=𝐻𝐴,(8) where 𝐻=(Φ(𝑋1𝐶1),,Φ(𝑋𝑞𝐶1),,Φ(𝑋1𝐶𝑚),,Φ(𝑋𝑞𝐶𝑚)) and 𝐴=(𝑏1𝑏2𝑏𝑞𝑚).

Substituting (8) into (6), we can obtain the following equation: 𝐴𝐽(𝐴)=𝑇𝐾𝑊𝐾𝐴𝐴𝑇𝐾𝐾𝐴,(9) where K is a kernel matrix, 𝑊=diag(𝑊1𝑊𝑚),𝑊𝑖 is a 𝑞𝑖×𝑞𝑖 matrix whose elements are 1/𝑞𝑖. From the definition of W, it is easy to verify that W is a 𝑞𝑚×𝑞𝑚 block matrix. In fact, it is often necessary to find s discriminant vectors, denoted by 𝑎1,𝑎2,,𝑎𝑠, to extract features. Let 𝑉=[𝑎1,𝑎2,,𝑎𝑠]. The matrix V should satisfy the following condition: 𝑉=argmax𝑉(|||𝑉𝑇𝑆𝑏𝑉||||||𝑉𝑇𝑆𝑡𝑉|||),(10) where 𝑆𝑏=𝐾𝑊𝐾 and 𝑆𝑡=𝐾𝐾.

Since, each symbolic face 𝑋𝑘𝑖 is located between the lower bound symbolic face 𝑋𝑘𝑖 and upper bound symbolic face 𝑋𝑘𝑖, it is possible to find most discriminating nonlinear interval-type features [𝐵𝑘𝑖,𝐵𝑘𝑖].

The lower bound features of each symbolic face 𝑋𝑘𝑖 is given by 𝐵𝑘𝑖=𝑉𝑇𝑙Φ𝑋𝑘𝑖,𝑙=1,2,,𝑠,(11) where Φ(𝑋𝑘𝑖)=[(Φ(𝑋1𝐶1)Φ(𝑋𝑘𝑖),,Φ(𝑋𝑞𝐶1)Φ(𝑋𝑘𝑖)),,(Φ(𝑋1𝐶𝑚)Φ(𝑋𝑘𝑖),,Φ(𝑋𝑞𝐶𝑚)Φ(𝑋𝑘𝑖))].(12) Similarly, the upper bound features of each symbolic face 𝑋𝑘𝑖 is given by 𝐵𝑘𝑖=𝑉𝑇𝑙Φ(𝑋𝑘𝑖),𝑙=1,2,,𝑠.(13) Let 𝑐test=[Γ1,Γ2,,Γ𝑙] be the test face class that contains face images of same subject with different expression, lighting condition and orientation. The test symbolic face 𝑋test is constructed for test face class 𝑐test as explained in Section 2. The lower bound test symbolic face of test symbolic face 𝑋test is described as 𝑋test=(𝑥test1,𝑥test2,,𝑥test𝑝). Similarly, the upper bound test symbolic face is described as 𝑋test=(𝑥test1,𝑥test2,,𝑥test𝑝).

The interval-type features [𝐵test,𝐵test] of test symbolic face 𝑋test are computed as: 𝐵test=𝑉𝑇𝑙Φ𝑋test,𝐵test=𝑉𝑇𝑙Φ(𝑋test),(14) where 𝑙=1,2,,𝑠.

4. Experimental Results

The proposed symbolic KDA method is experimented with the face images of the ORL and Yale databases. The effectiveness of proposed method is shown in terms of comparative performance against two face recognition methods. In particular, we compared our algorithm with kernel Eigenface [8] and kernel Fisherface [9] method.

4.1. Experiments Using ORL Database

We assess the feasibility and performance of the proposed symbolic KDA on the face recognition task using ORL database. The ORL face database is composed of 400 images with ten different facial views that represent various expressions and orientations for each of the 40 distinct subjects as shown in Figure 1. We have arranged images of each face class from right side view to left side view. All the 400 images from the ORL database are used to evaluate the face recognition performance of proposed method. Six images are randomly chosen from the ten images available for each subject for training, while the remaining images are used to construct the test symbolic face for each trial. We have conducted the experiments using two kernel functions namely, polynomial kernel and Gaussian kernel.

Our goal is to find appropriate kernel function and corresponding optimal kernel parameters (i.e., the order of the polynomial kernel and the width of the Gaussian kernel) for our proposed method. The experimental results shows that the order of the polynomial kernel should be three and the width of Gaussian kernel should be four for proposed symbolic KDA with respect to a minimum distance classifier.

After determining the optimal kernel parameters, we set out to select the dimension of discriminant subspace with respect to two different kernels. Table 1 shows optimal subspace for each method. The optimal parameters for each method with respect to different kernels. Besides, we find that symbolic KDA features seem more effective than features of other methods.

After selection of optimal parameters and optimal subspace for each method with respect to different kernels, all three methods are reevaluated using same set of training and testing samples. The average recognition rates for the best case are shown in Table 2. The best performance of the symbolic KDA method is better than the best performance of the kernel Eigenface and kernel Fisherface method. We note that the symbolic KDA method outperforms kernel Eigenface method and kernel Fisherface in the sense of using small number of features.

In order to examine, whether symbolic KDA is statistically significant and better than other methods in terms of its recognition rate. We evaluate the experimental results presented in Table 2 using McNemar’s significance test. McNemar’s test is essentially a null hypothesis statistical test based on a Bernoulli model. If the resulting p-value is below the desired significance level, the null hypothesis is rejected and the performance difference between two algorithms is considered to be statistically significant. By this test, we find that symbolic KDA statistically significant and outperforms kernel Eigenface and kernel Fisherface methods at a significance level of 𝑝=1.012×106.

The receiver operating characteristic (ROC) curve in Figure 2 reports results for a verification scenario. The equal-error rate (EER) is the ROC point at which the false-accept rate is equal to the false-reject rate. The EER for our approach goes from approximately 0.12 to approximately 0.23. The EER for the other methods shows much greater performance degradation.

4.2. Experiments on the Yale Face Database

The experiments were conducted using Yale database to evaluate the excellence of the symbolic KDA for frontal face recognition under different nondark backgrounds. The Yale face database consists of a total of 165 images obtained from 15 different people, with 11 images from each person. Figure 3 shows some typical images of one subject of Yale face database. We preprocessed these images by aligning and scaling them so that the distances between the eyes were the same for all images and also ensuring that the eyes occurred in the same co-ordinates of the image. The resulting image was then cropped. The images were not manually arranged as done in previous set of experiments using ORL database (Section 4.1). In our experiments, 9 images were randomly chosen from each class for training, while the remaining two images were used to construct test symbolic face for each trial.

The experiments were conducted using two different kernels, namely, polynomial kernel and Gaussian kernel. The order of the polynomial kernel should be 2 and the width of Gaussian kernel should be four for proposed symbolic KDA with respect to a minimum distance classifier.

After finding optimal kernel parameter (degree 2) for the symbolic KDA method, the experiments were conducted to find optimal subspace for proposed symbolic KDA, kernel Fisherface, and kernel Eigenface method. The recognition rates, training time, and optimal subspace dimension for Kernel Fisherface, Kernel Eigenface, and symbolic KDA are listed in Table 3. From Table 3, the symbolic KDA method with polynomial degree two has recognition rate 89.00% using only 15 features, where as Kernel Fisherface method requires 42 features to achieve 87.15% recognition rate. This is due to the fact that first few eigenvectors of symbolic PCA account for highest variance of training samples and these few eigenvectors are enough to represent image for recognition purposes. Hence, improved recognition results can be achieved at less computational cost by using symbolic KDA, by virtue of its low dimensionality. The experimental results obtained using Yale face database shows that the proposed symbolic KDA performs well on images with non dark backgrounds.

5. Conclusions

In this paper, we propose a novel symbolic KDA method for face recognition. Symbolic data representation of face images using interval-type features are desirable facial features to cope up with the variations due to illumination, orientation, and facial expression changes. The feasibility of the symbolic KDA has been tested successfully on frontal face images of ORL and Yale databases. Experimental results show that symbolic KDA method with polynomial kernel leads to improved recognition rate at reduced computational cost.

The proposed symbolic KDA has many advantages compared to other popular appearance-based methods. The drawback of other appearance-based methods is that in order to recognize a face seen from a particular pose and under a particular illumination, the face must have been previously seen under the same conditions. The symbolic KDA overcomes this limitation by representing the faces by interval-type features so that even the faces seen previously in different poses, orientations, and illuminations are recognized. Another important merit is that we can use more than one probe image with inherent variability of a face for face recognition, this yields improved recognition rate. This is clearly evident from the experimental results. We observe from the experimental results that the proposed symbolic KDA method yields improved recognition rate in terms of time and feature reduction compare to other kernel-based methods.

The main drawback of our proposed symbolic KDA method is that pose variation is limited up to 20 degree orientation and the performance of proposed method decreases on face images with pose variation greater than 20 degree orientation. The proposed method did not achieve 100% accuracy, this is due to the fact that while constructing the symbolic faces, there may be chance of misalignment of coordinates of eyes, mouth, and nose because of different facial expressions in training images. It can be observed in experimental results obtained using Yale face database, which contains face images with different facial expressions. Moreover, the performance of the proposed symbolic KDA method decreases on images with more variation in facial expressions of Yale face database compared to performance on images with less variation in facial expressions of ORL face database.

Acknowledgment

The authors are indebted to the referees for their helpful comments and suggestions, which improved the earlier version of the paper.