Research Article  Open Access
Geometric Distribution Weight Information Modeled Using Radial Basis Function with Fractional Order for Linear Discriminant Analysis Method
Abstract
Fisher linear discriminant analysis (FLDA) is a classic linear feature extraction and dimensionality reduction approach for face recognition. It is known that geometric distribution weight information of image data plays an important role in machine learning approaches. However, FLDA does not employ the geometric distribution weight information of facial images in the training stage. Hence, its recognition accuracy will be affected. In order to enhance the classification power of FLDA method, this paper utilizes radial basis function (RBF) with fractional order to model the geometric distribution weight information of the training samples and proposes a novel geometric distribution weight information based Fisher discriminant criterion. Subsequently, a geometric distribution weight information based LDA (GLDA) algorithm is developed and successfully applied to face recognition. Two publicly available face databases, namely, ORL and FERET databases, are selected for evaluation. Compared with some LDAbased algorithms, experimental results exhibit that our GLDA approach gives superior performance.
1. Introduction
Over the past two decades, face recognition (FR) has made great progress with the increasing computational power of computers and has become one of the most important biometricbased authentication technologies. The key issue of FR algorithm is dimensionality reduction for facial feature extraction. According to different processes of facial feature extraction, face recognition algorithms can be generally divided into two classes, namely, (local) geometric feature based and (holistic) appearance based [1]. The geometric featurebased approach is based on the shape and the location of facial components (such as eyes, eyebrows, nose, and mouth), which are extracted to represent a face geometric feature vector. However, for the appearancebased approach, it depends on the global facial pixel features, which are exploited to form a whole facial feature vector for face classification. Principle component analysis (PCA) [2] and linear discriminant analysis (LDA) [3] are two famous appearancebased approaches for linear feature extraction and dimensionality reduction. They are also called Eigenface method and Fisherface method in face recognition, respectively. The objective of PCA is to find the orthogonal principle component (PC) directions and preserve the maximum variance information of the training data along PC directions. PCA can reconstruct each facial image using all Eigenfaces. Since PCA takes no account of the discriminant information, it is unsuitable for classification tasks. LDA is a supervised learning method and seeks the optimal projection mapping under Fisher criterion such that the ratio of interdistance to intradistance attains the maximum. Therefore, from the classification point of view, LDA should give better performance than PCA. LDA is theoretically sound. However, it still has two issues. For one thing, LDA often encounters a small sample size (3S) problem, which always occurs when the dimension of the input sample space is greater than the number of training facial images. Under this situation, LDA cannot be performed directly. To solve the 3S problem, a large number of LDAbased approaches have been proposed [4–16]. Among them, Fisher linear discriminant analysis (FLDA) method, also called Fisherface method in FR, is a twostage algorithm. It first employs PCA for dimensionality reduction to guarantee that the betweenclass scatter matrix is full rank, and then LDA can be implemented in the PCAmapped low dimensional feature space. Direct LDA [6] (DLDA) is another LDAbased approach which uses simultaneous diagonalization technique [17] to solve 3S problem. The basic idea of DLDA is to previously discard the null space of betweenclass scatter matrix and then keep the null space of withinclass scatter matrix . Although DLDA is computationally efficient, it suffers from the performance limitation especially when the number of training images increases. This is because discarding the null space of would also discard the null space of indirectly. Literature [5] shows that the null space of contains the most discriminant information. For another thing, these LDAbased methods are based on the classic Fisher criterion, which does not consider the geometric distribution weight information of the training data. So, their recognition performances will be degraded.
To enhance the discriminant power of LDAbased approach, this paper presents a novel Fisher criterion by taking into account the geometric distribution weight information of the training facial data. It is natural to think that the intradata nearby its class center is more important to represent the feature of the class. So, the proposed method attempts to impose a penalty weight (small weight) on the intradata if the intradata is far from its own class center. In the meanwhile, if two different class centers are close to each other, they will be given a small weight as well. To this end, we should extract the geometric distribution weight information of the training data. In recent years, lots of fractional order based methods [18–25] have been proposed in the area of dynamic systems, image processing, face recognition, and so on. This paper will adopt radial basis function (RBF) with fractional order [21–23] to model the geometric distribution weight information of the training samples, and thus we are able to establish a new Fisher criterion incorporated with data geometric distribution weight information. Based on the modified Fisher discriminant criterion, a geometric distribution weight information based linear discriminant analysis (GLDA) method is proposed for face recognition. Our GLDA approach is tested on two face databases, namely, ORL database and FERET database. Compared with FLDA method and DLDA method, experimental results show that the proposed GLDA method outperforms FLDA and DLDA methods.
The rest of this paper is organized as follows. Section 2 briefly introduces the related works. In Section 3, RBF with fractional order is exploited to model the data geometric distribution weight information. The new Fisher criterion is then established using geometric distribution weight information of the training data, and GLDA algorithm is designed. Experimental results on two face databases are reported in Section 4. Finally, Section 5 draws the conclusions.
2. Related Works
In this section, we will introduce some related linear feature extraction and dimensionality reduction algorithms for face recognition.
2.1. Some Notations
Let be the dimension of the original sample space and let be the number of the sample classes. The th class contains training samples, and the total number of all training data is , where denotes the th samples in class . Assume is the center of class ; that is, , and the entire mean . In PCA algorithm, total scatter matrix , also called covariance matrix, is defined by
In LDA algorithm, withinclass scatter matrix and betweenclass scatter matrix are defined, respectively, as follows:
The radial basis function with fractional order is given as follows
The previous RBF can be viewed as the normalized radial kernel of fractional order .
2.2. PCA
Principal component analysis algorithm is also known as KarhunenLoeve transformation. It aims to find orthogonal principal component directions such that the scatter of all projected samples on large principal component direction is maximal. PCA is theoretically based on total scatter matrix which can be calculated via formula (1). The PCA projection matrix is determined by the following criterion: where and .
Problem (4) is equivalent to solving the eigensystem: , where with and . The PCA projection matrix can be chosen as (). The column vectors are called the eigenfaces in face recognition. It can be seen that PCA does not use the class label information. So, PCA is an unsupervised learning method, and its performance is not good in classification tasks.
2.3. Fisher LDA
The goal of linear discriminant analysis is to find a low dimensional feature space in which the intradata are tightly clustered and the interdata are far from each other. Therefore, LDA should acquire an optimal projection matrix to maximize the ratio of betweenclass scatter and the withinclass scatter; namely,
The previous problem is equivalent to solving the following eigensystem: where is a diagonal eigenvalue matrix with its eigenvalues sorted in decreasing order. The projection matrix is formed with the eigenvectors corresponding to the largest eigenvalues. In face recognition, the column vectors of are called Fisherfaces as well. However, LDA often suffers from small sample size problem when the number of training samples is smaller than the dimension of the sample vector. Under this situation, the withinclass scatter matrix is invertible, and the eigensystem (6) cannot be solved. This means that LDA cannot be performed directly. So, Fisher LDA (FLDA) uses PCA for dimensionality reduction in advance.
2.4. Direct LDA
Yu and Yang [6] proposed a direct LDA (DLDA) approach using simultaneous diagonalization technique [17]. Direct LDA is actually a subspace approach to overcome 3S problem of LDA. It attempts to obtain the optimal projection matrix in the subspace and satisfies the following equations: where means the null space of , denotes the complement subspace of , and is an identity matrix. Diagonal matrix may contain 0s and some small eigenvalues in its diagonal. Details can be found in [6].
We can see that some useful discriminant information will be discarded in the intermediate PCA stage of FLDA or simultaneous diagonalization stage of DLDA. Moreover, both FLDA method and DLDA method do not exploit the geometric distribution weight information of the training samples. These factors will affect their recognition performance.
3. Proposed GLDA Method
This section will propose a novel discriminant criterion, which will use the geometric distribution weight information of the training samples. Based on the new discriminant criterion, our GLDA method is proposed. Details are discussed as follows.
3.1. Proposed Discriminant Criterion
To take advantage of geometric distribution weight information of face pattern space, we redefine the withinclass scatter matrix and betweenclass scatter matrix , respectively, as follows: where and are radial basis functions defined by (3). and are fractional order parameters, which can be more flexibly adjusted to obtain the optimal parameters. It can be seen from (8) that if the distance between the samples and is large, it will impose a penalty weight. Similarly, if the class center is nearby the center , then we also give it a small weight. Otherwise, it will have a large weight.
Based on the previous analysis, our geometric distribution weight information based Fisher criterion function is defined by
To obtain the following optimal projection matrix: we can equivalently solve the following eigensystem: where is a diagonal eigenvalue matrix with its eigenvalues sorted in decreasing order. The projection matrix is formed with eigenvectors corresponding to the largest eigenvalues.
3.2. Algorithm Design
This subsection will develop our GLDA algorithm based on geometric distribution weight information Fisher discriminant criterion (9). Details are as follows.
It is easily seen that two scatter matrices and can be rewritten in the following forms, respectively: where where
Since the total scatter matrix , if we define , then can be written as
To solve the problem of eigensystem (11) and compare the proposed GLDA with FLDA algorithm under the same conditions, this paper will also use PCA for dimensionality reduction and guarantee that the geometric information based within scatter matrix is nonsingular. This means that GLDA can be carried out in the PCAtransformed low dimensional feature space. Thereby, our GLDA algorithm is designed as follows.
Step 1. Performing singular value decomposition on , we have , where is an orthonormal matrix, with . Denote , , and then let .
Step 2. Perform singular value decomposition where and are orthonormal matrices, with and .
Step 3. If , then let , and , and go to Step 4. Otherwise, update according to the rule , let , and go to Step 2.
Step 4. Perform an eigenvalue decomposition , where is a diagonal eigenvalue matrix of with its diagonal elements in a decreasing order and is an orthonormal eigenvector matrix. Let .
Step 5. The final GLDA optimal projection matrix is
4. Experimental Results
This section will evaluate the performance of the proposed GLDA method for face recognition. Two LDAbased algorithms, namely FLDA [3] and DLDA [6] algorithms, are chosen for comparisons under the same experimental conditions. In the following experiments, the values of fractional order parameters are given as and . They are manually determined using full search method.
4.1. Human Face Image Databases
Two popular and publicly available databases, namely, ORL database and FERET database, are selected for the evaluation. In ORL database, there are 40 persons and each person consists of 10 images with different facial expressions, small variations in scales, and orientations. The resolution of each image is and with 256 gray levels per pixel. Image variations of one person in the database are shown in Figure 1. For FERET database, we select 120 people, 6 images for each individual. The six images are extracted from 4 different sets, namely, Fa, Fb, Fc, and duplicate. Fa and Fb are sets of images taken with the same camera at the same day but with different facial expressions. Fc is a set of images taken with different cameras at the same day. Duplicate is a set of images taken around 6–12 months after the day the Fa and Fb photos were taken. Details of the characteristics of each set can be found in [26]. All images are aligned by the centers of eyes and mouth and then normalized with resolution . This resolution is the same as that in ORL database. Images from two individuals are shown in Figure 2. For all facial images, the following preprocessing steps are preformed.(i)All images are aligned with the centers of eyes and mouth. The orientation of face is adjusted (ontheplane rotation) such that the line joining the centers of eyes is parallel with axis. (ii)The dimension of the images is reduced by onefourth using Daubechies’ D4 wavelet filter. The resolution for all images in the following experiments is . (iii)For each facial image sample , it is normalized using the following formula:
In the recognition stage, the nearest neighbor approach is employed for face classification, which is base on Euclidian distance measurement between the testing image and the class center.
4.2. Comparisons on ORL Database
The experimental setting on ORL database is as follows. We randomly selected images from each individual for training and the rest of the images are for testing. In order to have a fair comparison, all methods use the same training and testing facial images. Moreover, the experiments are repeated 10 times, and the average accuracies are then calculated to avoid the statistical variations. The average accuracies are recorded and tabulated in Table 1 and plotted in Figure 3. TN in Table 1 means the numbers of training samples. It can be seen that the recognition accuracy of each approach ascends when the number of training images increases. The recognition accuracy of GLDA method increases from 79.98% with 2 training images to 99.00% with 9 training images. However, for FLDA and DLDA methods, their accuracies increase from 66.13% and 78.69% with 2 training images to 97.75% and 96.25% with 9 training images, respectively. Experimental results show that our GLDA method gives the best performance on ORL database.

(a)
(b)
We would also like to see the detailed performance of every method, which is graphically illustrated using the cumulative match characteristic (CMC) curve and the receiver operating characteristic (ROC) curve. The CMC curve shows the recognition accuracy against the rank, and the ROC curve displays the false acceptance rate (FAR) versus the genuine acceptance rate (GAR). High accuracy or high GAR with low FAR means good performance.
For each number of training images, the CMC curves and the ROC curves are plotted in Figure 4 ((TN = 2)–(TN = 9)) and Figure 5 ((TN = 2)–(TN = 9)), respectively. It can be seen that our method gives the best performance for all cases.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
4.3. Comparisons on FERET Database
The experimental setting for the FERET database is similar with that of ORL database. As the number of images for each person is 6, the number of training images ranges from 2 to 5. The experiments are repeated 10 times and the average accuracy is then calculated. The average accuracy is recorded and tabulated in Table 2 and plotted in Figure 3, respectively. When 2 training images is used for testing, the recognition rate of our method is 72.94%, while those of FLDA and DLDA methods are 62.85% and 70.25%, respectively. The performance for each method is also improved when the number of training images increases. When the number of training images is equal to 5, the accuracy for GLDA method is increased to 89.83% while those for FLDA method and DLDA method are 89.42% and 85.58%, respectively. It can be seen that the proposed method outperforms FLDA method and DLDA method on FERET database as well.

Like the ORL database, the detailed performance of each approach is shown using CMC and ROC curves. They are plotted in Figure 6 and Figure 7, respectively, with the number of training images ranging from 2 to 5.
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
Figures 6 and 7 demonstrate that our GLDA method has superior performance on the FERET database.
5. Conclusions
In order to enhance the discriminant power of the traditional LDAbased FR algorithms, this paper proposed to integrate the geometric distribution weight information of the training samples into Fisher criterion and developed a novel geometric distribution weight information based LDA (GLDA) face recognition approach. The geometric distribution weight information is learnt using radial basis function with fractional order. The proposed GLDA method is tested using two face databases, namely, ORL and FERET face databases. Compared with FLDA method, experimental results demonstrate that our GLDA method has the best performance.
Acknowledgments
This paper is partially supported by the NSFC (61272252) and the Science & Technology Planning Project of Shenzhen City (JC201105130447A, JCYJ20130326111024546). The authors would like to thank the Olivetti Research Laboratory and the Amy Research Laboratory for providing the face image databases.
References
 R. Brunelli and T. Poggio, “Face recognition: features versus templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 10, pp. 1042–1052, 1993. View at: Publisher Site  Google Scholar
 M. A. Turk and A. P. Pentland, “Face recognition using eigenfaces,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 586–591, June 1991. View at: Google Scholar
 P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisherfaces: recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997. View at: Publisher Site  Google Scholar
 A. M. Martinez and A. C. Kak, “PCA versus LDA,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228–233, 2001. View at: Publisher Site  Google Scholar
 L.F. Chen, H.Y. M. Liao, M.T. Ko, J.C. Lin, and G.J. Yu, “New LDAbased face recognition system which can solve the small sample size problem,” Pattern Recognition, vol. 33, no. 10, pp. 1713–1726, 2000. View at: Publisher Site  Google Scholar
 H. Yu and J. Yang, “A direct LDA algorithm for highdimensional datawith application to face recognition,” Pattern Recognition, vol. 34, no. 10, pp. 2067–2070, 2001. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 W.S. Chen, P. C. Yuen, and J. Huang, “A new regularized linear discriminant analysis method to solve small sample size problems,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 19, no. 7, pp. 917–935, 2005. View at: Publisher Site  Google Scholar
 P. Howland, J. Wang, and H. Park, “Solving the small sample size problem in face recognition using generalized discriminant analysis,” Pattern Recognition, vol. 39, no. 2, pp. 277–287, 2006. View at: Publisher Site  Google Scholar
 W. S. Chen, P. C. Yuen, J. Huang, and B. Fang, “Twostep single parameter regularization fisher discriminant method for face recognition,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 20, no. 2, pp. 189–208, 2006. View at: Publisher Site  Google Scholar
 J. D. Tebbens and P. Schlesinger, “Improving implementation of linear discriminant analysis for the high dimension/small sample size problem,” Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 423–437, 2007. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 K. Das and Z. Nenadic, “An efficient discriminantbased solution for small sample size problem,” Pattern Recognition, vol. 42, no. 5, pp. 857–866, 2009. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 H. Hu, P. Zhang, and F. De la Torre, “Face recognition using enhanced linear discriminant analysis,” IET Computer Vision, vol. 4, no. 3, pp. 195–208, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 D. Chu and G. S. Thye, “A new and fast implementation for null space based linear discriminant analysis,” Pattern Recognition, vol. 43, no. 4, pp. 1373–1379, 2010. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 W.S. Chen, P. C. Yuen, and X. Xie, “Kernel machinebased ranklifting regularized discriminant analysis method for face recognition,” Neurocomputing, vol. 74, no. 17, pp. 2953–2960, 2011. View at: Publisher Site  Google Scholar
 A. Sharma and K. K. Paliwal, “A twostage linear discriminant analysis for facerecognition,” Pattern Recognition Letters, vol. 33, no. 9, pp. 1157–1162, 2012. View at: Publisher Site  Google Scholar
 F. Dornaika and A. Bosaghzadeh, “Exponential local discriminant embedding and its application to face recognition,” IEEE Transactions on Cybernetics, vol. 43, no. 3, pp. 921–934, 2013. View at: Publisher Site  Google Scholar
 K. Fukunaga, Introduction to Statistical Pattern Recognition, Computer Science and Scientific Computing, Academic Press, Boston, Mass, USA, 2nd edition, 1990. View at: MathSciNet
 C. Liu, “Gaborbased kernel PCA with fractional power polynomial models for face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 572–581, 2004. View at: Publisher Site  Google Scholar
 C. Cattani and A. Ciancio, “Separable transition density in the hybrid model for tumorimmune system competition,” Computational and Mathematical Methods in Medicine, vol. 2012, Article ID 610124, 6 pages, 2012. View at: Google Scholar  Zentralblatt MATH  MathSciNet
 C. Cattani, A. Ciancio, and B. Lods, “On a mathematical model of immune competition,” Applied Mathematics Letters, vol. 19, no. 7, pp. 678–683, 2006. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 M. Li and W. Zhao, “Representation of a stochastic traffic bound,” IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 9, pp. 1368–1372, 2010. View at: Publisher Site  Google Scholar
 M. Li, “Approximating ideal filters by systems of fractional order,” Computational and Mathematical Methods in Medicine, vol. 2012, Article ID 365054, 6 pages, 2012. View at: Google Scholar  Zentralblatt MATH  MathSciNet
 M. Li, W. Zhao, and C. Cattani, “Delay bound: fractal traffic passes through servers,” Mathematical Problems in Engineering. In press. View at: Google Scholar
 S. X. Hu, Z. W. Liao, and W. F. Chen, “Sinogram restoration for lowdosed Xray computed tomography using fractionalorder peronaMalik Diffusion,” Mathematical Problems in Engineering, vol. 2012, Article ID 391050, 13 pages, 2012. View at: Google Scholar  Zentralblatt MATH
 S. X. Hu, “External fractionalorder gradient vector PeronaMalik Diffusion for sinogram restoration of lowdosed Xray computed tomography,” Advances in Mathematical Physics, vol. 2013, Article ID 516919, 14 pages, 2013. View at: Google Scholar
 P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for facerecognition algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2013 WenSheng Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.