About this Journal Submit a Manuscript Table of Contents
International Journal of Distributed Sensor Networks
Volume 2014 (2014), Article ID 242105, 7 pages
http://dx.doi.org/10.1155/2014/242105
Research Article

Distributed Face Recognition Using Multiple Kernel Discriminant Analysis in Wireless Sensor Networks

1School of Computer Science, Dongguan University of Technology, Dongguan 523808, China
2School of Computer Science, Zhongyuan University of Technology, Zhengzhou 450007, China

Received 17 October 2013; Accepted 18 October 2013; Published 23 January 2014

Academic Editor: Fatos Xhafa

Copyright © 2014 Xiao-Zhang Liu and Guan Yang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This paper proposes a module based distributed wireless face recognition system by integrating multiple kernel discriminant analysis with face recognition in wireless sensor networks. By maximizing the margin maximization criterion (MMC), we separately perform an iterative scheme for kernel parameter optimization for each module. The simulation on the FERET and CMU PIE face databases shows that our multiple kernel framework and the optimization procedure achieve high recognition performance, compared with single-kernel-based KDDA.

1. Introduction

Face recognition (FR) system is one of the most important biometric techniques and is used in a wide range of security applications such as access control, identification systems, and surveillance [1]. FR is a contactless biometric technique and has advantages of being natural and passive over other biometric techniques requiring cooperative subjects, such as fingerprint recognition and iris recognition [2]. A normal framework of FR system is shown in Figure 1, including procedures of enrollment and identification [3].

242105.fig.001
Figure 1: A normal framework of FR system.

In recent years, FR systems combined with wireless sensor networks (WSNs) [4] have shown great interest, as WSNs are very helpful for contactless biometrics security applications. For example, Kim et al. implement a wireless face recognition system based on ZigBee protocol and principle components analysis (PCA) method with low energy consumption [5]. Muraleedharan et al. propose the use of a specific evolutionary algorithm to optimize routing in distributed time varying network for face recognition [6]. Chang and Aghajan focus on recovering face orientation for more robust face recognition in wireless image sensor networks [7]. Zaeri et al. propose application of face recognition for wireless surveillance systems [8].

As there exist many image variations such as pose, illumination, and facial expression, face recognition is a highly complex and nonlinear problem which could not be sufficiently handled by linear methods, such as principal components analysis (PCA) [9] and linear discriminant analysis (LDA) [10]. Therefore, it is reasonable to assume that a better solution to this inherent nonlinear problem could be achieved using nonlinear methods, such as the so-called kernel machine techniques [11]. Following the success of applying the kernel trick in support vector machines (SVMs) [12], many kernel-based PCA and LDA methods have been developed and applied in pattern recognition tasks, such as kernel PCA (KPCA) [13], kernel Fisher discriminant (KFD) [14], generalized discriminant analysis (GDA) [15], and kernel direct LDA (KDDA) [16].

It has been shown that the kernel-based LDA method is a feasible approach to solve the nonlinear problems in face recognition. However, the performance of the kernel-based LDA method is sensitive to the selection of a kernel function and its parameters. Kernel parameter selection to date can mainly be achieved by cross-validation [17], which is computationally expensive, and the selected kernel parameters cannot be guaranteed optimal. Furthermore, a single and fixed kernel can only characterize the geometrical structure of some aspects for the input data, and, thus, not always be fit for the applications which involve data from multiple, heterogeneous sources [18, 19].

Recent applications and developments based on SVMs [20, 21] have shown that using multiple kernels (i.e., a combination of several “base kernels”) instead of a single fixed one can enhance classifier performance, which raised the so-called multiple kernel learning (MKL) method. With kernels, input data can be mapped into feature spaces, where each feature space can be taken as one view of the original input data [19]. Each view is expected to exhibit some geometrical structures of the original data from its own perspective such that all the views can complement for the subsequent learning task. It has been proven that MKL can offer some needed flexibility and well manipulate the case that involves multiple, heterogeneous data sources [18, 22, 23]. However, MKL is proposed for SVMs, and there have been few reports on performance of the kernel-based LDA method with multiple kernels. Liu and Feng propose multiple kernel Fisher discriminant analysis (MKFD) with an iterative scheme for weight optimization [24], in which the constructed kernel is a linear combination of several base kernels with a constraint on their weights.

In this paper, we integrate multiple kernel discriminant analysis with face recognition in wireless sensor networks, and propose a module based distributed wireless face recognition system. We consider separate cluster head for each module, that is, fore head, eye, lips, and nose. Only the local cluster is responsible for internal module processing for both training and recognition.

The rest of this paper is organized as follows. First we describe the module based distributed wireless face recognition system in Section 2. Then in Section 3, the optimization scheme for the multikernels is presented. The simulation results are reported in Section 4, while we draw our conclusion in Section 5.

2. Module Based Distributed Wireless Face Recognition

We present a face recognition system in wireless sensor networks where training and recognition are performed both in distributed environment. The image is divided into four submodules, that is, forehead, eyes, nose, and lips, as shown in Figure 2. For face recognition tasks, enrolment and identification of each submodule are performed in separate cluster heads, and the computations are carried out in kernel feature space [12]. Each cluster head is responsible for processing its submodule and communicating with the sink cluster which performs the score level fusion.

242105.fig.002
Figure 2: Submodules of a face image.

The following describes the score level fusion criterion at the sink node. Given images belonging to subjects. Denote the membership degree of the th image belonging to the th subject as , , . is obtained as follows: where denotes the score of the th submodule from the th image with regard to the th subject, and is the corresponding weight value, . The th image is assigned to the th subject if and only if .

Now we explain score , taking the forehead module as an example. In the forehead module cluster, there are also samples belonging to classes. For the th sample , we can compute the squared kernel distance between and the center of the th class as follows: where is a nonlinear mapping, which is implicitly defined by a mercer kernel function [12, 25]. Then we sort the distances in ascending order and denote the squared kernel distances after sorting as ; that is, is the order number of the th largest squared kernel distance in . Let , then we get . Thus, given the th sample, the smaller the kernel distance between it and the center of the th class, the greater is, .

Scores can be obtained from the eyes, nose, and lips modules, respectively, using a similar way to score .

3. Optimization of Multiple Kernels

From Section 2, we can see the computations in the proposed distributed wireless face recognition frame are based on kernel techniques. The framework has four computing modules, for forehead, eyes, nose, and lips, respectively. These four kinds of data are so different that it is hard to imagine that an excellent classification can be reached by a single kernel. We propose the use of multiple kernels for these four modules. Specifically, we use the Gaussian RBF kernel but with different values of parameter for the different computing modules.

Every module should have its own optimal kernel parameter. For each module, we separately perform the following kernel parameter optimization procedure.

3.1. Some Notations on Kernel Discriminant Analysis

For a certain module , there are samples belonging to classes. Assume the th class contains samples; that is, , , so . Denote the kernel matrix () as where ; then the kernel matrix corresponds to the nonlinear mapping.

Under the nonlinear mapping , the th mapped class and the mapped sample set are, respectively, given by Also, the mean of the mapped class and that of the mapped sample set are, respectively, given by In kernel feature space (let be dimensionality of ), the within-class scatter matrix and between-class scatter matrix are, respectively, defined as where

The kernel Fisher criterion is defined as where is a projection matrix. Kernel discriminant analysis is to find an optimal projection matrix in mapped feature space , such that .

3.2. Diagonalization Strategy

We use the same diagonalization strategy as KDDA [16] to deal with the small sample size (SSS) problem; that is, first diagonalizing to (identical matrix) and then diagonalizing to , which is briefly expressed as follows.

3.2.1. Eigenanalysis of in the Feature Space

can be expressed using the kernel matrix as follows: where ( diagonal matrix), is a matrix with terms all equal to one, is a block diagonal matrix, and is a vector with all terms equal to .

Let and be the th largest eigenvalue and corresponding eigenvector of . Let be the rank of (also the rank of ). Denote and . It can be derived that , with , a nonsingular diagonal matrix. Let . Then .

3.2.2. Eigenanalysis of in the Feature Space

Based on the analysis in Section 3.2.1, it can be seen that where can be expressed using , with similar details to that seen in [16].

Let be the eigenvector of corresponding to the th smallest eigenvalue , . Denote . Defining , it can be derived that , with .

Based on the derivation presented in Sections 3.2.1 and 3.2.2, an optimal projection matrix for kernel discriminant analysis is obtained as

Certainly, as the nonlinear mapping is implicitly defined by the kernel function (or matrix), (defined by (9)) remains unknown, and can not be evaluated. The real meaning of (13) is obtaining matrix , which can be computed from the kernel matrix . This is the core result of diagonalization process.

3.3. Optimization Criterion and Objective

We adopt the maximum margin criterion (MMC) [26] as the objective function to optimize the kernel parameter for each specific module (module of forehead, eyes, nose, or lips): where is a projection matrix, and is the parameter for the Gaussian RBF kernel as in (4).

Based on the result of (13) in Section 3.2, the optimal projection matrix . Denoting , which can be computed from the kernel matrix . Then the objective function (14) can be reformulated as where and can be expressed in terms of the kernel matrix as follows: with , , and defined the same as in (11); where is a block diagonal matrix and is a matrix with all terms equal to .

Therefore, regarding as constant, the objective function is an explicit function of the kernel parameter . To maximize the objective function with as the initial value of the parameter, an iterative procedure based on Newton’s method is developed in our method to update the kernel parameter, as shown in the following section.

3.4. Solving the Optimization Problem

Assume that is constant. To obtain the extremum of the objective function , we need to differentiate with respect to .

For the sake of clarity, we denote the differentiation of the kernel matrix with respect to as , which is an matrix expressed as follows:

Each element in matrix is the differentiation of the corresponding element in the kernel matrix with respect to and can be formulated as From (16) and (17), matrices and can be differentiated with respect to as follows: where is a matrix and is a matrix.

Then, the derivative of with respect to can be formulated as Thus the derivative of with respect to can be expressed in terms of matrix and matrix . To achieve the maximum of , we set the derivative to zero: We use Newton’s method to solve (22) with the initial value of the kernel parameter that is, is the average squared distance of all the samples in the given module. And the iteration formula is .

3.5. Identification Using Multiple Kernels

For each of the four modules (forehead, eyes, nose, and lips), the optimization of the Gaussian kernel parameter runs and finds the optimal value as above separately. Then the four submodules of a testing image are fed into the corresponding kernel discriminant classifiers to compute the membership degree of the image belonging to the every subject according to (1). Finally the image is assigned to the subject which shows the greatest membership degree of the image.

4. Simulation

To evaluate the performance of our multiple kernel framework for distributed wireless face recognition, we have made experimental comparisons with KDDA based on Gaussian RBF kernel, in terms of recognition accuracy. Images are from two face databases, namely, the FERET and the CMU PIE databases.

In our experiments, for the weight value in the fusion criterion (1), we set , , , and , meaning that we give larger weights to the nose module and the eyes module.

4.1. Face Image Datasets

From the FERET database [27], we select 72 people, with 6 frontal-view images for each individual. Face image variations in these 432 images include illumination, facial expression, wearing glasses, and aging. All the images are aligned by the centers of the eyes and the mouth and then normalized with a resolution of 92 × 112. The pixel value of each image is normalized between 0 and 1. The original images with resolution 92 × 112 are reduced to wavelet feature faces with resolution 49 × 59 after 1-level Daubechies-4 (Db4) wavelet decomposition. Images from one individual are shown in Figure 3.

242105.fig.003
Figure 3: Images of one person from the FERET database.

In the CMU PIE face database [28], there are a total of 68 people, and each person has 13 pose variations ranged from the full right profile image to the full left profile image and 43 different lighting conditions, 21 flashes with ambient light on or off. In our experiments, for each person, we select 56 images including 13 poses with neutral expression and 43 different lighting conditions in the frontal view. For all frontal-view images, we apply alignment based on two-eye center and nose center points, and no alignment is applied on the other images with poses. All the segmented images are rescaled to the resolution of 92 × 112, and then reduced to wavelet feature faces with resolution 49 × 59 after 1-level Daubechies-4 (Db4) wavelet decomposition. Some images of one person are shown in Figure 4.

242105.fig.004
Figure 4: Some images of one person from the CMU PIE face database.
4.2. Recognition Results

This section reports the recognition results of the proposed multiple kernel framework and KDDA with a single Gaussian RBF kernel on the FERET and the CMU PIE datasets. For KDDA, the parameter of Gaussian RBF kernel is optimized via grid search. For each subject in the FERET dataset, we randomly select ( to 5) out of 6 images for training, with the rest for testing. In the CMU PIE dataset, the number of randomly selected training images is ranged from 10 to 18 out of 56 for each individual, while the rest are testing images. The average recognition accuracies over 10 runs on the FERET and CMU PIE datasets are shown in Figures 5(a) and 5(b), respectively.

fig5
Figure 5: Comparison of accuracies obtained by multiple kernel framework and KDDA with a single kernel.

Table 1 shows the average and standard deviation of the accuracies for FERET ( = 4: 4 images per subject for training with the rest for testing) and CMU-PIE ( = 14: 14 images per subject for training with the rest for testing), respectively.

tab1
Table 1: Performance comparison between multiple kernel framework and KDDA with a single kernel.

From the results in Figure 5 and Table 1, it can be seen that the proposed multiple kernel framework can achieve higher accuracies than KDDA with an optimized parameter.

5. Conclusion

In this paper, on the assumption that multiple kernels can characterize geometrical structures of the original data from multiple views which can complement to improve recognition performance, we integrate multiple kernel discriminant analysis with face recognition in wireless sensor networks and propose a module based distributed wireless face recognition system. For each module, we separately perform an iterative scheme based on Newton’s method for kernel parameter optimization, by maximizing the margin maximization criterion. The multiple kernel framework and the optimization procedure yield high recognition accuracy on the FERET and CMU PIE face database, compared with a single kernel.

Conflict of Interests

The authors declare that they have no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is partially supported by the Basic and Frontier Technology Research Project of Henan Province in China under Grant no. 122300410321 and Science and Technology Development Project of Henan Province under Grant no. 132102210186.

References

  1. W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition: a literature survey,” ACM Computing Surveys, vol. 35, no. 4, pp. 399–458, 2003. View at Publisher · View at Google Scholar · View at Scopus
  2. X. Zhang and Y. Gao, “Face recognition across pose: a review,” Pattern Recognition, vol. 42, no. 11, pp. 2876–2896, 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. Q.-M. Lin, J.-W. Yang, N. Ye, R.-C. Wang, and B. Zhang, “Face recognition in mobile wireless sensor networks,” International Journal of Distributed Sensor Networks, vol. 2013, Article ID 890737, 7 pages, 2013. View at Publisher · View at Google Scholar
  4. I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “Wireless sensor networks: a survey,” Computer Networks, vol. 38, no. 4, pp. 393–422, 2002. View at Publisher · View at Google Scholar · View at Scopus
  5. I. Kim, J. Shim, J. Schlessman, and W. Wolf, “Remote wireless face recognition employing zigbee,” in Proceedings of the ACM SenSys Workshop on Distributed Smart Cameras (DSC '06), Boulder, Colo, USA, 2006.
  6. R. Muraleedharan, Y. Yan, and L. A. Osadciw, “Increased efficiency of face recognition system using wireless sensor network,” Systemics, Cybernetics and Informatics, vol. 4, no. 1, pp. 38–46, 2006.
  7. C. C. Chang and H. Aghajan, “Collaborative face orientation detection in wireless image sensor networks,” in Proceedings of the ACM SenSys Workshop on Distributed Smart Cameras (DSC '06), Boulder, Colo, USA, 2006.
  8. N. Zaeri, F. Mokhtarian, and A. Cherri, “Efficient face recognition for wireless surveillance systems,” in Proceedings of the 9th IASTED International Conference on Computer Graphics and Imaging (CGIM '07), E. Gobbetti, Ed., pp. 132–137, OACTA Press, Innsbruck, Austria, February 2007. View at Scopus
  9. M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71–86, 1991. View at Scopus
  10. P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces versus fisherfaces: recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997. View at Publisher · View at Google Scholar · View at Scopus
  11. A. Ruiz and P. E. López-de-Teruel, “Nonlinear kernel-based statistical pattern analysis,” IEEE Transactions on Neural Networks, vol. 12, no. 1, pp. 16–32, 2001. View at Publisher · View at Google Scholar · View at Scopus
  12. V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
  13. B. Schölkopf, A. Smola, and K. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Tech. Rep. 44, MPI fur Biologische Kybernetik, Tubingen, Germany, 1996.
  14. S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K. R. Müller, “Fisher discriminant analysis with kernels,” in Proceedings of the IEEE Signal Processing Society Workshop, Neural Networks for Signal Processing IX, pp. 41–48, Madison, Wis, USA, 1999.
  15. G. Baudat and F. Anouar, “Generalized discriminant analysis using a kernel approach,” Neural Computation, vol. 12, no. 10, pp. 2385–2404, 2000. View at Scopus
  16. J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, “Face recognition using kernel direct discriminant analysis algorithms,” IEEE Transactions on Neural Networks, vol. 14, no. 1, pp. 117–126, 2003. View at Publisher · View at Google Scholar · View at Scopus
  17. O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing multiple parameters for support vector machines,” Machine Learning, vol. 46, no. 1–3, pp. 131–159, 2002. View at Publisher · View at Google Scholar · View at Scopus
  18. S. Sonnenburg, G. Rätsch, and C. Schäfer, “A general and efficient multiple kernel learning algorithm,” in Proceedings of the Neural Information Processing Systems, 2005.
  19. Z. Wang, S. Chen, and T. Sun, “MultiK-MHKS: a novel multiple kernel learning algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 348–353, 2008. View at Publisher · View at Google Scholar · View at Scopus
  20. J. Bi, T. Zhang, and K. P. Bennett, “Column-generation boosting methods for mixture of kernels,” in Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (KDD '04), pp. 521–526, New York, NY, USA, August 2004. View at Scopus
  21. G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M. I. Jordan, “Learning the kernel matrix with semidefinite programming,” Journal of Machine Learning Research, vol. 5, pp. 27–72, 2004. View at Scopus
  22. F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan, “Multiple kernel learning, conic duality, and the SMO algorithm,” in Proceedings of the 21th International Conference on Machine Learning, (ICML '04), pp. 41–48, New York, NY, USA, July 2004. View at Scopus
  23. K. P. Bennett, M. Momma, and M. J. Embrechts, “MARK: a boosting algorithm for heterogeneous kernel models,” in Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (KDD '02), pp. 24–31, New York, NY, USA, July 2002. View at Scopus
  24. X.-Z. Liu and G.-C. Feng, “Multiple kernel learning in fisher discriminant analysis for face recognition,” International Journal of Advanced Robotic Systems, vol. 10, article 142, 2013. View at Publisher · View at Google Scholar
  25. J. Mercer, Functions of Positive and Negative Type and Their Connection with the Theory of Integral Equations, vol. 209 of Philosophical Transactions of the Royal Society of London Series A, The Royal Society, 1909.
  26. H. Li, T. Jiang, and K. Zhang, “Efficient and robust feature extraction by maximum margin criterion,” in Advances in Neural Information Processing Systems 16, S. Thrun, L. Saul, and B. Schölkopf, Eds., pp. 157–1165, The MIT Press, Cambridge, Mass, USA, 2004.
  27. P. Jonathon Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. View at Publisher · View at Google Scholar · View at Scopus
  28. T. Sim, S. Baker, and M. Bsat, “The CMU pose, illumination, and expression (PIE) database,” in Proceedings of the 5th IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC, USA, May 2002.