Abstract

Face recognition has received a great attention from a lot of researchers in computer vision, pattern recognition, and human machine computer interfaces in recent years. Designing a face recognition system is a complex task due to the wide variety of illumination, pose, and facial expression. A lot of approaches have been developed to find the optimal space in which face feature descriptors are well distinguished and separated. Face representation using Gabor features and discrete wavelet has attracted considerable attention in computer vision and image processing. We describe in this paper a face recognition system using artificial neural networks like multilayer perceptron (MLP) and radial basis function (RBF) where Gabor and discrete wavelet based feature extraction methods are proposed for the extraction of features from facial images using two facial databases: the ORL and computer vision. Good recognition rate was obtained using Gabor and DWT parameterization with MLP classifier applied for computer vision dataset.

1. Introduction

Face recognition has received a great attention from a lot of scientists in computer vision, pattern recognition, and human machine computer interfaces in recent years.

Face is the most common biometric identifier used by humans; this domain is motivated by the increased interest in the commercial applications of automatic face recognition (AFR) as well as by the emergence of real-time processors. Automatic face recognition (FR) is also one of the most visible and challenging research topics in computer vision, machine learning, and biometrics [1].

Face recognition has become important because of the potential value for applications and its theoretical challenges. Today, face recognition technology is being used to combat passport fraud, support law enforcement, identify missing children, and identify fraud. There are so many challenges in face recognition; some of them are (1) illumination variances, (2) occlusions, and (3) different expressions and poses [2].

There are many approaches to evaluate face images. The methods used in face recognition can be generally classified into image feature based and geometry feature based methods. In feature geometry based approach, recognition is based on the relationship between human facial features such as eye(s), mouth, nose, and face boundary and subspace analysis approach attempts to capture and define the face as a whole. The subspace method is the most famous technique for face recognition. In this method the face is treated as two-dimensional pattern of intensity variation [3].

Although facial images have a high dimensionality, they usually lie on a lower dimensional subspace or submanifold. Therefore, subspace learning and manifold learning methods have been dominantly and successfully used in appearance based FR. The classical Eigenface and Fisherface algorithms consider only the global scatter of training samples and they fail to reveal the essential data structures nonlinearly embedded in high-dimensional space [1].

Face representation using Gabor features has attracted considerable attention in computer vision, image processing, pattern recognition, and so on. The principal motivation to use Gabor filters is biological relevance that the receptive field profiles of neurons in the primary visual cortex of mammals are oriented and have characteristic spatial frequencies.

Gabor filters can exploit salient visual properties such as spatial localization, orientation selectivity, and spatial frequency characteristics [4].

The Gabor filter was first introduced by David Gabor in 1946 and was later shown as models of simple cell receptive fields. Since the Gabor features are extracted in local regions, they are less sensitive to variations of illumination, expression, and pose than the holistic features such as Eigenface and Randomface [1].

Face recognition using Gabor filters was firstly introduced by Yang and Zhang and soon proved to be a very effective means in human facial features extraction [1].

We are interested in our work to develop a recognition system using Gabor filters and discrete wavelet transform as feature extraction. A multilayer perceptron (MLP) and radial basis function (RBF) neural network classifier are used to classify an individual from two face databases: the ORL and computer vision databases. Figure 1 describes our system.

We investigate then the best way to characterize the facial images; efficiency is evaluated by computing recognition rate (RR%) for each parameterization and classifier.

The paper is presented as follows: Section 2 presents the face characterization using Gabor filters and discrete wavelet transform DWT, Section 3 presents simulation results, and conclusion is drawn in Section 5.

2. Feature Face Extraction and Face Parameterization

The most important step in face recognition system is the feature extraction. It permits us to reduce dimensionality and characterize the face image using a vector descriptor which minimizes intraperson dissimilarities and maximizes the extraperson difference. This task is difficult because of variations of pose, age, expression and illumination, and so forth.

In our work, we use Gabor filters and discrete wavelet transform to describe our face and to build a face recognition system using the multilayer perceptron (MLP) and radial basis function (RBF) neural network classifier. Comparison is done for each classifier and facial database to decide the optimal characterization. We describe first the Gabor function and discrete wavelet transforms to characterize our facial images.

2.1. Gabor Filter Representation
2.1.1. Gabor Wavelet: A Mathematical Overview

Gabor filter, defined by Dennis Gabor, is widely used in image analysis, pattern recognition, and so forth. Gabor filters present two interesting properties: frequency localization and selectivity in orientation.

A lot of studies demonstrate that representation by Gabor wavelet for image analysis is robust to variation in illumination and facial expressions.

Gabor filter in 2D was introduced in biometric research by Daugman for iris recognition. Lades et al. used Gabor filter for face recognition using “Dynamic Link Architecture.”

The Gabor wavelet, which captures the properties of orientation selectivity, spatial localization and optimally localized in the space and frequency domains, has been extensively and successfully used in face recognition [5]. Daugman pioneered the using of the 2D Gabor wavelet representation in computer vision in 1980s [6, 7].

Gabor wavelets (filters) characteristics for frequency and orientation representations are quite similar to those of human visual system. These have been found appropriate for texture representation and discrimination. This Gabor wavelet based extraction of features directly from the gray-level images is successful and has widely been applied to texture segmentation and fingerprint recognition. The commonly used Gabor filters in face recognition area [3, 4] are defined as follows [6]:

is the point of coordinates () in image space, and define the orientation Gabor frequency, and is the standard deviation of the Gaussian.

and define the orientation and the scale of the Gabor filters, and is defined as the following form:

and . is the maximum frequency, and is the spacing factor between kernels in the frequency domain. Usually , , and .

The Gabor wavelet representation of a face image is obtained by doing a convolution between the image and a family of Gabor filters as described by (3). The convolution of image and a Gabor filter can be defined as follows [6]:where , denotes the convolution operator, and is the Gabor filter response of the image with orientation and scale [6]. Figure 2 illustrates this principle.

The Gabor kernels in (1) are all self-similar since they can be generated from the same filter, the mother wavelet, by scaling and rotating via the wave vector .

Each kernel is a product of a Gaussian envelope and a complex plane wave. These representation results display scale, locality, and orientation properties corresponding to those displayed by the Gabor wavelets [6].

Each kernel is a product of a Gaussian envelope and a complex plane wave and can be separated into real and imaginary parts. Hence, a band of Gabor filters is generated by a set of various scales and rotations. A detail of Gabor filters calculations or design can be described later.

Gabor wavelet of an image is then a convolution product of with a set of Gabor filters with different frequency and orientations.

The convolution of the image with Gabor is defined by

Figure 3 presents an example of facial representation in Gabor wavelet using 40 Gabor filters.

2.1.2. Design of Gabor Filter

Gabor filter works as a band-pass filter for the local spatial frequency distribution, achieving an optimal resolution in both spatial and frequency domains. The 2D Gabor filter can be represented as a complex sinusoidal signal modulated by a Gaussian kernel function as follows [4, 8]:where

, are the standard deviations of the Gaussian envelope along the - and-dimensions; is the central frequency of the sinusoidal plane wave and the orientation. The rotation of the - plane by an angle will result in a Gabor filter at the orientation . The angle is defined by

and , where denotes the number of orientations.

Design of Gabor filters is accomplished by tuning the filter with a specific band of spatial frequency and orientation by appropriately selecting the filter parameters; the spread of the filters , , radial frequency , and the orientation of the filter [4].

The important issue in the design of Gabor filters for face recognition is the choice of filter parameters.

The Gabor representation of a face image is computed by convolving the face image with the Gabor filters. Let be the intensity at the coordinate () in a grayscale face image; its convolution with a Gabor filter is defined as [4]where denotes the convolution operator. The response to each Gabor kernel filter representation is a complex function with a real part and an imaginary part . The magnitude response is expressed as

We use the magnitude response to represent our facial images.

The Gabor filter is basically a Gaussian modulated by a complex sinusoid described by (5) and (6).

All filters could be produced by rotation and dilatation of a mother wavelet. A majority of face recognition approaches based on Gabor filters used only the magnitude of the response or a fusion of the magnitude and the real part because of the variation of the phase with local patterns and it can be considered irrelevant.

The response of the Gabor filter depends on the following parameters.(a)The orientation : it describes the orientation of the wavelet and characterizes the angles of contour and line images.(b)The frequency : it specifies the wavelengths of the function; wavelets with large wavelength are sensitive to progressive illumination changes in the image, whereas low wavelengths are sensitive to contours.(c)The standard deviation : this parameter defines the radius of the Gaussian. The size of the Gaussian determines the number of pixels of image taken into account for the convolution operator.

The set of Gabor filters is characterized by a certain number of resolutions, orientations, and frequencies known as “characteristics.”

Figure 4 shows the response of the Gabor filter when varying the total parameters. After a lot of experiments, it has been proved that the following parameters, ; ; ; , are relevant for face parameterization and build our face recognition system that will be described in future sections.

Gabor filters are used in contour detection; this parameter can be used to differentiate an individual from another. They represent also the characteristic points in the human face. They are widely used in texture characterization and in contour detection.

A simple Gabor filter can be expressed as the product of a Gaussian and a complex exponential. The resultant filtered image IG can be illustrated for the choice orientation and standard deviations by Figure 5.

We use only the magnitude information without the phase; the module of the filtered image shows local information for the details of the image (frequency decomposition) and for the orientation (directional information). Magnitude analysis permits us to detect local structures of interest. The filtered image presents the characteristic features of a person like lip, eyes, and so forth.

Face recognition using Gabor filters was firstly introduced by Lades et al. [9] and soon proved to be a very effective means in human facial features extraction. Wang et al. [10] proposed a face recognition algorithm combined vector features consisting of the magnitude of Gabor, PCA, and classification SVM [6].

The Gabor wavelet is a continuous filter, making the filter process computationally demanding. A wavelet is a waveform of limited duration that has an average value of zero. The limited duration of the wavelet enables preservation of spatial information in the signal [11].

The following paragraph presents an overview of discrete wavelet analysis followed by the application for face parameterization.

2.2. Wavelet Analysis

The discrete wavelet transform is analogous to the Fourier transform with the exception that the DWT uses scaled and shifted versions of a wavelet. It decomposes a signal into a sum of shifted and scaled wavelets. The DWT kernels are very similar to Gabor kernels and exhibit properties of horizontal, vertical, and diagonal directionality. Also, the DWT possesses the additional advantages of sparse representation and nonredundant computation, which make it advantageous to employ the DWT over the Gabor transform for facial feature extraction [11]. DWT has been successfully used for denoising, compression, and feature detection applications in image processing to provide faster performance times and achieve results comparable to those employing Gabor wavelets [11].

As described in [11], we present a brief review of continuous and discrete wavelet relevant to our analysis; 2D discrete wavelet decomposition is then described and applied for face recognition.

The continuous wavelet transform (CWT) between a signal and a wavelet is mathematically defined as [11]where is the scale, is the time, and is the shift.

The DWT is obtained by restricting the scale, , to powers of 2 and the position, , to integer multiples of the scales and is given bywhere and are integers and are orthogonal baby wavelets, defined asSubstituting (12) into (11) yieldsThe restricted choice of scale and position results in a subsample of coefficients. Baby wavelets have an associated baby scaling function given byThe scaling function or dilation equation can be expressed in terms of low-pass filter coefficients asIn addition, the wavelet function itself can be expressed in terms of high-pass filter coefficients asAlso, a signal can be represented by the scaling and wavelet functions:Here, are the approximation coefficients at level 1, and are the detail coefficients at level 1. Equation (17) represents the analysis of a signal that can be repeated numerous times to decompose the signal into lower subspaces as shown in Figure 6 [11].

The detail and approximation coefficients are found using the filter coefficients and :Equation (18) shows that the level 1 approximation coefficients can be found by convolving with the low-pass filter and downsampling by 2 (indicated by ↓2), which simply means disregarding every second output coefficient. Similarly, the level 1 detailed coefficient, , can be found by convolving with the high-pass filter and downsampling by 2. Therefore, the DWT can be performed using the filter coefficients and . The process is applied repeatedly to produce higher levels of approximation and detail coefficients as shown in Figure 6. Each time the filter outputs are downsampled, the number of samples is halved. The original signal, , is assumed to start in the subspace . Therefore, the level 0 approximation coefficients are the discrete values of the signal [11].

2.3. Discrete Wavelet Transform (DWT) Applied for Face Recognition

Discrete wavelet transform (DWT) is a suitable tool for extracting image features because it allows the analysis of images on various levels of resolution. Typically, low-pass and high-pass filters are used for decomposing the original image.

The low-pass filter results in an approximation image and the high-pass filter generates a detail image. The approximation image can be further split into a deeper level of approximation and detail according to different applications [12].

Suppose that the size of an input image is . At the first filtering in the horizontal direction of downsampling, the size of images will be reduced to . After further filtering and downsampling in the vertical direction, four subimages are obtained, each being of size . The outputs of these filters are given by [12]where and are coefficients of low-pass and high-pass filters, respectively.

Accordingly, we can obtain four images denoted as LL, HL, LH, and HH. The LL image is generated by two continuous low-pass filters; HL is filtered by a high-pass filter first and a low-pass filter later; LH is created using a low-pass filter followed by a high-pass filter; HH is generated by two successive high-pass filters [12].

Figure 7 presents the first level of decomposition applied for a facial image of dimension pixels, whereLL describes the approximation (ap in Figure 7),LH and HL describe the horizontal and vertical details,HH describes the diagonal detail.

After first level wavelet decomposition, the output images become input images of second level decomposition.

Every subimage can be decomposed further into smaller images by repeating the above procedure. The main feature of DWT is the multiscale representation of a function. By using the wavelets, a given image can be analyzed at various levels of resolution. Since the LL part contains most important information and discards the effect of noises and irrelevant parts, we extract features from the LL part of the first level decomposition. The reasons are that the LL part keeps the necessary information and the dimensionality of the image is reduced sufficiently for computation at the next stage [12].

One major advantage afforded by wavelets is the ability to perform local analysis, that is, to analyze a localized area of a larger signal. In wavelet analysis, we often speak about approximations and details. The approximations are the high-scale, low-frequency components of the signal. The details are the low-scale, high-frequency components.

A lot of tests have been carried; we adopt the Haar, db2, db4, db8, and bior2.2 wavelet with first level decomposition. The DWT feature vector is assigned for each individual and for the whole database containing facial images. The choice of wavelet function is very important task in features extraction. The effectiveness of recognition system depends on this selection.

Figure 8 shows the first level decomposition using Haar wavelet; a facial image size from computer vision database is reduced to () with 1st level decomposition. The dimension becomes () with 2nd level of decomposition.

The graph in Figure 9 describes the distribution of the resultant signal or images (approximation and detail) with first level decomposition using Haar wavelet. It is clear that a major energy of the approximation information is concentrated in the region with red colour, whereas the horizontal and vertical details appear with a little energy in blue colour.

From our experiments and after a lot of tests, only first level decomposition is sufficient to describe correctly our face image and to build a face recognition system with good recognition rate.

The DWT is able to extract the horizontal and vertical details of a facial image. However, unlike the Gabor wavelets, which can be oriented to any angle, the 2D DWT is restricted in terms of orientation to horizontal, diagonal, and directionality (0°, 45°, and 90°). Since features of the face in a 2D grayscale image are a union or combination of infinitely oriented lines, the ability to filter at several orientations is necessary to achieve acceptable face recognition performance [11].

The DWT analysis applied for face recognition has a lot of advantages compared to Gabor analysis; the first benefit yields with dimensionality reduction which permits us less computational complexity, whereas the second advantage concerns the image decomposition in frequency domain into subbands at different scales.

In this paper we examine the effects of different feature extraction methods like DWT and Gabor on the face recognition system and compare with other approaches like DCT feature extraction and PCA or LDA spatial feature reduction realized and discussed in a previous work [13].

3. Simulation Results

3.1. Face Databases

Our experiments were performed on two face databases, computer vision, and ORL database.

First, we used the computer vision dataset; it contains frontal images of 395 individuals, and each person has 20 frontal images. This dataset contains images of people of various racial origins, mainly of first year undergraduate students, so the majority of individuals are between 18 and 20 years old but some older individuals are also present. Images for one person differ from each other in lighting and facial expression. The images are 256-colour level with size of . Some individuals are wearing glasses and beards. The total number of images is 7900. Samples of the database are shown in Figure 10.

Wavelet has got a great attention for face recognition; the wavelet transform has emerged as a cutting edge technology. The discrete wavelet transform (DWT) is a linear, invertible, and fast orthogonal operation. Wavelets are a good mathematical tool for hierarchically decomposing occupations. Without degrading the quality of the image to an unacceptable level image compression minimizes the size in bytes of a graphics file. The reduction in file size permits more images to be stored in a certain amount of disk or memory space [3].

We used also ORL database for analysing the performance of our classifier; this database contains a set of face images taken at the Olivetti Research Laboratory (ORL) in Cambridge University, UK. We use 200 images of 20 individuals. For some subjects, the images were taken at different times, which contain quite a high degree of variability in lighting, facial expression (open/closed eyes, smiling/nonsmiling, etc.), pose (upright, frontal position, etc.), and facial details (glasses/no glasses). All images are 8-bit grayscale of size 112 × 92 pixels. Samples of the ORL database are shown in Figure 11.

3.2. Face Characterization

Our face recognition system is tested first on the computer vision database. We use 200 images of 10 subjects. In the following experiments, ten images are randomly selected as the training samples and another ten images as test images. Therefore, a total of 100 images (10 for each individual) are used for training and another 100 for testing and there are no overlaps between the training and testing sets.

For ORL database, we select randomly 100 samples (5 for each individual) for training. The remaining 100 samples are used as the test set.

In our face recognition experiments, we extract features of facial images from different wavelet transforms (Haar, Daubechies, and Biorthogonal) by decomposing face image in LL subbands at first level. The wavelets used are Haar, db2, db4, db8, and bior2.2. In addition, we apply Gabor wavelet in order to characterize our facial images. Feature vectors extracted from each method are fed into a MLP and RBF neural network classifier to perform efficiency of our face recognition.

Before applying our face recognition system based on MLP and RBF neural network, we have applied the correlation criterion between training and test features first for Haar DWT and Gabor features for primary analysis.

Figure 12 shows variation of correlation for some individuals with the 10 test images from the computer vision dataset. We noticed that we have a good separation (correlation coefficient is greater than a threshold) for the first, the second, and the fifth individual and bad separation for the third and the sixth individual. A threshold of 0.8 was chosen for the coefficient of correlation to decide the correct classification. The discrimination is clear for some individuals, whereas results obtained were not very satisfactory; a recognition rate of 72.5% was obtained.

We have computed correlation between training and test feature vectors on the computer vision dataset using ten classes and features from Gabor and Haar DWT. Figure 13 shows recognition rate obtained for each class; a global recognition rate of 72 to 73% was obtained. We noticed bad classification for the third, the sixth, and the eight classes.

In addition, the investigation of number of training images when varied from 5 to 10 for computer vision dataset permits us to perform better recognition rate for 10 facial images in training and testing phase. Figure 14 shows the correct classification for each class; a global recognition rate of 72% was obtained with 10 facial images, whereas only 67% was obtained with 5 training images for each class.

However, the correct classification rate depends on the number of variables in each class which will improve the correct classification rate for a large number of training images for each individual.

We conclude from this analysis that the choice of 10 images for training and test is more efficient to characterize the total database and to perform an appreciable correct classification only by applying the correlation criterion.

We investigate now the use of a neural network classifier to build a face recognition system based on discrete wavelet and Gabor characterization. The following paragraph describes a brief theory on the classifiers used in our work and discusses the simulation results obtained.

3.3. Face Recognition System Using Artificial Neural Network

Neural networks are feedforward and use the backpropagation algorithm. We imply feedforward networks and backpropagation algorithm (plus full connectivity). While inputs are fed to the ANN forwardly, the “back” in backpropagation algorithm refers to the direction to which the error is transmitted [16].

Face recognition is achieved by employing a multilayer perceptron with backpropagation algorithm and radial basis function classifier. For the MLP classifier, the first layer receives the Gabor features or the DWT coefficients with first level decomposition. The number of nodes in this layer is equal to the dimension of the feature vector incorporating the DWT or Gabor features. The number of the output nodes is equal to the number of the registered individuals. Finally, the number of the hidden nodes is chosen by the user.

The features test matrix is defined with variables called target; the target matrix has the same dimension as the training matrix. The network is trained to output a 1 in the correct position of the output vector and to fill the rest of the output vector with 0’s.

The MLP NN trained present fast convergence and the training process terminated within 4000 to 6000 epochs depending on the facial database and the size of feature descriptors, with the summed squared error (SSE) reaching the prespecified goal (10-3).

We used log-sigmoid and tang-sigmoid functions as a transfer function at all neurons (in hidden layer and output layer); log-sigmoid function is ideal for face recognition system using Gabor filters and wavelet decomposition.

In order to show the importance of processing elements, we trained our MLP classifier with variable hidden unit from 10 to 100. For a small number of neurons (10 to 20) in the hidden layer we observed large mean squared error (MSE), so low accuracy. The MLP generalizes poorly. After ~60 to 70 neurons, MSE came back to the levels of a system with only 10 neurons in the hidden layer. When there are too many neurons, poor performance is a direct effect of overfitting. The system overfits the training data and does not perform well on novel patterns.

In addition, we have used an RBF classifier to decide the correct classification. An RBF neural network, shown in Figure 15, can be considered a mapping: .

Let be the input vector and let () be the prototype of the input vectors [16]. The output of each RBF unit is as follows:where indicates the Euclidean norm on the input space. Usually, the Gaussian function (Figure 15) is preferred among all possible radial basis functions due to the fact that it is factorizable:where is the width of the th RBF unit. The th output of a neural network iswhere , is the weight or strength of the th receptive field to the th output [16].

According to [16], (21) and (22) show that the outputs of an RBF neural classifier are characterized by a linear discriminant function. They generate linear decision boundaries (hyperplanes) in the output space. Consequently, the performance of an RBF neural classifier strongly depends on the separability of classes in the -dimensional space generated by the nonlinear transformation carried out by the RBF units [16].

According to Cover’s theorem on the separability of patterns where in a complex pattern classification problem cast in a high-dimensional space nonlinearly is more likely to be linearly separable than in a low-dimensional space, the number of Gaussian nodes , where is the dimension of input space. On the other hand, the increase of Gaussian units may result in poor generalization because of overfitting, especially, in the case of small training sets [16]. It is important to analyze the training patterns for the appropriate choice of RBF hidden nodes.

Geometrically, the key idea of an RBF neural network is to partition the input space into a number of subspaces which are in the form of hyperspheres. Accordingly, clustering algorithms (-means clustering, fuzzy-means clustering, and hierarchical clustering) which are widely used in RBF neural networks are a logical approach to solve the problems [16].

However, it should be noted that these clustering approaches are inherently unsupervised learning algorithms as no category information about patterns is used. While considering the category information of training patterns, it should be emphasized that the class memberships not only depended on the distance of patterns but also depended on the Gaussian widths [16].

To perform the efficiency of our proposed method applied for ORL and computer vision CV database, we compared recognition rate (RR) obtained for each database and for each parameterization. Section 3.4 will describe the face identification experiments using the MLP classifier whereas Section 3.5 presents the simulation results for the RBF classifier.

3.4. MLP Neural Network Classifier for Face Recognition

Once feature extraction is studied, the novel vectors are fed into a MLP and RBF classifier to decide the correct classification rate.

After Gabor filtering, we construct our training matrix consisting of the module of filtered image. The characteristic features vectors are fed into a MLP neural classifier with its architecture. We use only the magnitude information for Gabor characterization.

Using Haar discrete wavelet and the ORL database, for example, the images of size pixels, we retain the approximation vector of size (); the resultant vector descriptor using the reshape function by Matlab permits us to have a novel vector of size 2576 of all the images in ORL database. The approximation vector of db2, db4, db8, and bior2.2 is 2679, 2891, 3339, and 2784, respectively. Table 1 summarizes the total vector size of feature vectors obtained with each wavelet and for the two databases used in our work.

After calculating the DWT and Gabor transform, these feature vectors are calculated for the training set and then used to train the neural network; this architecture is called DWT-MLP (DWT-RBF) and Gabor-MLP (Gabor-RBF).

The performance of Gabor-MLP or DWT-MLP approach is evaluated by using the recognition rate (RR) standard defined bywhere is the total number of test sample tests and is the number of test images recognized correctly.

The choice of feature extraction is a very important step, and then the application of a classifier such as MLP or RBF permits us to obtain good results for a development of an individual identification process.

Various transfer functions were tested for training the network and average minimum MSE on training is measured; log-sigmoid function is the most suitable transfer function. The MLP neural network is trained using learning rules, namely, conjugate gradient (CG). TRAINSCG is a network training function that updates weight and bias values according to the scaled conjugate gradient method. Finally network is tested on training and testing dataset.

3.4.1. Searching for the Optimal NN Topology

Number of Hidden Neurons. In our experiments we demonstrate how we decided on the optimal number of neurons for the hidden layer.

We attempted to find the optimal NN topology or architecture by training and testing MLPs, whose number of neurons in hidden layer varies from 10 to 100; we keep the number of maximum iterations constant (from 3000 to 6000) and the MSE training as presented in Table 2. We used sigmoid functions as transfer functions.

For a small number of neurons (10 to 20) in the hidden layer we observed large MSE, low accuracy, and generally large partiality. The MLPs were not flexible enough to capture the underlying process and due to large bias they generalize poorly. We faced similar generalization performance, even when we used too many neurons in the hidden layer. After ~90 neurons, MSE came back to the levels of a system with only 10 neurons in the hidden layer.

By adding more and more units in the hidden layer the training error can be made as small as desired but generally each additional unit will produce less and less benefits. With too many neurons, poor performance is a direct effect of overfitting. The system overfits the training data and does not perform well on novel patterns.

Table 2 summarizes the optimal neural network topology or architecture found in our experiments to achieve best recognition rate.

Figure 16 shows recognition efficiency obtained for both computer vision and ORL database with Gabor wavelet characterization and MLP neural network classifier. Good recognition rate about 96% to 97% was obtained for a number of neurons superior to 60 for the two databases.

Figure 17 shows the variation of recognition rate when varying the number of neurons in hidden layer with db2, db4, db8, and bior2.2 DWT with MLP classifier.

The best recognition rate about 96% was obtained for db8 and bior2.2 wavelet. We conclude for this preliminary result that db8 and Biorthogonal 2.2 wavelet perform better than the Haar wavelet; good parametrization can be achieved with the two discrete wavelets only with 20 to 30 neurons in hidden layer.

In addition, results obtained for computer vision dataset are superior to those obtained for ORL dataset where the best recognition rate obtained is about 81%. Figure 18 shows efficiency of our recognition system for the different discrete wavelet obtained with ORL dataset studied in our work.

We also noticed that, for 20 to 40 neurons in hidden layer, we have no convergence with the MLP algorithm for ORL dataset; this can be explained by the fact that the number of neurons is not enough to train the MLP classifier. In addition, facial images in ORL dataset present a lot of variabilities in orientation and in facial expressions.

All these parameters complicate the task of recognition and classification, whereas the advantage of DWT parametrization is essentialy the reduction of feature extraction and time computation compared to Gabor wavelet. The classification time required for achieving recognition face system is presented in Figure 19 in order to provide speed comparisons of the different discrete wavelet transform characterization. Simulation results are computed on a PC intel core i3 with a 1.2 GHz processor.

It is clear that time recognition is faster for the different DWT transform when a number of neurons are superior than 80; we noticed from Figure 19 that corresponds to a fast convergence of the MLP classifier. We conclude the relationship between the optimal convergence of the neural network classifier and time required for this task.

3.5. RBF Neural Network Classifier for Face Recognition

Simulations were carried with MATLAB 7.6 and using some predefined functions to train our RBF NN classifier. For RBF neural network simulation, a lot of tests have been carried. First we create an RBF function classifier using newrbe function to design a radial basis network with zero error on the design vectors. This function creates a two-layer network; the first layer has radbas neurons and calculates its weighted input with dist and its net input with netprod. The second layer has purlin neurons and calculates its weighted input with dotprod and its net inputs with netsum. Both layers have biases.

Newrbe sets the first layer weights to ’, where is our training vector descriptors (Gabor filtered images and discrete wavelet decomposition) and the first layer biases are all set to 0.8326/spread resulting in radial basis functions that cross 0.5 at weighted input of spread. Spread is the smoother function approximation (neural network toolbox).

For our simulation, we calculate recognition rate when varying the spread parameter from 10 to 250 for computer vision database, where variation is from 200 to 5600 for ORL database. We note that spread parameter depends on the quality of analyzed image and the database. Figures 20 and 21 show the variation of recognition rate when varying the spread parameter.

It is clear that recognition rate is about 95% for computer vision database obtained with a spread parameter equal to 90, whereas it is about 83% obtained for spread equal to 5200.

We conclude the following.(1)The use of wavelets is a good means to characterize the data with high speed compared to Gabor filters.(2)The Gabor wavelet can extract the object features in different orientations and different scales; on the contrary, the two-dimensional wavelets subband coefficients extract information in only two directions.(3)The MLP neural network is a good classifier for face identification using the conjugate gradient.Table 3 summarizes efficiency of the characterized method and classifier used in our work. Recognition performance is highly dependent on the facial dataset.

4. Discussion

Simulation results showed the efficiency of the proposed method; about 94 to 96% was obtained with Gabor filters and MLP/RBF classifier for computer vision database, whereas about 85% was obtained for ORL dataset.

We conclude that the best results are obtained for computer vision database; in addition, MLP can perform well with Gabor parameterization.

We also conclude that MLP classifier performs better than RBF where 94% of correct classification was obtained for Gabor parameterization, whereas for bior2.2 and db8 wavelet decomposition, it can achieve 96% with the two-classifier neural network.

We can explain these results that computer vision database was obtained in controlled environment with a little variation in illumination and pose, whereas ORL database presents a lot of variation in pose, illumination, and facial expression. All these parameters decrease the efficiency of our method.

The results obtained show the importance of designing neural network for face recognition. In addition, feature extractions like Gabor filters and wavelet decomposition perform well for the total database. We conclude an improvement in face recognition when using the Gabor parameterization for ORL dataset but time required for classification is long compared with discrete wavelet. This can be explained by the following: the Gabor wavelet can extract the object features in different orientations and different scales, whereas the two-dimensional wavelets subband analysis extracts information in only two directions. Both Gabor filters and wavelet decomposition characterize the facial images where the best recognition rate was obtained for Biorthogonal 2.2 wavelet and MLP classifier.

4.1. Comparisons with Other Approaches

Many face recognition approaches have been performed on the computer vision database. In order to compare the recognition performance, we choose some recent approaches tested under similar conditions for comparison. Approaches are evaluated on recognition rate.

The performance of the Gabor-MLP and DWT-RBF algorithm in comparison with some existing algorithms for face recognition is listed in Table 4. It is evident from the table that our proposed method achieves high recognition rate as well as high training and recognition speed.

Table 4 presents results in terms of correct classification rate obtained in our previous work [13] and compared with our approach for computer vision dataset.

It is clear from Table 4 that our proposed approach achieves high recognition rate. It can be seen that the DWT-MLP algorithm exhibits better performance than other linear classifiers like PCA and LDA.

We conclude that the MLP neural network with only one hidden layer performs well compared to the RBF classifier for the two databases. We also find the methods extracting the feature DCT are good for the recognition system, whereas the time required is long (about 5 mn for 100 training images) compared to DWT extraction and classification.

A lot of studies have investigated the use of discrete wavelet and other statistical reduction methods like PCA for ORL, computer vision, and FERET database. Results obtained are contradictory; Zhang et al. present in their work [17] a face recognition system by combining two recently proposed neural network models, namely, Gabor wavelet network (GWN) and kernel associative memory (KAM), into a unified structure called Gabor wavelet associative memory (GWAM) using three popular face databases, that is, FERET database, Olivetti-Oracle Research Lab (ORL) database, and AR face database. A recognition rate about 98% was obtained with ORL database and 96% for AR dataset [17].

AlEnzi et al. apply different levels of discrete wavelet transform (DWT) in different levels and the two-dimensional principal component (2DPCA) to compute the face recognition accuracy processing it through the ORL image database [18]. From their experiments and combining 2-level DWL technique with 2DPCA method, the highest recognition accuracy (94.5%) was obtained with a time rate of 4.28 sec.

Authors in [19] propose a face recognition scheme that combined wavelet transform and 2-DPCA applied for ORL dataset. They compared the recognition performances of various wavelets at 8 levels, 1 to 8. Accuracy with only 2-DPCA is 85% whereas the average accuracy of 2-DPCA with wavelet in subbands 2 and 3 is 94.5%.

Jain and Bhati [20] develop a face recognition system using neural network with PCA to reduce the dimensionality in wavelet domain. For face detection Haar wavelet is used to form the coefficient matrix and PCA is used for extracting features. These features were used to train the artificial neural networks classifier applied for ORL dataset. The best recognition rate obtained was about 81.11%.

From our overall analysis and comparison, we agree with Jain and Bhati where a similar correct classification rate about 81% to 83% was obtained for ORL dataset. The facial images varied in facial expression and illumination. PCA reduction can be applied in future work to investigate the effect of spatial reduction.

5. Conclusion

In this paper, Gabor and discrete wavelet based feature extraction methods are proposed for the extraction of features from facial images. Face recognition experiments were carried out by using MLP and RBF classifier. We have used two facial databases the ORL and computer vision. We have implemented and studied two neural networks architectures trained by different characteristic feature vectors like discrete wavelet transform (Haar, db2, db4, db8, and bior2.2) and Gabor wavelet.

Simulation results showed the efficiency of the proposed method; about 94 to 96% was obtained with Gabor filters and MLP classifier for computer vision database, whereas about 83 to 85% was obtained with discrete wavelet decomposition using the approximation feature descriptor for ORL database. Results obtained depend first on the quality of stored images, on the classifier, and also on the characterization method.

The presented MLP model for face recognition improves the face recognition performance by reducing the input features and performs well compared to the RBF classifier. Simulation results on computer vision face database show that the proposed method achieves high training and recognition speed, as well as high recognition rate about 96%. Another classifier like SVM can be applied for comparison with our method and other parameterizations like contourlets or other discrete wavelets could be investigated to perform better recognition rate. In future work, we plan to apply curvelets analysis which is better at handling curve discontinuities.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.