Abstract

Automatic authentication systems, using biometric technology, are becoming increasingly important with the increased need for person verification in our daily life. A few years back, fingerprint verification was done only in criminal investigations. Now fingerprints and face images are widely used in bank tellers, airports, and building entrances. Face images are easy to obtain, but successful recognition depends on proper orientation and illumination of the image, compared to the one taken at registration time. Facial features heavily change with illumination and orientation angle, leading to increased false rejection as well as false acceptance. Registering face images for all possible angles and illumination is impossible. In this work, we proposed a memory efficient way to register (store) multiple angle and changing illumination face image data, and a computationally efficient authentication technique, using multilayer perceptron (MLP). Though MLP is trained using a few registered images with different orientation, due to generalization property of MLP, interpolation of features for intermediate orientation angles was possible. The algorithm is further extended to include illumination robust authentication system. Results of extensive experiments verify the effectiveness of the proposed algorithm.

1. Introduction

The need for personal identification has grown enormously in the last two decades. Previously, biometric identification using fingerprints or face images was restricted to criminal prosecution only. A few experts could serve the demand. With increased terrorist activities, stricter security requirements for entering buildings, and other related applications, need for automatic biometric machine-authentication systems is getting more and more important.

Recognizing people from face (face image) is the most natural and widely used method we human do always and effortlessly. Due to ease of collection without disturbing the subject, it is one of the most popular ways of automatic machine authentication. An excellent survey of face-recognition algorithms is available in [1].

In automatic face recognition, the first step is to identify the boundary of the face and separate it from the photographed image. Next, recognition algorithms extract feature vectors from the input (probe) image. These features are then compared with the set of such features stored in the database. The database (gallery image) contains same set of features already extracted and stored during registration phase for all persons required to be authenticated.

There are two classes of algorithms to extract features from the image—model based and appearance based. Model-based algorithms use explicit 2D or 3D models of the face. In model-based algorithms, geometrical features like relative positions of important facial components, for example, eyes, nose, mouth, and so forth, and their shapes are used as features. These features are robust to lighting conditions but weak for change in the orientation of the face. We use a subset of such features as “Angle-feature” in our previous work [2]. In appearance-based methods pattern of the light and shade distribution in the facial image is used to derive features. Being computationally simpler, appearance-based paradigm is more popular. One of the significant works is the eigenface approach [3] by Turk and Pentland. We also used appearance-based algorithms to extract facial features.

Though automated face recognition by computers for frontal face images taken under controlled lighting conditions is more or less successful, recognition in uncontrolled environment is an extremely complex and difficult task. Lots of researchers are trying to develop unconstrained face recognition system [4], specially for pose and illumination invariant face recognition [5], for a wide variety of real-time applications.

For most of the biometric applications, we need to authenticate a particular person in real time from his/her quickly taken face image. The face image has to be already registered in the system. For proper verification, the input image (probe image) should exactly match the registered image (gallery image) of that particular person (to avoid false rejection of the genuine person) and not with anyone else’s face image (to avoid false acceptance). The algorithm has to be efficient to work in real time. The task becomes difficult because the quickly taken probe image may differ in illumination and pose (and therefore features) from the image of the individual registered in the data base.

Even though the person is same, the automatic authentication system may fail due to angle orientation, ambient lighting, age, make-up, glasses, expression of the face, and so forth which are different from the stored gallery image of the individual. It is said that about 75% of the authentication failure is due to the fact that angle of orientation of the probe face image is different from the stored image. It is impossible and very inefficient to store the images (i.e., image features) of an individual taken at all possible angles and at different illuminations in the gallery. But we need that information for correct recognition. In this work, we focus on angle-aware face recognition, and then the proposed algorithm is extended to include ambient light-aware face recognition. In the proposed angle- and illumination-aware face recognition, we store the available (training) information in a trained Artificial Neural Network. Retrieval of the features for any intermediate angle and illumination from the trained ANN is very efficient. The algorithm can be used in real time. We experimented with a benchmark database. Our system could achieve excellent results both for false-acceptance rate (FAR) as well as for false-rejection rate (FRR).

In the next section we briefly discuss related works on orientation and illumination robust face recognition. In Section 3 we represent our proposed idea for angle-aware face recognition and its extension to illumination-aware recognition which is followed by Section 4 containing simulation experiments and results. Section 5 contains conclusion and discussion.

According to FERET and FRVT [6] test reports, performance of face recognition systems drops significantly when large pose variations are present in the input images. Though the registration image is a frontal face image, the probe image is more often than not a perfect frontal image. Angle-aware face recognition is a major research issue. Approaches to address the pose variation problem are mainly classified into three categories.(1)Single-view approach in which invariant features or 3D model based methods are used to produce a canonical frontal view from various poses. In [7] a Gabor wavelet-based feature extraction method is proposed which is robust to small angle variations. This approach did not receive much attention due to high computational cost.(2)Multiview face recognition is an extension of appearance-based frontal image recognition. Here, gallery images of every subject at many different poses are needed. Earlier works on pose invariant appearance based on multiview algorithms are reported in [810]. Most algorithms in this category require several images of each subject in the data base and consequently require much more computation for searching and memory for storage.(3)Class-based hybrid methods in which multiview training images are available during training but only one gallery image per person is available for recognition. The popular eigenface approach [3] has been extended in [11] in order to achieve pose invariance. In [12] a robust face recognition scheme based on graph matching has been proposed.

More recent methods to address pose and illumination are proposed in [2, 1321].

The simplest approach is to look for a feature which is invariant to variation of pose. But, till now such a feature is not found. Reference [7] works only for very small range of angle variation, and the algorithm is too heavy to be used real time. Geometrical features are very weak to angle variation. Variation of image-pattern-based features due to angle variation exceeds variation of features across individuals, jeopardizing the recognition process and would lead to high FAR and FRR. Prince and Elder [22] presented a heuristic algorithm to construct a single feature which does not vary with pose. Murase and Nayer [23] have used principle components of many views to visualize the change due to pose variation. Graham and Allison [24] sampled input sequences of varying pose to form eigensignature when projected into an eigenspace. A good review of these approaches can be found in [5, 25].

3. Angle- and Illumination-Aware Face Recognition

Our approach is to store multiple pose image features in a single trained MLP, so that both storage and searching for intermediate angles are efficient. We do not overload the database by adding features for the same face at different angles. We train an artificial neural network to store them all as a function of the orientation angle. Due to good generalization property of MLP, it can give feature values at intermediate angles and very efficiently too. Through experiments, we realized that geometrical features are fragile to angle variation. We used a subset of geometrical features to express the pose angle. The following important aspects were investigated while selecting the efficient angle features:(1)low computational complexity to extract the angle feature, so that the algorithm can run real time(2)the pose-angle feature contains enough information about the angle(3)the feature values vary smoothly with angle variation, so that MLP can be trained easily and with little error.

The two main contributions regarding pose invariant face authentication, over our previous work [17], are to automatize the angle-feature extraction from the face image and enrich the angle feature vector with more relevant features. We also verified that artificial neural network could achieve good generalization for intermediate angles of orientation, for which data were not available during training phase. A brief description of the whole algorithm, with an emphasis on the new contribution, is presented in this section.

Figure 1 shows the block diagram of the proposed angle invariant face recognition system. The system consists of two phases, registration phase and recognition phase. In registration phase, a set of face images are taken from equal distance but at different angles.

If the number of cameras is , we get training samples to train the individual person’s MLP at the time of registration. From all the photographs, taken by cameras, first the training data is created to train that individual’s MLP. The input vector of the training data is the angle feature vector, and the output vector is the image feature. Procedures to extract angle features and image features are explained in Sections 3.1 and 3.2, respectively.

A person’s identification (ID) and the corresponding trained MLP (using her/his face image angle feature and image feature) are stored as a pair. Such ID-MLP pair forms gallery image “DATA BASE.” In the recognition phase, the individual’s face image (probe image) is presented with her/his ID. From gallery “DATA BASE” of MLPs, the particular trained MLP for the claimed ID is retrieved. Angle features from the image are extracted and used as input to that person’s MLP retrieved from the data-base. Image-feature from the probe image and that obtained as output from the MLP are compared. If the distance between two feature vectors are below some predefined threshold, the decision is accept, otherwise reject. The implicit assumption here in that the MLP would be able to deliver correct image feature for any intermediate face orientation due to its good interpolation (generalization) property.

In the following section, we will discuss how angle features are extracted from the face image. We will also show what angle features are finally selected for our system and why.

3.1. Angle Feature Extraction

Angle feature should contain the information of the orientation angle of the face image. Geometrical features of a face image, which uses distances between important parts of the face and angle between connecting lines, are capable of expressing the orientation of the face image. We used cues from those approaches of feature extraction. In our previous work [17], we used three points, the left and right eye locations and the middle of mouth. The distances between them and the slope of the lines connecting them are used as elements of the feature vector. The distance between the two eyes decreases as the orientation angle increases. Similarly, the slope of the line connecting eyes and the mouth changes as the face turns towards right or left. The results obtained using these six elements of angle-feature vector gave reasonably good results. But, in our previous work, the three points from the face image were manually identified, and feature vectors were manually evaluated from all face images under investigation. In total we used 10 facial images, each for 21 different angles. Therefore, 210 angle feature vectors were hand calculated.

In the present work, we wrote algorithm to automatically identify the important points on the face. This facilitated working with a larger data set. Moreover, after filtering, we could always identify the eyebrows, eyes, nostrils, and mouth. All possible identifying points can be listed as two end points of left eyebrow, two end points of right eyebrow, two end points of left eye, two end points of right eye, nostril (sometimes two), and two end points of mouth. This is clear from the picture after binary conversion (the best result obtained with a threshold of 0.75), as shown in Figure 3. We used the database with oriental faces only. A lot of angle vector elements can be identified whose values change as the angle changes. We tried different combinations taking care that the procedure is simple and efficient.

The important parts from the face image are separated as follows. At first minimum value filter is used. The minimum value filter emphasizes the part where the image is dark, because important parts on face are darker than that of surrounding skin. Through experiments, we ensured that this technique is effective to identify locations of eyebrows, eyes, nostrils, and mouth. After using the minimum value filter, binarization is performed to clearly identify important parts of the face. In addition to our targeted important parts of the face, hair also is filtered out. First the hair part is detected. Though it is an important element too to profile the face image, we do not use it. We delete the hair part and the background. We then identify eyebrows, eyes, nose, and mouth, with heuristic algorithm using knowledge of their relative positions. As we do not use eyebrows to create the angle-feature vector, eyebrows are also deleted after identification. Once both eyes, nose and mouth are located on the face image, we generate the angle features.

First we will give the details of the elements of angle-feature vector and then explain the rationality of choosing them. The angle feature vector is It consists of 12 elements. , , and are the widths of the left eye, the right eye, and the mouth. Following that, we find the center for left eye, right eye, nostrils, and mouth. Let us denote the coordinates of these four points of left eye (LE) as (,), right eye (RE) as (,), mouth (M) as (,), and nose (N) as (,). We have six distances taking any two points from the above four points. Except the distance between N and M, all other distances change with face angle orientation. We use the five distances shown in Table 1 as components of angle feature vector. The remaining four features are the gradient of lines described in Table 2.

All these features are easy to calculate and change more or less smoothly with angle variation. We did not include the distance between mouth and nose, the gradient of the line joining mouth and nose, and the line joining the two eyes. This is because these parameters do not change with angle change.

In order to ensure how our angle feature vector changes with change in the orientation of the face image, we plotted the Euclidean distance between angle vectors against the angle of orientation. It is shown in Figure 4(a). We have not discussed about the face-image feature yet. But in Figure 4(b), we have shown the Euclidean distance between face image feature vectors as the orientation angle changes.

The plots were for all training samples. It shows the smooth changes, though nonlinear but monotonic. From this plot, we can ensure that our angle feature is suitably chosen, and an MLP could be trained in a small number of epochs. Of course, during registration period, this training will be done off-line, and a longer training time is permissible. At the time of authentication, the MLP will give out the face image feature, from the input angle feature, instantly. That will ensure real-time application.

In summary, compared to our previous work, we have improved our angle feature extraction technique not only by automating it but also by adding six more elements in the angle-feature vector to capture the angle of orientation information more faithfully. This also enables us to work with larger data set of face images.

3.2. Image Feature Extraction

The image feature captures the characteristic of the entire image, the spatial distribution of the pixel values. The most widely used method is eigenface, first proposed byTurk and pentlandin [3]. It is based on principal component analysis. First few principal components are used as features, and every face image is expressed as a vector with values of the few principal components. We used the same technique to create image feature vector.

In our experiments 8 principal components, which carry 99% of the image information, were used. We further extended our experiments using independent components on image feature. As independent component feature of the image gave better results, in this paper we will only present those results.

3.3. Neural Network for Mapping Angle Feature to Image Feature

Multilayer neural network, trained with error backpropagation, is used as a mapping function—to map an individual’s face orientation angle to his/her face image features for that particular angle. As angle feature vector consists of 12 elements, the MLP has 12 input nodes plus one bias node. We use a single hidden layer with 15 hidden nodes. Experiments were tried with different number of hidden nodes. The training is fast and quickly converges to very low MSE. Even with hidden nodes 10, it is possible to get low error after training, but we need more numbers of training epochs. The number of output nodes is eight, equal to the number of image features by using independent component analysis.

As already mentioned, we have separate MLP for every individual. For every registered individual, we have face images taken with orientation angle from −50 degrees to +50 degrees, at an interval of 5 degrees. In total, we have 21 image data for any individual. Out of the available 21 data, we use those taken at orientation −50, −40, −30, −20, −10, 0, +10, +20, +30, +40, and +50, that is, in total 11, for training the MLP. The rest 10 images, taken at angles −45, −35, −25, and so forth, were used for testing the trained MLP. Figure 5 shows the result after averaging over all images against a single self-image. A very good generalization is obtained. We can notice that at testing points the error is a little more than the points where it is trained. Yet, the distance between self and non-self-images is quite large, ensuring low values for both FAR as well as FRR, when threshold is properly chosen.

3.4. Robust Systems to Illumination Variation

In this work we also proposed an extension of our system to include correction for illumination variation. Two alternative systems are proposed shown in Figures 6 and 7. In System I, only one MLP is used as in the case of angle invariant system. The only difference is that one input to the MLP is added to include image brightness information. The rest of the algorithm remains the same. In System II, two MLPs were used. Both of them were trained separately. The first MLP (MLP1) output the image feature using angle feature as input. While training this, we train with image using base brightness, that is, 0% darkness. When darker images are input to this MLP, the output image features will be incorrect. The second MLP (MLP2) takes the output of MLP1 and brightness information. It is trained to give correct image feature for the darker image. Finally, the output of MLP2 is compared to image feature of the probe face image to take the authentication decision.

4. Simulation Experiments and Results

As already mentioned, the system consists of two stages—learning of MLP, that is, the registration phase, and using the learned MLP in the authentication stage.

4.1. Registration Phase

When person “A” is to be registered, face photograph of person “A” is taken using multiple cameras set at different angles, as shown in Figure 2. We use the database [26] from Softopia, Japan. The database has images taken at an interval of 5 degrees. For registration, we use face image data at intervals of 10 degrees, from −50 degrees to +50 degrees. The registration system is shown in Figure 8. First, the image is converted to grey-scale image, face part is cut out, and the angle and independent component features of the face image are extracted. The angle feature is used as input to the MLP and the image features as teacher signal. From the database, 11 of such data are used for training. The training is converged within 5000 epochs, with very low mean square error.

4.2. Authentication Phase

In authentication phase, the person announces his/her identification and let the image be taken. The angle is arbitrary, depending on how the person poses in front of the camera. We assume this angle to be within −50 to +50 degrees. The mapping task of MLP is to interpolate. The layout of the authentication system is shown in Figure 9. From the camera image, the face part is cut out. The angle features are extracted and input to the MLP trained for the person, as retrieved from the database according to identification declaration. The image feature taken from the image and that obtained as output of the MLP are compared. The Euclidean distance is calculated. If the distance is below a threshold value, the person is accepted, and otherwise rejected. The Euclidean distance is calculated by (2) as follows:

Here, is image feature vector from input image. is the image vector from MLP output. Judgment of the proper threshold value is important. If the threshold is too low, false accept rate (FAR) will increase. On the other hand, if the threshold is set too high, false reject rate (FRR) will be high. Depending on the application, the threshold is fixed. For a heavily secured place, where false acceptance is not tolerable at the cost of a few misjudgment in face rejection, the threshold is kept high. In general, the threshold is kept at a value where FAR is equal to FRR.

4.3. Experimental Setup and Results

Compared to our previous work, in the present work the angle feature vector has changed, from 6 elements to 12 elements. The image feature vector is also changed from PCA to ICA, the number of elements remaining the same 8. As the number of input nodes is increased, we increased the hidden nodes to 16 for faster training. We used face image data, taken in same illumination condition, with orientation angle from −50 degrees to +50 degrees, taken at intervals of 5 degrees. Image data at intervals 10 degrees was used for training, and the intermediate is for testing. In total, face image of 15 individuals was used. Experiments were performed by varying the threshold in steps.

Experimental results, for the angle variation from −50 degrees to +50 degrees, are summarized in Figure 10. Average FAR and FRR for all the images were calculated and plotted in this figure. The value of FAR and FRR at proper threshold is improved from our previous work about 20% to 10%, that is, an overall improvement of 10% in recognition rate over the whole range of angle variation. We attribute this to our improved angle feature vector. It is also important to note that the optimum threshold value is now increased from 9 to 12, and the slope around that threshold is lower. In the previous work, as shifting of threshold value greatly changed FAR and FRR, it is difficult to select proper threshold, as it would be different for different individuals. The new result shows that FAR and FRR do not change much when the threshold is changed.

4.4. Experiments with Changing Illumination

The image feature changes also with illumination condition. We did a preliminary experiment to investigate the image feature change with brightness and on the basis of our investigation proposed the robust systems for illumination variation presented in the earlier section.

To investigate the pattern of change, we varied the brightness of face image by steps of 4% (of the original brightness) to a level up to −80% of the original value. Here, maximum value of the brightness is considered to be 0%. The image features at different illumination levels are compared, in terms of Euclidean distance, with respect to the brightest image, that is, 0%. The results are summarized in Figure 11. Though the variation of image feature is different for different images, the nature is same.

As shown, the Euclidean distances are larger with the decrease of brightness values. The nature of variation is easy to be learned by ANN. From this, we conclude that, we can extend the proposed system to be able to perform well in case of illumination variation too.

4.5. Experiments with Extended System and Results

We compared our results for System I and System II. We use brightness of different image features at intervals of 4%, from −80% to 0%. The image features are the same. ICA features are used in Section 4.

All the experimental results are summarized in Figure 12 and Table 3. Figure 12 shows the average value of misidentification with variation of both orientation angle and brightness. Least misidentification remains almost unchanged when brightness is reduced from 0% to 20%. It shows that when illumination is strong, there is no need to correct the original system. Misidentification using System I and System II is much less compared to the original system when the image is dark. To train System II, it takes more time and memory. But System II gives much better result. It is also found that System II’s performance is consistent, and correct authentication rate steadily improves as image brightness decreases more and more.

5. Conclusion

In this work we have proposed an efficient technique for angle-aware face recognition and extended the same technique to take care of the effect of illumination variation. Though there are lots of works on angle invariant and illumination invariant face recognition proposed in the literature so far, there is a very few work in which same framework is used for taking care of both the problems simultaneously. Our proposed system can take care of angle variation from −50 degrees to + 50 degrees and at the same time 40 sets and different image feature set. The results are now reliable to work with larger data set. We used only one data set and currently are engaged in using other data sets for simulation experiments.

In this work, we considered the change in angle orientation in the horizontal plane, but orientation in the vertical plane may also vary and affect face recognition. We would like to extend our work to take care of the change in orientation in the vertical plane. Further experiments to work with more bench mark data sets are also our future target.