Abstract

This paper presents a novel Gabor phase based illumination invariant extraction method aiming at eliminating the effect of varying illumination on face recognition. Firstly, It normalizes varying illumination on face images, which can reduce the effect of varying illumination to some extent. Secondly, a set of 2D real Gabor wavelet with different directions is used for image transformation, and multiple Gabor coefficients are combined into one whole in considering spectrum and phase. Lastly, the illumination invariant is obtained by extracting the phase feature from the combined coefficients. Experimental results on the Yale B and the CMU PIE face database show that our method obtained a significant improvement over other related methods for face recognition under large illumination variation condition.

1. Introduction

Face recognition has attracted much interest for its wide application. Although great progress has been made according to related researches [1, 2], many issues still remain unresolved, including varying illumination, pose, and expression problems. The varying illumination problem is intractable yet crucial and has to be dealt with. The illumination can change face appearance dramatically and thus will seriously affect the performance of face recognition system [3]. To address this problem, researchers have proposed many approaches, and these methods are mainly classified into three groups.

(1) Illumination Preprocessing. These approaches adopt image processing to remove lighting effects from face images to obtain illumination normalized face images [4, 5]. Tan and Triggs [6] proposed a preprocessing chain which combines Gamma correction (GC) and difference of Gaussian (DoG) with contrast equalization. It eliminates most of illumination effects while still preserving needed essential appearance details. Fan and Zhang [7] presented a homomorphic filtering (HF) based illumination normalization algorithm and obtained promising results. Recently, Lee et al. compensated illumination using orientated local histogram equalization (OLHE), which encoded rich information on the edge orientations [8]. Illumination preprocessing methods are simple, effective, and efficient. However, they could not resolve extreme uneven illumination variations completely [3].

(2) Face Modeling. Illumination variations are mainly generated from the 3D shape of human faces under various lighting directions. A generative 3D face model has been constructed to render face images with different poses and illumination. Belhumeur et al. [9, 10] proposed an illumination cone named generative model, which uses an illumination convex cone to represent face images set with changing illumination conditions under fixed pose. They first construct an illumination convex cone using a great deal of images with varying lighting and then use a low-dimensional linear subspace to represent the cone approximated. Basri and Jacobs [11] found that a 9D linear subspace could approximate the set of images of a convex Lambertian object with varying lighting very well. These methods need images of the same subject with varying lighting and 3D shape information for training. However, these needs could not be met in real world. Therefore, the application of these methods is limited.

(3) Illumination Invariant Extraction. This kind of approach is the mainstream, which tries to extract illumination-robust facial features. Many methods are based on the Lambertian illumination model. In that model, a face image under illumination conditions is generally regarded as a product , where is the reflectance component and is the illumination component at each point [12]. The objective is to extract the reflectance component , which is considered as the intrinsic information specific to each class. However, it is difficult to calculate the reflectance and the illuminance component from real images. A common assumption is that varies slowly and mainly lies in low frequencies, while can change abruptly and typically lies in high frequencies. Under this assumption, Jobson et al. [13] proposed the multiscale retinex (MSR) method which estimated the reflectance component as the ratio of the image and its low-pass version that served as estimate for the illumination component. Wang et al. [14] used a similar idea (with a different local filter, namely, the weighted Gaussian filter) in the Self Quotient Image (SQI), which was very simple and could be applied to any single image. However, the used weighted Gaussian filter can hardly keep sharp edges in low frequency illumination fields, and it needs experience and time to select proper parameter. Aiming at solving this problem, Chen et al. [15] replaced the weighted Gaussian filter by Logarithmic Total Variation (LTV) to improve SQI. In 2009, Zhang et al. [16] presented a wavelet-based illumination invariant method (WD), which extracted denoised high frequency component in wavelet domain as the reflectance component. Inspired by this, Cheng et al. [17] and Xie et al. [18] presented two similar illumination invariant extraction methods in the nonsubsampled Contourlet transform (NSCT) domain. In 2011, Chen et al. [19] utilized the scale invariant property of natural images to derive a Wiener filter approach to best separate the illumination invariant features from an image. Cao et al. [20] proposed a wavelet-based illumination invariant extraction approach while taking the correlation of neighboring wavelet coefficients into account in 2012. Recently, Song et al. [21] presented a novel illumination invariant, histogram-based descriptor, and Faraji and Qi [22] proposed a novel illumination invariant using logarithmic fractal dimension-based complete eight local directional patterns. Experiments show that these methods have achieved very good results. Chen et al. [23] revealed that the direction of the image gradient is insensitive to changes of illumination. Based on this, Zhang et al. [24] introduced the Gradientfaces method, which used the arctan of the ratio between - and -gradient of an image as Gradientfaces. Chen and Zhang [25] improved the Gradientfaces method by proposing multidirectional orthogonal gradient phase faces method.

Recent studies confirm that the phase also contains a lot of effective information for image feature extraction, comparing with the magnitude [26]. Based on this, Sao and Yegnanarayana [27] presented a 2D Fourier phase based face image representation, and Cheng et al. [28] presented a novel illumination invariant method, namely, multiscale principal contour direction (MPCD). Inspired by the above mentioned, based on Gabor wavelet’s excellent visual physiology background and its powerful ability as a feature descriptor, we present a novel illumination invariant extraction method based on the Gabor wavelet phase (GF) in this paper. We first preprocess the face image by using a homomorphic filtering (HF) based illumination normalization algorithm [7]. Then a set of 2D real Gabor wavelet with different directions is used for image transformation. Finally, multiple Gabor coefficients are combined into one whole in considering both spectrum and phase information and the illumination invariant is obtained by extracting the phase feature from the combined coefficients. The 2D symmetric real Gabor wavelet is chosen in our method, which aims not only at avoiding the complexity of complex calculations, but also at fitting the symmetry of the face image itself.

The rest of this paper is organized as follows. Section 2 presents the proposed method in detail. The experimental results and our conclusions are shown in Sections 3 and 4, respectively.

2. Algorithm Description

Researchers have found that Gabor functions have the capability of modeling simple cells in the visual cortex of mammalian brains [29]. Thus, image analysis using Gabor functions is similar to perception in the human visual system. Frequency and orientation representations of Gabor filters are similar to those of the human visual system, and they are particularly appropriate for texture representation and discrimination.

In recent years, the Gabor wavelet transform has been widely used as an effective element in face recognition [26, 3034]. Gabor wavelet transform is insensitive to external environment factors such as illumination, facial expressions, gestures, and occlusion [35]. For this reason, it has been widely used to extract robust facial feature. Most existing Gabor feature-based methods usually use the Gabor magnitude features and discard the phase features. However, studies have shown that the phase information contains a number of effective image features, and it is insensitive to illumination variation. Inspired by this, the Gabor phase features are extracted as illumination invariants in this paper.

The proposed illumination invariant extraction method consists of three steps. Firstly, a homomorphic filtering based illumination normalization method [7] is used to preprocess the face images. Secondly, a set of 2D real Gabor wavelet with different directions is used for image transformation. Lastly, multiple Gabor coefficients are combined into one whole in considering both spectrum and phase information, and the illumination invariant is obtained by extracting the phase feature from the combined coefficients.

2.1. Illumination Normalization

We use the method presented in the literature [7], which combines homomorphic filtering and histogram equalization. In our paper, this illumination normalization method is called HF + HQ for short. This preprocessing greatly corrects the uneven illumination effects.

2.2. 2D Gabor Wavelet Transform

Different Gabor wavelets can be obtained by using different kernel functions. In order to avoid the complexity of complex calculations, and to fit the symmetry of the face image itself, the 2D symmetric real Gabor wavelet is chosen in our method. The kernel function used in this paper is the following.where is the frequency of the sinusoidal function, and represent the spatial scaling coefficient along - and -axes, respectively, and is the orientation of Gabor filter. is defined by this formula.

Here, determines the number of the filter’s orientation, and we set in this paper. If and are selected, after the Gabor wavelet transform of an gray face image , we have

Here, indicates convolution of the two functions. is denoted as for short. All the Gabor wavelet transformed coefficients are denoted as follows.

Figure 1 illustrates the spectrograms of a face image under the same frequency () and 8 different orientations ().

2.3. Illumination Invariant Extraction

After 2D Gabor wavelet transforming, the set includes all the spectrum information under different orientations. In order to take the phase information into account, we define the complex wavelet coefficients as follows.

Then, summing up all the complex coefficients under the same frequency () aiming at reducing feature dimension, we have

The phase feature is calculated by this formula.

Here, and are the real and imaginary part of , respectively. In this paper, the phase feature is considered as the illumination invariant. Figure 2 shows the illumination normalized face images and the obtained illumination invariant.

3. Experimental Results

In this section, the performance of the proposed method (GF) is compared with the existing methods including MSR, WD, LTV, Gradientfaces, and MPCD using Yale B [36] and CMU PIE [37]. Firstly, we present different illumination invariants in image form. Then, we compare their recognition performance by using Eigenfaces under the same experimental conditions. According to the FERET testing protocol [38], the Tops 1 and 3 recognition rate are tested.

3.1. Comparison of the Different Illumination Invariants

To prove the efficiency of different methods, Figure 3 shows some original images in the Yale B and their corresponding illumination invariants. As can be seen from the images, the GF method has removed most effects of the illumination variation and greatly reduced the intraclass difference.

3.2. Experimental Results for the Yale B

The face database of Yale B has 10 different persons, and each person has 9 poses, and each pose is captured by 64 different illumination conditions. In our experiments, the frontal images are used. And, based on the angle of light source direction, these images are classified into five subsets. They are subset 1 (0–12°), subset 2 (13°–25°), subset 3 (26°–50°), subset 4 (51°–77°), and subset 5 (others) [39]. All images are cropped and rescaled to pixels with strict alignment. The five images of each subset (each row) for one person are shown in Figure 4, and their illumination invariants are extracted by GF method.

Firstly, we select subset 1 as the training set and others as the testing set. As can be seen from Figures 5 and 6 the proposed method (GF) outperforms MSR, WD, and LTV, and GF obtains outstanding results similar to the Gradientfaces and the MPCD method. The average Top 1 recognition rate is nearly 99%.

Secondly, subset 4 is selected as the training set, and others are used as testing set. Figures 7 and 8 show the recognition rates. It is clearly seen that the performance of GF is far greater than others and achieves 100% recognition rate on each testing subset.

Thirdly, for the training set, we randomly choose 10 images for each person, namely, subset , and the others are used for testing. To achieve a credible result, the result is averaged over 50 random splits. The experiment results are presented in Figures 9 and 10. It can be observed that the recognition rate of GF is higher than the other methods, and the performance is quite similar to that of the Gradientfaces and the MPCD, and it reaches a 100% recognition rate on every testing subset except subset 2.

The experiments are implemented using different training sets. It is clearly seen that the proposed method obtains excellent results under different conditions, and this demonstrates its robustness to illumination.

3.3. Experimental Results for the CMU PIE

The CMU PIE face database [37] contains images of 68 persons with various poses, illuminations, and expressions. In our experiments, the illumination subset (C27) is used, and C27 has 21 different illuminations for each person. All images used in our experiments are cropped and rescaled to pixels. 21 different illumination images of a person on CMU PIE and their illumination invariants extracted by GF are shown in Figure 11.

The experiments on CMU PIE are divided into two sections. In Section 1, the first 3, 4, and 5 images of each person are selected as the training set and others as the testing set, respectively. Table 1 shows the recognition results of Section 1. In Section 2, we randomly choose 3, 4, and 5 images of each person as training set and others as testing set, respectively. To achieve a credible result, the result is averaged over 50 random splits. The recognition results of Section 2 are tabulated in Table 2. From Tables 1 and 2, it can be seen that the proposed method outperforms all other methods and consistently achieves a high recognition rate, which strongly shows its outstanding efficacy in relation to varying illumination.

Run time is also critical in real application. To evaluate the computational complexity of each method, the run time of processing a pixel face image of each method is presented in Table 3. The hardware platform is 2.6 GHz P4 with 2G memory. Table 3 shows that the proposed method only needs 37 ms to process a face image, which shows that it can process face images in real time and thus it is able to handle large face databases. MSR, LTV, and MPCD are slower than our method.

4. Conclusion

In this paper, we propose an efficient Gabor phase based illumination invariant extraction method. We first normalize face images using a homomorphic filter-based preprocessing method to preeliminate effects of the illumination changes. Then, a set of 2D real Gabor wavelet with different directions is used for image transformation, and multiple Gabor coefficients are combined into one whole in considering both spectrum and phase. Lastly, the illumination invariant is obtained by extracting the phase feature from the combined coefficients. The proposed method does not need 3D face shape information or a bootstrap for training. And the extracted illumination invariant contains more essential discriminant information while greatly reducing the effect of illumination changes at the same time. Experimental results show its effectiveness and robustness to different illumination variation.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 17KJB520021, Jiangsu Government Scholarship for Overseas Studies, Training Projects of Undergraduate Practice Innovation funded by Nanjing University of Information Science and Technology, and a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institution.