Abstract

At present, the face recognition method based on deep belief network (DBN) has advantages of automatically learning the abstract information of face images and being affected slightly by active factors, so it becomes the main method in the face recognition area. Because DBN ignores the local information of face images, the face recognition rate based on DBN is badly affected. To solve this problem, a face recognition method based on center-symmetric local binary pattern (CS-LBP) and DBN (FRMCD) is proposed in this paper. Firstly, the face image is divided into several subblocks. Secondly, CS-LBP is used to extract texture features of each image subblock. Thirdly, texture feature histograms are formed and input into the DBN visual layer. Finally, face classification and face recognition are completed through deep learning in DBN. Through the experiments on face databases ORL, Extend Yale B, and CMU-PIE by the proposed method (FRMCD), the best partitioning way of the face image and the hidden unit number of the DBN hidden layer are obtained. Then, comparative experiments between the FRMCD and traditional methods are performed. The results show that the recognition rate of FRMCD is superior to those of traditional methods; the highest recognition rate is up to 98.82%. When the number of training samples is less, the FRMCD has more significant advantages. Compared with the method based on local binary pattern (LBP) and DBN, the time-consuming of FRMCD is shorter.

1. Introduction

The face recognition method is the biometric recognition method of the automatic identification based on face features. Since the 1990s, the face recognition gradually becomes the research focus in the field of the pattern recognition and the artificial intelligence [1]. According to whether the human face feature extraction and recognition depend on the manual rule or not, face recognition methods are divided into face recognition methods based on statistical features (FRMSF) and face recognition methods based on machine learning (FRMML) [2]. In FRMML, shallow machine learning face recognition methods (SMLFRM) and deep learning face recognition methods (DLFRM) are included. The FRMSF relies on the manual feature selection, which can be intervened by excessive active factors, so this method has the low face recognition rate [3, 4]. Due to the fact that SMLFRM has problems of complex theoretical analysis and high optimization complexity, the face recognition rate is low when the SMLFRM is applied to deal with multiclass classification problems [5, 6].

Recently, the DLFRM has become the main recognition method in the face recognition field, due to its learning face image essence features and reducing the sample training difficulty. Through learning the deep nonlinear network structure, the deep learning achieves the complex function approximation, obtains a large number of high-dimension training data, and abstracts facial image features to improve the classification accuracy [7]. Based on above, the backpropagation (BP) neural network with the advantages of generalization ability, self-adaptive ability, and nonlinear mapping ability is applied in face recognition by Gan [8]. Radu proposes improved face recognition method based on BP to enhance the recognition effect of the BP face recognition method [9], and the image binarization method is added to effectively obtain the face position and the face size. However, the BP neural network has the random initialization weight parameter, which causes the BP neural network to be easily trapped in local minimum value, so the face recognition rate of the BP neural network is not obviously improved. To resolve the problem of the BP, the DBN deep learning algorithm is proposed by Hinton in 2006 [10]. In the DBN, the contrastive divergence algorithm is applied to effectively train the Restricted Boltzmann Machine (RBM) to achieve initial weight instead of random initialization weight by the BP, and the Wake-sleep algorithm is applied to optimize the whole network. Therefore, the DBN solves the problem that the BP is easy to fall into the local minimum value. At present, the DBN has already been applied in the pattern recognition. Partha uses the high self-adaptability and the excellent cross-intersection feature of the DBN to determine the fuzzy matrix [11]. In the sound event recognition, in order to solve the problem of unclear semantics extracted by the bad waveform generation mode, Wang proposes the combination of the Gabor filtering signal and the DBN deep processing mechanism [12]. In the face recognition, the DBN has advantages of automatically learning different level abstract features and obtaining face feature nonlinear descriptions, but DBN is not able to extract face image local texture features, which reduces the face recognition rate. In order to solve the problem that DBN is not able to extract face image local texture features, the face recognition method based on LBP and DBN (FRMLD) is proposed by Liang [13]. LBP is the local texture feature operator to extract local micropatterns and distributions of bright, dark, and edge [14]. The extracted face features by LBP are input into DBN, which enable the DBN to learn local texture features to achieve the high recognition rate. Because the LBP has the invariance of the illumination and the rotation [15], the FRMLD also has a certain degree inhibition of the illumination and the rotation. However, the extracted local texture features by LBP have the characteristics of sparse, high dimension, and poor antinoise ability, which can lead to large calculation and long calculation time, so the global optimum of the network is not easy to be reached.

In order to solve the FRMLD problems of the high calculation dimension and the poor antinoise ability, the FRMCD is proposed in the paper. The CS-LBP operator not only has advantages of LBP operator, but also encodes the luminance difference at the center symmetrical point; thus CS-LBP operator can increase the local gradient information, shorten the encoding length, reduce the dimension, and enhance antinoise ability [16]. Therefore, face features extracted by CS-LBP have strong discriminating ability. The CS-LBP texture feature vectors are used as the DBN input so that the DBN is able to obtain the face image local information, which makes the information diverse for face recognition. Meanwhile, for a certain range of light intensity and face rotation angle changing, the FRMCD also has a good recognition effect on face images.

2. Face Recognition Based on CS-LBP and DBN

The face recognition based on CS-LBP statistical method is able to extract the texture features of face images but is easy to ignore the positioned information of feature images which results in the problem of losing face recognition information [17]. DBN is able to extract the abstract features of face images, but DBN is not able to extract the local features of face images. For this reason, the network is not able to learn bright spots, dark spots, and other local micromodes in face imagines. In view of the reasons above, the FRMCD is proposed.

2.1. Center-Symmetric Local Binary Patterns (CS-LBP)

The CS-LBP based on central and symmetric idea can extract local texture feature. The texture features of face images do not change too much when the light, the posture, and the face expression change in face imagines [1820]; therefore, this method using CS-LBP operator to extract face local texture features is effective.

The CS-LBP is applied to compare (=0, 1, 2, 3) with ( and are central symmetric pixels). According to comparative results, the larger pixel is labelled as 1 and the smaller is labelled as 0. The binary value is converted to the decimal CS-LBP value [2123]. The calculation process of CS-LBP features is shown in Figure 1.In (1), is the center pixel, the number of neighbourhood pixels around the center pixel is , and is the symbol function to be shown in (2). is the gray value of the neighbourhood pixel around the central pixel.

Compared with the LBP operator in the face recognition, advantages of the CS-LBP operator are as follows.

The CS-LBP feature dimension is low. As shown in (1), when the histogram is used to describe local texture features, the CS-LBP histogram dimension is lower than the LBP histogram dimension . Therefore, CS-LBP has the large decrease in dimension to make that the computation complexity is greatly reduced.

The LBP operator is calculated by comparing the gray values of neighbour eight pixels with the center pixel point , but the CS-LBP operator is only calculated by comparing with which are symmetrical from the center pixel. So, the CS-LBP operator is the compact description operator, which can achieve the gradient information to enrich the useful information.

The CS-LBP face texture features have strong antinoise ability. The calculation schematic of CS-LBP operator and LBP operator is shown in Figure 2(a). As shown in Figure 2(a), suppose that the red pixel value is bigger than the yellow, then the feature value extracted by CS-LBP is 1000 and the LBP is 00000000. The possible changes of the pixel value caused by noise effects, such as the slight shake of the camera, are shown in Figures 2(b) and 2(c). From these two figures, we can see that the feature values extracted by CS-LBP operator are 1000 in both cases and the feature values extracted by LBP operator are 00000000 and 00111000, respectively. The feature values extracted by CS-LBP operator are the same in Figures 2(a), 2(b), and 2(c), while the feature values extracted by LBP operator are different. So the CS-LBP has stronger antinoise ability than LBP.

2.2. Deep Belief Network (DBN)

The DBN is constituted by the multilayer RBM with the fast learning ability [24, 25]. According to the DBN with n hidden layers, the joint distribution of its visual units and hidden units is shown in In (3), H= , H is the DBN visual unit, and (=1, 2,.., n) is the hidden unit on the -th hidden layer. The distribution of the -th layer and the ()-th layer is shown in In (4) and (5), is a -th layer bias. is a weight between the -th layer and the ()-th layer. In DBN network, is regarded as a RBM mode. When the input is , the hidden layer () is obtained by . When the input is h, the visual layer () is reobtained by . It is the deep learning process.

The DBN structure model is shown in Figure 3. The prototype of the DBN network structure is connected with human brain tissue. The DBN extracts the input data features from simple to complex and from low-level to high-level. Standard category labels of the input data are obtained on the Soft-Max of the DBN top layer [26, 27]. The DBN does not rely on the manual selection, and it learns the input data actively and digs out rich information hidden in the known data automatically [28].

2.3. The Face Recognition Method Based on CS-LBP and DBN (FRMCD)

The FRMCD is proposed in the paper and the schematic diagram of FRMCD is shown in Figure 4. The face image is divided into several subblocks. The texture features of each subblock extracted by CS-LBP are as input to DBN, and the texture features are used to complete the feature learning and face recognizing. The FRMCD is divided into two parts which are face image training and face image testing.

The specific steps on the FRMCD are as follows:

The database of face images is divided into training set and testing set; each image in training set and testing set is divided into subblocks.

According to the coding rule of (3), the texture features extracted by CS-LBP in each subblock are denoted by d.

The statistical histogram of the CS-LBP texture features is established; each subblock is described with a statistical histogram. The statistical histogram of the k-th subblock is represented as In (6), , is the frequency value when the CS-LBP texture feature is equal to in the subblock, , and t is equal to (=16).

The CS-LBP feature histogram of each subblock is connected in an orderly manner to form a feature vector, which is the CS-LBP feature vector of the entire image, .

The texture feature vector (V) obtained in step is input into the DBN visible layer. The joint distribution of the DBN visual units and hidden units is shown in In (7), are higher level features that are learned layer by layer. The number of hidden layers is set to 2 in the paper. According to (7), the joint distribution of the visual layer and two hidden layers is shown in In (8), V is visual layer. is the first hidden layer and is the second hidden layer. According to (5), the active probability of hidden units on the first hidden layer is shown in In (9), is visual unit; s is the number of visual units. is hidden unit. is the active function. is the weight value of connection between the -th visual unit and the j-th hidden unit. The first hidden layer is regarded as the visual layer of the secondly hidden layer. According to (9), active probability of hidden units on the secondly hidden layer is calculated.

The DBN iterative algorithm is used to optimize the weights to obtain the optimal training network; the iteration number is . The judgment of the optimal network is based on the fact that the maximum probability function value of the training set is the largest. The maximum probability function is shown in In (10), is weight matrix; is the CS-LBP texture feature matrix of training set. The iteration number (m) is set to 3000. According to adjustment, the learning rate is set to be 0.001.

From step , the optimal network is obtained. The category labels of the testing samples are obtained by classifier in the top of the DBN network.

The FRMCD pseudocode is shown in Algorithm 1.

input: image matrix A; image sample number p; block number n
compute (row/n, col/n)
for all image sample q do
for all sub-block k do
compute s (size()); compute (for t, )
end for
X=[,,,…, ]
end for
V=[,,,…, ]
DBN
input: sample CS-LBP vector V; hidden unit numbers m
for all visible units j do
sample
end for
for all hidden units i do
compute Q ()(for binomial units, sigm())
sample
end for
for all visible units j do
compute Q ()(for binomial units, sigm())
sample
end for
for all hidden units i do
compute Q ()(for binomial units, sigm())
sample
end for

3. Results and Discussion

In order to verify performances of proposed approach, the ORL face database, the Extend Yale B face database, and the CMU-PIE face database were chosen. As shown in Figure 5, the ORL face database created by Cambridge AT & T Labs in UK contains 400 face images from 40 persons, and each person has 10 images with different gestures and varying face expressions. Its size is 112 92 pixels. As shown in Figure 6, Extend Yale B face database created by the Computational Vision and Control Center of Yale is the most widely used database. This database is usually used to study the light effect in the face recognition. This face database with 2432 face images contains positive posture face images of 38 volunteers in different light conditions. Its size is 192 168 pixels. As shown in Figure 7, the CMU-PIE face database is collected in tightly controlled collection by Carnegie Mellon, which contains 41,268 face images of 68 peoples with different local changes including face expression, face gesture, light, and background. Its size is 64 64 pixels.

3.1. Experimental Study on Different Partitioning Ways

The first step of FRMCD is to divide the face image to extract CS-LBP texture features. If the number of subblocks is too few, the information of local features extracted by CS-LBP is too little to recognize the face images accurately [29, 30]. If the number of subblocks is excessive, then a large feature information extracted by CS-LBP makes the training samples input DBN network too sparse and the classifier performance bad and the recognition rate will decrease [31, 32]. Therefore, the best partitioning way of face images must be exited to make the highest FRMCD recognition rate on ORL, Extend Yale B, and CMU-PIE.

On ORL, the hidden unit number of DBN hidden layers is 200, and the partitioning ways of each image are 2 2, 2 4, 4 2, 4 4, and 8 4. The face recognition experiments are performed 20 times under each partitioning way; 7 face images and 3 face images of each person are randomly selected as the training set and the testing set. The average and variance of recognition rate under different partitioning ways are shown in Figure 8.

On Extend Yale B, the hidden unit number of DBN hidden layers is 200, and the partitioning ways of each image are 2 2, 2 4, 4 2, 4 4, 8 4, and 8 8. The face recognition experiments are performed 20 times under each partitioning way, in which 20 face images and 5 face images of each person are randomly selected as the training set and the testing set. The average and variance of recognition rate under different partitioning ways are shown in Figure 9.

On CMU-PIE, the hidden unit number of DBN hidden layers is 250, and the partitioning ways of each image are 2 2, 4 2, 4 4, 8 4, 8 8, 16 8, and 16 16. The face recognition experiments are performed 20 times under each partitioning way, in which 20 face images and 5 face images of each person are randomly selected as the training set and the testing set. The average and variance of recognition rate under different partitioning ways are shown in Figure 10.

As shown in Figures 8, 9, and 10, different partitioning ways of face images have obvious influence on the FRMCD recognition rate. With the partitioning number increasing, the FRMCD recognition rate firstly increases and then decreases. This situation is consistent with the theoretical analysis that there is a partitioning way to achieve the highest FRMCD recognition rate. On ORL, the recognition rate is up to 98.08% and the best scanning subwindow is 28 23 when the image is divided into 4 4. On Extend Yale B, the recognition rate is up to 98.82% and the best scanning subwindow is 24 42 when the image is divided into 8 4. On CMU-PIE, the recognition rate is up to 93.79% and the best scanning subwindow is 16 16 when the image is divided into 4 4.

3.2. Experimental Study on Different Hidden Units

The excessive hidden units of the DBN hidden layer lead to the overfitting phenomenon which makes recognition rate decrease [33]. The few hidden units of the DBN hidden layer lead to extracted features few, which results in the low FRMCD recognition rate. Therefore, for achieving the highest FRMCD recognition rate, the best hidden unit number should be achieved. The following experiments are performed in order to find the best hidden unit number of the FRMCD.

The experimental selections of training set and testing set on ORL, Extend Yale B, and CMU-PIE are exactly the same as the experimental selections above. The FRMCD experiments are performed on the best partitioning ways (4 4 on ORL, 8 4 on Extend Yale B, and 4 4 on CMU-PIE). The hidden unit number is, respectively, set to 100, 150, 200, 250, and 300. The hidden unit number of the two hidden layers is the same. The face recognition experiments are performed 20 times under each hidden unit number. The average and variance of recognition rate under different hidden unit number are shown in Figures 11, 12, and 13.

As shown in Figures 11, 12, and 13, the number of hidden units has obvious influence on the FRMCD recognition rate. With the hidden unit number increasing, the recognition rate firstly increases and then decreases, which is consistent with the theoretical analysis; i.e., there is a hidden unit number to achieve the highest recognition rate. On ORL, the best FRMCD recognition rate is 98.08% when the number of the hidden units is 200. On Extend Yale B, the best FRMCD recognition rate is 98.82% when the number of the hidden units is 200. On CMU-PIE, the best FRMCD recognition rate is 93.79% when the number of the hidden units is 250.

3.3. Experimental Study on Recognition Rate between Different Face Recognition Methods

The FRMCD recognition rate is compared with the recognition rate of the methods based on LBP and DBN, DBN, LBP and SVM, PCA and SVM. The experimental selections of training sets and testing sets on ORL, Extend Yale B, and CMU-PIE are exactly the same as the previous experimental selections. The experiments of the FRMCD and the FRMLD are under the situations with the best partitioning way and the best hidden unit numbers. For these experimental methods, entire face images need to be directly input. The face recognition experiments are performed 20 times by each face recognition method, and the average and variance of recognition rate are shown in Table 1.

As shown in Table 1, for these three databases, ORL, Extend Yale B, and CMU-PIE, the FRMCD not only has the highest recognition rate but also has the versatility on the face databases. The FRMCD recognition rate is up to 98.08% on ORL, 98.82% on Extend Yale B, and 93.79% on CMU-PIE. The recognition rate of FRMCD is more stable than that of other methods.

3.4. Comparative Experiment of CDFRM and LDFRM
3.4.1. Comparison Experiment of Recognition Rate

In the process of face recognition, the fewer training samples are, the lower the recognition rate is. In the case of insufficient training samples, in order to verify which method has the superior recognition rate, the two methods FRMCD and FRMLD with higher recognition rate in the above experimental results. The comparative experiments of the two methods are performed under the situation of the best partitioning way and the best hidden unit number.

The comparative experiments are performed on three face databases. On ORL, 3, 5, and 7 images of per person are randomly selected as the training set and the rest images are selected as the testing set; on Extend Yale B, 3, 5, 10, 15, and 20 images of per person are randomly selected as the training set and 5 images of per person are selected as the testing set; on CMU-PIE, 5, 7, 10, 15, and 20 images of per person are randomly selected as the training set and 5 images of per person are selected as the testing set. The face recognition experiments are performed 20 times under each training sample number. Tables 2, 3, and 4, respectively, show the average and variance of recognition rate under different training sample numbers. In these tables, TSN denotes the number of training samples; UPR represents the upgraded recognition rate.

As shown in Tables 2, 3, and 4, when the number of training samples increases, the recognition rate increases accordingly. The UPR of FRMCD and FRMLD increases, when the number of training samples decreases. The experiment results show that, when training sample numbers become fewer, FRMCD recognition rate has more obvious advantage than FRMLD. Therefore, FRMCD is effective when the training samples are not enough.

3.4.2. Comparison Experiment of Time-Consuming

For time-consuming experiments, we choose Matlab2014b as platform and 64-bit operation system is used with the CPU of Intel Core i5-6200 and the 4 GB memory. 7 images of per person are randomly selected as the training set and the rest images are selected as the testing set on ORL; 20 images of per person are randomly selected as the training set and 5 images of per person are selected as the testing set on Extend Yale B; 20 images of per person are randomly selected as the training set and 5 images of per person are selected as the testing set on CMU-PIE. The time-consuming experiments are performed 20 times; the average and variance of time-consuming on FRMCD and FRMLD are shown in Table 5. FD is the face database, and PTR denotes the percentage of the time reducing.

As shown in Table 5, the time-consuming of FRMCD is significantly lower than FRMLD, which is because the dimension of extraction feature by CS-LBP is smaller than LBP as explanation in the previous article.

Compared with the time-consuming average of FRMLD, the time-consuming average is reduced 48.29%, 53.93%, and 50.89%, respectively, on ORL, Extend Yale B, and CMU-PIE. The time-consuming of FRMCD is more stable than that of FRMLD.

4. Conclusions

In this paper, the FRMCD is proposed. This proposed method applies the CS-LBP operator to extract CS-LBP texture features, then these texture features are learned by DBN network to complete the classification and the recognition. The FRMCD enables DBN to obtain the local features of face images, so the useful information of face images input into DBN is more abundant to ensure the high recognition rate.

On the face databases of ORL, Extend Yale B, and CMU-PIE, a number of face recognition experiments are performed to verify the FRMCD recognition rate. Firstly, the best face image partitioning way and the best DBN hidden unit number of the FRMCD are obtained from experiments. Secondly, under the situations of the best partitioning way and the best hidden unit number, the recognition rates of different face recognition methods are compared. Finally, the time-consuming comparison experiments between FRMCD and FRMLD and the recognition comparison experiments with different training sample numbers are performed. As shown in the experimental results, the FRMCD recognition rate is superior to other traditional face recognition methods, and FRMCD also has the high recognition effect and robustness on the face databases with different lights, gestures, and expressions. Compared with the FRMLD, the FRMCD has good recognition effect when there are enough training samples and also has the more significant advantage when there are fewer training samples. Meanwhile, the FRMCD has low computation complexity and short time-consuming.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

The authors acknowledge the support from the National Natural Science Foundation of China under Grant no. 61671190 and University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province under Grant no. 2015043.