Abstract

Pulse coupled neural network (PCNN) has been widely used in image processing. The 3D binary map series (BMS) generated by PCNN effectively describes image feature information such as edges and regional distribution, so BMS can be treated as the basis of extracting 1D oscillation time series (OTS) for an image. However, the traditional methods using BMS did not consider the correlation of the binary sequence in BMS and the space structure for every map. By further processing for BMS, a novel facial feature extraction method is proposed. Firstly, consider the correlation among maps in BMS; a method is put forward to transform BMS into frequency map series (FMS), and the method lessens the influence of noncontinuous feature regions in binary images on OTS-BMS. Then, by computing the 2D entropy for every map in FMS, the 3D FMS is transformed into 1D OTS (OTS-FMS), which has good geometry invariance for the facial image, and contains the space structure information of the image. Finally, by analyzing the OTS-FMS, the standard Euclidean distance is used to measure the distances for OTS-FMS. Experimental results verify the effectiveness of OTS-FMS in facial recognition, and it shows better recognition performance than other feature extraction methods.

1. Introduction

Face recognition is an important research field of pattern recognition; it has good potential applications in biological recognition technology, security system, and so on. At present, there are many face images through sensors, so we need a good algorithm to deal with these images. In the process, because of face data space caused by the problem of dimension disaster, facial feature extraction method with spatial dimension reduction effect is becoming the key technology of face recognition. In the past several decades, many researchers proposed a lot of methods to extract facial features such as geometric characteristics, subspace analysis [15], and neural network method [6].

The method of geometric characteristics [7] uses the calculation of geometric parameters as the face features; it has good adaptability to illumination changes, but poor adaptability to the more obvious facial expression, posture, and rotation changes, and so forth. At present, the subspace analytical method is a popular face recognition method, it employs a transform method of linear or nonlinear to make the data in the cast shadow space embodying explicit feature pattern, so as to extract the key features such as the method of PCA [1], LDA [2], and ICA [3] based on linear transformation and the method of KPCA [4], KFDA [5], and KICA based on nonlinear transformation. However, these methods have poor adaptability to the changes of rotation and distortion in the image. Neural network is based on the nonlinear transform ability of the network structure and uses the learning of the training sample to get nonlinear transforming space of the data and then to obtain the facial features according to the nonlinear transforming space, for instance, using SOM neural network and fuzzy RBF neural network, and so forth. But neural network method will likely cause over-fitting in the learning process for the samples of empirical risk minimization principle. Besides, the subspace analysis method and the face feature extraction method of traditional neural network need face samples to learn; if the training samples are changed, the projection transformation space of the data also wants to change; thus, the calculated amount in large-scale human face feature extraction is too large; its application in the real-time demand higher occasion is limited.

In 1993, Johnson and Ritter proposed pulse coupled neural network (PCNN) [8] based on Eckhorn research in cat’s visual cortex. It has widely been used in image segmentation [9, 10], image fusion [1113], image retrieval [14], and so forth. In PCNN, the similarity group neurons will issue synchronous pulses under the effect of mutual coupling pulses. These pulses information constitute a 3D binary map series (BMS), which effectively describes the information of the edge and regional distribution of the image. But the data size of the BMS is larger and cannot be directly used as the image features; for this reason, by calculating the area of the binary image, Johnson [15] transform BMS into 1D oscillation time sequences (OTS), and the OTS has a good invariance in geometric transforming such as rotation, translation, and zoom. Based on the pulse mechanism of the neuron in PCNN, the pulses are divided into the capture pulses and self-excitation pulses, and the OTS is divided into C-OTS and S-OTS, and serve as the facial features extraction in face recognition [16]. Reference [14] extracted 1D OTS of the BMS by calculating the normalization rotational inertia (NRI) of the binary image and applying it to the image retrieval. However, the OTS feature extraction method of the image based on BMS did not fully consider the correlation between BMS in the binary images; those discontinuous features regions will cause influence for the pattern classification capability of OTS features. In addition, the OTS of Johnson’s form is statistical characteristics in the sense of whole situation of the binary images; these did not consider the spatial structure of the image, and the spatial structure information often plays an important role in pattern classification.

In view of the above analysis, this paper proposed a novel face feature extraction method of OTS based on the BMS of PCNN output, and, compared with the traditional subspace analysis and neural network method, the results will not change with the sample space change.

2. Pulse Coupled Neural Network

The PCNN model consists of the receptive field, the modulation field, and the pulse generator. In the receptive field, the neuron, respectively, receives the coupling pulse input and external stimulus input of neighboring neurons and consists of and channels, which is described by (1). In and channels of the neuron, the neuron links with its neighborhood neurons via the synaptic linking weights and , respectively; the two channels accumulate input and exponential decay changes at the same time; the decay exponentials of and channels are and , while the channel amplitudes are and :

In the modulation field, the linking input is added a constant positive bias; then, it is multiplied by the feeding input; the bias is unitary, is the linking strength, and the total internal activity is the result of modulation, which is described by

Pulse generator consists of a threshold adjuster, a comparison organ, and a pulse generator. Its function is to generate the pulse output , and it adjusts threshold value ; is threshold range coefficient, which is described by (3). When the internal state is larger than the threshold , the neuron generates a pulse, which is described by

In the above equations, the subscripts and denote the neuron location in a PCNN and denotes the current iteration (discrete time step), where varies from 1 to ( is the total number of iterations).

The PCNN used for image processing is a single layer two-dimensional array of laterally linked pulse coupled neurons as shown in Figure 1, and all neurons are identical. The number of neurons in the network is equal to the number of pixels in the input image. There exists one-to-one correspondence between the image pixels and network neurons, and the gray value of a pixel is taken as the external input stimulus of the neuron in channel; namely, . The output of each neuron results is two states, namely, pulse (1 state) and nonpulse (0 state), so the output states of neurons comprise a binary map.

3. Face Feature Extraction Using PCNN

In PCNN, each neuron is connected with neighboring neurons in linking range, so a neuron will receive some pulse inputs from its adjacent neurons. If adjacent neurons have a similar external stimulus with the current neuron, the neuron will issue pulse because of pulse coupled action; that is to say, the neuron and its similar adjacent neurons will emit synchronous pulses. So the similar group of the PCNN neurons possesses the pulse synchronization characteristic and the characteristic benefit features clustering for an image.

3.1. Feature Extraction Based on Binary Map Series of PCNN

The PCNN will output a binary map in each iteration, so a binary map series (BMS) which contains binary maps will be generated after iterations, which is recorded and is shown in Figure 2.

Figure 2 shows a face image in the upper left corner, and others are the face image of BMS generated by PCNN. The results in Figure 2 show that the BMS effectively reflects the edge details and regional features distribution, and on the time axis of the image sequence, it well demonstrates that the process produces changes of the image features by neighborhood neurons pulse coupling. In addition, some characteristics of the regional component cycle are repeated with certain cycle in BMS, but some feature components are contracting on the time axis, while others are expanding. The phenomenon actually shows that the PCNN operates the features region clustering in smooth form.

In PCNN, an image of size will generate a BMS with the size ; this means the data of the BMS is times greater than that of the original image, so the BMS cannot be directly applied to pattern classification. If each 2D binary map in BMS could be translated into 0D data points by some means or another, then the 3D BMS could be translated into a 1D oscillation time sequence (OTS) of the size ; thus, the OTS realizes the feature extraction and data dimension reduction of the image. Because this kind of OTS is generated by BMS, it can be denoted as OTS-BMS. Johnson [15] put forward a method of translating a binary image into 0D data points by calculating the area of the binary map; thus, this kind of OTS-BMS can be described by (5); an example is shown in Figure 3:

3.2. Frequency Map Series and Feature Extraction

OTS-BMS defined by (5) is a global statistical result of each binary image, so in a binary map, those pulses (1 state) at different location and not in same feature region will play the same role in the feature description with OTS-BMS. Thus, this king of OTS-BMS not only failed to properly describe the correlation among binary maps in chronological order but also cannot well describe the spatial structure information of a binary map, but the information is very important for improving the effectiveness of pattern classification using BMS. In view of this, we define a frequency map series (FMS) based on the above BMS, which is described by (6), and an example is shown in Figure 4:

In FMS, if each 2D frequency map can be transformed into 0D feature points, then 1D time sequence signature OTS-FMS can be extracted from 3D FMS. In addition, to make the 0D feature points as possible as including the space structure information of a frequency map, here we use two-dimensional entropy of an image to extract 0D feature information for a frequency map. The two-dimensional entropy of an image is defined by (8):

In (8), denotes gray levels of the image. In (9), represents pixel gray value, and represents neighbor mean value of this pixel. Obviously, two-dimensional entropy not only transforms the image of the size into 0D data points, because of reflecting the relation between pixel gray value and neighbor mean value, but also contains the spatial structure information of the image. Through calculating the two-dimensional entropy of the image, the FMS of the size can be transformed into 1D OTS-FMS of the size ; that is, ; as a result, this method greatly reduces the feature dimension. In addition, OTS-FMS also has good ability of pattern classification for the face image; as shown in Figure 5, OTS-FMS is supremely similar with the same kind of face S1_1 and S1_7 and is different for S3_9.

Because the calculation of the two-dimensional entropy of an image is independent of the rotation and translation of the image and has good adaptability to image scaling, therefore, this kind of OTS-FMS has good invariance for the image rotation and translation and has invariance for the image scaling with small error. As shown in Figure 6, the face image is rotated 30 degrees, zoomed 0.6, and moved 15 pixels vertically and horizontally, respectively. Obviously, the image occurs geometric changes, but its OTS-FMS characteristic curves have little difference.

4. Distance Measure Selection of OTS-FMS

After feature extraction and dimensionality reduction of the face image, we can use common Euclidean distance (ED) as distance measure in pattern classification for the face recognition. But ED requires that the distribution of each dimension component of data should be consistent. For the OTS-BMS feature data of an image with the size of , because it is global pulse statistics in data space, the mapping space of each dimension component is , so ED can be used to measure the distance among OTS-BMS.

However, for OTS-FMS feature data of the image, because it is monotone increasing totally, as shown in Figure 5, namely, the frequency rank of each dimension component of data also is increasing in the overall. Therefore, even if we use the same entropy method to calculate the frequency map, mapping spaces of the frequency maps are inconsistent; that is to say, their operation results of two-dimensional entropy are not consistent in their dimension. So the distance measure of OTS-FMS feature data is inappropriate to select ED, and it should select a distance measure which is independent of dimension. Here, standardized Euclidean distance (SED) is a suitable choice. For eigenvectors and , denotes the standard variance, and then SED is defined by

5. Experimental Results and Analysis

To evaluate the performance of the proposed facial feature extraction in face recognition, experimental studies are carried out on the ORL face database or MIT-CBCL face database. The ORL database contains 400 images of 40 individuals (4 females and 36 males). Each individual has 10 images varying in position, rotation, zoom, and expression. The MIT-CBCL database comprises 2000 images from 200 people; each person has 10 images with different illumination, pose, and rotation. The parameters of PCNN are set as , , , , , , , , , the default iterations , and the default statistics starting point .

First of all, to test the classification performance of the OTS-BMS or OTS-FMS when using ED and SED, we constructed experiments based on ORL database by randomly leaving out image per person each time for testing, the remainder being images per person for training. This was repeated 30 times by leaving out images per person each time. The experimental results listed in Table 1 are the average of 30 times’ results each time.

It can be seen from Table 1, under different training sets, that the recognition rate of SED measure is significantly higher than ED measure for OTS-FMS, and the recognition rate of ED measure is also slightly higher than SED measure for OTS-BMS. This shows that the OTS-FMS by calculating two-dimensional entropy of each frequency map in FMS is suitable for using SED measure, while the OTS-BMS by calculating the area of each binary map in BMS is suitable for selecting ED measure. In addition, the recognition rate of OTS-FMS under SED measure is significantly higher than OTS-BMS under ED measure. That shows the proposed OTS-FMS in this paper can effectively extract facial image features.

Then, to investigate the influences on the facial recognition performance for different iterations in PCNN or the different statistical starting point in FMS, experiments were carried out on ORL face database, by randomly choosing 6 images per person each time for testing, the remainder being 4 images per person for training. This was repeated 30 times with different and each time. The average face recognition rates in these experiments are shown in Table 2.

In Table 2, we can see that the recognition rate firstly increases with the increase of iterations under definite , but soon afterwards it is gradually stable; hence, can be set as to decrease the amount of calculation for PCNN. Under definite , the recognition rate decreases with the increase of , and to increase the recognition rate, is set as .

On the ORL database, selecting the frontal images of each person as training set, and the rest of images as test set, we make a comparison of the recognition rate of our method (OTS-FMS + SED) with other methods such as common subspace face recognition algorithms (PCA, 2DPCA, and KPCA) and PCNN [16], as shown in Table 3. Obviously, under the condition , the recognition rate of the proposed method is significantly higher than other methods, but under the condition , the recognition rate in this paper is lower than other subspace methods, but it is higher than PCNN method [16]. Combined with the experimental results of Table 1 (, the average recognition rate is 92.67%), we can know that there are a few influential samples in the training sets.

In order to verify the validity of this method for face recognition in the bigger data space, on MIT-CBCL, we randomly select face images of each person at rate to compose training set, the remaining images as test set; then, 30 times experiments were carried out, and the average recognition rate is shown in Table 4.

As shown in Table 4, under the same training set, the recognition rate of OTS-FMS in this paper is higher than [16] based on BMS, which illustrates the validity of this paper. In addition, the experimental results in MIT-CBCL Library are the same as ORL Library; the recognition rate using SED measure is higher than that using ED measure. This again shows the correctness of the analysis of the characteristics of the OTS-FMS in this paper.

On ORL or MIT-CBCL face database above, the face has been tailored manually from original image, but on a practical face recognition system, obtaining the face from an image needs to use a certain face detection algorithm such as the method in [17], and the facial recognition performance is easily affected by the detection accuracy. To test the adaptability of both OTS-FMS and OTS-BMS for face detection precision, we set up a custom face database, which includes 208 images of 13 individuals; each individual has 16 images varying in facial expression, illumination, pose, background, and facial details (with/without glasses), and the part of the database was shown in Figure 7.

The face has a larger proportion of background on the custom database; first, we use the face detection algorithm [17] for processing (as shown in Figure 8); then, we compare the face recognition performance before and after face detection for both OTS-FMS and OTS-BMS, and the results are shown in Table 5. Here, experimental repetitions are 30, . As we can see in Table 5, after face detection, the average recognition rates are improved for OTS-FMS and OTS-BMS; this suggests that the efficient face detection method has important significance for improving identification performance in the proposed method. In addition, both before and after the face detection, the average recognition rate for the OTS-FMS is better than that for the OTS-BMS; it also suggests that the improvement based on OTS-BMS for the facial feature extraction method (i.e., OTS-FMS) is effective.

In addition, after face detection on custom face database, we add Gaussian noise to each kind of faces detected as shown in Figure 8 with the mean value 0 and the variance ; here, the variance satisfies with uniform distribution, and then we investigate the adaptation of both OTS-FMS and OTS-BMS for noise; the results are shown in Table 6. Here, experimental repetitions are 30, . As we can see in Tables 5 and 6, after the faces are polluted by noise, both for OTS-FMS and OTS-BMS, the average recognition rate is decreased, but the influence of the noise for the OTS-FMS identification is more serious than OTS-BMS. This is mainly due to the OTS-BMS which is a global statistics on BMS of the facial image, but the OTS-FMS is more emphasized on the facial local detail information. Therefore, when a face is polluted by Gaussian noise, the noise damage to the facial local details is fatal, but to the global statistics it is lighter. The fact shows that the face recognition algorithm (OTS-FMS + SED) is sensitive to noise.

6. Conclusions

In this paper, a novel method was proposed to extract the facial feature based on PCNN. Through the analysis of the limitations of BMS extracted one-dimensional oscillation time sequence (OTS) and considering the correlation between binary images in BMS, we proposed a method to transform BMS into frequency map series (FMS), which reduces the influences of the discontinuous value of binary image on the effectiveness of OTS. Then, the paper considers the method that 2D frequency map is transformed into 0D data points; in order to reflect the spatial structure information in the frequency map, the paper employs the computing method of two-dimensional entropy of the image; it will transform 3D FMS into 1D OTS-FMS feature, and the feature has a good invariance for face image geometric changes. Finally, based on the analysis of characteristics of OTS-FMS data features, the paper we proposed uses the standard Euclidean distance measure as the distance measure of OTS-FMS features. The experimental results show that the recognition rate of OTS-FMS is significantly higher than PCA and KPCA, and so forth, and also better than method [16] based on BMS extracted OTS-BMS features. And compared to traditional subspace analysis and neural network method, face features extracted by this method do not change with the sample space change.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research was financially supported by the National Natural Science Foundation of China (nos. 61365001 and 61463052), the Natural Science Foundation of Yunnan Province (no. 2012FD003), and the Science and Technology Plan of Yunnan Province (no. 2014AB016).