Abstract

Because a large number of labeled face data samples in special scenes need a large number of training samples with identity markers, and it is impossible to accurately extract the characteristics of small samples, a fast face recognition method based on double decision subspace is proposed. A feature recognition structure based on double decision subspace is constructed to preprocess the face image and separate the local features of several corresponding face images. The local binary pattern is used to extract the local texture features of the face, and the deep convolution network face fast recognition model is constructed. The convolution network is used to share the weight, pool, and downsampling to reduce the complexity of the model. The constructed recognition model is used to recognize the features of the face image, and the fast face recognition is effectively completed. The experimental results show that the designed method has high recognition accuracy, less average recognition time, and good recognition performance.

1. Introduction

In the current era of rapid development of information technology, how to accurately identify a person’s identity and protect information security has become a key social problem that must be solved [1, 2]. The traditional authentication technology characterized by passwords, cards, and certificates is more and more difficult to meet the needs of society because it is easy to forge and lose. However, biometric technology has become the most secure and complete authentication technology in the world because of its uniqueness, concealment, anticounterfeiting, stability, and universality [3, 4]. Biometric identification refers to the technology that uses the physiological or behavioral characteristics owned by human beings to uniquely identify their identity [5]. Among them, physiological characteristics are inherent to individuals, including face, fingerprint, palm print, retina, and iris. The behavior characteristics are the habitual behavior characteristics acquired by individuals, including handwriting, gait, voice, and keystroke [6, 7]. To some extent, these characteristics are owned and unique by everyone and can reflect the characteristics of different individuals. Based on these features, the corresponding recognition technologies include face recognition, fingerprint recognition, palmprint recognition, retina recognition, iris recognition, and speech recognition [8, 9]. Among them, face recognition is the most important method for human to recognize each other. The main applications of face recognition include the following aspects: (1)Access control system: in areas requiring security protection, identify the identity of people trying to enter through face recognition technology to prevent unreliable people from entering(2)Criminal investigation: after obtaining the photos or facial features of a suspect through some channels, the staff will use network services and face recognition system to search for fugitives all over the country in order to arrest fugitives quickly(3)Video surveillance: monitor the crowd in public places such as banks, airports, stadiums, and shopping malls to prevent terrorist activities(4)Network applications: using face recognition technology to assist credit card network payment to prevent noncredit card owners from stealing credit cards, etc.(5)Human computer interaction: start the personal computer for face recognition, unlock the mobile phone for face recognition, and use face recognition for realistic virtual games

Face recognition technology has not only attracted more and more attention in academic circles but also has a very wide application prospect because of its nonmandatory, noncontact, intuitive, and simple advantages. Reference [10] proposed cancelable face recognition based on fractional Lorenz chaotic system and Haar wavelet fusion. Cancelable biometric recognition technology is a technology to generate distorted or encrypted templates of original biometric templates. Its development is due to advanced hacker technology, which can capture the original stored biometrics from the database. The basic idea is to generate user-specific random keys, which are different from the red, green, and blue components of color face images. These keys are generated by fractional Lorentz chaotic system. Reference [11] proposed a face recognition method based on probabilistic cooperative dictionary learning. Although the classifier based on sparse representation is a nonparametric model and can obtain interesting pattern recognition results, it lacks a reasonable explanation for its classification mechanism. In addition, the training samples are directly used as ready-made dictionaries in SRC, which makes it difficult to identify the features hidden in the training samples. At the same time, because there are too many atoms in the dictionary, the complexity of the algorithm increases. Firstly, the classification mechanism of SRC is explained in detail from the perspective of probabilistic cooperative subspace, and the process of using joint probability to improve the stability of the algorithm in the case of multiple subspaces is given. Then, dictionary learning and the Fisher criterion are introduced into the model to further improve the discrimination ability of coding coefficients. In order to ensure the convexity of the discriminant term and further enhance the discriminant ability, the L21 norm term is added to the Fisher discriminant term, and its convexity is proved. Finally, the experimental results on a series of benchmark databases show that pcddl is superior to the existing classical classification models. Although some progress has been made in the above research, the rapid face recognition technology still faces great challenges, mainly due to the changes of posture, expression, illumination, and occlusion. In recent years, the convolution neural network algorithm widely used in pattern recognition and image processing has a certain degree of invariance to these effects. Therefore, a fast face recognition method based on double decision subspace is proposed. The feature recognition structure based on double decision subspace is used to preprocess the face image and separate the local features. The local binary pattern is used to extract the local texture features of the face, and the deep convolution network face fast recognition model is constructed to recognize the features of the face image and complete the fast face recognition. The proposed fast face recognition technology has great theoretical and practical significance.

2. Fast Face Recognition Method Based on Double Decision Subspace

2.1. Feature Recognition Based on Double Decision Subspace

The complete kernel discriminant analysis algorithm is a feature recognition algorithm based on double decision subspace. It is composed of kernel principal component analysis and the Fisher linear discriminant analysis. It is divided into two stages: firstly, the initial feature space is obtained by applying kernel principal component analysis to the input; then, the Fisher linear discriminant algorithm is used to analyze the feature space in the double decision subspace (kernel space and range space), and two kinds of decision feature information are obtained, unconventional feature and conventional feature; finally, the two kinds of feature information are fused according to certain rules, and the fused features can be used for recognition.

With the progress of artificial intelligence technology, many traditional fast face recognition algorithms have been developed and improved [12]. The rapid development of software application based on face recognition technology has brought great convenience to human life and work. Therefore, it is studied that the use of double decision subspace can meet the learning of fewer data points and complete the accurate fast face recognition task based on double decision subspace [13]. The feature recognition structure based on double decision subspace is shown in Figure 1.

In the process of rapid face recognition, firstly, an unknown face image is preprocessed, the local features of several corresponding face images are separated, and the projection vector of each feature is obtained. On this basis, a classification discriminant function is defined. The function is expressed as the distance between a group of local features of the unknown face image and the corresponding local features in the face database [14, 15], and based on this, the minimum classification error rate between face features is calculated for feature fusion, and the fusion results are used to complete the rapid recognition of face features.

Assuming represents the feature vector, , of each unknown face image, urge each to project to its corresponding subspace, calculate the projection vector of each feature, and define a feature classification discriminant function formula by using the following formula:

In formula (1), represents the distance between each feature vector and other features, and represents the weighting coefficient of each feature vector. In the process of fast face recognition based on double decision subspace [16], the minimum classification error rate between face features is calculated through formula (1), and the formula is as follows:

In formula (2), represents the maximum eigenvalue in the vector of face features [17]. It can fuse the features of face images and collect the information of face images. The expression is

In formula (3), represents the comprehensive function of face feature discrimination.

According to the synthesis function of face feature discrimination, the obtained results can effectively complete face recognition [18, 19]. Face image preprocessing mainly serves for subsequent face rapid recognition. In the process of obtaining the original face image, it is limited by various conditions and random interference, and the image quality is often not high. Therefore, image preprocessing must be carried out in the early stage, including image graying, noise filtering, and image enhancement [20].

2.1.1. Image Graying

The collected original face image is a color image, and the color image will confuse the image information to a certain extent and increase the later recognition workload. Therefore, it is necessary to grayscale the original face image [21]. Image graying refers to changing a color image into a grayscale image with a pixel range of 0 to 255. At present, there are four main processing methods, as follows:

(1) Component Method. The brightness of the three components in the color image is taken as the gray value of the three gray images.

In formula (4), , , and , respectively, represent the three components (red, green, and blue) in the color image, represents the gray value coefficient corresponding to red, represents the gray value coefficient corresponding to yellow, and represents the gray value coefficient corresponding to blue.

(2) Maximum Method. The maximum value of the three component brightness in the color image is taken as the gray value of the gray image.

(3) Average Method. The three-component brightness in the color image is averaged to obtain a gray value.

(4) Weighted Average Method. According to the importance and other indicators, the three components are weighted and averaged with different weights.

2.1.2. Noise Filtering

The image will be disturbed by noise during acquisition, resulting in some noise points in the face image, which will reduce the image definition. Therefore, noise filtering is required [2224]. A median filtering method is used for image denoising. Its principle is to set the gray value of each pixel to the median of the gray values of all pixels in a neighborhood window of the point, so that the surrounding pixel values are close to the real value, so as to eliminate isolated noise points. The specific process is shown in Figure 2.

2.1.3. Image Enhancement

Fast face recognition based on double decision subspace is realized through its unique features. Therefore, in order to highlight its features and speed up the recognition efficiency, image enhancement processing can be carried out to enrich the amount of information and improve the visual effect of the image [25, 26]. There are many image enhancement methods. Here, an image enhancement method based on the principle of histogram compactness transformation is adopted. The basic process is as follows.

Step 1. Count the gray frequency histogram of the input image and calculate the eigenvalue of the image histogram.

Step 2. Calculate the search starting point and left and right reference frequency.

Step 3. Histogram compactness transformation.

Step 4. Establish gray mapping relationship.

Step 5. Input image gray reconstruction.

Carry out face image recognition processing. The specific process is shown in Figure 3.

3. Realize the Fast Face Recognition Method Based on Double Decision Subspace

3.1. Face Local Texture Feature Recognition Based on Local Binary Pattern

Local binary pattern is a method to extract texture features locally from images. It has strong classification ability, high computational efficiency, rotation invariance, and gray invariance. It is widely used in image recognition, machine vision, and other fields. When selecting facial features, the selected target should not only fully reflect the important features of the face [27] but also be obviously easy to extract feature points, which should contain enough rich information, but not too much, so as not to slow down the operation speed.

Therefore, seven prominent feature points are selected: forehead, eyebrows, ears, cheeks, jaw, mouth, nose, and eyes. The basic structure of human face is shown in Figure 4:

In Figure 4, is the forehead in the face, is the eyebrow in the face, is the ear in the face, is the cheek in the face, is the jaw in the face, is the nose in the face, and is the eye in the face. The positions of the seven feature points are obviously different from the gray features of other parts of the face. Therefore, face features can be collected quickly through integral projection [28, 29]. The gray image can be processed directly by integral projection, or the image can be binarized and then integrated projection. The binary image has only two gray levels of 0 and 1, namely,

In formula (8), represents the binary image and represents the threshold.

In the process of face recognition, expression changes will have a great impact on the recognition results. Overcoming expression changes is an important research content of face rapid recognition [30]. Among them, the mouth area is most affected by the change of expression. When people laugh, the mouth opens, which changes the structure of people’s face, resulting in the change of face topology [31]. When the change of facial expression is not obvious, the position change of each feature point can be regarded as the isometric transformation of neutral expression. Therefore, according to the isometric hypothesis, this study avoids the mouth area and directly takes the isometric geodesic contour centered on the nose as the basis to represent the face shape. In the process of obtaining the isometric geodesic contour, it needs to go through the forehead, cheek, nose, and eye areas with less influence of expression changes.

In the process of optimal face recognition, the local binary pattern is used to extract the local texture features of the face. The basic principle of the local binary pattern is to compare the gray values between the central pixel and its domain pixels. The basic principle of the local binary pattern is used to represent . The calculation formula is as follows:

In formula (9), represents the pixel gray value on the neighborhood with radius, represents the gray value of the central pixel, and represents the number of sampling points in the neighborhood. The gray value of the local binary mode is obtained through the calculation formula.

The calculation of the pixel gray value of the local binary mode shows that the number of binary modes formed by a local binary mode operator depends on the sampling number of the neighborhood set. When the binary sequence obtained by the local binary mode operator is connected end to end, it is assumed that the change of the local binary mode operator sequence from 0 to 1 or from 1 to 0 is less than 2 times. The local binary pattern operator is defined as an equivalent pattern. It can be concluded that it can effectively reduce the face feature dimension and reduce the loss rate of feature information [32].

The statistical histogram of the local binary mode of the whole image is regarded as the face feature, and the obtained face feature is fuzzy. The idea of blocking is proposed [33]. At the same time, the image is divided into blocks, and the local binary mode operator is applied to connect the obtained histograms to form a new face feature vector. The operation steps of the extraction process of the block local binary mode histogram are as follows.

Step 1. First, block the face image.

Step 2. Secondly, extract the local binary mode from the partitioned subgraph and generate the local binary mode histogram, as shown in Figure 5:

Step 3. Finally, the obtained multiple local binary pattern histograms are connected in order to form a new face local texture feature.

According to the extraction process of local binary pattern histogram, the local texture features of human face are extracted by using local binary pattern operator [34], which lays a foundation for the optimization of face recognition and creates conditions for the construction of deep convolution network face fast recognition model.

3.2. Construction of Fast Face Recognition Model Based on Deep Convolution Network

Deep convolution network face recognition is generally composed of multiple single-layer convolution neural structures. The recognition process shows the source of features and improves the performance under unconstrained conditions. Generally, when identifying objects, it is allowed to take advantage of all features from different granularity and different abstraction layers. For the input image, multiple convolution networks with different scales are used for convolution with different granularity. Different convolution networks do not interfere with each other. The features used in the final object recognition come from different single-layer feature extraction results. Based on the local binary pattern, the local texture features of human face are extracted by using the local binary pattern and clustered. The results are used as the input of each layer of the depth network, and the depth convolution network parameters are trained layer by layer. Deep convolution network is a multilayer neural network structure, which is composed of several convolution layers, pooling layers, and full connection layers. The core idea of deep convolution network is to cluster the given data samples based on a certain distance, so as to minimize the intraclass distance and maximize the interclass distance. In the process of optimizing face recognition, the specific operation steps are as follows.

Step 1. Select the face local texture features as the initial clustering center [35], calculate the distance from the other face local texture features to the clustering center, and then assign each texture feature to the nearest class. Finally, adjust the clustering center of the new class. If the clustering centers of the two adjacent clusters do not change, the clustering is completed.

Step 2. Input the clustered face local texture features into the convolution layer (visual layer), assuming that it is represented by , and then, the input is convoluted with different convolution cores ; if the output characteristic map after each convolution is , an equation about can be obtained. The calculation formula is as follows: In formula (10), represents the convolution operation, the convolution weight of convolution kernel and base vector represent the training parameters, forming a clustered face local texture feature synthesis function, and represents the training coefficient.

Step 3. Input the clustered face local texture features into the pool layer and reduce the spatial resolution of the convolution layer through down sampling, which can not only reduce the number of trained face feature weights but also reduce the influence of face rotation, expression, illumination, and other factors. Because the pooling operation adopts size operation, the clustered face local texture features are more representative.

Step 4. Input the face local texture features after clustering into the full connection layer, obtain the corresponding output feature vector of the layer through the weight matrix , base vector , and activation function , and finally form a function formula (11) about : According to the synthesis function in formula (11), the final deep convolution network is combined with the local binary mode network structure. In the deep convolution network, the pixels at the input picture boundary will be scanned by the convolution kernel and the pixels in the middle of the image will be scanned many times. In the actual processing process, the operation will be carried out from the new boundary. It can solve the problem of adjusting the inconsistent size of the input image and maintain the consistent size of the input and output after the convolution operation. Based on the double decision subspace, the construction of the deep convolution network face recognition model is completed.

3.3. Fast Recognition of Face Image Features

Based on the double decision subspace, based on the construction of the deep convolution network face fast recognition model, the face image features are quickly recognized through the combination of the deep convolution network and the local binary pattern [36]. The specific process of rapid recognition of face image features is as follows:

Two different function models are constructed. It is assumed that represents the similarity measurement function model within the face feature class and represents the similarity measurement function model between different face feature types. Finally, two formulas about and are formed, as shown in the following:

In formulas (12) and (13), represents the mean value of class samples, represents the final output value of the network, and represent the direct gap between the output value of the network and the label, represents the connection parameter between the units of each layer, that is, the weight, and represents the energy function added with the criterion. The next step of image feature recognition is carried out according to the two classification functions. In order to accurately recognize the image features, it is necessary to establish reasonable constraints [37].

According to the function model in formulas (12) and (13), the following formula can be used to calculate the mean value of class face samples, forming a constraint function (14) about the mean value :

In formula (14), in order to make the deep convolution network model more conducive to the recognition of face image features, it is necessary to further restrict the and function models. A new functional constraint on residual parameters is formed as follows:

In formula (15), and represent the residual parameters of each output unit of the output layer. The main purpose is to make the recognition effect more obvious [38, 39]. With the constraint of , the above formula can reduce the intraclass spacing of face feature samples and adjust the weight of each layer in a direction more conducive to face recognition [40].

To sum up, based on the double decision subspace, under the deep convolution network face fast recognition model, the residual of the units in the face image feature class is calculated by using the formula, and the face image feature recognition value is calculated. According to the deep convolution network model, the face fast recognition can be effectively completed. By updating the algorithm of the deep convolution network face fast recognition model, a new face fast recognition method is formed based on the local binary pattern algorithm, so as to complete the research of face fast recognition method based on double decision subspace.

4. Experimental Analysis

In order to verify the feasibility of face fast recognition method based on double decision subspace, an experiment is designed. Select 10 objects to take face images and then store the 20 face images taken by each object in the folder with the same identification. Each object has a unique storage folder. The whole data set is divided into training set and test set. There are 120 sample images in the training set and 80 sample images in the test set. The experimental platform is a PC, mainly configured with Intel Core i5CPU dual core, main frequency of 2.93GHz, and 8 GB RAN. The operating environment of the experimental platform is shown in Table 1.

Normalize all image sizes, train each person’s face image, and realize the full connection layer connection of triple loss function. After completion, use the face image captured by the webcam and cut it for verification and recognition. Then, the first step is to find and select one of the previously stored object face folders as the positive image. The second step is to randomly select three object folders and select one image from each folder as the negative image, calculate the mean square error, and repeat the process, until the error of the image obtained from the second step is less than the error obtained in the first step, it means that the recognition result is fast.

The calculation formula of average precision is

In formula (16), represents the recognition accuracy and accuracy of 10 objects. The greater the average accuracy, the higher the recognition accuracy.

The calculation formula of average recognition time is

In formula (17), takes a total of time to recognize 10 object faces. Calculate the average accuracy and average recognition time according to formula (16) and formula (17), and the results are shown in Table 2:

It can be seen from Table 2 that the average accuracy of face recognition of 10 objects by using this method is 96.68% and the average recognition time is 354.2 s, and this result is better than that of reference [10] and reference [11], and the average accuracy of reference [10] and reference [11] is 89.36% and 88.25%, respectively. The average recognition time of reference [10] method and reference [11] method is 410.5 s and 432.4 s, respectively. This shows that this method can complete the task of rapid face recognition more quickly and accurately. The reason is that this method means a classification discrimination function, which is expressed as the distance between a group of face local features of unknown face image and the corresponding local features in face database, and based on this, the minimum classification error rate between face features is calculated. Feature fusion, using the fusion results, can effectively complete the rapid recognition of face features.

The training samples randomly select any 5 face images of each person, a total of 50. Test sample selection: 6 face images with no one left as this part. In the experiment, different subgraphs have different weights, and the low-frequency components have the main features of human face, and the results are relatively stable. In human facial expressions, the change range of mouth is the largest. Therefore, the low-frequency weight is low, the part above the nose changes relatively little, and the corresponding low-frequency weight is high. The vertical component can well describe the eye position and contour, so the weight is also high. In the experiment, the horizontal component is mainly the feature of the mouth, and the change is very obvious, so the weight is low. The diagonal component cannot see obvious information in the experiment, so the low-frequency weight is high. Two groups of training samples are randomly selected and given different weights, respectively. The specific experimental results are shown in Table 3.

It can be seen from Table 3 that the method in this paper is compared with the method in reference [10] and the method in reference [11]. The method in this paper can efficiently recognize expression changes and increase the recognition rate. In the case of training set 1, the weights of this method, reference [10] method, and reference [11] method are 0.93, 0.94, and 0.936, respectively. In the case of training set 2, the weights of this method, reference [10] method, and reference [11] method are 0.88, 0.91, and 0.946, respectively. In terms of expression changes, training set 2 is more prominent. After increasing the horizontal weight, the recognition rate has been greatly improved.

In the experiment, three different face images are selected as the test object. The original image is shown in Figure 6:

The methods in this paper, reference [10] method, and reference [11] method are used for performance identification, respectively. The specific experimental test steps are as follows.

Step 1. Firstly, preprocess the face image, mainly use the classifier to obtain the required face image information, and establish the face pose target matrix;

Step 2. Calculate the similarity of the best recognition target between the neighborhood of the target matrix, and plan the similar neighborhood set according to the fit degree of the target to the face image.

Step 3. Establish the face image recognition information model to realize the selection of recognition performance. The performance comparison results of different methods are shown in Figure 7.

By analyzing the experimental data in Figure 7, it can be seen that this method can accurately recognize different pose information of human face, while the other two methods can only recognize local information, which fully proves that this method can quickly obtain high accuracy recognition results. In conclusion, this method has good performance.

To sum up, the fast face recognition method based on double decision subspace in this paper has good recognition effect.

5. Conclusions and Prospects

Rapid face recognition plays a key role in many fields, such as security, finance, and public transportation. Therefore, a fast face recognition method based on double decision subspace is studied. The conclusions are as follows: (1)The double decision subspace has better operability than most face verification algorithms with identity tags(2)In the experimental environment, the average accuracy of this method is 96.68%, and the average recognition time is 354.2 s(3)It can efficiently recognize the expression changes, increase the recognition rate, and accurately recognize the different pose information of the face

However, there are some deficiencies in this experiment; that is, due to the fine data acquisition and image preprocessing in the experimental environment and the large amount of data required in practical application, there is a certain gap in the calculation efficiency of network model, and the recognition rate needs to be improved. Rapid face recognition is a very challenging subject. At present, it has not been realized that only one method can achieve very good results. How to combine with other technologies, improve recognition rate and speed, and reduce the amount of calculation and how to implement it in hardware need long-term continuous research and exploration.

Data Availability

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Conflicts of Interest

The authors declared that they have no conflicts of interest regarding this work.

Acknowledgments

This work supported by the Bureau Level Heilongjiang Provincial Higher Education Institution Basic Scientific Research Business Expenses Projects “Research on target tracking method based on convolutional neural network ensemble learning” (2021-KYYWF-0580), and Heilongjiang Provincial Natural Fund “Mechanism research of rare earth toughening and modification of Mg2Si and fracture behavior of MG-Al-Si alloy at chamber/high temperature” (LH2021E115).