Joint Subspace and Low-Rank Coding Method for Makeup Face Recognition

Lu, Jianwei; Zhou, Guohua; Zhu, Jiaqun; Xue, Lei

doi:https://doi.org/10.1155/2021/9914452

Mathematical Problems in Engineering

On this page

Abstract Introduction Related Work Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Advanced Intelligent Fuzzy Systems Modeling Technologies for Smart Cities

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 9914452 | https://doi.org/10.1155/2021/9914452

Joint Subspace and Low-Rank Coding Method for Makeup Face Recognition

Jianwei Lu,¹Guohua Zhou,^1,2Jiaqun Zhu,²and Lei Xue²

Academic Editor: Yi-Zhang Jiang

Received04 Mar 2021

Revised07 Apr 2021

Accepted13 Apr 2021

Published20 Apr 2021

Abstract

Facial makeup significantly changes the perceived appearance of the face and reduces the accuracy of face recognition. To adapt to the application of smart cities, in this study, we introduce a novel joint subspace and low-rank coding method for makeup face recognition. To exploit more discriminative information of face images, we use the feature projection technology to find proper subspace and learn a discriminative dictionary in such subspace. In addition, we use a low-rank constraint in the dictionary learning. Then, we design a joint learning framework and use the iterative optimization strategy to obtain all parameters simultaneously. Experiments on real-world dataset achieve good performance and demonstrate the validity of the proposed method.

1. Introduction

Digital technology represented by artificial intelligence, Internet of things (IoT), and cloud computing, etc. is developing vigorously for smart cities. A smart city aims at using various kinds of information technology to integrate the system and services of the city, which improves the utilization efficiency of resources and the quality of life of residents [1, 2]. Devices and sensors of IoT will reach 40 billion by 2025 [3]. With the amount of data increasing, IoT industry expands from the initial connection to intelligence and autonomy. Simultaneously, artificial intelligence as a powerful tool provides intelligence for smart cities, and a large number of machine learning algorithms are put into practical application to realize the autonomy of the equipment, which completes data collection and processing by itself. In this case, artificial intelligence helps to collect relevant data, identify alternatives, and make choices among alternatives, review decisions, and make predictions [4, 5]. Automatic face recognition is considered as one of important techniques to realize smart city. It plays an interactive role in human-computer interaction and intelligent transportation in access control system, community management information system, person of interest, and so on [6, 7]. For example, based on face recognition technology, monitoring is carried out in crowded places such as passenger stations and railway stations. Real-time recognition of faces in the video is compared with the database of people of key concern to public security, and real-time alarm can be provided. In smart cities, face recognition technology can now be applied to examinations in schools. At the examination centre, candidates verify their identity through a face recognition system to ensure fairness and prevent the occurrence of test substitution.

Due to differences in illumination variations, face angle, posture, and cameras, the face images belonging to the same person may look very different. Particularly, in real-world applications, facial makeup significantly changes the perceived appearance of the face and reduces the accuracy of face recognition. The literatures [8–10] indicated that facial makeup has a negative impact on the performance of the majority of face recognition algorithms. Figure 1 shows examples of face image pairs in the wild (DFW) dataset [11]: left one in each pair is without makeup, and right one in each pair is with makeup. The face before and after makeup can intuitively feel the significant changes in facial appearance. For those reasons, the makeup face recognition has become a difficult problem in facial classification. In order to develop a powerful face recognition system, the influence of cosmetics on face verification needs to be solved. Yan [12] introduced a multiple feature descriptors into the metric learning that learned multiple distance metrics by collaborating different facial features from visual and audio information. Chen et al. [13] developed a method for the automatic detection of makeup in face images. This method extracts a feature vector to capture the shape, texture, and color features of face images and uses SVM and Adaboost to determine if makeup is present. In addition to extracting features from the whole face, the method also uses parts of the face associated with the left eye, right eye, and mouth. Kose et al. [14] developed a facial makeup detector to reduce the impact of makeup in face recognition. This method exploits the shape and texture information of face and uses SVM and Alligator as classifiers. Wang and Kumar [15] developed a framework for facial makeup detector and remover. In this framework, it uses a locality-constrained low-rank dictionary learning method for facial makeup detector and uses locality-constrained coupled dictionary learning for facial makeup removal. Although there have been some research results on makeup face recognition, the performances of the methods in real scenario applications still need to be improved.

Recently, dictionary learning has achieved great success in the field of face recognition. Traditional dictionary learning learns sparse representation and dictionary in the original data space. However, face makeup image verification is not only affected by cosmetics, but also easily affected by illumination and posture. In this study, we develop a joint subspace and low-rank coding method for makeup face recognition (JSLC). We consider finding a feature projection space and project the face images into it. At the same time, we learn a discriminative dictionary in such feature subspace, and each face image is encoded by a discriminative coding. To solve the problem of subspace and dictionary simultaneously, we build a joint learning model for them. In addition, to obtain more discriminative information in the subspace, we consider a low-rank constraint in the dictionary learning. The optimal solution of subspace projection matrix, dictionary, and sparse coefficient can be obtained simultaneously by alternating iterative optimization strategy.

We organize the rest of this paper as follows. Firstly, related work about makeup face recognition is reviewed in Section 2. Secondly, the proposed method is introduced in Section 3. After that, the results of comparison experiment are shown in Section 4. Finally, conclusions and future work are summarized in Section 5.

In the view of AI, the makeup face recognition contains two stages: feature extraction and classification method. The common used feature extraction methods for face recognition is geometric methods and appearance methods [9]. Geometric methods use geometric shape of facial components, and appearance methods use textures of the facial images, also including creases and furrows. Geometric methods use pre-defined geometric marker positions on salient facial features to represent facial characters. Since geometric methods express facial characters according to the limited fiducial points on the human face, they usually need accurate facial feature detection. Thus, appearance methods often perform better in face recognition. The commonly used local binary patterns (LBP) and Garber filters are all appearance methods. There are many successful classification methods for face recognition, such as SVM, metric learning, dictionary learning, Adaboost, and so on [16, 17]. Due to its sparsity and noise alleviation, dictionary learning demonstrated its advantages in image processing tasks.

Dictionary learning methods can approximate each sample by using a linear combination of a few atoms from the learned dictionary [18, 19]. Given training samples where , the dictionary and corresponding sparse coefficients can be trained by the following formula:where is the Frobenius norm operation, is the sparsity regularization, and is the balance parameter.

The original meaning of equation (1) is to complete the reconstruction tasks. In order to use DL for classification tasks, more discriminative or supervision information is considered in the dictionary learning. Thus, its optimization problem can be written aswhere function can be a classifier, discrimination criterion, or label consistency term.

3. Joint Subspace and Low-Rank Coding Method for Makeup Face Recognition

3.1. Objective Function of JSLC

Because the appearance of the person face will change significantly after makeup, in this study we use subspace learning to project the original data samples and preserve the discriminative information in the feature subspace. The subspace learning imbedded into dictionary learning can be represented aswhere is the projection matrix, p is the dimension of the subspace, and and are two positive parameters. has three terms. The first two are the dictionary learning terms in the subspace, and their goal is to minimize the representation error. The third is the regularization term, and it plays the role of principal component analysis (PCA), by which the discriminant information in the original space can be preserved in the projection subspace [20].

Then, we consider using an affinity matrix Q to measure the discriminant ability of the sparse codes; i.e., if two face images are from the same person and look similar, the difference in their sparse codes is minimized; if two face images are from different person and look similar, the difference in their sparse codes is maximized so that discriminative information can be exploited. This idea can be represented as

The element of matrix Q can be written aswhere function returns the k-nearest neighbors of image . means images and from the same person. means images and from the different person.

We denote diagonal matrix whose diagonal elements are the sums of the row elements of Q. term can be simplified aswhere L = S-Q.

In order to obtain more discriminative information in the subspace, we consider a low-rank constraint of A in the dictionary learning. Following [21], we use to present the rank, where and . Thus, the objection function of rank minimization can be written as

We combine , , and together, and we obtain the objection function of JSLC as i.e.,

Obviously equation (8) is a joint learning function for subspace and dictionary learning. The subspace can gradually enhance the discriminative ability of the learned dictionary during the optimization process. Also, the learned dictionary can improve the recognition of the subspace.

3.2. Optimization

In this subsection, we solve equation (8) by using the alternating optimization strategy. First, we denote , , for some and some . The objective function of JSLC can be written aswhere .(1)Update W: when fixing B, A, E, and H, we can obtain the following formula: Equation (10) has a closed-form solution. We can obtain R by where , and Z can be solved by We can obtain Z in a closed-form solution. Then, we can obtain W by(2)Update D: when fixing A, W, E, and H, we can obtain the following formula: We use Lagrange dual approach to solve equation (14). The closed-form solution of D is where is a very small diagonal matrix. Then, we can obtain matrix B by , where is the operation of pseudo-inverse matrix.(3)Update A: with W, D, E, and H fixed, the objective function is rewritten by Obviously, each term in equation (16) is quadratic; we have the following formulation by setting the derivative of A to zero: Equation (17) is a standard Sylvester equation, and we can solve it by following the Bartels-Stewart algorithm [22].(4)Update E, H: with W, D, and A fixed, equation (9) can be written as . We can easily obtain the closed-form solution of E and H by

When we obtain the optimal parameters of dictionary D and project matrix W, we can obtain the sparse coding of testing image by

Finally, we can use the closing distance strategy to perform the testing task.

Based on the above analysis, the proposed JSLC method is presented in Algorithm 1.

Input: a dataset of facial images , including with makeup images and without makeup images;
Output: dictionary D and project matrix W.
Initialization: random matrix B, constructing R’s columns using eigen-vectors with top p eigen-values of C;
Repeat
Update W using equation (13) with D, A, E, and H fixed;
Update D using equation (15) with A, W, E, and H fixed;
Update A using equation (17) with W, D, E, and H fixed;
Update E and H using equations (18) and (19) with W, D and A fixed;
if converged

4. Experiments

4.1. Datasets and Experimental Settings

In the experiment, we use the widely used face datasets DFW [11]. The DFW dataset contains 11155 different images of 1000 people collected from the Internet, including face images of movie stars, singers, athletes, and politicians. Each person contains one face image without makeup and multiple face images with makeup, and there are differences in posture, age, lighting, and expression. Wearing glasses and hats are also categories of makeup. The example face images of DWF dataset are shown in Figure 2. In this paper, we use histogram of oriented gradient (HOG) [23], local binary pattern (LBP) [24], and three-patch LBP (TPLBP) [25] to extract the features of facial images. The HOG algorithm sets the image block size as 16 × 16, and the extracted features are 1764 dimensions. LBP divides each face dataset image into 16 non-overlapping regions of 16 × 16 pixels and extracts 3776 data features. The TPLBP algorithm sets the image block size as 16 × 16, and the extracted features are 4096 dimensions. We randomly select 2000 images of 200 people. We reduce the obtained features to 500 dimensions by principal component analysis (PCA).

To validate the effectiveness of our approach, our method verified performance with the following methods: LLC [26], LMNN [27], PRDC [28], NCA [29], and RDML-CCPVL [30]. We set the subspace dimension in the grid {100, 200, 300, 400, 450} and the number of dictionary atoms in the grid {200, 300,...,600}. The parameters , , , and are set in the grid {0.5, 1,...,5}. All parameters in these methods are set according to their default settings. We use 5-fold cross-validation to obtain the optimal parameters and the average results of five turns are taken as the final result.

4.2. Experimental Results

Table 1 shows the comparison of JSLC based on HOG feature extraction and four comparison algorithms in the matching rate index. The results show the following: (1) JSLC achieves the best results on Rank 1, Rank 5, Rank 10, and Rank 15 of matching rate. JSLC uses dictionary learning framework and combines subspace and low-rank learning technology, which can effectively mine the discrimination information of different face images. (2) The comparison algorithm PRDC is mainly based on relative distance comparison; LMNN mainly uses the large interval information of inter domain samples, which cannot effectively make full use of the image discrimination information, so it still shows poor ability. Although RDML-CCPVL uses the depth discriminative metric learning method, the clustering method used in RDML-CCPVL cannot exploit all the effective information of images, so that its performance cannot achieve the ideal results. Tables 2 and 3 show the comparison of JSLC and four comparison algorithms in the matching rate index based on LBP and TPLBP features, respectively. Similar results are obtained on HOG feature; JSLC obtains the best matching performance compared with the other four methods. The results in Tables 1–3 also indicate that HOG, LBP, and TPLBP features are suitable for extracting makeup face feature vectors. The bold means the best result in the tables.

Figures 3 and 4 show the values of Rank 1 of JSLC using HOG features with different subspace dimensions and dictionary atoms. The results in Figures 3 and 4 show that the dimension of subspace to 400 and the number of dictionary atoms to 450 is feasible. In the JSLC method, the parameters , , , and are related to the performance of the model. Next, we analyze these four parameters. With LBP features and the fixed values of other parameters, Figure 5 shows the average value of Rank 1 of JSLC method with different parameters , , , and .

(a)

(b)

(c)

(d)

First, we discuss the effect of in JSLC. The parameter controls the role of sparse regularization term. The results in Figure 5(a) show that when = 1, the average Rank 1 achieves the best performance. In addition, the differences in model performance for different values of are modest. The parameter controls the role of PCA regularization term. The larger the value of is, the larger the proportion of PCA term in the objective function is. The results in Figure 5(b) show that different values of lead to different performance of JSLC. But we cannot find the relationship between and matching rate. Therefore, the optimal value determined by grid search method is feasible. Next, we consider the effect of . The parameter controls the role of affinity matrix in JSLC. The results in Figure 5(c) show that the matching rate of JSLC is sensitive to . When = 4, the matching rate is highest. Therefore, grid search method for is feasible. Finally, we discuss the effect of in JSLC. The results in Figure 5(d) show that the matching rate of JSLC is also sensitive to . The parameter controls the role of low-rank term. When the value is too small or too large, the low-rank term cannot exploit the intrinsic data structure of the face image.

5. Conclusion

In this study, a joint subspace and low-rank coding method is proposed for makeup face recognition. Based on the dictionary learning framework, the subspace learning and low-rank coding is jointly, so that the discriminative information of face images can be exploited. Experiment results on DFW show the good performance of our method. In the future, we will carry out face makeup recognition and verification in more complex datasets and more scenes, such as under various illumination, pose, and expression. How to extract deep features of face images into our method is also our work in the next step.

Data Availability

The labeled datasets used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The research activities described in this paper have been conducted within the Qinglan Project of Jiangsu Province under Grant no. Q019001, the Scientific Research Project of Changzhou Institute of Technology under Grant no. YB201813101005, Youth Innovation Fund Project of Changzhou Institute of Technology under Grant nos. QN202013101002 and HKKJ2020-37, National Natural Science Foundation of China under Grant no. 61806026, Natural Science Foundation of Jiangsu Province under Grant no. BK20180956, and Project of Jiangsu Education Science in the 13th Five-Year Plan in 2018 under Grant no. B-a/2018/01/41.

References

J. Lai, H. Zhou, W. Hu, D. Zhou, and L. Zhong, “Smart demand response based on smart homes,” Mathematical Problems in Engineering, vol. 2015, Article ID 912535, p. 8, 2015.
View at: Publisher Site | Google Scholar
Z. Xian, X. Wang, S. Yan, D. Yang, J. Chen, and C. Peng, “Main coronary vessel segmentation using deep learning in smart medical,” Mathematical Problems in Engineering, vol. 2020, Article ID 8858344, p. 9, 2020.
View at: Publisher Site | Google Scholar
C. MacGillivray and D. Reinsel, “Worldwide global dataSphere IoT device and data forecast,” Tech. Rep., IDC Corporate, Framingham, MA, USA, 2019, IDC Report.
View at: Google Scholar
F. Samie, L. Bauer, and J. Henkel, “From cloud down to things: an overview of machine learning in Internet of things,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4921–4934, 2019.
View at: Publisher Site | Google Scholar
L. Matindife, Y. Sun, and Z. Wang, “A machine-learning based nonintrusive smart home appliance status recognition,” Mathematical Problems in Engineering, vol. 2020, Article ID 9356165, p. 21, 2020.
View at: Publisher Site | Google Scholar
T. Ni, X. Gu, C. Zhang, W. Wang, and Y. Fan, “Multi-task deep metric learning with boundary discriminative information for cross-age face verification,” Journal of Grid Computing, vol. 18, no. 2, pp. 197–210, 2020.
View at: Publisher Site | Google Scholar
A. Dzedzickis, A. Kaklauskas, and V. Bucinskas, “Human emotion recognition: review of sensors and methods,” Sensors, vol. 20, no. 3, p. 592, 2020.
View at: Publisher Site | Google Scholar
Y. Kortli, M. Jridi, A. Al Falou, and M. Atri, “Face recognition systems: a survey,” Sensors, vol. 20, no. 2, p. 342, 2020.
View at: Publisher Site | Google Scholar
Z. Zhang, X. Chen, B. Wang, G. Hu, W. Zuo, and E. R. Hancock, “Face frontalization using an appearance-flow-based convolutional neural network,” IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2187–2199, 2019.
View at: Publisher Site | Google Scholar
P. Terhörst, J. N. Kolf, N. Damer, F. Kirchbuchner, and A. Kuijper, “Post-comparison mitigation of demographic bias in face recognition using fair score normalization,” Pattern Recognition Letters, vol. 140, no. 12, pp. 332–338, 2020.
View at: Publisher Site | Google Scholar
V. Kushwaha, M. Singh, and R. Singh, “Disguised faces in the wild,” in Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, June 2018.
View at: Google Scholar
H. Yan, “Collaborative discriminative multi-metric learning for facial expression recognition in video,” Pattern Recognition, vol. 75, no. 3, pp. 33–40, 2018.
View at: Publisher Site | Google Scholar
C. Chen, A. Dantcheva, and A. Ross, “Automatic facial makeup detection with application in face recognition,” in Proceedings of the 2013 international conference on biometrics (ICB), pp. 1–8, Madrid, Spain, June 2013.
View at: Google Scholar
N. Kose, L. Apvrille, and J. Dugelay, “Facial makeup detection technique based on texture and shape analysis,” in Proceedings of the 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–7, Ljubljana, Slovenia, May 2015.
View at: Google Scholar
T. Y. Wang and A. Kumar, “Recognizing human faces under disguise and makeup,” in Proceedings of the 2016 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA), pp. 1–7, Sendai, Japan, January 2016.
View at: Google Scholar
L. Zhang, H. P. H. Shum, L. Liu, G. Guo, and L. Shao, “Multiview discriminative marginal metric learning for makeup face verification,” Neurocomputing, vol. 333, no. 3, pp. 339–350, 2019.
View at: Publisher Site | Google Scholar
S. Yang, J. Liu, Y. Fang, and Z. Guo, “Joint-feature guided depth map super-resolution with face priors,” IEEE Transactions on Cybernetics, vol. 48, no. 1, pp. 399–411, 2018.
View at: Publisher Site | Google Scholar
T. Ni, C. Zhang, and X. Gu, “Transfer model collaborating metric learning and dictionary learning for cross-domain facial expression recognition,” IEEE Transactions on Computational Social Systems, p. 1, 2020.
View at: Publisher Site | Google Scholar
T. Ni, X. Gu, and Y. Jiang, “Transfer discriminative dictionary learning with label consistency for classification of EEG signals of epilepsy,” Journal of Ambient Intelligence and Humanized Computing.
View at: Publisher Site | Google Scholar
Q. Lei, H. Jing, X. Fan, Y. Shi, and Y. Gao, “Unsupervised joint subspace and dictionary learning for enhanced cross-domain person re-identification,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 6, pp. 1263–1275, 2018.
View at: Google Scholar
Z. Ding and Y. Fu, “Deep transfer low-rank coding for cross-domain learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 6, pp. 1768–1779, 2019.
View at: Publisher Site | Google Scholar
J. D. Gardiner, A. J. Laub, J. J. Amato, and C. B. Moler, “Solution of the Sylvester matrix equation AXB T + CXD T = E,” ACM Transactions on Mathematical Software, vol. 18, no. 2, pp. 223–231, 1992.
View at: Publisher Site | Google Scholar
N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), pp. 886–893, San Diego, CA, USA, June 2005.
View at: Google Scholar
T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: application to face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 12, pp. 2037–2041, 2006.
View at: Publisher Site | Google Scholar
L. Wolf, T. Hassner, and Y. Taigman, “Descriptor based methods in the wild,” in Proceedings of the Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition, pp. 1–14, Marseille, France, October 2008.
View at: Google Scholar
J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong, “Locality constrained linear coding for image classification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367, San Francisco, CA, USA, June 2010.
View at: Google Scholar
K. Q. Weinberger and L. K. Saul, “Distance metric learning for large margin nearest neighbor classification,” Journal of Machine Learning Research, vol. 10, pp. 207–244, 2009.
View at: Google Scholar
W. S. Zheng, S. Gong, and T. Xiang, “Reidentication by relative distance comparison,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 3, pp. 653–668, 2013.
View at: Google Scholar
J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, “Neighbourhood components analysis,” in Proceedings of Advances in Neural Information Processing Systems, pp. 513–520, Vancouver, Canada, December 2004.
View at: Google Scholar
T. Ni, Z. Ding, F. Chen, and H. Wang, “Relative distance metric leaning based on clustering centralization and projection vectors learning for person Re-identification,” IEEE Access, vol. 6, pp. 11405–11411, 2018.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Jianwei Lu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

215

Downloads

778

Citations

Mathematical Problems in Engineering

Advanced Intelligent Fuzzy Systems Modeling Technologies for Smart Cities

Joint Subspace and Low-Rank Coding Method for Makeup Face Recognition

Abstract

1. Introduction

2. Related Work

3. Joint Subspace and Low-Rank Coding Method for Makeup Face Recognition

3.1. Objective Function of JSLC

3.2. Optimization

4. Experiments

4.1. Datasets and Experimental Settings

4.2. Experimental Results

5. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright