Kinship Verification Using Facial Images by Robust Similarity Learning
Kinship verification from face images is a new and challenging problem in pattern recognition and computer vision, and it has many potential real-world applications including social media analysis and children adoptions. Most existing methods for kinship verification assume that each positive pair of face images (with kin relationship) has greater similarity score than those of negative pairs without kin relationships under a distance metric to be learned. In practice, however, this assumption is usually too strict for real-life kin samples. Instead, we propose in this paper learning a robust similarity model, under which the similarity score of each positive pair is greater than average similarity score of some negative ones. In addition, we develop an online similarity learning algorithm for more scalable application. We empirically evaluate the proposed methods on benchmark datasets, and experimental results show that our method outperforms some state-of-the-art kinship verification methods in terms of verification accuracy and computational efficiency.
The aim of the kinship verification via biometrics is to determine whether a given pair of face images of two people has a kin relationship. Recent evidence in psychology has demonstrated that facial appearance is a key clue to identify the kinship [1–6], because usually two people who are biologically related have higher facial similarity than other people. Inspired by this, many researchers in computer vision society have investigated the problem of face verification from facial images in recent years, and some encouraging progress has been made in this area. However, most existing similarity learning methods for kinship verification suffer from two critical difficulties: (i) compared to face verification, kinship verification is even more difficult, because its objective is to determine whether there is a kinship relation between two images of two people, other than two images of same individual; (ii) the problem is still extremely challenging, because there are large variations on lighting, pose, expression, background, and ethnicity on the face images, especially when face images are captured under uncontrolled conditions.
To address the above difficulties, we propose an online similarity learning with average strategy (OSL-A) method for kinship verification from facial images, which applies good similarity strategy  to learn a sparse linear similarity measurement for kinship verification. Figure 1(a) shows several sampled positive pairs with kinship relation from the KinFaceW datasets . Compared to the state-of-the-art metric learning based kinship verification methods, OSL-A has the following advantages:(1)Online similarity learning algorithm is presented for kinship verification in this paper. Different from the batch training patterns usually used by the previous studies, our approach scales well to the ever-growing kin dataset.(2)To the best of our knowledge, there is very little attention devoted to tackle the problem that interclass samples often have higher similarity than intraclass samples in kinship verification. To address this challenge, we propose in this work a good similarity strategy that the similarity of intraclass samples should be larger than the average similarity of some randomly selected interclass samples. The relaxed rule has demonstrated the robustness in kinship verification in our experiments.(3)From the computational viewpoint, OSL-A clearly shows the superiority to most existing metric learning methods. We design a sparse linear similarity measurement model for kinship verification using a diagonal matrix to replace a square matrix ( is the dimensionality of the face descriptor). Consequently, computational complexity of OSL-A is drastically reduced from to .(4)Although our approach has a lower computational cost, OSL-A achieves competitive performance to state-of-the-art metric learning based kinship verification methods.
The remainder of this paper is organized as follows. Section 2 briefly reviews related work. Section 3 details our proposed method. Section 4 presents the experimental results, and the last section concludes the paper.
2. Related Work
In the section, we focus on reviewing two related topics: (1) kinship verification and (2) metric learning.
2.1. Kinship Verification
To the best of our knowledge, Fang et al.  first proposed the computational method to tackle the challenge of kinship verification from facial images. From then on, kinship verification became an active research topic in computer vision area and some seminal research results [9–16] have been obtained over the past five years. Fang et al.  classified pairs of face images as “related” or “unrelated” kinship by using novel feature extraction and selection methods. Guo and Wang  suggested a method for familial trait extraction and a stochastic combination scheme. Zhou et al. [12, 17] proposed a spatial pyramid learning based (SPLE) feature descriptor and a Gabor-based Gradient Orientation Pyramid (GGOP) feature to represent facial images integrated with support vector machine (SVM) classifier for kinship verification. The above-mentioned methods are all feature-based with the aim of extracting some discriminative features to describe facial images, in which each face image is represented as a compact feature vector; meanwhile, intraclass variations are reduced and interclass variations are increased as much as possible. Another category method for kinship verification is learning based. Recently some discriminative learning algorithms [10, 11, 14, 16, 18] have been proposed to learn an effective classifier from facial images in order to verify kinship. Xia et al. [10, 11, 14] constructed the UB KinFace and FamilyFace datasets and bridge the great discrepancy between children and their old parents using transfer subspace learning method. Lu et al.  collected two kinship datasets named KinFaceW-I and KinFaceW-II and proposed a new neighborhood repulsed metric learning. More recently, Yan et al.  jointly learned multiple distance metrics with multiple features to exploit complementary and discriminative information for verification. Zhou et al.  proposed a similarity learning method for the verification problem by formulating similarity learning in the ensemble learning framework to enhance the generalization ability of the prediction model.
However, there still exist two limitations in previous studies [9–16, 18]. (i) In some cases, as we know, two people without kin relationship may have higher facial similarity than their kin-related individuals. However, few studies have ever been made to tackle this problem. In this work, we propose a relax rule instead of the overly strict strategy, which ensures that the similarity of kin pairs (with kinship relation) should be higher than the average similarity of some randomly selected nonkin pairs (without kinship relation), other than the similarity of any nonkin pair. (ii) Unlike previous studies using batch machine learning mode, which are often poorly scalable and even cannot scale up to medium-scale applications, we present an online learning algorithm that is fast and easy for model training.
2.2. Metric Learning
In recent years, much attention has been paid on metric learning, and there have been some representative algorithms [8, 15, 18–28] proposed and applied to many computer vision problems, such as neighborhood component analysis (NCA) , large margin nearest neighbor (LMNN) , information theoretic metric learning (ITML) , cosine similarity metric learning (CSML) , large scale metric learning (LSML) , sparse pairwise constraints metric learning (SPCML) , neighborhood repulsed metric learning (NRML) , and discriminative multimetric learning (DMML) , while most existing distance metric learning methods [8, 18–28] generally are designed to learn a distance metric square matrix. If feature descriptor has large dimensionality, the metric learning algorithm will have a very high computation cost (). Different from previous methods, OSL-A uses a diagonal sparse matrix with lower computation cost (). The experimental results clearly show that our algorithm is highly competitive to state-of-the-art metric learning based kinship verification methods.
3. Proposed Approach
The resemblance between human faces has been generally accepted as an important cue in recognizing the kinship between parents and children. However, some nonkin pairs have higher similarity than kin pairs. These nonkin pairs with high resemblance become a big obstacle in kinship verification task. To address this, we propose a novel OSL-A method for kinship verification, which suggests that the relax rule ensures that the similarity of kin pairs should be larger than the average similarity of one group randomly selected nonkin pairs.
We aim to learn a pairwise similarity function , according to supervision on the relative similarity between two pairs of images. First formally, let be the training set of pairs of kinship images, where are the th parent and child image vector, respectively. The target of OSL-A is to achieve a good metric and then compute similarity function in (1) such that the distance between and is as small as possible; simultaneously, the distance between and (, ) is as large as possible, where with . We assign higher similarity value to the positive sample pairs with a certain margin as follows:
To learn a similarity measurement that obeys the constraints in (2), we define a loss function for a single quaternion:
To minimize the loss , our OSL-A is based on the online passive-aggressive algorithm . Hence, we formulate OSL as the following convex optimization problem with a certain margin:
To address the problem in (5), we define the Lagrangian as where . We can achieve the optimal solution when the gradient vanishes:where is the gradient matrix:
In our approach, we iteratively update by a new random quaternion till maximal number of the iterations is reached. The proposed OSL-A algorithm is summarized in Algorithm 1.
4. Experiments and Results
To evaluate proposed OSL-A approach for kinship verification, we perform experiments on two public datasets. In this section, we first describe experimental datasets and settings and then discuss and analyze the experiment results.
4.1. Datasets and Experimental Settings
KinFaceW-I  (http://www.kinfacew.com) and KinFaceW-II  (http://www.kinfacew.com) are two publicly available face kinship datasets. The difference between two datasets is that two facial images of each pair in KinFaceW-I were from different photos, while those in KinFaceW-II were collected from the same photo. Four kinship relation subsets exist in both the KinFaceW-I and KinFaceW-II datasets: Father-Son (FS), Father-Daughter (FD), Mother-Son (MS), and Mother-Daughter (MD). Moreover, in the KinFaceW-I dataset, there are 134, 156, 127, and 116 pairs of facial images for FS, FD, MS, and MD kinship relation, respectively. For the KinFaceW-II dataset, there are 250 pairs of facial images for each kinship relation. Figure 1(a) shows some positive samples of kinship images.
In our experiments, we align the face images and crop them into pixels based on the provided eyes positions in each dataset. We adopt fivefold cross-validation experimental strategy on the two kinship datasets, where the face images are uniformly divided fivefold for each subset of the dataset. In our experiments, we use pairs of face images with kinship relation as positive samples. In addition, we generate each negative pair by first choosing one parent image and then generating a child image computed by a few child images that have no kin relations to the parent.
In our experiments, we apply two face descriptors, histogram of gradient (HOG)  and SIFT , to extract face features. For HOG, each face image is first divided into nonoverlapping blocks with the size of pixels and then divided into nonoverlapping blocks with size of pixels. As a result, we obtain a 9-dimensional HOG feature for each block, and they are finally concatenated to form a 2880-dimensional vector. As for SIFT, one 128-dimensional feature over each patch is computed, where the spacing of two neighboring patches is 8 pixels. Finally, the SIFT features are concatenated into a 6272-dimensional vector.
4.2. Results and Analysis
4.2.1. Comparisons of the Single Similarity and the Average Similarity
To better understand the basic idea behind our proposed OSL-A approach, we perform experiment to compare the similarity of the positive pair, single negative pair (without kin relationship but having high similarity score), and average negative pairs. As shown in Figure 1(b), for each row, face images in column 1 and column 2 are a pair of faces with kinship relations, face images in column 1 and column 3 are a pair of face images without kin relations while having higher similarity, and face images in column 4 to column 7 are randomly selected facial images without kin relations to the face image in column 1. We list the similarity of the above examples in Table 1. It can be observed from the table that similarity scores of some positive pairs are lower than those of single negative pairs, while they are greater than those of average negative pairs. The experiment results indicate that our proposed OSL-A approach effectively addresses the problem that negative pairs sometimes have greater resemblance than those of positive ones.
4.2.2. Comparisons with OSL Method
We summarize the verification rate of OSL-A and OSL method without using average strategy (OSL) on KinFaceW-I and KinFaceW-II datasets in Tables 2 and 3. As shown in these two tables, OSL-A significantly surpasses the OSL method in all subsets with gains in accuracy of 4.18%, 3.21% on the F-S subset, 4.07%, 2.28% on the F-D subset, 3.92%, 0.46% on the M-S subset, 12.69%, 4.26% on the M-D subset, and 6.22%, 2.55% on the mean accuracy of the KinFaceW-I dataset, 4.80%, 0.80% on the F-S subset, 4.00%, 1.80% on the F-D subset, 1.20%, 0.40% on the M-S subset, 4.60%, 2.40% on the M-D subset, and 3.65%, 1.35% on the mean accuracy of the KinFaceW-II dataset, respectively.
4.2.3. Comparisons with NRML Method
We also tabulate the verification rate of OSL-A and NRML  method on KinFaceW-I and KinFaceW-II dataset in Tables 4 and 5. We can observe from these two tables that the proposed OSL-A is slightly worse than NRML method on the F-S subset, obtains comparable but a little bit better performance on the F-D and M-S subsets, and clearly outperforms NRML method on the M-D subset of the KinFaceW-I dataset. Furthermore, it is significantly superior to NRML on all subsets of the KinFaceW-II dataset and surpasses NRML method on mean accuracy of the two datasets.
To better present and compare the verification performance of different methods, we provide the receiver operating characteristic (ROC) curves in Figures 2(a) and 2(b). From the two figures, we can conclude that OSL-A approach achieves highly competitive performance to NRML on KinFaceW-I and achieves better performance on KinFaceW-II. On the other hand, OSL-A consistently outperforms OSL on the two datasets.
4.2.4. Parameter Analysis
The impact of the parameters and on our proposed OSL-A method is also to be studied. The mean verification accuracy of OSL-A versus different value of parameter and parameter on KinFaceW-I and KinFaceW-II datasets is shown in Figures 3(a) and 3(b), respectively. From Figure 3, we can observe that OSL-A can achieve the best recognition performance when parameters , are set as 1 and 1000, respectively. In addition, we also can see that OSL-A shows a stable recognition performance versus different values of parameters and .
4.2.5. Computational Complexity
First, let us briefly analyze the computational complexity of our proposed OSL-A method, which includes iterations. In addition solving a standard eigenvalue equation, the two matrices and ν need to be updated during each iteration. The time complexity of computing the two tasks is and in each iteration, respectively. Therefore, the computational complexity of OSL-A is .
Then, we list and compare the average running time cost between the proposed OLS-A method and NRML  method as well. The hardware configuration used for our experiment is i5-4210U dual-core CPU and 4 GB RAM. In addition, these methods are implemented by Matlab. Running time of NRML and OSL-A is shown in Table 6.
5. Conclusions and Future Work
In this paper, we have proposed a robust similarity learning method for kinship verification. Our basic idea is that each positive pair of face images (with kin relationship) should have greater similarity score than those of negative pairs without kin relationships under a distance metric to be learned. Extensive experimental results on widely used kinship datasets demonstrated that our method can achieve considerable improvements to the state-of-the-art metric learning based kinship verification methods. In the future, we plan to extend our OSL-A method to combine with more discriminative features to further improve the performance of our method for kinship verification.
Conflict of Interests
The authors declare no conflict of interests.
The authors of the paper were responsible for leading the design and content of the paper.
This work was partially supported by the grant from the Colleges and Universities Youth Talent Program in Beijing (YETP1634) and National Natural Science Foundation of China (no. 61373090).
A. Bellet, A. Habrard, and M. Sebban, “Similarity learning for provably accurate sparse linear classification,” in Proceedings of the 29th International Conference on Machine Learning (ICML ’12), Edinburgh, UK, June-July 2012.View at: Google Scholar
M. Shao, S. Xia, and Y. Fu, “Genealogical face recognition based on UB KinFace database,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW ’11), pp. 60–65, Colorado Springs, Colo, USA, June 2011.View at: Publisher Site | Google Scholar
X. Zhou, J. Lu, J. Hu, and Y. Shang, “Gabor-based gradient orientation pyramid for kinship verification under uncontrolled environments,” in Proceedings of the 20th ACM International Conference on Multimedia (MM ’12), pp. 725–728, ACM, Nara, Japan, November 2012.View at: Publisher Site | Google Scholar
J. Goldberger, G. E. Hinton, S. T. Roweis, and R. Salakhutdinov, “Neighbourhood components analysis,” in Advances in Neural Information Processing Systems, pp. 513–520, 2004.View at: Google Scholar
K. Q. Weinberger, J. Blitzer, and L. K. Saul, “Distance metric learning for large margin nearest neighbor classification,” in Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS ’05), pp. 1473–1480, December 2005.View at: Google Scholar
D. Tran and A. Sorokin, “Human activity recognition with metric learning,” in Computer Vision—ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12–18, 2008, Proceedings, Part I, vol. 5302 of Lecture Notes in Computer Science, pp. 548–561, Springer, Berlin, Germany, 2008.View at: Publisher Site | Google Scholar
H. V. Nguyen and L. Bai, “Cosine similarity metric learning for face verification,” in Computer Vision—ACCV 2010, pp. 709–720, Springer, Berlin, Germany, 2011.View at: Google Scholar
M. Kostinger, M. Hirzer, P. Wohlhart, P. M. Roth, and H. Bischof, “Large scale metric learning from equivalence constraints,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’12), pp. 2288–2295, IEEE, Providence, RI, USA, June 2012.View at: Publisher Site | Google Scholar