Abstract
Kinship verification from face images is a new and challenging problem in pattern recognition and computer vision, and it has many potential realworld applications including social media analysis and children adoptions. Most existing methods for kinship verification assume that each positive pair of face images (with kin relationship) has greater similarity score than those of negative pairs without kin relationships under a distance metric to be learned. In practice, however, this assumption is usually too strict for reallife kin samples. Instead, we propose in this paper learning a robust similarity model, under which the similarity score of each positive pair is greater than average similarity score of some negative ones. In addition, we develop an online similarity learning algorithm for more scalable application. We empirically evaluate the proposed methods on benchmark datasets, and experimental results show that our method outperforms some stateoftheart kinship verification methods in terms of verification accuracy and computational efficiency.
1. Introduction
The aim of the kinship verification via biometrics is to determine whether a given pair of face images of two people has a kin relationship. Recent evidence in psychology has demonstrated that facial appearance is a key clue to identify the kinship [1–6], because usually two people who are biologically related have higher facial similarity than other people. Inspired by this, many researchers in computer vision society have investigated the problem of face verification from facial images in recent years, and some encouraging progress has been made in this area. However, most existing similarity learning methods for kinship verification suffer from two critical difficulties: (i) compared to face verification, kinship verification is even more difficult, because its objective is to determine whether there is a kinship relation between two images of two people, other than two images of same individual; (ii) the problem is still extremely challenging, because there are large variations on lighting, pose, expression, background, and ethnicity on the face images, especially when face images are captured under uncontrolled conditions.
To address the above difficulties, we propose an online similarity learning with average strategy (OSLA) method for kinship verification from facial images, which applies good similarity strategy [7] to learn a sparse linear similarity measurement for kinship verification. Figure 1(a) shows several sampled positive pairs with kinship relation from the KinFaceW datasets [8]. Compared to the stateoftheart metric learning based kinship verification methods, OSLA has the following advantages:(1)Online similarity learning algorithm is presented for kinship verification in this paper. Different from the batch training patterns usually used by the previous studies, our approach scales well to the evergrowing kin dataset.(2)To the best of our knowledge, there is very little attention devoted to tackle the problem that interclass samples often have higher similarity than intraclass samples in kinship verification. To address this challenge, we propose in this work a good similarity strategy that the similarity of intraclass samples should be larger than the average similarity of some randomly selected interclass samples. The relaxed rule has demonstrated the robustness in kinship verification in our experiments.(3)From the computational viewpoint, OSLA clearly shows the superiority to most existing metric learning methods. We design a sparse linear similarity measurement model for kinship verification using a diagonal matrix to replace a square matrix ( is the dimensionality of the face descriptor). Consequently, computational complexity of OSLA is drastically reduced from to .(4)Although our approach has a lower computational cost, OSLA achieves competitive performance to stateoftheart metric learning based kinship verification methods.
(a)
(b)
The remainder of this paper is organized as follows. Section 2 briefly reviews related work. Section 3 details our proposed method. Section 4 presents the experimental results, and the last section concludes the paper.
2. Related Work
In the section, we focus on reviewing two related topics: (1) kinship verification and (2) metric learning.
2.1. Kinship Verification
To the best of our knowledge, Fang et al. [9] first proposed the computational method to tackle the challenge of kinship verification from facial images. From then on, kinship verification became an active research topic in computer vision area and some seminal research results [9–16] have been obtained over the past five years. Fang et al. [9] classified pairs of face images as “related” or “unrelated” kinship by using novel feature extraction and selection methods. Guo and Wang [13] suggested a method for familial trait extraction and a stochastic combination scheme. Zhou et al. [12, 17] proposed a spatial pyramid learning based (SPLE) feature descriptor and a Gaborbased Gradient Orientation Pyramid (GGOP) feature to represent facial images integrated with support vector machine (SVM) classifier for kinship verification. The abovementioned methods are all featurebased with the aim of extracting some discriminative features to describe facial images, in which each face image is represented as a compact feature vector; meanwhile, intraclass variations are reduced and interclass variations are increased as much as possible. Another category method for kinship verification is learning based. Recently some discriminative learning algorithms [10, 11, 14, 16, 18] have been proposed to learn an effective classifier from facial images in order to verify kinship. Xia et al. [10, 11, 14] constructed the UB KinFace and FamilyFace datasets and bridge the great discrepancy between children and their old parents using transfer subspace learning method. Lu et al. [16] collected two kinship datasets named KinFaceWI and KinFaceWII and proposed a new neighborhood repulsed metric learning. More recently, Yan et al. [18] jointly learned multiple distance metrics with multiple features to exploit complementary and discriminative information for verification. Zhou et al. [19] proposed a similarity learning method for the verification problem by formulating similarity learning in the ensemble learning framework to enhance the generalization ability of the prediction model.
However, there still exist two limitations in previous studies [9–16, 18]. (i) In some cases, as we know, two people without kin relationship may have higher facial similarity than their kinrelated individuals. However, few studies have ever been made to tackle this problem. In this work, we propose a relax rule instead of the overly strict strategy, which ensures that the similarity of kin pairs (with kinship relation) should be higher than the average similarity of some randomly selected nonkin pairs (without kinship relation), other than the similarity of any nonkin pair. (ii) Unlike previous studies using batch machine learning mode, which are often poorly scalable and even cannot scale up to mediumscale applications, we present an online learning algorithm that is fast and easy for model training.
2.2. Metric Learning
In recent years, much attention has been paid on metric learning, and there have been some representative algorithms [8, 15, 18–28] proposed and applied to many computer vision problems, such as neighborhood component analysis (NCA) [20], large margin nearest neighbor (LMNN) [21], information theoretic metric learning (ITML) [22], cosine similarity metric learning (CSML) [26], large scale metric learning (LSML) [29], sparse pairwise constraints metric learning (SPCML) [28], neighborhood repulsed metric learning (NRML) [8], and discriminative multimetric learning (DMML) [18], while most existing distance metric learning methods [8, 18–28] generally are designed to learn a distance metric square matrix. If feature descriptor has large dimensionality, the metric learning algorithm will have a very high computation cost (). Different from previous methods, OSLA uses a diagonal sparse matrix with lower computation cost (). The experimental results clearly show that our algorithm is highly competitive to stateoftheart metric learning based kinship verification methods.
3. Proposed Approach
The resemblance between human faces has been generally accepted as an important cue in recognizing the kinship between parents and children. However, some nonkin pairs have higher similarity than kin pairs. These nonkin pairs with high resemblance become a big obstacle in kinship verification task. To address this, we propose a novel OSLA method for kinship verification, which suggests that the relax rule ensures that the similarity of kin pairs should be larger than the average similarity of one group randomly selected nonkin pairs.
We aim to learn a pairwise similarity function , according to supervision on the relative similarity between two pairs of images. First formally, let be the training set of pairs of kinship images, where are the th parent and child image vector, respectively. The target of OSLA is to achieve a good metric and then compute similarity function in (1) such that the distance between and is as small as possible; simultaneously, the distance between and (, ) is as large as possible, where with . We assign higher similarity value to the positive sample pairs with a certain margin as follows:
To learn a similarity measurement that obeys the constraints in (2), we define a loss function for a single quaternion:
To minimize the loss , our OSLA is based on the online passiveaggressive algorithm [29]. Hence, we formulate OSL as the following convex optimization problem with a certain margin:
To address the problem in (5), we define the Lagrangian as where . We can achieve the optimal solution when the gradient vanishes:where is the gradient matrix:
Finally, we obtain the solution according to (5)–(7):
In our approach, we iteratively update by a new random quaternion till maximal number of the iterations is reached. The proposed OSLA algorithm is summarized in Algorithm 1.

4. Experiments and Results
To evaluate proposed OSLA approach for kinship verification, we perform experiments on two public datasets. In this section, we first describe experimental datasets and settings and then discuss and analyze the experiment results.
4.1. Datasets and Experimental Settings
KinFaceWI [18] (http://www.kinfacew.com) and KinFaceWII [18] (http://www.kinfacew.com) are two publicly available face kinship datasets. The difference between two datasets is that two facial images of each pair in KinFaceWI were from different photos, while those in KinFaceWII were collected from the same photo. Four kinship relation subsets exist in both the KinFaceWI and KinFaceWII datasets: FatherSon (FS), FatherDaughter (FD), MotherSon (MS), and MotherDaughter (MD). Moreover, in the KinFaceWI dataset, there are 134, 156, 127, and 116 pairs of facial images for FS, FD, MS, and MD kinship relation, respectively. For the KinFaceWII dataset, there are 250 pairs of facial images for each kinship relation. Figure 1(a) shows some positive samples of kinship images.
In our experiments, we align the face images and crop them into pixels based on the provided eyes positions in each dataset. We adopt fivefold crossvalidation experimental strategy on the two kinship datasets, where the face images are uniformly divided fivefold for each subset of the dataset. In our experiments, we use pairs of face images with kinship relation as positive samples. In addition, we generate each negative pair by first choosing one parent image and then generating a child image computed by a few child images that have no kin relations to the parent.
In our experiments, we apply two face descriptors, histogram of gradient (HOG) [30] and SIFT [31], to extract face features. For HOG, each face image is first divided into nonoverlapping blocks with the size of pixels and then divided into nonoverlapping blocks with size of pixels. As a result, we obtain a 9dimensional HOG feature for each block, and they are finally concatenated to form a 2880dimensional vector. As for SIFT, one 128dimensional feature over each patch is computed, where the spacing of two neighboring patches is 8 pixels. Finally, the SIFT features are concatenated into a 6272dimensional vector.
4.2. Results and Analysis
4.2.1. Comparisons of the Single Similarity and the Average Similarity
To better understand the basic idea behind our proposed OSLA approach, we perform experiment to compare the similarity of the positive pair, single negative pair (without kin relationship but having high similarity score), and average negative pairs. As shown in Figure 1(b), for each row, face images in column 1 and column 2 are a pair of faces with kinship relations, face images in column 1 and column 3 are a pair of face images without kin relations while having higher similarity, and face images in column 4 to column 7 are randomly selected facial images without kin relations to the face image in column 1. We list the similarity of the above examples in Table 1. It can be observed from the table that similarity scores of some positive pairs are lower than those of single negative pairs, while they are greater than those of average negative pairs. The experiment results indicate that our proposed OSLA approach effectively addresses the problem that negative pairs sometimes have greater resemblance than those of positive ones.
4.2.2. Comparisons with OSL Method
We summarize the verification rate of OSLA and OSL method without using average strategy (OSL) on KinFaceWI and KinFaceWII datasets in Tables 2 and 3. As shown in these two tables, OSLA significantly surpasses the OSL method in all subsets with gains in accuracy of 4.18%, 3.21% on the FS subset, 4.07%, 2.28% on the FD subset, 3.92%, 0.46% on the MS subset, 12.69%, 4.26% on the MD subset, and 6.22%, 2.55% on the mean accuracy of the KinFaceWI dataset, 4.80%, 0.80% on the FS subset, 4.00%, 1.80% on the FD subset, 1.20%, 0.40% on the MS subset, 4.60%, 2.40% on the MD subset, and 3.65%, 1.35% on the mean accuracy of the KinFaceWII dataset, respectively.
4.2.3. Comparisons with NRML Method
We also tabulate the verification rate of OSLA and NRML [8] method on KinFaceWI and KinFaceWII dataset in Tables 4 and 5. We can observe from these two tables that the proposed OSLA is slightly worse than NRML method on the FS subset, obtains comparable but a little bit better performance on the FD and MS subsets, and clearly outperforms NRML method on the MD subset of the KinFaceWI dataset. Furthermore, it is significantly superior to NRML on all subsets of the KinFaceWII dataset and surpasses NRML method on mean accuracy of the two datasets.
To better present and compare the verification performance of different methods, we provide the receiver operating characteristic (ROC) curves in Figures 2(a) and 2(b). From the two figures, we can conclude that OSLA approach achieves highly competitive performance to NRML on KinFaceWI and achieves better performance on KinFaceWII. On the other hand, OSLA consistently outperforms OSL on the two datasets.
(a)
(b)
4.2.4. Parameter Analysis
The impact of the parameters and on our proposed OSLA method is also to be studied. The mean verification accuracy of OSLA versus different value of parameter and parameter on KinFaceWI and KinFaceWII datasets is shown in Figures 3(a) and 3(b), respectively. From Figure 3, we can observe that OSLA can achieve the best recognition performance when parameters , are set as 1 and 1000, respectively. In addition, we also can see that OSLA shows a stable recognition performance versus different values of parameters and .
(a)
(b)
4.2.5. Computational Complexity
First, let us briefly analyze the computational complexity of our proposed OSLA method, which includes iterations. In addition solving a standard eigenvalue equation, the two matrices and ν need to be updated during each iteration. The time complexity of computing the two tasks is and in each iteration, respectively. Therefore, the computational complexity of OSLA is .
Then, we list and compare the average running time cost between the proposed OLSA method and NRML [8] method as well. The hardware configuration used for our experiment is i54210U dualcore CPU and 4 GB RAM. In addition, these methods are implemented by Matlab. Running time of NRML and OSLA is shown in Table 6.
5. Conclusions and Future Work
In this paper, we have proposed a robust similarity learning method for kinship verification. Our basic idea is that each positive pair of face images (with kin relationship) should have greater similarity score than those of negative pairs without kin relationships under a distance metric to be learned. Extensive experimental results on widely used kinship datasets demonstrated that our method can achieve considerable improvements to the stateoftheart metric learning based kinship verification methods. In the future, we plan to extend our OSLA method to combine with more discriminative features to further improve the performance of our method for kinship verification.
Conflict of Interests
The authors declare no conflict of interests.
Authors’ Contribution
The authors of the paper were responsible for leading the design and content of the paper.
Acknowledgments
This work was partially supported by the grant from the Colleges and Universities Youth Talent Program in Beijing (YETP1634) and National Natural Science Foundation of China (no. 61373090).