#### Abstract

Biometric template protection is indispensable to protect personal privacy in large-scale deployment of biometric systems. Accuracy, changeability, and security are three critical requirements for template protection algorithms. However, existing template protection algorithms cannot satisfy all these requirements well. In this paper, we propose a hybrid approach that combines random projection and fuzzy vault to improve the performances at these three points. Heterogeneous space is designed for combining random projection and fuzzy vault properly in the hybrid scheme. New chaff point generation method is also proposed to enhance the security of the heterogeneous vault. Theoretical analyses of proposed hybrid approach in terms of accuracy, changeability, and security are given in this paper. Palmprint database based experimental results well support the theoretical analyses and demonstrate the effectiveness of proposed hybrid approach.

#### 1. Introduction

Biometric based authentication is more convenient and reliable than password or token based authentication. However, biometric technology needs large-scale capture and storage of biometric data which leads to serious concern about privacy leakage and identity theft. Unlike passwords or tokens, biometric characteristics are inherent to a person; once they are compromised, they would never be reissued or refreshed. Therefore, biometric template protection techniques [1] have attracted much attention recently for the reasons mentioned above.

Broadly, biometric template protection techniques can be categorized into two classes, cancelable biometrics and biometric cryptosystems. For a typical biometric template protection scheme, three critical requirements are suggested to satisfy [2].

*(1) Accuracy Requirement*. The discriminability of original biometric features should be preserved in a biometric template protection scheme, so that the accuracy of biometric system is not degraded.

*(2) Security Requirement*. The protected objects (biometric features and cryptographic key in biometric cryptosystems) should be computationally hard to be revealed by attackers even though the sketch is published.

*(3) Cancelability Requirement*. The cancelability means revocability and diversity. Different applications have different templates of the same user, and these templates cannot authenticate with each other. Once a template is compromised, a new different template can be generated to replace it.

However, for cancelable biometrics and biometric cryptosystems, they cannot satisfy all these requirements quite well. And different approach has its advantages and disadvantages [3].

The cancelable biometrics often uses transform-based approach to generate new templates. This approach has good cancelability, but the security level is often lower than biometric cryptosystems, and in general no independent cryptographic key can be bound for cryptographic applications.

Biometric cryptosystems (BC) [4] output encrypted sketch; the security level is relatively high. BC uses biometric features to protect cryptographic key, which provide a new solution for key management issue. However, the error correcting code (ECC) used in this technique is not strong enough to handle large biometric intraclass variants; the accuracy of BC degrade sharply, and changeability is often not provided.

Considering the limitations of available approaches, hybrid approach [5, 6] is a solution to meet the increasing demands for biometric template protection.

In this paper, a novel hybrid approach is proposed to compensate the shortcomings of a single approach and meanwhile maintain the advantages of individual approach in the hybrid scheme.

The proposed hybrid scheme combines fuzzy vault scheme (FVS) [7] and random projection [8] to meet above three requirements for biometric template protection.

Fuzzy vault scheme is one of the most popular biometric cryptosystems [9–11]; it provides an effective security mechanism to protect cryptographic key and biometric templates simultaneously. However, the system accuracy in terms of false accept rate (FAR) and false rejection rate (FRR) often degrade sharply due to insufficient intraclass variations handing ability of used error correcting code. And FVS do not provide cancelability. The random projection, which is a transform-based template protection approach, has good cancelability property. By combining the random projection method with the fuzzy vault scheme, the proposed hybrid scheme aims to improve the accuracy and security and provide good changeability simultaneously.

To combine random projection with fuzzy vault effectively, first, a heterogeneous space is defined; raw biometric features are projected into the heterogeneous space by random projection and long enough cryptographic key can be bound together with projected features in the heterogeneous space. A new chaff point generation method is also proposed to ensure the security even when the projection matrices are lost, and then three requirements of proposed hybrid are theoretically analyzed. Promising experimental results based on palmprint database show the validity of proposed hybrid approach.

The rest of this paper is organized as follows. The proposed hybrid approach is described in Section 2. Three requirements are analyzed in Section 3. Experimental results are reported in Section 4. All works are summarized in Section 5.

#### 2. Proposed Hybrid Approach

The flow chart of proposed hybrid approach is shown in Figure 1, in which two main modules are included. The first is multispace random projection which is used not only to provide cancelability but also to provide the different representations of original palmprint feature vectors in random subspaces for generating different genuine points. The second is the proposed heterogeneous fuzzy vault scheme, which is used to enhance security and bind cryptographic key for cryptographic applications. Since cryptographic key is generated independently, its randomness is guaranteed, and in heterogeneous space, the cryptographic key can be bound long enough to meet high security requirements in cryptographic applications. In the following subsections, we will introduce how these two modules work.

##### 2.1. Multispace Random Projection

Assuming the fixed-length feature vector is , the multispace random projection is defined as follows [8]: where is a random matrix with size and represents matrix transposition.

In order to generate multiple genuine points using single feature vector, one feature vector is projected into a set of random subspaces by using different projection matrices:

##### 2.2. Generation of Heterogeneous Vault

The heterogeneous vault is a set of points in* heterogeneous space*. The* heterogeneous space* is defined as , where is a real-valued vector and is its length; is an element from finite field , where is the cardinality of the finite field. A heterogeneous vault contains two subsets, genuine points and chaff points. Following we will introduce how to generate these two parts.

###### 2.2.1. Generation of Genuine Points

*(a) Feature Vector Mapping. *We have

The high dimensional palmprint feature vector is mapped into low dimensional subvectors using (2); that is, . is named genuine vector.

In genuine vector generation, projection matrices are used on one original feature vector to generate different genuine vectors.

*(b) Key Encoding. *We have

The key to be protected is independent of genuine vectors, so that it can be generated randomly; therefore, the randomness of the key is guaranteed.

In this step, the key to be protected is encoded into -symbol sequence using ECC encoding algorithm. If the key is very long, it can be segmented into multiple shorter sequences, and then each shorter sequence is encoded into -symbol sequence; that is, , where is the number of segmented sequences.

*(c) Pairwise Conjugation. *We have

Given genuine vector obtained in step (a) and -symbol sequence obtained in step (b), genuine points belong to heterogeneous space can be generated by combining genuine vectors and symbols orderly. If longer key needs to be bound, each genuine vector can be combined with multiple symbols, that is, . For the pairwise conjugation, in vault unlocking, the recognition errors of genuine vectors are transformed to symbol errors in the -symbol sequence, so that can be corrected by the ECC decoding algorithm.

###### 2.2.2. Generation of Chaff Points

The chaff points are generated to protect genuine points against attacks such as clustering attack and compromised projection matrices attack.

The chaff points have the same components as genuine points; that is, chaff vector and chaff symbol . Since secret symbols in genuine points are generated randomly, the chaff symbols can be selected randomly from Galois field .

The idea of chaff vector generation is shown in Figure 2, where genuine matching distances are concentrated in the smallest circle, impostor matching distances are in the largest circle, and chaff vectors are added in the middle circle, so as to prevent the adversary from knowing which are genuine vectors, even though the adversary has impostor biometric features.

The chaff vectors are generated as follows: , where, is genuine vector, and is a random vector; each element in is independent and identically distributed (i.i.d.) according to standard norm distribution . Then, follows a chi-square distribution with degree of freedom , and its expectation . To control the distance between chaff point and genuine point, the is used as a scaling factor. The value of is set to be , where is selected according to the genuine and imposter distributions of matching distances of projected feature vectors.

Although the distances between one genuine vector and its chaff vectors are concentrated around its mean , the distances are distributed randomly; a small number of chaff vectors may be very close to some genuine vectors, which will lead to failure of genuine point filtration in vault decoding phase. Here, a minimum distance threshold and maximum distance threshold are set for all points in vault to reduce filtration errors and prevent attackers from recognizing chaff points by distance analysis. The minimum distance threshold is less than and the maximum distance threshold is greater than , the same as ; both and are selected according to the genuine and imposter matching distances distribution of projected feature vectors. An example of a 2D vault generated applying proposed genuine and chaff points generation methods is illustrated in Figure 3.

After adding chaff points, all points in heterogeneous space are sorted according to the value of the first elements in real-valued vectors; after that, the vault can be stored in smartcard or central database.

##### 2.3. Decoding of Heterogeneous Vault

*(1) Query Subvectors Generation. *We have

Firstly, the query feature vector is projected into query subvectors using the projection matrices according to (2).

*(2) Filtration of Genuine Points by Distance Measure.* The genuine vector filtration is carried out between query subvectors and the vault . Given query subvector , computing distances between , and real-valued vectors in all points in vault, the point in vault corresponding to the minimum distance is considered as the genuine point.

Totally, there are points that are filtered out orderly from vault, and then are extracted from filtered points and cascaded orderly to form a -symbol sequence for ECC decoding.

*(3) Correcting Error Symbols Using ECC Decoding Algorithm *. Given -symbol sequence obtained in previous step, a proper ECC decoding algorithm is used to such sequence to get . The false filtration of genuine points would result in symbol errors in , and the number of error symbols equals to the number of falsely recognized genuine points. If the number of error symbols is within the error-correcting capability of ECC, the original key can be recovered successfully by ECC decoding algorithm; that is, .

#### 3. Analysis of Proposed Hybrid Approach

In this section, the accuracy, changeability, and security of proposed hybrid approach are analyzed theoretically.

##### 3.1. Accuracy Analysis

###### 3.1.1. Nonorthogonal Matrix Case

If the projection matrices are nonorthogonal, the random projection can preserve the pairwise distances at a certain degree; this property is addressed by means of the Johnson-Lindenstrauss (JL) Lemma [2].

*J-L Lemma*. For any and any integer , let be a positive integer such that . Then, for any set of points in , there is a map , such that for all ,

According to the J-L Lemma, an original set with points in -dimension Euclidean space can be embedded into another Euclidean space with dimension ; meanwhile, the pairwise distances of points are preserved up to a factor of . Arriaga and Vempala [12], Achlioptas [13], and Li et al. [14] have proved that such mapping can be achieved by random projections.

This property states that we can change the form of real-valued biometric feature vectors, but the discriminability of feature vectors are still preserved. So, this property can be used to generate multiple genuine vectors in vault generation.

###### 3.1.2. Orthogonal Matrix Case

In this case, the projection matrix is a square matrix; that is, . Since each entry of is an independent and identically distributed random variable, by applying Gram-Schmidt orthonormalization method [13], the projection matrix can be transformed to an orthogonal matrix to obtain , where is an identity matrix. In this case, the random projection becomes orthogonal transformation.

Suppose that are two different real-valued feature vectors and is orthogonal matrix; then [15], we have

The above equation demonstrates that the pairwise Euclidean distances of feature vectors can be precisely preserved after orthogonal random projection.

##### 3.2. Changeability Analysis

The changeability of proposed scheme is provided by the random projection module. By refreshing the projection matrices, the projected feature vector can be updated. In this subsection, the statistical properties [16] of random projection are used for changeability analysis.

Let be two feature vectors of the same user; , , are two different random matrices, assuming that each entry of or follows standard normal distribution ; then, applying the same projection matrix for projection; that is, , , the mean and variance of squared Euclidean distance between and are as follows [16]:

According to (9), after projection, the mean of squared Euclidean distances is the same as the distance of two original feature vectors. According to (10), the variance is inversely proportional to the dimension of new space. The higher the dimension, the smaller the variance, which means better preservation of pairwise distances between original feature vectors.

If projection matrices are different; that is, , , the corresponding mean and variance are as follows [16]:

According to (9) and (11), since , when different projection matrices are applied for projections, the gathering center of squared Euclidean distances of pairwise vectors in new space is larger than that in same projection matrices scenario. According to (10) and (12), larger means smaller variances, which leads to clear separation of two kinds of distance distributions, so that stronger changeability can be provided.

##### 3.3. Security Analysis

Assuming that an attacker has obtained the vault and all parameters of the vault, that is, the number of genuine points , the number of chaff points , and the number of symbol errors that can be corrected in vault decoding phase, the security of the vault is considered in four different circumstances.

###### 3.3.1. The Attacker Has No Information about Projection Matrices and Impostor Features

In this condition, what an attacker can do is to employ brute force attack to decode the vault. Min-entropy [17] is used to measure the security of the vault: where “” means the number of combinations and “” means the number of permutations.

###### 3.3.2. The Attacker Has Genuine Query Feature Vector

In this case, the attacker will use randomly generated random matrices and legitimate query feature vector to decode the vault. The security of the vault can be measured by the false accept probability .

Assuming projection matrices used in enrollment are and , enrolled feature vector and lost legitimate feature vectors are and , respectively. The transformed features are and , respectively.

Since each entry in and is generated randomly, they can be full column rank matrices, and therefore and can be decomposed [18] as follows: and , where and . and . Since , there are and , and columns of and are almost orthonormal. Then, the projected features can be reformulated as and .

These two equalities imply that original feature vectors are first projected by the same matrix and then transformed using different orthonormal matrices, which is equivalent to the rotation of a point in hyperspace; the rotation radius is the length (norm) of the point.

According to geometric-based analysis in [18], the false accept probabilities are obtained in two cases: where is a controlling threshold in chaff vector generation, is the dimension of projected feature vectors, and and are length of and , respectively.

From the above two cases, the total false accept probability can be expressed as

The total false accept probability depends on dimension of projected feature vector and the threshold .

###### 3.3.3. The Attacker Has the Projection Matrices

When the attacker only has projection matrices , we consider a scenario that a random vector is generated as query feature vector; after projection, is used to decode the vault. The probability that falls into the hyperspace where the distance between and a genuine vector is less than a threshold which is proposed to measure the security in this case.

Suppose Euclidean distance is used to measure the distance between two vectors; the probability can be written as follows:

Assuming that entries in are uniformly and independently distributed in a given value range , to simplify the calculation, we transform the above probability to the probability that each random generated element in falls into a small value range; that is,

Since uniformly distribution in a given value range is assumed for entries in , the probability that each entry falls into the given value range is as follows:

Substituting (18) into (17), we get where is the length of .

###### 3.3.4. The Attacker Has Projection Matrices and Impostor Feature Vector

This case is the user-independent scenario; all users use the same projection matrices. The attacker may take as a center to determine a hypersphere to find genuine points. According to proposed chaff point generation method, chaff vectors are added much closer to genuine vector than query vectors projected from impostor feature vectors, even though genuine projection matrices are used. So for each genuine vector, there will be lots of chaff vectors in the hypersphere in which the attacker does not know which one is exactly the genuine vector.

From the fuzzification phase in vault generation, we know there are genuine points and chaff points in a vault. Averagely, there are points in a hypersphere. In these points, only one is genuine point. Assuming symbol errors can be corrected by the ECC; then, the security of vault can be computed as follows:

In the above four different scenarios, the last one is the most severe scenario since the attacker has gotten most information. In (20), there are three variables, total number of points in vault , the number of genuine points in vault , and the number of corrected symbols by ECC. The quantified bits and the trend of security when changing different parameters will be discussed in next section.

#### 4. Experimental Results and Discussion

In this section, the proposed hybrid scheme is evaluated based on palmprint database. Concrete experimental results in terms of accuracy, changeability, and security are presented to support the proposed hybrid approach.

##### 4.1. Palmprint Database and Experimental Parameters

The Handmetric Authentication Beijing Jiao Tong University database (HA-BJTU) [19] is used in experiments. In HA-BJTU, there are 1973 palmprints of 98 people. The palmprints are resampled to 128 × 128, and the resolution of palmprint image is 72 dpi.

The classic principle component analysis (PCA) and linear discriminant analysis (LDA) are used to extract the features from palmprints. In feature extraction (PCA and LDA), five palmprint images of each person are used for training and the rest 1483 palmprint images are used for test.

In experiments, the number of genuine points is set to be 31; for each genuine point, 20 chaff points are generated for fuzzification using proposed chaff point generating algorithm. And one symbol error is set to be corrected by ECC.

##### 4.2. Accuracy Experiments

Similar to biometric verification system, receiver operating characteristic (ROC) curve (which includes two kinds of error rates, that is, the false accept rate (FAR) and the false reject rate (FRR)) and equal error rate (EER) (when FAR = FRR) are used to evaluate the accuracy of proposed hybrid system. ROC curves are obtained by varying the controlling distance between chaff vectors and genuine vectors. EER curves are obtained under different dimensionality of projected feature vectors.

In the random projection module of proposed hybrid system, random matrices and biometric templates are needed for feature transformations, so it is a two-factor scheme. Three different scenarios, that is, stolen-key, stolen biometrics, and both legitimate cases, should be considered.

For the stolen-key case, the impostor will use genuine projection matrices and impostor biometrics for vault unlocking. This is equal to user-independent (UI) scenario; that is, different users use the same projection matrices for vault unlocking, which characterizes the system accuracy when user-independent transformations are used. For the stolen-biometrics scenario, random generated projection matrices and genuine biometrics are used for vault unlocking. In both legitimate cases, different user uses different projection matrices for vault locking and unlocking. This is a user-dependent (UD) scenario.

Let , where “Gen” represents vault generation algorithm, represents biometric features used for vault generation, represents projection matrices used for feature transformations, and is the secrets to be protected by the vault. Given genuine query biometrics and legal query matrix , if , where “Unlock” represents vault unlocking algorithm, this is false reject case. Given impostor query biometrics and impostor query matrix , if , where “Unlock” represents vault unlocking algorithm, this is false accept case.

Figure 4 shows the ROC curves in user-independent scenario. The dimensionality of genuine vector is 100. The LDA feature outperforms PCA feature because the random projection can only preserve the discriminability of features but cannot enhance that in user independent case. And LDA features have better discriminability than PCA features, as we know. The user-dependent scenario is not shown in Figure 4; in fact, FRR decreases by enlarging the distances between chaff vectors and genuine vectors and vice versa, but the FAR remains at zero in experiments.

EER curves in Figure 5 are obtained by varying dimensionality of projected vectors. In user independent case, the EER decreases as the dimension increases, but no zero EER is obtained. For user-dependent scenario, the EER decreases to zero when dimensionality is equal or greater than 80. The zero EER of hybrid system benefits from the random projection module, in which user-dependent projection matrices enhance the discriminability of transformed biometric features.

##### 4.3. Changeability Experiments

The changeability of proposed hybrid scheme is provided by the random projection module, where different enrolling features can be generated for different applications by applying random projection with different projection matrices.

Let , where is the enrolled biometric features and is the enrolled projection matrix. Using random generated projection matrices and genuine biometric features to unlock the vault, if , this is the false accept case, the obtained FAR is used to measure the changeability of proposed scheme.

In experiments, each test palmprint feature vector is paired with five groups of randomly generated matrices to unlock the corresponding vault. There are 1483 test palmprints; 7415 times experiments are performed totally.

The experimental results are shown in Figure 6. It can be seen that with different projection dimension, the FAR is always zero, which means that the proposed hybrid algorithm can provide strong changeability.

##### 4.4. Security Experiments

According to the theoretical analysis of security in Section 3.3, in this section we consider the quantized security bits in the worst case (i.e., the attacker has known projection matrices and has impostor biometrics) based on the experimental parameters.

In our experiments, the number of genuine points . In fuzzification, 20 chaff points are added around each genuine point, so the total number of points . And one symbol error can be corrected by ECC; that is, . Substituting these parameters into (20), the obtained security bits are 131.77 bits, which is higher compared to those typically reported in the literature [9, 10, 20–22].

Figures 7–9 show how the security bits change by varying parameters , , and . From Figure 7 we can see that the security bits increase rapidly by increasing the number of genuine points . From Figure 8 we can see that the security also increases by adding more chaff points around each genuine points, but the growth rate decreases when the number of chaff points increases. From Figure 8 we can see that with the increasing of corrected number of corrected error symbols, the security decreases; this indicates the tradeoff between accuracy and security; that is, correcting more symbol errors can decrease the FRR of system, but the security also decreases and vice versa.

#### 5. Conclusions

To better satisfy accuracy, changeability, and security requirements for biometric template protection, in this paper, a hybrid approach for protecting real-valued palmprint feature vectors has been proposed. The proposed hybrid approach includes two modules: random projection and fuzzy vault scheme. A heterogeneous space was proposed for fuzzy vault to enhance the intraclass variant tolerating ability and the cryptographic key can be bound as long as needed. To improve the security of fuzzy vault in heterogeneous space, a chaff point generation method was also proposed.

Theoretical analyses from accuracy, changeability, and security perspectives were presented. For accuracy analysis, orthogonal projection and nonorthogonal projection were considered. For changeability analysis, statistical properties of projected feature vector were obtained using same projection matrices and different projection matrices have shown that higher dimension of projected feature vectors provides stronger cancelability. For security analysis, we considered four different scenarios that the attacker knows different information.

Experiments based on HA-BJTU palmprint database have given concrete data to support the proposed hybrid approach well in the view of accuracy, changeability, and security.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work is supported by NSFCs (nos. 61201158 and 61201203), PCSIRT (no. IRT201206), and the Key Laboratory of Advanced Information Science and Network Technology of Beijing.