Abstract

Accuracy and efficiency are essential topics in the current biometric feature recognition and security research. This paper proposes a deep neural network using bidirectional feature extraction and transfer learning to improve finger-vein recognition performance. Above all, we make a new finger-vein database with the opposite position information of the original one and adopt transfer learning to make the network suitable for our overall recognition framework. Next, the feature extractor is constructed by adjusting the unidirectional database’s parameters, capturing vein features from top to bottom and vice versa. Correspondingly, we concatenate the above two features to form the finger-veins’ bidirectional features, which are trained and classified by Support Vector Machines (SVM) to realize recognition. Experiments are conducted on the Malaysian Polytechnic University’s published database (FV-USM) and finger veins of Signal and Information Processing Laboratory (FV-SIPL). The accuracy of our proposed algorithm reaches 99.67% and 99.31%, which is significantly higher than the unidirectional recognition under each database. Compared with the algorithms cited in this paper, our proposed model based on bidirectional feature enjoys higher accuracy, faster recognition speed than the state-of-the-art frameworks, and excellent practical value.

1. Introduction

The fast development of biometric identification technology has made the machine vision application more extensive and in-depth. Meanwhile, with the continual improvement of technology and modern science, there are higher and higher identity authentication security requirements. Among them, finger-vein recognition technology has been widely applied in information security, network payment, and other fields due to living identification and high anticounterfeiting [1, 2]. Based on the above superiorities, finger-vein recognition has attracted more researchers’ attention. Finger-vein identification systems usually consist of two processes, namely, feature extraction and matching. Finger veins contain much irregular texture information, shaded parts, and noise. Finger-vein images of the same finger have similar information, but there is a significant variance between different fingers. Therefore, people usually select functional patterns from finger veins and matching strategies for recognition.

The existing finger-vein models can be roughly separated into two categories: nonlearning and learning models. In the nonlearning model, Gabor filters [3] were mostly applied for finger-vein feature extraction. When extracting binary vein texture information, the adopted algorithm is Local Binary Patterns (LBP) [4] or the improved LBP, called Line Local Binary Patterns (LLBP) [5]. Generally, the feature point extraction exploits the Scale-Invariant Feature Transformation (SIFT) approach [6]. Researchers often employ the above ones as feature extractors and Euclidean distance [7] as the final matching strategy. Since each point of the SIFT algorithm corresponds independently to another, it is usually robust to finger-vein deformation.

Nonlearning methods are still easily affected by the quality of finger-vein pictures, texture deletion, and other problems, causing low robustness. To solve these problems, people began to introduce algorithms based on learning models [8]. Such methods are not prone to image quality and location information. He et al. [9] introduced Principal Component Analysis (PCA) to gain the principal components of finger veins and finally applied networks for classification and identification. Khellat-Kihel et al. [10] proposed utilizing the Gabor filter for feature acquirement and SVM [11] for classification and matching. Wu et al. [12] designed a finger-vein recognition network based on SVM. After extracting and resizing the region of interest (RoI), they utilized PCA [13] and Linear Discriminant Analysis (LDA) [14] to shorten the optimal feature dimension and SVM to classify images.

Although the learning methods perform well, there is a need to develop a subjective feature extraction algorithm and decrease the complexity of block execution. Presently, Convolutional Neural Network (CNN) is widely spread in the area of biometric recognition. The advantages of deep feature extraction and strong robustness developed the finger-vein identification system. Ahmad Radzi et al. [15] exploited CNN along with multiple layers to identify finger veins. Meng et al. [16] also applied CNN to find out feature information, but they identified the final output characteristics according to Euclidean distance. Huang et al. [17] utilized VGG16 as the underlying network to learn about the normalized binary feature. Hu et al. [18] proposed FV-Net based on deep learning, which obtained the recognition result by matching the finger-vein feature subregion extracted by CNN. Since then, more and more scholars have applied CNN to extract in-depth finger veins and built models with strong robustness and high classification accuracy [19].

However, the abovementioned finger-vein identification algorithms only exploit the unidirectional database for feature capturing and recognition experiments. The obtained accuracy is nonideal, and it is easy to cause security risks in subsequent practical applications. The emergence of transfer learning [20] improves massive parameters and the slow convergence during CNN training. These studies have laid a theoretical foundation for transfer learning application in the domain of biometric recognition enhancement.

Multimodal fusion technology has always been a research hotspot in biometric recognition tasks to introduce more meaningful information. Miao [21] realized a capable information fusion algorithm of iris and face in feature and score levels, respectively. Yang et al. [22] proposed feature-level cancellable multibiometric system based on fingerprints and finger veins, which provided template protection and revocation. Inspired by the above considerations, we can combine biometric features by score-fusion, pixel-level, or feature-level fusion method for performance enhancement. This paper proposes a bidirectional feature extraction algorithm via transfer learning and feature concatenation for overall finger-vein recognition enhancement.

In this research, taking the published database of Malaysian Polytechnic University (FV-USM) [23] as an example, we have created a new database called by the original finger veins by 180°, consisting of opposite location information. The original FV-USM is named the forward database (A FV-USM), and the rotated database is the reverse database (B FV-USM). Similarly, the original FV-SIPL is called A FV-SIPL, and the rotated database is called B FV-SIPL. These two databases are trained separately in a unidirectional way, the same for the databases generated from our group in Signal and Information Processing Laboratory (FV-SIPL). Then, we adjust these pretrained parameters to detect the in-depth information of different directions and fuse them by concatenation to form the bidirectional feature under the respective databases. Finally, we complete the experiments with the SVM classifier.

The experimental results indicate that the bidirectional feature extraction accuracy on FV-USM and FV-SIPL is 99.67% and 99.31%, respectively, significantly higher than those of the unidirectional models. In the same period, compared with most existing finger-vein recognition algorithms cited in this paper, our model can achieve richer detailed information, higher accuracy, and less time consumption.

2. Materials and Methods

2.1. Transfer Learning and Learning Optimization

Deep learning developed from the initial perceptron neural network. With the development of science and technology, hardware equipment improves the computing capability and calculation training of complex parameter networks. Large-scale data training enhances the network’s intelligence and reduces the reliance on prior knowledge while avoiding overfitting. At present, deep learning has become a research hotspot. It is the most crucial component in the field of artificial intelligence.

Our work utilizes deep CNN to extract finger-vein features and provide more input features for subsequent SVM recognition, effectively improving the model’s accuracy and generalization. Transfer learning [24] is a crucial branch of machine learning, making significant progress so far, especially in image recognition. Transfer learning is the technology that applies the pretrained network from a particular task to another one through parameter adjustment. The issue is to find the correlation between the new and previous problems.

In deep learning, CNN optimization is also a complex but essential process. The following two are commonly applied optimization methods. In order to improve recognition performance, we adopted the first method in this paper.(1)Data enlargement: The CNN training process usually requires numerous data for fitting. In the experiment, it is necessary to reasonably expand finger-vein databases to fit the network and promote robustness.(2)Regularization: The purpose of regularization is to receive minor training errors and robust testing results. Generally, CNN models need to adopt corresponding optimization methods to reduce test set errors, collectively called regularization operations. Regularization can improve the robustness of algorithms and prevent overfitting.

2.2. Feature-Level Fusion

There are two different types of fusion in our experiments, shown in Figure 1. Our proposed feature concatenation method is carried out at feature level. Feature-level fusion [22] is an intermediate-level fusion method. Through specific algorithms, extracted features are simplified to the characteristic with large differences between classes and minor differences within classes for subsequent feature matching and classification decisions.

Additionally, score-level fusion [21] is another fusion strategy in matching level. This fusion method enjoys fast implementation, simple fusion rules, and positive effects. After feature extraction and corresponding matching, there are different matching distances or scores, which are standardized to achieve a unified calculation criterion. According to some specific score-fusion rules, fusion weights contribute to the final result. This method has apparent advantages in multimodal biometric recognition.

In a nutshell, our proposed feature concatenation method can fuse several different feature sets to form more representative feature vectors. The score-level fusion method has the advantages of fast implementation difficulty and high recognition accuracy, low complexity, and fast recognition. The experiments are carried out at feature level and score level to verify our proposed network based on the concatenation feature.

3. Proposed Approach

In this paper, our proposed methodology based on bidirectional feature extraction thoroughly considers the rationality and feasibility of the scheme. The standard CNN image feature extracting process takes order from left to right, from top to bottom. The method in this paper for the bidirectional feature extraction includes those two extraction processes. After this step, we can acquire more features from the same finger-vein images, facilitating subsequent recognition experiments. Theoretically, the more information can be obtained, the more ideal the recognition effect will be, confirmed by subsequent experimental results. In addition, the extracted features of the two images have fixed positional relations. They can be saved by the image registration method, which provides feasibility for further practical applications. The implementation steps are as follows, and the overall flowchart is shown in Figure 2.

Step 1. The first step is acquiring finger-vein images, making a new finger-vein dataset with reverse positions. Before inputting to the framework, they are preprocessed by extracting the region of interest (RoI), normalizing, and image enhancement, detailed in Section 4.1.

Step 2. Following the pretrained structure and parameter migration of Vgg19 and ResNet50, we construct and save the feature extraction framework. Then, along with a pooling layer and a 2048-dimensional fully connected layer, they are regarded as the proposed finger-vein feature extractor, as shown in Section 3.1.

Step 3. The input dataset of A FV-USM, B FV-USM, A FV-SIPL, and B FV-SIPL are inputted to the model, respectively. Correspondingly, the network outputs 2048-dimensional vectors in Feed-Forward, which are utilized to feature unidirectional finger veins, as shown in Section 3.2.

Step 4. We can concatenate the two finger-vein features from the same database and feature extraction method, generating a bidirectional feature. Finally, training and testing processes are completed through the SVM classifier, as shown in Section 3.3.

3.1. Pretrained Model Selection

LeNet5 [25] is the beginning of the CNN research, of which AlexNet [26] and VGGNet [8] are improvements, belonging to the nonbranching network. With the increasing depth of the network, the training process began to show gradient dispersion, overfitting, and other phenomena. Researchers are not only limited to deepening the networks but also considered the width to solve these issues. As a result, several excellent networks such as ResNet [27], DenseNet [28], and ResNeXt [29] were successively put forward. Like [30], various researchers introduced residual calculation to deepen the network and shortcut connections to help fitting.

Meanwhile, learning based on residuals can extract better image features. Based on VGGNet, Hong et al. [8] took the test and train set as input and learned their correlations through training. This method achieved good effects in the finger-vein recognition experiment. Das et al. [31] presented a network with high accuracy and stable performance. The model can ensure stable recognition of finger-vein images of different quality, rotation, scaling, and translation when testing public databases. In summary, choosing VGG19 with the nonbranching structure and ResNet50 with the residual module to train the unidirectional finger-vein database is the most superior choice for our task. We could discuss more details in the experiment section.

3.2. Unidirectional Finger-Vein Recognition Model Using Transfer Learning

The next step is to apply the pretrained VGG19 and ResNet50 to the four databases of A FV-USM, B FV-USM, A FV-SIPL, and B FV-SIPL and tune the parameters suitable for finger-vein recognition. Our work adopts transfer learning for training the unidirectional finger-vein model, shown in Figure 3.

Firstly, the model weights obtained from ImageNet training are applied to initialize the training parameters. The training images of A FV-USM, B FV-USM, A FV-SIPL, and B FV-SIPL were input into the model for parameter tuning, respectively. After multiple iterations and parameter adjustment of CNN, we obtain the model corresponding to each unidirectional finger-vein database, preparing for the subsequent bidirectional feature connection experiments. The similarity between the sample feature of the test and the train set is calculated to get the probabilities during classification and recognition, which are carried out according to the probabilities. To be specific, the Softmax function returns the probability values of all finger-vein categories, where the identification category corresponding to the maximum value is the correct recognized object.

3.3. Proposed Finger-Vein Recognition Model Based on Bidirectional Feature

As shown in Figure 2, the experiment initially extracts the original and finger-vein image characteristic after the position transformation. This process aims to simplify the original complex information and gain meaningful features with large differences between classes and small differences within classes. Then we form a bidirectional feature by concatenation for matching, decision-making process, and the final classification. For example, the bidirectional feature representations consist of the feature vectors extracted by the VGG19 after parameter tuning under A FV-USM and B FV-USM, the same as ResNet50. After the above processing, the model obtained a 4096-dimensional finger-vein feature vector. Similarly, we could capture the bidirectional feature of A FV-SIPL and B FV-SIPL. Finally, we complete training and testing through the SVM classifier, whose results are presented in Section 5.

4. Experiments

The experiments are conducted on FV-USM [23] and FV-SIPL to verify our proposed finger-vein recognition framework using bidirectional feature and transfer learning. We discuss the experimental details and training iterations in this section and the results in Section 5.

4.1. Data Preprocessing

The CNN-based finger-vein recognition process is generally made up of four parts: data loading, image preprocessing, feature extraction, and classification and recognition. The acquisition module is responsible for collecting biometric images. Image preprocessing aims to eliminate noise information in the acquired image, extract the region of interest (RoI) [32], and improve image quality. Our experiments include two databases: the finger-vein database of Malaysian Polytechnic University (FV-USM) and that of our group in Signal and Information Processing Laboratory (FV-SIPL). The introduction of finger-vein databases we conducted experiments on is shown in Table 1.

The finger-vein collection device built by our group in Signal and Information Processing Laboratory adopts the direct light collection method, with the advantages of the closed collection device and high image quality, not easy to be interfered with by external light. Moreover, the collected finger vein does not contain the instrument part without RoI extraction. FV-SIPL collector principle, collection device, and part of the collected finger veins are shown in Figure 4.

The training database in our work includes 150 kinds of finger veins in FV-USM and all pictures in FV-SIPL. The specific division of the required databases is shown in Table 2.

Within the datasets, FV-USM contains many instrument regions and different thickness of people’s fingers, which will affect the subsequent recognition [33]. Therefore, we should conduct a series of operations on the collected images, such as RoI extraction, size normalization, and image enhancement, as illustrated in Figure 5.

As shown in Section 2, image enhancement is a critical step for data preprocessing. This work applies the contrast-limited adaptive histogram equalization (CLAHE) [34] algorithm to improve the finger-vein images. Taking the FV-SIPL database as an example, the image enhancement effect through CLAHE is shown in Figure 6.

4.2. Environment Settings

The training environment is Ubuntu 64-bit operating system, with a memory size of 64 GB, Intel Core i5-5200U CPU, and GeForce GTX Titan-X GPU. Besides, our experiments are based on Python 3.6, with the libraries of Keras and TensorFlow. In the meantime, we expand the database to avoid overfitting while training. Specifically, the finger-vein images are stretched, randomly cropped, and other operations. The training parameter settings are shown in Table 3, where SGD represents Stochastic Gradient Descent [35].

4.3. Experiment Iterations

After adopting the VGG19 and ResNet50 models to train the above four unidirectional finger-vein datasets, the loss variation curve and recognition accuracy of the FV-USM and FV-SIPL verification sets are shown in Figures 7(a)7(d) and 8(a)8(d), where all the horizontal axes of Figures 7 and 8 are iteration steps.

The vertical coordinates refer to the loss of verification set of A FV-USM and B FV-USM in Figures 7(a) and 7(c), respectively, and the recognition accuracy in Figures 7(b) and 7(d). Similarly, the vertical axes represent the verification loss of A FV-SIPL and B FV-SIPL in Figures 8(a) and 8(c) and accuracy rate in Figures 8(b) and 8(d).

According to the results we have shown, the loss of ResNet50 converges faster than VGG19 to a stable state after iterations. Our proposed model based on ResNet50 performs higher accuracy and lower loss convergence than VGG19. Compared with ResNet50, the accuracy and loss fluctuation of the VGG19 range broader while training due to the simple and shallow structure.

5. Experimental Results

First, in Table 4, we show the testing recognition accuracy obtained by unidirectional finger-vein recognition model based on VGG19 and ResNet50. Overall, the highest accuracy is from our manmade dataset FV-SIPL without RoI extraction and less marginal information. Objectively, with deeper structure, ResNet50 can achieve a higher recognition accuracy than that of the simple nonbranching hierarchical stacked network VGG19.

We compare our proposed model based on bidirectional feature extraction and transfer learning. It can be seen from Tables 4 and 5 that the finger-vein feature concatenation algorithm for VGG19 and ResNet50 is improved by various degrees contrasted to the unidirectional finger-vein database under their respective models.

In the FV-USM recognition experiment, the most noticeable improvement in the recognition effect is the feature connection experiment based on VGG19, which reaches 98.00%, 1.73% increase of the A FV-USM. The recognition rate of ResNet50 based on the residual network [30] is 99.67%, which is 1.36% higher than that of A FV-USM. Similarly, in the FV-SIPL recognition experiment, the most obvious enhancement in the recognition effect is still the feature connection based on VGG19. The recognition rate reaches 99.07%, 1.05% better than the B FV-SIPL experiment. The ResNet50 based on the residual network has a recognition rate of 99.31% in the feature connection experiment, 0.24% higher than that of A FV-SIPL.

Furthermore, we extend the experiments to the score-fusion method and select the results with the best accuracy or minimal time consumption for comparison with our proposed concatenated feature. With a little more time-consuming, it achieves improved accuracy within the score division of 5 : 5, shown in Table 6. However, there are difficulties in ascertaining the proper scores and large performance differences in various score distribution. Inappropriate score division may cause huge time consumption. The overall accuracy and time consumption of the score-fusion version could not outperform our proposed method based on feature concatenation.

Additionally, Table 7 shows the comparison between the proposed and state-of-the-art efficiency of finger-vein recognition algorithms. The proposed CNN method mentioned in [39] denotes an improved structure of LeNet5. The references cited in the table adopt different image feature extraction algorithms or different CNN models.

According to the comparative experimental results in Table 7, it can be found that the model based on bidirectional feature extraction presented in this paper has more tremendous recognition advantages than other existing literature. Meanwhile, in contrast to the traditional image extraction algorithm, the model proposed in this paper has more evident advantages in finger-vein recognition effect and time consumption.

The experimental results indicate that the algorithm has a high-accuracy performance for finger veins. It is inseparable from the image feature extraction of this approach to acquire more abundant information. This paper’s method makes up for the lack of in-depth information in previous methods. The work has a positive impact on the subsequent recognition steps. In conclusion, the algorithm in this paper has achieved the state-of-art recognition level with little time consumption and specific practical value.

6. Conclusions

We propose a novel approach based on bidirectional feature extraction to enhance the accuracy and efficiency of current finger-vein recognition algorithms. This model is constructed through finger-vein preprocessing, positive and negative pretrained recognition modules, feature extractors, and SVM classifier. More detailed information and meaningful relations can be detected through the two-direction extraction we proposed, which improves the recognition effect. Experimental results prove that the framework in this paper enjoys high accuracy, which reaches 99.67% testing on FV-USM and 99.31% on FV-SIPL. Our method outperforms most state-of-the-art and classical finger-vein frameworks, shown in Table 7. Future research will be devoted to combining existing theoretical research with practical applications and developing a high-accuracy finger-vein recognition system based on bidirectional feature extraction. How to enhance the performance and robustness without increasing the complicity is a long-term research topic. According to our current work, we can pay more attention to meaningful information acquisition and representations in learning methods. This solution can be implemented to deal with the leakage of biometric information and increase the identification reliability, availability, and security.

Data Availability

The FV-USM database used to support the findings of this study may be released upon application to Dr. Bakhtiar Affendi Rosdi, who can be contacted at [email protected]. The FV-SIPL database is available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Key R&D Program of China funded by the Ministry of Science and Technology of China (no. 2018YFB1403303).