#### Abstract

The fingerprint identification is an efficient biometric technique to authenticate human beings in real-time Big Data Analytics. In this paper, we propose an efficient Finite State Machine (FSM) based reconfigurable architecture for fingerprint recognition. The fingerprint image is resized, and Compound Linear Binary Pattern (CLBP) is applied on fingerprint, followed by histogram to obtain histogram CLBP features. Discrete Wavelet Transform (DWT) Level 2 features are obtained by the same methodology. The novel matching score of CLBP is computed using histogram CLBP features of test image and fingerprint images in the database. Similarly, the DWT matching score is computed using DWT features of test image and fingerprint images in the database. Further, the matching scores of CLBP and DWT are fused with arithmetic equation using improvement factor. The performance parameters such as TSR (Total Success Rate), FAR (False Acceptance Rate), and FRR (False Rejection Rate) are computed using fusion scores with correlation matching technique for FVC2004 DB3 Database. The proposed fusion based VLSI architecture is synthesized on Virtex xc5vlx30T-3 FPGA board using Finite State Machine resulting in optimized parameters.

#### 1. Introduction

The reliable personnel authentication [1, 2] based on biometrics has significant importance in the present digital world and can be achieved by human and computer interface activities. The evolution of biometrics in recent years from single mode to multiple mode closed systems has made it possible to consider for Big Data processing [3, 4]. The development of new algorithms and parallel processing architectures has an impact for Big Data processing response time. The interoperable feature and many sources and avenues for collection of biometric samples have made the biometric evolution of Big Data possible. The biometric physiological or behavioural samples are captured using sensors or devices, which are further processed in the next level of vetting through Office for Personal Management (OPM) which can be either verification or identification. Fingerprint based identification is one of the most important biometric technologies, which have drawn a substantial amount of attention recently since the process of acquiring fingerprint samples are easy and simple. A fingerprint is seen as a set of interleaved ridges and valleys on the surface of the finger. The most fingerprint matching approach relies on the fact that the uniqueness of a fingerprint can be determined by minutiae, which are represented by either bifurcation or termination of ridges. The quality and enhanced minutiae [5–7], which influence recognition rates are discussed in literature.

The features of a fingerprint can be derived using the following: (i)* Spatial domain*: the features of an image are carried out directly on pixel value. Examples are Local Binary Pattern [8], Complete Linear Binary Pattern [9], and Singular Value Decomposition [10]. (ii)* Transform domain*: in this any transform is applied to an original image to get a transformed image on which further processing is done. Examples are Fast Fourier Transform [11], Discrete Cosine Transform [12], Discrete Wavelet Transform [13], and Dual Tree Complex Wavelet Transform [14]. (iii)* Fusion*: in this technique [15, 16] it combines the advantages of both spatial and transform domain.

The automated fingerprint recognition system is used for both identification and verification against standard database law enforcement agencies to identify the suspect for committing crime or for attendance verification process to verify the claimed identity. The performance speed of fingerprint system is a critical factor to be addressed while dealing with large databases. The real-time processing of a fingerprint recognition system is its ability to process the large data and produce the results within certain time constraints in the order of milliseconds and sometimes microseconds depending on the application and the user requirements. In this category, Field Programmable Gate Array (FPGA) outperforms other processors. The FPGAs are specially built hardware optimized for speed and are suitable for real-time biometric data processing. Multicores and HPC clusters have reasonable real-time processing capabilities, but not efficient as FPGA with many processing cores and high bandwidth memory.

In real time, speed of the algorithm becomes crucial which in turn defines the throughput. The efficient FPGA architectures [17–20] for fingerprint processing and existing algorithms to identify a fingerprint based on minutiae [21], ridge, multiresolution features, and Hough transform were discussed.

Vatsa et al. [22] proposed Redundant Discrete Wavelet Transform based on local image quality assessment algorithm followed by extraction algorithm using Level 3 features. These features are combined with Level 1 and Level 2 in the fingerprint identification scheme. Finally, the matching performance was improved by using quality based likelihood ratios. Govan and Buggy [23] proposed effective matching solution that addresses security and privacy issues. This technique eliminates the requirement to release biometric template data into an open environment which uses embedded applications such as smart cards. The effective disturbance rejection methodology which is able to differentiate between equivalent and insignificant structure models was discussed.

Nain et al. [24] proposed an algorithm to classify fingerprint images into four different classes using High Ridge Curvature (HRC) algorithm involving two stages. In the first stage, HRC region was extracted, which avoids core point detection. In the second stage, ridges inside HRC region were considered for matching. The global distribution structure and the local matching similarities [25] between fingerprints were considered for matching using Hidden Markov Model (HMM) [26]. Nikam and Agarwal [27] proposed spoof fingerprint detection using ridge let transform. The comparisons of individual ridgelet energy and cooccurrence signatures were analysed and also testing was done using diverse classifiers. Masmoudi et al. [28] proposed an algorithm which used the rotation invariant measured as local phase and was combined with Linear Binary Pattern Features to improve the performance accuracy. Stewart et al. [29] proposed the test technique to determine the effects of outdoor and cold weather effects on chip versus optical fingerprint scanner, fingerprint recognition quality, and device interaction. The results suggested that performance has no dependence on temperature and humidity. Cao and Dai [30] proposed fingerprint segmentation for online process using frame difference technique. Further the segmented foreground was used for identification.

Umamaheswari et al. [31] proposed fingerprint classification and recognition using neuro-nearest neighbour based method which improves classification rate. This consists of different stages such as image enhancement, line detector base feature extraction, and neural network classification using back propagation networks. The results have shown the accurate estimation of orientation and ridge frequency which helps in better recognition. Conti et al. [32] proposed pseudo-singularity points based fingerprint recognition. This technique uses additional parameters such as their relative distance and orientation around standard singularity points (core and delta) which enhances the matching performance. Ahmed et al. [33] proposed Compound Local Binary Pattern (CLBP) for rotation invariant texture classification. This combines magnitude information of the difference between two grey values with original LBP pattern and provides robustness. Paulino et al. [34] proposed an alignment algorithm (descriptor-based Hough transform) for latent fingerprint matching. This technique measures similarity between fingerprints by considering both minutiae and orientation field information. The comparison was done between proposed and generalized Hough transform for large database.

Feng et al. [35] proposed a technique using orientation field estimation based on prior knowledge of fingerprint structure. The dictionary of reference for orientation patches was constructed using a true set of orientation fields. The approach was applied to the overlapped latent fingerprint database to achieve better performance compared to conventional algorithms.

*Contribution*. The contribution and novel aspects of the proposed techniques are listed as follows: (i) the computation of the novel matching score for CLBP and DWT features; (ii) the matching score values which are varied based on characteristics of images, that is, the values which are computed adaptively based on characteristics of the images; (iii) the fusion of matching scores with improvement factor; (iv) the implementation of FSM based VLSI architecture to improve the hardware performance.

#### 2. Proposed Fingerprint Recognition System

An efficient fingerprint recognition model using histogram of CLBP scores, DWT feature scores, and fusion of both scores is given in Figure 1.

##### 2.1. Fingerprint Database

The DB3 of FVC2004 fingerprint database [36] is considered for performance analysis. The size of each fingerprint image is 300 480 with 512 dpi. The fingerprint samples of ten different persons are shown in Figure 2.

##### 2.2. Preprocessing

The original fingerprint image of size 300 480 is resized to 256 256, which is suitable for hardware implementation.

##### 2.3. Complete Local Binary Pattern (CLBP)

It is an extension of the Local Binary Pattern (LBP) [37] texture operator. The CLBP operator gives both sign CLBP_ and magnitude components CLBP_ for each pixel from its neighbouring pixels. If is the number of neighbours of a centre pixel, then CLBP operator uses bits to code centre pixel. The first MSB bits represent sign and the next LSB bits represent magnitude.

The binary bit patterns are generated for sign and magnitude components for each pixel. The fingerprint image is scanned from left to right and top to bottom and considering each pixel which is surrounded by 8 neighbouring pixels, that is, 3 3 matrix. The centre pixel intensity value is and surrounded neighbouring pixel intensity values, say, . The sign bit patterns for 3 3 matrices are generated using The magnitude bit pattern is generated using where and to are the magnitude values of the difference between respective and .

Each neighbourhood pixel is represented by two bits; that is, MSB bit represents sign and the LSB bit represents magnitude. Each centre pixel is represented by eight sign bits and eight magnitude bits. The example for CLBP is as shown in Figure 3. The arbitrary values for 3 3 matrix are considered in Figure 3(a). The values of neighbouring pixels are subtracted from centre pixel value and are given in Figure 3(b). The sign of each coefficient in Figure 3(b) is represented in Figure 3(c) as sign component of CLBP. The magnitude components of CLBP are shown in Figure 3(d) by considering only magnitude of Figure 3(b). The average value of the CLBP magnitude component is computed and is compared with neighbouring CLBP magnitude coefficient values and assigns binary values using (2) to generate CLBP magnitude pattern given in Figure 3(f). The numbers of centre pixels available for image size 256 256 are 64516 using 3 3 window matrix. The binary eight bits of sign and magnitude of each pixel are converted into decimal values for feature extraction. If the CLBP sign and magnitude coefficient features are considered directly for an image size of 256 256, the algorithm requires 64516 for sign and 64516 for magnitude; that is, total number of features are 129032.

###### 2.3.1. Histogram of CLBP Features

The features obtained directly from CLBP are large in number and hence increase in matching processing time and are a disadvantage in hardware implementation. The histogram on CLBP produces only 256 features for each sign and magnitude. Hence the number of features is reduced from 129032 to 512, that is, approximately 0.4% features compared to CLBP. The advantage of histogram on CLBP is that the number of features reduces and also features are more unique. The histograms of original fingerprint, sign, and magnitude components of CLBP are shown in Figure 4.

**(a) Histogram of fingerprint**

**(b) Magnitude histogram**

**(c) Sign histogram**

###### 2.3.2. Proposed CLBP Matching Score

The CLBP histograms of test and database images are compared componentwise to compute CLBP match score . The absolute sign component difference CLBP__ between sign component CLBP__ of test fingerprint and sign component CLBP__ of fingerprint images in the database is computed using where is intensity values (0 to 255); = number of persons in the database number of images per person.

The CLBP sign histogram coefficients match is computed based on threshold sign difference value (14 for best match) given in

The absolute magnitude component difference CLBP__ between magnitude component CLBP__ of test fingerprint and magnitude component CLBP__ of fingerprint images in the database is computed using where is intensity values (0 to 255); = number of persons in the database number of images per person.

The CLBP magnitude histogram coefficient match is computed based on threshold magnitude difference value (18 for best match) given in The overall CLBP match count by considering both sign and magnitude histogram coefficients is The CLBP match score is computed using CLBP match count and number of histogram levels using The* first* and* eighth *samples of same person are considered as database and test image. The original fingerprint, CLBP magnitude component, and CLBP sign component images of database and test image are shown in Figures 5(a)–5(c) and 5(d)–5(f) respectively. The CLBP_Match_score is computed between database image and test image of the same person, which yields high value, that is, 67.9%.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

The* first *and* eighth* samples of different person are considered as database and test image. The original fingerprint, CLBP magnitude component and CLBP sign component images of the database and test image are shown in Figures 6(a)–6(c) and 6(d)–6(f) respectively. The CLBP_Match_score is computed between the database image and test image of the different person, which yields low value, that is, 51.9531%.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

##### 2.4. DWT Algorithm

The DWT [38] provides spatial and frequency characteristics of an image. It has an advantage over Fourier transform in terms of temporal resolution where it captures both frequency and location information. The signal is translated into shifted and scaled versions of the mother wavelet to generate DWT bands. The fingerprint image is decomposed into multiresolution representation using DWT. The LL subband gives overall information of the original fingerprint image, the LH subband represents horizontal information of the fingerprint image, HL gives vertical characteristics of the fingerprint image, and HH gives diagonal details.

The Haar wavelets are orthogonal and have simplest useful energy compression process. The Haar transformation on one-dimension inputs leads to a 2-element vector using where is the Haar operator and and are the sum and difference of and which produce low pass and high pass filtering, respectively, scaled by to preserve energy. The Haar operator is an orthonormal matrix since its rows are orthogonal to each other (their dot products are zero) and have unit lengths; therefore . Hence we may recover from using For 2D image, Let be 2 2 matrix of an image; the transformation is obtained by multiplying columns of by , and then the rows of the result are by multiplying by using The original values are recovered using An example of DWT is as follows.

If is the original matrix, then DWT is given in (13).

Then

The level 2 DWT features can be obtained by applying Haar wavelet on LL subband of Level 1. The decomposition of fingerprint using DWT at two levels is shown in Figure 7.

**(a) Original fingerprint image**

**(b) One-level decomposition**

**(c) Two-level decomposition**

The DWT bands correspond to the following filtering processes:(i)LL : low pass filtering in horizontal as well as vertical direction.(ii)HL : high pass filtering in horizontal direction and low pass filtering in vertical direction.(iii)LH : low pass filtering in horizontal and high pass filtering in vertical direction.(iv)HH : high pass filtering in both horizontal and vertical direction.

To use this transform to a complete image, the pixels are grouped into 2 2 blocks and transformations are obtained using (13) for each block. The 2-level DWT is applied to the fingerprint image of size 256 256 to obtain 128 128 coefficients after first level and 64 64 coefficients after second-level stage. The 64 64 LL subband coefficients are considered as DWT features.

###### 2.4.1. Proposed DWT Matching Score

The LL subband coefficients of Level 2 DWT of the test fingerprint are compared with LL band coefficients of fingerprint images present in the database using difference formula between coefficients using where is the number of second-level subband coefficients, that is, 4096 for original image size of 256 256.

The DWT coefficient match is given by The DWT_Match_count by considering Level 2 coefficients is given in The DWT_Match_score is computed using DWT match count and total number of DWT coefficients using The* first* and* eighth* samples of same person are considered as database image and the test image. The corresponding LL subband images of DWT database image and test image are shown in Figures 8(a)-8(b) and 8(c)-8(d) respectively. The matching score of 21.6768% is high, since the score is computed between two samples of the same person.

**(a) Sample image of database**

**(b) DWT Level 2 LL subband image of database**

**(c) Sample of test image**

**(d) DWT Level 2 LL subband image of test image**

The* first* and* eighth* samples of different person are considered as database image and the test image. The corresponding LL subband images of DWT database image and test image are shown in Figures 9(a)-9(b) and 9(c)-9(d) respectively. The matching score of 14.2129% is low, since the score is computed between two samples of different person.

**(a) Sample image of database**

**(b) DWT Level 2 LL subband image of database**

**(c) Sample of test image**

**(d) DWT Level 2 LL subband image of test image**

##### 2.5. Fusion

The percentage CLBP match score is fused with percentage DWT matching score [39] to improve performance of the proposed algorithm using where is an improvement factor which varies from 0 to 1.

#### 3. Algorithm

The proposed efficient algorithm is shown as follows.

*Proposed Algorithm*

*Input*. This includes fingerprint database and test image.

*Output*. This includes fingerprint authentication.(i)The DB3 of FVC2004 fingerprint database is considered.(ii)Resize to 256 256.(iii)The CLBP is applied on fingerprints to obtain CLBP sign and magnitude coefficient.(iv)The histogram of CLBP sign and magnitude are obtained to form features.(v)The 2-level DWT is applied on the fingerprint and second-level LL band coefficients are considered as features.(vi)The CLBP sign and magnitude histogram of the test and database fingerprint images are compared using difference formula to compute CLBP match score.(vii)The LL subband coefficient test and fingerprint images are compared using difference formula to compute DWT score.(viii)The matching scores of CLBP and DWT are fused using an arithmetic equation(ix)The performances parameters are computed using fused matching scores.The fingerprint identification to authenticate a person effectively on FPGA with optimized parameters is discussed. The spatial domain CLBP and transform domain DWT are used to extract features. The arithmetic fusion is employed on CLBP and DWT match score to compute performance parameters. The algorithm is implemented on Virtex 5 FPGA board. The main objective is to increase TSR, decrease FRR and FAR, and improve hardware optimization parameters.

#### 4. Performance Analysis and Results

In this section, the definitions of performance parameters and performance analysis are discussed.

##### 4.1. Definitions of Performance Parameters

###### 4.1.1. False Rejection Rate (FRR)

False Rejection Rate (FRR) is the measure of the number of authorized persons rejected. It is computed using

###### 4.1.2. False Acceptance Rate (FAR)

False Acceptance Rate (FAR) is the measure of the number of unauthorized persons accepted and is computed using

###### 4.1.3. Total Success Rate (TSR)

Total Success Rate (TSR) is the number of authorized persons successfully matched in the database and is computed using

###### 4.1.4. Equal Error Rate (EER)

Equal error rate (EER) is the point of intersection of FRR and FAR values at particular threshold value. The EER is the tradeoff between FRR and FAR. The value of EER must be low for better performance of an algorithm.

##### 4.2. MATLAB Experimental Results

The performance parameters are computed by running a computer simulation using MATLAB 12.1 version. The performance improvements are explained in this section. The DB3 of FVC2004 fingerprint database is considered for performance analysis. The DB3_A database has one hundred persons with eight samples per person. The size of each fingerprint image is 300 480 with 512 dpi. The database is created by considering the number of persons inside database varied from 30 to 50 with 7 fingerprint samples per person in the database; that is, number of images varies between 210 and 350 to compute FRR and TSR. The eighth sample of each person is considered as test fingerprint.

###### 4.2.1. CLBP Algorithm

In this section the performance analysis is discussed for features extracted using only CLBP by substituting power factor in (18). Consider

The variations of FRR and FAR with a threshold for PID : POD combinations of 30 : 30, 40 : 30, 45 : 35, and 50 : 40 are shown in Figure 10. It is observed that for lower threshold values FAR is high and FRR is low. As threshold value increases, FAR decreases from higher values, whereas FRR increases from lower to higher values. The computed values of EERs for different PID and POD combinations of 30 : 30, 40 : 30, 45 : 35, and 50 : 40 are 13.33, 10, 14.29, and 22.5, respectively.

**(a) FAR and FRR versus threshold for 30 : 30**

**(b) FAR and FRR versus threshold for 40 : 30**

**(c) FAR and FRR versus threshold for 45 : 35**

**(d) FAR and FRR versus threshold for 50 : 40**

The variations of percentage TSR with threshold for different combinations of PID and POD are given in Table 1. The value of % TSR decreases from higher values to zero as threshold increases. The value of TSR is zero for higher threshold value since the correlation technique is used for matching. It is also observed that as PID increases, the % TSR decreases.

###### 4.2.2. DWT Algorithm

The performance parameters are computed by considering only DWT features by substituting power factor in (18) to obtain

The variations of FRR and FAR with a threshold for PID : POD of 30 : 30, 40 : 30, 45 : 35, and 50 : 40 are shown in Figure 11. It is observed that for lower threshold values FAR is high and FRR is low. As threshold value increases, FAR decreases from higher values, whereas FRR increases from lower to higher values. The computed values of EERs in percentage for different PID and POD combinations of 30 : 30, 40 : 30, 45 : 35, and 50 : 40 are 33.33, 33.33, 37.14, and 42.5, respectively. The variations of percentage TSR with threshold for different combinations of PID and POD are given in Table 2. The value of % TSR decreases from higher values to zero as threshold increases. The value of TSR is zero for higher threshold value since the correlation technique is used for matching. It is also observed that as PID increases the percentage TSR value decreases.

**(a) FAR and FRR versus threshold for 30 : 30**

**(b) FAR and FRR versus threshold for 40 : 30**

**(c) FAR and FRR versus threshold for 45 : 35**

**(d) FAR and FRR versus threshold for 50 : 40**

###### 4.2.3. Fusion of CLBP and DWT

The performance parameters are computed considering fusion based given by (18).

The variations of FRR and FAR with a threshold for PID : POD of 30 : 30, 40 : 30, 45 : 35, and 50 : 40 are shown in Figure 12. It is observed that for lower threshold values FAR is high and FRR is low. As threshold value increases, FAR decreases from higher values, whereas FRR increases from lower to higher values. The computed values of EERs in percentage for different PID and POD combinations of 30 : 30, 40 : 30, 45 : 35, and 50 : 40 are 0, 0, 0, and 20, respectively.

**(a) FAR and FRR versus threshold for 30 : 30**

**(b) FAR and FRR versus threshold for 40 : 30**

**(c) FAR and FRR versus threshold for 45 : 35**

**(d) FAR and FRR versus threshold for 50 : 40**

The variations of percentage TSR with threshold for different combinations of PID and POD are given in Table 3. The values of % TSR decrease from higher values to zero as threshold increases. The value of TSR is zero for higher threshold value since the correlation technique is used for matching. It is also observed that as PID increases the percentage TSR value decreases.

###### 4.2.4. Comparison between CLBP, DWT, and Proposed Model

The values of EER and TSR with different combinations of PID and POD are tabulated in Table 4. The value of %TSR decreases and EER increases with increase in PID and POD. The proposed model achieves reduced EER and increased TSR compared to individual technique of CLBP and DWT implementation. The performance parameters such as EER and TSR are compared with existing techniques published by Karki and Sethu Selvi [40], Bartunek et al. [41], Ouzounoglou et al. [42], and Medina-Pérez et al. [43] for FVC2004 DB3 Database given in Table 5. The proposed model achieves reduced EER and increased TSR.

#### 5. FPGA Implementation of Proposed Model

The proposed architectures are implemented on FPGA device using Virtex xc5vlx30T [44] with speed grade 3 and designed to work with external SRAM memory [45] which is used to store the database and test images. This SRAM has been required since the on-chip memory of FPGA is small to store the database and test images during algorithm execution.

##### 5.1. CLBP Architecture

The CLBP algorithm is synthesized using CLBP VLSI architecture shown in Figure 13. The nine shift registers of eight bits along with two shift registers each of length 2008 bits are used to form First In First Out (FIFO) architecture to implement 3 3 matrix. The outputs p0, p1, and p2 are three pixels of* first* row, p3, p4, and p5 are three pixels of* second *row, that is, exactly below first three pixels, and p6, p7, and p8 are three pixels of* third* row, that is, exactly below* second* row three pixels to form 3 3 matrix for sign and magnitude computations with CLBP. In the next rising edge of the clock, pixels are shifted to the right by one to form a new matrix. The control unit along with 10-bit and 8-bit counters is used to create new matrices which are sent to compute CLBP_*S* and CLBP_*M* blocks to obtain CLBP sign and magnitude components. These components are further used to obtain the histogram magnitude and sign CLBP features using two counter banks of .

###### 5.1.1. Finite State Machine (FSM) of CLBP

The MODELSIM FSM view window is used to display state diagram. The FSM of control unit to compute CLBP_ and CLBP_ is shown in Figure 14. The st0 is the initial state of control unit and is continued in this state until 10-bit counter counts 515 clock cycles to allow the FIFO architecture to store pixel values of the* first* row,* second* row, and first three pixels of* third* row. The st0 state shifts to st1 state after 515 clock cycles. In st1 state first 3 3 matrices of* first*,* second*, and* third* rows are considered, sending q1 from 10-bit counter to control unit to activate s1 to compute CLBP_ and s2 to compute CLBP_. The sign and magnitudes of CLBP for successive 3 3 matrices of the first* three* rows are computed in st1 till 8-bit counter count reaches 253 and shifts to st2. The 8-bit counter count is reset and 10-bit counter count is incremented in state st2 to store* fourth* row by eliminating first row and shifts to st3 to create clock cycles delay before shifts to st1. The processes of CLBP_ and CLBP_ for every 3 3 matrix for all rows of an image are computed in st1, st2, and st3 states. Once all rows are processed state shifts st0 to read next image.

##### 5.2. DWT Architecture

The DWT algorithm is synthesized using DWT architecture as shown in Figure 15. In case of DWT, 2 2 nonoverlapping matrix is required. The four shift registers of eight bits along with one shift register of length 254 bits are used for FIFO architecture to form 2 2 matrix. The outputs p0 and p1 are two pixels of the* first* row and p2 and p3 are pixels exactly below the* first* row to form 2 2 matrix for DWT computation. The control unit controls all the timing issues using a 9-bit counter. Its operation is based on the state diagram shown in Figure 16. In st0, the entire* first* row and two pixels of the* second* row of an image are read using 9-bit counter and shifts to st1. The LL of 2 2 matrix is computed in st1 and continued to compute LL coefficient with nonoverlapping pixels of the* second *row. Once* first* and* second* rows were completed, then st1 shifts to st2 and back to st1 after two clock cycles’ delay to compute LL coefficients for* third* and* fourth* rows and is continued till all rows are completed in an image. The entire image of 256 256 will give rise to 128 128 DWT coefficients for 65536 clock cycles.

The algorithm for DWT Level 2 remains the same as DWT Level 1 but is applied only on the LL component of DWT Level 1 coefficients. The LL coefficients generated by Level 1 are again processed here to generate coefficients. Now, instead of waiting for Level 1 to complete its processing and then executing Level 2, the pipelining has been done in both the stages to achieve better speed. The LL coefficients for every overlapping and nonoverlapping 2 2 matrix in an image are generated in Level 1 using moving window architecture and are connected to registers of Level 2 architecture shown in Figure 17. The LL coefficients of a nonoverlap 2 2 matrix of Level 1 are considered in Level 2 for further decomposition. The architecture uses a 10-bit counter, 11-bit adder, and 1 right shift register. The controller uses the counter to keep track of time and all the LL coefficient values P0, P1, P2, and P3 are added using 11-bit adder since all the values are about 9 bits. The result of the addition is scaled down by 2 by using one-bit right shift operation.

The FSM of Level 2 DWT to generate LL coefficients is shown in Figure 18. In state st0, the 2 2 LL coefficients of Level 1 are read and jump to st1. The LL coefficients of Level 2 are computed in st1 by adding and shift technique. The states st2, st3, and st4 are used to create a delay to compute Level 2 coefficient of next nonoverlapped 2 2 windows. The process is continued until all nonoverlapped 2 2 windows are exhausted.

##### 5.3. Matching Score Architecture for CLBP and DWT

The architecture for computation of the matching score in percentage for both CLBP and DWT is shown in Figure 19. The feature of the test image is subtracted from that of database feature and if the difference is less than the threshold, then it is considered to be a match and the counter is updated. Similarly, after comparing all the features, the control unit asserts cnt_out signal to use the content of counter to calculate the match score; also this signal is used to reset the counter to zero. The match score in percentage is obtained by multiplying the number of matched features by 100 and then dividing it by the total number of features as given in (8) and (17). This requires a dedicated multiplier and divider which consumes more hardware and decreases the speed. In our architecture multiplication and division operation can be performed using shift registers reducing the area and increasing the speed. The numbers of matched features are shifted left by 6, 5, and 2 bits and then add to achieve multiplication by 100. Similarly the division is performed by shifting right by 9 bits and 14 bits for CLBP score and DWT score, respectively.

##### 5.4. Architecture for Fusion of CLBP and DWT Match Scores

The architecture for fusion of CLBP and DWT match scores using improvement factor using (18) is as shown in Figure 20. The process of multiplication with a fractional part like 0.7 and 0.3 is carried-out shift operation in three steps. To obtain 0.7, the parallel combination of 1 bit, 3 bits, and 4 bits right shift registers, this yields a value of 0.5 + 0.125 + 0.0625 = 0.6875 resulting in an error of 1.8%. Similarly to obtain 0.3, the parallel combination of 2 bits, 5 bits, and 6 bits right shift registers, this yields a value of 0.25 + 0.0313 + 0.016 = 0.2969 resulting in an error of 2.3%. These errors are negligible since the threshold used for decision circuit is not hard. Finally, the fusion match score is compared with a threshold and the decision is made whether the test sample is matched with database or not. This method of implementation eliminates the use of dedicated floating point and fixed point multiplier and divider circuits consuming more clock cycles and area.

##### 5.5. Hardware Results

The performance parameters based on FPGA for CLBP, DWT, and fusion based architectures are given in Table 6. The limitation of fusion technique is that it requires more number of slice registers and LUTs as compared to individual technique.

The RTL schematics of the proposed fusion based design with CLBP and DWT architecture, run in parallel, is shown in Figure 21. The CLBP system consists of CLBP module and CLBP matching with its output CLBP match score. The DWT system consists of 2-level decomposition modules and finally DWT matching module with its output DWT match score. The two levels of DWT decomposition are pipelined in order to achieve high speed. Finally, using fusion module, both the match scores are combined using strength factors and the fusion match score is computed upon which threshold is applied to decide whether a match has occurred or not.

The routing design on the FPGA connecting several CLBs and block RAM is shown in Figure 22. The blue streaks indicate the connection between the logical blocks. Similarly, a schematic showing all interconnections between the LUTs, BLOCK RAM, and IOBs (input/output buffers) is shown in Figure 23.

The schematic floor of our proposed design using Virtex 5 device is as shown in Figure 24. This snapshot is taken from a tool know as Xilinx PlanAhead. The small violet rectangular boxes represent CLBs (combinational logic block). The CLB consists of two slices and each slice has 4 LUTs, 3 multiplexers, 1 dedicated arithmetic logic (two 1-bit adders and a carry chain), and four 1-bit registers that can be configured either as flip-flops or as latches shown in Figure 25. This technique of hardware implementation does not require dedicated multiplier and divider; hence it consumes less hardware to build and it is faster.

###### 5.5.1. Comparison between Existing Fingerprint Architectures and Proposed Architecture

The area and total execution time estimated using FPGA for the proposed algorithm and existing algorithm are presented in Table 7. In case of comparing the proposed results with previous related work, it is better with respect to different aspects.(a)In [46] authors presented a hardware-software codesign of fingerprint recognition system. The coprocessors were used to speed up the execution time of algorithm resulting in 988 ms. The microblaze soft core processor along with coprocessor limits the speed of the entire system. In our proposed method the implementation of full flexible parallel and pipelined architecture using on-chip slices of FPGA improves the system matching speed.(b)In [47] authors proposed a solution of fingerprint recognition using a combination of ARM and FPGA. The use of full FPGA reconfiguration in our proposed method using Virtex 5 with SRAM is better in speed compared to reconfiguration latencies achieved using a combination of ARM and FPGA (EPXA10 DDR).(c)In [48] a sensor has been prototyped via FPGA to improve the speed of the system with best elaboration time of 183.32 ms and a working frequency of 22.5 MHz. In our method the elaboration time of 1.644 ms is achieved at the working frequency of 68 MHz, since the external SRAM is used to port via FPGA.

*Limitations*. There exist some limitations in the proposed method despite the improvement of speed as it requires more area since the CLBP and DWT fused technique is used. The on-chip moving window FIFO architecture designed has initial clock latencies.

#### 6. Conclusion

In this paper, efficient FSM based reconfigurable architecture for fingerprint recognition implemented using Virtex 5 FPGA board is proposed. The novel matching score of CLBP is computed using histogram CLBP features of test image fingerprint images in the database. Similarly the DWT matching score is computed using DWT features of test image and fingerprint images in the database. The arithmetic fusion equation with improvement factor is used to combine the matching scores generated by histogram CLBP features and DWT features. The performance parameters are computed using fusion scores with correlation technique. It is observed that the values of EER, FAR, FRR, TSR, and hardware parameters such as area and delay are better in the case of proposed method compared to existing methods.

#### Competing Interests

The authors declare that they have no competing interests.