Abstract

Palmprint has become one of the biometric modalities that can be used for personal identification. This modality contains critical identification features such as minutiae, ridges, wrinkles, and creases. In this research, feature from creases will be our focus. Feature from creases is a special salient feature of palmprint. It is worth noting that currently, the creases-based identification is still not common. In this research, we proposed a method to extract crease features from two regions. The first region of interest (ROI) is in the hypothenar region, whereas another ROI is in the interdigital region. To speed up the extraction, most of the processes involved are based on the processing of the image that has been a downsampled image by using a factor of 10. The method involved segmentations through thresholding, morphological operations, and the usage of the Hough line transform. Based on 101 palmprint input images, experimental results show that the proposed method successfully extracts the ROIs from both regions. The method has achieved an average sensitivity, specificity, and accuracy of 0.8159, 0.9975, and 0.9951, respectively.

1. Introduction

A biometric identification system identifies automatically an individual based on his physical or behavioral characteristics [1, 2]. Palmprint is one of the most important physiological modalities, and it can be used for personal identification [3, 4]. The palmprint indicates a photograph of a hand or the impression the hand leaves on the surface [5]. Its region starts from the wrist to the root of the fingers [5]. The palmprint provides a lot of information that can be used in personal identification [4, 6].

Recently, the usage of the palmprint for identification has attracted increasing attention from researchers [2, 7]. Palmprint has several advantages. If compared with the finger, palmprint area is larger [2, 8], and thus, contains more features [1, 9]. Palmprint-based biometric systems are low-cost [8], user-friendly [7], and provides high accuracy and efficiency [2]. Although many researchers have observed the effectiveness and usefulness of palmprint in identification, only a few works have been reported in improving palmprint identification for individualization purposes [10].

Various unique features can be extracted from the palmprint, such as crease features, point features, and texture features. Features on the palmprint are stable, as most of these features remain unchanged in individual lifespan [1, 7]. Among these features, the crease feature has attracted our attention for biometric recognition. Creases are discontinuities in the epidermal ridge patterns [11] and developed during gestation age [12]. Creases are formed during the embryonic skin development stage [11], and they are permanent and unique [13]. These creases divide the palm into several regions as shown in Figure 1. The main regions of the palm are hypothenar, thenar, and interdigital region.

Crease feature is a stable and important feature that can provide rich information for palmprint recognition [6, 7]. Identification methods that are using creases have shown a promising recognition rate that is comparable with methods that are using other features such as face and fingerprint [6]. Creases get attention from most researchers because this feature is the most clearly observable feature even when captured at low resolution (e.g., in the resolution of 100 dpi) [2]. A feature from creases is a special salient feature on palmprint, but unfortunately, identifications based on creases are still not very common [13, 14]. In an examination of palmprint, creases commonly served as supporting information to identify or eliminate distorted latent palmprints [13].

Palmar flexion creases are important crease features on palmprint images [6]. Palmar flexion creases are one of the external anatomical landmarks on the hand [15]. They represent the region of firmer attachment of the skin to the basal skin structure (i.e., dermis) [11, 15]. Most of the existing works were focused on principle lines of palmprint [6, 9, 10, 1618]. Unfortunately, these studies did not utilize significant portions of the palmprint. Principal lines are genetically dependent, whereas most of the other creases are not [19]. These nongenetically deterministic features are still very useful, and regarding this fact, our study will focus on other creases on the palm.

In our project, two regions of palmprint’s creases of size 2 cm × 2 cm will be used as our features. One of the regions is located in the hypothenar region, while another region is in the interdigital region. As the manual extraction of these regions is tedious, time-consuming, not consistent, and prone to errors, an automatic region of interest (ROI) extraction technique is proposed in this manuscript. An automatic technique not only makes the ROI extraction simple but also standardizes the features and can reduce the problems with palmprints that are not correctly aligned. This paper is organized as follows. Section 2 presents our methodology. Then, the experimental results are presented in Section 3. Finally, Section 4 concludes our findings.

2. Methodology

The palmprint images used in our research are acquired through a general scanner, which is a Canon E400 Series. The images are scanned at 300 dpi × 300 dpi and saved in JPEG format. They are 24-bit-per-pixel color images with a size of 2488 × 3484 pixels (i.e., around 8.6 Mpixel image).

Examples of input images used in this experiment are shown in Figure 2. As shown in this figure, in addition to the palm, there are two rulers located in the image. These rulers are used to measure the geometry of the palm. There is always one ruler located on the top portion of the image. Another ruler is located either on the left or the right side of the image. There are also six markers or pegs in each image. The color of each marker in the image varies between images. These markers were used to help the person to align their hand during image acquisition.

We asked the volunteers to put their hand on the glass surface of the scanner. Then, to minimize the effects from the external light, the hand is covered with one plain black cloth. The lighting is solely from the internal lighting provided by the scanner itself. However, some of the images, such as in Figure 2(a), are appearing brighter than others. This is due to improper positioning of the hand or improper covering of the black cloth, which allows the ambient light to penetrate into the system and interfere with the images.

Figure 3 shows the block diagram of the proposed method. The input of this system is a color input image F. The outputs of this system are two ROIs, which are the ROI on the hypothenar region, ROI1, and the ROI on the interdigital region, ROI2. This method has been implemented using C# in Microsoft Visual Studio. As shown in this figure, the proposed method consists of 12 main blocks. These blocks will be explained in the following subsections.

2.1. Image Downsampling by Factor 10

Because the resolution of the original input image F is around 8.6 Mpixels (i.e., 2488 × 3484 pixels), a long processing time is required to process the whole image. Thus, in this research, to reduce the computational burden, image F has been downsampled by using a scaling factor of 10. The output from this process is a downsampled image f with dimensions of 248 × 348 pixels (i.e., floor(2488/10) × floor(3484/10) pixels), with image resolution around 0.086 Mpixels. Therefore, the area of image f is only 1% as compared with the area of the original image F. The downsampling process by using a factor of 10 is given by the following equation: where x and y are the spatial coordinates, and c is the color channel (i.e., red (R), green (G), and blue (B)). Coordinates (x,y) = (0,0) are located at the top left of the image. The resolution of the image now becomes 30 dpi × 30 dpi.

2.2. Conversion from Color to Grayscale Image

To further reduce the processing time, color image f is then been converted to grayscale image g. By using this conversion, the three-channel image is converted to a one-channel image. Although there are many methods that can be employed for this conversion, in this research, we do the conversion by keeping only the red color channel (R). This is because we assume that the human skin (i.e., the palm) is more dominant in red color (R), as compared with blue (B) or green (G) color. Besides, to reduce the unwanted effect of high-intensity values from the rulers in the image that might deteriorate the performance of the next segmentation process, the regions of the rulers have been cropped out. In order to do so, we have inspected image f and found that the rulers’ areas only occupy 25-pixel-row from the top and 25-pixel-column from the left. Therefore, this area is given intensity 0, which is similar to the background intensity value. Image g is given as where R is the red color channel, and Wf is the width of image f (i.e., in this case, Wf is 248 pixels).

2.3. Segmentation of the Hand Region

In this research, the hand region is identified by using a simple thresholding. As the hand region is brighter than its background, the mask of hand M1 is defined as where T1 is the threshold level. The value of T1 is in between 80 and 120. Yet, T1 equal to 100 works in most cases. From equation (3), the value of M1(x,y) is 1 for the hand region, and 0 for the background area.

2.4. Removal of the Fingers’ Region

In this research, the creases that we are interested are located on the palm. Therefore, the regions of the fingers from M1 will be deleted. We assume that the area of the palm is bigger than 50 × 50 pixels on image g. To do so, we use two temporary masks, which are m1 and m2. By inspecting row-by-row, only horizontal lines of value 1 with length more than 50 pixels are copied to m1. Similarly, by inspecting column-by-column, only vertical lines of value 1 with length more than 50 pixels are copied to m2. The output from this process is M2, which is defined as

2.5. Extraction of the Hand Region

Mask of palm M2 is then used to create the palm image h. Image h is the segmented image, where the regions of the background and fingers are given intensity value 0. This image is defined as

2.6. Segmentation of the Major Creases

From the image, we can observe that the creases have lower intensity values as compared with other regions of the palm. We have assumed that the creases occupy less than 25% from the area defined by M2. Therefore, the mask of creases M3 is defined as where T2 is the first quartile of the values from h(x,y) inside the region defined by M2(x,y).

2.7. Detection of Palmar Crease

In this research, we have assumed that the palmar crease is a straight line that passing through the palm from the left side to the right side. Therefore, we use a Hough line transformation to find this straight line. The Hough image (i.e., a 2D matrix) that we use for this Hough transformation is the radius r versus angle ϕ. In the beginning, all bins in this Hough image are empty. The range of r considered in this experiment is integer values from −300 to 300. Next, for each defined value in M3 (i.e., M3(x,y) = 1), the value of r is calculated: for ϕ from 0° to 365.5°, with a step size of 0.5°. From this information, the value in the corresponding bin of the Hough image, which is based on coordinates (ϕ,r), will be added by one.

After all points with value 1 in M3 have been transformed, we then find the bin with the maximum value. We have restricted our search for the range of ϕ in between 70° to 80° only. This bin will give us coordinates (ϕmax,rmax). From this bin, we then track back all points from M3 that have contributed to this bin. These points are then given value 1 in a temporary mask M4, while other points are given value 0. As these points might not form one single straight line, we refine these lines by using a morphology closing operation, using a square structuring element of size 5 × 5 pixels, for both dilation and erosion operations.

It is also possible to use information from ϕmax to correct the alignment of the hand so that the palmar crease will lay horizontally. We also have rotated the original image 180° so that it is easier to analyze the image by using visual inspection. Therefore, the rotation angle θ obtained from this step is defined as

2.8. Rotation of Palmar Crease

To rotate this image, we have considered the center of the image, which is the location of (Wf/2, Hf/2), as the new origin (0,0). Here, Wf is the width of image f, and Hf is the height of image f. Then, for each coordinates (x,y) on the rotated mask M5, the corresponding coordinates (xo,yo) on mask M4 are found using the following formula:

From here, the rotated mask M5 is defined as

2.9. Rotation of Hand Region

The mask of the hand region M1 is rotated using the same method explained in the previous subsection. For each coordinates (x,y) on the rotated mask M6, the corresponding coordinates (xo,yo) on mask M1 are found using the formulas (9) and (10). Then, the rotated mask M6 is defined as

2.10. Detection of Two End Points

The two end points (i.e., P1 and P2) of the line defined in M5 are found in this stage. Information from M6 is also used in this stage. Coordinates P1 = (x1,y1) are at the point located on the left side of the line. It is defined as the left-most point in M5 that is with M6(x-1,y-1) = 0, M6(x-1,y) = 0, and M6(x-1,y + 1) = 0. On the other hand, P2 = (x2,y2) is defined as the right-most point, which is located more than 50 pixels from P1, with M6(x + 1,y-1) = 0, M6(x + 1,y) = 0, and M6(x + 1,y + 1) = 0.

2.11. Points Refinement

The ROIs that we need to find is of size 2 cm × 2 cm. Therefore, in terms of pixels

Point P1 corresponds to the hypothenar area. Based on this point, an ROI of the size of 24 pixels × 24 pixels is defined, which is in coordinates range of x1 ≤ x < x1+ 24 and y1 ≤ y < y1+ 24. Within this region, based on mask M6, the number of nonpalm pixels is calculated. If the number of nonpalm pixel is greater than 48 pixels, the procedure is repeated by shifting P1 one pixel to the right. The location of P1 where this requirement is fulfilled is denoted as Pa = (xa,ya).

Similarly, point P2 corresponds to the interdigital area. Based on P2, an ROI of the size of 24 pixels × 24 pixels is defined in coordinates range of x2–24 < x ≤ x2 and y2–24 < y ≤ y2. Within this region, based on mask M6, the number of nonpalm pixels is calculated. If the number of nonpalm pixel is greater than 48 pixels, the procedure is repeated by shifting P2 one pixel to the left. The location of P2 where this requirement is fulfilled is denoted as Pb = (xb,yb).

2.12. Define Actual Regions

If the ROI on the mask is of size 24 pixels × 24 pixels, this ROI should be upsampled by factor 10 for the image F. Therefore, the actual size of the ROI is 240 pixels × 240 pixels. The points Pa and Pb are also multiplied by 10 in order to find the corresponding points on the original resolution. These new points are denoted as PA and PB, respectively.

A region on hypothenar is defined based on point PA. This region is in the range of xA ≤ x < xA + 240 and yA ≤ y < yA + 240. For each point in this region, the followings are defined:

Therefore, the ROI on the hypothenar region is defined as

Similarly, a region on interdigital is defined based on point PB. This region is in the range of xB-240 < x ≤ xB and yB-240 < y ≤ yB. For each point in this region, equations (15) and (16) are calculated. Therefore, the ROI on the interdigital region is defined as

ROI1 and ROI2 are then saved in separate files in BMP format for further identification processes.

3. Results and Discussions

This section is divided into two subsections. In Subsection 3.1, qualitative or subjective evaluations will be presented. Results from some stages of the method will also be presented. In Subsection 3.2, quantitative or objective evaluations will be given.

3.1. Qualitative Evaluation

After the image was downsampled by factor 10, the image is converted to a grayscale image by keeping only the red color channel. Figure 4 shows each of the color components from one of the images used in this experiment. As shown by this figure, we can see that the hand has the best contrast in the red channel, as compared with the other two color channels. Therefore, it is shown that the use of the red channel in this research is appropriate to present g.

The images are then undergoing image thresholding to define the mask of hand region M1. The results obtained from the different T1 value of equation (3) are shown in Figure 5. As shown in this figure, the input image is shown by Figure 5(a) requires a high threshold value, which is T1 equal to 120, to correctly separate the hand from the background. This is because, during the image acquisition, the scanner lid is not completely closed, which causes higher illumination at the hand’s base. Thus, T1 values equal to 80 and 100 failed to segment the hand for this case. Therefore, for this input image, T1 equal to 120 is selected.

For input images shown in Figures 5(b) and 5(c), although all threshold values tested separate the hand region, T1 that is equal to 100 gives the best result. This is because at T1 equal to 80, the segmented region is bigger, and there are potential that some background regions at the hand edges might be included as the hand region. At T1 equal to 120, the defined region is smaller, and thus some of the hand regions at the edges might be excluded. Therefore, for these images, T1 equal to 100 is selected.

As there are intensity variations of the input images, in this project, the user is given the flexibility to choose the threshold level T1. Three options are given, which are 80, 100, and 120, which can be set from a graphical user interface. The default value is set to 100. As M1 mask will not be shown to the user, the selection of the threshold value is mostly based which T1 value that gives best output ROIs.

The fingers’ region on mask M1 is then removed by using equation (4). Figure 6 shows the M2 obtained from this step, together with its temporary masks m1 and m2. As shown by this figure, there are also nonpalm regions included in M2. However, these regions are relatively small as compared to the palm region.

Some of the results are shown in Figures 79. Subfigures 7(b), 8(b), and 9(b) show the detected palmar crease (i.e., M5), indicated by the red line, overlaid on image g. The green squares present the detected ROIs. As shown by these figures, both ROIs have been located correctly on all test images. It is worth noting that the method also performs well for the hand-wearing ring and other hand ornament, as shown in Figure 8.

3.2. Quantitative Evaluations

In this experiment, 101 palmprint images have been used. Three measures have been utilized for this purpose, which are the sensitivity, specificity, and accuracy. These measures are defined by the following equations: where TP is the number of the true positive pixels, TN is the number of the true negative pixels, P is the number of real positive case pixels in the image, and N is the number of real negative case pixels in the image. These measures are used to evaluate the extracted ROI1 (hypothenar region), the extracted ROI2 (interdigital region), and for both regions. The definitions for TP, TN, P, and N for these three cases are given in Table 1. As given by these equations, all measures used need the information from the ground truth. Therefore, in this experiment, we have created the ground truth data by manually segmenting these 101 palmprint images by using our previously developed ROI segmentation tool [20]. The ROIs from this manual segmentation are considered as the ground truth of the data.

Table 2 shows the sensitivity, specificity, and accuracy values obtained from 101 palmprint images. As shown in this table, the proposed method has a good performance in terms of specificity and accuracy, where its values are near to 1 for all input image given. However, if we inspect the sensitivity of the proposed method, the method has the sensitivity range of 0.4389 to 0.9716 for ROI1, a range of 0.5086 to 0.9782 for ROI2, and a range of 0.5033 to 0.9327 for both ROIs. This indicates that the extraction of ROI1 is more difficult as compared to ROI2.

Figure 10 shows some of the differences between the ROIs detected by the proposed method with their corresponding ground truths. Figure 10(a) presents the case where the extraction of both ROIs is not good. For this figure, the sensitivity of ROI1 is 0.4389, the sensitivity of ROI2 is 0.5678, and the sensitivity for both regions is 0.5033. Figure 10(b) shows a case where we get a good extraction of ROI2 (i.e., interdigital region) but not a good extraction of ROI1 (i.e., hypothenar region). For this figure, the sensitivity of ROI1 is 0.4650, the sensitivity of ROI2 is 0.9158, and sensitivity for both regions is 0.6904. Figure 10(c) shows a good ROIs extraction. For this figure, the sensitivity of ROI1 is 0.8769, the sensitivity of ROI2 is 0.7865, and the sensitivity for both regions is 0.8317. Figure 10(d) presents the case where the extracted ROIs are almost the same with the ground truth. For this figure, the sensitivity of ROI1 is 0.9590, the sensitivity of ROI2 is 0.9064, and the sensitivity for both regions is 0.9327.

4. Conclusion

This paper presents a new technique to extract two regions from palm image. This is a fully automatic technique. However, as threshold value T1 in equation (3) plays an important role in this method, the user is still given the freedom to change this value if the obtained results are unsatisfactory. In addition, to make the extraction of the features simpler, as compared to manual extraction, the technique able to align the image based on detected palmar creases makes the data more standardized. The features extracted can be fed into a machine learning algorithm for biometric identification.

Data Availability

The image data used to support the findings of this study are available from the corresponding author upon request.

Disclosure

The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; and in the decision to publish the results.

Conflicts of Interest

The authors declare no conflict of interest.

Acknowledgments

This work was supported in part by the Universiti Sains Malaysia: Research University Grant 1001/PPSK/812125 and Biasiswa Yang Di-Pertuan Agong (BYDPA) from the Public Service Department of Malaysia.