Abstract

Face recognition using thermal imaging has the main advantage of being less affected by lighting conditions compared to images in the visible spectrum. However, there are factors such as the process of human thermoregulation that cause variations in the surface temperature of the face. These variations cause recognition systems to lose effectiveness. In particular, alcohol intake causes changes in the surface temperature of the face. It is of high relevance to identify not only if a person is drunk but also their identity. In this paper, we present a technique for face recognition based on thermal face images of drunk people. For the experiments, the Pontificia Universidad Católica de Valparaíso-Drunk Thermal Face database (PUCV-DTF) was used. The recognition system was carried out by using local binary patterns (LBPs). The LBP features were obtained from the bioheat model from thermal image representation and a fusion of thermal images and a vascular network extracted from the same image. The feature vector for each image is formed by the concatenation of the LBP histogram of the thermogram with an anisotropic filter and the fused image, respectively. The proposed technique has an average percentage of 99.63% in the Rank-10 cumulative classification; this performance is superior compared to using LBP in thermal images that do not use the bioheat model.

1. Introduction

The recognition of individuals based on thermal images of the face (thermograms), captured by long-wave infrared (LWIR) thermal cameras in the range of 8 to 15 μm, has considerable advantages over facial recognition systems within the visible spectrum of 390–700 nm (VS). In face thermal images, it is possible to extract unique marks that can be used to identify people [1]. These marks have better tolerance for recognition in persons with pose changes [2] in variable illumination conditions [3, 4] and can even be obtained in complete darkness. These advantages have motivated researchers to propose different models that represent determining information for the identification of individuals.

Despite these advantages, some factors may cause marks not to be generated precisely with the same morphology for the same person. The factors can be internal or external and cause alterations in the thermoregulatory system. For example, caffeine and alcohol ingestion may cause an increase in face surface temperature [5, 6]. These temperature changes cause the heat dispersion zones of the skin to activate and these, in turn, can be obtained in the form of an image by a thermal camera. The morphology of the heat zones has been used to design people's recognition systems.

In this study, a thermal facial recognition system was designed considering the effects of temperature variations caused from alcohol consumption. . For training in the system, only the thermograms of the individuals when they have not taken alcohol were used as a reference. It is proposed in this way to simulate an application scenario in a real-life situation. Blood perfusion models were used in order to mitigate the effects caused by the increase in temperature [7] caused by alcohol intake. The perfusion model was performed as an image preprocessing. This feature is ideal for scenarios where conditions that cause changes in the surface temperature of the face can occur.

2. Materials and Methods

2.1. Thermal Face Database

The methodology was evaluated using the Pontificia Universidad Católica de Valparaíso-Drunk Thermal Face database (PUCV-DTF) [8]. It consists of 46 participants: 40 men and 6 women with an average of 24 years (between 18 and 29 years). As for their general health situation, participants should not have alcohol-related health problems, in addition to not being regular consumers. There were 250 face thermograms for each individual, 50 when they still did not drink alcohol and 50 for each who drank 1, 2, 3, and 4 beers, respectively. The drink that was ingested was 355mL can of 5.5° beer. . The authors established a procedure for image capture. First, participants should rest 30 minutes after arriving at the image capture site to stabilize the metabolism to environmental conditions. Each can of beer was consumed in a time of 5 minutes. The 50 images were captured in an average time of 1 minute. After each consumption there was a stabilization time of 30 minutes.

2.2. Theory
2.2.1. Blood Perfusion Models

There are endogenous and exogenous conditions to the individual that can affect the surface temperature of the face. Endogenous factors are those that produce a stimulation of the thermoregulatory system. Alcohol consumption [6], pathologies [9], physical activity [10], and time lapse [11, 12] are some of the causes that can stimulate the body and, in turn, disperse heat. Exogenous conditions are related to the use of clothing on the face, such as glasses [13]. Due to these variations captured by the thermal camera, the recognition systems that do not take them into account decrease their effectiveness. The first heat dispersion model using LWIR face images was proposed by Wu et al. [7]. The main purpose of this model was to mitigate the effects of heat redistribution on thermal images (Figure 1). To establish the model, the authors started from the thermal equilibrium equation:where Q represents the heat flux per unit area and stands for radiation (Qr), evaporation (Qe), convection (Qf), body conduction (Qc), metabolism (Qm), and blood flow convection (Qb).

In a later work, Wu et al. [14] proposed a simplified model based on the reduced equation from which equation (2) is derived.where stands for skin emissivity, is the Stefan–Boltzmann constant, is the temperature of the pixel to analyze, is the environment temperature, is the tissue-skin exchange ratio, is the specific heat of blood, and is the average artery-vein temperature.

Xie et al. [15] proposed a blood perfusion model based on the bioheat transfer model. Equation (3) is a discrete version of the bioheat transfer model obtained from Pennes equation:where qm stands for blood dispersion rate and λ is a constant related with pixel distance. This model allowed attenuating surface temperature variations caused by internal or external conditions.

One of the main advantages of using these models is that the variables involved to solve the equations are predefined as constants or were obtained empirically. Wu's model, represented by equation (2), requires knowing the value of the ambient temperature at the time of thermogram capture. This variable is not necessary for the model proposed by Xie described in equation (3). The Xie model is a better alternative in applications where it is possible to know the value of the ambient temperature. With thermal cameras, it is possible to get the surface temperature values of the face, with acceptable error ranges, which on average are 2%.

2.2.2. Vascular Networks

The heat patterns of each individual can be useful in recognition systems. Ghiass et al. [16] mentioned that high-reliability studies have not been carried out to demonstrate the direct correspondence of vascular networks with heat zones. However, an individual's heat zones can function as a biometric mark [17], regardless of whether they have a direct correspondence with vascular networks. Buddharaju et al. [18] used heat zones for the generation of thinned vascular networks, which in turn can be used as unique identification marks.

Cho et al. [19] proposed preprocessing techniques to extract vascular networks: (1) application of a salt and pepper filter to eliminate noise in the image; (2) increase the contrast of the image; and (3) extraction of the vascular network using the top-hat algorithm represented by equations (4) and (5). The proposed recognition system is based on the location of bifurcation points obtained from the thinned vascular network.where fopen is open operation from f, by applying erosion and dilatation with structural element s.

Hermosilla et al. [20] used a fused image thermogram and vascular networks to improve recognition rates. Several algorithms for extracting characteristics were used: LBP histograms; Gabor descriptors; WLD histograms; and SIFT and SURF descriptors. The experiments were performed on two databases EQUINOX and UCH thermal face database [17]. In general, the method favors the improvement of recognition percentages by 1.89%.

Nguyen et al. [21] recreated a three-dimensional model of the individuals' heads, from the fusion of video images. The vascular network is isolated, generating a 3D model. This representation would be closer to a real vascular network of an individual.

A limitation in the use of vascular networks for identification is that these were affected by changes in the surface temperature of the face. For example, in people who have consumed alcohol, the areas of the face that disperse heat modify their morphology compared to when they have not consumed. In one experiment, Koukiou [22] showed that the morphology of the vascular networks extracted from the thermogram was different in a person when he drank alcohol. The effect of the morphological modification of the vascular network is shown in Figure 2. This phenomenon is caused by the individual’s thermoregulation process caused by alcohol intake; however, the networks of veins and arteries do not undergo noticeable structural changes.

The morphology of the vascular networks extracted from the thermograms modified by changes in surface temperature poses challenges in the design of identification systems applied to real-life scenarios. Several situations alter the surface temperature of the face in uncontrolled environments. Additionally, the use of vascular networks for identification has had good recognition results in controlled environments. The possibility arises when these situations appear in thermograms captured in uncontrolled environments, particularly for the case of this study, on people who have consumed alcohol.

2.3. Methodology

Recognition systems that use infrared face images are classified by Arya et al. [23] as (1) based on face recognition techniques in images with classical (holistic) methods; (2) based on feature extraction, and (3) based on the multimodal analysis. The system proposed in this study is based on feature extraction and was designed in three stages denoted as (1) generation of the bioheat transfer model; (2) fusion of thermograms with vascular networks; and (3) feature extraction.

2.3.1. Bioheat Transfer Model

The blood dispersion models increase the intensity of the image pixels corresponding to the areas associated with the tissue with higher surface temperature and attenuate those with a relatively lower temperature. In Figures 3(a) and 3(b), this augmentation process and attenuation are shown, respectively.

Initially, the histogram of the thermogram f is equalized. The objective is to obtain a resulting image with a uniform intensity distribution. The discrete Pennes equation, defined in equation (3), is applied to the resulting image. From this transformation, the image Wb is obtained; this image is the representation of the dispersion model of the image of the thermogram f. The application of the bioheat transfer model mitigates the effects of the increase in the surface temperature of the face captured on the thermogram and that are caused by alcohol consumption.

2.3.2. Image Fusion

Image fusion is established for the use of the vascular network and the bioheat model features for recognition purposes. This fusion (WFUS) is obtained from Wb, and the vascular network extracted from the thermogram (WVN). The image with the vascular network WVN is obtained from the application of equations (4) and (5) on Wb. In the work of Hermosilla et al. [20], image fusion is performed on a softened thermogram and the vascular network from the same image. In this work, the operations defined in equation (6) are used, instead of the ordered weighted averaging (OWA) operator [24].where and must have the same size. In Figure 4, the fusion process applied to the same individual, with different alcohol consumption, is shown.

2.3.3. Image Features

Feature extraction is performed by using local binary patterns (LBPs). LBP (equation (7)) has been used in the processing of facial images for recognition purposes in the VS [25], near-infrared (NIR) [26], and in LWIR [27]. There are several configurations with which LBP can be used. In particular, configuration studies have been carried out that are better suited for establishing face recognition systems [17].where is the the number of the adjacent pixels to compare, and fp and fc stand for intensity pixel values for position p and central position c, respectively.

In this research, the value of P is 8 with a radius of 2 pixels.. Only binary patterns with two transitions in the binary chain generated by s(x) were taken into account. These types of patterns are called uniform binary patterns [28], and a rotation-invariant version was used. Equation (7) is a reformulated LBP for the extraction of uniform binary local patterns.where function U stands for the number of binary strings generated by . The image to be analyzed is divided into blocks of 20 lines by 4 columns. For each of these blocks, the LBP histogram is calculated, which generates a vector of values for each block. The concatenation of all block vectors forms a single vector, which functions as a global descriptor of the image.

3. Results and Discussion

The experiment design was based on supervised learning. For each image of the individuals, the identity of the person and the number of beers ingested are known. “Class 0” corresponds to those individuals who have not ingested alcohol. “Class 1”, “Class 2”, “Class 3,” and “Class 4” indicate the number of beers ingested by each individual. The main objective of the experiment was to establish the conditions that would happen in a real-life system. “Class 0” in all experiments is the training dataset. In a recognition system applied to real life, it is most likely that thermograms of individuals are available when they are not intoxicated.

For the development of the experiment, 3 representations of the thermograms were taken for comparative purposes. The first representation was of the image applied in a traditional model where thermograms are preprocessed only with an anisotropic filter (ANISOFF). The second representation used the preprocessing with the bioheat transfer model and the vascular network extracted from the same thermogram (BIOHEATRV). Finally, a concatenation of the ANISOFF global LBP histogram with the BIOHEATRV global LBP (ABFUSED) was used to build a global vector. This representation is described in Figure 5.

For each thermogram, a vector formed by the global histogram obtained from the concatenation of the LBP histograms was obtained. All values for each vector were normalized. Each vector was labeled with an identifier of the subject and the amount of alcohol consumed. For the classification phase, the nearest neighbor algorithm was used. Histogram intersection was used as a similarity measure. In all cases, data on subjects without consuming alcohol were used as a training set. Table 1 shows the cumulative recognition results for the 3 methods. For each class, the values Rank-1, Rank-5, and Rank-10 are shown. In these classifications, the percentage of cumulative recognition was indicated by taking subsets of 1, 5, and 10 subjects. Subsets were formed with the most similar individuals with regard to the subject to be analyzed.

In particular, for Class 2, it was shown that the cumulative recognition percentages have the worst performance with regards to the other classes. This was shown for both first attempt recognition (Rank-1) and Rank-10 with 78.69% and 96.47%, respectively. The best results were obtained for Class 1, for Rank-1 and Rank-10 of 92.21% and 99.82%, respectively. The average ANISOFF recognition percentage for Rank-1 and Rank-10 was 84.22% and 98.70%, respectively. The performance of the ANISOFF method is shown in Figure 6.

In contrast, BIOHEATRV had its worst performance with class 3, for Rank-1 and Rank-10 of 83.16% and 94.17%, respectively. Like ANISOFF, the best BIOHEATRV performance was obtained in class 1 with 91.51% and 99.30% recognition for Rank-1 and Rank-10, respectively. The cumulative recognition results are shown in Figure 7.

The behavior of ANISOFF for class 3 and BIOHEATRV for class 2 gave us an indication that in using both methods it was possible to improve the recognition percentages. From this behavior, ABFUSED was designed. The union of methods does not always lead us to improve the percentages of recognition. However, this combination proved to be viable for this study. The cumulative percentages of ABFUSED recognition for each of the classes are shown in Figure 8.

CMC curves were designed for each of the classes (Figures 912). In each of the classes, ABFUSED obtained the best recognition percentages, except for a segment between Rank-3 and Rank-4 of Class 3, where ANISOFF had better recognition percentages.

The average recognition percentage of each class was calculated. The average cumulative recognition value was computed for each rank and each class. ABFUSED had the highest recognition percentages. It was noted that the use of BIOHEATRV had a better percentage of recognition in Rank-1 compared to ANISOFF. However, ANISOFF had a better percentage in the remaining subsets.

4. Conclusions

The results in the recognition values shown in this paper are evidence of the effectiveness of bioheat transfer methods. These models may contain enough information for the process of identifying individuals. In the case of the BIOHEATRV method, it is demonstrated that it includes complimentary features compared to the classical anisotropic filter. These complementary characteristics were the reason why ABFUSED increased the recognition percentages for all classes.

Alcohol consumption, in particular, tends to generate metabolic alterations that in turn produce temperature rise in different parts of the surface of the face. These temperature changes caused the recognition systems to reduce their effectiveness. In particular, this study showed how these percentages declined for classes 2 to 4. Of course, the methods described in this work were not taking into account the particular conditions of each individual. This stands out because neither the metabolism nor the heat dispersion of the face is the same for all individuals. However, the experimental conditions designed were defined as a system of recognition of everyday life. That is, thermograms of people acquired in a controlled environment were taken as the training base. In this scenario, the image of the subjects with alterations caused by alcohol would not be counted. Additionally, the percentages of Rank-1 recognition obtained for each of the classes suggested the feasibility of implementation in real-life situations.

The cumulative recognition analysis was performed to indicate the feasibility of applying these methods for two-phase identification systems. The average Rank-10 percentage of 99.63% of ABFUSED method suggested that a second recognition method could be used. This is because the population was limited to a subset of 10 individuals, with a percentage of certainty ∼100% that the individual to identify was in that subset.

Theoretically, heat dispersion models, such as the bioheat transfer model, mitigate global face temperature changes on the image. It is necessary to design studies where this can be measured quantitatively as other factors affect the percentages of recognition, for example, those originated due to the increase in the surface temperature of the face such as physical activity, substance use, or some pathologies. In the database used for this investigation, people who are active consumers of alcohol were excluded. It is necessary to broaden the inclusion criteria in the conformation of datasets to determine how these characteristics cause changes in heat dispersion and how these probably decrease the effectiveness of face recognition systems. Even other conditions must be added, for example, it is common for people who consume alcohol to also eat food.

It is in the interest of the authors to carry out tests using the bioheat transfer model, for example, as input variables, in a deep learning neural network. Additionally, this study seeks to incorporate information on face images in NIR to establish a system of recognition of information fusion with LWIR images. Research is underway in these areas and will be reported in subsequent papers.

Data Availability

The datasets generated and/or analyzed during the current study are available from the corresponding author on request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to acknowledge the financial support provided by Consejo Nacional de Ciencia y Tecnología (CONACyT) of Mexico, which contributed to academic grants for ASP (414351/449144).