The application of biometric recognition in personal authentication enables the growth of this technology to be employed in various domains. The implementation of biometric recognition systems can be based on physical or behavioral characteristics, such as the iris, voice, fingerprint, and face. Currently, the attendance tracking system based on biometric recognition for education sectors is still underutilized, thus providing a good opportunity to carry out interesting research in this area. As evidenced in a typical classroom, educators tend to take the attendance of their students by using conventional methods such as by calling out names or signing off an attendance sheet. Yet, these types of methods are proved to be time consuming and tedious, and sometimes, fraud occurs. As a result, significant progress had been made to mark attendance automatically by making use of biometric recognition. This progress enables a new and more advanced biometric-based attendance system being developed over the past ten years. The setting-up of biometric-based attendance systems requires both software and hardware components. Since the software and hardware sections are too broad to be discussed in one paper, this literature survey only provides an overview of the types of hardware used. Emphasis is then placed on the microcontroller platform, biometric sensor, communication channel, database storage, and other components in order to assist future researchers in designing the hardware part of biometric-based attendance systems.

1. Introduction

Personal identification is considered an important aspect in recognizing the identity of a particular individual. A person’s identity can be validated through the traditional or biometric methods. There are two types of traditional methods which are token-based and knowledge-based identifications [1]. Examples of the token-based method include possession of a passport, driving license, and different types of cards such as identity (ID) card and credit card. Although it is convenient to carry these identity documents, these documents can be reproduced, stolen, or lost. On the other hand, the knowledge-based method is related to a password or personal identification number (PIN) created by each individual for authentication. Nonetheless, it tends to be forgotten easily, especially if the person has several passwords or PINs for different applications. Another alternative method is through biometric adoption, which considers the physical or behavioral characteristics distinctive to an individual. Physical characteristics refer to inherent features of the human body part. These include the face, fingerprints, and iris. On the contrary, behavioral characteristics deal with features observed from human action. Examples of human action are gait, voice, and signature [2]. By using biometric methods, the problems faced in traditional methods as mentioned above can be solved.

Currently, biometrics are employed in a wide variety of domains. According to a 2018 report by German and Barber from the Center of Identity, University of Texas at Austin [3], the top three sectors which embrace biometric methods are financial services, technology, and government. This is followed by the workplace, recreation, and healthcare and with the least usage in the education domain. Figure 1 depicted the different domains as mentioned, along with the biometric application percentage.

In financial services and technology domains, a person can use a mobile wallet to purchase goods. This is because most of the current mobile phones are integrated with a biometric scanner. By adding a credit card to a mobile wallet, payment for in store or web purchase can be done through Apple Pay or Samsung Pay. In Apple Pay [4], Face ID or Touch ID is used while Samsung Pay [5] utilized the fingerprint or iris for authentication. In addition, the banking industries are also adopting biometric measures to authenticate their customers at ATMs. For instance, almost 90% of the ATMs in Macau were installed with “Know-Your-Customer” facial recognition technology [6]. Asserting that facial recognition is less secure, Bank of China (Hong Kong) equipped the company’s ATM with finger vein identification [7]. As of December 2017, 160 ATMs had been installed in all the branches in the city state.

Governments all over the world are quick to adopt biometrics for different purposes. Malaysia is the world’s first country to use an electronic passport with a thumbprint as the biometric security feature in 1998 [8]. Subsequently, the passport was further enhanced with an additional security feature using face recognition [9]. Other than that, the fingerprint is also incorporated inside the chip on the Malaysia’s identity card, MyKad [10]. Over in China, a vast network of surveillance cameras enabled with face recognition technology was installed to help in criminal detection and law enforcement [11]. Moreover, a number of international airports are seeking to enhance efficiency and improve passenger experience by deploying biometric technology. Besides, the biometric processing of a passenger also helps the respective government in controlling border checkpoint security. Face recognition technology is employed in Singapore’s Changi Airport [12] and Atlanta’s Hartsfield-Jackson International Airport in the US [13]. A new “smart tunnel” check-in system using iris recognition was implemented in Dubai International Airport [14].

The workplace and education domains applied biometric technology for attendance recording or tracking [1517], accessing permission [18, 19], and behavioral analysis [16, 20, 21]. Attendance records are used for an employee’s payroll and to determine whether a student is eligible to sit for an exam. Access permission ensured that only authorized personnel can enter a premise. Furthermore, behavioral biometrics is used to track the concentration level of each individual either in an office or in a classroom. The recreation domain also embraced biometric technology, as can be seen in Universal Studio Hollywood, which required guests to fingerprint in order to enter the theme park [22]. Another example is in Wuzhen, a historic tourist park in China. Visitors need to register using facial biometrics to gain entry into various attractions [23]. In the healthcare domain, biometrics has been introduced for patient identification and electronic medical record management [24]. For the e-health system [25], patients at home use biometric features to identify themselves for receiving medical service remotely. A wearable device with built-in sensors is used to transfer real-time data of the patient to the doctors in the hospital. As a result, doctors can make accurate and timely treatment based on the data and electronic medical record.

Due to the low percentage of biometric applications in the education sector, this paper provides details in the biometric-based attendance recording or tracking system. In academic institutes, both teaching and training are delivered to transfer knowledge and skills from educators to the students. For a conductive learning environment, there is a need for students to attend classes so that they can seek knowledge and learn skills from their educators. Although there are many online courses available, these courses do not offer the opportunity of a direct face-to-face interaction with the educator. Through continuous interaction, students can get immediate feedback regarding any doubt in a particular topic. Besides, it also helps to strengthen the relationship between educators and students; thus, students tend to be more motivated to study.

Attendance marking is considered as one of the crucial parts of a class. This is to ensure students participate in the class activities and learn from their educators. Besides, in some institutes, the percentage of attendance must be 80% and above for the whole semester for them to be eligible to sit for the final examination [26]. Currently, most of the attendances are marked by using the conventional methods such as calling out names or signing off attendance in paper. Unfortunately, these methods are not suitable for a large class. It is a waste of time for the educator to call out all the students’ name to mark their attendance whereby the time can be utilized more efficiently for the teaching and learning processes. Another method is by signing off attendance. Distraction happens because the student needs to sign the attendance list and pass on to the other student during class. Moreover, it can be compromised by a student who signs on behalf of their friend who does not attend the class. Therefore, biometric systems have been developed as an alternative way to mark attendance in class. One of the advantages is that the student cannot manipulate the attendance as each individual has different biometric characteristics. In addition, it also helps in improving efficiency and reduces the educator’s burden as the attendance is marked automatically. Besides marking attendance, some systems can determine the students’ seating positions [15, 2729] while the other classifies gender of students using facial features [30].

There are several aspects and specification to be considered in developing this type of system. First, it is the number of students. The biometric-based attendance system can be designed and implemented for a small or large class. From the literature survey, the ranges are from five to two hundred students. Raghuwanshi and Swami [31] have developed their system with the least number of students, which is only five students. On the other hand, there were also systems implemented for a large classroom with 200 students [32, 33]. The second factor to be considered is the time taken for attendance marking. Generally, the numbers of students in a class are directly proportional to the time taken for attendance marking. According to Adeniji et al. [34], the conventional method of taking attendance using paper is more time consuming, which is around 22.6 seconds per student. On the contrary, the time taken for the biometric-based attendance systems ranges from less than one second [35, 36] to 26 seconds [37]. The voice-based attendance system developed by Dey et al. [37] took more time as compared to the conventional method because students were asked with several questions to ensure that sufficient numbers of speech frames were obtained for verification. Other biometric-based methods [15, 34, 35, 3841] took less than 10 seconds to take the attendance of one student. Another point to look at is on when the attendance is being recorded. Most of the systems took the attendance for one time during the beginning of a class with the date and entrance time [19, 39, 4246]. However, cheating can still occur if students skip the class once they take the attendance before the class starts. In order to solve this problem, attendance taking for more than one time was suggested. Normally, students’ attendances were taken twice, which are at the beginning and at the end of a class with the entrance and exit time, respectively [40, 47, 48]. Besides that, continuous or periodic observation can be implemented by tracking facial images from video frames to mark students’ attendance [27, 4952].

In terms of the operation stages of the biometric-based attendance system, it can be viewed from the user’s context or developer’s perspective as shown in Figure 2. From the user’s context, there are two operation stages which are the enrollment stage and authentication stage. In the enrollment stage, all the students’ biometric characteristics are captured and labelled with their name or roll number, respectively [27, 34, 39, 4850, 5357]. These biometric characteristics can be in the form of the iris, voice, fingerprint, or face. Generally, the iris or face images are acquired using cameras, voice recorded with microphones, and fingerprint obtained through a scanner. At the beginning of a semester, the students are required to register their biometric traits during the enrollment stage. These biometric data are stored in the database and serve as templates for the authentication stage. The numbers of templates required for every student are different for each system. In the speech-based attendance system [56], three spoken text templates are required each for the voice password (VP) and text-dependent (TD) modules. For the fingerprint-based attendance system, the templates vary from one [55], two [32, 34, 43, 58], four [47] to ten [40]. With regard to the attendance system based on face recognition, the minimum number of templates is five [31] while the maximum value is 1000 [59]. Furthermore, the numbers of templates for other face recognition systems are below sixty templates [57, 60, 61] with most of them having values in between six and twenty-one [29, 33, 46, 53, 6269]. Next is the authentication stage that involved matching captured biometric data with those templates in the database. During a class, the students’ biometric characteristics are captured once again with either the awareness or the ignorance of students. For the iris, voice, and fingerprint, students are aware because they need to be in contact with the devices to capture their biometric traits. However, most of the face-based systems are contactless, and hence, students do not know when will the attendance be taken by the camera [30, 31, 49, 50, 57, 63, 64, 68, 70, 71]. Nevertheless, there are some cases where the students are mindful that their facial images are being captured, because they are required to face the front of the camera [46, 65, 72]. In order to authenticate these new data, it is followed by either identification or verification process. For identification, these new data are compared to all the existing templates in the database (one-to-many). Conversely, verification involves comparison only to the templates of the claimed identity (one-to-one) [73].

Developers play an important role in research and design of a biometric-based attendance system to facilitate in attendance marking. With the aim of developing an efficient and user-friendly system, the tasks of the developers are to ensure that the system could take attendance automatically and consume less time in attendance marking as well as matching with the correct identity. Therefore, numerous operation stages need to be taken into account from the developer’s perspective. Normally, it can be classified into two broad categories which are the hardware and software sections.

Hence, this paper provides an insight into developing a decent biometric-based attendance system focusing on the hardware part. Hardware is required as the biometric-based attendance system is constructed from several components. From the literature survey, hardware can be further divided into five categories, as shown in Figure 3. They are microcontroller platform, biometric sensor, communication channel, database storage, and other components. These components will be discussed in Sections 26. Finally, a conclusion is made in Section 7.

2. Microcontroller Platform

The microcontroller is the “brain” of the entire biometric-based attendance system that controls all the operations between components or devices. Generally, the microcontroller can be categorized based on the bus width, memory architecture, and instruction set architecture (ISA). The bus width refers to the number of bits which can be transmitted by the data bus at a time. Thus, the microcontroller can be classified as having 8 bits, 16 bits, 32 bits, or 64 bits of bus width [74]. Higher speed and greater precision operations require a microcontroller with wider bus. Next, the memory architecture can be differentiated as having a Von Neumann architecture or a Harvard architecture. For the Von Neumann architecture, the data and program are stored in the same memory; hence, a single set of address and data buses are shared between the processor and memory. Conversely, the data and program memories are separated in the Harvard architecture, thus resulting in two sets of isolated address and data buses. Finally, there are two variants of ISA, namely, reduced instruction set computer (RISC) and complex instruction set computer (CISC). RISC shortens the operation time by executing an instruction in one machine cycle that leads to a faster speed. On the contrary, CISC allows the combination of multiple simple instructions into only one instruction, albeit with varying machine cycle times.

For the biometric-based attendance system, the microcontroller initiates and terminates the process of attendance marking. Moreover, it receives the biometric data from the sensor and sends this data to be stored in the database. In this system, the microcontroller is connected to a biometric sensor, card reader, keypad, power supply, and clock circuit as inputs. On the other hand, the output devices connected to the microcontroller could be a computer, LCD, wireless connection, and memory module. All the operations between various devices are controlled by the microcontroller. Apart from that, the microcontroller is also used for image processing tasks such as preprocessing, feature extraction, matching, and recognition of biometric features [45, 46, 75]. However, most of the sensor modules come with built-in image processing capacity so as not to overload the main microcontroller [35, 38, 4244, 76]. There are several options in choosing a suitable type of microcontroller in implementing the attendance system as shown in Figure 4. In addition, Table 1 outlines requirements for the microcontroller used along with their respective functions.

The 80C51 microcontroller is based on the 8051 central processing unit (CPU) manufactured with complementary metal-oxide-semiconductor (CMOS) technology as shown in the letter “C” in between the name of the microcontroller. The cost for 80C51 is in the range of $4 to $9 with the prices quoted in US dollars. Intel Corporation invented the first 8051 microcontroller in 1981 [77]. This is an 8-bit microcontroller with CISC processor. The 80C51 family has separated program memory and data memory, which is based on the Harvard architecture. However, some 80C51 can be tuned into the Von Neumann architecture to enable writes to the program memory [78]. As shown in Table 1, there are some projects developed by using the 80C51 microcontroller [43, 76]. Kadry and Smaili used Atmel board AT89C5122 which is based on the 80C51 microcontroller to develop an iris recognition system [76]. The memory of this board consists of 32 KB of flash and 768 B of RAM. Besides that, Purohit et al. adopted the 80C51 microcontroller by using P89V51RD2 board by Philips semiconductor to develop a fingerprint-based attendance system [43]. This board has a bigger memory size with 64 KB of flash and 1024 B of random-access memory (RAM).

Next, the peripheral interface controller (PIC) is made by Microchip Technology Incorporated, which retails in between $4 and $5. PIC can be classified as 8-bit, 16-bit, or 32-bit microcontroller [77]. It is considered as an RISC processor with the Harvard architecture. Said et al. used PIC16F876A in their fingerprint-based attendance system [79]. A more advanced PIC18F4550 was used for fingerprint recognition by Basheer and Raghu to develop an attendance system as well [54]. This advanced PIC is incorporated with a flexible oscillator structure that saves power consumption especially for portable devices. Both PIC16F876A and PIC18F4550 are 8-bit microcontrollers equipped with 256 B of electrically erasable programmable read-only memory (EEPROM) for data. However, PIC18F4550 has 32 KB of larger flash program memory compared to PIC16F876A with just 8 KB. This is because PIC18F4550 supports 77 instruction sets with 16-bit word while PIC16F876A only has 35 instruction sets of 14-bit wide. Moreover, PIC18F4550 provides direct support for a universal serial bus (USB) interface, such as with the host PC and power supply as in the system developed by Basheer and Raghu [54].

The ATmega microcontroller is designed and manufactured by Atmel Corporation before being acquired by Microchip Technology Incorporated in 2016 [80]. The price tag for ATmega is around $1 to $13. From the literature review, 8-bit RISC-based ATmega are normally used in the attendance systems. Moreover, ATmega microcontrollers use Harvard architecture with a different memory and bus for program and data. Mittal et al. [18] along with Soniya et al. [51] executed their system using Arduino Uno with ATmega 328 core. Saxena et al. [81] also selected ATmega 328 core to develop their system but did not specify on which Arduino board they were using. Besides Arduino Uno board, Zainal et al. [38] implemented their system using Arduino Mega equipped with ATmega 1280 core. In addition, there are some authors using Arduino Mega board as well but with a different processor, which was ATmega 2560 [35, 39]. The same processor chip ATmega 2560 without Arduino board was also used by Mazhar et al. [82] to develop their project. Among the three types of cores, ATmega 328 has the lowest memory specifications that are 32 KB of flash, 1 KB of EEPROM, and 2 KB of static access random memory (SRAM). Instead of limited memory capacity, both ATmega 1280 and 2560 have larger memory size. These two types of microcontrollers have the same 4 KB of EEPROM and 8 KB of SRAM. Nevertheless, ATmega 2560 has larger flash memory size of 256 KB as opposed to ATmega 1280b with only 128 KB.

ARM Holdings does not manufacture microcontroller, as it only licenses the technology to other semiconductor companies [83]. An advanced RISC machine, which is commonly known as ARM microcontroller, usually costs around $3 to $8 and is built upon 32-bit or 64-bit core [84]. The architecture for an ARM is either the Harvard architecture or the Harvard first level memory system. From Table 1, the recent trend for selecting the type of microcontroller tends to be in favor of ARM microcontroller. Gadhave and Kore [42] developed their system using a Raspberry Pi board with Broadcom BCM2835 ARM 11 processing core. Besides, Atmel SAMA5D31 equipped with ARM Cortex-A5 (“A” profile for sophisticated application [83]) microprocessor was used in the system by Dhanalakshmi et al. [48]. Other models of Raspberry Pi boards were used also, such as Raspberry Pi 2 Model B consisting of Broadcom BCM2836 ARM Cortex-A7 processing core [45]. In addition, there were another two systems developed using Raspberry Pi 3 Model B [46] and Raspberry Pi 3 Model B+ [62]. Both of these boards were based on ARM Cortex-A53 using Broadcom BCM2837 and Broadcom BCM2837B0 chips, respectively. Furthermore, STMicroelectronics STM32F103C8 integrated with ARM Cortex-M3 (“M” profile for microcontroller application [83]) microprocessor was used by Gagandeep et al. [44] to implement the attendance system. Generally, ARM 11, Cortex-A5, Cortex-A7, and Cortex-A53 are suited for high-end application with much greater performance in contrast to Cortex-M3 which has lower functionality [83]. All of the ARM microprocessors discussed previously are based on a 32-bit core except Cortex-A53 with 64 bits.

Digital signal processor (DSP) is a specialized microprocessor design for signal processing operations. The price of DSP ranges in between $14 and $29. Different from the other microcontrollers which have 8-bit, 16-bit, or 32-bit core, DSP usually has uncommon bus widths [85]. Besides that, DSP utilized RISC processor for better performance in signal processing applications. The DSPs used in the biometric-based attendance systems support modified Harvard architecture. In this context, the original Harvard architecture is modified so that one memory bank handles program instruction and data while the other handles data only [86]. As a result, two memories can be accessed concurrently within one multiply/accumulate function (MAC) instruction [87]. Moreover, these DSPs contain 40-bit arithmetic logic unit (ALU). Li et al. [88] used TMS320C5409 DSP by Texas Instrument Incorporated in their system. Likewise, Kamaraju and Kumar [75] also used another type of DSP by Analog Devices Incorporated, which is ADSP-BF532. The reason for choosing a DSP was because the image processing, feature extraction, and fingerprint recognition tasks were all carried out by the processor compared to other systems whereby these processes were done in the fingerprint scanners [75].

A hybrid type of board, which encompasses both a microcontroller and a system on chips (SoC), was implemented in a face recognition-based attendance system using deep learning to mark the attendance of students [52]. This hybrid board which is known as UDOO ×86 ultrasingle board computer consists of an Intel Curie microcontroller and embedded SoC that integrates Intel Pentium N3710 processor. For the microcontroller part, it provides a compatible platform with similar pin layout as in Arduino 101. On the other side, the 64-bit SoC has 4 cores and 8 GB of dual channel RAM. Besides, the SoC is embedded with graphics processing unit (GPU) using Intel HD Graphics 405 controller, which has 16 execution units for visual computing [89]. The GPU is capable of handling parallelized operations; therefore, the execution time for the deep learning algorithm can be reduced tremendously.

Almost all the microcontrollers mentioned previously have small size, low cost, and low power consumption along with high performance. Hence, these microcontrollers are suited for a wide variety of general-purpose applications besides the attendance system. Nevertheless, DSPs can be selected as well for developers who are concerned with signal processing and real-time execution [75]. For developers employing the deep learning algorithm, they have the option of choosing board with embedded GPU for faster processing speed. Table 2 summarizes the performance for each of the microcontroller platforms. Besides tabulating the bus width, memory architecture, and instruction set architecture, other performances or characteristics are outlined as well. In terms of speed, 80C51 needs the most clock cycles to execute an instruction. Moreover, 80C51 has high power consumption and less input/output (I/O) ports that can be connected to other components. Thus, poor performance is a trade-off for a lower price considering 80C51 only costs about $4 to $9. Similarly, the price for PIC is inexpensive at around $4-$5. Apart from that, PIC has low power consumption. However, compromises have to be made in terms of performance whereby the speed is slow and has less I/O ports. Conversely, ATmega comes with a price tag in between $1 and $13, although the cost is slightly higher but still affordable. Performance wise, ATmega has higher speed, lower power consumption, and more I/O pins. Likewise, the cost for an ARM is reasonable, retailing from $3 to $8. In addition, ARM is fast with less clock cycle to execute an instruction besides power efficient and has more I/O ports. Therefore, ATmega and ARM have good value for money by providing a right balance between the cost and performance. This is further proved by the adoption of ATmega and ARM in most of the biometric attendance systems. On the other hand, DSP is able to implement the recognition process [75] besides being a controller to handle other components. Hence, DSP is a standalone system suitable for complex tasks. In comparison to DSP, PIC needs additional processor to run the recognition process as in the matching task implemented in a computer [79]. With a retail price from $14 to $29, DSP is the most expensive among all the microcontroller platforms. This is because DSP has good performance with regard to speed and power consumption but at the expense of higher cost.

3. Biometric Sensor

Biometric sensors are used to capture the biometric characteristic of an individual. In the biometric-based attendance system, the sensors are used for two purposes. First, sensors are used to capture the biometric data to be stored as templates. After that, each template is tagged with the corresponding name or roll number of a student. All of the templates along with the students’ information are then stored in a database as references for comparison with new biometric data captured subsequently. This process is done only once at the start of an academic semester. The next purpose of sensors is to capture another copy of new biometric data from each student whenever there is a class. In order to mark the class attendance, the identity of each student must be recognized. Hence, the new biometric data captured are compared to the templates to record the correct name, date, and time of those students who are present in the class. Other than discussing the purpose and importance of sensors in the biometric-based attendance system, the following subsections will deliberate about the different types of sensors along with the prerequisite condition or environment for capturing error-free biometric data.

3.1. Iris Sensor

Iris image capture devices or sensors are actually just cameras [90]. There are certain criteria to consider when choosing an iris sensor. First is the illumination condition, because without ample lighting, the iris image tends to be dark. However, a visible light source with wavelength between 400 and 760 nm is not suitable as it will cause discomfort to the user. Thus, a near infrared (NIR) light source (wavelength between 700 and 900 nm) is usually used for capturing iris image [91]. In addition, NIR light is able to enhance the iris image with a perceivable structural pattern [92]. The next criterion is related to the focus of the camera’s lens. Autofocus camera is able to tune its lens with varying scales to obtain focus on the iris image [93]. Hence, it is more convenient for the users as they do not need to be in close range with the iris sensor. A fixed-focus camera confines the users within a specific distance from the lens for successful iris acquisition [93]. Thus, users are expected to give high cooperation or stay still occasionally during the capturing process. Performance wise, an autofocus camera is more flexible for the ability to capture sharp iris image easily in varying distances by selecting the perfect focus. In contrast, with definite length setting, the fixed-focus camera produces blur image occasionally when the iris is located too near to the camera. Nevertheless, the cost of the autofocus camera is more expensive than the fixed-focus camera. The other criterion is the resolution of the iris sensor. According to Daugman [94], the sensor should be able to capture at least 70 pixels in iris radius for good recognition. Another alternative was recommended by the National Institute of Standards and Technology by using a minimum of 60 pixels across the iris radius [95].

For the iris-based attendance system, Teh and Mohamed [96] and Khatun et al. [97] implemented their system using a webcam. In terms of system design, Teh and Mohamed [96] installed the webcam inside a box while Khatun et al. [97] attached the webcam above a laptop. However, the system by Teh and Mohamed [96] caters to low resolution iris image captured through a webcam. There is no information provided by the authors [76, 96, 97] regarding the model of webcams or sensors used; thus, comparison between this equipment cannot be made. Generally, the preference of using a webcam in an iris-based attendance system is due to lower cost and acceptable image quality as a whole. Nonetheless, a low-end webcam produces a low-quality image. The trade-off for a low-cost webcam is that researchers need to perform additional works on image preprocessing. Hence, in order to increase the recognition rate, the quality of the captured image is improved through techniques such as contrast enhancement or image denoising. In addition, a lot of time may be wasted to take attendance of all the students especially for large classes. This is because students need to queue in order to scan their iris whereby the camera can only capture one iris at a time.

3.2. Voice Sensor

Voice can be captured via a microphone. Typically, a microphone is a type of transducer that converts sound wave into an electrical signal. Variation of sound energy causes the diaphragm in the microphone to vibrate, thus producing mechanical energy which in turn converts to electrical energy in the form of alternating voltage. Hence, there are two classifications of microphones depending on mechanical characteristics and electrical characteristics [98].

Nevertheless, according to Thomas and Govindaraju [93], the two main types of sensors used for voice biometrics are acoustic sensors and nonacoustic sensors. Acoustic sensors are found in most of the widely used microphones that capture the acoustic signal from the voice. For example, dynamic microphone operation is based on the concept of induction while condenser microphone works on the basis of conduction. The other type is nonacoustic sensor, which provides measurement for glottal excitation and vocal-tract articulation movements. The electromagnetic motion sensor is one of the nonacoustic sensors for measuring glottal movement using microwave radar. Others include an electroglottography (EGG) sensor for reproduction of a speech signal based on the measurement of the vocal fold contact area, a piezoelectric sensor that produces voltage due to changes in pressure and sensor utilizing bone movement. In terms of performance, the acoustic sensor in a pure microphone is prone to background noise. Contrarily, the nonacoustic sensor is less susceptible to surrounding noise disturbance because the detection is only on the muscle movements in the vocal-tract area. In spite of that, the nonacoustic sensor comes at a cost and may cause discomfort since it has to be attached to the user.

In order to choose a suitable microphone, there are a few specifications to be followed [98]. First is the sensitivity of the microphone which measures the efficiency of output voltage generated when exposed to sound pressure. Next is the inherent noise whereby a voltage level exists in the microphone itself, although there is no sound source. Then, it is the dynamic range that determines the differences between the maximum sound pressure level and the inherent noise that can be captured by the microphone. In addition, the frequency response defines the range of frequency (interval between upper and lower limits) within the operation of the microphone. Directivity describes the sensitivity towards the direction of the sound depending on the structural shape and directivity pattern of the microphone. Other than the technical specification, the microphone placement plays an important role in capturing quality voice data. Typically, the microphone placement from the sound source can be considered either in near-field or far-field [99].

From the literature review, almost all of the voice-based attendance systems [15, 37, 56, 100] used built-in microphones on the mobile phones to record voice. Compared to the common microphones which are relatively large, the microphones in the mobile phones are smaller in scale. Modern smartphones are equipped with more than one miniature microphone to detect surrounding sound as well as to filter out unwanted noise. Nowadays, most of the students own a smartphone; thus, voice can be recorded to mark attendance. This in turn will help the academic institutes in lowering the expenses for system development since the hardware costs are reduced. However, the cost-saving initiative may decrease the recognition rate mainly due to lack of uniformity in the voice captured by different types of smartphones. Some authors suggested the use of an attendance app that can be downloaded to any smartphone belonging to the students [15, 100]. On the other hand, other systems confined the used of only a few predefined mobile phones [37, 56]. Based on the discussion, a smartphone is a good option for the voice-based attendance system. However, the disadvantage is sensitivity towards background noise such as the sounds of other students chatting when a particular student is recording a voice for attendance marking. Moreover, the quality of voice recorded may be degraded by the microphone.

3.3. Fingerprint Sensor

Various types of sensors are used to capture the fingerprint image. Basically, the image sensor for fingerprint can be categorized into three types which are optical, solid state, and ultrasound [101]. Optical sensors can be implemented in different ways to capture the fingerprint image. The earliest and commonly used optical sensors are based on the working principle of frustrated total internal reflection (FTIR). Components for an FTIR-based optical sensor include a light source, a glass/plastic prism, a lens, and a charge-coupled device (CCD) or complementary metal-oxide-semiconductor (CMOS) camera. The CCD or CMOS camera captured the reflected light from the prism via the focusing lens as the user touches the top site of the prism. Basically, a fingerprint is characterized by different patterns of ridge and valley features [102]. Due to the different refractive indexes of the ridges and valleys when the finger is placed on top of the prism, the light source is absorbed by the ridges and reflected by the valleys as shown in Figure 5(a). As a result, the ridges appear dark and valleys appear bright in the captured fingerprint image [101]. There are several advantages of using an optic-based sensor. The sensor cannot be tricked easily by duplicated fingerprint image as it only senses the finger with a three-dimensional structure. Besides, it is able to capture varying image resolutions as well as produce high-quality image. Despite the benefits, the FTIR-based optical sensor is susceptible to a dry or wet finger that yields saturated or weak impression, respectively [103]. Moreover, it cannot be miniaturized due to the inflexibility of reducing the optical path length that can cause optical distortion in an image.

Solid state sensors which are also known as silicon sensors are designed using tiny sensors that represent an array of pixels. There are four kinds of solid state sensors comprising capacitive, thermal, electric field, and piezoelectric [101]. By far, the most common is the capacitive type. A capacitive sensor consists of two plates. One plate includes a two-dimensional array of capacitors placed underneath the finger sensing surface, and the other plate is the skin of the finger. Capacitances with varying magnitudes are created depending on the distance between the two plates as shown by the vertical double-headed arrow in Figure 5(b). Hence, ridges and valleys can be distinguished with regard to the different capacitance values [104]. Solid state sensors are able to solve the problems associated with optical sensors such as smaller footprint and adjustable electrical parameters to address a dry or wet fingerprint image. In addition, these sensors cannot be duped easily by a copied or fake fingerprint image since a three-dimensional finger surface is sensed based on the distance measurement. In spite of the advantages, solid state sensors are sensitive to electrostatic discharges (ESD). Moreover, white blobs are noticeable in the fingerprint image after countless usage of the sensor [105]. Furthermore, frequent cleaning is required to obtain a good and clean fingerprint image.

Ultrasound sensors are used to sense the difference between acoustic reflection depth of the ridges and valleys [103]. The sensor consists of a transmitter which produces an acoustic signal as well as a receiver that detects the corresponding reflected or echo signal from the fingerprint surface. Figure 5(c) illustrates that the fingerprint structure is captured by computing the reflection depth of successive echo signals. Ultrasound sensors are able to capture a good quality image without being affected by dirt, oil, or other contaminants on the finger [101]. However, these sensors are bulky, expensive, and require more time for fingerprint image acquisition [102].

Important criteria for choosing a fingerprint scanner are the acquisition area, output image resolution, and geometric accuracy [101]. The optimum value of the fingerprint sensing area is inch2 (). However, most of the commercial scanners are smaller in order to reduce size and cost. Besides, the minimum image resolution is around 500 dots or pixels per inch (dpi). Geometry accuracy is measured by the structural distortion caused by the fingerprint scanner. Other parameters should also be considered as well, such as the I/O interface, frames per second, automatic finger detection, encryption, and supported operating system. Table 3 lists down the specifications of the respective sensor used for finger impression in the attendance system.

From Table 3, it is observed that optical sensors are the most popular type of fingerprint scanner for the biometric-based attendance system [18, 34, 35, 3840, 4244, 54, 75, 79, 82]. Solid state sensors were used occasionally [48, 58, 88] while none of the attendance system used ultrasound sensors. Other than that, Adal et al. [17] proposed using mobile phones because nowadays most of these communication devices are equipped with a built-in fingerprint sensor and at a much lower price. In addition, most of the sensors captured the fingerprint image with resolution around 500 dpi. From the perspective of optical sensors, the GT-511C3 [82] and GT511C1R [35] sensors have the lowest sensing area with only 175 mm2 although other sensors, such as R305 [4244], have a larger value at 270 mm2. Conversely, the sensing areas of solid state sensors are 192 mm2 and 230.4 mm2 for FPS200 and SFM3050-TC1, respectively. Besides, the larger sensing area enables more fingerprint information to be captured. Generally, it can be deduced that the cost of the sensor is proportional to the sensing area. For example, a GT511C1R sensor with a smaller sensing area costs only $18 compared to the price of R305 with a larger area at $21. The number of pixels ranges from 51840 to 107016 pixels for optical sensors. On the other hand, a fingerprint image captured with solid state sensors has around 76800 to 92160 numbers of pixels. As a result, the optical sensor is capable of capturing quality image which contains more pixel information. The trade-off for better performance is higher cost such as the U.are.U optical sensor with 107016 pixels but retails for $80.

In the previous part, a review of the types of fingerprint sensors and application in the attendance systems is presented. In terms of performances and characteristics, Table 4 further summarizes the comparison for those fingerprint sensors. Optical and solid state sensors are able to generate 2D images while the ultrasound sensor is capable of producing 3D images with more biometric information. However, the ultrasound sensor is slightly slower to acquire a fingerprint image as opposed to the other two types of sensors. For power consumption, the solid state sensor is the most power efficient compared to optical and ultrasound sensors which need high power requirement. The security levels of these sensors in descending order are solid state with high resilience to spoof, followed by ultrasound and finally optical. Regarding size, the solid state sensor is miniature and found in most of the smartphones. In contrast, large ultrasound and FTIR-based optical sensors are hard to be integrated into smartphones. Nevertheless, with technology improvement, optical and ultrasound sensors begin to serve as other alternatives for smartphone adoption. The cost of optical and solid state sensors is a function of the performance and sensing area as discussed previously, but overall, the price is considered low. Meanwhile, the cost of the ultrasound sensor is expensive and thus is not used in any of the fingerprint-based attendance system. In essence, faster speed coupled with lower cost is one of the reasons for selecting optical and solid state sensors as the preferred choice for fingerprint-based attendance systems. Nevertheless, the disadvantage of this system is the same as the iris because students have to queue and take turns to scan their fingers.

3.4. Face Sensor

Facial image acquisition devices or sensors are usually referred as cameras used to capture images or record video frames. Acquired face data can be in two-dimensional (2D) form of intensity image, three-dimensional (3D) representations consisting of intensity and depth information, or infrared [106]. The crucial parts inside of a camera include the image sensor and lens [107]. Generally, the image sensor operates by converting light travelling through the camera lens into electric charge and further converting to electronic signals. The electric charge in the image pixel is proportional to the illumination intensity whereby brighter images contain more charges compared to dim image. There are two types of image sensors which are charged-coupled device (CCD) and complementary metal-oxide-semiconductor (CMOS). The fundamental of these sensors is based on the accumulating charge proportional to the intensity of light striking at each pixel. For a CCD sensor, charge from one pixel is moved sequentially to the other pixel until a common output node is reached for voltage conversion. Moreover, the CCD sensor has an analogue output. In a CMOS sensor, charge to voltage conversion happens directly in each pixel [108]. The output of the CMOS sensor is in digital bits. Historically, CCD is the prevailing sensor used to capture images due to the detector’s high sensitivity towards light and producing quality image with low noise. However, the power consumption and production cost are high, and no other integration inside the CCD chip is allowed. In the early 1990s, with the growth of semiconductor technology, CMOS sensor serves as an alternative in challenging the CCD sensor. The advantages of using CMOS sensor are low cost, less power consumption, and integration of both the sensor and image processing functions in the same chip. As a result, the CMOS sensor is widely used in mobile phone cameras, webcams, and digital single-lens reflex cameras (DSLRs) [109]. Nevertheless, the drawbacks are low sensitivity and susceptibility to noise. A new type of CMOS sensor known as scientific CMOS (sCMOS) is introduced in 2009. Generally, sCMOS sensor offers greater benefits in terms of high sensitivity, low noise, high resolution, high dynamic range, and high frame rate [110]. Concurrently, CCD technology also makes progress in reducing cost and power dissipation while increasing the levels of integration [111]. As time flies, there will be little or no distinction between a CMOS and CCD with such technology improvement.

There are some properties associated with the image sensor such as the frame rate, spatial resolution, and pixel sensitivity [106]. Basically, the video consisted of images taken in sequence, also known as frames. Therefore, the frame rate is defined as the number of images that are captured on each second measured in frames per second (fps) [112]. Resolution is often related to the sharpness of an image. Hence, the spatial resolution of a video image is determined by the number of pixels within an image or in other words, the pixels and line count in the horizontal and vertical directions, respectively [113]. In addition, pixel sensitivity denotes the sensitivity of the sensor towards the lights. In order to increase the light sensitivity, a back-illuminated sensor has been developed by using the opposite surface of the silicon for more efficient light absorption [114]. Other types of image sensor properties regarding responsivity, uniformity, shuttering, and others which are beyond the scope of this paper can be referred to background literature for more details [107]. The camera properties of each face recognition attendance system are listed in Table 5.

As indicated in Table 5, webcams are the most popular type of cameras used for the face recognition-based attendance system followed by mobile phone cameras. The reasons are because these types of cameras are widely available and at an affordable price. Kawaguchi et al. [27] suggested using a fish-eye lens camera (sensing camera) to obtain the spherical view of the student’s seating area vertically from the ceiling of a classroom. Besides, a pan-tilt zoom camera (capturing camera) was also used to capture the face of the student. This type of camera allows the movement control and zooming adjustment of the lens. DSLR CMOS cameras with superior imaging technology by Nikon were used in two attendance systems [53, 63]. However, the costs of DSLR cameras are expensive as compared to normal webcams. Kinect camera associated with 3D imaging [106] was used by Islam et al. [61], but the authors did not mention about 3D face images being applied in their attendance system. Furthermore, the Raspberry Pi camera serves as another option as it is compatible with all models of Raspberry Pi boards as implemented in the system by Salim et al. [46]. In terms of the quantity of cameras, most attendance systems were implemented using only one camera to cover the entire class. Moreover, there were some attendance systems using two cameras with different capturing tasks. For example, the system by Kawaguchi et al. using sensing and capturing cameras as described previously [27] separates cameras for enrollment and recognition [49], as well as two nonintersecting cameras for students seating in two columns [30]. In addition, six cameras were installed in the whole class for better resolution [57].

With respect to the resolution of the cameras, the value ranges from around 300 kilopixels to 900 kilopixels (). From the video frame or image captured, the crop size of the face images varies between the minimum value pixels to maximum value pixels. The typical frame rate of the cameras used in the attendance system is 30 fps, although there are other values within 2-60 fps. In regard to the camera format, some systems used video frame while others used images of the students’ faces for further image processing tasks. The locations of the cameras are also important in order to obtain faces in a clear view. The cameras can be placed at the entrance of the door, on top of the blackboard, or in front or at the center of the classroom. Other than those properties, for better image processing results, Chintalapati and Raghunadh recommended that the face images be captured at a distance of 4 to 7 feet [66] whereas Raghuwanshi and Swami proposed around 1-3 feet [31].

Typically, the most common type of image sensor for a camera is either a CCD or a CMOS sensor. Table 6 presents the performance for each of the image sensor. In terms of image quality, the CCD sensor produces superior image with low noise while the CMOS sensor is prone to noise thus resulting in a low-quality image. Moreover, the CCD sensor is sensitive to the intensity of light illumination with higher signal-to-noise ratio versus the CMOS sensor which has several transistors located next to each pixel that reduce the light sensitivity. From the perspective of speed, charge to voltage sampling at each pixel for the CMOS sensor is faster than shifting charge to a common output for voltage conversion in the CCD sensor. Besides, due to less power dissipation, the CMOS sensor has low power consumption compared to the CCD sensor. As opposed to the CCD sensor, the size of CMOS sensor is small, making it suitable to integrate most camera functions in a chip. In addition, the cost of the CMOS sensor is also much lower than the CCD sensor, thanks to high volume production and leveraging standard silicon manufacturing line. As in Table 6, although CCD sensor captures high-quality image and shows more sensitivity towards the lighting, CMOS sensor outshines CCD sensor in other prominent aspects such as faster speed, smaller size, and lower power consumption and cost. Hence, CMOS sensor generally is much more preferable for integration in a webcam and digital and DSLR cameras. In spite of that, there is a trade-off between the cost and performance of CMOS sensors found on most of the cameras. The CHD 20.0 webcam [50] retails for $21, but a compromise has to be made whereby the image is less sharp with only 720 p resolution, also known as high-definition (HD) ready. Hence, similar to the iris sensor, the image preprocessing stage is required for better recognition rate. On the other hand, DSLR 5200 camera [63] captures crisp images with 1080 p resolution (full HD); nonetheless, the cost is expensive at $545. Advantages of the face recognition-based attendance system are quickness (no queue) and nonintrusiveness, thus suitable for attendance marking particularly in a large class. However, the recognition process may be affected by the orientation of the faces, occlusion, and poor lighting.

4. Communication Channel

After capturing the biometric traits by the sensor, users’ biometric data in the form of images or voices, identifiers (IDs), roll numbers, date, and time are transferred from the transmitter module to the receiver module. If the sensor only captures the biometric traits, this data needs to be transmitted to the receiver module for the matching process done in the server [48, 79, 88]. Alternatively, if the matching of biometric identity was done on the sensor module, the user’s ID or roll number along with the date and time will be transmitted [4244, 58, 75, 76]. These data are necessary for user authentication and attendance management in an institute.

Wired connection such as a serial port and universal serial bus (USB) is used to transfer the data in some systems [35, 54, 97]. Another option is wireless communication which is widely used for data transmission in biometric-based attendance systems. Apart from that, the location of a student can be detected as well using a wireless signal [15]. In addition, the number of preregistered devices connected to the wireless signal can be determined to count the number of students in a class [81]. Furthermore, attendance information can be viewed on a web site [15, 18, 34, 40, 47, 48, 55, 115] or received through email [47, 55, 97, 100, 116] or short message service (SMS) [47, 48, 55, 115]. Wireless technology was chosen in most of the biometric attendance systems because of cost saving as there is no need to lay cables and install ports in each and every classroom as compared to a wired network that requires physical connection. Moreover, in terms of mobility, the wireless network can be expanded to other locations easier without extra physical infrastructures. As such, authors in the literature proposed using different types of wireless protocol standard in their project. The typical wireless standards used for biometric attendance systems include Wi-Fi, ZigBee, global system for mobile communication (GSM), and other types of radio frequency (RF) as shown in Table 7. These wireless technologies are compared in terms of the device used, data rate, frequency band, and coverage range.

From Table 7, it is obvious that Wi-Fi with a higher data rate is able to transmit more data in a given time frame when compared to ZigBee, GSM, and RF wireless network. Besides, Wi-Fi support a broader frequency band. The papers in Table 7 and the corresponding datasheet do not indicate the coverage range for Wi-Fi and GSM. Hence, from other references, the maximum range of Wi-Fi is 100 m [117] while GSM covers up to 35 km [118]. Obviously, GSM and RF wireless (using PTR2000+) are intended for long distance transmission. Other wireless characteristics such as the topologies and mode of operation for the ZigBee network are also considered. In their system, Simao et al. used mesh topology in the nonbeacon mode [58], whereas Li et al. implemented cluster tree topology in the beacon mode [88]. Both of these topologies enable the network to be extended via routers. Moreover, the devices in the network can be synchronized or not synchronized using beacon and nonbeacon operations, respectively. A more detailed explanation about the ZigBee topologies can be found in the literatures [58, 88]. In addition, information related to the mode of operation can be obtained through some sources [119, 120]. As a summary, Wi-Fi provides the best option for a system that requires a high data rate, efficient energy consumption, and security [117]. On the other hand, low data rate and less power consumptions are features of ZigBee and GSM protocol [117, 121].

The typical performances for wireless protocols are summed up in Table 8. With a high data rate, Wi-Fi is capable of handling large quantities of data compared to ZigBee and GSM which have a low data rate. In terms of frequency band, Wi-Fi operates at 2.4 GHz and 5 GHz, ZigBee works at 868 MHz, 915 MHz, and 2.4 GHz while GSM makes use of 850/900 MHz and 1800/1900 MHz. In addition, GSM has long coverage distance at 35 km versus short distance at 100 m for Wi-Fi and ZigBee. Both Wi-Fi and GSM need high power requirement in contrast to ZigBee with very low power consumption. Security-wise, Wi-Fi is the best, relying on Wi-Fi-protected access (WPA) encryption as opposed to poor encryption in ZigBee and GSM. Moreover, Wi-Fi and ZigBee chipsets are inexpensive contrary to higher chipset cost for GSM. From Table 8, Wi-Fi is the most sought-after communication channel for transferring attendance data. This is due to high data speed and bandwidth as well as low chipset cost as in ESP 8266 which retails around $3 to $7. Contrarily, these advantages come with a price to pay, which is mainly due to high power consumption. The cost for the chipset of ZigBee is also low, ranging in between $2 and $7; however, the trade-off is a low data rate. Among all the wireless technologies, the chipset for the SIM5360E GSM module is the most expensive with a price tag around $20-$26. Nonetheless, the coverage range of GSM is the longest, albeit low data rate and high power consumption.

5. Database Storage

Data normally kept in the storage of the attendance system consists of biometric templates, attendance information, course information, email address, and mobile phone numbers. The templates are stored for comparison with the newly captured biometric traits so as to authenticate the identity and mark the attendance of the particular student. Attendance information such as students’ ID, date, and entrance and exit times is recorded to facilitate in attendance tracking. In addition, course information regarding course code, schedule, and venue is saved too. Besides, email address and mobile phone numbers of parents and educators are kept in the database for attendance status notification. Figure 6 depicted the common type of devices for database storage in the biometric-based attendance system.

Typically, servers are used to store these data. Most of the servers were hosted on the computer workstation for easy access to attendance records and thus help in smoothing the attendance management [15, 17, 18, 32, 35, 40, 43, 44, 47, 48, 54, 58, 70, 76, 88, 97, 115]. Furthermore, a mobile internal database was also implemented whereby the user stored the attendance and course information on the server using mobile phones [100]. Cloud servers hosted over the internet were used to store attendance information which can be accessed remotely and at any time [62, 69, 116]. Additionally, web servers were used in some attendance systems [29, 46]. González-Agulla et al. [72] along with Gadhave and Kore [42] stored and managed the attendance activities with modular object-oriented dynamic learning environment (Moodle) server which is an online management system. Besides using server to store data, some biometric sensors are equipped with built-in memory to store the templates. For example, in a small classroom, a GT511C1R fingerprint sensor was used to store up to 20 templates [35]. On the other hand, for a larger classroom, GT-511C3 [82] and R305 [4244] fingerprint scanners support up to 200 and 980 templates, respectively. In addition, each student was provided a near-field communication (NFC) card with 4K memory to store the fingerprint templates and identifier distributively [32]. The advantage of this approach lies in the fact that no other storage devices contain the personal data other than the NFC card for security reasons. Memory IC by Atmel such as the AT24C512 and AT24C1024 which provide 512K and 1024K bits of electrically erasable and programmable read-only memory (EEPROM), respectively, were used to store attendance information [54, 76]. Moreover, Purohit et al. employed ST24C04 by STMicroelectronics with a space of 4K bits of EEPROM [43]. Other storage devices for the attendance system include secure digital (SD) card [38, 39, 82], random-access memory (RAM) [42, 75, 88], and NAND [48]. With the intention of preventing the loss of attendance information, some systems offered double storage capability to back up the original copy [32, 43, 48, 54, 76]. In essence, the factors in deciding which storage device to use in the attendance system depend on the size of the class and the amount of information required. For a classroom with a few students or with only the students’ ID and attendance status (i.e., present or absent) needing to be recorded, a small capacity storage device is sufficient. However, for a large classroom with many students or huge volume of data to be saved, a large storage device is preferable.

Table 9 shows the performance for various types of database storage. Based on flash memory, both SD card and memory IC are combined together in the table. In terms of capacity, the server is able to store large amounts of data, such as facial images which have larger file sizes. On the other hand, NFC card and memory IC have less data capacity to store small file size, namely, fingerprint and iris images. With respect to speed for data transfer, server and RAM are the fastest while sensor, NFC card, and memory IC are the slowest. Server and RAM require high power to function, in contrast to low energy consumption of the others. As NFC card is passive, the operating power is drawn from the reader. In addition, all of the storage devices have high durability. Basically, most of the storage devices are affordable with the exception of server and RAM. Although the performance of server is the best with greater security and data backup, the trade-off for owning such a storage device is the high cost as the price for server ranges from $150 up to $4000. It is also affordable to store data in the built-in memory of some fingerprint sensors which cost around $18 to $28; however, it is difficult to replace when a larger storage is needed or the memory is defective. NFC card is the cheapest with a price tag around $0.20 to $1 but has limited storage capacity and the possibility of students forgetting to bring their card to class. Moreover, despite the fact that the cost per piece is cheap, NFC card is not suitable for large classes. This is because for a class of 100 students, the overall cost will increase to $100 taken into consideration the maximum unit price of $1. Another inexpensive option is a memory IC which costs around $1 to $3 yet when used regularly may result in broken IC pins. The costs for the SD card and NAND are reasonable, with price ranging in between $10 and $20. Nevertheless, the attendance record can only be viewed by transferring data to a computer, and thus, SD card is prone to damage if inserted and removed frequently. The price of RAM is slightly higher from $10 to $50. Despite delivering good performance, additional storage such as flash [88] is needed because RAM has volatile memory. Among all the storage devices, server is the most popular choice as it is utilized in the iris, voice, fingerprint, and face recognition-based attendance systems. Besides offering larger storage capacity and faster speed, the other reason for choosing server is because of centralized data for easy access to attendance record compared to storing data locally in a memory card.

6. Other Components

Besides the mentioned hardware, there were other hardware used in the biometric-based attendance system. The liquid crystal display (LCD) and thin film transistor (TFT) touch screen were used to display information to the users [35, 39, 41, 43, 44, 48, 51, 54, 58, 75, 76, 79, 82, 122]. The information shown may be the user’s ID, roll number, date, time, course code, welcome message, or authentication status. In order to determine when the attendance was taken, a real-time clock (RTC) keeps track of the time, date, and day. The models for the timekeeping chips used were DS1302 or DS1307 [54, 76, 82]. For a portable attendance system, batteries were used to power up the devices [38, 44, 48, 54]. Lithium-ion (Li-ion) batteries were normally used for portable devices because of high energy density that enables longer operating time after each charge. In addition, regulators were used to obtain fix output voltages for the portable device [44, 54]. Another attendance system which operated by the power supply circuit utilized three different types of regulators (i.e., LM117, 7805, and 7812) to control the desired output voltages for the processor, LCD, keypad, and fingerprint sensor [75]. Keypad and keyboard were used to enter the user’s ID, roll number, course code, or password [35, 48, 75, 76, 82]. Besides that, buttons and switches enabled the user to choose the desired operation such as entrance and exit options, menu button, reset button, and a power button [54, 58, 79]. For multiple layer authentication, RFID or NFC card was used along with the biometric sensor to enhance the security and achieve better performance in the biometric-based attendance system [32, 33, 35, 36, 39, 41, 55, 115]. RFID and NFC reader allowed the data, such as a user’s identifier or biometric template, to be collected from the card. A high-frequency RFID reader which supports a longer range was able to read the card automatically from a significant distance [55]. On the contrary, students needed to place their card in front of a low-frequency RFID reader with short distance contact [33, 39]. Furthermore, different colors of LEDs represent various indications, namely, fingerprint status, attendance status, and power status [54, 79]. The function of buzzers was to give the user a hint or a sign of unsuccessful attendance marking [39, 75, 76, 79]. Moreover, a speaker was incorporated in the attendance system to notify about events happening in the institution [48]. Access control was implemented using a servo motor to open and close the door [18, 46]. In an effort to curb spoofing threat, a blink detector was used to detect eye blink [66]. Apart from that, an ultrasonic sensor was used along with the camera to track the student movement when entering or leaving the classroom [41].

7. Conclusion

It remains a challenge for choosing suitable types of hardware to design a biometric-based attendance system. Thus, a review along with tables summarizing the properties and characteristics of each type of hardware components is provided. There are various types of microcontroller boards, biometric sensors, communication channels, database storages, and other components for researchers to select based on their own needs and requirements. For microcontroller, 80C51 and PIC were initially used in between the years 2010 and 2014, followed by ATmega since 2014. After that, in 2016, ARM became popular and started to be used widely in attendance systems. Basically, 80C51 and PIC are selected for the low processing task, such as iris and fingerprint recognition, while ATmega and ARM are suited for tasks with higher processing capability, for instance, face recognition. The choice of sensor depends on the captured biometric trait. The fingerprint and face are the two most common biometric traits of the attendance system due to convenience and high acceptability rate compared to the iris and voice. A sensor in the form of a camera is required for the iris, fingerprint, or face image acquisition. CMOS is by far the prevalent sensor in nearly all cameras. On the other hand, a sensor for voice recording is managed through a microphone. In addition, time is also another factor in sensor selection. Time saving is achievable through face recognition-based attendance system in that students do not need to queue and contact with the sensor. The communication channel plays an important role in transferring attendance data. Wi-Fi is a great option since high amounts of data can be transferred in a given time frame. This proves to be useful for real-time attendance monitoring. Attendance data can be kept in various types of database storage. The server is suitable for huge capacity storage and ease of access while other memory cards are only capable of storing data locally up to a certain extent. Additionally, the number of students and size of biometric template determine the types of storage database as well. Finally, other components are needed to complement the whole biometric attendance system. In a large classroom, researchers may opt for a more powerful microcontroller, contactless sensor, larger database storage, and communication channel with high data rate. To sum up, the benchmark for choosing hardware devices or components always narrows down to three important criteria which are cost, power consumption, and speed.

Furthermore, biometric-based attendance systems are taken to the next level with the widespread use of mobile devices in the internet of things (IoT) era. Nowadays, modern smartphones are embedded with built-in hardware components such as the camera, iris, or fingerprint sensor. Consequently, for future research, an opportunity arises whereby the attendance marking can be taken using only smartphones instead of setting-up a system with separated hardware components. Upon capturing the biometric traits, these data can be uploaded to the cloud server for authentication using a wireless connection on the smartphone. Moreover, these attendance data can be saved in the cloud database. Nevertheless, in the world of increasing mobile connectivity, researches should be cautious about the security and privacy of the biometric data so as not to fall prey to cyber criminals.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.


This work was supported in part by the Universiti Sains Malaysia: Research University Grant 1001/PELECT/8014052.