Academic Editor: N. V. Boulgouris
Abstract
A new iris recognition method for mobile phones based on corneal specular reflections (SRs) is discussed. We present the following three novelties over previous research. First, in case of user with glasses, many noncorneal SRs may happen on the surface of glasses and it is very difficult to detect genuine SR on the cornea. To overcome such problems, we propose a successive on/off dual illuminator scheme to detect genuine SRs on the corneas of users with glasses. Second, to detect SRs robustly, we estimated the size, shape, and brightness of the SRs based on eye, camera, and illuminator models. Third, the detected eye (iris) region was verified again using the AdaBoost eye detector. Experimental results with 400 face images captured from 100 persons with a mobile phone camera showed that the rate of correct iris detection was 99.5% (for images without glasses) and 98.9% (for images with glasses or contact lenses). The consequent accuracy of iris authentication was 0.05% of the EER (equal error rate) based on detected iris images.
1. Introduction
Instead
of traditional security features such as identification tokens, passwords, or
personal identification numbers (PINs), biometric systems have been widely used
in various kinds of applications. Among these biometric systems, iris
recognition has been shown to be a highly accurate method of identifying people
by using the unique patterns of the human iris [1].
Some
recent additions to mobile phones have included traffic cards, mobile banking
applications, and so forth. This means that it is becoming increasingly important to
protect the security of personal information on mobile phones. In this sense, fingerprint
recognition phones are already being manufactured. Other recent additions to
these phones have been megapixel cameras. Our final goal is to develop an iris
recognition system that uses only these built-in cameras and iris recognition software
without requiring any additional hardware components such as DSP chips.
In
addition to other factors such as image quality, illumination variation, angle
of capture, and eyelid/eyelash obfuscation, the size of the iris region must be
considered to ensure good authentication performance. This is because “the
image scale should be such that irises with diameters will show at least 100 pixels
diameter in the digital image to meet the recommended minimum quality level”
[2]. In the past, it was necessary to use large zoom and focus lens cameras to
capture images, so large iris images could not be obtained with small cheap
mobile phones. However, a megapixel camera can make it possible to capture magnified iris images with no need for
large zoom and focus cameras.
Even when facial images are
captured relatively far away (30 ~ 40 cm), the captured regions possess
sufficient pixel information for iris recognition. In addition, the
camera-viewing angle is larger than in conventional iris cameras and the depth of field
(DOF), in which focused iris images can be captured is larger,
consequently. With captured facial images, eye regions must be detected for
iris recognition. So, in this paper we propose a new iris detection method
based on corneal specular reflections (SRs). However, for users with glasses, there
may be many noncorneal SRs on the glasses and it can be very difficult to
detect genuine SRs on the cornea. To overcome these problems, we also propose a
successive on/off dual illuminator scheme.
Existing eye detection methods can
be classified into two categories. Methods in the first category detect eyes
based on the unique intensity distribution or the shape of the eyes under
visual light [3–9]. Methods in the second category exploit the spectral
properties of pupils under near IR illumination [10–12].
All the research discussed in
[3–6] used a deformable template method to locate the human eye. The method
discussed in [7] used multicues for detecting rough eye regions from facial
images and performed a thresholding process. Rowley et al. [8]
developed a neural network-based upright frontal facial feature (including the eye
region) detection system. The face detection method proposed by Viola and Jones
[9] used a set of simple features, known as an “integral image.” Through the
AdaBoost learning algorithm, these features were simply and efficiently classified
and then a cascade of classifiers was constructed [13, 14].
In the method discussed in [10],
eye detection was accomplished by simultaneously utilizing the bright/dark
pupil effect under IR illumination and the eye appearance pattern under ambient
illumination via the support vector machine (SVM). Ebisawa and Satoh. [11]
generated bright/dark pupil images based on a differential lighting scheme that
used two IR light sources (an on/off camera axis). However, it is difficult to
use this method for mobile applications because the power of the light source must
be very strong to produce a bright/dark pupil image (this increases the power
consumption of mobile phones and reduces battery life). Also, large SRs can
hide entire eye regions for users with glasses.
Suzaki [12] detected eye regions
and checked the quality of eye images by using specular reflections for
racehorse and human identification. However, the magnified eye images were captured
close to the object in an illuminator-controlled harness place. This led to
small noncorneal SR regions in the input image. Also, these researchers did
not consider users with glasses. In addition, they only used heuristic experiments
to determine and threshold the size and pixel intensity value of the SR in the
image. In [15], the activation/deactivation illuminator scheme was proposed to
detect eye regions based on corneal SRs. However, because these researchers
used a single illuminator, detection accuracy was degraded when there were many
noncorneal SRs on the surface of glasses. In addition, because eye regions
were determined only based on detected SRs, there were many false acceptance
cases, which meant that noneye regions were falsely regarded as eye regions. Also,
only the iris detection accuracy and processing times were shown. In [16], the
researchers also used the on/off illuminator scheme, but it was used for
detecting rough eye positions for face recognition.
In [17], the researchers proposed a
method for selecting good quality iris images from a sequence based on the
position and quality of the SR relative to the pupil. However, they did not
solve the problem of detecting corneal SRs when there were many noncorneal SRs
when users wore glasses. In addition, they did not show the theoretical size
and brightness of corneal SRs.
To overcome these problems, we
propose a rapid iris detection method for use in mobile phones and based on SRs.
To determine the size and pixel intensity values of the SRs in the image, theoretically,
we considered the eye model and the camera, the eye, and the illuminator
geometry. In addition, we used a successive on/off dual illuminator to detect
genuine SRs (in the pupil region) for users with glasses. Also, we excluded the
floating-point operation to reduce processing time, since the ARM CPU used in
mobile phones does not have floating-point coprocessors.
2. Proposed Iris Detection Algorithm
2.1. Overview of the Proposed
Method and the Illuminator On/Off Scheme
An overview of the proposed method
is shown in Figure 1 [16]. First, the user initiates the iris recognition process
by clicking the “start” button of a mobile phone. Then, the camera
microcontroller alternatively turns on and off the dual (left and right) infra-red (IR)
illuminators. When only the right IR illuminator is turned on, two facial images
(Frame #1, #2) are captured, as shown in Figure 2. And then, another one (Frame
#3) is captured when both illuminators are turned off. After that, two additional
facial images (Frame #4, #5) are captured again when only the left IR illuminator
is turned on. So, we obtained five successive images as shown in Figures 1(1) and 2. This scheme was iterated successively as shown in Figure 2. When Frames
#1–#5 did not meet our predetermined threshold for motion and optical
blurring (as shown in Figure 1(2), (3)), another five images (Frame #6–#10) were
used (Figure 1(4)).
Figure 1: Flowchart of the proposed method.
Figure 2: The alternative on/off scheme of the dual IR-illuminators [
16].
The size of the original captured
image was
pixels. To reduce processing time,
image was
pixels. To reduce processing time,
we used the eye region in a predetermined area of the input image. Because we
attached a cold mirror (to pass the IR light through and reflect the visible
light) in front of the camera lens and the eye-aligning region was indicated on
the mirror as shown in Figure 5, the user was able to align his or her eye with the
camera. So, the eye existed in the restricted region of any given captured
image. This kind of eye-aligning scheme has been adopted by conventional
iris recognition cameras such as the LG IrisAccess 3000 or the Panasonic
BM-ET300. By using the eye-aligning region in the cold mirror, we were
able to determine that eye regions existed in the area of (0,566)
(2048,1046) in the input image. So, it was not necessary to process the whole input image
(
pixels) and we are able to reduce processing time. For
this, the captured eye region images (
pixels (0,566)
(2048,1046))) were 1/6 down-sampled (
pixels image) and we checked the amount of motion
blurring in the input image as shown in Figure 1(2).
In general, the motion blur amount
(MBA) can be calculated by the difference
image between two illuminator-on images. If the calculated MBA was greater than the predetermined threshold (Th1 as shown in Figure 1) (we used 4 as a threshold), we determined that the input
image was too blurred to be recognized. After that, our system checked the
optical blurring amount (OBA) by
checking the focus values of the A2 and A4 images in Figure 2, as shown in Figure 1(3).
In general, focused images contain more high-frequency components than defocused
images [18]. We used the focus checking method proposed by Kang and Park [19]. The calculated focus value was
compared to the predetermined threshold. If all the focus values of A2 and A4 were
not below the threshold (Th2 as shown in Figure 1) (we used 70
as the threshold), we regarded the input image as defocused and captured five
other images as shown in Figure 1(4), as mentioned before.
Next, our system calculated the
environmental light amount (ELA) of the
illuminator-off image (the average gray level of A3 shown in Figure 2) to check
whether outer sunlight existed or not in the input image,
as shown in Figure 1(5). As shown in
Figure 5, we attached a cold mirror
with an IR-Pass filter in front of the camera lens so that image brightness was
not affected by visible light. In indoor environments, the average gray level
of the illuminator-off image (A3) was very low (our experiments showed that it
was below 50 (Th3)).
However,
sunlight includes a large amount of IR light and in outdoor environments, the
average gray level of the illuminator-off image (A3) increases (more than 50 (Th3)).
The American Conference of
Government Industrial Hygienists (ACGIH) exposure limit for infrared radiation is
defined by the following equation. For exposures greater than 1,000
seconds, irradiance must be limited to less than 10 mW/cm2 [20],
(1) where
represents the wavelength of incident light,
represents the irradiance onto the eye in watts/cm2, and
represents the exposure time in
seconds. In our iris recognition system, the exposure time (
)
was a maximum of five seconds (time-out) for enrollment or recognition. We obtained
the maximum ACGIH exposure limits for infrared radiation as 540 mW/cm2 based on (1). As shown in Section 2.2, the
-distance between the illuminator
and the eye in our system was 250–400 mm. Experimental results showed that
the infrared radiation power (0.44 mW/cm2) of our system was much less
than the limits (540 mW/cm2), so it met the safety requirements.
2.2. Detecting Corneal SRs by Using the Difference Image
After that, our system detected
the corneal specular reflections in the input image. For indoor environments (ELA < Th3 shown in Figure 1(6)),
corneal SR detection was performed using the difference image between A2 and A4
in Figures 1(6) and 3. In general, large numbers of noncorneal SRs (with
similar gray levels to genuine SRs on the cornea) occurred for users with
glasses and that made it difficult to detect genuine SRs on the cornea (inside the
pupil region, as shown in Figure 3). So, we used a difference image to detect the
corneal SRs easily. That is because the genuine corneal SRs had horizontal pair
characteristics in the difference image as shown in Figure 3(c) and their
interdistance in the image was much smaller than that of other noncorneal SRs
on the surface of glasses. Also, the curvature radius of the cornea was much
smaller than that of glasses. However, in outdoor environments, SR detection was
performed using the difference image between (A2−A3)/2+127) and
(A4−A3)/2+127), as shown in Figure 1(7).
Figure 3: The captured eye images for users with glasses. (a) Eye image with right illuminator on, (b) eye image with
left illuminator on, and (c) difference image between (a) and (b).
In outdoor environments, the
reason we used A2−A3 and A4−A3 was to get rid of the effect of sunlight. A3 was
only illuminated by sunlight. So, by obtaining the difference image between A2
and A3 (or A4 and A3), we were able to reduce the effect of sunlight. In
detail, in outdoor environments, sunlight increased the ELA. So, in addition to the
corneal SR, the brightness of other regions such as the sclera and facial skin
became so high (their brightness became similar to that of the corneal SR) that
it was very difficult to discriminate those regions from the corneal SR only by
using the difference images of A2 and A4 like (6) of Figure 1.
In this case, because the effect
of sunlight was included in both A2 and A4, by subtracting the brightness of A3
(because it was captured with the camera illuminator off, its brightness was determined
only by outer sunlight) from A2 and A4 (A2−A3)/2+127) and (A4−A3)/2+127)),
we got rid of the effect of sunlight in A2 and A4. Consequently, the brightness
of other regions such as sclera or facial skin regions became much lower compared
to that of the corneal SR and we were easily able to discriminate the corneal
SR from other regions.
Based on that, we used the
following three pieces of information to detect the genuine SRs inside the pupil
region. First, the corneal SR is small and it can be estimated by the camera,
eye, and illuminator models (details are shown in Section 3). Second, genuine
corneal SRs have horizontal pair characteristics in the difference image that
are different from other noncorneal SRs on the surface of glasses because they
are made by left and right illuminators. Since we knew the curvature radius of
the cornea (7.8 mm) based on Gullstrand’s eye model [21], the distance (50 mm)
between the left and right illuminators and the
-distance was 250–400 mm.
Also, our iris camera had an operating range of 250–400 mm between the eye
and the camera, and we were able to estimate the pixel distance (on the
axis)
between the left and right genuine SRs in the image based on the perspective
projection model [22].
Especially, because the curvature
radius of the cornea is much smaller than that of the surface of the glasses,
the distance between the corneal left and right SRs is shorter than that
between the noncorneal ones in the image, as shown in Figure 3. However, because
there was a time difference between the left and right SR images (as shown in
Figure 2, the time difference of A2 and A4 is 66 milliseconds) and there was also hand
vibration, there was also a vertical disparity of the left and right SR
positions. Experimental results showed a maximum of
pixels
in the image (which corresponds to the movement of 0.906 mm per 66 milliseconds as
measured by the Polhemus FASTRAK [23]) and we used it as the vertical margin of
the left and right SR positions.
Third, because genuine SRs occur in
the dark pupil region (whose gray level is below 5) and its gray level is
higher than 251 (see Section 4), the difference value (= (A2−A3)/2+127 in
indoor environments) of the genuine SR is higher than 250 or lower than 4.
Also, using a similar method, we estimated the difference value (=
(A2−A3)/2+127) − (A4−A3)/2+127)) of the genuine corneal SRs in outdoor
environments. Based on that, we discriminated the genuine SR from the noncorneal
ones. From the difference image, we obtained the accurate center position of the
genuine SRs based on the edge image obtained by the
Prewitt operator, component labeling, and circular edge detection.
Based on the detected position of the genuine SR in the 1/6 down-sampled image,
pupil, iris detection, and iris recognition were performed in the original image
(details are shown in Sections 5 and 6).
3. Estimating the Size of Corneal Specular Reflections in Images
3.1. Size Estimation of SRs in Focused Images
In this section, we estimate the size of the genuine SRs on
the cornea based on eye, camera, and illuminator models as shown in Figure 4 [15].
Previous researchers [12] have used only heuristic experiments to determine and
threshold the size and pixel intensity values of the SRs in images. Also, in
this section, we discuss why the SRs are brighter than the reflection of the skin.
Figure 4: A corneal SR and the camera, illuminator, eye model [
15]. (a) The camera, illuminator, and eye model, (b) a corneal SR
in the convex mirror.
Figure 5: Mobile
phone used for iris recognition.
By using the Fresnel formula
, where
is the reflection coefficient,
is the refractive index of the air (=1),
and
is that of the cornea (=1.376)
[21] (or facial skin (=1.3) [24])), we obtained the reflection coefficients
of the cornea as about −0.158 (here, the
reflectance rate is 2.5
)
and the skin as about −0.13 (here, the reflectance rate is 1.69). So, we discovered
that the SRs are brighter than the reflection of the skin.
We then tried to estimate the size of the SR in the image.
In general, the cornea is shaped like a convex mirror and it can be modeled as
shown in Figure 4 [25]. In Figure 4, C is the center of the eyeball. The line that
passes from the cornea’s surface through C is the principal axis. The cornea has
a focal point F, located on the principal axis. According to Gullstrand’s
eye model [21] and the fact that C and F are located on the opposite sides of
the object, the radius of the cornea
is −7.8 mm and the corneal focal length
is −3.9 mm (because
in the convex mirror). Based on that
information, we obtained the image position
of the reflected illuminator by
. Here,
represents the distance between the cornea surface and the
camera illuminator. Because our iris camera in the mobile phone had an
operating range of 25–40 cm, we defined
as 250–400 mm. From that,
was calculated as −3.84–3.86 mm and we used −3.85 mm as
the average value of
.
From that calculation, we obtained the image size of the reflected illuminator
(
as shown in Figure 4) as
0.096–0.154 mm, because
(the diameter of the camera
illuminator) was 10 mm. We then adopted the perspective model between the eye
and the camera and obtained the image size (X) of the SR in the camera, as
shown in Figure 4(a) (
: X, X is
1.4–3.7 pixels in the image). Here,
(the camera focal length) was 17.4 mm and
(the distance between the CCD cell) was 349 pixel/mm.
and
were
obtained by camera calibration [22]. Consequently, we determined the size (diameter)
of the SR as 1.4–3.7 pixels in the focused input image and used that value as
a threshold for size filtering when detecting the genuine SR on the cornea.
However, in one case, the user tried to identify his iris
by holding the mobile phone, which led to image blurring. This blurring by hand
vibration occurs frequently and it increases the image size of the SR (by optical
and motion blurring). When this happens, we also need to consider the blurring
to determine the image size of the SR.
The meaning to estimate the size
of corneal SR is like this. Based on Figure 4, we were able to estimate the size
of the corneal SR theoretically by not capturing actual eye images including the
corneal SR. Of course, by using heuristic methods, we were able to estimate the
size of the corneal SR. But for that, we had to obtain many images and analyze
the size of the corneal SR intensively. In addition, most conventional iris
cameras include the
-distance measuring sensor with which
of Figure 4 can be obtained automatically. In this way, the size
of the corneal SR can be estimated easily without requiring intensive and
heuristic analysis of many captured images. The obtained size information can
be used for size filtering in order to detect the corneal SR among many
noncorneal SRs.
In order to prove our theoretical
model, we used 400 face images captured from 100 persons (see Section 7). Among
them, we extracted the images which were identified by our iris recognition
algorithm (because the size (1.4–3.7 pixels) of the SR denoted a focused image).
Then, we measured the size of the SR manually and found that the obtained size
of the SR was almost the same as that obtained theoretically.
Because the corneal SR was
generated on the cornea mirror surface as shown in Figure 4 and it was not
reflected on the surface of the glasses, the actual size of the SR did not
change irrespective of wearing glasses. Of course, many noncorneal SRs occurred
on the surface of the glasses. To prove this, we analyzed the actual SR size
with the images of glasses among 400 face images and we found that the size of
SR was not changed when glasses were worn.
3.2. Optical Blur Modeling of SRs
In
general, optical blurring can be modeled as (
where
represents the Fourier transform of the blurred iris image caused
by defocusing,
represents that
of the degradation function (2-D PSF),
represents that ofthe clear (focused) image, and
represents that of noise [22]). In general,
is much smaller than other terms
and can be excluded. Because the point spread function (PSF) (
) of optical blurring can be represented
by the Gaussian function [22], we used the Gaussian function for it.
To determine an accurate Gaussian
model, we obtained the SR images at a distance of 25 ~ 40 cm (our operating
range) in the experiment. We then selected the best focused SR image as
and the least focused one as
. With those images, we determined
the mask size and variance of the Gaussian function (
) based on inverse filtering [22]. From that, we determined
that the maximum size (diameter) of the SR was increased to 4.4 ~ 6.7 pixels in
the blurred input image (1.4–3.7 pixels in the focused image). We used those
values as a threshold for size filtering when detecting the genuine SR [15].
3.3. Motion Blur Modeling of SRs
In
addition, we considered motion blurring of the SRs. In general, motion blurring
is related to the shutter time of the camera lens. The longer the shutter time,
the brighter the input image, but the more severe the degree of motion
blurring. In these cases, the SRs are represented by ellipses instead of circles.
To reduce motion blurring, we could have reduced the shutter time, but the
input image was too dark to be used for iris recognition. We could also have used
a brighter illuminator, but this may have led to an increase of system costs.
Due to these reasons, we set our shutter time as 1/30 second (33 milliseconds).
To measure the amount of motion
blurring by a conventional user, we used a 3D position tracker sensor (Polhemus
FASTRAK [23]). Experimental results showed that translations in the directions
of the
, and
axes were 0.453 mm per 33 milliseconds. From that information, and
based on the perspective model between the eye and the camera as shown in Figure
4(a), we estimated the ratio between the vertical and horizontal diameters of the
SR, the maximum length of the major SR axis, that of the minor axis, and the
maximum SR diameter in the input image. We used those values as the threshold
for shape filtering when detecting the genuine SR. Even if we used another kind
of iris camera, we knew
, and
as shown in Section
3.1 (as obtained by camera calibration or the camera and illuminator specifications).
So, we obtained the above size and shape information of the SR irrespective of
the kind of iris camera [15].
4. Estimating the Intensity of Corneal Specular Reflections in Images
The Phong model identifies two kinds of light (ambient
light and point light) [26]. However, because we used a cold mirror (IR pass
filter) in front of the camera as shown in Figure 5, we were able to exclude the
effect of ambient light when estimating the brightness of the SR. Although point
light has been reported to produce both diffuse elements and SRs, only SRs can
be considered in our modeling of corneal SRs, as shown in (2),
(2)where
is the reflected brightness of the SR,
is the reflected direction of
incident light, and
is the camera viewing
direction.
is the SR coefficient, as
determined by the incident angle and the characteristics of the surface
material. Here, the distance between the camera and the illuminator was much
smaller than the distance between the camera and the eye as shown in Figure 4(a).
Due to that, we supposed that the incident angle was about 0 degrees. Also, the
angle between
and
was 0 degree (so,
). From that,
was only
represented as the reflection coefficient
of the cornea as about −0.158. This value was obtained in Section 3.1.
represents the power of incident
light (camera illuminator) measured as 620 lux.
is the operating range (250–400 mm) and
is the offset term (we used
5 mm) to ensure that the divider did not become 0.
represents the constant value, as determined by the
characteristics of the surface. From that, we obtained
(the SR
reflected brightness on the cornea surface) as 0.242–0.384 lux/mm. From (2) and (3), we obtained
the radiance
(0.0006–0.0015 lux/mm2) of the SR into the camera:
(3)where
is the distance between the camera and the eye, and
is the offset term of 5 mm. We then obtained the image
irradiance
value of the SR [27]:
(4)where
and
represent the camera focal length (17.4 mm) and aperture of the lens (3.63 mm), respectively, [28].
is the angle between the optical axis and the ray from the
center of the SR to the center of the lens. Because the distance between the
optical axis and the SR is much smaller than the distance between the camera
and the eye as shown in Figure 4(a), we supposed
was 0 degree. From that, we found that
was
lux/mm2. Finally, we obtained the image brightness of the corneal
SR
[27]:
(5)where
is the camera shutter time, and
is the auto gain control (AGC) factor.
In general,
can be assumed to be 1. In our camera
specifications,
is
33 milliseconds and
is
mm2/Lux
milliseconds. From those values, we
obtained the minimum intensity of the corneal SR in the image as 251 and used
it as the threshold value to detect the corneal SR. Even if we used another
kind of iris camera, we obtained the above camera and illuminator parameters by
camera calibration or camera and illuminator specifications. Therefore, we
obtained the minimum intensity of the corneal SR in the image irrespective of
the kinds of iris camera hardware [15].
5. Pupil and Iris Detection and Verification with the Adaboost Classifier
Based on the size, shape, and
brightness of the SR obtained from theoretical analysis in Sections 3 and 4, we
were able to detect the accurate SR position of the pupil in the difference
image by the method mentioned in Section 2.2. After that, before detecting the pupil
region based on the detected SR, we verified the detected eye region by using
the AdaBoost algorithm [9]. That is because when there are large SRs on the
surface of glasses caused by left or right illuminators, it is possible not to
detect accurate SR positions in the pupil.
The original AdaBoost classifier
used a boosted cascade of simple classifiers with Haar-like features capable of
detecting faces in real time at both high detection rates and very low false
positive rates [13, 14]. In essence, the AdaBoost classifier represents a
sequential learning method based on a one-step greedy strategy. It is
reasonably expected that postglobal optimization processing will further improve
AdaBoost performance [13]. A cascade of classifiers is a decision tree where at
each stage a classifier is trained and formed to detect almost all objects
while rejecting a certain percentage of background areas. Those image windows
not rejected by a stage classifier in the cascade sequence will be processed by
the successful stage classifiers [13]. The cascade architecture can dramatically
increase the speed of the detector by focusing attention on promising regions.
Each stage classifier was trained by the AdaBoost algorithm [13, 29]. The idea
of boosting refers to selecting a set of weak learners to form a strong
classifier [13].
We modified the original AdaBoost
classifier for verification of detected eye regions by using corneal SRs. For
training, we used 200 face images captured from 70 persons and in each image,
we selected the eye and noneye regions manually for classifier training. Because
we applied the AdaBoost classifier only to the detected eye candidate region by
using the SRs, it did not take much processing time (less than 0.5 milliseconds when using
a Pentium-IV PC (3.2 Ghz)). Then, if the detected eye region was correctly
verified by the AdaBoost classifier, we defined the pupil candidate box as
160
160 pixels based on the detected SR position. Here, the box size was determined
by the human eye model. The conventional size of the pupil was adjusted from 2 mm
to 8 mm depending on the level of extraneous environmental light [30]. The
magnification factor of our camera was 19.3 pixels/mm. Consequently, we
estimated the pupil diameter from 39 to 154 pixels in the input image (2048
480 pixels). The size of the pupil candidate box was determined to be 160
160 pixels (in order to cover the pupil at the maximum size).
Then, in the pupil candidate box,
we applied circular edge detection to detect accurate pupil and iris regions
[1, 31]. To enhance processing speed, we used an integer-based circular edge
detection method, which excluded the floating-point operation [32].
6. Iris Recognition
To isolate iris regions from eye images, we performed pupil
and iris detection based on the circular edge detection method [31, 33]. For
iris (or pupil) detection, the integro-difference values between the inner and
outer boundaries of the iris (or pupil) were calculated in the input iris image
with the changing radius values and the different positions of the iris (or
pupil). The position and radius when the calculated integro-difference value was
the maximum were determined as the detected iris or (pupil) position and
radius.
The upper and lower eyelids were also located by an eyelid
detection mask and the parabolic eyelid detection method [33–35]. Since the
eyelid line was regarded as a discontinuity area between the eyelid and iris regions,
we first detected the eyelid candidate points by using an eyelid detection mask
based on the first-order derivative. Because there were detection errors in the
located candidate points, the parabolic Hough transform was applied to detect
accurate positions of the eyelid line.
Then, we determined the eyelash candidate region based on
the detected iris and pupil area and located the eyelash region [33, 36]. The
image focus was measured by the focus checking mask. Then, with the measured
focus value of the input iris image, an eyelash-checking mask based on the first-order derivative was determined. If the image was defocused, a larger mask was
used, and vice versa. The eyelash points were detected where the calculated
value of the eyelash-checking mask was maximum and this was based on the
continuous characteristics of the eyelash.
In circular edge detection, we did not use any threshold. By
finding the position and radius with which the difference value was maximized,
we were able to detect the boundaries of the pupil and the iris.
For eyelid detection masking and parabolic eyelid detection,
we did not use any kind of threshold either. In the predetermined searching
area as determined by the localized iris and pupil positions, the masking value
of the eyelid detection mask was calculated vertically and the position with
which the masking value was maximized was determined as the eyelid candidate
position. Based on these candidate positions, we performed the parabolic Hough
transform which had four control points: the curvature value of the parabola, the
and
positions of the parabola apex, and the rotational angle of the parabola.
In this case, because we detected one parabola with which the maximum value of
curve fitting was obtained, we did not use any threshold. In order to reduce
the processing time of the parabolic Hough transform, we restricted the
searching dimensions of four control points by considering the conventional
shape of the human eyelid.
For eyelash detection, because the eyelash points were
detected on the maximum position, we again did not use any kind of user defined
threshold.
After that, the detected circular iris region was normalized
into rectangular polar coordinates [1, 37, 38]. In general, each iris image
has variations in terms of the length of the outer and inner boundaries. The
reason for these variations is that there are size variations between people’s
irises (the diameter of any iris can range from about 10.7–13 mm). Another
reason is because the captured image size of any given iris may change
according to the zooming factor caused by the
-distance between the camera and the eye. Another reason
is due to the dilation and contraction of the pupil (known as hippus movement).
In order to reduce these variations and obtain normalized
iris images, we adjusted the lengths of the inner and outer iris boundaries to
256 pixels by stretching and linear interpolation. In conventional iris
recognition, low, and mid-frequency components are mainly used for
authentication instead of high-frequency information [1, 37, 38]. Consequently,
linear interpolation did not degrade recognition accuracy. Experimental results
with the captured iris images (400 images from 100 classes) showed that the
accuracy of iris recognition when using linear interpolation was the same as
when using bicubic interpolation and B-spline interpolation. So, we used
linear interpolation to reduce processing time and system complexity.
Then,
the normalized iris image was divided into 8 tracks and 256 sectors [1, 37, 38]. In each track and sector, the weighted
mean of the gray level based on a 1D Gaussian kernel was calculated vertically
[1, 37, 38]. By using
the weighted mean of the gray level, we were able to reduce the effect caused
by the iris segmentation error and obtain a 1D iris signal according to each
track. We obtained eight 1D iris signals (256 pixels wide, resp., based
on 256 sectors) from eight tracks. Consequently, we obtained a normalized iris
region of
pixels, from 256 sectors and 8 tracks. Then, long and
short Gabor filters were applied to generate the iris phase codes as shown in (6) [33],
(6) where
is the amplitude of the Gabor filter, and
and
are the kernel size and the frequency of the Gabor filter, respectively, [33].
Here, the long Gabor filter had a long kernel and was
designed with a low frequency value. So, it passed a low-frequency component of
the iris textures. However, the short Gabor filter passed a
mid-frequency component
with a short kernel and a mid-frequency value for designing the Gabor kernel.
The optimal parameters of each Gabor filter were determined
to obtain the minimum equal error rate (EER) by testing with test iris images. The
EER is the error rate when the false acceptance rate (FAR) is the same as that
of the false rejection rate (FRR). The FAR is the error rate of accepting
imposter users as genuine ones. The FRR is the error rate of rejecting genuine users
as imposters [33].
In terms of the long Gabor filter, the filter size was 25 pixels and the frequency (
of (6)) was 1/20. In terms of the short Gabor filter, the filter size was
15 pixel and the frequency (
of (6)) was 1/16. The calculated value of Gabor filtering was
checked to determine whether it had a positive or negative value. If it had a
positive value (including 0), the calculated value of Gabor filtering was 1. If
it had a negative value, it was 0 [1, 37, 38]. This was called iris code quantization
and we used the iris phase information from that. The Gabor filter was applied
on every track and sector, and we obtained an iris code of 2,048 bits (= 256
sectors
tracks) which had either a 1 or a 0 code. Consequently, 2,048 bits were obtained
from long Gabor filtering and another 2,048 bits were obtained from short Gabor
filtering [33].
In this case, the iris code bits which were extracted from the
eyelid, eyelash, and SR occluded areas were regarded as unreliable and were not
used for code matching [33]. After pupil, iris, eyelid, and eyelash detection,
the noise regions were depicted as unreliable pixels (255). With Gabor
filtering, even if one unreliable pixel was included in the range, the extracted
bit on that position was regarded as an unreliable code bit. Only when the
number of reliable codes exceeded the predetermined threshold (we used 1000 as
the threshold to obtain the highest iris authentication accuracy with the iris
database) they could be used as an enrolled template with high confidence [33].
The extracted iris code bits of the recognition image were
compared with the enrolled template based on the hamming distance (HD) [1, 37, 38]. The HD was calculated based on the exclusive operation (XOR) between
two code bits. So, if they were the same, the XOR value was 0. If they were
different, the value was 1. Consequently, it was highly probable that the two
iris codes of two genuine users would have both been 0. Therefore, all the
reliable code bits of the recognition image were compared with those of the enrolled
one based on the HD. If the calculated HD exceeded the threshold (we used 0.3),
the user was accepted as genuine. If not, he or she was rejected as an imposter.
7. Experimental Results
Figure 5 shows the mobile phone that we used. It was a
Samsung SPH-S2300 with a 2048*1536 pixel CCD sensor and a 3X optical zoom. To
capture detailed iris patterns, we used IR-illuminators and an IR pass filter
[19]. In front of the camera lens as shown in Figure 5, we attached a cold mirror
(with an IR pass filter), which allowed IR light to pass through and reflect
visible light. Also, we attached dual IR-LED illuminators to detect genuine SRs
easily (as mentioned in Section 2.2).
In the first test, we measured the accuracy (hit ratio) of
our algorithm. Tests were performed on 400 face images captured from 100
persons (70 Asians, 30 Caucasians). These face images were not used for
AdaBoost training. The test images consisted of the following four categories:
images with glasses and contact lenses (100 images); images without glasses
or contact lenses (100 images) in indoor environments (223 lux.); images
with glasses and contact lenses (100 images); and images without glasses or
contact lenses (100 images) in outdoor environments (1,394 lux.).
Experimental results showed that the pupil detection rate
was 99.5% (for images without glasses or contact lenses in indoor and outdoor
environments) and 99% (for images with glasses or contact lenses in indoor and
outdoor environments). The iris detection rate was 99.5% (for images without
glasses or contact lenses in indoor and outdoor environments) and 98.9% (for
images with glasses or contact lenses in indoor and outdoor environments). The
detection rate was not degraded irrespective of conditions due to the illuminator
mechanism as mentioned in Section 2.2. Though performance was slightly lower
for users with glasses, contact lenses did not affect performance.
When we measured performance only
using the AdaBoost algorithm, the detection rate was almost 98%. But there were
also many false alarms (e.g., when noneye regions such as eyebrows or
frames of glasses were detected as correct eye regions). Experimental results
with 400 face images showed that the false alarm rate using only the AdaBoost
eye detector was almost 53%. So, to solve these problems, we used both the information
of the corneal SR and the AdaBoost eye detector. These results showed that the
correct eye detection rate was more than 99% (as mentioned above) and the false
alarm rate was less than 0.2%.
Also, experimental results showed that the accuracies of the
detected pupil (iris) center and radius were measured by the pixel RMS error
between the detected and the manually-picked ones. The RMS error of the
detected pupil center was about 2.24 pixels (1 pixel on the
axis and 2 pixels
on the
axis, resp.). The RMS error of the pupil radius was about 1.9
pixels. Also, the results showed that the RMS error of the detected iris center
was about 2.83 pixels (2 pixels on the
axis and 2 pixels on the
axis,
resp.). The RMS error of the iris radius was about 2.47 pixels. All the
above localization accuracy figures were determined by manually assessing each
image.
In the second test, we checked the correct detection rate
of the pupil and the iris according to the size of the pupil detection box (as
mentioned in Section 5) and as shown in Tables 1–4.
Table 1: Correct pupil detection rate for images without glasses (unit: %).
Table 2: Correct pupil detection rate for images with glasses or contact lenses (unit: %).
Table 3: Correct iris detection rate for images without glasses (unit: %).
Table 4: Correct iris detection rate for images with glasses or contact lenses (unit: %).
In the next experiments, we
measured recognition accuracy with the captured iris images and detailed
explanations of the recognition algorithm are presented in Section 6. Results
showed that the EER was 0.05% when using 400 images (from 100 classes), which
meant that the captured iris images could be used for iris recognition. Figure 6
and Table 5 show examples of the captured iris images and the FRR according to
the FAR. In this case, the FAR refers to the error rate of accepting an imposter user as a
genuine one, and the FRR refers to the error rate of rejecting a genuine user as
an imposter. Here, an imposter means a user who did not enroll a biometric template
in the database [33].
Table 5: FRR according to FAR (unit: %).
Figure 6: Examples of captured iris images.
Then, we applied our iris recognition algorithm (as mentioned
in Section 6) to the CASIA database version 1 [39] (using 756 iris images from
108 classes), the CASIA database version 3 [39] (a total of 22,051 iris images
from more than 700 subjects), the iris images captured by our handmade iris
camera based on the Quickcam Pro-4000 CCD camera [40] (using 900 iris images
from 50 classes [33]) and those by the AlphaCam-I CMOS camera [41] (using 450
iris images from 25 classes [33]). Results showed that the iris authentication
accuracies (EER) of the CASIA version 1, the CASIA
version 3, the iris images captured with the CCD camera, and the iris images
captured with the CMOS camera were 0.072%, 0.074%, 0.063%, and 0.065%, respectively. From
that, it was clear that the authentication accuracy with the iris images
captured by the mobile phone was superior and the captured iris images on
mobile phone were of sufficient quality to be used for iris authentication.
Figure 7 shows the ROC curves for the datasets such as the iris images obtained by
our mobile phone camera, the CASIA version 1, the CASIA
version 3, those by the Quickcam Pro-4000 CCD camera, and those by the AlphaCam-I
CMOS camera.
Figure 7: ROC
curves for all datasets.
In order to evaluate the robustness of
our method to noise and show the degradation in the recognition accuracy as the
amount of noise in the captured iris images increased, we increased the Gaussian
noise in the iris images captured by our mobile phone camera. To measure the
amount of inserted noise, we used the signal-to-noise rate (SNR =
(Ps/Pn)), where Ps represents the variance of the original
image and Pn represents that of the noise image.
Results showed that if the SNR exceeded
10 dB, there was no iris segmentation error or recognition. If the SNR was
between 5–10 dB, the RMS error of the detected pupil and iris increased to
4.8% based on the original RMS error. However, even in that case, the
recognition error was not increased. If the SNR was between 0 and 5 dB, the RMS
error of the detected pupil and iris increased to 6.2% based on the original
RMS error. However, again, the recognition error was not increased.
That is because in conventional iris
recognition, the low- and mid-frequency components of iris texture are mainly
used for authentication instead of high-frequency information, as mentioned before
[1, 33, 37, 38]. Based on that, both long and short Gabor filters were
applied to generate iris phase codes [33]. The long Gabor filter had a long
kernel and was designed with a low frequency value (it passed the low-frequency
component of the iris textures). Whereas, the short Gabor filter passed the mid-frequency
component with a short kernel size and a mid-frequency value for designing the Gabor
kernel.
In the next test, we measured different processing times
with a mobile phone, a desktop PC, and a PDA. The mobile phone (SPH-S2300) used a Qualcomm MSM6100 chip (ARM926EJ-STM CPU (150 Mhz),
4 MB Memory) [28, 42]. To port our algorithm on the mobile
phone, we used a wireless internet platform for interoperability (WIPI) 1.1 platform [43] without an additional DSP chip. For the PDA,
we used an HP iPAQ hx4700 (with an Intel PXA270 CPU (624 Mhz), 135 MB Memory, and a
Pocket PC 2003 (WinCE 4.2) OS). The desktop PC was a Pentium-IV CPU (3.2 Ghz), with
1 GB Memory and a Windows-XP OS.
Experimental results showed that
the total processing times for iris detection and recognition in the desktop
PC, PDA, and mobile phone were 29.32, 107.7, and 524.93 milliseconds, respectively. In previous
research, the face detection algorithm proposed by Viola and Jones [44] was
also tested on mobile phones such as the Nokia 7650 (with a CPU clock of 104 MHz)
and the Sony-Ericsson P900 (with a CPU clock of 156 MHz) with an input image of
344*288 pixels. Results showed that processing time on each mobile phone was
210 milliseconds and 160 milliseconds, respectively. Though these methods showed faster processing
speed, they only included a face detection procedure and did not address recognition.
In addition, they used an additional DSP chip, which increased their total costs.
8. Conclusions
In this paper, we have proposed a
real-time pupil and iris detection method appropriate for mobile phones. This
research has presented the following three advantages over previous works.
First, for users with glasses, there may be many noncorneal SRs on the surface
of the glasses and it is very difficult to detect genuine SRs on the cornea. To
overcome these problems, we proposed the successive On/Off Scheme of the dual
illuminators. Second, to detect SRs robustly, we proposed a theoretical way of
estimating the size, shape, and brightness of SRs based on eye, camera, and illuminator
models. Third, the detected eye (iris) regions by using the SRs were verified
again by using the AdaBoost eye detector.
Results with 400 face images
captured from 100 persons showed that the rate of correct iris detection was 99.5%
(for images without glasses) and 98.9% (for images with glasses and contact lenses).
Consequent accuracy of iris authentication with 400 images from 100 classes was
0.05% of the equal error rate (EER) based on the detected iris image.
In future work, more field tests will
be required. Also, to reduce processing time in mobile phones, we plan to port
our algorithm into the ARM CPU of mobile phones. In addition, we plan to
restore optical and motion blurred iris images and use them for recognition by not
rejecting and recapturing images. This may reduce total processing time and
enhance recognition accuracy.
Acknowledgments
This work was
supported by the Korea Science and Engineering Foundation (KOSEF) through the
Biometrics Engineering Research Center (BERC) at Yonsei University.
References
- J. G. Daugman, “High confidence visual recognition of persons by a test of statistical independence,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1148–1161, 1993.
- American National Standards Institute Inc., “Initial Draft for Iris Image Format Revision (“Iris Image Interchange Format”),” February 2007.
- A. L. Yuille, D. S. Cohen, and P. W. Hallinan, “Feature extraction from faces using deformable templates,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '89), pp. 104–109, Rosemont, Ill, USA, June 1989.
- K.-M. Lam and H. Yan, “Locating and extracting the eye in human face images,” Pattern Recognition, vol. 29, no. 5, pp. 771–779, 1996.
- F. Zuo and P. H. N. de With, “Real-time face detection and feature localization for consumer applications,” in Proceedings of the 4th PROGRESS Embedded Systems Symposium, pp. 257–262, Utrecht, The Netherlands, October 2003.
- J. Rurainsky and P. Eisert, “Template-based eye and mouth detection for 3D video conferencing,” in Visual Content Processing and Representation, vol. 2849 of Lecture Notes in Computer Science, pp. 23–31, Springer, Berlin, Germany, 2003.
- G. C. Feng and P. C. Yuen, “Multi-cues eye detection on gray intensity image,” Pattern Recognition, vol. 34, no. 5, pp. 1033–1046, 2001.
- H. A. Rowley, S. Baluja, and T. Kanade, “Neural network-based face detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 23–38, 1998.
- P. Viola and M. J. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137–154, 2004.
- Z. Zhu and Q. Ji, “Robust real-time eye detection and tracking under variable lighting conditions and various face orientations,” Computer Vision and Image Understanding, vol. 98, no. 1, pp. 124–154, 2005.
- Y. Ebisawa and S.-I. Satoh, “Effectiveness of pupil area detection technique using two light sources and image difference method,” in Proceedings of the 15th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 1268–1269, San Diego, Calif, USA, October 1993.
- M. Suzaki, “Racehorse identification system using iris recognition,” IEICE Transactions on Information and Systems, vol. J84-D2, no. 6, pp. 1061–1072, 2001.
- Z. Ou, X. Tang, T. Su, and P. Zhao, “Cascade AdaBoost classifiers with stage optimization for face detection,” in Proceedings of International Conference on Biometrics (ICB '06), vol. 3832 of Lecture Notes in Computer Science, pp. 121–128, Hong Kong, January 2006.
- Y. Freund and R. Schapire, “A short introduction to boosting,” Journal of Japanese Society for Artificial Intelligence, vol. 14, no. 5, pp. 771–780, 1999.
- H. A. Park and K. R. Park, “A study on fast iris detection for iris recognition in mobile phone,” Journal of the Institute of Electronics Engineers of Korea, vol. 43, no. 2, pp. 19–29, 2006.
- S. Han, H. A. Park, D. H. Cho, K. R. Park, and S. Y. Lee, “Face recognition based on near-infrared light using mobile phone,” in Proceedings of the 8th International Conference on Adaptive and Natural Computing Algorithms (ICANNGA '07), Lecture Notes in Computer Science, pp. 11–14, Warsaw, Poland, April 2007.
- S. Rakshit and D. M. Monro, “Iris image selection and localization based on analysis of specular reflection,” in Proceedings of IEEE Workshop on Signal Processing Applications for Public Security and Forensics (SAFE '07), Washington, DC, USA, April 2007.
- K. Choi, J.-S. Lee, and S.-J. Ko, “New autofocusing technique using the frequency selective weighted median filter for video cameras,” IEEE Transactions on Consumer Electronics, vol. 45, no. 3, pp. 820–827, 1999.
- B. J. Kang and K. R. Park, “A study on iris image restoration,” in Proceedings of the 5th International Conference on Audio—and Video-Based Biometric Person Authentication (AVBPA '05), vol. 3546 of Lecture Notes in Computer Science, pp. 31–40, Hilton Rye Town, NY, USA, July 2005.
- American Conference of Government Industrial Hygienists, “Eye Safety with Near Infra-Red Illuminators,” 1981.
- A. Gullstrand, “The optical system of the eye,” in Physiological Optics, H. von Helmholtz, Ed., 3rd edition, 1909.
- R. C. Gonzalez, Digital Image Processing, Prentice-Hall, Englewood Cliffs, NJ, USA, 1992.
- http://www.polhemus.com/?page=Motion_Fastrak.
- P. Sandoz, D. Marsaut, V. Armbruster, P. Humbert, and T. Ghabi, “Towards objective evaluation of the skin aspect: principles and instrumentation,” Skin Research and Technology, vol. 10, no. 4, pp. 263–270, 2004.
- E. C. Lee, K. R. Park, and J. Kim, “Fake iris detection by using purkinje image,” in Proceedings of International Conference on Biometrics (ICB '06), vol. 3832 of Lecture Notes in Computer Science, pp. 397–403, Hong Kong, January 2006.
- R. L. Cook and K. E. Torrance, “Reflectance model for computer graphics,” in Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '81), pp. 307–316, Dallas, Tex, USA, August 1981.
- K. Shafique and M. Shah, “Estimation of the radiometric response functions of a color camera from differently illuminated images,” in Proceedings of International Conference on Image Processing (ICIP '04), vol. 4, pp. 2339–2342, Singapore, October 2004.
- http://downloadcenter.samsung.com/content/UM/200411/20041126110547406_SPH-S2300_Rev3.0.pdf.
- Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119–139, 1997.
- S.-W. Shih and J. Liu, “A novel approach to 3-D gaze tracking using stereo cameras,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 34, no. 1, pp. 234–245, 2004.
- D.-H. Cho, K. R. Park, D. W. Rhee, Y. Kim, and J. Yang, “Pupil and iris localization for iris recognition in mobile phones,” in Proceedings of the 7th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, Including the 2nd ACIS International Workshop on Self-Assembling Wireless Networks (SNPD/SAWN '06), vol. 2006, pp. 197–201, Las Vegas, Nev, USA, June 2006.
- D. H. Cho, K. R. Park, and D. W. Rhee, “Real-time iris localization for iris recognition in cellular phone,” in Proceedings of the 6th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and the 1st ACIS International Workshop on Self-Assembling Wireless Networks (SNPD/SAWN '05), vol. 2005, pp. 254–259, Towson, Md, USA, May 2005.
- H.-A. Park and K. R. Park, “Iris recognition based on score level fusion by using SVM,” Pattern Recognition Letters, vol. 28, no. 15, pp. 2019–2028, 2007.
- Y. K. Jang, et al., “Robust eyelid detection for iris recognition,” Journal of the Institute of Electronics Engineers of Korea, vol. 44, no. 1, pp. 94–104, 2007.
- Y. K. Jang, et al., “A study on eyelid localization considering image focus for iris recognition,” submitted toPattern Recognition Letters.
- B. J. Kang and K. R. Park, “A robust eyelash detection based on iris focus assessment,” Pattern Recognition Letters, vol. 28, no. 13, pp. 1630–1639, 2007.
- J. Daugman, “How iris recognition works,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 21–30, 2004.
- J. G. Daugman, “Demodulation by complex-valued wavelets for stochastic pattern recognition,” International Journal of Wavelets, Multi-Resolution and Information Processing, vol. 1, no. 1, pp. 1–17, 2003.
- http://www.cbsr.ia.ac.cn/IrisDatabase.htm.
- http://www.logitech.com/index.cfm/webcam_communications/webcams/&cl=us,en.
- http://www.avtech.co.kr/html/camera_etc.html.
- http://www.arm.com/products/CPUs/ARM926EJ-S.html.
- http://www.wipi.or.kr.
- http://www.idiap.ch/pages/contenuTxt/Demos/demo29/face_finderfake.html.