Automatic iris recognition on uncontrolled images

(1)

Master Forensic Science

Automatic iris recognition on

uncontrolled images

Sander Hansen

10995080

December 30, 2020

Supervisor: dr. A.C.C. Ruifrok Examiner: Prof. Z.J.M.H. Geradts

F

orensic

Science

—

University

of

Amsterd

am

(2)

(3)

Abstract

In this research, we will compare the literature of four different iris recognition algorithms, designed to recognise iris patterns in uncontrolled images taken in natural light to answer what the performance of different iris recognition algorithms is on uncontrolled iris images? The algorithms which are compared are designed to include all three steps, image pre-processing, feature extraction and template matching of the iris recognition process. The four algorithms are tested on two different databases of uncontrolled and noisy iris images taken with natural light. We will find, based on the literature, that while some algorithms perform better than the other, the overall performance of automatic uncontrolled iris recognition, is not good enough to be used as a reliable form recognition.

(4)

(5)

Introduction

The iris is seen as a unique and stable biometric feature [44]. For this reason, it can be of great use of the identification of people, especially if it is done in an automated manner. As early as in 2001 Schiphol airport launched the Privium program [33], which allowed frequent flyers to go through the boarding with an iris scan only. Since then a lot of big projects have been launched, with at this moment two of the biggest examples being the Aadhaar project in India and the border-crossing control system in the United Arab Emirates. The Aadhaar project has as of November 2020 more than 1.25 billion registered users [42], of whom an iris image of both of the eyes has been taken, together with other biometric data it is linked to a unique identification number for the Indian government [17]. In the United Arab Emirates border-crossing control system, there are 7.2 billion iris comparisons conducted for 6000 persons daily [6].

Both applications use images which are taken in a controlled environment with near infra-red light. In this way, iris recognition is a reliable source of biometric identification. Research of Daugman has shown that with the use of images of the United Arab Emirates border-crossing control system a false-match in a cross-comparison experiment has been observed of lower than 1 in 200 billion (2 ∗ 1011_{) and theoretically even as low as 1 in 5 quadrillions (5 ∗ 10}15_{) [5].}

Iris recognition could be of great use as forensic evidence, for example in matching deceased and missing persons. While it is possible to retrieve usable iris images up to 500 hours post-mortem in the most ideal circumstances [41], there is not always a high-quality image of the iris of a missing person available. Also in the search of a perpetrator, iris recognition could be useful, high-quality images of the irises of suspects can be retrieved but photographic evidence is most of the time not created under the ideal circumstances.

In the Dutch legal system, no legal or identification cases could be found where iris recognition is used. One of the reasons for this is that most of the time when dealing with forensic evidence it is uncontrolled, images are not taken with the purpose of iris recognition in mind. The high-quality images needed to match the iris of perpetrators or missing persons who need to be found do not exist.

There are quite some researches available of iris recognition algorithms which are tested on or even designed for uncontrolled images. However, no recent comparison has been made between those different algorithms. Therefore this literature review will compare different algorithms which have been tested on uncontrolled images and answer the following question:

What is the performance of different iris recognition algorithms on uncontrolled iris images? This literature review will begin with explaining the basics of iris recognition after which the implementation of four different algorithms will be discussed with the help of two different databases with uncontrolled iris images taken with natural light. I will reflect on the results an give my own educated opinion of the uses of iris recognition on uncontrolled images.

(8)

(9)

CHAPTER 2

Iris recognition

2.1 Basics

As described in the introduction most iris recognition systems require illumination by near infra-red light. These light sources reveal patterns in the iris which are not visible with the naked eye. The first complete [11] and nowadays most implemented [25] iris recognition algorithm has been proposed by Daugman in 1993 [8]. It consists roughly of three steps, image pre-processing, feature extraction and template matching. While the Daugman algorithm is designed for using images which are made in a controlled environment with near infra-red light it seems that most algorithms take these same three steps in iris recognition [24]. Therefore to better understand the iris recognition process we will explain the basics of these three steps and some common ways they are executed.

2.1.1 Image pre-processing

During the image pre-processing we have to make sure we only have the part of the eye in which patterns can be discovered, the iris. Both the outer and inner edge of the iris can be detected. The outer edge of the iris is adjacent to the sclera, the white part of the eye, and contrast is in most algorithms used to detect this edge. The inner edge of the iris is the outer edge of the black pupil, again the contrast is used to detect this edge. Both edges can then be approximated by a circle [24].

When taking a image of an eye the eyelids and eyelashes can prevent that the whole iris is visible. The algorithm has to remove these from the images. Texture, edges or contrast can be used to detect and remove the eyelids and eyelashes.

The last step in the image pre-processing is the normalisation of the image. Normalisation is done so that images of different sizes or under different lighting conditions can still be compared. Normalisation will make sure that the same features that will be extracted in the next step can be extracted from multiple images of the same eye. The different steps and the result of the image pre-processing can be seen in figure 2.1.

2.1.2 Feature extraction

After the image has been pre-processed we can extract features out of the images. The image itself holds too much information to be compared immediately so we want to create a template which holds only the relevant information, in this case, the iris pattern. The template will consist of features extracted by a certain algorithm. In this way, the information of the iris pattern is extracted and we represent it in a lower dimensionality space which makes it ready to be compared [19].

(10)

Figure 2.1:(a) Original image of the eye (b) Iris localised in image (c) Normalised iris image. Simplified version of figure from [24].

2.1.3 Template matching

Because of the normalisation and the template creation in the previous steps, we can use different created templates to look for similarities. Two templates of different images of the same eye will not be exactly the same, but if a good feature extraction algorithm is used they will look a lot like each other.

An iris recognition algorithm can look at the correlation between points in or the distance between two templates. A certain threshold t can then be set and if the similarity exceeds this threshold we can say that the two templates match.

2.1.4 Comparison techniques

In biometrics we distinguish two different recognition purposes; verification and identification [18]. In the Aadhaar project, the purpose is to verify the identity of a person. The iris is linked to an identification number and when a person claims to be the owner of this identification number, which can be checked by iris recognition. We call this verification. In the border control of the United Arab Emirates, the iris of an individual is used to search against a database of people who are on a negative watch list [7], this is what we call identification. In the case of using iris recognition as forensic evidence, we can say that matching the iris of a missing person with one of a deceased body can be seen as verification. Finding a perpetrator can be seen as identification since the found images of a perpetrator can be matched against a dataset of known suspects.

For both verification and identification, we have four possible outcomes, a true positive, true negative, false positive and a false negative. However, we are interested in different metrics when we want to measure the performance.

For verification, we use a so-called Receiving Operating Characteristic (ROC) curve, which sets out the false positive rate on the x-axis against the 1 - false-negative rate on the y-axis while varying the threshold t. An example of the ROC curve can be seen in figure 2.2. A ROC of a perfect classifier would be square in the top left corner while a random classifier would go straight through the middle from the left bottom to the right top corner. The more bowed to the upper left corner the better the classifier.

When measuring the performance of the identification we use a Cumulative Match Characteristic (CMC) curve. When we test an iris image of subject x against the database containing N reference iris images we will get N probability scores. We will rank these from high to low. If we do this for M subjects we get M sets of N ranked scores. We remove sets which do not have a reference image in N so we are left with M1ranked sets. By doing this we only have information

about the true positive rate, which is the measurement used in the CMC. The actual graph will show on the y-axis what percentage of the M1 sets contain the true positive at rank K(K ≤ N )

(11)

Figure 2.2: (a) ROC graph with the diagonal line to find the EER (b) CMC graph. Adjusted version of figure from [10].

which is displayed at the x-axis. For example we can see in figure 2.2 at rank 5 a bit more than 95% of the identifiable subjects have actually been identified when t = 0 [10].

Besides graphical evaluation, we can also use numerical metrics to describe the performance. Four metrics which are used often in papers about iris recognition algorithms are the equal error rate (EER), rank-one recognition and the decidability index.

The EER can be calculated or found in the ROC curve and is the point where the false negative and the false positive rate are equal to each other, it is expressed in percentages. In figure 2.2 the EER is located where the blue line intersects with the diagonal line. The lower this rate is, the better the performance of the algorithm. Some ROC curves do not have x and the y axis with the same size, in this case, the EER is not equal to the diagonal line.

Rank-one recognition can be calculated or found in the CMC curve. It is simply the percentage of subjects which are correctly recognised at the first rank.

Another metric is the decidability, which is based on the mean and standard deviation of the measured distance between two iris templates. The metric takes the average and standard deviation of the distance between iris templates of the same subject µ1[4] and between the ones

of different subjects µ2. The difference between these means is divided by the square root of half

the summed standard deviations between the same σ1 and different σ2 subjects. This can be

seen in the following formula for decidability d0.

d0 = ₁|µ1− µ2|

2

√ σ1+ σ2

The decidability reflects in a ROC how curved the line is and is not dependent on threshold t. The better the performance the higher the decidability index and the more the ROC will reflect the perfect bow in the upper left corner.

In forensics applications which are described in section 1 another metric is often used the likelihood ratio. The likelihood ratio will give information about the strength of evidence. However, the numbers and error rates reported in the researches used are not usable in comparison with the likelihood ratio since error rates are based on posterior probabilities. According to [23] the strength in the likelihood-ratio framework is that no prior odds that could influence a case are needed. It is according to [13] possible under certain conditions to calculate the likelihood ratio from metrics like the equal error rate, but we are not able to do this with the data given in the studied researches.

2.2 Comparison

As described in the introduction there have been quite some researches in recent years into iris recognition algorithms for uncontrolled images. However, if we want to compare these different developed algorithms we need to have comparable results. When using images taken in uncontrolled conditions different problems for the iris recognition can arise in the image.

(12)

First of all, if we do not have control over the environment we can assume that no near infra-red light source has been used to take the image. According to [12] the difference between a image taken with natural light and one taken with near infra-red light is of great importance for the design of the algorithm, this is true for both the iris localisation in the image pre-processing and the feature extraction. It makes a difference because in natural light fewer features will be visible especially with darker iris pigments [5], which makes it not only harder to detect these features but also the pupil.

Besides the light source, we also have no control over the distance and the angle at which a photo has been taken. There are also different circumstances which can occlude the iris, we consider this as noise [30].

There are different datasets with iris images which are noisy, taken at a distance or in non-ideal conditions. In the next section, we will describe the different datasets we used for the comparison.

2.2.1 Datasets

2.2.2 UBIRIS.v2 and NICE-II

The UBIRIS.v2 [30] dataset is a dataset with images captured in visible light created by the University of Beira Interior. The dataset was created around three principles. The images should be taken at different distances, the subjects should be moving and noise should be present. The images are taken when the subjects were walking at a distance between 4 and 8 meters and are manually cropped to a region of interest of 800 by 600 pixels with a resolution of 72 dots per inch. This region is equal to the periocular region, the region around the eye.

Most of the subjects are of caucasian (90%) ethnicity and a small percentage are black (8%) or Asian (2%) people. The dataset is divided into five age subsets with the following representations:

1. 0-20: 6.6% 2. 21-25: 32.9% 3. 26-30: 23.8% 4. 31-35: 21.0% 5. 36-99: 15.7%

Of every subject there is, in the dataset an iris image of good quality and multiple images with at least one of the following causes of noise:

1. The iris is off-angle 2. The iris is out of focus 3. The iris is rotated

4. The iris is motion blurred

5. The iris is obstructed by an eyelid 6. The iris is obstructed by eyelashes 7. The iris is obstructed by glasses 8. The iris is obstructed by contact lenses 9. The iris is obstructed by hair

10. The iris is photographed in poor lightning 11. The iris has lightning reflections from glasses 12. The iris has lightning reflections

(13)

13. The iris is only partly captured 14. The iris is not present on the image

In total there are 261 subjects of which 11102 images are taken in the UBIRIS.V2 dataset. The UBIRIS.v2 dataset contains a subset made for the NICE-II competition [26]. NICE-II is short for Noisy Iris Challenge Evaluation Part - II and the subset were made to make different algorithms to test their performance on the same set. The training subset of NICE-II contains in the training set 1000 images of 71 subjects.

The dataset also contains already segmented iris images, these can be used to compare new comparison techniques or can be used directly in an iris recognition algorithm. They are retrieved by a solution as described in [38], which was the winner of the previous edition of the NICE competition, NICE-I [29].

2.2.3 MICHE-I

The MICHE-I [9] is a dataset with uncontrolled images captured by mobile phones. The images are taken with the idea in mind that the subjects from whom the iris image needs to be taken are holding the mobile phone which takes the image. Since a mobile phone is used the images are taken in conditions with natural light.

The subjects of the MICHE-I dataset are all of the Caucasian ethnicity between 20 and 60 years old. There are 66 male subjects and 26 female.

The subjects took the iris images of themselves and they were asked to do this with the idea of mobile recognition in mind. The image had to be taken in a way they would also take it if an application asked to show the iris to verify your identity. This to make the capturing process as genuine and close to a possible application as possible. Each subject had to take at least four different images per camera of each device both indoor and outdoor. Three different mobile devices were used to take the images, an iPhone5, Galaxy Samsung IV, Galaxy Tablet II with resolutions 1536 by 2048 pixels, 2322 by 4128 pixels and 640 by 480 pixels respectively. The images are in contrast to the UBIRIS.V2 [30] database not cropped to a certain region of interest, since they are already created with the purpose of capturing the periocular region.

The following kinds of noise which are present in the images are recognised by the paper: 1. The iris is off-angle

2. The iris is out of focus 3. The iris is motion blurred

4. The iris is obstructed by an eyelid 5. The iris is obstructed by eyelashes 6. The iris is obstructed by glasses 7. The iris is obstructed by hair 8. The iris is obstructed by shadows

9. The iris has reflections, either from lightning, people or objects 10. Artifacts created by the mobile device

11. Different forms of lightning 12. Different colors are dominant

In total there are 92 different subjects of which 3732 images were taken at the moment of the publishing of [9].

(14)

2.2.4 Overview

The main difference between the two datasets is the way the images are taken. [9] has images which are taken up close, with a low-resolution mobile camera by the subjects themselves. [30] has images taken with a high-resolution camera from a distance while the subject was moving. The [30] dataset is bigger with 11102 iris images while [9] has 3732 images. But it is good to realise that in most algorithms the [26] subset of [30] is used, which contains only 1000 images.

The different noise factors are relatively the same with some extra noise factors created by the devices in the [9] dataset and some partial irises in [30]. Not every image of [9] has necessarily a noise factor present, while [30] does have some form of noise in every image.

(15)

CHAPTER 3

Algorithms

In the past decade, a lot of research has been conducted on iris recognition algorithms. Because of this, there are also a lot of newly proposed methods, for this research we do not have the resources to compare all these algorithms.

The choice of the algorithm has been made on different points. Since there is a big performance difference between images taken in natural and near infra-red light we focus on algorithms tested on images taken with natural light, these images are more likely to occur in the case of uncontrolled images. All of the datasets of subsection 2.2.1 reflect this choice by containing images taken in natural light, so the algorithms must have been tested on at least one of these datasets. The algorithms chosen all use different techniques, so even with the limited time available for this research, we can see the performance of unique aspects of the algorithms. We also want the proposed algorithm to be a complete and automatic solution. So all three steps, image pre-processing, feature extraction and template matching, need to be included.

3.1 Accurate Iris Recognition at a Distance Using Stabilized Iris Encoding

and Zernike Moments Phase Features

3.1.1 Image pre-processing

The algorithm of [36] is designed to give good recognition performance on images taken both in visible and near infra-red light. It needs an image of the periocular region as input and uses a previously proposed iris segmentation algorithm [35]. Visible light can be a problem when detecting the actual iris image. For example, a round light source can create reflections which can be interpreted as an iris since it is round and has a high contrast with its surroundings. To correct for the variation in light high dynamic range compression is used to reduce highlights after which a Gaussian filter is applied to even further reduce them.

To find the iris, every pixel is labelled as either iris or non-iris. This is done with a random walker algorithm. The output of the random walker algorithm is a mask of the calculated iris region, the pupil is included, this is called coarse segmentation. The mask will be used to calculate the centre of the iris.

Using the mask and the centre the actual inner and outer edges of the iris can be estimated. As stated in section 2.1.1 the edges are assumed to be circular. In the computed mask a Canny edge detector is used. The circular edges are placed as close to the edges calculated by the Canny edge detector as possible.

The outer edge is according to [35] not always circular when using uncontrolled images, so it should be further refined. This is done by using the mean intensity of pixels close to the inner edge and comparing them with pixels close to the outer edge, the pixels close to the outer edge of whom the difference with the mean inner edge intensity is too big are removed.

At this point, there is still the possibility that eyelids or eyelashes are in the way. For the eyelid, the already known edges are used to create a model. Using a second-degree polynomial

(16)

curve the eyelids are reconstructed. Eyelashes are removed by again using intensity differences, but this time this is done using the upper and bottom half of the masked iris. If the difference of a pixel in the upper half is too big with the mean of the bottom half a pixel is removed from the mask. The iris and mask are normalized using a method proposed by [5] to create a usable normalized iris image and mask. The whole pre-processing stage is illustrated in figure 3.1.

Figure 3.1: Summary of the image pre-processing step from [36]. Figure from [36].

3.1.2 Feature extraction

The algorithm uses two different feature extraction and template matching techniques which are combined to decide if the iris matches with another iris. The first technique makes use of global iris features while the other uses local features.

The technique which uses global features will describe the whole image as one feature. The iris image is first further smoothed to get rid of the noise. To get a global feature the image has to be converted to an iris code, this is done by using a 1-dimensional log Gabor filter, which creates an iris code consisting of bits. To cope with the noise which is still present in the image the algorithm tries to find fragile bits in the iris code. Fragile bits are bits which are inconsistent in the iris code, so if we take different images of the same iris, the fragile bits differ between the codes generated from the images [15]. Fragile bits weights are calculated for reference iris codes, so the iris to which we compare a new iris image. If there is only one image of a subject in the database, the weights are equal to the iris code, so it will be of no influence. But if there are multiple iris images of the same subject in the reference dataset, we can calculate which bits differ between the global feature template and assign a lower weight to the fragile bits.

To get the local features Zernike moments are used. According to [36], Zernike moments have been used in other algorithms however they only use magnitude information, the approach of this algorithm make use of the phase information. In section 4.1 the performance difference between the use of magnitude and phase information can be seen. The mask created in section 3.1.1 is used in this step to cancel out noise effect.

(17)

So the global features only use the normalized iris image while the local features also use the mask created in the image pre-processing.

3.1.3 Template matching

The global technique will calculate a modified version of the Hamming Distance, the number of positions at which the bits of two images are different [43]. The Hamming Distance between image x we want to test and reference image y is multiplied by the fragile bit weights of image y. The sum is then divided by the sum of the weight to get the modified version of the Hamming Distance.

The similarity between image x and y for the Zernike moments is calculated by using the distance between the phases. The global and local feature distances are combined with the use of a weighted sum to create the final matching score.

3.2 Multi-patch deep sparse histograms for iris recognition in visible

spectrum using collaborative subspace for robust verification

The algorithm proposed in [32] was created especially for iris recognition on smartphones. For example as an alternative for a fingerprint to unlock the phone. Earlier works suggest that the phone must have a near infra-red camera, but this paper suggests an approach with natural light. The algorithm is created for the verification purpose, as explained in section 2.1.4.

3.2.1 Image pre-processing

The implementation includes the recognition of the periocular region, so images with a big area visible around the eye can be used as input. A Haar cascade object detector is used to find this eye region. This object detector can divide the image into subsections using Haar-like features. Haar-like features are calculated by taking two neighbouring rectangles in a certain window and calculate the difference of sums of intensities in these rectangles. In this window, a lot of neighbouring rectangles of different sizes and in different locations are used to calculate multiple features. By making the window a sliding window over the image a lot of features can be detected. These features can be used to determine where the periocular region is. One feature itself is weak, but by combining multiple features over multiple window locations the periocular region can reliably be detected [34].

With a known eye region, the outer edge of the iris can be detected. The technique used to get the outer iris edge comes partly from an iris recognition algorithm earlier proposed by the same authors [31]. The difference in contrast in the form of a saliency map is used to detect this outer edge. This map reveals strong edges, which is the case in the outer edge of the iris. Diffusion is used to make small contrast edges even smaller so only the high contrast outer edge is visible in the map. Using circular Hough transform the iris region is estimated as a circle. The iris region is then are correlated with images with known iris radius to get the final estimation of the iris diameter.

The OSIRISV4.1 algorithm [27] is used to do the rest of the segmentation, this algorithm requires the previously calculated iris diameter as input [32]. OSIRISV4.1 applies a Viterbi algorithm to estimate the most likely edges of the iris, both inner and outer. The algorithm is applied both on high and low-resolution versions of the image. The high-resolution edge is used to create a mask and detect most eyelashes, the mask is further refined with the help of an adaptive filter. The edge of the low-resolution version is with the least squares estimated to the circle to create the outer edge. The normalisation technique of [5] is used to normalise the iris image.

3.2.2 Feature extraction

The feature extraction technique proposed in [32] is again a variation on their own earlier work [31]. Deep sparse filtering is used to get the features. Sparse filtering is a machine learning

(18)

technique which does not require the tuning of hyperparameters. The filters used are trained by using 200.000 random subdivided parts of 4212 images unrelated to eyes.

Figure 3.2: Creating four patches of a normalised iris image as used in [31]. Simplified version of figure from [31].

The new algorithm of [31] will extract the features using the 256 Deep Sparse Filters learned via the training described above. Every image is first divided into four patches as can be seen in figure 3.2. All three colour channels of the normalised image and the four different patches were convoluted with the 256 filters. Groups of 8 results of these convolutions were stacked on to each other to create 32 images per colour channel for the four patches and the normalised image. All these images are converted to histograms and are concatenated to get a final iris template The whole feature extraction is summarised in figure 3.3.

Figure 3.3: Summary of the feature extraction step from [31]. Figure from [31].

3.2.3 Template matching

Three different matching techniques are tested with the above-described feature extraction propose. The first technique will not just compare the templates directly with each other, but instead will learn a so-called subspace. The subspace is learned on the already known features of reference subjects, which are seen as the classes. The features of multiple images of the same subject are combined into one class. Using a Gaussian function this subspace is trained [2], so it can predict to which class a new image belongs to based on the information of all the reference subjects present in the subspace and the new feature template.

The other two matching techniques are using distances between the feature templates. The distances used are the chi-squared distance and the cosine similarity.

(19)

3.3 The Impact of Preprocessing on Deep Representations for Iris Recognition

on Unconstrained Environments

[45] compares different methods using convolutional neural networks (CNN’s) with each other. Especially the influence of some steps in the image pre-processing phase are discussed. While one best algorithm is proposed in the end, it is for the goal of this literature research useful to see the effects of these steps. Therefore the different methods will be explained and discussed.

3.3.1 Image pre-processing

There are six image pre-processing methods which are tested in [45], but there are some common steps to retrieve a mask of the iris using the technique of [38]. As discussed in section 2.2.2 these masks are already present in the dataset. But the way they are retrieved is important since it is part of the complete algorithm. This technique starts by removing the brightest points, which are seen as reflections and filling these points with bilinear interpolation [14]. The image is then clustered into different regions by using the darkest and brightest points. Of the different clustered regions it has to be decided which region contains the actual iris, for example, the eyebrow is likely to be clustered as a region but should not be chosen as the actual iris region. An example of the result of the clustering is shown in figure 3.4. Shape information is used to decide which region to use.

Figure 3.4: (a) Original image of the eye (b) Three different clustered regions found in image (a). Simplified version of figure from [38].

Next, the inner and outer edges are located. They are modelled as two circles without a common centre and located with a so-called integro-differential operator. An integro-differential operator is capable of detecting circle edges [1]. The edges are then refined by using intensity information close to these edges.

Eyelids are detected with a Canny edge detector and noise is removed with the help of a model of the curve of the eyelid. This model has been trained on a training dataset with known eyelid locations and applied to the edge calculated by the Canny edge detector. The last common step is the detection and removal of eyelashes and shadows with the help of intensity values. The image is split up in two zones, the bottom, which can have no eyelashes and shadows and the top, which can have eyelashes and shadows. A prediction model based on intensity histograms of these zones is used to get the threshold for the intensity under which a part of the image should be masked.

There are three different ways of normalization which are tested in the paper, first the same normalization as used in [5] is used with an 8:1 aspect ratio, second, the same method is used but with a different aspect ratio namely 4:2 and the last method applies no normalization at all, only the region not between the inner and out edge calculated with the help of the mask is removed. These three methods are used twice, one time without segmentation and one time with segmentation based on the mask. This way six different versions of one image are created as can be seen in figure 3.5. To make the images work with the CNN all images are resized to be

(20)

square since the input size of a pre-trained network should not be changed to make it perform in the best possible way.

Figure 3.5: (a) and (b) show, respectively, non-segmented and segmented images for noise removal from the Nice.II database. From top to bottom, images with the aspect ratios of 8:1, 4:2 are shown, as well as non-normalized images. Figure from [45].

A CNN is used for the feature extraction step, but if a CNN is trained on a small image set there is a big chance of overfitting [28]. The available datasets with uncontrolled images are not big enough, therefor [45] proposes a simple data augmenting technique. Every image is rotated six times with a random angle between a range to create a seven-fold of the number of images there were originally present.

3.3.2 Feature extraction

For feature extraction two different convolutional neural networks are tested, VGG and Resnet-50. Both networks are pre-trained for face recognition and only the last layer is replaced by two new layers, this is called transfer learning [40]. Since a CNN does a classification the last layer needs to have as many neurons as there are classes the image can be labelled too. In this case, the classes are the number of subjects which are in the training dataset. The network is trained on the augmented data and this way the weights of the network are set. After the training, the last layer, which contains the probability for the classes, is removed. This is done since the subjects on whom the network is trained are not the same subjects who are tested, so the classes are useless. The new last layer can be seen as a feature template.

In figure 3.6 we can see one of CNN’s used called VGG. The version of figure 3.6 is pre-trained on 1000 classes of ImageNet. So the last layer is one of 1000 classes to which a softmax is applied. In the version used in [45], the last layer is replaced by two new layers during training and of those two layers the last layer is again removed to get a feature template.

3.3.3 Template matching

For the template matching the cosine similarity is used. The output of the new last layer of the network generated by image x is compared with the output generated by image y. This will give us a certain cosine distance between x and y.

(21)

Figure 3.6: The architecture of the VGG-16 CNN. Figure from [39].

3.4 Conditional Generative Adversarial Network Based Data Augmentation

for Enhancement of Iris Recognition Accuracy

The algorithm described in [21] is using three convolutional neural networks (CNN’s). The problem with CNN’s is that a lot of training data is required to work reliably. When using only a small set to train there is a big chance of overfitting [28]. The proposed solution of [21] overcomes this problem by making use of a deep learning technique called conditional Generative Adversarial Networks (cGAN) to create artificial iris images to train the CNN on. Using these artificial images three CNN’s are combined for the feature extraction.

3.4.1 Image pre-processing

The segmentation of the iris in [21] is relatively simple. A circular edge detector tries to find the inner and outer edge of the eye. This edge detector will retrieve a centre point and a radius for the iris. Besides the iris image two periocular images are used as well, the size of these images is determined by extending the calculated radius by two different weights. So we have three different circular images of the eye, one of only the iris and two of the periocular region.

The normalisation process is a slightly modified version of the one proposed in [5], the normalised image is simplified for computational speed since the images need to be used in the CNN’s, which can be quite computationally expensive. Therefore the normalized image is only 256 pixels by 8 pixels. The normalisation is done for all three images [20].

As described earlier to prevent overfitting, a lot more data is needed then publicly available in a dataset. Therefore artificial versions of the iris images, not of the two periocular images, are created with the help of a conditional generative adversarial network. An already existing generative adversarial network (GAN) called pix2pix [16] is used. A GAN is capable of generating images based on a certain input. A GAN has two main components, a discriminator and a generator. The discriminator is trained to discriminate as good as it can between fake and real images. The generator is trained to create images that are by the discriminator classified as real. Both the generator and the discriminator are neural networks, however, the generator outputs real images and the discriminator will predict if an image is fake or real. This way images are trained to be as real as possible. The conditional part in a cGAN will make the GAN more

(22)

controllable by adding a second input next to the images only. A cGAN outputs artificially generated iris images.

A cGAN also needs a lot of input to give a good performance, therefor does [21] also perform augmentation before feeding the data into the cGAN. This augmentation is done translating the iris centre and cropping the image. The set created is the training set for the cGAN.

3.4.2 Feature extraction

Three CNN’s are combined for the feature extraction. The structure of the CNN’s is proposed by earlier research of the same authors [20]. Already existing CNN structures could not be used since most existing CNN’s are trained on square images and the height of the normalised image would be too small to use in such CNN’s. Retraining hidden layers of an existing network would not work with the proposed size. The new network uses non-square filters and has eight hidden layers and three fully connected layers, an overview of the proposed network can be seen in figure 3.7.

Figure 3.7: The architecture of the CNN as proposed in [20]. Figure from [20].

To improve the performance not only the normalised iris image is used, but also two periocular images. The periocular images are retrieved and also normalised as explained in section 3.4.1. The three images are all put through their own CNN, the structure of the three CNN’s is the same, but they are separately trained for the iris and two kinds of periocular images.

During the training of the three CNN’s only the first CNN, which uses the normalised iris images, uses the augmented data by the cGAN. The two CNN’s trained on periocular images only use the original, translated and cropped images from the dataset. The training is done using the last fully connected layer which has as many classes as there are subjects in the training dataset. However, the classes in the last fully connected layer are useless, since classes of training and testing will be different. The subjects for the CNN training are not present in the testing dataset. So to extract the features after the training phase the first fully connected layer is used and the other fully connected layers are removed.

3.4.3 Template matching

We now have three different templates which have to be matched. With every check against the reference dataset, all three feature templates of subject x are tested against all three of subject y in the reference dataset. This is done by individually calculating the euclidian distances between

(23)

the feature template pairs. The scores are then combined using a Support Vector Machine (SVM), which outputs a single score. If this score is greater than a threshold the templates match. An overview of the method can be found in figure 3.8.

(24)

(25)

CHAPTER 4

Results

4.1 Accurate Iris Recognition at a Distance Using Stabilized Iris Encoding

and Zernike Moments Phase Features

[36] is reviewed with the help of three different databases of which UBIRIS.v2 of section 2.2.2. The subset for the NICE-II competition [26] is used for the evaluation and training. Of the 1000 images available the first 96 images of 19 different subjects are used for the training of parameters. The other 904 images of 52 subjects are independent of the training set and are used for the evaluation.

In figure 4.1.a we can see the ROC of the algorithm when using the different features. The purple line represents the method only using global features. The blue line represents the local features of the phase component of the Zernike Moments while the green line shows the results of the Zernike Moments magnitude component. The red line uses both the global and the local phase features.

In figure 4.1.b we can see the CMC of the algorithm when using the different features. The same colour coding as in 4.1.a is used for the lines.

Figure 4.1: (a) The reported ROC of the proposed algorithms of [36] (b) The CMC of the proposed algorithms of [36]. Results to create these figures are from testing NICE-II [26] subset of the UBIRIS.v2 database [30]. Figure from [36].

From the COC curve, the equal error rate is calculated for the algorithm with the global and local features with the phase component, which is 11.96%. The CMC curve gives us a rank-one recognition rate of 63%. A decidability score of 2.5735 is achieved.

(26)

4.2 Multi-patch deep sparse histograms for iris recognition in visible

spectrum using collaborative subspace for robust verification

[32] is tested on two datasets, MICHE-I of section 2.2.3 and MICHE-II which we do not use for this research. Of the MICHE-I dataset, only the images of the two phones are used. Besides that only subjects of whom at least four images were available are used. This gives a subset of 50 subjects. Of each subject, one image will be used to test against whole dataset minus the original image, which should yield three matches. This is done in a circulating manner such that every image of each subject has been in the testing position once, while the other three are in the reference dataset. The results are presented separately for the phones, their cameras and the in- and outdoor situations. In figures 4.2.a, 4.2.b, 4.2.c and 4.2.d you can find the ROC curves for the iPhone indoor for the front and back camera and outdoor for the front and back camera respectively. In the ROC curve, a comparison is made with three other algorithms and the proposed solution, the dotted lines, with the three different matching techniques as described in section 3.2.3.

Figure 4.2: (a) ROC of [32] of iPhone front camera indoor (b) ROC of [32] of iPhone back camera indoor. (c) ROC of [32] of iPhone front camera outdoor. (d) ROC of [32] of iPhone back camera outdoor. Results to create these figures are from testing a subset of the MICHE-I [9] dataset. Figure from [32].

(27)

In figures 4.3.a, 4.3.b, 4.3.c and 4.3.d you can find the ROC curves for the Samsung phone indoor for the front and back camera and outdoor for the front and back camera respectively. In the ROC curve, a comparison is made with three other algorithms and the proposed solution, the dotted lines, with the three different matching techniques as described in section 3.2.3.

Figure 4.3: (a) ROC of [32] of Samsung phone front camera indoor (b) ROC of [32] of Samsung phone back camera indoor. (c) ROC of [32] of Samsung phone front camera outdoor. (d) ROC of [32] of Samsung phone back camera outdoor. Results to create these figures are from testing a subset of the MICHE-I [9] dataset. Figure from [32].

(28)

In table 4.1 you can find the EER of the camera’s and the different template matching techniques. A significance level of α 0.10 is used.

Phone Location Front/Back camera EER% with EER% with EER% with Collaborative subspace Cosine Chi Squared iPhone Indoor Front 2.09± 0.64 5.77± 1.30 5.23± 1.19 iPhone Indoor Back 1.73± 0.79 10.16± 1.64 8.06 ± 1.47 iPhone Outdoor Front 0.37± 0.32 4.12± 1.10 1.81± 0.72 iPhone Outdoor Back 1.17± 0.56 7.37± 1.52 7.15± 1.50 Samsung Indoor Front 1.41± 0.65 2.91± 0.91 4.70± 1.15 Samsung Indoor Back 2.46± 0.64 8.43± 1.53 8.65± 1.56 Samsung Outdoor Front 0.66± 0.46 3.48± 0.85 1.84± 0.85 Samsung Outdoor Back 1.71± 0.57 6.97± 1.34 7.34± 1.59 Table 4.1: EER and decidability scores for the proposed solution with the different template matching techniques of [32] on a subset of the MICHE-I dataset [9] as reported by [32].

Since this algorithm is designed for verification, no figures, like the CMC or numbers are presented which are used for identification.

(29)

4.3 The Impact of Preprocessing on Deep Representations for Iris Recognition

on Unconstrained Environments

The approaches in [45] are tested on the UBIRIS.v2 database only, they use the same subset as [36] of the NICE-II competition [26] for the training. No ROC or CMC is provided but, the paper reports a total of 18 different equal error rates and decidability indexes. For both the EER and the decidability index the standard deviation is also reported. Each of the two different neural networks is tested with the three different forms of normalisation, each form of normalisation is then tested without data augmentation and no segmentation, with data augmentation and no segmentation and with data augmentation and segmentation. The results for VGG and ResNet-50 on the NICE-II test can be found in tables 4.2 and 4.3 respectively. A significance level of α 0.05 is used.

Normalisation Augmentation Segmentation EER % Decidability

8:1 No No 26.19± 1.95 1.3140 ± 0.1246 8:1 Yes Yes 23.63± 1.33 1.4712 ± 0.0881 8:1 Yes Yes 22.58± 1.07 1.5437 ± 0.0697 4:2 No No 24.77± 1.42 1.4127 ± 0.1001 4:2 Yes No 18.74± 0.89 1.8527 ± 0.0712 4:2 Yes Yes 18.00± 0.93 1.9055 ± 0.0750 None No No 23.32± 1.10 1.4891 ± 0.0740 None Yes No 17.49± 0.90 1.9529 ± 0.0760

None Yes Yes 17.48± 0.68 1.9439 ± 0.0589

Table 4.2: EER and decidability scores for the VGG convolutional neural network on the NICE-II test set [26] subset of the UBIRIS.v2 database [30] as reported by [45].

Normalisation Augmentation Segmentation EER % Decidability

8:1 No No 24.38± 1.41 1.4297 ± 0.0916 8:1 Yes Yes 19.18± 0.75 1.7988 ± 0.0552 8:1 Yes Yes 20.68± 1.39 1.6801 ± 0.1071 4:2 No No 22.78± 1.22 1.5307 ± 0.0853 4:2 Yes No 17.11± 0.53 1.9822 ± 0.0482 4:2 Yes Yes 17.44± 0.85 1.9450 ± 0.0803 None No No 21.51± 0.97 1.6119 ± 0.0677 None Yes No 13.98± 0.55 2.2480 ± 0.0528

None Yes Yes 14.89± 0.78 2.1781 ± 0.0794

Table 4.3: EER and decidability scores for the ResNet-50 convolutional neural network on the NICE-II test set [26] subset of the UBIRIS.v2 database [30] as reported by [45].

For the VGG network the best EER was retrieved when using no normalisation, but with using segmentation and data augmentation. For the ResNet-50 network, the best score was with no normalisation and no segmentation, but with data augmentation.

(30)

4.4 Conditional Generative Adversarial Network based Data Augmentation

for Enhancement of Iris Recognition Accuracy

The proposed method of [21] uses three databases to test the performance. Two of them we will use for the comparison, the NICE-II training subset of the UBIRIS.v2 dataset [26] and the MICHE-I dataset [9]. The ROC of the NICE-II [26] dataset can be found in figure 4.4 different combinations of augmented data and amount of CNN’s are tried for this dataset. The ROC of the MICHE [9] dataset can be found in figure 4.5 on this dataset the cGAN augmentation was always used.

Two-fold cross-validation is used in [21], which can also be seen in the figures. This means that the dataset was split into two subsets and one is used for training and the other for testing, the subsets are then switched to get a second result.

Figure 4.4: (a) ROC of [21] of the 1st fold cross validation with [26] dataset (b) ROC of [21] of the 2nd fold cross validation with [26] dataset.

Figure 4.5: (a) ROC of [21] of the 1st fold cross validation with [9] dataset (b) ROC of [21] of the 2nd fold cross validation with [9] dataset.

(31)

Method Average EER % Average Decidability

1st CNN only 12.79 2.25

2nd CNN only 11.87 2.42

3rd CNN only 11.63 2.46

All CNN’s combined 8.58 3.31

Table 4.4: Average EER and decidability scores for the different solutions of [21] on the NICE-II dataset [26] as reported by [21].

In table 4.4 the EER and decidability scores can be found for four different methods, the three different CNN’s on their own and the proposed combined method. The scores are averaged over the two folds for the [26] dataset. The 1st CNN is the CNN which uses the iris images.

In table 4.5 the EER and decidability scores per device can be found. The scores are averaged over the two folds for the [9] dataset. The method with the 1st CNN only is used.

Device Average EER % Average Decidability Samsung phone 18.08 1.72

Samsung tablet 19.24 1.98

iPhone 19.50 1.67

Table 4.5: Average EER and decidability scores for using only the 1st CNN solution of [21] on the MICHE-I dataset [9] as reported by [21].

In table 4.6 you can find the EER and decidability scores per device for all CNN’s combined.

Device Average EER % Average Decidability Samsung phone 15.49 1.86

Samsung tablet 16.34 2.09

iPhone 17.39 1.87

Table 4.6: Average EER and decidability scores for the proposed solution of [21] on the MICHE-I dataset [9] as reported by [21].

(32)

4.5 Comparison

Since some papers describe multiple possible solutions we summarised the best performing proposed solutions of the four different papers in 4.7. For the algorithm of [21] not the best, but the solution using only the iris region was used since for this research we compare iris recognition without the periocular region.

Method from Best performing solution

[36] in 3.1 Global and Local features Zernike Moments Phase component

[32] in 3.2 Multi-patch deep sparse histograms using collaborative subspace template matching [45] in 3.3 ResNet-50 without normalisation and segementation with data augmentation [21] in 3.4 1st CNN with cGAN data augmentation for dataset [26]

Table 4.7: The proposed and best performing solutions from [36], [32], [45] and [21].

In table 4.8 you can find the metrics for those proposed solutions. The rank-one recognition rate of 63% from [36] is not included in the table since this metric is used in [36] paper only. It is important to note that even though the same datasets are used we can not compare them one by one as if they are tested and used in the same way, since different subsets or training and testing methods are used. Also two of the papers [36] and [21] used no significance level at all while [32] used alpha 0.10 and [45] used alpha 0.05. The numbers do give however an indication on how good they perform relative to each other. The methods used are described in 4.7.

Method from Dataset EER % Decidability

[36] from section 3.1 Subset of NICE-II 11.96 2.5735

[45] from section 3.3 NICE-II 13.98± 0.55 2.2480 ± 0.0528

[21] from section 3.4 NICE-II 12.79 2.25

[21] from section 3.4 MICHE-I Samsung phone 18.08 1.72 [32] from section 3.2 MICHE-I Samsung Indoor Front 1.41± 0.65

[32] from section 3.2 MICHE-I Samsung Indoor Back 2.46± 0.64 [32] from section 3.2 MICHE-I Samsung Outdoor Front 0.66± 0.46 [32] from section 3.2 MICHE-I Samsung Outdoor Back 1.71± 0.57

[21] from section 3.4 MICHE-I iPhone 19.5 1.67 [32] from section 3.2 MICHE-I iPhone Indoor Front 2.09± 0.64

[32] from section 3.2 MICHE-I iPhone Indoor Back 1.73± 0.79 [32] from section 3.2 MICHE-I iPhone Outdoor Front 0.37± 0.32 [32] from section 3.2 MICHE-I iPhone Outdoor Back 1.17± 0.56

[21] from section 3.4 MICHE-I Samsung tablet 19.24 1.98

Table 4.8: EER, decidability and rank one recognition metrics of the best proposed (as describe in 4.7 iris recognition algorithms by [36], [32], [45] and [21] as reported by [36], [32], [45] and [21].

(33)

CHAPTER 5

Discussion and Conclusion

5.1 Critical discussion

The four different proposed automatic iris recognition algorithms provide us with a good overview of different iris recognition techniques for uncontrolled iris images. Not only do most of the papers just propose a technique, but they also provide an overview of steps taken to come to the decision of using the technique. Multiple similar techniques and design choices are compared with each other.

Because of the variety of testing methods and databases, the results can not be compared directly, they only give an indication of how good the algorithms perform relative to each other. Out of the four complete iris recognition algorithms the best-proposed solution [32] has in the most favourable subset of the [9] dataset an equal error rate of 0.37%. However, the same solution has on another part of the dataset an equal error rate of 2.46%. So the solution still does not come close to the results of iris recognition for iris images taken in a controlled environment. For example, the algorithm as proposed in [8] gets on the Casia V3 dataset, which contains images taken with NIR in a constrained setting, an equal error rate of as low as 0.08%.

The solution of [32] with the collective subspace also requires the subspace to be relearned every time a new subject is added to the dataset of known people, which is not an ideal situation. However, with the other distance metrics, it still performed better on the [9] set than the other tested algorithms, with the highest equal error rate for a sub set being 10.16%.

Besides the equal error rate, we also talked about the decidability, which was not reported by every algorithm we compared. The best decidability was retrieved on the [26] subset by the algorithm of [21], with a score of 3.31. Again this score does not come close to the reported score of the algorithm as proposed in [8] of 14.1 for ideal circumstances as reported in [6].

Only the algorithm of [36] reported the rank-one recognition of 63%, which means that in 63% of the tests, the correct subject was placed in the first rank. We can say that this is a low score if we compare it to the more than 96% which is retrieved with algorithms taken in a controlled environment [22].

It is clear that all the algorithms perform worse than known algorithms on controlled images, but what does this tell us about the potential use of iris recognition as forensic evidence as described in section 1? With the results from the researches, it is not possible to determine the likelihood ratio of certain matches that are made during the testing. So we can only say something about the algorithm based on the scores we described earlier. Based on these numbers we can say that for both the purpose of matching deceased and missing persons and finding perpetrators the algorithm will not perform discriminating enough especially when we look at the potential consequences a decision by such an algorithm can have, as further explained in section 5.1.2.

5.1.1 Problems

The results we found are all performed on datasets which are still, to some extent controlled. The use of iris recognition for the forensic applications from section 1 would require a solution

(34)

that would accept iris images of completely different sources taken under different conditions. The algorithms [21] and [32] which used the [9] split their results up into the different devices, which is good to see how they perform for these different settings, but seeing the difference in the scores this does not convince us of a system that could be used with images from completely different sources.

The best-proposed solution of [21] used more than just the iris region and reported better results on those tests, which meant we did not use the best solution for the comparison since we only want to use the iris region. Other algorithms also perform better when taking the periocular region into account. For example, [36] also published that their algorithm performs a lot better when taking this region into account. While we know that the iris is stable over time [44] we do not have this information for the rest of the periocular region. Therefore the metrics of the performance of only the 1st CNN of [21] were used.

The algorithms of [36], [37] and [45] also still needed to find the periocular region to extract the iris from. At this moment these are not found automatically and they expect an image of only this region as the input. For an ideal automated system, it would also be good if the process of recognizing the periocular region would be included.

In the two datasets, [9] and [30] the subjects are mostly of Caucasian ethnicity. Also, certain age groups are better represented than others. Males are also more present in the [9] dataset than females. Even though different eye colours are taken into account, we do not know if the ethnicity makes a difference for the recognition. For a better comparison, the datasets should include people of different ethnicity’s, sexes and ages in equal proportions.

5.1.2 Personal opinion

In my opinion, it is, at this moment, not safe to use iris recognition on uncontrolled images the metrics point to unreliable recognition and when using it for forensic purposes the uncertainty is way too big for the potential consequences it can have. We should not convict a person based on evidence that is as uncertain as to the results we have seen. A false link between a found body and a missing person can also have big consequences when the police stop searching for this person, while it may be still alive.

There is hope on the horizon when it comes to deep learning techniques, which can possibly be better exploited for this. But I do not think deep learning will and should make it to the court any time soon. We do not know the true decisive thinking steps of a deep network and transparency is one of the most important things in the forensic framework.

Right now we are also missing research on real uncontrolled images which reflect the population. A more chaotic dataset for example combined from different existing noisy datasets with a better representation of ethnicity, sex and race could be of use when wanting to research uncontrollable images, which are more usable in for example a forensic context.

But this is of course only of use when the performance of the algorithms becomes better in general. Cause at this stage I do not think they perform nearly discriminating enough to be of help in a forensic context.

This does not mean that iris recognition of natural light can not be used at all. While the algorithms may not perform discriminating enough, iris recognition can also be done in a non-automated manner for example for forensic practices. As proposed in [3] algorithms could still find the features, but the features can be interpreted by the forensic expert, who can give a judgement on how well a specific case performs. However, no researches on natural light images with this approach could be found.

5.2 Conclusion

We can now come back to the question:

What is the performance of different iris matching algorithms on uncontrolled iris images? As we have seen from the test results the performance of automatic iris matching algorithms on uncontrolled images is at this moment not good, or at least not good enough for the purposes

(35)

described in the introduction. The results of the automated algorithms are not discriminating enough and could lead to both false negatives and false positives which, depending on the use could have big consequences.

We could of course not compare all available algorithms designed for the use of uncontrolled images but no signs could be found that other algorithms would significantly outperform the ones covered in this research, at least on publicly available datasets.

For further research into the comparison of algorithms designed for iris recognition on uncontrolled images, it would be good if a dataset is developed which consists of images of multiple sources which has equal, ethnicity, sex and race ratio’s. Partly automated methods with human interpretable features could also be of use in the example of forensic practices, which is something which should also be further investigated.

(36)

(37)

Bibliography

[1] Z. Z. Abidin, M. Manaf, A. Shibghatullah, S. M. Yunus, S. Anawar, and Z. Ayop. Iris segmentation analysis using integro-differential and hough transform in biometric system. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 4(2):41–48, 2012.

[2] S. Cai, L. Zhang, W. Zuo, and X. Feng. A probabilistic collaborative representation based approach for pattern classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2950–2959, 2016.

[3] J. Chen, F. Shen, D. Z. Chen, and P. J. Flynn. Iris recognition based on human-interpretable features. IEEE Transactions on Information Forensics and Security, 11(7):1476–1485, 2016. [4] J. Daugman. Biometric decision landscapes. Technical report, University of Cambridge,

Computer Laboratory, 2000.

[5] J. Daugman. How iris recognition works. In The essential guide to image processing, pages 715–739. Elsevier, 2004.

[6] J. Daugman. Probing the uniqueness and randomness of iriscodes: Results from 200 billion iris pair comparisons. Proceedings of the IEEE, 94(11):1927–1935, 2006.

[7] J. Daugman and I. Malhas. Iris recognition border-crossing system in the uae. Department of Computer Science and Technology, University of Cambridge, 2004.

[8] J. G. Daugman. High confidence visual recognition of persons by a test of statistical independence. IEEE transactions on pattern analysis and machine intelligence, 15(11):1148–1161, 1993.

[9] M. De Marsico, M. Nappi, D. Riccio, and H. Wechsler. Mobile iris challenge evaluation (miche)-i, biometric iris dataset and protocols. Pattern Recognition Letters, 57:17–23, 2015. [10] B. DeCann and A. Ross. Relating roc and cmc curves via the biometric menagerie. In 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), pages 1–8. IEEE, 2013.

[11] A. Gangwar and A. Joshi. Deepirisnet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition. In 2016 IEEE international conference on image processing (ICIP), pages 2301–2305. IEEE, 2016.

[12] K. Grabowski, W. Sankowski, M. Zubert, and M. Napieralska. Reliable iris localization method with application to iris recognition in near infrared light. In Proceedings of the International Conference Mixed Design of Integrated Circuits and System, 2006. MIXDES 2006., pages 684–687. IEEE, 2006.

[13] R. Haraksim, D. Ramos, and D. Meuwly. Validation of likelihood ratio methods for forensic evidence evaluation handling multimodal score distributions. Iet Biometrics, 6(2):61–69, 2016.

(38)

[14] Z. He, T. Tan, Z. Sun, and X. Qiu. Toward accurate and fast iris segmentation for iris biometrics. IEEE transactions on pattern analysis and machine intelligence, 31(9):1670–1684, 2008.

[15] K. P. Hollingsworth, K. W. Bowyer, and P. J. Flynn. Using fragile bit coincidence to improve iris recognition. In 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems, pages 1–6. IEEE, 2009.

[16] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.

[17] E. K. Jacobsen. Unique identification: Inclusion and surveillance in the indian biometric assemblage. Security Dialogue, 43(5):457–474, 2012.

[18] A. K. Jain, A. Ross, and S. Prabhakar. An introduction to biometric recognition. IEEE Transactions on circuits and systems for video technology, 14(1):4–20, 2004.

[19] G. Kumar and P. K. Bhatia. A detailed review of feature extraction in image processing systems. In 2014 Fourth international conference on advanced computing & communication technologies, pages 5–12. IEEE, 2014.

[20] M. B. Lee, H. G. Hong, and K. R. Park. Noisy ocular recognition based on three convolutional neural networks. Sensors, 17(12):2933, 2017.

[21] M. B. Lee, Y. H. Kim, and K. R. Park. Conditional generative adversarial network-based data augmentation for enhancement of iris recognition accuracy. IEEE Access, 7:122134–122152, 2019.

[22] X. Liu, K. W. Bowyer, and P. J. Flynn. Experiments with an improved iris segmentation algorithm. In Fourth IEEE Workshop on Automatic Identification Advanced Technologies (AutoID’05), pages 118–123. IEEE, 2005.

[23] G. S. Morrison. Measuring the validity and reliability of forensic likelihood-ratio systems. Science & Justice, 51(3):91–98, 2011.

[24] R. Y. F. Ng, Y. H. Tay, and K. M. Mok. A review of iris recognition algorithms. In 2008 International Symposium on Information Technology, volume 2, pages 1–7. IEEE, 2008. [25] K. Nguyen, C. Fookes, R. Jillela, S. Sridharan, and A. Ross. Long range iris recognition: A

survey. Pattern Recognition, 72:123–143, 2017.

[26] NICE. Noisy iris challenge evaluation, part ii. url: http://nice2.di.ubi.pt/.

[27] N. Othman, B. Dorizzi, and S. Garcia-Salicetti. Osiris: An open source iris recognition software. Pattern Recognition Letters, 82:124–131, 2016.

[28] K. Pasupa and W. Sunhem. A comparison between shallow and deep architecture classifiers on small dataset. In 2016 8th International Conference on Information Technology and Electrical Engineering (ICITEE), pages 1–6. IEEE, 2016.

[29] H. Proenca and L. A. Alexandre. Toward covert iris biometric recognition: Experimental results from the nice contests. IEEE Transactions on Information Forensics and Security, 7(2):798–808, 2011.

[30] H. Proenca, S. Filipe, R. Santos, J. Oliveira, and L. Alexandre. The UBIRIS.v2: A database of visible wavelength images captured on-the-move and at-a-distance. IEEE Trans. PAMI, 32(8):1529–1535, August 2010.

[31] K. B. Raja, R. Raghavendra, V. K. Vemuri, and C. Busch. Smartphone based visible iris recognition using deep sparse filtering. Pattern Recognition Letters, 57:33–42, 2015.

(39)

[32] K. B. Raja, R. Raghavendra, S. Venkatesh, and C. Busch. Multi-patch deep sparse histograms for iris recognition in visible spectrum using collaborative subspace for robust verification. Pattern Recognition Letters, 91:27–36, 2017.

[33] Schiphol. Privium, the service programme for frequent flyers at amsterdam airport schiphol, celebrates its 10th anniversarytes its 10th anniversary, Jan 2017. url:

https://news.schiphol.com/privium-the-service-programme-for-frequent-flyers-at-amsterdam-airport-schiphol-celebrates-its-10th-anniversarytes-its-10th-anniversary/. [34] S. Soo. Object detection using haar-cascade classifier. Institute of Computer Science,

University of Tartu, pages 1–12, 2014.

[35] C.-W. Tan and A. Kumar. Towards online iris and periocular recognition under relaxed imaging constraints. IEEE Transactions on Image Processing, 22(10):3751–3765, 2013. [36] C.-W. Tan and A. Kumar. Accurate iris recognition at a distance using stabilized iris

encoding and zernike moments phase features. IEEE Transactions on Image Processing, 23(9):3962–3974, 2014.

[37] C.-W. Tan and A. Kumar. Efficient and accurate at-a-distance iris recognition using geometric key-based iris encoding. IEEE Transactions on Information Forensics and Security, 9(9):1518–1526, 2014.

[38] T. Tan, Z. He, and Z. Sun. Efficient and robust segmentation of noisy iris images for non-cooperative iris recognition. Image and vision computing, 28(2):223–230, 2010.

[39] R. Thakur. Step by step vgg16 implementation in keras for beginners, Nov 2020. Picture taken from:

https://towardsdatascience.com/step-by-step-vgg16-implementation-in-keras-for-beginners-a833c686ae6c. [40] L. Torrey and J. Shavlik. Transfer learning. In Handbook of research on machine learning

applications and trends: algorithms, methods, and techniques, pages 242–264. IGI global, 2010.

[41] M. Trokielewicz, A. Czajka, and P. Maciejewicz. Iris recognition after death. IEEE Transactions on Information Forensics and Security, 14(6):1501–1514, 2018.

[42] Unique Identification Authority of India, Nov 2020. Aadhaar Dashboard, visit on: https://uidai.gov.in/aadhaar/ dashboard/.

[43] W. M. Waggener. Pulse Code Modulation Techniques (Electrical Engineering). Springer, sep 1994.

[44] R. P. Wildes. Iris recognition: an emerging biometric technology. Proceedings of the IEEE, 85(9):1348–1363, 1997.

[45] L. A. Zanlorensi, E. Luz, R. Laroca, A. S. Britto, L. S. Oliveira, and D. Menotti. The impact of preprocessing on deep representations for iris recognition on unconstrained environments. In 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 289–296. IEEE, 2018.

(40)

(41)

(42)

Automatic iris recognition on uncontrolled images

Master Forensic Science