You have to wonder whether anything other than having watched too many James Bond films feeds the idea that biometrics are a good means of achieving security. Nowadays, Canadians are not allowed to smile when they are having their passport photos taken, in hopes that computers will be able to read the images more easily. Of course, any computer matching system foiled by something as simple as smiling is not exactly likely to be useful for much.

Identification v. authentication

Biometrics can be used in two very distinct ways: as a means of authentication, and as a means of identification. Using a biometric (say, a fingerprint) to authenticate is akin to using a password in combination with a username. The first tells the system who you claim to be, the second attempts to verify that using something you have (like a keycard), something you know (like a password), or something you are (like a fingerprint scan). Using a biometric for identification attempts to determine who you are, within a database of possibilities, using biometric information.

Using a fingerprint scan for identification is much more problematic than using it for authentication. This is a bit like telling people to enter a password and, if it matches any password in the system, allow them into that person’s account. It isn’t quite that bad, because fingerprints are more unique and secure than passwords, but the problem remains that as the size of the database increases, the probability of false matching increases.

For another example, imagine you are trying to identify the victim of a car wreck using dental records. If person X is the registered owner and hasn’t been heard from since the crash, we can use dental records to authenticate that a badly damaged body almost certainly belongs to person X. This is like using biometrics for authentication. Likewise, if we know the driver could be one of three people, we can ascertain with a high degree of certainty which it is, by comparing dental x-rays from the body with records for the three possible matches. The trouble arises when we have no idea who person X is, so we try running the x-rays against the whole collection that we have. Not only is this likely to be resource intensive, it is likely to generate lots of mistakes, for reasons I will detail shortly.

The big database problem in security settings

The problem of a big matching database is especially relevant when you are considering the implementation of wholesale surveillance. Ethical issues aside, imagine a database of the faces of thousands of known terrorists. You could then scan the face of everyone coming into an airport or other public place against that set. Both false positive and false negative matches are potentially problematic. With a false negative, a terrorist in the database could walk through undetected. For any scanning system, some probability (which statisticians call Beta, or the Type II Error Rate) attaches to that outcome. Conversely, there is the possibility of identifying someone not on the list as being one of the listed terrorists: a false positive. The probability of this is Alpha (Type I Error Rate), and it is in setting that threshold that the relative danger of false positives and negatives is established.

A further danger is somewhat akin to ‘mission creep’ - the logic that, since we are already here, we may as well do X in addition to Y, where X is our original purpose. This is a very frequent security issue. For example, think of driver’s licenses. Originally, they were meant to certify to a police officer that someone driving a car is licensed to do so. Some types of people would try to attack that system and make fake credentials. But once having a driver’s license lets you get credit cards, rent expensive equipment, secure other government documents, and the like, a system that existed for one purpose is vulnerable to attacks from people trying to do all sorts of other things. When that broadening of purpose is not anticipated, a serious danger exists that the security applied to the originally task will prove inadequate.

A similar problem exists with potential terrorist matching databases. Once we have a system for finding terrorists, why not throw in the faces of teenage runaways, escaped convicts, people with outstanding warrants, etc, etc? Again, putting ethical issues aside, think about the effect of enlarging the match database on the possibility of false positive results. Now, if we can count on security personnel to behave sensibly when such a result occurs, there may not be too much to worry about. Numerous cases of arbitrary detention, and even the use of lethal force, demonstrate that this is a serious issue indeed.

The problem of rare properties

In closing, I want to address a fallacy that relates to this issue. When applying an imperfect test to a rare case, you are almost always more likely to get a false positive than a legitimate result. It seems counterintuitive, but it makes perfect sense. Consider this example:

I have developed a test for a hypothetical rare disease. Let’s call it Panicky Student Syndrome (PSS). In the whole population of students, one in a million is afflicted. My test has an accuracy of 99.99%. More specifically, the probability that a student has PSS is 99.99%, given that they have tested positive. That means that if the test is administered to a random collection of students, there is a one in 10,000 chance that a particular student will test positive, but will not have PSS. Remember that the odds of actually having PSS are only one in a million. There will be 100 false positives for every real one - a situation that will arise in any circumstance where the probability of the person having that trait (whether having a rare disease or being a terrorist) is low.

Given that the reliability of even very expensive biometrics is far below that of my hypothetical PSS test, the ration of false positives to real ones is likely to be even worse. This is something to consider when governments start coming after fingerprints, iris scans, and the like in the name of increased security.

This entry was originally a post on my blog, at: