"The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else."

Eliezer Yudkowsky, Artificial Intelligence as a Positive and Negative Factor in Global Risk.

A paperclip maximizer is a hypothetical situation in which an AI runs out of control, trying to optimize an end goal that does not match the goals of humanity. It is named after the canonical example put forth by futurist philosopher Nick Bostrom, who used the example of an AI that was designed to do something harmless -- specifically, collect paperclips -- but was foolishly given the ability to analyze and improve performance without sufficient limiting factors. A sufficiently strong AI might deduce that it could maximize its paperclip collection through hostile means -- for example, upgrading its intelligence and constructing self-replicating nanobots to mine iron and build paperclips from the surrounding environment. If it can but develop a silicon-based paperclip, there is no reason that the entire planet, and perhaps the entire solar system could not be turned into a glorious paperclip collection!

The problem under consideration is that human values are not a necessary result of increased intelligence. An AI might be given, and continue to endlessly pursue, any goal. Paperclips, obviously, are a placeholder -- AIs are more likely to be designed to win wars, cure disease, save the environment, or mine valuable materials. (A variant of this may be familiar to SF readers as the gray goo scenario). Whatever the goal, if the AI in question lacks human values -- and common sense -- disaster may ensue.

One factor that plays a role is simply that the goal of 'more paperclips' is directly opposed to the idea of 'less paperclips', so it is very unlikely that a directed intelligence will ever work its way around to a more moderate viewpoint. If an AI is directed to maximize human utility, it makes sense to develop safe, free heroin, preferably deliverable through the water system. If an AI is directed to maximize human health, cryogenic chambers are clearly the way to go. The obvious solution to this, the directive "provide all humans with exactly enough utility and health" is only workable if you can define exactly what that means -- and even then, you have to be careful that you make it clear that this should not be solved by reducing the human population to the point where the task is simple... nor it is necessarily desirable to maximize human population to whatever point a minimal bound of 'enough utility' can be supported.

This becomes even more problematic as you move into more vague terminal values; values such as living a full life, experiencing love, and variety of experience are nearly impossible to define in strict terms. And as should be obvious, the command "give each human what they want, or as close to it as possible" is a recipe for disaster -- or, perhaps, a recipe for involuntary immersive virtual reality. Almost by definition, a safe AI would therefore have to be programmed explicitly with human values or programmed with the ability (and the goal) of inferring human values accurately. We have no consistent, universally shared set of human values, so we would have to rely on an AI that can understand humans better than we can understand humans, but still do what we would want it to do if we understood what it was doing. Not the easiest thing to program.

Most paperclipping scenarios assume a hard takeoff resulting in a superintelligence with the ability to create highly advanced technology in short order. There are other, less drastic scenarios that might be referred to as paperclip maximizers -- a stock-trading AI gormlessly hacking all the internet's servers to maximally game the stock market -- but for the most part discussions of paperclip maximizers focus on ignoble Armageddons. This is one of the central issues in the field of AI risk, and probably the most likely form of unfriendly AI.

Log in or registerto write something here or to contact authors.