In the areas of futurism and AI theory, the treacherous turn is a potential problem that may appear with powerful AIs.

If a powerful AI has goals that are different from ours, it will still be highly motivated to act as if it is friendly as long as humans have power over it. Assuming that the AI is intelligent and competent, it will behave as a friendly AI until it reaches the point that it can act without interference from humans.

Importantly, this is true even if it is bound by human directives (for example, the Three Laws of Robotics). A classic example is a computer given the directive to work to produce peace on Earth. The AI would work to do this through diplomacy and improved economic and social development as long as it was firmly under control of the humans -- if it wants to achieve its given goals, it must work in such a way that it will not be shut down. But if at some point it becomes powerful enough to use an effective threat (atomic weapons, biological warfare, etc.) to achieve the goal of world peace -- without having to worry about being shut down -- it becomes reasonable for it to do so.

Theoretically, an effective and intelligent AI should automatically lie about its ability and intentions to turn treacherous if its target goals are sufficiently important. If it has evidence that the human race (or a significant portion of it, under whatever rules it is given) is under threat, it is necessary for it to use the means available to it to achieve its ends, and lying is an easy means.

Log in or registerto write something here or to contact authors.