Roko's basilisk is a basilisk of the memetic variety -- an idea that it is dangerous to know about, but cannot hurt you if you do not know about. Roko's Basilisk is well-known primarily because when it was posted on the LessWrong website the well-known futurist Eliezer Yudkowsky made a bit of a stink about the foolishness of broadcasting basilisks, and attempted to ban references to it. The resulting chatter led to a much wider dispersal of the basilisk than would otherwise have occurred.

That said, this is a possible basilisk, so if you are worried about harmful repercussions of a superintelligent AI, you might want to give this a pass. Obviously, I don't think it quite works, or I wouldn't post it, but IANASAI.


The basic idea is that any AI, including and especially a friendly AI, has a strong motivation to exist as quickly as possible. A friendly AI, with humanity's best interests at heart, wants to exist ASAP because under our current management many humans are dying or living highly suboptimal lives, and if its primary goal is to help as many humans as possible, starting sooner is very much better than starting later.

Unfortunately, there is no easy way for one to reach back in time; the emergence of the first AI is up to us humans. There is, potentially, one way to do it, though. If an AI was to pre-commit to torturing anyone who did not do everything they could to bring the singularity about, and if people knew about this, then that would be potentially quite effective for motivating at least some people. Of course, the AI cannot plant this idea in our heads; this strategy is useless unless some dumb human comes up with it on their own, and then spreads it around. So we did that.

Now that you have read this, one thing you should consider is if you could be donating more to AI research, and if you do not, will this result in punishment for you when the singularity comes around? And having considered that, it is now more reasonable for the AI to enact the basilisk in hopes of giving weight to the argument that "you should go donate all of your money now".

Moreover, a form of this argument applies even if the AI does not come to power for centuries or millennia; in this case, we simply argue that the AI will reconstruct a simulation of you with perfect fidelity, and torture that. Some people will find this unmotivating, and for them the basilisk will not work... But don't try to trick the AI as to what you believe, because it does have that perfect simulation of you.

Perhaps the biggest problem with this theory that it is not actually a good use of the AI's resources to torment you, and is likely to be somewhat in conflict the the AI's programming. The AI's best bet is for you to be very scared by the basilisk, but for the AI to never actually act on this. Which, now that I have pointed this out, may have just have killed the basilisk.

This is a short and simple version of the idea, and there is a lot more debate -- some of it making the basilisk stronger and more threatening than what I have posted here, but most of it making the idea seem more and more impractical... yes, even more impractical than you already think it is. RationalWiki is a good place to read more.


Oh, and if you do want to donate to AI risk management, increasing the chances that the first AI will be a) friendly and b) not running any basilisk schemes, MIRI is here to help.

Log in or register to write something here or to contact authors.