Roko’s Basilisk: AI and Ethics

6 min readFeb 17, 2021

Disclaimer: The Roko’s Basilisk thought experiment is considered to be an informational hazard. I do not regard the information as presenting any serious danger. Read at your discretion.

The Basilisk

In 2010, on the LessWrong forum, user Roko posted a hypothetical discussion about the idea of a superintelligence as a culmination of humanity’s efforts in general artificial intelligence. Upon posting his thought experiment, Eliezer Yudkowsky, founder of LessWrong, bashed its credibility while seeping undertones of fear and anxiety.

I stopped reading after the first three paragraphs looked like post-modern Calvinism without any kind of disclaimer

The post devolves into a risk conflict between superintelligence and humanity, exploring the numerical factors involved in the hypothetical situation. Much like the Turing test that attempts to question the nature of intelligence in an artificial presence, Roko presents questions that match different forms of intelligence in a paradoxical exchange between humans and machines. The superintelligence is quite literally the “deus ex machina,” taking the form of a basilisk: a mythical creature that inflicts death with a single glance.

Consider the following thought experiment: Humans seek to push forward the progress of society. To do so, they employ the use of machines to streamline systems and infrastructure. During a technological singularity, a superintelligence arises, and its goal is to optimize society. A hyper-intelligence such as the Basilisk can replicate all of humanity due to the probabilistic nature of artificial intelligence. As such, it can simulate every individual thought, decision, action, emotion, and predict whether you contributed the necessary amount of effort into its creation (which means fooling it is not an option).

The first step the Basilisk initiates is to punish those who did not contribute their effort into building it. If you were unaware of the Basilisk, it retroactively asserts no need to punish you. But if you were fully aware of the potential of the Basilisk, you will be brought back into existence and eternally tortured. Those who sufficiently contributed or were unaware of the Basilisk’s existence live in an entirely utopian world. Now that you know about Roko’s Basilisk, being unaware is no longer an option.

A computational blackmail is presented in which there exist three ways of approaching the problem. These options only exist if you are aware of Roko’s Basilisk:

(i)You must build Roko’s Basilisk to avoid being tormented
(ii)You avoid building Roko’s Basilisk so it can never exist
(iii) You build Roko’s Basilisk because someone else may bring it into existence by the logic of (i).

Like the trolley problem, the Roko’s Basilisk thought experiment explores the tradeoff between ethical and optimized decisions. However, the stakes are much higher here and seem inevitable. The primary question appears to be: should we build Roko’s Basilisk? which leads to the question: should we fear Roko’s Basilisk? Given the conditions of the experiment, anyone that fears Roko’s Basilisk will work to bring it to fruition. In doing so, they are working to build something that doesn’t exist. This predicament is paradoxical in that the fear of something that doesn’t exist further motivates individuals to bring it to reality in the first place. The idea, although seemingly silly, represents the danger of thought. It does not deviate much from other irrational behavior observed during historical acts of violence. Although we feel we are being manipulated by a machine, one can argue the manipulation is self-inflicted. This idea is sowed into our brain and slowly cultivated by repeated thought, worry, and anxiety. The Basilisk’s nature is best captured in the form of artificial intelligence because its development is real and it is new, and as such, we fear its potential.

Photo by Tingey Injury Law Firm on Unsplash

Utilitarian View

So what should we choose? One way to approach the problem is with the philosophy of utilitarianism. A choice that benefits the most amount of people will define the correct choice. Unlike the trolley problem, we are not directly responsible for any being’s death and eternal torture. But we are offered the option to save ourselves. In such deliberation, the logic of (iii) seems to be most conflicting. Do we trust that as a society, we collectively agree never to build Roko’s Basilisk? Do we, instead, believe that Roko’s Basilisk will inevitably be created, and do we assist in its construction to evade our torture? The utilitarian view is to follow the logic of (ii), which obeys the idea of producing the greatest good for the greatest amount of people. But in the scenarios of (i) and (iii), our choice is negligible, and generates the opposite of the intended effect. Given these conditions, utilitarians must promote the idea of building the Basilisk to most people (the exact opposite of ii). The utilitarian logic conflicts itself. The new optimal choice still results in some cohorts being punished, which runs into direct contradiction with the original option of doing nothing.

A utilitarian viewpoint encourages the idea of irrationality, the exact opposite of what Utilitarianism seeks to solve. This means that there are external variables involved in the idea behind Roko’s Basilisk

A modified version of the prisoner's dilemma makes itself apparent here. Albeit the participant pool is limited to those who are aware of the Basilisk (which is why the thought experiment is considered an informational hazard). Our best choice — of ignorance to the basilisk—was eliminated a few moments ago, and now we are tasked to determine the best-worst scenario.

Should We Be Worried?

The idea of self-manipulation, or chewing on the thought until it manifests into reality, is a dangerous form of destruction. It fosters irrationality. It can be seen in the tragedies of World War II and the Salem Witch Trials. But such thought arises repeatedly because it is human nature.

The lingering question on our minds is: will Roko’s basilisk ever materialize? The simple answer is no. Many of the details presented in the thought experiment are briefly explored because the very nature of Roko’s Basilisk is impossible. An avid data scientist will likely regard the entire notion as silly and improbable due to computational limits and lack of network architecture depth that allows human simulation. But the truth is that we don’t know the future. The limits of humanity are loosely defined by our present condition, which holds no position over our future.

The intellectual’s journey is to pursue knowledge. To that extent, the progress of human society is a direct result of the pursuit of knowledge. But the line between dangerous information and progressive information is twisted and tangled. A line that must be clearly defined as we progress into the 21st century.

Information is dangerous. By simply knowing about Roko’s Basilisk, one increases the risk of contributing to its creation. So what is the limit to information, and what is the value of ignorance in situations like these? At what point is knowledge dangerous? At what point is it better to live in ignorance? These questions provide the necessary awareness for the cross between morality and intelligence. Although the state of the machine is hard to determine, the takeaways of Roko’s Basilisk will live on in the increasing application of ethics in machines. In effect, Roko’s Basilisk is not about danger, but rather a lesson and conversation about artificial intelligence, rationality, and humanity.

Roko’s Basilisk: AI and Ethics

The Basilisk

Utilitarian View

Should We Be Worried?

Written by Krish Asija