Existential risk to humanity (X-risk, for short) refers to risks on a scale that can cause the extinction of humanity, or permanently cripple its potential for future flourishing. Two familiar examples are global nuclear war and climate catastrophe, but there are several other risks that may be of a similar order of magnitude. This project is an attempt to contributing to the emerging field of X-risk, whose aim is to identify and quantify these risks as well as working out mitigation strategies. It can partly be seen as a continuation of the GoCAS (Gothenburg Chair Programme for Advanced Studies) guest researcher program Existential Risk to Humanity, which was held during September—October 2017, and which featured creative discussions and many promising ideas for further work. Work in this new field may involve either (a) developing general frameworks for understanding the entire range of X-risks, or (b) studying more specific X-risks. The present project has subprojects in both categories.
One subproject of category (a) concerns how X-risk relates to so-called S-risk, the risk for creating astronomical amounts of suffering (typically via space colonization). Mitigating X-risk vs mitigating S-risk can sometimes be conflicting goals, and what to do then requires deep philosophical and ethical considerations. A practical goal can be to look for actions that tend to decrease both kinds of risk.
Another subproject of category (a) is to develop a theory of how to take indexical information into account in Bayesian reasoning. No satisfactory such theory exists to date, something that strictly speaking puts us in a rather desperate epistemic situation (where we cannot even convincingly argue that we are not Boltzmann brains). In down-to-earth applications of Bayesian reasoning this is often ignored, but when X-risk is studied in relation to, e.g., Fermi’s paradox and the so-called Doomsday argument, the problems become obvious and impossible to ignore.
In category (b), several subprojects concern risks associated with a future breakthrough in artificial intelligence (AI) research leading to a machine that outperforms humans in what is referred to as general intelligence. One of these involves proposing strategies for so-called AI-in-a-box techniques that radically restricts the AI’s chances to influence the world – or, perhaps more realistically, showing that such strategies are unlikely to work for more than a temporary and rather brief period. Another one concerns how to predict the motivations and goals (in order ultimately to influence them – what has become known as the AI Alignment problem) of such AI, and to critically scrutinize the best and pretty much only theory available for this: the Omohundro-Bostrom theory of instrumental vs final AI goals.
Professor vid Chalmers, Mathematical Sciences, Applied Mathematics and Statistics
Funding Chalmers participation during 2017–2018