Nick Bostrom first thought about AGI and lock-in in 2005, but there has been almost no work on it since. We're starting Formation Research to minimise lock-in risk, the outcome being reduced likelihood that humanity ends up in a lock-in in the future.
A lock-in is where features of the world, typically negative elements of human culture that are harmful or oppressive, are held stable for long periods of time, akin to a totalitarian regime. A future where lock-ins are prevented has continued technological and cultural evolution, economic growth, sustainable competition, and improved individual freedom. Lock-in may prevent any or all of these properties from being true in the future. The nature of lock-in prevents people from doing anything about it once it happens.
Example 1: an AGI may gain strategic control of humanity, enforcing a dystopian scenario where the AGI competently pursues some goal, preventing human intervention.
Example 2: a human dictator may leverage AI systems to improve their own lifespan and implement mass worldwide surveillance, leading to a long-term totalitarian regime.
Our vision is twofold:
Lock-in may prevent any or all of these properties from being true in the future. The nature of lock-in prevents people from doing anything about it once it happens.
As we outline in the theory of change, the type of research our organisation will initially conduct is defined by the following values: first-principles, bottom-up, collaborative, scientific and technical research into the behaviours of AI systems and the potential usage of such systems.
This is a nascent area of study, and in the spirit of rationality, we believe that thinking about lock-in risks from first principles is the best way to keep unhelpful biases from clouding our research. This is not the same as only using the knowledge we create, that is not realistic. It means forming our own conceptual and theoretical models of the world based on our understanding of the laws of physics and computation, and testing the assumptions borrowed from other fields before employing them.
Building inside-view theoretical and conceptual models of lock-in based on simple facts about AI systems and game theory, rather than resting on the conceptual models of other fields or organisations. For example, we would not just implement existing risk management protocols, alignment strategies, or follow the s-risk template in our research.
We aim to be a collaborative organisation that works with AI safety research organisations with overlapping goals. We also expect to conduct interdisciplinary work with organisations that conduct other kinds of research, e.g., think tanks which conduct economic research who may approach lock-in from a different point of view.
We plan to use the scientific method in our research with the goal of creating fundamental knowledge about lock-in risks. We plan to use:
to make scientific progress on lock-in with good explanations.
We will also use the scientific method with real-world applications and interventions in mind. We want to create applicable knowledge that can be applied to solving problems. We want to use the existing and created knowledge to develop interventions and innovate mitigations for lock-in risks.
We believe in the build, measure, learn approach to creating and scaling solutions to problems. In the context of lock-in minimisation, this means continuously updating our theory of change, research agenda, and interventions upon updates to our world models. We aim to always use evidence and reason to update those world models.