AI system

AI safety is concerned with the potentially unprecedented risks arising from certain kinds of upcoming technology. We want to be able to refer to these systems of interest separately from making claims about their nature. So throughout this wiki, we use “AI system” as this catch-all term. We intend for it to refer to all machine learning systems, and to any future systems that people would naturally consider AI.

We use the word “system” to connote that they will typically be made of parts interacting in a complex way. We expect there to be systems where it is unclear whether they are one distributed “agent” or many cooperating “agents”.

We won’t find it useful here to define “AI” or even “intelligence”. Instead we’ll lean on concepts like prediction, world modelling and utility maximization, which we hope are less ambiguous. We’ll also use the terms like abstraction, values, optimization, and agency, where a major facet of our research goals is to give convincing arguments (formal and informal) for certain formal definitions of these concepts.