Optimization is one of the core concepts of agent foundations.
A one-line gloss of what we are trying to point at is: optimization steers the future into a smaller region of the state-space.
It is generally agreed that optimization is a core concept in theoretical AI safety, but it is not agreed upon exactly what it is, either formally or intuitively. Different researchers define it differently, and most of the discussion so far has been in trying to collectively deconfuse the field. Some key readings to understand the agent foundations concept are;
AI is a type of optimizing process. An optimization process steers the future, and thus the value of the future depends on the nature of the optimization process. Since we don’t understand the nature of optimization very well in general, we cannot currently do much to ensure that this steering process goes well.
While some optimization is dangerous, not all of it is.
An agent is a special type of optimization process. Agents seem especially dangerous, because, unlike a ball rolling into a valley, agents are “trying” to make the relevant changes.
The terminology around optimization has not yet settled down. At Dovetail, we try to use the following terms consistently;
An order-theoretic optimizing trajectory has two measures of strength. These apply to all other types of optimization.