Motivated Value Selection for Artificial Agents

Coding values (or preferences) directly into an artificial agent is a very challenging task, while value selection (or value-learning, or value-loading) allows agents to learn values from their programmers, other humans or their environments in an interactive way. However, there is a conflict between agents learning their future values and following their current values, which motivates agents to manipulate the value selection process. This paper establishes the conditions under which motivated value selection is an issue for some types of agents, and presents an example of an `indifferent' agent that avoids it entirely. This poses and solves an issue which has not to the author's knowledge been formally addressed in the literature.