论文信息 - Motivated Value Selection for Artificial Agents

Motivated Value Selection for Artificial Agents

Coding values (or preferences) directly into an artificial agent is a very challenging task, while value selection (or value-learning, or value-loading) allows agents to learn values from their programmers, other humans or their environments in an interactive way. However, there is a conflict between agents learning their future values and following their current values, which motivates agents to manipulate the value selection process. This paper establishes the conditions under which motivated value selection is an issue for some types of agents, and presents an example of an `indifferent' agent that avoids it entirely. This poses and solves an issue which has not to the author's knowledge been formally addressed in the literature.

Stuart Armstrong | S. Armstrong

[1] J. Neumann,et al. Theory of Games and Economic Behavior. , 1945 .

[2] William Harper,et al. Counterfactuals and Two Kinds of Expected Utility , 1978 .

[3] David Lewis,et al. Causal decision theory , 1981 .

[4] Van Fraassen,et al. Belief and the Will , 1984 .

[5] S. Pinker. The Blank Slate: The Modern Denial of Human Nature , 2002 .

[6] Eliezer Yudkowsky. Artificial Intelligence as a Positive and Negative Factor in Global Risk , 2006 .

[7] Stephen M. Omohundro,et al. The Basic AI Drives , 2008, AGI.

[8] A. Sepielli. Moral Uncertainty and the Principle of Equity among Moral Theories , 2008 .

[9] Daniel Dewey,et al. Learning What to Value , 2011, AGI.

[10] Ben Goertzel,et al. Nine Ways to Bias Open-Source AGI Toward Friendliness , 2012 .

[11] Nick Bostrom,et al. The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents , 2012, Minds and Machines.

[12] Nick Bostrom,et al. Thinking Inside the Box: Controlling and Using an Oracle AI , 2012, Minds and Machines.

[13] Roman V. Yampolskiy,et al. Leakproofing the Singularity Artificial Intelligence Confinement Problem , 2012 .

[14] S. Armstrong. General Purpose Intelligence: Arguing the Orthogonality Thesis , 2013 .

[15] D. Edmonds. Would You Kill the Fat Man?: The Trolley Problem and What Your Answer Tells Us about Right and Wrong , 2013 .

[16] Nick Bostrom,et al. Superintelligence: Paths, Dangers, Strategies , 2014 .