Asymptotically Unambitious Artificial General Intelligence

General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artificially constructible. Narrow intelligence, the ability to solve a given particularly difficult problem, has seen impressive recent development. Notable examples include self-driving cars, Go engines, image classifiers, and translators. Artificial General Intelligence (AGI) presents dangers that narrow intelligence does not: if something smarter than us across every domain were indifferent to our concerns, it would be an existential threat to humanity, just as we threaten many species despite no ill will. Even the theory of how to maintain the alignment of an AGI's goals with our own has proven highly elusive. We present the first algorithm we are aware of for asymptotically unambitious AGI, where "unambitiousness" includes not seeking arbitrary power. Thus, we identify an exception to the Instrumental Convergence Thesis, which is roughly that by default, an AGI would seek power, including over us.

[1]  C. Goodhart Problems of Monetary Management: The UK Experience , 1984 .

[2]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[3]  C. Robert Superintelligence: Paths, Dangers, Strategies , 2017 .

[4]  Marcus Hutter,et al.  Universal Artificial Intellegence - Sequential Decisions Based on Algorithmic Probability , 2005, Texts in Theoretical Computer Science. An EATCS Series.

[5]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[6]  S. Barker,et al.  On the new Riddle of induction , 1960 .

[7]  Shane Legg,et al.  Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.

[8]  Tor Lattimore,et al.  General time consistent discounting , 2014, Theor. Comput. Sci..

[9]  Ofi rNw8x'pyzm,et al.  The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .

[10]  Jessica Taylor,et al.  Alignment for Advanced Machine Learning Systems , 2020, Ethics of Artificial Intelligence.

[11]  Marcus Hutter,et al.  Loss Bounds and Time Complexity for Speed Priors , 2016, AISTATS.

[12]  Joel Veness,et al.  A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..

[13]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[14]  Laurent Orseau,et al.  Delusion, Survival, and Intelligent Agents , 2011, AGI.

[15]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[16]  Shane Legg,et al.  Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings , 2019, ArXiv.

[17]  Stephen M. Omohundro,et al.  The Basic AI Drives , 2008, AGI.

[18]  Dario Amodei,et al.  AI safety via debate , 2018, ArXiv.

[19]  Marcus Hutter,et al.  Discrete MDL Predicts in Total Variation , 2009, NIPS.

[20]  Marcus Hutter,et al.  Rationality, optimism and guarantees in general reinforcement learning , 2015, J. Mach. Learn. Res..

[21]  Laurent Orseau,et al.  Universal Knowledge-Seeking Agents for Stochastic Environments , 2013, ALT.

[22]  Nick Bostrom,et al.  Thinking Inside the Box: Controlling and Using an Oracle AI , 2012, Minds and Machines.