Aligning Superintelligence with Human Interests: A Technical Research Agenda

The property that has given humans a dominant advantage over other species is not strength or speed, but intelligence. If progress in artificial intelligence continues unabated, AI systems will eventually exceed humans in general reasoning ability. A system that is “superintelligent” in the sense of being “smarter than the best human brains in practically every field” could have an enormous impact upon humanity (Bostrom 2014). Just as human intelligence has allowed us to develop tools and strategies for controlling our environment, a superintelligent system would likely be capable of developing its own tools and strategies for exerting control (Muehlhauser and Salamon 2012). In light of this potential, it is essential to use caution when developing AI systems that can exceed human levels of general intelligence, or that can facilitate the creation of such systems.

[1]  Kurt Gödel,et al.  On undecidable propositions of formal mathematical systems , 1934 .

[2]  A. Wald Contributions to the Theory of Statistical Estimation and Testing Hypotheses , 1939 .

[3]  E. Lehmann Some Principles of the Theory of Testing Hypotheses , 1950 .

[4]  Claude E. Shannon,et al.  XXII. Programming a Computer for Playing Chess 1 , 1950 .

[5]  J. Łoś On the axiomatic treatment of probability , 1955 .

[6]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[7]  H. Gaifman Concerning measures in first order calculi , 1964 .

[8]  I. J. Good,et al.  Speculations Concerning the First Ultraintelligent Machine , 1965, Adv. Comput..

[9]  E. Eells Causal Decision Theory , 1984, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association.

[10]  P. S. Tasker,et al.  DEPARTMENT OF DEFENSE TRUSTED COMPUTER SYSTEM EVALUATION CRITERIA , 1985 .

[11]  Vernor Vinge,et al.  ==================================================================== the Coming Technological Singularity: How to Survive in the Post-human Era , 2022 .

[12]  Oren Etzioni,et al.  The First Law of Robotics (A Call to Arms) , 1994, AAAI.

[13]  Elchanan Ben-Porath,et al.  Rationality, Nash Equilibrium and Backwards Induction in Perfect-Information Games , 1997 .

[14]  James M. Joyce The Foundations of Causal Decision Theory , 1999 .

[15]  Marcus Hutter,et al.  A Theory of Universal Artificial Intelligence based on Algorithmic Complexity , 2000, ArXiv.

[16]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[17]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[18]  Jon Bird,et al.  The evolved radio and its implications for modelling the evolution of novel sensors , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[19]  Joseph Y. Halpern Reasoning about uncertainty , 2003 .

[20]  Haim Gaifman,et al.  Reasoning with Limited Resources and Assigning Probabilities to Arithmetical Statements , 2004, Synthese.

[21]  John McCarthy,et al.  A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955 , 2006, AI Mag..

[22]  Eliezer Yudkowsky Artificial Intelligence as a Positive and Negative Factor in Global Risk , 2006 .

[23]  Shane Legg,et al.  Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.

[24]  Stephen M. Omohundro,et al.  The Basic AI Drives , 2008, AGI.

[25]  Peter de Blanc Ontological Crises in Artificial Agents' Value Systems , 2011, ArXiv.

[26]  Eliezer Yudkowsky,et al.  Complex Value Systems in Friendly AI , 2011, AGI.

[27]  Luke Muehlhauser,et al.  Intelligence Explosion: Evidence and Import , 2012 .

[28]  Nick Bostrom,et al.  Thinking Inside the Box: Controlling and Using an Oracle AI , 2012, Minds and Machines.

[29]  Abram Demski Logical Prior Probability , 2012, AGI.

[30]  S. Brams,et al.  Prisoners' Dilemma is a Newcomb Problem , 2013 .

[31]  Marcus Hutter,et al.  Probabilities on Sentences in an Expressive Logic , 2012, J. Appl. Log..

[32]  Eliezer,et al.  Tiling Agents for Self-Modifying AI , and the Löbian Obstacle * , 2013 .

[33]  N. Soares Tiling agents in causal graphs , 2014 .

[34]  Stuart J. Russell Unifying Logic and Probability: A New Dawn for AI? , 2014, IPMU.

[35]  Benja Fallenstein,et al.  Robust Cooperation in the Prisoner's Dilemma: Program Equilibrium via Provability Logic , 2014, ArXiv.

[36]  Paul Christiano Non-Omniscience, Probabilistic Inference, and Metamathematics , 2014 .

[38]  Benja Fallenstein,et al.  Problems of Self-reference in Self-improving Space-Time Embedded Intelligence , 2014, AGI.

[39]  C. Aitken,et al.  The logic of decision , 2014 .

[40]  Daniel Hintze Problem Class Dominance in Predictive Dilemmas , 2014 .

[41]  Benja Fallenstein,et al.  Questions of Reasoning Under Logical Uncertainty , 2015 .

[42]  Benja Fallenstein,et al.  Toward Idealized Decision Theory , 2015, ArXiv.

[43]  Nate Soares,et al.  Formalizing Two Problems of Realistic World-Models , 2015 .

[44]  Benja Fallenstein,et al.  Vingean Reflection : Reliable Reasoning for Self-Improving Agents , 2015 .

[45]  C. Robert Superintelligence: Paths, Dangers, Strategies , 2017 .