Ethical Artificial Intelligence

This book-length article combines several peer reviewed papers and new material to analyze the issues of ethical artificial intelligence (AI). The behavior of future AI systems can be described by mathematical equations, which are adapted to analyze possible unintended AI behaviors and ways that AI designs can avoid them. This article makes the case for utility-maximizing agents and for avoiding infinite sets in agent definitions. It shows how to avoid agent self-delusion using model-based utility functions and how to avoid agents that corrupt their reward generators (sometimes called "perverse instantiation") using utility functions that evaluate outcomes at one point in time from the perspective of humans at a different point in time. It argues that agents can avoid unintended instrumental actions (sometimes called "basic AI drives" or "instrumental goals") by accurately learning human values. This article defines a self-modeling agent framework and shows how it can avoid problems of resource limits, being predicted by other agents, and inconsistency between the agent's utility function and its definition (one version of this problem is sometimes called "motivated value selection"). This article also discusses how future AI will differ from current AI, the politics of AI, and the ultimate use of AI to help understand the nature of the universe and our place in it.

[1]  William L. Hibbard,et al.  The VIS-5D system for easy interactive visualization , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[2]  Martin Ziegler,et al.  Logical Limitations to Machine Ethics with Consequences to Lethal Autonomous Weapons , 2014, ArXiv.

[3]  Anna Slobodová,et al.  Replacing Testing with Formal Verification in Intel CoreTM i7 Processor Execution Engine Validation , 2009, CAV.

[4]  Joshua W. Brown,et al.  How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues , 1999, The Journal of Neuroscience.

[5]  Stuart Armstrong,et al.  Motivated Value Selection for Artificial Agents , 2015, AAAI Workshop: AI and Ethics.

[6]  Bill Hibbard,et al.  Model-based Utility Functions , 2011, J. Artif. Gen. Intell..

[7]  Bill Hibbard,et al.  Avoiding Unintended AI Behaviors , 2012, AGI.

[8]  Gustavus J. Simmons,et al.  Authentication Theory/Coding Theory , 1985, CRYPTO.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  J. Weizenbaum Computer Power And Human Reason: From Judgement To Calculation , 1978 .

[11]  B. Selman,et al.  Interim Report from the Panel Chairs: AAAI Presidential Panel on Long-Term AI Futures , 2012 .

[12]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[13]  Martin Guha Handbook of Social Psychology (5th edition) , 2010 .

[14]  Christof Paar,et al.  Understanding Cryptography: A Textbook for Students and Practitioners , 2009 .

[15]  S. Beane,et al.  Constraints on the universe as a numerical simulation , 2012, 1210.1847.

[16]  Michael Inzlicht,et al.  Power changes how the brain responds to others. , 2014, Journal of experimental psychology. General.

[17]  F Chessa,et al.  Enough: staying human in an engineered age , 2004, Journal of Medical Ethics.

[18]  Bill Hibbard,et al.  Exploratory Engineering in AI , 2014 .

[19]  Laurent Orseau,et al.  Self-Modification and Mortality in Artificial Agents , 2011, AGI.

[20]  Peter Dayan,et al.  Temporal difference models describe higher-order learning in humans , 2004, Nature.

[21]  Bill Hibbard Adversarial Sequence Prediction , 2008, AGI.

[22]  Luke Muehlhauser,et al.  The Singularity and Machine Ethics , 2012 .

[23]  A D Wissner-Gross,et al.  Causal entropic forces. , 2013, Physical review letters.

[24]  Michael Marien,et al.  Book Review: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies , 2014 .

[25]  Zoubin Ghahramani,et al.  Learning Dynamic Bayesian Networks , 1997, Summer School on Neural Networks.

[26]  Bill Hibbard,et al.  Decision Support for Safe AI Design , 2012, AGI.

[27]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[28]  Joy Bill,et al.  Why the future doesn’t need us , 2003 .

[29]  Erik Brynjolfsson,et al.  Race against the machine : how the digital revolution is accelerating innovation, driving productivity, and irreversibly transforming employment and the economy , 2011 .

[30]  S. Lloyd Computational capacity of the universe. , 2001, Physical review letters.

[31]  M. Carter Diagnostic and Statistical Manual of Mental Disorders, 5th ed. , 2014 .

[32]  Jeremy Butterfield,et al.  Our Mathematical Universe , 2014 .

[33]  Roman V Yampolskiy,et al.  Safety Engineering for Artificial General Intelligence , 2012 .

[34]  Bill Hibbard Bias and No Free Lunch in Formal Measures of Intelligence , 2009, J. Artif. Gen. Intell..

[35]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[37]  Janet B W Williams,et al.  Diagnostic and Statistical Manual of Mental Disorders , 2013 .

[38]  D. Chalmers The Singularity: a Philosophical Analysis , 2010 .

[39]  Marcus Hutter,et al.  Feature Dynamic Bayesian Networks , 2008, ArXiv.

[40]  Shane Legg,et al.  Is There an Elegant Universal Theory of Prediction? , 2006, ALT.

[41]  Eliezer,et al.  Tiling Agents for Self-Modifying AI , and the Löbian Obstacle * , 2013 .

[42]  Pei Wang,et al.  Non-axiomatic reasoning system: exploring the essence of intelligence , 1996 .

[43]  G. Rizzolatti,et al.  The mirror-neuron system. , 2004, Annual review of neuroscience.

[44]  Diogo R. Ferreira,et al.  The Impact of the Search Depth on Chess Playing Strength , 2013, J. Int. Comput. Games Assoc..

[45]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[46]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[47]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[48]  Eric B. Baum,et al.  What is thought? , 2003 .

[49]  Bill Hibbard,et al.  Super-intelligent machines , 2012, COMG.

[50]  Daniel Dewey,et al.  Learning What to Value , 2011, AGI.

[51]  Nick Bostrom,et al.  The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents , 2012, Minds and Machines.

[52]  Susan T. Fiske,et al.  Control, Interdependence and Power: Understanding Social Cognition in Its Social Context , 1996 .

[53]  Stephen M. Omohundro,et al.  The Basic AI Drives , 2008, AGI.

[54]  Itamar Arel,et al.  Beyond the Turing Test , 2009, Computer.

[55]  Al Gedicks,et al.  Resource Rebels: Native Challenges to Mining and Oil Corporations , 2001 .

[56]  Laurent Orseau,et al.  Delusion, Survival, and Intelligent Agents , 2011, AGI.

[57]  Mark Waser Rational Universal Benevolence: Simpler, Safer, and Wiser Than "Friendly AI" , 2011, AGI.

[58]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[59]  Marcus Hutter,et al.  Intelligence as Inference or Forcing Occam on the World , 2014, AGI.

[60]  J. Doob Stochastic processes , 1953 .

[61]  A Peer-reviewed Electronic Journal Published by the Institute for Ethics and Emerging Technologies , 2008 .

[62]  Marcus Hutter,et al.  Feature Reinforcement Learning: Part I. Unstructured MDPs , 2009, J. Artif. Gen. Intell..

[63]  Jürgen Schmidhuber,et al.  Ultimate Cognition à la Gödel , 2009, Cognitive Computation.

[64]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[65]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[66]  E. Walker,et al.  Diagnostic and Statistical Manual of Mental Disorders , 2013 .

[67]  V. Climenhaga Markov chains and mixing times , 2013 .

[68]  Marcus Hutter,et al.  Universal Artificial Intellegence - Sequential Decisions Based on Algorithmic Probability , 2005, Texts in Theoretical Computer Science. An EATCS Series.

[69]  Matthew C. Waxman,et al.  Adapting the Law of Armed Conflict to Autonomous Weapon Systems , 2014 .

[70]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[71]  Michael W. Kraus,et al.  Social Class, Contextualism, and Empathic Accuracy , 2010, Psychological science.

[72]  Mark R. Waser Designing a Safe Motivational System for Intelligent Machines , 2010, AGI 2010.

[73]  Bill Fitzgerald Facebook Tinkers With Users’ Emotions in News Feed Experiment, Stirring Outcry , 2015 .

[74]  Ronald C. Arkin,et al.  Governing lethal behavior: Embedding ethics in a hybrid deliberative/reactive robot architecture part I: Motivation and philosophy , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[75]  Yoshua Bengio,et al.  Bias learning, knowledge sharing , 2003, IEEE Trans. Neural Networks.

[76]  J. Weizenbaum From Computer Power and Human Reason From Judgment to Calculation , 2007 .

[77]  Bill Hibbard Self-modeling Agents Evolving in Our Finite Universe , 2014, AGI.

[78]  Laurent Orseau,et al.  Space-Time Embedded Intelligence , 2012, AGI.

[79]  Mark R. Waser Instructions for Engineering Sustainable People , 2014, AGI.

[80]  Raymond C. Kurzweil,et al.  The Singularity Is Near , 2018, The Infinite Desire for Growth.