Taking Principles Seriously: A Hybrid Approach to Value Alignment

An important step in the development of value alignment (VA) systems in AI is understanding how VA can reflect valid ethical principles. We propose that designers of VA systems incorporate ethics by utilizing a hybrid approach in which both ethical reasoning and empirical observation play a role. This, we argue, avoids committing “naturalistic fallacy,” which is an attempt to derive “ought” from “is,” and it provides a more adequate form of ethical reasoning when the fallacy is not committed. Using quantified model logic, we precisely formulate principles derived from deontological ethics and show how they imply particular “test propositions” for any given action plan in an AI rule base. The action plan is ethical only if the test proposition is empirically true, a judgment that is made on the basis of empirical VA. This permits empirical VA to integrate seamlessly with independently justified ethical principles.

[1]  M. Banaji,et al.  PREDICTIVE VALIDITY OF THE IAT 1 RUNNING HEAD : PREDICTIVE VALIDITY OF THE IAT Understanding and Using the Implicit Association Test : III . Meta-analysis of Predictive Validity , 2006 .

[2]  Samuel V. Bruton,et al.  Kant's Ethical Thought , 2001 .

[3]  Matthias Scheutz,et al.  The “big red button” is too late: an alternative model for the ethical evaluation of AI systems , 2018, Ethics and Information Technology.

[4]  F. Cushman,et al.  Philosophers’ biased judgments persist despite training, expertise and reflection , 2015, Cognition.

[5]  Don A. Moore,et al.  Conflicts Of Interest And The Case Of Auditor Independence: Moral Seduction And Strategic Issue Cycling , 2006 .

[6]  Matthias Scheutz,et al.  Computationalism—The Next Generation , 2003 .

[7]  Zenon W. Pylyshyn,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[8]  Selim Berker,et al.  The Normative Insignificance of Neuroscience , 2009 .

[9]  W. Dubbink,et al.  Understanding the Role of Moral Principles in Business Ethics: A Kantian Perspective , 2011, Business Ethics Quarterly.

[10]  Dan W. Brockt,et al.  The Theory of Justice , 2017 .

[11]  Selmer Bringsjord,et al.  Toward a General Logicist Methodology for Engineering Ethically Correct Robots , 2006, IEEE Intelligent Systems.

[13]  D. Moore,et al.  Is it time for auditor independence yet , 2011 .

[14]  John N. Hooker,et al.  Toward Non-Intuition-Based Machine and Artificial Intelligence Ethics: A Deontological Approach Based on Modal Logic , 2018, AIES.

[15]  Willem Zuidema,et al.  A review of computational models of basic rule learning: The neural-symbolic debate and beyond , 2019, Psychonomic Bulletin & Review.

[16]  Keith W. Miller,et al.  Why we should have seen that coming: comments on Microsoft's tay "experiment," and wider implications , 2017, CSOC.

[17]  D. Parfit,et al.  On What Matters , 2011 .

[18]  Steven Pinker,et al.  Rules and connections in human language , 1988, Trends in Neurosciences.

[19]  M. Bazerman,et al.  Bounded Awareness: Implications for Ethical Decision Making , 2016 .

[20]  Hector Geffner,et al.  Model-free, Model-based, and General Intelligence , 2018, IJCAI.

[21]  Wendell Wallach,et al.  Machine morality: bottom-up and top-down approaches for modelling human moral faculties , 2008, AI & SOCIETY.

[22]  Matthias Scheutz,et al.  Value Alignment or Misalignment - What Will Keep Systems Accountable? , 2017, AAAI Workshops.

[23]  John N. Hooker,et al.  Truly Autonomous Machines Are Ethical , 2018, AI Mag..

[24]  Patrick Lin,et al.  The Divine-Command Approach to Robot Ethics , 2012 .

[25]  M. Colombo The Architecture of Cognition: Rethinking Fodor and Pylyshyn’s Systematicity Challenge , 2016 .

[26]  Bernhard Nebel,et al.  Evaluation of the moral permissibility of action plans , 2020, Artif. Intell..

[27]  O. O’neill Acting on Principle: An Essay on Kantian Ethics , 2013 .

[28]  Francesca Gino,et al.  Behavioral Ethics: Toward a Deeper Understanding of Moral Judgment and Dishonesty , 2012 .

[29]  Max H. Bazerman,et al.  Blind Spots: Why We Fail to Do What's Right and What to Do about It , 2011 .

[30]  Jean-Gabriel Ganascia,et al.  Modelling ethical rules of lying with Answer Set Programming , 2007, Ethics and Information Technology.

[31]  C. Allen,et al.  Moral Machines: Teaching Robots Right from Wrong , 2008 .

[32]  Patrick Lin,et al.  Robot Ethics: The Ethical and Social Implications of Robotics , 2011 .

[33]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[34]  Dawid Połap,et al.  Intelligent Home Systems for Ubiquitous User Support by Using Neural Networks and Rule-Based Approach , 2020, IEEE Transactions on Industrial Informatics.

[35]  Don A. Moore,et al.  Conflict of interest and the intrusion of bias , 2010, Judgment and Decision Making.

[36]  Daniel J. Singer Mind the Is-Ought Gap , 2015 .

[37]  Francesca Gino,et al.  Ethical blind spots: explaining unintentional unethical behavior , 2015 .

[38]  P. Smolensky On the proper treatment of connectionism , 1988, Behavioral and Brain Sciences.

[39]  Laurie A. Rudman,et al.  Discrimination and the Implicit Association Test , 2007 .

[40]  Vincent Conitzer,et al.  Moral Decision Making Frameworks for Artificial Intelligence , 2017, ISAIM.

[41]  N. McGlynn Thinking fast and slow. , 2014, Australian veterinary journal.

[42]  Konstantine Arkoudas,et al.  Toward Ethical Robots via Mechanized Deontic Logic ∗ , 2005 .

[43]  Pinar O. Fletcher,et al.  Reducing Bounded Ethicality: How to Help Individuals Notice and Avoid Unethical Behavior , 2015 .

[44]  S. Stich,et al.  Moral Intuitions: Are Philosophers Experts? , 2013 .

[45]  Akeel Bilgrami,et al.  Self-Knowledge and Resentment , 2006 .

[46]  T. Nagel The view from nowhere , 1987 .

[47]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[48]  T. Walsh The effective and ethical development of artificial intelligence: an opportunity to improve our wellbeing , 2019 .

[49]  David Cummiskey Kantian Consequentialism , 1990, Ethics.

[50]  Selmer Bringsjord,et al.  What Robots Can and Can’t Be , 1992 .

[51]  Francesca Rossi,et al.  Preferences and Ethical Priorities: Thinking Fast and Slow in AI , 2019, AAMAS.

[52]  Selmer Bringsjord,et al.  On Automating the Doctrine of Double Effect , 2017, IJCAI.

[53]  T. Donaldson The Epistemic Fault Line in Corporate Governance , 2012 .

[54]  S. Mullainathan,et al.  Implicit Discrimination , 2018 .

[55]  Martin Mose Bentzen,et al.  A Formalization of Kant's Second Formulation of the Categorical Imperative , 2018, DEON.

[56]  Dimitar Filev,et al.  Explaining Deep Learning Models Through Rule-Based Approximation and Visualization , 2020, IEEE Transactions on Fuzzy Systems.

[57]  M. Banaji,et al.  Bounded Ethicality as a Psychological Barrier to Recognizing Conflicts of Interest , 2005 .

[58]  Nancy S. Jecker,et al.  The Sources of Normativity , 2001 .

[59]  T. Donaldson When Integration Fails: The Logic of Prescription and Description in Business Ethics , 1994, Business Ethics Quarterly.

[60]  Gary Marcus,et al.  Deep Learning: A Critical Appraisal , 2018, ArXiv.

[61]  Michael L. Anderson,et al.  Machine Ethics: A Prima Facie Duty Approach to Machine Ethics , 2011 .

[62]  Stuart J. Russell,et al.  Research Priorities for Robust and Beneficial Artificial Intelligence , 2015, AI Mag..

[63]  John N. Hooker,et al.  Taking Ethics Seriously: Why Ethics Is an Essential Tool for the Modern Workplace , 2018 .

[64]  Michael Rescorla,et al.  The Computational Theory of Mind , 2019 .

[65]  Wendell Wallach,et al.  Why Machine Ethics? , 2006, IEEE Intelligent Systems.

[66]  Mahzarin R. Banaji,et al.  Implicit Bias among Physicians and its Prediction of Thrombolysis Decisions for Black and White Patients , 2007, Journal of General Internal Medicine.

[67]  Matthias Scheutz,et al.  Are we ready for sex robots? , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[68]  M. J. Wolf,et al.  Why We Should Have Seen That Coming , 2017 .

[69]  S. Toulmin The Language of Morals , 1954, Philosophy.

[70]  S. Bringsjord A 21st-Century Ethical Hierarchy for Robots and Persons: $$\mathscr {E \! H}$$ , 2017 .

[71]  C. Allen,et al.  Artificial Morality: Top-down, Bottom-up, and Hybrid Approaches , 2005, Ethics and Information Technology.

[72]  Alec Morton,et al.  Inequity averse optimization in operational research , 2015, Eur. J. Oper. Res..

[73]  Jean-Marie Chauvet,et al.  The 30-Year Cycle In The AI Debate , 2018, ArXiv.

[74]  Marco Baroni,et al.  Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks , 2017, ICLR 2018.

[75]  D. Nelkin Two Standpoints and the Belief in Freedom , 2000 .

[76]  Benjamin Kuipers,et al.  Ethical Considerations in Artificial Intelligence Courses , 2017, AI Mag..

[77]  Miguel Egler,et al.  Philosophical expertise under the microscope , 2018, Synthese.

[78]  Colin Allen,et al.  Prolegomena to any future artificial moral agent , 2000, J. Exp. Theor. Artif. Intell..

[79]  Alex Wiegmann,et al.  Intuitive Expertise and Irrelevant Options , 2020 .

[80]  Matthias Scheutz,et al.  Sacrifice One For the Good of Many? People Apply Different Moral Norms to Human and Robot Agents , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[81]  F. Cushman,et al.  Expertise in Moral Reasoning? Order Effects on Moral Judgment in Professional Philosophers and Non-Philosophers , 2012 .

[82]  John N. Hooker,et al.  Combining Equity and Utilitarianism in a Mathematical Programming Model , 2012, Manag. Sci..

[83]  John Haugeland,et al.  Artificial intelligence - the very idea , 1987 .