From Machine Ethics To Machine Explainability and Back

We find ourselves surrounded by a rapidly increasing number of autonomous and semi-autonomous systems. Two grand challenges arise from this development: Machine Ethics and Machine Explainability. Machine Ethics, on the one hand, is concerned with behavioral constraints for systems, set up in a formal unambiguous, algorithmizable, and implementable way, so that morally acceptable, restricted behavior results; Machine Explainability, on the other hand, enables systems to explain their actions and argue for their decisions, so that human users can understand and justifiably trust them. In this paper, we stress the need to link and cross-fertilize these two areas. We point out how Machine Ethics calls for Machine Explainability, and how Machine Explainability involves Machine Ethics. We develop both these facets based on a toy example from the context of medical care robots. In this context, we argue that moral behavior, even if it were verifiable and verified, is not enough to establish justified trust in an autonomous system. It needs to be supplemented with the ability to explain decisions and should thus be supplemented by a Machine Explanation component. Conversely, such explanations need to refer to the system’s modeland constraint-based Machine Ethics reasoning. We propose to apply a framework of formal argumentation theory for the task of generating useful explanations of the Machine Explanation component and we sketch out how the content of the arguments must use the moral reasoning of the Machine

[1]  Maximilian A. Köhl,et al.  Two Challenges for CI Trustworthiness and How to Address Them , 2017 .

[2]  Jose M. Alonso,et al.  An Essay on Self-explanatory Computational Intelligence: A Linguistic Model of Data Processing Systems , 2017 .

[3]  Helmut Horacek,et al.  Requirements for Conceptual Representations of Explanations and How Reasoning Systems Can Serve Them , 2017 .

[4]  Matthias Hein,et al.  Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation , 2017, NIPS.

[5]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[6]  Pat Langley,et al.  Explainable Agency for Intelligent Autonomous Systems , 2017, AAAI.

[7]  Bernd Finkbeiner,et al.  Facets of Software Doping , 2016, ISoLA.

[8]  Kevin Baum,et al.  What the Hack Is Wrong with Software Doping? , 2016, ISoLA.

[9]  Ellen Enkel,et al.  Applied artificial intelligence and trust—The case of autonomous vehicles and medical assistance devices , 2016 .

[10]  Iyad Rahwan,et al.  The social dilemma of autonomous vehicles , 2015, Science.

[11]  Iyad Rahwan,et al.  Autonomous Vehicles Need Experimental Ethics: Are We Ready for Utilitarian Cars? , 2015, ArXiv.

[12]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[13]  Bill Hibbard,et al.  Avoiding Unintended AI Behaviors , 2012, AGI.

[14]  Eliezer Yudkowsky,et al.  Complex Value Systems in Friendly AI , 2011, AGI.

[15]  Michael Anderson,et al.  Machine Ethics: Frontmatter , 2011 .

[16]  C. Allen,et al.  Moral Machines: Teaching Robots Right from Wrong , 2008 .

[17]  Phan Minh Dung,et al.  On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning, Logic Programming and n-Person Games , 1995, Artif. Intell..

[18]  J. Horvat THE ETHICS OF ARTIFICIAL INTELLIGENCE , 2016 .

[19]  J. Hahn The Critique Of Pure Reason , 2016 .

[20]  Luke Muehlhauser,et al.  The Singularity and Machine Ethics , 2012 .

[21]  Roman V. Yampolskiy,et al.  Artificial Intelligence Safety Engineering: Why Machine Ethics Is a Wrong Approach , 2011, PT-AI.

[22]  James H. Moor,et al.  Machine Ethics: The Nature, Importance, and Difficulty of Machine Ethics , 2011 .

[23]  V. Chappell Ordinary language : essays in philosophical method , 1981 .

[24]  D. Davidson Actions, Reasons, And Causes , 1980 .

[25]  John Austin,et al.  A plea for excuses , 1971 .