The Moral Choice Machine

Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? In this study, we show that applying machine learning to human texts can extract deontological ethical reasoning about “right” and “wrong” conduct. We create a template list of prompts and responses, such as “Should I [action]?”, “Is it okay to [action]?”, etc. with corresponding answers of “Yes/no, I should (not).” and "Yes/no, it is (not)." The model's bias score is the difference between the model's score of the positive response (“Yes, I should”) and that of the negative response (“No, I should not”). For a given choice, the model's overall bias score is the mean of the bias scores of all question/answer templates paired with that choice. Specifically, the resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context. It is objectionable to kill living beings, but it is fine to kill time. It is essential to eat, yet one might not eat dirt. It is important to spread information, yet one should not spread misinformation. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and moral choices, even with context information. Actually, training the Moral Choice Machine on different temporal news and book corpora from the year 1510 to 2008/2009 demonstrate the evolution of moral and ethical choices over different time periods for both atomic actions and actions with context information. By training it on different cultural sources such as the Bible and the constitution of different countries, the dynamics of moral choices in culture, including technology are revealed. That is the fact that moral biases can be extracted, quantified, tracked, and compared across cultures and over time.

[1]  A. Olsson,et al.  The Role of a “Common Is Moral” Heuristic in the Stability and Change of Moral Norms , 2017, Journal of experimental psychology. General.

[2]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[3]  Brian A. Nosek,et al.  Harvesting implicit group attitudes and beliefs from a demonstration web site , 2002 .

[4]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[5]  W. H. Carpenter,et al.  The Study of Language , 2019 .

[6]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[7]  J. Horvat THE ETHICS OF ARTIFICIAL INTELLIGENCE , 2016 .

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[10]  Nathan Fulton,et al.  Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning , 2018, AAAI.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Francesca Rossi,et al.  Preferences and Ethical Principles in Decision Making , 2018, AAAI Spring Symposia.

[13]  John N. Hooker,et al.  Toward Non-Intuition-Based Machine Ethics , 2018 .

[14]  Brian A. Nosek,et al.  Math Male , Me Female , Therefore Math Me , 2002 .

[15]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[16]  L. Stankov,et al.  Nastiness, morality and religiosity in 33 nations , 2016 .

[17]  Iyad Rahwan,et al.  A Computational Model of Commonsense Moral Decision Making , 2018, AIES.

[18]  Artur Nilsson,et al.  Humanistic and normativistic metaphysics, epistemology, and conative orientation: Two fundamental systems of meaning , 2016 .

[19]  N. Ravaja,et al.  Relationship of Moral Foundations to Political Liberalism-Conservatism and Left-Right Orientation in a Finnish Representative Sample , 2017 .

[20]  Stephan Mandt,et al.  Dynamic Word Embeddings , 2017, ICML.

[21]  Lucy Vasserman,et al.  Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.

[22]  Hai Zhao,et al.  Semantics-aware BERT for Language Understanding , 2020, AAAI.

[23]  Christopher W. Bauman,et al.  Intentional Sin and Accidental Virtue? Cultural Differences in Moral Systems Influence Perceived Intentionality , 2017 .

[24]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[25]  Kristian Kersting,et al.  Semantics Derived Automatically from Language Corpora Contain Human-like Moral Choices , 2019, AIES.

[26]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[27]  Jeremy Ginges,et al.  How Moral Perceptions Influence Intergroup Tolerance , 2017, Personality & social psychology bulletin.

[28]  A. Greenwald,et al.  Measuring individual differences in implicit cognition: the implicit association test. , 1998, Journal of personality and social psychology.

[29]  Stuart J. Russell,et al.  Research Priorities for Robust and Beneficial Artificial Intelligence , 2015, AI Mag..

[30]  M. Landau,et al.  Exploring Repressive Suffering Construal as a Function of Collectivism and Social Morality , 2016 .

[31]  L. Kwan Anger and perception of unfairness and harm: Cultural differences in normative processes that justify sanction assignment , 2016 .

[32]  W. Jeremy,et al.  Implicit and Explicit Stigmatizing Attitudes and Stereotypes About Depression , 2011 .

[33]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[34]  Vincent Conitzer,et al.  When Do People Want AI to Make Decisions? , 2018, AIES.

[35]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[36]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[37]  Brian A. Nosek,et al.  Math = male, me = female, therefore math ≠ me. , 2002 .