Adversarial manipulation of human decision-making

Adversarial examples are carefully crafted input patterns that are surprisingly poorly classified by artificial and/or natural neural networks. Here we examine adversarial vulnerabilities in the processes responsible for learning and choice in humans. Building upon recent recurrent neural network models of choice processes, we propose a general framework for generating adversarial opponents that can shape the choices of individuals in particular decision-making tasks towards the behavioural patterns desired by the adversary. We show the efficacy of the framework through two experiments involving action selection and response inhibition. We further investigate the strategy used by the adversary in order to gain insights into the vulnerabilities of human choice. The framework may find applications across behavioural sciences in helping detect and avoid flawed choice.

[1]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[2]  Martin A. Riedmiller,et al.  Batch Reinforcement Learning , 2012, Reinforcement Learning.

[3]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[4]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[7]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[8]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[9]  Ji Hyun Bak,et al.  Adaptive optimal training of animal behavior , 2017, NIPS.

[10]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[11]  Peter Dayan,et al.  Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models , 2018, bioRxiv.

[12]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[13]  Lihong Li,et al.  Adversarial Attacks on Stochastic Bandits , 2018, NeurIPS.

[14]  Patrick G. Bissett,et al.  Uncovering the structure of self-regulation through data-driven ontology discovery , 2019, Nature Communications.

[15]  Peter Dayan,et al.  Models that learn how humans learn: The case of decision-making and its disorders , 2019, PLoS Comput. Biol..

[16]  Nicolas Le Roux,et al.  Understanding the impact of entropy on policy optimization , 2018, ICML.

[17]  Richard Nock,et al.  Monge beats Bayes: Hardness Results for Adversarial Training , 2018, ICML.

[18]  Yonatan Loewenstein,et al.  From choice architecture to choice engineering , 2019, Nature Communications.

[19]  Richard Nock,et al.  Disentangled behavioral representations , 2019, bioRxiv.