The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies

Tackling real-world socio-economic challenges requires designing and testing economic policies. However, this is hard in practice, due to a lack of appropriate (micro-level) economic data and limited opportunity to experiment. In this work, we train social planners that discover tax policies in dynamic economies that can effectively trade-off economic equality and productivity. We propose a two-level deep reinforcement learning approach to learn dynamic tax policies, based on economic simulations in which both agents and a government learn and adapt. Our data-driven approach does not make use of economic modeling assumptions, and learns from observational data alone. We make four main contributions. First, we present an economic simulation environment that features competitive pressures and market dynamics. We validate the simulation by showing that baseline tax systems perform in a way that is consistent with economic theory, including in regard to learned agent behaviors and specializations. Second, we show that AI-driven tax policies improve the trade-off between equality and productivity by 16% over baseline policies, including the prominent Saez tax framework. Third, we showcase several emergent features: AI-driven tax policies are qualitatively different from baselines, setting a higher top tax rate and higher net subsidies for low incomes. Moreover, AI-driven tax policies perform strongly in the face of emergent tax-gaming strategies learned by AI agents. Lastly, AI-driven tax policies are also effective when used in experiments with human participants. In experiments conducted on MTurk, an AI tax policy provides an equality-productivity trade-off that is similar to that provided by the Saez framework along with higher inverse-income weighted social welfare.

[1]  Yang Cai,et al.  Optimal Multi-dimensional Mechanism Design: Reducing Revenue to Welfare Maximization , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[2]  Anca D. Dragan,et al.  On the Utility of Learning about Humans for Human-AI Coordination , 2019, NeurIPS.

[3]  Jing Peng,et al.  Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .

[4]  Tim Roughgarden,et al.  On the Pseudo-Dimension of Nearly Optimal Auctions , 2015, NIPS.

[5]  Paul Dütting,et al.  Payment Rules through Discriminant-Based Classifiers , 2012, ACM Trans. Economics and Comput..

[6]  Stefania Albanesi,et al.  Dynamic Optimal Taxation with Private Information , 2003 .

[7]  F. Ramsey A Contribution to the Theory of Taxation , 1927 .

[8]  Emmanuel Saez,et al.  The Elasticity of Taxable Income: Evidence and Implications , 2000 .

[9]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[10]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[11]  Christos Dimitrakakis,et al.  Multi-View Decision Processes: The Helper-AI Problem , 2017, NIPS.

[12]  J. Slemrod,et al.  High-Income Families and the Tax Changes of the 1980s: The Anatomy of Behavioral Response , 1995 .

[13]  Pingzhong Tang,et al.  Automated Mechanism Design via Neural Networks , 2018, AAMAS.

[14]  Yan Hong,et al.  Reinforcement Mechanism Design, with Applications to Dynamic Pricing in Sponsored Search Auctions , 2017, ArXiv.

[15]  Sarit Kraus,et al.  Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport , 2008, AAMAS.

[16]  Max Jaderberg,et al.  Open-ended Learning in Symmetric Zero-sum Games , 2019, ICML.

[17]  Michael P. Wellman,et al.  Constrained automated mechanism design for infinite games of incomplete information , 2007, Autonomous Agents and Multi-Agent Systems.

[18]  N. Mankiw,et al.  The Optimal Taxation of Height: A Case Study of Utilitarian Income Redistribution , 2009 .

[19]  Peter McBurney,et al.  Evolutionary mechanism design: a review , 2010, Autonomous Agents and Multi-Agent Systems.

[20]  Andrew Byde Applying evolutionary game theory to auction mechanism design , 2003, EC '03.

[21]  Emmanuel Saez,et al.  How Elastic are Preferences for Redistribution? Evidence from Randomized Survey Experiments , 2013 .

[22]  N. Gregory Mankiw,et al.  Spreading the Wealth Around: Reflections Inspired by Joe the Plumber , 2010 .

[23]  J. Mirrlees Optimal tax theory: A synthesis , 1976 .

[24]  Richard Cole,et al.  The sample complexity of revenue maximization , 2014, STOC.

[25]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[26]  D. Foley,et al.  The economy needs agent-based modelling , 2009, Nature.

[27]  Maria-Florina Balcan,et al.  Sample Complexity of Automated Mechanism Design , 2016, NIPS.

[28]  Lantao Yu,et al.  Deep Reinforcement Learning for Green Security Games with Real-Time Information , 2018, AAAI.

[29]  Austan Goolsbee What Happens When You Tax the Rich? Evidence from Executive Compensation , 1997, Journal of Political Economy.

[30]  David C. Parkes,et al.  Deep Learning for Revenue-Optimal Auctions with Budgets , 2018, AAMAS.

[31]  David C. Parkes,et al.  Automated Mechanism Design without Money via Machine Learning , 2016, IJCAI.

[32]  N. Le Fort-Piat,et al.  The world of independent learners is not markovian , 2011, Int. J. Knowl. Based Intell. Eng. Syst..

[33]  Shimon Whiteson,et al.  Learning with Opponent-Learning Awareness , 2017, AAMAS.

[34]  Nando de Freitas,et al.  Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.

[35]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[36]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[37]  Yuandong Tian,et al.  M^3RL: Mind-aware Multi-agent Management Reinforcement Learning , 2018, ICLR.

[38]  Sven Seuken,et al.  Designing Core-selecting Payment Rules: A Computational Search Approach , 2018, EC.

[39]  W. Arthur Designing Economic Agents that Act Like Human Agents: A Behavioral Approach to Bounded Rationality , 1991 .

[40]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[41]  Michael P. Wellman,et al.  Economic reasoning and artificial intelligence , 2015, Science.

[42]  Guy Lever,et al.  Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[43]  Shrisha Rao,et al.  Theory and Agent-Based Modeling of Taxpayer Preference and Behavior , 2018, 2018 IEEE/ACM 22nd International Symposium on Distributed Simulation and Real Time Applications (DS-RT).

[44]  Joel Z. Leibo,et al.  Inequity aversion resolves intertemporal social dilemmas , 2018, ArXiv.

[45]  Paul Dütting,et al.  Optimal auctions through deep learning , 2017, ICML.

[46]  S. Matthew Weinberg,et al.  The Sample Complexity of Up-to-ε Multi-Dimensional Revenue Maximization , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[47]  Luigi Mittone,et al.  An agent based model for studying optimal tax collection policy using experimental data: The cases of Chile and Italy , 2013 .

[48]  Danny Yagan,et al.  Optimal Taxation in Theory and Practice , 2009 .

[49]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[50]  Eduardo Tapia,et al.  Exploring Tax Compliance: An Agent-Based Simulation , 2012, ECMS.

[51]  J. Holland,et al.  Artificial Adaptive Agents in Economic Theory , 1991 .

[52]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[53]  Jessica Goldberg,et al.  Kwacha Gonna Do? Experimental Evidence about Labor Supply in Rural Malawi , 2016 .

[54]  Jakub W. Pachocki,et al.  Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[55]  Yang Cai,et al.  An algorithmic characterization of multi-dimensional mechanisms , 2011, STOC '12.

[56]  Milind Tambe,et al.  Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning , 2020, AAAI.

[57]  Eric Bonabeau,et al.  Agent-based modeling: Methods and techniques for simulating human systems , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Kevin Leyton-Brown,et al.  The Positronic Economist: A Computational System for Analyzing Economic Mechanisms , 2017, AAAI.

[59]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[60]  Kim M. Bloomquist Tax Compliance as an Evolutionary Coordination Game: An Agent-Based Approach , 2011 .

[61]  Peter A. Diamond,et al.  Optimal Taxation and Public Production I: Production Efficiency, II: Tax Rules , 1971 .

[62]  Narayana R. Kocherlakota,et al.  ZERO EXPECTED WEALTH TAXES: A MIRRLEES APPROACH TO DYNAMIC OPTIMAL TAXATION , 2004 .

[63]  Yang Cai,et al.  Understanding Incentives: Mechanism Design Becomes Algorithm Design , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[64]  Peter A. Diamond,et al.  Optimal taxation and public production , 1968 .

[65]  Elizabeth Sklar,et al.  Co-evolutionary Auction Mechanism Design: A Preliminary Report , 2002, AMEC.

[66]  H. Francis Song,et al.  Relational Forward Models for Multi-Agent Learning , 2018, ICLR.

[67]  Emmanuel Saez,et al.  The Case for a Progressive Tax: From Basic Research to Policy Recommendations , 2011 .

[68]  Alexander Peysakhovich,et al.  Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish Ones Extended Abstract , 2018 .

[69]  G. Debreu Mathematical Economics: Representation of a preference ordering by a numerical function , 1983 .

[70]  Sergio Valcarcel Macua,et al.  Coordinating the Crowd: Inducing Desirable Equilibria in Non-Cooperative Systems , 2019, AAMAS.

[71]  Emmanuel Saez,et al.  Optimal Taxation of Top Labor Incomes: A Tale of Three Elasticities , 2011 .

[72]  David C. Parkes,et al.  A General Statistical Framework for Designing Strategy-proof Assignment Mechanisms , 2016, UAI.

[73]  Edward L. Glaeser,et al.  Work and Leisure in the United States and Europe: Why So Different? , 2005, NBER Macroeconomics Annual.

[74]  Shimon Whiteson,et al.  Stable Opponent Shaping in Differentiable Games , 2018, ICLR.

[75]  Joel Z. Leibo,et al.  Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.

[76]  Emmanuel Saez,et al.  You have printed the following article : Using Elasticities to Derive Optimal Income Tax Rates , 2007 .

[77]  Stefanie Stantcheva,et al.  Intergenerational Mobility and Preferences for Redistribution , 2017 .

[78]  Vincent Conitzer,et al.  Self-interested automated mechanism design and implications for optimal combinatorial auctions , 2004, EC '04.

[79]  Vincent Conitzer,et al.  Complexity of Mechanism Design , 2002, UAI.

[80]  David C. Parkes,et al.  Deep Learning for Multi-Facility Location Mechanism Design , 2018, IJCAI.

[81]  Narayana R. Kocherlakota,et al.  Optimal indirect and capital taxation , 2003 .

[82]  Thore Graepel,et al.  A Neural Architecture for Designing Truthful and Efficient Auctions , 2019, ArXiv.

[83]  Pingzhong Tang,et al.  Reinforcement mechanism design , 2017, IJCAI.

[84]  Raj Chetty Bounds on Elasticities with Optimization Frictions: A Synthesis of Micro and Macro Evidence on Labor Supply , 2009 .

[85]  Ichiro Kawachi,et al.  Income inequality and health: what have we learned so far? , 2004, Epidemiologic reviews.