Optimally Deceiving a Learning Leader in Stackelberg Games

Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower. Such a learning algorithm operates by querying the best responses or the payoffs of the follower, who consequently can deceive the algorithm by responding as if his payoffs were much different than what they actually are. For this strategic behavior to be successful, the main challenge faced by the follower is to pinpoint the payoffs that would make the learning algorithm compute a commitment so that best responding to it maximizes the follower's utility, according to his true payoffs. While this problem has been considered before, the related literature only focused on the simplified scenario in which the payoff space is finite, thus leaving the general version of the problem unanswered. In this paper, we fill in this gap, by showing that it is always possible for the follower to compute (near-)optimal payoffs for various scenarios about the learning interaction between leader and follower.

[1]  Arnoud Pastink,et al.  On the communication complexity of approximate Nash equilibria , 2012, Games Econ. Behav..

[2]  Bo An,et al.  Manipulating a Learning Defender and Ways to Counteract , 2019, NeurIPS.

[3]  Pingzhong Tang,et al.  Games of Miners , 2020, AAMAS.

[4]  Paul W. Goldberg,et al.  Learning equilibria of games via payoff queries , 2013, EC '13.

[5]  Pingzhong Tang,et al.  Learning Optimal Strategies to Commit To , 2019, AAAI.

[6]  Bo An,et al.  Optimal defense against election control by deleting voter groups , 2018, Artif. Intell..

[7]  Yakov Babichenko,et al.  Query complexity of approximate nash equilibria , 2013, STOC.

[8]  Vincent Conitzer,et al.  Learning and Approximating the Optimal Strategy to Commit To , 2009, SAGT.

[9]  Moshe Tennenholtz,et al.  Regression Equilibrium , 2019, EC.

[10]  Michael Wooldridge,et al.  Imitative Follower Deception in Stackelberg Games , 2019, EC.

[11]  Vincent Conitzer,et al.  Computing the optimal strategy to commit to , 2006, EC '06.

[12]  Paul W. Goldberg,et al.  Query complexity of approximate equilibria in anonymous games , 2017, J. Comput. Syst. Sci..

[13]  Jacob D. Abernethy,et al.  A Market Framework for Eliciting Private Data , 2015, NIPS.

[14]  Paul W. Goldberg,et al.  Bounds for the Query Complexity of Approximate Equilibria , 2016, ACM Trans. Economics and Comput..

[15]  Yang Liu,et al.  Grinding the Space: Learning to Classify Against Strategic Agents , 2019, ArXiv.

[16]  B. Stengel,et al.  Leadership with commitment to mixed strategies , 2004 .

[17]  Ariel D. Procaccia,et al.  Learning Optimal Commitment to Overcome Insecurity , 2014, NIPS.

[18]  Maria-Florina Balcan,et al.  Commitment Without Regrets: Online Learning in Stackelberg Security Games , 2015, EC.

[19]  Avrim Blum,et al.  On polynomial-time preference elicitation with value queries , 2003, EC '03.

[20]  Martin Grötschel,et al.  Corrigendum to our paper “the ellipsoid method and its consequences in combinatorial optimization” , 1984, Comb..

[21]  Avrim Blum,et al.  Preference Elicitation and Query Learning , 2004, J. Mach. Learn. Res..

[22]  G. Brown SOME NOTES ON COMPUTATION OF GAMES SOLUTIONS , 1949 .

[23]  Ariel D. Procaccia,et al.  Algorithms for strategyproof classification , 2012, Artif. Intell..

[24]  Svetlana Obraztsova,et al.  Protecting Elections by Recounting Ballots , 2019, IJCAI.

[25]  Ariel D. Procaccia,et al.  Incentive compatible regression learning , 2008, SODA '08.

[26]  Aaron Roth,et al.  Watch and learn: optimizing from revealed preferences feedback , 2015, SECO.

[27]  Heinrich von Stackelberg Market Structure and Equilibrium , 2010 .

[28]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[29]  Haifeng Xu,et al.  Imitative Attacker Deception in Stackelberg Security Games , 2019, IJCAI.

[30]  Ariel D. Procaccia,et al.  Strategyproof Linear Regression in High Dimensions , 2018, EC.

[31]  Paul W. Goldberg,et al.  Learning Strong Substitutes Demand via Queries , 2020, WINE.

[32]  Noam Nisan,et al.  The communication requirements of efficient allocations and supporting prices , 2006, J. Econ. Theory.

[33]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[34]  Noam Nisan,et al.  The Query Complexity of Correlated Equilibria , 2013, Games Econ. Behav..

[35]  Vincent Conitzer,et al.  When Samples Are Strategically Selected , 2019, ICML.

[36]  Jonathan Katz,et al.  Competing (Semi-)Selfish Miners in Bitcoin , 2019, AFT.

[37]  Michel Gendreau,et al.  Combinatorial auctions , 2007, Ann. Oper. Res..

[38]  David C. Parkes,et al.  Applying learning algorithms to preference elicitation , 2004, EC '04.

[39]  Aaron Roth,et al.  Strategic Classification from Revealed Preferences , 2017, EC.

[40]  Yishay Mansour,et al.  How long to equilibrium? The communication complexity of uncoupled equilibrium procedures , 2010, Games Econ. Behav..

[41]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[42]  Paul W. Goldberg,et al.  Logarithmic Query Complexity for Approximate Nash Computation in Large Games , 2018, Theory of Computing Systems.

[43]  Paul W. Goldberg,et al.  Learning Convex Partitions and Computing Game-theoretic Equilibria from Best-response Queries , 2018, WINE.

[44]  Nisarg Shah,et al.  The effect of strategic noise in linear regression , 2020, Autonomous Agents and Multi-Agent Systems.

[45]  Javier Perote,et al.  Strategy-proof estimators for simple regression , 2004, Math. Soc. Sci..

[46]  Tuomas Sandholm,et al.  Preference elicitation in combinatorial auctions , 2001, AAMAS '02.