Reinforcement mechanism design

We put forward a modeling and algorithmic framework to design and optimize mechanisms in dynamic industrial environments where a designer can make use of the data generated in the process to automatically improve future design. Our solution, coined reinforcement mechanism design, is rooted in game theory but incorporates recent AI techniques to get rid of nonrealistic modeling assumptions and to make automated optimization feasible. We instantiate our framework on the key application scenarios of Baidu and Taobao, two of the largest mobile app companies in China. For the Taobao case, our framework automatically designs mechanisms that allocate buyer impressions for the e-commerce website; for the Baidu case, our framework automatically designs dynamic reserve pricing schemes of advertisement auctions of the search engine. Experiments show that our solutions outperform the state-of-the-art alternatives and those currently deployed, under both scenarios.

[1]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[2]  Pingzhong Tang,et al.  Practical versus Optimal Mechanisms , 2017, AAMAS.

[3]  Yang Cai,et al.  Optimal Multi-dimensional Mechanism Design: Reducing Revenue to Welfare Maximization , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[4]  Renato Paes Leme,et al.  Dynamic Auctions with Bank Accounts , 2016, IJCAI.

[5]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[6]  Paul Milgrom,et al.  Putting Auction Theory to Work , 2004 .

[7]  Alan A. Stocker,et al.  Human Decision-Making under Limited Time , 2016, NIPS.

[8]  David M. Pennock,et al.  Revenue analysis of a family of ranking rules for keyword auctions , 2007, EC '07.

[9]  Mehryar Mohri,et al.  Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers , 2014, NIPS.

[10]  Yiwei Zhang,et al.  A deep reinforcement learning framework for allocating buyer impressions in e-commerce websites , 2017, 1708.07607.

[11]  Mehryar Mohri,et al.  Non-parametric Revenue Optimization for Generalized Second Price auctions , 2015, UAI.

[12]  Mehryar Mohri,et al.  Revenue Optimization against Strategic Buyers , 2015, NIPS.

[13]  Enhong Chen,et al.  Agent Behavior Prediction and Its Generalization Analysis , 2014, AAAI.

[14]  Michael Ostrovsky,et al.  Reserve Prices in Internet Advertising Auctions: A Field Experiment , 2009, Journal of Political Economy.

[15]  Éva Tardos,et al.  Econometrics for Learning Agents , 2015, EC.

[16]  Pingzhong Tang,et al.  Optimal Auctions for Negatively Correlated Items , 2016, EC.

[17]  Andrew Chi-Chih Yao,et al.  An n-to-1 Bidder Reduction for Multi-item Auctions and its Applications , 2014, SODA.

[18]  Michael Schwarz,et al.  Reserve Prices in Internet Advertising Auctions: A Field Experiment , 2009 .

[19]  Pingzhong Tang,et al.  Mechanism Design for Personalized Recommender Systems , 2016, RecSys.

[20]  Pablo Castells,et al.  Proceedings of the 10th ACM Conference on Recommender Systems , 2016, RecSys.

[21]  Di He,et al.  A Game-Theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search , 2013, IJCAI.

[22]  Noam Nisan,et al.  Approximate revenue maximization with multiple items , 2012, EC '12.

[23]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[24]  Mehryar Mohri,et al.  Learning Algorithms for Second-Price Auctions with Reserve , 2016, J. Mach. Learn. Res..

[25]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[26]  Proceedings of the 2019 ACM Conference on Economics and Computation , 2019, EC.

[27]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[28]  Tuomas Sandholm,et al.  Mixed-bundling auctions with reserve prices , 2012, AAMAS.

[29]  M. Mišík,et al.  Oxford University Press , 1968, PMLA/Publications of the Modern Language Association of America.

[30]  Robert M Thrall,et al.  Mathematics of Operations Research. , 1978 .