Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers' valuations \new{for} an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers' valuations, i.e., buyers' preferences. The seller's goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers' heterogeneous preferences. Given the seller's goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller's learning policy. We propose two learning policies that are robust to such strategic behavior. These policies use the outcomes of the auctions, rather than the submitted bids, to estimate the preferences while controlling the long-term effect of the outcome of each auction on the future reserve prices. The first policy called Contextual Robust Pricing (CORP) is designed for the setting where the market noise distribution is known to the seller and achieves a T-period regret of $O(d\log(Td) \log (T))$, where $d$ is the dimension of {the} contextual information. The second policy, which is a variant of the first policy, is called Stable CORP (SCORP). This policy is tailored to the setting where the market noise distribution is unknown to the seller and belongs to an ambiguity set. We show that the SCORP policy has a T-period regret of $O(\sqrt{d\log(Td)}\;T^{2/3})$.

[1]  Assaf J. Zeevi,et al.  Chasing Demand: Learning and Earning in a Changing Environment , 2016, Math. Oper. Res..

[2]  Jon Feldman,et al.  Yield optimization of display advertising with ad exchange , 2011, EC '11.

[3]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[4]  Benjamin Van Roy,et al.  Dynamic Pricing with a Prior on Market Response , 2010, Oper. Res..

[5]  Hamid Nazerzadeh,et al.  Dynamic Pricing for Heterogeneous Time-Sensitive Customers , 2020, Manuf. Serv. Oper. Manag..

[6]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[7]  Josef Broder,et al.  Dynamic Pricing Under a General Parametric Choice Model , 2012, Oper. Res..

[8]  Sergei Vassilvitskii,et al.  A Field Guide to Personalized Reserve Prices , 2016, WWW.

[9]  Renato Paes Leme,et al.  Feature-based Dynamic Pricing , 2016, EC.

[10]  Renato Paes Leme,et al.  Contextual Pricing for Lipschitz Buyers , 2018, NeurIPS.

[11]  Klaus M. Schmidt Commitment Through Incomplete Information in a Simple Repeated Bargaining Game , 1993 .

[12]  D. Simchi-Levi,et al.  A Statistical Learning Approach to Personalization in Revenue Management , 2015, Manag. Sci..

[13]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[14]  S. Matthew Weinberg,et al.  Multi-armed Bandit Problems with Strategic Arms , 2017, COLT.

[15]  Umar Syed,et al.  Repeated Contextual Auctions with Strategic Buyers , 2014, NIPS.

[16]  R. Gill,et al.  Applications of the van Trees inequality : a Bayesian Cramr-Rao bound , 1995 .

[17]  Christos Koufogiannakis,et al.  A Nearly Linear-Time PTAS for Explicit Fractional Packing and Covering Linear Programs , 2013, Algorithmica.

[18]  Mohsen Bayati,et al.  Online Decision-Making with High-Dimensional Covariates , 2015 .

[19]  S. Bikhchandani,et al.  Behavior - Based Price Discrimination by a Patient Seller , 2011 .

[20]  den Arnoud Boer Dynamic Pricing and Learning , 2013 .

[21]  Adel Javanmard Perishability of Data: Dynamic Pricing under Varying-Coefficient Models , 2017, J. Mach. Learn. Res..

[22]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[23]  M. Rothschild A two-armed bandit theory of market pricing , 1974 .

[24]  Adel Javanmard,et al.  Dynamic Pricing in High-Dimensions , 2016, J. Mach. Learn. Res..

[25]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[26]  Victor F. Araman,et al.  Dynamic Pricing for Nonperishable Products with Demand Learning , 2009, Oper. Res..

[27]  Mehryar Mohri,et al.  Learning Theory and Algorithms for revenue optimization in second price auctions with reserve , 2013, ICML.

[28]  Renato Paes Leme,et al.  Contextual Search via Intrinsic Volumes , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[29]  S. Salant When is Inducing Self-Selection Suboptimal for a Monopolist? , 1989 .

[30]  Wang Chi Cheung,et al.  Dynamic Pricing and Demand Learning with Limited Price Experimentation , 2017 .

[31]  V. Mirrokni,et al.  Boosted Second Price Auctions: Revenue Optimization for Heterogeneous Bidders , 2017 .

[32]  Adam Schultz,et al.  Dynamic Learning and Market Making in Spread Betting Markets with Informed Bettors , 2019, EC.

[33]  Bert Zwart,et al.  Simultaneously Learning and Optimizing Using Controlled Variance Pricing , 2014, Manag. Sci..

[34]  J. Michael Harrison,et al.  Bayesian Dynamic Pricing Policies: Learning and Earning Under a Binary Prior Distribution , 2011, Manag. Sci..

[35]  M. Bagnoli,et al.  Log-concave probability and its applications , 2004 .

[36]  Ilan Lobel,et al.  Intertemporal Price Discrimination: Structure and Computation of Optimal Policies , 2014, Manag. Sci..

[37]  David P. Myatt,et al.  Forthcoming in American Economic Review , 2022 .

[38]  J. Tirole,et al.  Contract Renegotiation and Coasian Dynamics , 1988 .

[39]  Christian Borgs,et al.  Optimal Multiperiod Pricing with Service Guarantees , 2013, Manag. Sci..

[40]  Adel Javanmard,et al.  Multi-Product Dynamic Pricing in High-Dimensions with Heterogeneous Price Sensitivity , 2019, 2020 IEEE International Symposium on Information Theory (ISIT).

[41]  Bernardo Guimaraes,et al.  Sales and Monetary Policy , 2008 .

[42]  Mike Mingcheng Wei,et al.  Responsive Pricing of Fashion Products: The Effects of Demand Learning and Strategic Consumer Behavior , 2019, Manag. Sci..

[43]  Vahab S. Mirrokni,et al.  Incentive-Aware Learning for Large Markets , 2018, WWW.

[44]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[45]  H. Varian,et al.  Conditioning Prices on Purchase History , 2005 .

[46]  Adel Javanmard,et al.  Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions , 2020 .

[47]  Yossi Aviv,et al.  Optimal Pricing of Seasonal Products in the Presence of Forward-Looking Consumers , 2008, Manuf. Serv. Oper. Manag..

[48]  Po-Ling Loh,et al.  On Lower Bounds for Statistical Learning Theory , 2017, Entropy.

[49]  B. Xiao OPTIMAL RESERVE PRICE FOR THE GENERALIZED SECOND-PRICE AUCTION IN SPONSORED SEARCH ADVERTISING , 2009 .

[50]  Umar Syed,et al.  Learning Prices for Repeated Auctions with Strategic Buyers , 2013, NIPS.

[51]  Benjamin Edelman,et al.  Optimal Auction Design in a Multi-unit Environment : The Case of Sponsored Search Auctions , 2007 .

[52]  Benjamin Edelman,et al.  Strategic bidder behavior in sponsored search auctions , 2007, Decis. Support Syst..

[53]  John R. Birge,et al.  Markdown Policies for Demand Learning and Strategic Customer Behavior , 2018 .

[54]  Hamid Nazerzadeh,et al.  Dynamic Reserve Prices for Repeated Auctions: Learning from Bids , 2014, ArXiv.

[55]  D. Simchi-Levi,et al.  Online Network Revenue Management Using Thompson Sampling , 2017 .

[56]  A. V. den Boer,et al.  Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[57]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[58]  A. Zeevi,et al.  A Linear Response Bandit Problem , 2013 .

[59]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[60]  Renato Paes Leme,et al.  Multidimensional Binary Search for Contextual Decision-Making , 2016, EC.

[61]  N. B. Keskin,et al.  Personalized Dynamic Pricing with Machine Learning: High Dimensional Features and Heterogeneous Elasticity , 2020 .

[62]  Claudio Gentile,et al.  Regret Minimization for Reserve Prices in Second-Price Auctions , 2015, IEEE Trans. Inf. Theory.

[63]  Martin J. Wainwright,et al.  Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[64]  J. Tropp FREEDMAN'S INEQUALITY FOR MATRIX MARTINGALES , 2011, 1101.3039.