Contextual Search for General Hypothesis Classes

In the contextual pricing problem a seller repeatedly obtains products described by an adversarially chosen feature vector in $\mathbb{R}^d$ and only observes the purchasing decisions of a buyer with a fixed but unknown linear valuation over the products. The regret measures the difference between the revenue the seller could have obtained knowing the buyer valuation and what can be obtained by the learning algorithm. We give a poly-time algorithm for contextual pricing with $O(d \log \log T + d \log d)$ regret which matches the $\Omega(d \log \log T)$ lower bound up to the $d \log d$ additive factor. If we replace pricing loss by the symmetric loss, we obtain an algorithm with nearly optimal regret of $O(d \log d)$ matching the $\Omega(d)$ lower bound up to $\log d$. These algorithms are based on a novel technique of bounding the value of the Steiner polynomial of a convex region at various scales. The Steiner polynomial is a degree $d$ polynomial with intrinsic volumes as the coefficients. We also study a generalized version of contextual search where the hidden linear function over the Euclidean space is replaced by a hidden function $f : \mathcal{X} \rightarrow \mathcal{Y}$ in a certain hypothesis class $\mathcal{H}$. We provide a generic algorithm with $O(d^2)$ regret where $d$ is the covering dimension of this class. This leads in particular to a $\tilde{O}(s^2)$ regret algorithm for linear contextual search if the linear function is guaranteed to be $s$-sparse. Finally we also extend our results to the noisy feedback model, where each round our feedback is flipped with a fixed probability $p < 1/2$.

[1]  Renato Paes Leme,et al.  Contextual Pricing for Lipschitz Buyers , 2018, NeurIPS.

[2]  László Lovász,et al.  Hit-and-run mixes fast , 1999, Math. Program..

[3]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[4]  Eli Upfal,et al.  Bandits and Experts in Metric Spaces , 2013, J. ACM.

[5]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[6]  Santosh S. Vempala,et al.  Solving convex programs by random walks , 2004, JACM.

[7]  Yaniv Plan,et al.  Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.

[8]  Haipeng Luo,et al.  More Adaptive Algorithms for Adversarial Bandits , 2018, COLT.

[9]  Mohsen Bayati,et al.  Dynamic Pricing with Demand Covariates , 2016, 1604.07463.

[10]  Elad Hazan,et al.  Interior-Point Methods for Full-Information and Bandit Online Learning , 2012, IEEE Transactions on Information Theory.

[11]  Ewout van den Berg,et al.  1-Bit Matrix Completion , 2012, ArXiv.

[12]  Aleksandrs Slivkins,et al.  Contextual Bandits with Similarity Information , 2009, COLT.

[13]  Akshay Krishnamurthy,et al.  Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents , 2020, ArXiv.

[14]  Adel Javanmard,et al.  Dynamic Pricing in High-Dimensions , 2016, J. Mach. Learn. Res..

[15]  Renato Paes Leme,et al.  Contextual Search via Intrinsic Volumes , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[16]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[17]  Mohsen Bayati,et al.  Online Decision Making with High-Dimensional Covariates , 2020, Oper. Res..

[18]  M. Simonovits,et al.  Random walks and an O * ( n 5 ) volume algorithm for convex bodies , 1997 .

[19]  Renato Paes Leme,et al.  Feature-based Dynamic Pricing , 2016, EC.

[20]  Renato Paes Leme,et al.  Multidimensional Binary Search for Contextual Decision-Making , 2018, Oper. Res..

[21]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[22]  Miklós Simonovits,et al.  Random walks and an O*(n5) volume algorithm for convex bodies , 1997, Random Struct. Algorithms.

[23]  Christopher Jung,et al.  Online Learning with an Unknown Fairness Metric , 2018, NeurIPS.

[24]  Mohsen Bayati,et al.  Online Decision-Making with High-Dimensional Covariates , 2015 .

[25]  Umar Syed,et al.  Repeated Contextual Auctions with Strategic Buyers , 2014, NIPS.