Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization

A crucial problem in modern data science is data-driven algorithm design, where the goal is to choose the best algorithm, or algorithm parameters, for a specific application domain. In practice, we often optimize over a parametric algorithm family, searching for parameters with high performance on a collection of typical problem instances. While effective in practice, these procedures generally have not come with provable guarantees. A recent line of work initiated by a seminal paper of Gupta and Roughgarden (2017) analyzes application-specific algorithm selection from a theoretical perspective. We progress this research direction in several important settings. We provide upper and lower bounds on regret for algorithm selection in online settings, where problems arrive sequentially and we must choose parameters online. We also consider differentially private algorithm selection, where the goal is to find good parameters for a set of problems without divulging too much sensitive information contained therein. We analyze several important parameterized families of algorithms, including SDP-rounding schemes for problems formulated as integer quadratic programs as well as greedy techniques for several canonical subset selection problems. The cost function that measures an algorithm's performance is often a volatile piecewise Lipschitz function of its parameters, since a small change to the parameters can lead to a cascade of different decisions made by the algorithm. We present general techniques for optimizing the sum or average of piecewise Lipschitz functions when the underlying functions satisfy a sufficient and general condition called dispersion. Intuitively, a set of piecewise Lipschitz functions is dispersed if no small region contains many of the functions' discontinuities. Using dispersion, we improve over the best-known online learning regret bounds for a variety problems, prove regret bounds for problems not previously studied, and provide matching regret lower bounds. In the private optimization setting, we show how to optimize performance while preserving privacy for several important problems, providing matching upper and lower bounds on performance loss due to privacy preservation. Though algorithm selection is our primary motivation, we believe the notion of dispersion may be of independent interest. Therefore, we present our results for the more general problem of optimizing piecewise Lipschitz functions. Finally, we uncover dispersion in domains beyond algorithm selection, namely, auction design and pricing, providing online and privacy guarantees for these problems as well.

[1]  Michael Langberg,et al.  The RPR2 rounding technique for semidefinite programs , 2006, J. Algorithms.

[2]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[3]  Tim Roughgarden,et al.  A PAC Approach to Application-Specific Algorithm Selection , 2015, SIAM J. Comput..

[4]  Maria-Florina Balcan,et al.  A General Theory of Sample Complexity for Multi-Item Profit Maximization , 2017, EC.

[5]  Robert D. Kleinberg Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[6]  Tim Roughgarden,et al.  Learning Simple Auctions , 2016, COLT.

[7]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Ambuj Tewari,et al.  Online Learning: Stochastic, Constrained, and Smoothed Adversaries , 2011, NIPS.

[10]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[11]  Vianney Perchet,et al.  Online learning in repeated auctions , 2015, COLT.

[12]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[13]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[14]  Avrim Blum,et al.  Near-optimal online auctions , 2005, SODA '05.

[15]  Sergei Vassilvitskii,et al.  Revenue Optimization with Approximate Bid Predictions , 2017, NIPS.

[16]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[17]  Justin Dallmann Understanding probability , 2016 .

[18]  Vijay Kumar,et al.  Online learning in online auctions , 2003, SODA '03.

[19]  Anindya De,et al.  Lower Bounds in Differential Privacy , 2011, TCC.

[20]  Uri Zwick,et al.  Outward rotations: a tool for rounding solutions of semidefinite programming relaxations, with applications to MAX CUT and other problems , 1999, STOC '99.

[21]  Santosh S. Vempala,et al.  Fast Algorithms for Logconcave Functions: Sampling, Rounding, Integration and Optimization , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[22]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[23]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[24]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[25]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[26]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[27]  W. J. DeCoursey,et al.  Introduction: Probability and Statistics , 2003 .

[28]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[29]  Maria-Florina Balcan,et al.  Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems , 2016, COLT.

[30]  Nikhil R. Devanur,et al.  The sample complexity of auctions with side information , 2015, STOC.

[31]  Maria-Florina Balcan,et al.  Sample Complexity of Automated Mechanism Design , 2016, NIPS.

[32]  D. Pollard Convergence of stochastic processes , 1984 .

[33]  Tim Roughgarden,et al.  Minimizing Regret with Multiple Reserves , 2016, EC.

[34]  Edith Elkind,et al.  Designing and learning optimal finite support auctions , 2007, SODA '07.

[35]  Tim Roughgarden,et al.  Making the Most of Your Samples , 2014, EC.

[36]  Richard Cole,et al.  The sample complexity of revenue maximization , 2014, STOC.

[37]  Moses Charikar,et al.  Maximizing quadratic programs: extending Grothendieck's inequality , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[38]  Claudio Gentile,et al.  Ieee Transactions on Information Theory 1 Regret Minimization for Reserve Prices in Second-price Auctions , 2022 .

[39]  Vasilis Syrgkanis A Sample Complexity Measure with Applications to Learning Optimal Auctions , 2017, NIPS.

[40]  Mehryar Mohri,et al.  Learning Theory and Algorithms for revenue optimization in second price auctions with reserve , 2013, ICML.

[41]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[42]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[43]  Tim Roughgarden,et al.  Ironing in the Dark , 2015, EC.

[44]  R. D. Gordon Values of Mills' Ratio of Area to Bounding Ordinate and of the Normal Probability Integral for Large Values of the Argument , 1941 .

[45]  Vladimir Koltchinskii,et al.  Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.

[46]  Vasilis Syrgkanis,et al.  Learning to Bid Without Knowing your Value , 2017, EC.

[47]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[48]  Koichi Yamazaki,et al.  A note on greedy algorithms for the maximum weighted independent set problem , 2003, Discret. Appl. Math..

[49]  Varun Kanade,et al.  Online Optimization of Smoothed Piecewise Constant Functions , 2016, AISTATS.

[50]  Roman Garnett,et al.  Differentially Private Bayesian Optimization , 2015, ICML.

[51]  Nikhil R. Devanur,et al.  Online Auctions and Multi-scale Online Learning , 2017, EC.

[52]  Anna R. Karlin,et al.  A Prior-Independent Revenue-Maximizing Auction for Multiple Additive Bidders , 2016, WINE.

[53]  Haipeng Luo,et al.  Oracle-Efficient Learning and Auction Design , 2016, ArXiv.

[54]  R. Dudley The Sizes of Compact Subsets of Hilbert Space and Continuity of Gaussian Processes , 1967 .

[55]  Yannai A. Gonczarowski,et al.  Efficient empirical revenue maximization in single-parameter auction environments , 2016, STOC.

[56]  Tim Roughgarden,et al.  On the Pseudo-Dimension of Nearly Optimal Auctions , 2015, NIPS.

[57]  Eli Upfal,et al.  Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[58]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..