Small-Data, Large-Scale Linear Optimization with Uncertain Objectives

Optimization applications often depend upon a huge number of uncertain parameters. In many contexts, however, the amount of relevant data per parameter is small, and hence, we may have only imprecise estimates. We term this setting -- where the number of uncertainties is large, but all estimates have fixed and low precision -- the "small-data, large-scale regime." We formalize a model for this regime, focusing on linear programs with uncertain objective coefficients, and prove that the small-data, large-scale regime is distinct from the traditional, large-sample regime. Consequently, methods like sample average approximation, data-driven robust optimization, regularization, and "estimate-then-optimize" policies can perform poorly. We propose a novel framework that, given a policy class, identifies an asymptotically best-in-class policy, where the asymptotics hold as the number of uncertain parameters grows large, but the amount of data per uncertainty (and hence the estimate's precision) remains small. We apply our approach to two natural policy classes for this problem: the first inspired by the empirical Bayes literature in statistics and the second by the regularization literature in optimization and machine learning. In both cases, the sub-optimality gap between our proposed method and the best-in-class policy decays exponentially fast in the number of uncertain parameters, even for a fixed amount of data. We also show that in the usual large-sample regime our policies are comparable to the sample average approximation. Thus, our policies retain the strong large-sample performance of traditional methods and additionally enjoy provably strong performance in the small-data, large-scale regime. Numerical experiments confirm the significant benefits of our methods.

[1]  A. Wald An Essentially Complete Class of Admissible Decision Functions , 1947 .

[2]  Abraham Wald,et al.  Statistical Decision Functions , 1951 .

[3]  B. Efron,et al.  Stein's Estimation Rule and Its Competitors- An Empirical Bayes Approach , 1973 .

[4]  B. Efron,et al.  Data Analysis Using Stein's Estimator and its Generalizations , 1975 .

[5]  C. Stein Estimation of the Mean of a Multivariate Normal Distribution , 1981 .

[6]  C. Morris Parametric Empirical Bayes Inference: Theory and Applications , 1983 .

[7]  Peter Kall,et al.  Approximation to Optimization Problems: An Elementary Review , 1986, Math. Oper. Res..

[8]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[9]  R. Wets,et al.  Stochastic programming , 1989 .

[10]  D. Pollard Empirical Processes: Theory and Applications , 1990 .

[11]  G. Ryzin,et al.  Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .

[12]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[13]  Garrett J. van Ryzin,et al.  A Multiproduct Dynamic Pricing Problem and Its Applications to Network Yield Management , 1997, Oper. Res..

[14]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[15]  K. Talluri,et al.  An Analysis of Bid-Price Controls for Network Revenue Management , 1998 .

[16]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[17]  Arkadi Nemirovski,et al.  Robust solutions of Linear Programming problems contaminated with uncertain data , 2000, Math. Program..

[18]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[19]  Alexander Shapiro,et al.  The Sample Average Approximation Method for Stochastic Discrete Optimization , 2002, SIAM J. Optim..

[20]  Arkadi Nemirovski,et al.  Robust optimization – methodology and applications , 2002, Math. Program..

[21]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[22]  Melvyn Sim,et al.  The Price of Robustness , 2004, Oper. Res..

[23]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[24]  J. George Shanthikumar,et al.  A practical inventory control policy using operational statistics , 2005, Oper. Res. Lett..

[25]  David P. Williamson,et al.  An adaptive algorithm for selecting profitable keywords for search-based advertising services , 2006, EC '06.

[26]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[27]  Jon Feldman,et al.  Budget optimization in search-based advertising auctions , 2006, EC '07.

[28]  Peng Sun,et al.  A Robust Optimization Perspective on Stochastic Programming , 2007, Oper. Res..

[29]  Victor Naroditskiy,et al.  Algorithm for stochastic multiple-choice knapsack problem and application to keywords bidding , 2008, WWW.

[30]  Deeparnab Chakrabarty,et al.  Budget constrained bidding in keyword auctions and online knapsack problems , 2008, WINE.

[31]  Rune B. Lyngsø,et al.  Lecture Notes I , 2008 .

[32]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[33]  Martin J. Wainwright,et al.  Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting , 2009, IEEE Trans. Inf. Theory.

[34]  S. Muthukrishnan,et al.  Stochastic Models for Budget Optimization in Search-Based Advertising , 2006, Algorithmica.

[35]  Martin J. Wainwright,et al.  Information-Theoretic Limits on Sparsity Recovery in the High-Dimensional and Noisy Setting , 2007, IEEE Transactions on Information Theory.

[36]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[37]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[38]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[39]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[40]  Apostolos G. Fertis,et al.  A robust optimization approach to statistical estimation problems , 2009 .

[41]  Alexander Shapiro,et al.  Lectures on Stochastic Programming: Modeling and Theory , 2009 .

[42]  A. Belloni,et al.  L1-Penalised quantile regression in high-dimensional sparse models , 2009 .

[43]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[44]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[45]  Guanghui Lan,et al.  An optimal method for stochastic composite optimization , 2011, Mathematical Programming.

[46]  Christian P. Robert,et al.  Large-scale inference , 2010 .

[47]  Lawrence D. Brown,et al.  SURE Estimates for a Heteroscedastic Hierarchical Model , 2012, Journal of the American Statistical Association.

[48]  R. Tibshirani,et al.  Degrees of freedom in lasso problems , 2011, 1111.0653.

[49]  Omar Rivasplata,et al.  Subgaussian random variables : An expository note , 2012 .

[50]  Vibhanshu Abhishek,et al.  Optimal Bidding in Multi-Item Multislot Sponsored Search Auctions , 2013, Oper. Res..

[51]  Emmanuel J. Candès,et al.  Unbiased Risk Estimates for Singular Value Thresholding and Spectral Estimators , 2012, IEEE Transactions on Signal Processing.

[52]  Stefanus Jasin,et al.  Reoptimization and Self-Adjusting Price Control for Network Revenue Management , 2014, Oper. Res..

[53]  Jean-Philippe Vial,et al.  Deriving robust counterparts of nonlinear uncertain inequalities , 2012, Math. Program..

[54]  L. Brown,et al.  Empirical Bayes prediction for the multivariate newsvendor loss function , 2015 .

[55]  D. Simchi-Levi,et al.  A Statistical Learning Approach to Personalization in Revenue Management , 2015, Manag. Sci..

[56]  Henry Lam,et al.  Robust Sensitivity Analysis for Stochastic Systems , 2013, Math. Oper. Res..

[57]  David Simchi-Levi,et al.  Analytics for an Online Retailer: Demand Forecasting and Price Optimization , 2016, Manuf. Serv. Oper. Manag..

[58]  Alan Edelman,et al.  Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[59]  Mustafa Sahin,et al.  Large-Scale Advertising Portfolio Optimization in Online Marketing , 2017 .

[60]  Vishal Gupta,et al.  Data-driven robust optimization , 2013, Math. Program..

[61]  Daniel Kuhn,et al.  Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations , 2015, Mathematical Programming.

[62]  Dimitris Bertsimas,et al.  Characterization of the equivalence of robustification and regularization in linear and matrix regression , 2017, Eur. J. Oper. Res..

[63]  Vishal Gupta,et al.  Robust sample average approximation , 2014, Math. Program..

[64]  Vishal Gupta,et al.  Near-Optimal Bayesian Ambiguity Sets for Distributionally Robust Optimization , 2019, Manag. Sci..

[65]  Jérémie Gallien,et al.  Dynamic Procurement of New Products with Covariate Information: The Residual Tree Method , 2019, Manuf. Serv. Oper. Manag..

[66]  Cynthia Rudin,et al.  The Big Data Newsvendor: Practical Insights from Machine Learning , 2013, Oper. Res..

[67]  Adam N. Elmachtoub,et al.  Smart "Predict, then Optimize" , 2017, Manag. Sci..