Computational Bayes-Predictive Stochastic Programming: Finite Sample Bounds

We study stochastic programming models where the stochastic variable is only known up to a parametrized distribution function, which must be estimated from a set of independent and identically distributed (i.i.d.) samples. We take a Bayesian approach, positing a prior distribution over the unknown parameter and computing a posterior predictive distribution over future values of the stochastic variable. A data-driven stochastic program is then solved with respect to this predictive posterior distribution. While this forms the standard Bayesian decision-theoretic approach, we focus on problems where calculating the predictive distribution is intractable, a typical situation in modern applications with large datasets, high-dimensional parameters, and heterogeneity due to observed covariates and latent group structure. Rather than constructing sampling approximations to the intractable distribution using standard Markov chain Monte Carlo methods, we study computational approaches to decision-making based on the modern optimization-based methodology of variational Bayes. We consider two approaches, a two-stage approach where a posterior approximation is constructed and then used to solve the decision problem, and an approach that jointly solves the optimization and decision problems. We analyze the finite sample performance of the value and optimal decisions of the resulting data-driven stochastic programs.

[1]  Zhaolin Hu,et al.  Kullback-Leibler divergence constrained distributionally robust optimization , 2012 .

[2]  G. Pflug Stochastic Optimization and Statistical Inference , 2003 .

[3]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[4]  Cynthia Rudin,et al.  The Big Data Newsvendor: Practical Insights from Machine Learning , 2013, Oper. Res..

[5]  J. George Shanthikumar,et al.  Technical note – operational statistics: Properties and the risk‐averse case , 2015 .

[6]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[7]  Karthyek Murthy,et al.  Optimal Transport-Based Distributionally Robust Optimization: Structural Properties and Iterative Schemes , 2018, Math. Oper. Res..

[8]  H. Scarf Some remarks on bayes solutions to the inventory problem , 1960 .

[9]  Dimitris Bertsimas,et al.  From Predictive to Prescriptive Analytics , 2014, Manag. Sci..

[10]  Pierre Alquier,et al.  Consistency of variational Bayes inference for estimation and model selection in mixtures , 2018, 1805.05054.

[11]  Ning Chen,et al.  Bayesian inference with posterior regularization and applications to infinite latent SVMs , 2012, J. Mach. Learn. Res..

[12]  S. Varadhan,et al.  Asymptotic evaluation of certain Markov process expectations for large time , 1975 .

[13]  Melvyn Sim,et al.  Distributionally Robust Optimization and Its Tractable Approximations , 2010, Oper. Res..

[14]  David M. Blei,et al.  Frequentist Consistency of Variational Bayes , 2017, Journal of the American Statistical Association.

[15]  Zoubin Ghahramani,et al.  Approximate inference for the loss-calibrated Bayesian , 2011, AISTATS.

[16]  J. George Shanthikumar,et al.  Solving operational statistics via a Bayesian analysis , 2008, Oper. Res. Lett..

[17]  C. Holmes,et al.  Approximate Models and Robust Decisions , 2014, 1402.6118.

[18]  Yinyu Ye,et al.  Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems , 2010, Oper. Res..

[19]  Dimitris Bertsimas,et al.  Inventory Management in the Era of Big Data , 2016 .

[20]  D. Epstein Coalescing Data and Decision Sciences for Analytics , 2018 .

[21]  Tito Homem-de-Mello,et al.  Monte Carlo sampling-based methods for stochastic optimization , 2014 .

[22]  Edward Furman,et al.  Weighted Premium Calculation Principles , 2006 .

[23]  Vinayak A. Rao,et al.  Asymptotic Consistency of $\alpha-$R\'enyi-Approximate Posteriors. , 2019 .

[24]  Daniel Kuhn,et al.  Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations , 2015, Mathematical Programming.

[25]  L. Schwartz On Bayes procedures , 1965 .

[26]  Dimitris Bertsimas,et al.  Optimization over Continuous and Multi-dimensional Decisions with Observational Data , 2018, NeurIPS.

[27]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[28]  Pierre Alquier,et al.  Concentration of tempered posteriors and of their variational approximations , 2017, The Annals of Statistics.

[29]  Güzin Bayraksan,et al.  Data-Driven Stochastic Programming Using Phi-Divergences , 2015 .

[30]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[31]  Daniel Kuhn,et al.  Distributionally Robust Convex Optimization , 2014, Oper. Res..

[32]  Chao Gao,et al.  Convergence rates of variational posterior distributions , 2017, The Annals of Statistics.

[33]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[34]  Milind Tambe,et al.  Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[35]  Di Wu,et al.  A Bayesian Risk Approach to Data-driven Stochastic Optimization: Formulations and Asymptotics , 2016, SIAM J. Optim..

[36]  J. George Shanthikumar,et al.  A practical inventory control policy using operational statistics , 2005, Oper. Res. Lett..

[37]  Di Wu,et al.  Simulation Optimization Under Input Model Uncertainty , 2017 .