Bias Reduction in Sample-Based Optimization

Abstract We consider stochastic optimization problems which use observed data to estimate essential characteristics of the random quantities involved. Sample average approximation (SAA) or empirical (plug-in) estimation are very popular ways to use data in optimization. It is well known that sample average optimization suffers from downward bias. We propose to use smooth estimators rather than empirical ones in optimization problems. We establish consistency results for the optimal value and the set of optimal solutions of the new problem formulation. The performance of the proposed approach is compared to SAA theoretically and numerically. We analyze the bias of the new problems and identify sufficient conditions for ensuring less biased estimation of the optimal value of the true problem. At the same time, the error of the new estimator remains controlled. We show that those conditions are satisfied for many popular statistical problems such as regression models, classification problems, and optimization problems with Average (Conditional) Value-at-Risk. We have observed that smoothing the least-squares objective in a regression problem by a normal kernel leads to a ridge regression. Our numerical experience shows that the new estimators frequently exhibit also smaller variance and smaller mean-square error than those of SAA.

[1]  A. Ruszczynski,et al.  Statistical estimation of composite risk functionals and risk optimization problems , 2015, 1504.02658.

[2]  B. Silverman,et al.  Weak and Strong Uniform Consistency of the Kernel Estimate of a Density and its Derivatives , 1978 .

[3]  Alexander Shapiro,et al.  Lectures on Stochastic Programming: Modeling and Theory , 2009 .

[4]  Georg Ch. Pflug,et al.  On the Glivenko-Cantelli Problem in Stochastic Programming: Linear Recourse and Extensions , 1996, Math. Oper. Res..

[5]  P. Kall Approximations to stochastic programs with complete fixed recourse , 1974 .

[6]  David P. Morton,et al.  Monte Carlo bounding techniques for determining solution quality in stochastic programs , 1999, Oper. Res. Lett..

[7]  Dominik Wied,et al.  Consistency of the kernel density estimator: a survey , 2012 .

[8]  Werner Römisch,et al.  Differential Stability of Two-Stage Stochastic Programs , 2000, SIAM J. Optim..

[9]  A. Shapiro Statistical Inference of Stochastic Optimization Problems , 2000 .

[10]  Georg Ch. Pflug,et al.  From Empirical Observations to Tree Models for Stochastic Optimization: Convergence Properties , 2016, SIAM J. Optim..

[11]  Arkadi Nemirovski,et al.  Non-asymptotic confidence bounds for the optimal value of a stochastic program , 2016, Optim. Methods Softw..

[12]  Joel Zinn,et al.  Weighted uniform consistency of kernel density estimators , 2004 .

[13]  Stephen M. Robinson,et al.  Analysis of Sample-Path Optimization , 1996, Math. Oper. Res..

[14]  Martin J. Wainwright,et al.  Randomized Smoothing for Stochastic Optimization , 2011, SIAM J. Optim..

[15]  David P. Morton,et al.  A Sequential Sampling Procedure for Stochastic Programming , 2011, Oper. Res..

[16]  Petr Lachout,et al.  Strong convergence of estimators as εn-minimisers of optimisation problemsof optimisation problems , 2005 .

[17]  Necessary and sufficient conditions for weak convergence of smoothed empirical processes , 2003 .

[18]  Werner Römisch,et al.  Stability of Multistage Stochastic Programs , 2006, SIAM J. Optim..

[19]  Richard Nickl,et al.  Uniform central limit theorems for kernel density estimators , 2008 .

[20]  Werner Römisch,et al.  Sampling-Based Decomposition Methods for Multistage Stochastic Programs Based on Extended Polyhedral Risk Measures , 2012, SIAM J. Optim..

[21]  An Approximative Solution of a Stochastic Optimization Problem , 1978 .

[22]  Yifan Liu,et al.  Mitigating Uncertainty via Compromise Decisions in Two-Stage Stochastic Linear Programming: Variance Reduction , 2016, Oper. Res..

[23]  W. Römisch Stability of Stochastic Programming Problems , 2003 .

[24]  S. Sheather Density Estimation , 2004 .

[25]  Hui Xiong,et al.  Risk-Averse Classification , 2018, Annals of Operations Research.

[26]  David P. Morton,et al.  Assessing solution quality in stochastic programs , 2006, Algorithms for Optimization with Incomplete Information.

[27]  Alexander Shapiro,et al.  The Sample Average Approximation Method for Stochastic Discrete Optimization , 2002, SIAM J. Optim..

[28]  Georg Ch. Pflug,et al.  A branch and bound method for stochastic global optimization , 1998, Math. Program..

[29]  Yufeng Liu,et al.  Statistical Analysis of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization , 2019, ArXiv.

[30]  Uwe Einmahl,et al.  Uniform in bandwidth consistency of kernel-type function estimators , 2005 .

[31]  V. I. Norkin,et al.  Convergence of the empirical mean method in statistics and stochastic programming , 1992 .

[32]  Werner Römisch,et al.  A Stochastic Programming Model for Optimal Power Dispatch: Stability and Numerical Treatment , 1992 .

[33]  Gül Gürkan,et al.  Sample-path solution of stochastic variational inequalities , 1999, Math. Program..

[34]  Yuri M. Ermoliev,et al.  Sample Average Approximation Method for Compound Stochastic Optimization Problems , 2013, SIAM J. Optim..

[35]  Y. Ermoliev,et al.  The Minimization of Semicontinuous Functions: Mollifier Subgradients , 1995 .

[36]  J. Dupacová,et al.  ASYMPTOTIC BEHAVIOR OF STATISTICAL ESTIMATORS AND OF OPTIMAL SOLUTIONS OF STOCHASTIC OPTIMIZATION PROBLEMS , 1988 .

[37]  R. Nickl,et al.  Mathematical Foundations of Infinite-Dimensional Statistical Models , 2015 .

[38]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[39]  R. Wets,et al.  Epi‐consistency of convex stochastic programs , 1991 .