Efficient Sampling from Time-Varying Log-Concave Distributions

We propose a computationally efficient random walk on a convex body which rapidly mixes and closely tracks a time-varying log-concave distribution. We develop general theoretical guarantees on the required number of steps; this number can be calculated on the fly according to the distance from and the shape of the next distribution. We then illustrate the technique on several examples. Within the context of exponential families, the proposed method produces samples from a posterior distribution which is updated as data arrive in a streaming fashion. The sampling technique can be used to track time-varying truncated distributions, as well as to obtain samples from a changing mixture model, fitted in a streaming fashion to data. In the setting of linear optimization, the proposed method has oracle complexity with best known dependence on the dimension for certain geometries. In the context of online learning and repeated games, the algorithm is an efficient method for implementing no-regret mixture forecasting strategies. Remarkably, in some of these examples, only one step of the random walk is needed to track the next distribution.

[1]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[2]  P. Diaconis Some things we've learned (about Markov chain Monte Carlo) , 2013, 1309.7754.

[3]  G. Walther Inference and Modeling with Log-concave Distributions , 2009, 1010.0305.

[4]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[5]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[6]  Santosh S. Vempala,et al.  Simulated annealing in convex bodies and an O*(n4) volume algorithm , 2006, J. Comput. Syst. Sci..

[7]  Santosh S. Vempala,et al.  The geometry of logconcave functions and sampling algorithms , 2007, Random Struct. Algorithms.

[8]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[9]  Neil J. Gordon,et al.  Editors: Sequential Monte Carlo Methods in Practice , 2001 .

[10]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[11]  V. Vovk Competitive On‐line Statistics , 2001 .

[12]  Alexander Rakhlin,et al.  Stability Properties of Empirical Risk Minimization over Donsker Classes , 2006, J. Mach. Learn. Res..

[13]  Santosh S. Vempala,et al.  Simulated Annealing for Convex Optimization , 2004 .

[14]  S. Vempala Geometric Random Walks: a Survey , 2007 .

[15]  Nicholas G. Polson,et al.  Sampling from log-concave distributions , 1994 .

[16]  Hariharan Narayanan,et al.  Random Walks on Polytopes and an Affine Interior Point Method for Linear Programming , 2012, Math. Oper. Res..

[17]  S. Walker,et al.  Sampling Truncated Normal, Beta, and Gamma Densities , 2001 .

[18]  Thomas M. Cover,et al.  Behavior of sequential predictors of binary sequences , 1965 .

[19]  Miklós Simonovits,et al.  Random Walks in a Convex Body and an Improved Volume Algorithm , 1993, Random Struct. Algorithms.

[20]  Persi Diaconis,et al.  The Markov chain Monte Carlo revolution , 2008 .

[21]  Elad Hazan,et al.  Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.

[22]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[23]  Michael J. Todd,et al.  On the Riemannian Geometry Defined by Self-Concordant Barriers and Interior-Point Methods , 2002, Found. Comput. Math..

[24]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[25]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[26]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[27]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[28]  Elad Hazan,et al.  Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier , 2015, ICML.

[29]  N. Chopin A sequential particle filter method for static models , 2002 .

[30]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[31]  W. Gilks,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 1992 .

[32]  Sham M. Kakade,et al.  Online Bounds for Bayesian Algorithms , 2004, NIPS.

[33]  László Lovász,et al.  Hit-and-run mixes fast , 1999, Math. Program..

[34]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[35]  A. Nemirovski,et al.  Interior-point methods for optimization , 2008, Acta Numerica.

[36]  P. MassartLedoux,et al.  Concentration Inequalities Using the Entropy Method , 2002 .

[37]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[38]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[39]  Martin E. Dyer,et al.  A Random Polynomial Time Algorithm for Approximating the Volume of Convex Bodies , 1989, STOC.

[40]  Kenji Yamanishi,et al.  Minimax relative loss analysis for sequential prediction algorithms using parametric hypotheses , 1998, COLT' 98.

[41]  Y. Ollivier Ricci curvature of Markov chains on metric spaces , 2007, math/0701886.

[42]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[43]  C. Robert Simulation of truncated normal variables , 2009, 0907.4010.

[44]  Alexander Rakhlin,et al.  Stability of $K$-Means Clustering , 2006, NIPS.