Algorithmic Advances in Riemannian Geometry and Applications

In this chapter, we present Bayesian models for diffeomorphic shape variability in populations of images. The first model is a probabilistic formulation of the image atlas construction problem, which seeks to compute an atlas image most representative of a set of input images. The second model adds diffeomorphic modes of shape variation, or principal geodesics. Both of these models represent shape variability as random variables on the manifold of diffeomorphic transformations. We define a Gaussian prior distribution for diffeomorphic transformations using the inner product in the tangent space to the diffeomorphism group. We develop a Monte Carlo Expectation Maximization (MCEM) algorithm for the Bayesian inference, due to the lack of closed-form solutions, where the expectation step is approximated via Hamiltonian Monte Carlo (HMC) sampling of diffeomorphisms. The resulting inference produces estimates of the image atlas, principal geodesic modes of variation, and model parameters. We show that the advantage of the Bayesian formulation is that it provides a principled way to estimate both the regularization parameter of the diffeomorphic transformations and the intrinsic dimensionality of the input data.

[1]  R. Douc,et al.  Minimum variance importance sampling via Population Monte Carlo , 2007 .

[2]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[3]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[4]  G. Roberts,et al.  MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster , 2012, 1202.0709.

[5]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[6]  G. Roberts,et al.  Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler , 1997 .

[7]  Alexandre H. Thi'ery,et al.  Optimal Scaling and Diffusion Limits for the Langevin Algorithm in High Dimensions , 2011, 1103.0542.

[8]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[9]  Anthony Brockwell Parallel Markov chain Monte Carlo Simulation by Pre-Fetching , 2006 .

[10]  Chao Yang,et al.  Learn From Thy Neighbor: Parallel-Chain and Regional Adaptive MCMC , 2009 .

[11]  J. Propp,et al.  Exact sampling with coupled Markov chains and applications to statistical mechanics , 1996 .

[12]  G. Roberts,et al.  Adaptive Markov Chain Monte Carlo through Regeneration , 1998 .

[13]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[14]  N. Chopin A sequential particle filter method for static models , 2002 .

[15]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[16]  Mátyás A. Sustik,et al.  Sparse Approximate Manifolds for Differential Geometric MCMC , 2012, NIPS.

[17]  Ari Pakman,et al.  Exact Hamiltonian Monte Carlo for Truncated Multivariate Gaussians , 2012, 1208.4118.

[18]  Bin Yu,et al.  Regeneration in Markov chain samplers , 1995 .

[19]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[20]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .

[21]  M. West On scale mixtures of normal distributions , 1987 .

[22]  Jean-Michel Marin,et al.  Adaptive importance sampling in general mixture classes , 2007, Stat. Comput..

[23]  G. Warnes The Normal Kernel Coupler: An Adaptive Markov Chain Monte Carlo Method for Efficiently Sampling From Multi-Modal Distributions , 2001 .

[24]  Christian P. Robert,et al.  A vanilla RaoBlackwellization of MetropolisHastings algorithms , 2011 .

[25]  Raquel Urtasun,et al.  A Family of MCMC Methods on Implicitly Defined Manifolds , 2012, AISTATS.

[26]  Max Welling,et al.  Distributed and Adaptive Darting Monte Carlo through Regenerations , 2013, AISTATS.

[27]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[28]  Chris Hans Bayesian lasso regression , 2009 .

[29]  Yee Whye Teh,et al.  Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex , 2013, NIPS.

[31]  Andrew Gelfand,et al.  On Herding and the Perceptron Cycling Theorem , 2010, NIPS.

[32]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[33]  Babak Shahbaba,et al.  Distributed Stochastic Gradient MCMC , 2014, ICML.

[34]  Radford M. Neal 5 MCMC Using Hamiltonian Dynamics , 2011 .

[35]  Liam Paninski,et al.  Efficient Markov Chain Monte Carlo Methods for Decoding Neural Spike Trains , 2011, Neural Computation.

[36]  J. Møller,et al.  An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants , 2006 .

[37]  Thomas Hofmann,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2007 .

[38]  Radford M. Neal The Short-Cut Metropolis Method , 2005, math/0508060.

[39]  W. K. Yuen,et al.  Optimal scaling of random walk Metropolis algorithms with discontinuous target densities , 2012, 1210.5090.

[40]  Max Welling,et al.  Accelerated Variational Dirichlet Process Mixtures , 2006, NIPS.

[41]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[42]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[43]  C. Andrieu,et al.  On the ergodicity properties of some adaptive MCMC algorithms , 2006, math/0610317.

[44]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[45]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[46]  G. Roberts,et al.  MCMC methods for diffusion bridges , 2008 .

[47]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[48]  J. Munkres,et al.  Calculus on Manifolds , 1965 .

[49]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[50]  Babak Shahbaba,et al.  Spherical Hamiltonian Monte Carlo for Constrained Target Distributions , 2013, ICML.

[51]  G. Roberts,et al.  Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets , 2009, 0909.0856.

[52]  G. Roberts,et al.  Optimal Scaling for Random Walk Metropolis on Spherically Constrained Target Densities , 2008 .

[53]  Max Welling,et al.  Herding dynamical weights to learn , 2009, ICML '09.