Differential geometric MCMC methods and applications

This thesis presents novel Markov chain Monte Carlo methodology that exploits the natural representation of a statistical model as a Riemannian manifold. The methods developed provide generalisations of the Metropolis-adjusted Langevin algorithm and the Hybrid Monte Carlo algorithm for Bayesian statistical inference, and resolve many shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlation structure. The performance of these Riemannian manifold Markov chain Monte Carlo algorithms is rigorously assessed by performing Bayesian inference on logistic regression models, log-Gaussian Cox point process models, stochastic volatility models, and both parameter and model level inference of dynamical systems described by nonlinear differential equations.

[1]  Antonietta Mira,et al.  Zero variance Markov chain Monte Carlo for Bayesian estimators , 2010, Stat. Comput..

[2]  Kevin H. Knuth,et al.  Foundations of Inference , 2010, Axioms.

[3]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[4]  Mark Girolami,et al.  Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods , 2011, Interface Focus.

[5]  Hervé Delingette,et al.  Efficient probabilistic model personalization integrating uncertainty on data and parameters: Application to eikonal-diffusion models in cardiac electrophysiology. , 2011, Progress in biophysics and molecular biology.

[6]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[7]  Kamil Erguler,et al.  Practical limits for reverse engineering of dynamical systems: a statistical analysis of sensitivity and parameter inferability in systems biology models. , 2011, Molecular bioSystems.

[8]  Michael P H Stumpf,et al.  Sensitivity, robustness, and identifiability in stochastic chemical kinetics models , 2011, Proceedings of the National Academy of Sciences.

[9]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[10]  Mark K Transtrum,et al.  Geometry of nonlinear least squares with applications to sloppy models and optimization. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Michael Andrew Christie,et al.  Population MCMC methods for history matching and uncertainty quantification , 2010, Computational Geosciences.

[12]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[13]  William W. Chen,et al.  Classic and contemporary approaches to modeling biochemical reactions. , 2010, Genes & development.

[14]  D. Xiu Numerical Methods for Stochastic Computations: A Spectral Method Approach , 2010 .

[15]  T. Maiwald,et al.  Materials and Methods SOM Text Figs. S1 to S16 References Materials and Methods , 2022 .

[16]  Stephen M. Stigler,et al.  Darwin, Galton and the Statistical Enlightenment , 2010 .

[17]  M. Girolami,et al.  Inferring Signaling Pathway Topologies from Multiple Perturbation Measurements of Specific Biochemical Species , 2010, Science Signaling.

[18]  T. Banchoff,et al.  Differential Geometry of Curves and Surfaces , 2010 .

[19]  M. Koornneef,et al.  The development of Arabidopsis as a model plant. , 2010, The Plant journal : for cell and molecular biology.

[20]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[21]  Jean Clairambault,et al.  Circadian timing in cancer treatments. , 2010, Annual review of pharmacology and toxicology.

[22]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[23]  James C. Spall,et al.  Efficient Monte Carlo computation of Fisher information matrix using prior information , 2007, Comput. Stat. Data Anal..

[24]  Shun-ichi Amari,et al.  Divergence, Optimization and Geometry , 2009, ICONIP.

[25]  Neil Dalchau,et al.  Systems analyses of circadian networks. , 2009, Molecular bioSystems.

[26]  Mark A. Girolami,et al.  Estimating Bayes factors via thermodynamic integration and population MCMC , 2009, Comput. Stat. Data Anal..

[27]  Ursula Klingmüller,et al.  Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood , 2009, Bioinform..

[28]  S. McDougall,et al.  Multiscale modelling and nonlinear simulation of vascular tumour growth , 2009, Journal of mathematical biology.

[29]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[30]  Paul H. C. Eilers,et al.  Bayesian density estimation from grouped continuous data , 2009, Comput. Stat. Data Anal..

[31]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[32]  Robert J. Smith,et al.  WHEN ZOMBIES ATTACK!: MATHEMATICAL MODELLING OF AN OUTBREAK OF ZOMBIE INFECTION , 2009 .

[33]  Neil D. Lawrence,et al.  Latent Force Models , 2009, AISTATS.

[34]  Catherine F. Higham Bifurcation analysis informs Bayesian inference in the Hes1 feedback loop , 2009, BMC Systems Biology.

[35]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[36]  Ciaran L. Kelly,et al.  The Circadian Clock in Arabidopsis Roots Is a Simplified Slave Version of the Clock in Shoots , 2008, Science.

[37]  Neil D. Lawrence,et al.  Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes , 2008, NIPS.

[38]  J. Tyson,et al.  Design principles of biochemical oscillators , 2008, Nature Reviews Molecular Cell Biology.

[39]  L. Abbott,et al.  Theoretical Neuroscience Rising , 2008, Neuron.

[40]  Erin L. McDearmon,et al.  The genetics of mammalian circadian order and disorder: implications for physiology and disease , 2008, Nature Reviews Genetics.

[41]  Neil D. Lawrence,et al.  Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities , 2008, ECCB.

[42]  A. Pettitt,et al.  Marginal likelihood estimation via power posteriors , 2008 .

[43]  Bryan C. Daniels,et al.  Sloppiness, robustness, and evolvability in systems biology. , 2008, Current opinion in biotechnology.

[44]  Mark A. Girolami,et al.  Bayesian ranking of biochemical system models , 2008, Bioinform..

[45]  M. Vallisneri Use and abuse of the Fisher information matrix in the assessment of gravitational-wave parameter-estimation prospects , 2007, gr-qc/0703086.

[46]  B. Calderhead A study of Population MCMC for estimatingBayes Factors over nonlinear ODE models , 2008 .

[47]  Karine David,et al.  ZEITLUPE is a circadian photoreceptor stabilized by GIGANTEA in blue light. , 2007, Nature.

[48]  S. Kay,et al.  PRR7 protein levels are regulated by light and the circadian clock in Arabidopsis. , 2007, The Plant journal : for cell and molecular biology.

[49]  Ajay Jasra,et al.  On population-based simulation for static inference , 2007, Stat. Comput..

[50]  C. Tomlin,et al.  Biology by numbers: mathematical modelling in developmental biology , 2007, Nature Reviews Genetics.

[51]  H. Steven Wiley,et al.  Cell Surface Receptors for Signal Transduction and Ligand Transport: A Design Principles Study , 2007, PLoS Comput. Biol..

[52]  Aki Vehtari,et al.  Sparse Log Gaussian Processes via MCMC for Spatial Epidemiology , 2007, Gaussian Processes in Practice.

[53]  Christopher R. Myers,et al.  Universally Sloppy Parameter Sensitivities in Systems Biology Models , 2007, PLoS Comput. Biol..

[54]  A. Hajian Efficient cosmological parameter estimation with Hamiltonian Monte Carlo technique , 2006, astro-ph/0608679.

[55]  S. Oliver,et al.  Bayesian Methods of Astronomical Source Extraction , 2005, astro-ph/0512597.

[56]  R. Sidman Discussion of paper by B. B. Garber , 1972, In Vitro.

[57]  Jiguo Cao,et al.  Parameter estimation for differential equations: a generalized smoothing approach , 2007 .

[58]  Tony O’Hagan Bayes factors , 2006 .

[59]  B. Walsh,et al.  Models for navigating biological complexity in breeding improved crop plants. , 2006, Trends in plant science.

[60]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[61]  J. Skilling Nested sampling for general Bayesian computation , 2006 .

[62]  Anthony Hall,et al.  Disruption of Hepatic Leptin Signaling Protects Mice From Age- and Diet-Related Glucose Intolerance , 2010, Diabetes.

[63]  Tania Nolan,et al.  Quantification of mRNA using real-time RT-PCR , 2006, Nature Protocols.

[64]  Anand Rangarajan,et al.  A New Closed-Form Information Metric for Shape Analysis , 2006, MICCAI.

[65]  Xavier Pennec,et al.  Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements , 2006, Journal of Mathematical Imaging and Vision.

[66]  D. Wilkinson Stochastic Modelling for Systems Biology , 2006 .

[67]  C. Robertson McClung,et al.  Plant Circadian Rhythms , 2006, The Plant Cell Online.

[68]  H. Philippe,et al.  Computing Bayes factors using thermodynamic integration. , 2006, Systematic biology.

[69]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[70]  P. James McLellan,et al.  Parameter estimation in continuous-time dynamic models using principal differential analysis , 2006, Comput. Chem. Eng..

[71]  Mats Jirstrand,et al.  Systems biology Systems Biology Toolbox for MATLAB : a computational platform for research in systems biology , 2006 .

[72]  Stephen Emmott,et al.  Towards 2020 Science , 2006 .

[73]  Eric J Kunkel,et al.  Systems biology in drug discovery. , 2006, Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference.

[74]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[75]  P. Atzberger The Monte-Carlo Method , 2006 .

[76]  M. Barenco,et al.  Ranked prediction of p53 targets using hidden variable dynamic modeling , 2006, Genome Biology.

[77]  J. Spall Monte Carlo Computation of the Fisher Information Matrix in Nonstandard Settings , 2005 .

[78]  Xiuwen Liu,et al.  A Computational Approach to Fisher Information Geometry with Applications to Image Analysis , 2005, EMMCVPR.

[79]  Carol S. Woodward,et al.  Enabling New Flexibility in the SUNDIALS Suite of Nonlinear and Differential/Algebraic Equation Solvers , 2020, ACM Trans. Math. Softw..

[80]  Paul E. Brown,et al.  Extension of a genetic network model by iterative experimentation and mathematical analysis , 2005, Molecular systems biology.

[81]  M S Turner,et al.  Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. , 2005, Journal of theoretical biology.

[82]  T. Mizuno,et al.  Pseudo-Response Regulators (PRRs) or True Oscillator Components (TOCs). , 2005, Plant & cell physiology.

[83]  J. Rosenthal,et al.  Scaling limits for the transient phase of local Metropolis–Hastings algorithms , 2005 .

[84]  Muffy Calder,et al.  When kinases meet mathematics: the systems biology of MAPK signalling , 2005, FEBS letters.

[85]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[86]  B. Leimkuhler,et al.  Simulating Hamiltonian Dynamics: Hamiltonian PDEs , 2005 .

[87]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[88]  Albert Tarantola,et al.  Inverse problem theory - and methods for model parameter estimation , 2004 .

[89]  Ovidiu Calin,et al.  Geometric Mechanics on Riemannian Manifolds: Applications to Partial Differential Equations , 2004 .

[90]  E. Hairer,et al.  Geometric Numerical Integration: Structure Preserving Algorithms for Ordinary Differential Equations , 2004 .

[91]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[92]  J. Stelling,et al.  Robustness properties of circadian clock architectures. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[93]  B. Frieden Science from Fisher Information , 2004 .

[94]  Jens Timmer,et al.  Parameter Identification Techniques for Partial Differential Equations , 2004, Int. J. Bifurc. Chaos.

[95]  C. Sawyers,et al.  Targeted cancer therapy , 2004, Nature.

[96]  Carl E. Rasmussen,et al.  Warped Gaussian Processes , 2003, NIPS.

[97]  A. P. Dawid,et al.  Gaussian Processes to Speed up Hybrid Monte Carlo for Expensive Bayesian Integrals , 2003 .

[98]  Thore Graepel,et al.  Solving Noisy Linear Operator Equations by Gaussian Processes: Application to Ordinary and Partial Differential Equations , 2003, ICML.

[99]  N. Monk Oscillatory Expression of Hes1, p53, and NF-κB Driven by Transcriptional Time Delays , 2003, Current Biology.

[100]  K. S. Brown,et al.  Statistical mechanical approaches to models with many poorly known parameters. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[101]  J. Timmer,et al.  Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[102]  Guy Lebanon,et al.  Learning Riemannian Metrics , 2002, UAI.

[103]  Carl von Linné,et al.  Linnaeus' Philosophia Botanica , 2003 .

[104]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[105]  Radford M. Neal Slice Sampling , 2000, physics/0009028.

[106]  M. Koornneef,et al.  A fortunate choice: the history of Arabidopsis as a model plant , 2002, Nature Reviews Genetics.

[107]  Andrew J. Millar,et al.  The ELF4 gene controls circadian rhythms and flowering time in Arabidopsis thaliana , 2002, Nature.

[108]  Elton P. Hsu Stochastic analysis on manifolds , 2002 .

[109]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[110]  Carl E. Rasmussen,et al.  Derivative Observations in Gaussian Process Models of Dynamic Systems , 2002, NIPS.

[111]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[112]  Steve A. Kay,et al.  Reciprocal Regulation Between TOC1 and LHY/CCA1 Within the Arabidopsis Circadian Clock , 2001, Science.

[113]  Yoram Baram,et al.  Manifold Stochastic Dynamics for Bayesian Learning , 1999, Neural Computation.

[114]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[115]  George Boole,et al.  The Calculus of Logic , 2001 .

[116]  Jim Albert,et al.  Ordinal Data Modeling , 2000 .

[117]  A. Hall,et al.  Functional independence of circadian clocks that regulate plant gene expression , 2000, Current Biology.

[118]  N. Čencov Statistical Decision Rules and Optimal Inference , 2000 .

[119]  Stefano Tarantola,et al.  Sensitivity Analysis as an Ingredient of Modeling , 2000 .

[120]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[121]  P. Marriott,et al.  Applications of differential geometry to econometrics: List of contributors , 2000 .

[122]  Lingyu Chen,et al.  Exploring Hybrid Monte Carlo in Bayesian Computation , 2000 .

[123]  Francis Sullivan,et al.  The Metropolis Algorithm , 2000, Computing in Science & Engineering.

[124]  남홍길 Control of Circadian Rhythms and Photoperiodic Flowering by the Arabidopsis GIGANTEA Gene , 1999 .

[125]  S. Howison,et al.  Applied Partial Differential Equations , 1999 .

[126]  Neil Gershenfeld,et al.  The nature of mathematical modeling , 1998 .

[127]  J. M. Corcuera,et al.  A Characterization of Monotone and Regular Divergences , 1998 .

[128]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo in Practice: A Roundtable Discussion , 1998 .

[129]  Gareth O. Roberts,et al.  Markov‐chain monte carlo: Some practical implications of theoretical results , 1998 .

[130]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[131]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[132]  R. Kass,et al.  Geometrical Foundations of Asymptotic Inference , 1997 .

[133]  D. C. Rapaport,et al.  The Art of Molecular Dynamics Simulation , 1997 .

[134]  Dani Gamerman,et al.  Sampling from the posterior distribution in generalized linear mixed models , 1997, Stat. Comput..

[135]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[136]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[137]  John Skilling,et al.  Data analysis : a Bayesian tutorial , 1996 .

[138]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[139]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[140]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[141]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[142]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[143]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[144]  N. Shephard,et al.  Stochastic Volatility: Likelihood Inference And Comparison With Arch Models , 1996 .

[145]  Michael I. Miller,et al.  REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS , 1994 .

[146]  Paul Marriott,et al.  Preferred Point Geometry and the Local Differential Geometry of the Kullback-Leibler Divergence , 1994 .

[147]  I. Chavel Riemannian Geometry: Subject Index , 2006 .

[148]  Paul Marriott,et al.  Preferred Point Geometry and Statistical Manifolds , 1993 .

[149]  M. Murray,et al.  Differential Geometry and Statistics , 1993 .

[150]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[151]  G. Sussman,et al.  Chaotic Evolution of the Solar System , 1992, Science.

[152]  J. Skilling Bayesian Solution of Ordinary Differential Equations , 1992 .

[153]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[154]  R. Ghanem,et al.  Stochastic Finite Elements: A Spectral Approach , 1990 .

[155]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[156]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[157]  A. Kennedy The theory of hybrid stochastic algorithms , 1990 .

[158]  Creutz Global Monte Carlo algorithms for many-fermion systems. , 1988, Physical review. D, Particles and fields.

[159]  J. Banavar,et al.  Computer Simulation of Liquids , 1988 .

[160]  R. Fletcher Practical Methods of Optimization , 1988 .

[161]  Derek J. Pike,et al.  Empirical Model‐building and Response Surfaces. , 1988 .

[162]  A. Kennedy,et al.  Hybrid Monte Carlo , 1988 .

[163]  Calyampudi R. Rao,et al.  Chapter 3: Differential and Integral Geometry in Statistical Inference , 1987 .

[164]  K. Wilson,et al.  Langevin simulations of lattice field theories. , 1985, Physical review. D, Particles and fields.

[165]  Scott Kirkpatrick,et al.  Optimization by simulated annealing: Quantitative studies , 1984 .

[166]  S. Eguchi Second Order Efficiency of Minimum Contrast Estimators in a Curved Exponential Family , 1983 .

[167]  C. R. Rao,et al.  Entropy differential metric, distance and divergence measures in probability spaces: A unified approach , 1982 .

[168]  S. Amari Differential Geometry of Curved Exponential Families-Curvatures and Information Loss , 1982 .

[169]  C. R. Rao,et al.  On the convexity of some divergence measures based on entropy functions , 1982, IEEE Trans. Inf. Theory.

[170]  J. Varah A Spline Least Squares Method for Numerical Parameter Estimation in Differential Equations , 1982 .

[171]  K. Chung Lectures from Markov processes to Brownian motion , 1982 .

[172]  P. Ferreira,et al.  Extending Fisher's measure of information , 1981 .

[173]  C. Atkinson Rao's distance measure , 1981 .

[174]  M. Benson,et al.  Parameter fitting in dynamic models , 1979 .

[175]  J. Kent Time-reversible diffusions , 1978, Advances in Applied Probability.

[176]  J. D. Doll,et al.  Brownian dynamics as smart Monte Carlo simulation , 1978 .

[177]  A. Dawid Further Comments on Some Comments on a Paper by Bradley Efron , 1977 .

[178]  Harold L. Friedman,et al.  Brownian dynamics: Its application to ionic solutions , 1977 .

[179]  C. Dodson,et al.  Tensor Geometry: The Geometric Viewpoint and its Uses , 1977 .

[180]  M. Spivak A comprehensive introduction to differential geometry , 1979 .

[181]  B. Efron Defining the Curvature of a Statistical Problem (with Applications to Second Order Efficiency) , 1975 .

[182]  Piet Hemker,et al.  Nonlinear parameter estimation in initial value problems , 1974 .

[183]  Yonathan Bard,et al.  Nonlinear parameter estimation , 1974 .

[184]  P. Peskun,et al.  Optimum Monte-Carlo sampling using Markov chains , 1973 .

[185]  Robert K. Tsutakawa,et al.  Design of Experiment for Bioassay , 1972 .

[186]  I. Csiszár A class of measures of informativity of observation channels , 1972 .

[187]  C. W. Gear,et al.  The automatic integration of ordinary differential equations , 1971, Commun. ACM.

[188]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[189]  A. Barker Monte Carlo calculations of the radial distribution functions for a proton-electron plasma , 1965 .

[190]  N. G. Parke,et al.  Ordinary Differential Equations. , 1958 .

[191]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[192]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[193]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[194]  Helly Grundbegriffe der Wahrscheinlichkeitsrechnung , 1936 .

[195]  L. M. M.-T. Theory of Probability , 1929, Nature.

[196]  I. Holopainen Riemannian Geometry , 1927, Nature.

[197]  C. Darwin,et al.  The 'Power of movement in plants.'--1880. , 1888 .

[198]  W. T. THISELTON DYER,et al.  The Effects of Cross- and Self-Fertilisation in the Vegetable Kingdom , 1877, Nature.