Variational Bayesian Parameter Estimation Techniques for the General Linear Model

Variational Bayes (VB), variational maximum likelihood (VML), restricted maximum likelihood (ReML), and maximum likelihood (ML) are cornerstone parametric statistical estimation techniques in the analysis of functional neuroimaging data. However, the theoretical underpinnings of these model parameter estimation techniques are rarely covered in introductory statistical texts. Because of the widespread practical use of VB, VML, ReML, and ML in the neuroimaging community, we reasoned that a theoretical treatment of their relationships and their application in a basic modeling scenario may be helpful for both neuroimaging novices and practitioners alike. In this technical study, we thus revisit the conceptual and formal underpinnings of VB, VML, ReML, and ML and provide a detailed account of their mathematical relationships and implementational details. We further apply VB, VML, ReML, and ML to the general linear model (GLM) with non-spherical error covariance as commonly encountered in the first-level analysis of fMRI data. To this end, we explicitly derive the corresponding free energy objective functions and ensuing iterative algorithms. Finally, in the applied part of our study, we evaluate the parameter and model recovery properties of VB, VML, ReML, and ML, first in an exemplary setting and then in the analysis of experimental fMRI data acquired from a single participant under visual stimulation.

[1]  N. Draper,et al.  Applied Regression Analysis. , 1967 .

[2]  Karl J. Friston,et al.  Classical and Bayesian Inference in Neuroimaging: Applications , 2002, NeuroImage.

[3]  Stuart Barber,et al.  All of Statistics: a Concise Course in Statistical Inference , 2005 .

[4]  D. Lindley,et al.  Bayes Estimates for the Linear Model , 1972 .

[5]  Calyampudi R. Rao ON VARIANCE–COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES , 1999 .

[6]  J.A. Mumford,et al.  Modeling and inference of multisubject fMRI data , 2006, IEEE Engineering in Medicine and Biology Magazine.

[7]  Daniel Gianola,et al.  Bayesian Analysis of Linear Models , 2002 .

[8]  Karl J. Friston,et al.  Dynamic causal modelling , 2003, NeuroImage.

[9]  Robin Thompson,et al.  Average information REML: An efficient algorithm for variance parameter estimation in linear mixed models , 1995 .

[10]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[11]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[12]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[13]  Robin Thompson,et al.  Prospects for statistical methods in dairy cattle breeding , 1999 .

[14]  E. Groeneveld,et al.  A Note on Multiple Solutions in Multivariate Restricted Maximum Likelihood Covariance Component Estimation , 1990 .

[15]  Karl J. Friston,et al.  Charting the landscape of priority problems in psychiatry, part 2: pathogenesis and aetiology. , 2016, The lancet. Psychiatry.

[16]  Thomas F. Coleman,et al.  An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds , 1993, SIAM J. Optim..

[17]  D. Boichard,et al.  Approximate restricted maximum likelihood and approximate prediction error variance of the Mendelian sampling effect , 1992, Genetics Selection Evolution.

[18]  Bert Fristedt,et al.  A modern approach to probability theory , 1996 .

[19]  Mark W. Woolrich,et al.  Bayesian analysis of neuroimaging data in FSL , 2009, NeuroImage.

[20]  Karl J. Friston,et al.  Dynamic causal modelling of induced responses , 2008, NeuroImage.

[21]  Greg Miller NEUROSCIENCE. Brain scans are prone to false positives, study says. , 2016, Science.

[22]  Brain imaging studies under fire , 2009, Nature.

[23]  Tung H. Pham,et al.  ASYMPTOTIC NORMALITY AND VALID INFERENCE FOR GAUSSIAN VARIATIONAL APPROXIMATION , 2011, 1202.5183.

[24]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[25]  Eric T. Bradlow,et al.  Perspectives on Bayesian Methods and Big Data , 2014, Customer Needs and Solutions.

[26]  Karl J. Friston,et al.  Dynamic causal modeling of evoked responses in EEG and MEG , 2006, NeuroImage.

[27]  Karl J. Friston,et al.  Statistical parametric mapping for event-related potentials: I. Generic considerations , 2004, NeuroImage.

[28]  廣瀬雄一,et al.  Neuroscience , 2019, Workplace Attachments.

[29]  Karl J. Friston Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[30]  Dirk Ostwald,et al.  Probabilistic delay differential equation modeling of event-related potentials , 2016, NeuroImage.

[31]  Karl J. Friston,et al.  Computational neuroimaging strategies for single patient predictions , 2017, NeuroImage.

[32]  Lynn Roy LaMotte A direct derivation of the REML likelihood function , 2007 .

[33]  Hans Knutsson,et al.  Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates , 2016, Proceedings of the National Academy of Sciences.

[34]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[35]  G H Glover,et al.  Image‐based method for retrospective correction of physiological motion effects in fMRI: RETROICOR , 2000, Magnetic resonance in medicine.

[36]  Thomas E. Nichols,et al.  Power calculation for group fMRI studies accounting for arbitrary design and temporal autocorrelation , 2008, NeuroImage.

[37]  J. Ormerod,et al.  On Variational Bayes Estimation and Variational Information Criteria for Linear Regression Models , 2014 .

[38]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[39]  Xiangyu Chang,et al.  Asymptotic Normality of Maximum Likelihood and its Variational Approximation for Stochastic Blockmodels , 2012, ArXiv.

[40]  Karl J. Friston,et al.  Voxel-Based Morphometry—The Methods , 2000, NeuroImage.

[41]  Stephen M. Smith,et al.  Temporal Autocorrelation in Univariate Linear Modeling of FMRI Data , 2001, NeuroImage.

[42]  Mark W. Woolrich,et al.  Variational Bayesian Inference for a Nonlinear Forward Model , 2020, IEEE Transactions on Signal Processing.

[43]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[44]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[45]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[46]  John Ashburner,et al.  Computational anatomy with the SPM software. , 2009, Magnetic resonance imaging.

[47]  Alain Celisse,et al.  Consistency of maximum-likelihood and variational estimators in the Stochastic Block Model , 2011, 1105.3288.

[48]  Devavrat Shah,et al.  On entropy for mixtures of discrete and continuous variables , 2006, ArXiv.

[49]  M. D’Esposito,et al.  Empirical analyses of BOLD fMRI statistics. I. Spatially unsmoothed data collected under null-hypothesis conditions. , 1997, NeuroImage.

[50]  Martin M. Monti,et al.  Human Neuroscience , 2022 .

[51]  D.G. Tzikas,et al.  The variational approximation for Bayesian inference , 2008, IEEE Signal Processing Magazine.

[52]  Zoubin Ghahramani,et al.  On Modern Deep Learning and Variational Inference , 2015 .

[53]  Raymond J. Dolan,et al.  Dynamic causal models of steady-state responses , 2009, NeuroImage.

[54]  Karl J. Friston,et al.  Variational Bayesian inference for fMRI time series , 2003, NeuroImage.

[55]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[56]  Thomas E. Nichols,et al.  Simple group fMRI modeling and inference , 2009, NeuroImage.

[57]  Karl J. Friston,et al.  Convolution Models for fMRI , 2007 .

[58]  Mark W. Woolrich,et al.  Multilevel linear modelling for FMRI group analysis using Bayesian inference , 2004, NeuroImage.

[59]  H. Pashler,et al.  Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition 1 , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[60]  E. Groeneveld A reparameterization to improve numerical optimization in multivariate REML (co)variance component estimation , 1994, Genetics Selection Evolution.

[61]  R. R. Hocking Methods and Applications of Linear Models: Regression and the Analysis of Variance , 2003 .

[62]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[63]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[64]  Karl J. Friston,et al.  Systematic Regularization of Linear Inverse Solutions of the EEG Source Localization Problem , 2002, NeuroImage.

[65]  John Ashburner,et al.  SPM: A history , 2012, NeuroImage.

[66]  I Misztal,et al.  Reliable computing in estimation of variance components. , 2008, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[67]  Karl J. Friston,et al.  Computational psychiatry , 2012, Trends in Cognitive Sciences.

[68]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[69]  Thomas E. Nichols,et al.  Non-white noise in fMRI: Does modelling have an impact? , 2006, NeuroImage.

[70]  David Barber,et al.  Bayesian reasoning and machine learning , 2012 .

[71]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[72]  José M. Bernardo,et al.  Modern Bayesian Inference: Foundations and Objective Methods , 2011 .

[73]  Karl J. Friston,et al.  Multiple sparse priors for the M/EEG inverse problem , 2008, NeuroImage.

[74]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[75]  R B Buxton,et al.  Probabilistic analysis of functional magnetic resonance imaging data , 1998, Magnetic resonance in medicine.

[76]  Karl J. Friston,et al.  Bayesian decoding of brain images , 2008, NeuroImage.

[77]  Karl J. Friston,et al.  Dynamic causal modeling with neural fields , 2012, NeuroImage.

[78]  D. Harville Maximum Likelihood Approaches to Variance Component Estimation and to Related Problems , 1977 .

[79]  Jean-Baptiste Poline,et al.  The general linear model and fMRI: Does love last forever? , 2012, NeuroImage.

[80]  R. Weisskoff,et al.  Effect of temporal autocorrelation due to physiological noise and stimulus paradigm on voxel‐level false‐positive rates in fMRI , 1998, Human brain mapping.

[81]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[82]  J. Foulley,et al.  The PX-EM algorithm for fast stable fitting of Henderson's mixed model , 2000, Genetics Selection Evolution.

[83]  José M. Bernardo,et al.  Bayesian Statistics , 2011, International Encyclopedia of Statistical Science.

[84]  Edward Vul,et al.  Reply to Comments on “Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition” , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[85]  Karl J. Friston,et al.  Charting the landscape of priority problems in psychiatry, part 1: classification and diagnosis. , 2016, The lancet. Psychiatry.

[86]  J. Foulley A Simple Argument Showing How to Derive Restricted Maximum Likelihood , 1993 .

[87]  Karl J. Friston,et al.  Variational Bayesian inversion of the equivalent current dipole model in EEG/MEG , 2008, NeuroImage.

[88]  Thomas E. Nichols,et al.  Commentary on Vul et al.'s (2009) “Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition” , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[89]  Karl J. Friston,et al.  Variational free energy and the Laplace approximation , 2007, NeuroImage.

[90]  Karl J. Friston,et al.  Electromagnetic source reconstruction for group studies , 2008, NeuroImage.

[91]  D. Titterington,et al.  Convergence properties of a general algorithm for calculating variational Bayesian estimates for a normal mixture model , 2006 .

[92]  Dirk Ostwald,et al.  A tutorial on variational Bayes for latent linear stochastic time-series models , 2014 .

[93]  Karl J. Friston,et al.  Statistical parametric mapping for event-related potentials (II): a hierarchical temporal model , 2004, NeuroImage.

[94]  Karl J. Friston,et al.  Classical and Bayesian Inference in Neuroimaging: Theory , 2002, NeuroImage.

[95]  Dirk Ostwald,et al.  An information theoretic approach to EEG–fMRI integration of visually evoked responses , 2010, NeuroImage.

[96]  Karl J. Friston,et al.  Computational Phenotyping in Psychiatry: A Worked Example , 2016, eNeuro.

[97]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[98]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[99]  Karl J. Friston,et al.  Nonlinear Dynamic Causal Models for Fmri Nonlinear Dynamic Causal Models for Fmri Nonlinear Dynamic Causal Models for Fmri , 2022 .

[100]  Karl J. Friston,et al.  Dynamic causal modelling for fMRI: A two-state model , 2008, NeuroImage.