Flexible Bayesian Dynamic Modeling of Covariance and Correlation Matrices

Modeling covariance (and correlation) matrices is a challenging problem due to the large dimensionality and positive-definiteness constraint. In this paper, we propose a novel Bayesian framework based on decomposing the covariance matrix into variance and correlation matrices. The highlight is that the correlations are represented as products of vectors on unit spheres. We propose a variety of distributions on spheres (e.g. the squared-Dirichlet distribution) to induce flexible prior distributions for covariance matrices that go beyond the commonly used inverse-Wishart prior. To handle the intractability of the resulting posterior, we introduce the adaptive $\Delta$-Spherical Hamiltonian Monte Carlo. We also extend our structured framework to dynamic cases and introduce unit-vector Gaussian process priors for modeling the evolution of correlation among multiple time series. Using an example of Normal-Inverse-Wishart problem, a simulated periodic process, and an analysis of local field potential data (collected from the hippocampus of rats performing a complex sequence memory task), we demonstrated the validity and effectiveness of our proposed framework for (dynamic) modeling covariance and correlation matrices.

[1]  Xiao-Li Meng,et al.  Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage , 2000 .

[2]  Christopher Bingham Distribution on the Sphere , 1980 .

[3]  R. R. Hocking,et al.  Algorithm AS 53: Wishart Variate Generator , 1972 .

[4]  D. Brigo,et al.  Parameterizing correlations: a geometric interpretation , 2007 .

[5]  Van Der Vaart,et al.  Adaptive Bayesian estimation using a Gaussian random field with inverse Gamma bandwidth , 2009, 0908.3556.

[6]  Andrew Gordon Wilson,et al.  Generalised Wishart Processes , 2010, UAI.

[7]  Adrian E. Raftery,et al.  Inference in model-based cluster analysis , 1997, Stat. Comput..

[8]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[9]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[10]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[11]  Chuanhai Liu Bartlett's decomposition of the posterior distribution of the covariance for normal monotone ignorable missing data , 1993 .

[12]  M. Girolami,et al.  Geodesic Monte Carlo on Embedded Manifolds , 2013, Scandinavian journal of statistics, theory and applications.

[13]  H. Ombao,et al.  SLEX Analysis of Multivariate Nonstationary Time Series , 2005 .

[14]  R. Kass,et al.  Nonconjugate Bayesian Estimation of Covariance Matrices and its Use in Hierarchical Models , 1999 .

[15]  M. Daniels A prior for the variance in hierarchical models , 1999 .

[16]  B. Shahbaba,et al.  Geodesic Lagrangian Monte Carlo over the space of positive definite matrices: with application to Bayesian spectral density estimation , 2016, Journal of statistical computation and simulation.

[17]  Tom Leonard,et al.  Bayesian Inference for a Covariance Matrix , 1992 .

[18]  Raquel Prado,et al.  Sequential estimation of mixtures of structured autoregressive models , 2013, Comput. Stat. Data Anal..

[19]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[20]  Martin A. Lindquist,et al.  Evaluating dynamic bivariate correlations in resting-state fMRI: A comparison study and a new approach , 2014, NeuroImage.

[21]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[22]  G. Nason,et al.  Wavelet processes and adaptive estimation of the evolutionary wavelet spectrum , 2000 .

[23]  Douglas M. Bates,et al.  Unconstrained parametrizations for variance-covariance matrices , 1996, Stat. Comput..

[24]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[25]  Jan R. Magnus,et al.  The Elimination Matrix: Some Lemmas and Applications , 1980, SIAM J. Algebraic Discret. Methods.

[26]  J. Wishart THE GENERALISED PRODUCT MOMENT DISTRIBUTION IN SAMPLES FROM A NORMAL MULTIVARIATE POPULATION , 1928 .

[27]  Gabriel A. Elias,et al.  Nonspatial sequence coding varies along the CA1 transverse axis , 2017, Behavioural Brain Research.

[28]  Rainer Dahlhaus,et al.  A Likelihood Approximation for Locally Stationary Processes , 2000 .

[29]  Christopher Bingham An Antipodally Symmetric Distribution on the Sphere , 1974 .

[30]  Babak Shahbaba,et al.  Spherical Hamiltonian Monte Carlo for Constrained Target Distributions , 2013, ICML.

[31]  Michael A. West,et al.  Evaluation and Comparison of EEG Traces: Latent Structure in Nonstationary Time Series , 1999 .

[32]  Paul S. Dwyer,et al.  Multivariate Maxima and Minima with Matrix Derivatives , 1969 .

[33]  Martin A. Lindquist,et al.  Dynamic connectivity regression: Determining state-related changes in brain connectivity , 2012, NeuroImage.

[34]  N. Fortin,et al.  A Sequence of events model of episodic memory shows parallels in rats and humans , 2014, Hippocampus.

[35]  D. Dunson,et al.  Bayesian Manifold Regression , 2013, 1305.0617.

[36]  Harry van Zanten,et al.  Information Rates of Nonparametric Gaussian Process Methods , 2011, J. Mach. Learn. Res..

[37]  Merrill W. Liechty,et al.  Bayesian correlation estimation , 2004 .

[38]  M. Pourahmadi,et al.  Distribution of random correlation matrices: Hyperspherical parameterization of the Cholesky factor , 2015 .

[39]  J. Magnus,et al.  The Commutation Matrix: Some Properties and Applications , 1979 .

[40]  Tom Leonard,et al.  The Matrix-Logarithmic Covariance Model , 1996 .

[41]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[42]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[43]  Van Der Vaart,et al.  Rates of contraction of posterior distributions based on Gaussian process priors , 2008 .

[44]  J. M. Sanz-Serna,et al.  Hybrid Monte Carlo on Hilbert spaces , 2011 .

[45]  M. Pourahmadi Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation , 1999 .

[46]  David B. Dunson,et al.  Bayesian nonparametric covariance regression , 2011, J. Mach. Learn. Res..

[47]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions for non-i.i.d. observations , 2007, 0708.0491.

[48]  Iven Van Mechelen,et al.  Visualizing Distributions of Covariance Matrices ∗ , 2011 .

[49]  T. Rao The Fitting of Non-stationary Time-series Models with Time-dependent Parameters , 1970 .

[50]  Idris A. Eckley,et al.  Estimating Time-Evolving Partial Coherence Between Signals via Multivariate Locally Stationary Wavelet Processes , 2014, IEEE Transactions on Signal Processing.

[51]  Raquel Prado,et al.  Multichannel electroencephalographic analyses via dynamic regression models with time‐varying lag–lead structure , 2001 .

[52]  N. Fortin,et al.  Nonspatial Sequence Coding in CA1 Neurons , 2016, The Journal of Neuroscience.

[53]  Erling Sverdrup Derivation of the Wishart distribution of the second order sample moments by straightforward integration of a multiple integral , 1947 .

[54]  Chee-Ming Ting,et al.  Estimating Effective Connectivity from fMRI Data Using Factor-based Subspace Autoregressive Models , 2015, IEEE Signal Processing Letters.

[55]  M. Girolami,et al.  Markov Chain Monte Carlo from Lagrangian Dynamics , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[56]  R. Kass,et al.  Shrinkage Estimators for Covariance Matrices , 2001, Biometrics.

[57]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[58]  J. Berger,et al.  Estimation of a Covariance Matrix Using the Reference Prior , 1994 .

[59]  Babak Shahbaba,et al.  A Bayesian supervised dual‐dimensionality reduction model for simultaneous decoding of LFP and spike train signals , 2017, Stat.

[60]  R. Fisher Dispersion on a sphere , 1953, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[61]  Tullis C. Onstott,et al.  Application of the Bingham distribution function in paleomagnetic studies , 1980 .

[62]  Piotr Fryzlewicz,et al.  Multiple‐change‐point detection for high dimensional time series via sparsified binary segmentation , 2015, 1611.08639.

[63]  Hernando Ombao,et al.  Modeling the Evolution of Dynamic Brain Processes During an Associative Learning Experiment , 2016 .

[64]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..