Unification of sparse Bayesian learning algorithms for electromagnetic brain imaging with the majorization minimization framework

Methods for electro- or magnetoencephalography (EEG/MEG) based brain source imaging (BSI) using sparse Bayesian learning (SBL) have been demonstrated to achieve excellent performance in situations with low numbers of distinct active sources, such as event-related designs. This paper extends the theory and practice of SBL in three important ways. First, we reformulate three existing SBL algorithms under the majorization-minimization (MM) framework. This unification perspective not only provides a useful theoretical framework for comparing different algorithms in terms of their convergence behavior, but also provides a principled recipe for constructing novel algorithms with specific properties by designing appropriate bounds of the Bayesian marginal likelihood function. Second, building on the MM principle, we propose a novel method called LowSNR-BSI that achieves favorable source reconstruction performance in low signal-to-noise-ratio settings. Third, precise knowledge of the noise level is a crucial requirement for accurate source reconstruction. Here we present a novel principled technique to accurately learn the noise variance from the data either jointly within the source reconstruction procedure or using one of two proposed cross-validation strategies. Empirically, we could show that the monotonous convergence behavior predicted from MM theory is confirmed in numerical experiments. Using simulations, we further demonstrate the advantage of LowSNR-BSI over conventional SBL in low-SNR regimes, and the advantage of learned noise levels over estimates derived from baseline data. To demonstrate the usefulness of our novel approach we show neurophysiologically plausible source reconstructions on averaged auditory evoked potential data.

[1]  Karl J. Friston,et al.  Classical and Bayesian Inference in Neuroimaging: Theory , 2002, NeuroImage.

[2]  Stefan Haufe,et al.  Solving the EEG inverse problem based on space–time–frequency structured sparsity constraints , 2015, NeuroImage.

[3]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[4]  T. Rapcsák Geodesic convexity in nonlinear optimization , 1991 .

[5]  Stefan Haufe,et al.  Single-trial analysis and classification of ERP components — A tutorial , 2011, NeuroImage.

[6]  Alexandre Gramfort,et al.  Mapping, timing and tracking cortical activations with MEG and EEG: Methods and application to human vision. (Localisation et suivi d'activité fonctionnelle cérébrale en électro et magnétoencéphalographie: Méthodes et applications au système visuel humain) , 2009 .

[7]  Trevor J. Hastie,et al.  The Graphical Lasso: New Insights and Alternatives , 2011, Electronic journal of statistics.

[8]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[9]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[10]  Alfred O. Hero,et al.  Covariance Estimation in High Dimensions Via Kronecker Product Expansions , 2013, IEEE Transactions on Signal Processing.

[11]  Sandeep Kumar,et al.  A Unified Framework for Structured Graph Learning via Spectral Constraints , 2019, J. Mach. Learn. Res..

[12]  K. Matsuura,et al.  Selective minimum-norm solution of the biomagnetic inverse problem , 1995, IEEE Transactions on Biomedical Engineering.

[13]  Lourens J. Waldorp,et al.  Spatiotemporal EEG/MEG source analysis based on a parametric noise covariance model , 2002, IEEE Transactions on Biomedical Engineering.

[14]  Sergey M. Plis,et al.  A generalized spatiotemporal covariance model for stationary background in analysis of MEG data , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[15]  F. H. Lopes da Silva,et al.  Model of brain rhythmic activity , 1974, Kybernetik.

[16]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[17]  D. Lehmann,et al.  Low resolution electromagnetic tomography: a new method for localizing electrical activity in the brain. , 1994, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[18]  Silvere Bonnabel,et al.  Riemannian Metric and Geometric Mean for Positive Semidefinite Matrices of Fixed Rank , 2008, SIAM J. Matrix Anal. Appl..

[19]  Daniel P. Palomar,et al.  A Signal Processing Perspective of Financial Engineering , 2016, Found. Trends Signal Process..

[20]  Stephen P. Boyd,et al.  Variations and extension of the convex–concave procedure , 2016 .

[21]  Jeffrey A. Fessler,et al.  An Expanded Theoretical Treatment of Iteration-Dependent Majorize-Minimize Algorithms , 2007, IEEE Transactions on Image Processing.

[22]  Alexandre Gramfort,et al.  Automated model selection in covariance estimation and spatial whitening of MEG and EEG signals , 2015, NeuroImage.

[23]  Daniel Pérez Palomar,et al.  Optimization Methods for Financial Index Tracking: From Theory to Practice , 2018, Found. Trends Optim..

[24]  Björn E. Ottersten,et al.  Covariance Matching Estimation Techniques for Array Signal Processing Applications , 1998, Digit. Signal Process..

[25]  Julia P. Owen,et al.  Robust Bayesian estimation of the location, orientation, and time course of multiple correlated neural sources using MEG , 2010, NeuroImage.

[26]  Suvrit Sra,et al.  Geometric Mean Metric Learning , 2016, ICML.

[27]  Paris Smaragdis,et al.  Majorization-minimization Algorithms for Convolutive NMF with the Beta-divergence , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[29]  João M. F. Xavier,et al.  Robust Localization of Nodes and Time-Recursive Tracking in Sensor Networks Using Noisy Range Measurements , 2011, IEEE Transactions on Signal Processing.

[30]  Motoaki Kawanabe,et al.  Divergence-Based Framework for Common Spatial Patterns Algorithms , 2014, IEEE Reviews in Biomedical Engineering.

[31]  E. Halgren,et al.  Dynamic Statistical Parametric Mapping Combining fMRI and MEG for High-Resolution Imaging of Cortical Activity , 2000, Neuron.

[32]  Giuseppe Caire,et al.  Structured Channel Covariance Estimation from Limited Samples in Massive MIMO , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[33]  Bhaskar D. Rao,et al.  Latent Variable Bayesian Models for Promoting Sparsity , 2011, IEEE Transactions on Information Theory.

[34]  Alfred O. Hero,et al.  On Convergence of Kronecker Graphical Lasso Algorithms , 2012, IEEE Transactions on Signal Processing.

[35]  Alfred O. Hero,et al.  Robust Kronecker Product PCA for Spatio-Temporal Covariance Estimation , 2015, IEEE Transactions on Signal Processing.

[36]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[37]  Bhaskar D. Rao,et al.  An Empirical Bayesian Strategy for Solving the Simultaneous Sparse Approximation Problem , 2007, IEEE Transactions on Signal Processing.

[38]  Klaus-Robert Müller,et al.  Introduction to machine learning for brain imaging , 2011, NeuroImage.

[39]  Julia P. Owen,et al.  Performance evaluation of the Champagne source reconstruction algorithm on simulated and real M/EEG data , 2012, NeuroImage.

[40]  David P. Wipf,et al.  Variational Bayesian Inference Techniques , 2010, IEEE Signal Processing Magazine.

[41]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[42]  Christoph F. Mecklenbräuker,et al.  Multisnapshot Sparse Bayesian Learning for DOA , 2016, IEEE Signal Processing Letters.

[43]  Yu Huang,et al.  The New York Head—A precise standardized volume conductor model for EEG source localization and tES targeting , 2015, NeuroImage.

[44]  Diethard Pallaschke,et al.  Foundations of mathematical optimization : convex analysis without linearity , 1997 .

[45]  Esa Ollila,et al.  Shrinking the Eigenvalues of M-Estimators of Covariance Matrix , 2021, IEEE Transactions on Signal Processing.

[46]  Ami Wiesel,et al.  Structured Robust Covariance Estimation , 2015, Found. Trends Signal Process..

[47]  D. M. Schmidt,et al.  Spatiotemporal noise covariance estimation from limited empirical magnetoencephalographic data , 2006, Physics in medicine and biology.

[48]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[49]  A. Gramfort,et al.  A hierarchical Bayesian perspective on majorization-minimization for non-convex sparse regression: application to M/EEG source imaging , 2017, Inverse Problems.

[50]  David P. Wipf,et al.  Iterative Reweighted 1 and 2 Methods for Finding Sparse Solutions , 2010, IEEE J. Sel. Top. Signal Process..

[51]  R. Ilmoniemi,et al.  Interpreting magnetic fields of the brain: minimum norm estimates , 2006, Medical and Biological Engineering and Computing.

[52]  A. Gramfort,et al.  Mixed-norm estimates for the M/EEG inverse problem using accelerated gradient methods , 2012, Physics in medicine and biology.

[53]  Wei Wu,et al.  Bayesian Machine Learning: EEG\/MEG signal processing measurements , 2016, IEEE Signal Processing Magazine.

[54]  Leo Liberti ON A CLASS OF NONCONVEX PROBLEMS WHERE ALL LOCAL MINIMA ARE GLOBAL , 2004 .

[55]  Adeel Razi,et al.  The Connected Brain: Causality, models, and intrinsic dynamics , 2016, IEEE Signal Processing Magazine.

[56]  Robert Oostenveld,et al.  The five percent electrode system for high-resolution EEG and ERP measurements , 2001, Clinical Neurophysiology.

[57]  Stefan Haufe,et al.  Robust estimation of noise for electromagnetic brain imaging with the champagne algorithm , 2020, NeuroImage.

[58]  Maher Moakher,et al.  A Differential Geometric Approach to the Geometric Mean of Symmetric Positive-Definite Matrices , 2005, SIAM J. Matrix Anal. Appl..

[59]  Kensuke Sekihara,et al.  Hierarchical multiscale Bayesian algorithm for robust MEG/EEG source reconstruction , 2018, NeuroImage.

[60]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[61]  Fetsje Bijma,et al.  A mathematical approach to the temporal stationarity of background noise in MEG/EEG measurements , 2003, NeuroImage.

[62]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[63]  K. Lange,et al.  The MM Alternative to EM , 2010, 1104.2203.

[64]  Stefan Haufe,et al.  Improving EEG Source Localization Through Spatio-Temporal Sparse Bayesian Learning , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[65]  Cédric Févotte,et al.  Majorization-minimization algorithm for smooth Itakura-Saito nonnegative matrix factorization , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[66]  Kensuke Sekihara,et al.  Electromagnetic Brain Imaging , 2015, Springer International Publishing.

[67]  Jens Haueisen,et al.  Time-frequency mixed-norm estimates: Sparse M/EEG imaging with non-stationary source activations , 2013, NeuroImage.

[68]  Wei Yu,et al.  Optimization of MIMO Device-to-Device Networks via Matrix Fractional Programming: A Minorization–Maximization Approach , 2018, IEEE/ACM Transactions on Networking.

[69]  Prabhu Babu,et al.  Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[70]  Chang Cai,et al.  Robust estimation of noise for electromagnetic brain imaging with the champagne algorithm , 2021, NeuroImage.

[71]  Giuseppe Caire,et al.  Massive MIMO Channel Subspace Estimation From Low-Dimensional Projections , 2015, IEEE Transactions on Signal Processing.

[72]  C. Villani Optimal Transport: Old and New , 2008 .

[73]  Ingo Bojak Neural Population Models and Cortical Field Theory: Overview , 2014, Encyclopedia of Computational Neuroscience.

[74]  Giuseppe Caire,et al.  Massive MIMO Unsourced Random Access , 2019, ArXiv.

[75]  Bhaskar D. Rao,et al.  Sparse Signal Recovery With Temporally Correlated Source Vectors Using Sparse Bayesian Learning , 2011, IEEE Journal of Selected Topics in Signal Processing.

[76]  Stefan Haufe,et al.  Large-scale EEG/MEG source localization with spatial flexibility , 2011, NeuroImage.

[77]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[78]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[79]  Stefan Haufe,et al.  Optimizing the regularization for image reconstruction of cerebral diffuse optical tomography , 2014, Journal of biomedical optics.

[80]  Jeffrey A. Fessler,et al.  Model-Based Image Reconstruction for MRI , 2010, IEEE Signal Processing Magazine.

[81]  Saurabh Khanna,et al.  On the Support Recovery of Jointly Sparse Gaussian Sources via Sparse Bayesian Learning , 2017, IEEE Transactions on Information Theory.

[82]  Eric Mjolsness,et al.  Algebraic transformations of objective functions , 1990, Neural Networks.

[83]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[84]  David P. Wipf,et al.  Dual-Space Analysis of the Sparse Linear Model , 2012, NIPS.

[85]  A. Papadopoulos Metric Spaces, Convexity and Nonpositive Curvature , 2004 .

[86]  I F Gorodnitsky,et al.  Neuromagnetic source imaging with FOCUSS: a recursive weighted minimum norm algorithm. , 1995, Electroencephalography and clinical neurophysiology.

[87]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[88]  Kostiantyn Maksymenko Novel algorithmic approaches for the forward and inverse M/EEG problems. (Nouvelles approches algorithmiques pour les problèmes directs et inverses en M/EEG) , 2019 .

[89]  Lourens J. Waldorp,et al.  Estimating stationary dipoles from MEG/EEG data contaminated with spatially and temporally correlated background noise , 2001, NeuroImage.

[90]  Kensuke Sekihara,et al.  Robust Empirical Bayesian Reconstruction of Distributed Sources for Electromagnetic Brain Imaging , 2020, IEEE Transactions on Medical Imaging.

[91]  Motoaki Kawanabe,et al.  Robust Spatial Filtering with Beta Divergence , 2013, NIPS.

[92]  D. Hunter,et al.  A Tutorial on MM Algorithms , 2004 .

[93]  Kensuke Sekihara,et al.  MEG/EEG Source Reconstruction, Statistical Evaluation, and Visualization with NUTMEG , 2011, Comput. Intell. Neurosci..

[94]  R. Ilmoniemi,et al.  Magnetoencephalography-theory, instrumentation, and applications to noninvasive studies of the working human brain , 1993 .

[95]  Andreas Ziehe,et al.  Combining sparsity and rotational invariance in EEG/MEG source reconstruction , 2008, NeuroImage.

[96]  Bhaskar D. Rao,et al.  Joint Channel Estimation and Data Detection in MIMO-OFDM Systems: A Sparse Bayesian Learning Approach , 2015, IEEE Transactions on Signal Processing.

[97]  Gabriel Peyré,et al.  Fast Optimal Transport Averaging of Neuroimaging Data , 2015, IPMI.

[98]  Heinz H. Bauschke,et al.  Fenchel–Rockafellar Duality , 2011 .

[99]  Joakim Andén,et al.  Kymatio: Scattering Transforms in Python , 2018, J. Mach. Learn. Res..

[100]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[101]  Richard M. Leahy,et al.  Electromagnetic brain mapping , 2001, IEEE Signal Process. Mag..

[102]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[103]  Roberto D. Pascual-Marqui,et al.  Discrete, 3D distributed, linear imaging methods of electric neuronal activity. Part 1: exact, zero error localization , 2007, 0710.3341.

[104]  Saurabh Khanna,et al.  Rényi divergence based covariance matching pursuit of joint sparse support , 2017, 2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[105]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[106]  Petre Stoica,et al.  On Estimation of Covariance Matrices With Kronecker Product Structure , 2008, IEEE Transactions on Signal Processing.

[107]  Nisheeth K. Vishnoi Geodesic Convex Optimization: Differentiation on Manifolds, Geodesics, and Convexity , 2018, ArXiv.

[108]  David P. Wipf,et al.  A unified Bayesian framework for MEG/EEG source imaging , 2009, NeuroImage.

[109]  J. Fermaglich Electric Fields of the Brain: The Neurophysics of EEG , 1982 .

[110]  Prabhu Babu,et al.  Robust Estimation of Structured Covariance Matrix for Heavy-Tailed Elliptical Distributions , 2015, IEEE Transactions on Signal Processing.

[111]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[112]  Adeel Razi,et al.  Bayesian fusion and multimodal DCM for EEG and fMRI , 2019, NeuroImage.

[113]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.

[114]  Bhaskar D. Rao,et al.  Sparse Bayesian learning for basis selection , 2004, IEEE Transactions on Signal Processing.

[115]  Shogo Kato,et al.  Entropy and Divergence Associated with Power Function and the Statistical Application , 2010, Entropy.

[116]  Michael Muma,et al.  Robust Statistics for Signal Processing , 2018 .

[117]  Bertrand Thirion,et al.  Group level MEG/EEG source imaging via optimal transport: minimum Wasserstein estimates , 2019, IPMI.

[118]  A. Ben-Tal On generalized means and generalized convex functions , 1977 .