Unification of sparse Bayesian learning algorithms for electromagnetic brain imaging with the majorization minimization framework

Methods for electro- or magnetoencephalography (EEG/MEG) based brain source imaging (BSI) using sparse Bayesian learning (SBL) have been demonstrated to achieve excellent performance in situations with low numbers of distinct active sources, such as event-related designs. This paper extends the theory and practice of SBL in three important ways. First, we reformulate three existing SBL algorithms under the majorization-minimization (MM) framework. This unification perspective not only provides a useful theoretical framework for comparing different algorithms in terms of their convergence behavior, but also provides a principled recipe for constructing novel algorithms with specific properties by designing appropriate bounds of the Bayesian marginal likelihood function. Second, building on the MM principle, we propose a novel method called LowSNR-BSI that achieves favorable source reconstruction performance in low signal-to-noise-ratio (SNR) settings. Third, precise knowledge of the noise level is a crucial requirement for accurate source reconstruction. Here we present a novel principled technique to accurately learn the noise variance from the data either jointly within the source reconstruction procedure or using one of two proposed cross-validation strategies. Empirically, we could show that the monotonous convergence behavior predicted from MM theory is confirmed in numerical experiments. Using simulations, we further demonstrate the advantage of LowSNR-BSI over conventional SBL in low-SNR regimes, and the advantage of learned noise levels over estimates derived from baseline data. To demonstrate the usefulness of our novel approach, we show neurophysiologically plausible source reconstructions on averaged auditory evoked potential data.

[1]  Michael Muma,et al.  Robust Statistics for Signal Processing , 2018 .

[2]  Diethard Pallaschke,et al.  Foundations of mathematical optimization : convex analysis without linearity , 1997 .

[3]  Gunnar Rätsch,et al.  A Mathematical Programming Approach to the Kernel Fisher Algorithm , 2000, NIPS.

[4]  Bhaskar D. Rao,et al.  An Empirical Bayesian Strategy for Solving the Simultaneous Sparse Approximation Problem , 2007, IEEE Transactions on Signal Processing.

[5]  Kensuke Sekihara,et al.  Hierarchical multiscale Bayesian algorithm for robust MEG/EEG source reconstruction , 2018, NeuroImage.

[6]  K. Matsuura,et al.  Selective minimum-norm solution of the biomagnetic inverse problem , 1995, IEEE Transactions on Biomedical Engineering.

[7]  Victor Solo,et al.  Sparse component selection with application to MEG source localization , 2013, 2013 IEEE 10th International Symposium on Biomedical Imaging.

[8]  Stefan Haufe,et al.  Optimizing the regularization for image reconstruction of cerebral diffuse optical tomography , 2014, Journal of biomedical optics.

[9]  E. Halgren,et al.  Dynamic Statistical Parametric Mapping Combining fMRI and MEG for High-Resolution Imaging of Cortical Activity , 2000, Neuron.

[10]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[11]  Barbara Vantaggi,et al.  Brain Activity Mapping from MEG Data via a Hierarchical Bayesian Algorithm with Automatic Depth Weighting , 2018, Brain Topography.

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Lourens J. Waldorp,et al.  Spatiotemporal EEG/MEG source analysis based on a parametric noise covariance model , 2002, IEEE Transactions on Biomedical Engineering.

[14]  C. Villani Optimal Transport: Old and New , 2008 .

[15]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[16]  A. Papadopoulos Metric Spaces, Convexity and Nonpositive Curvature , 2004 .

[17]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[18]  Adeel Razi,et al.  Bayesian fusion and multimodal DCM for EEG and fMRI , 2019, NeuroImage.

[19]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.

[20]  Bhaskar D. Rao,et al.  Sparse Bayesian learning for basis selection , 2004, IEEE Transactions on Signal Processing.

[21]  R. Ilmoniemi,et al.  Magnetoencephalography-theory, instrumentation, and applications to noninvasive studies of the working human brain , 1993 .

[22]  Karl J. Friston,et al.  Classical and Bayesian Inference in Neuroimaging: Theory , 2002, NeuroImage.

[23]  Wei Wu,et al.  Bayesian Machine Learning: EEG\/MEG signal processing measurements , 2016, IEEE Signal Processing Magazine.

[24]  D. Hunter,et al.  A Tutorial on MM Algorithms , 2004 .

[25]  Prabhu Babu,et al.  Robust Estimation of Structured Covariance Matrix for Heavy-Tailed Elliptical Distributions , 2015, IEEE Transactions on Signal Processing.

[26]  D. Lehmann,et al.  Low resolution electromagnetic tomography: a new method for localizing electrical activity in the brain. , 1994, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[27]  David P. Wipf,et al.  Iterative Reweighted 1 and 2 Methods for Finding Sparse Solutions , 2010, IEEE J. Sel. Top. Signal Process..

[28]  R D Pascual-Marqui,et al.  Standardized low-resolution brain electromagnetic tomography (sLORETA): technical details. , 2002, Methods and findings in experimental and clinical pharmacology.

[29]  I F Gorodnitsky,et al.  Neuromagnetic source imaging with FOCUSS: a recursive weighted minimum norm algorithm. , 1995, Electroencephalography and clinical neurophysiology.

[30]  Hagai Attias,et al.  A probabilistic algorithm integrating source localization and noise suppression for MEG and EEG data , 2006, NeuroImage.

[31]  A. Dale,et al.  Improved Localizadon of Cortical Activity by Combining EEG and MEG with MRI Cortical Surface Reconstruction: A Linear Approach , 1993, Journal of Cognitive Neuroscience.

[32]  Trevor J. Hastie,et al.  The Graphical Lasso: New Insights and Alternatives , 2011, Electronic journal of statistics.

[33]  Daniel Pérez Palomar,et al.  Optimization Methods for Financial Index Tracking: From Theory to Practice , 2018, Found. Trends Optim..

[34]  Prabhu Babu,et al.  Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[35]  Maher Moakher,et al.  A Differential Geometric Approach to the Geometric Mean of Symmetric Positive-Definite Matrices , 2005, SIAM J. Matrix Anal. Appl..

[36]  A. Benfenati,et al.  Proximal approaches for matrix optimization problems: Application to robust precision matrix estimation , 2017, Signal Process..

[37]  Saurabh Khanna,et al.  On the Support Recovery of Jointly Sparse Gaussian Sources via Sparse Bayesian Learning , 2017, IEEE Transactions on Information Theory.

[38]  David P. Wipf,et al.  Variational Bayesian Inference Techniques , 2010, IEEE Signal Processing Magazine.

[39]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[40]  Daniel P. Palomar,et al.  A Signal Processing Perspective of Financial Engineering , 2016, Found. Trends Signal Process..

[41]  Robert Oostenveld,et al.  The five percent electrode system for high-resolution EEG and ERP measurements , 2001, Clinical Neurophysiology.

[42]  Julia P. Owen,et al.  Robust Bayesian estimation of the location, orientation, and time course of multiple correlated neural sources using MEG , 2010, NeuroImage.

[43]  Esa Ollila,et al.  Shrinking the Eigenvalues of M-Estimators of Covariance Matrix , 2021, IEEE Transactions on Signal Processing.

[44]  Andreas Ziehe,et al.  Combining sparsity and rotational invariance in EEG/MEG source reconstruction , 2008, NeuroImage.

[45]  K. Lange,et al.  The MM Alternative to EM , 2010, 1104.2203.

[46]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[47]  Björn E. Ottersten,et al.  Covariance Matching Estimation Techniques for Array Signal Processing Applications , 1998, Digit. Signal Process..

[48]  Daniela Calvetti,et al.  Inverse problems: From regularization to Bayesian inference , 2018 .

[49]  Julia P. Owen,et al.  Performance evaluation of the Champagne source reconstruction algorithm on simulated and real M/EEG data , 2012, NeuroImage.

[50]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[51]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[52]  Alfred O. Hero,et al.  Covariance Estimation in High Dimensions Via Kronecker Product Expansions , 2013, IEEE Transactions on Signal Processing.

[53]  Yu Huang,et al.  The New York Head—A precise standardized volume conductor model for EEG source localization and tES targeting , 2015, NeuroImage.

[54]  Sergey M. Plis,et al.  A generalized spatiotemporal covariance model for stationary background in analysis of MEG data , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[55]  A. Gramfort,et al.  A hierarchical Bayesian perspective on majorization-minimization for non-convex sparse regression: application to M/EEG source imaging , 2017, Inverse Problems.

[56]  Sandeep Kumar,et al.  A Unified Framework for Structured Graph Learning via Spectral Constraints , 2019, J. Mach. Learn. Res..

[57]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[58]  Giuseppe Caire,et al.  Massive MIMO Unsourced Random Access , 2019, ArXiv.

[59]  David P. Wipf,et al.  Dual-Space Analysis of the Sparse Linear Model , 2012, NIPS.

[60]  Alexandre Gramfort,et al.  Automated model selection in covariance estimation and spatial whitening of MEG and EEG signals , 2015, NeuroImage.

[61]  Fetsje Bijma,et al.  A mathematical approach to the temporal stationarity of background noise in MEG/EEG measurements , 2003, NeuroImage.

[62]  Jens Haueisen,et al.  The Iterative Reweighted Mixed-Norm Estimate for Spatio-Temporal MEG/EEG Source Reconstruction , 2016, IEEE Transactions on Medical Imaging.

[63]  Bhaskar D. Rao,et al.  Sparse Signal Recovery With Temporally Correlated Source Vectors Using Sparse Bayesian Learning , 2011, IEEE Journal of Selected Topics in Signal Processing.

[64]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[65]  Wei Yu,et al.  Optimization of MIMO Device-to-Device Networks via Matrix Fractional Programming: A Minorization–Maximization Approach , 2018, IEEE/ACM Transactions on Networking.

[66]  Kensuke Sekihara,et al.  MEG/EEG Source Reconstruction, Statistical Evaluation, and Visualization with NUTMEG , 2011, Comput. Intell. Neurosci..

[67]  Bhaskar D. Rao,et al.  Latent Variable Bayesian Models for Promoting Sparsity , 2011, IEEE Transactions on Information Theory.

[68]  Roberto D. Pascual-Marqui,et al.  Discrete, 3D distributed, linear imaging methods of electric neuronal activity. Part 1: exact, zero error localization , 2007, 0710.3341.

[69]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[70]  João M. F. Xavier,et al.  Robust Localization of Nodes and Time-Recursive Tracking in Sensor Networks Using Noisy Range Measurements , 2011, IEEE Transactions on Signal Processing.

[71]  D. M. Schmidt,et al.  Spatiotemporal noise covariance estimation from limited empirical magnetoencephalographic data , 2006, Physics in medicine and biology.

[72]  Giuseppe Caire,et al.  Structured Channel Covariance Estimation from Limited Samples in Massive MIMO , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[73]  A. Gramfort,et al.  Mixed-norm estimates for the M/EEG inverse problem using accelerated gradient methods , 2012, Physics in medicine and biology.

[74]  Bhaskar D. Rao,et al.  Joint Channel Estimation and Data Detection in MIMO-OFDM Systems: A Sparse Bayesian Learning Approach , 2015, IEEE Transactions on Signal Processing.

[75]  Gabriel Peyré,et al.  Fast Optimal Transport Averaging of Neuroimaging Data , 2015, IPMI.

[76]  T. Rapcsák Geodesic convexity in nonlinear optimization , 1991 .

[77]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[78]  Christoph F. Mecklenbräuker,et al.  Multisnapshot Sparse Bayesian Learning for DOA , 2016, IEEE Signal Processing Letters.

[79]  Suvrit Sra,et al.  Geometric Mean Metric Learning , 2016, ICML.

[80]  Nelson J. Trujillo-Barreto,et al.  Bayesian model averaging in EEG/MEG imaging , 2004, NeuroImage.

[81]  Saurabh Khanna,et al.  Rényi divergence based covariance matching pursuit of joint sparse support , 2017, 2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[82]  Alexandre Gramfort,et al.  Mapping, timing and tracking cortical activations with MEG and EEG: Methods and application to human vision. (Localisation et suivi d'activité fonctionnelle cérébrale en électro et magnétoencéphalographie: Méthodes et applications au système visuel humain) , 2009 .

[83]  Shogo Kato,et al.  Entropy and Divergence Associated with Power Function and the Statistical Application , 2010, Entropy.

[84]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[85]  Alfred O. Hero,et al.  On Convergence of Kronecker Graphical Lasso Algorithms , 2012, IEEE Transactions on Signal Processing.

[86]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.

[87]  Jens Haueisen,et al.  Time-frequency mixed-norm estimates: Sparse M/EEG imaging with non-stationary source activations , 2013, NeuroImage.

[88]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[89]  Marco Cuturi,et al.  Computational Optimal Transport: With Applications to Data Science , 2019 .

[90]  Motoaki Kawanabe,et al.  Divergence-Based Framework for Common Spatial Patterns Algorithms , 2014, IEEE Reviews in Biomedical Engineering.

[91]  Diethard Pallaschke,et al.  Foundations of Mathematical Optimization , 1997 .

[92]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[93]  Chang Cai,et al.  Robust estimation of noise for electromagnetic brain imaging with the champagne algorithm , 2021, NeuroImage.

[94]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[95]  Stefan Haufe,et al.  Improving EEG Source Localization Through Spatio-Temporal Sparse Bayesian Learning , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[96]  Richard M. Leahy,et al.  Electromagnetic brain mapping , 2001, IEEE Signal Process. Mag..

[97]  Giuseppe Caire,et al.  Massive MIMO Channel Subspace Estimation From Low-Dimensional Projections , 2015, IEEE Transactions on Signal Processing.

[98]  Jens Haueisen,et al.  MEG/EEG Source Imaging with a Non-Convex Penalty in the Time-Frequency Domain , 2015, 2015 International Workshop on Pattern Recognition in NeuroImaging.

[99]  Cédric Févotte,et al.  Majorization-minimization algorithm for smooth Itakura-Saito nonnegative matrix factorization , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[100]  Stefan Haufe,et al.  Solving the EEG inverse problem based on space–time–frequency structured sparsity constraints , 2015, NeuroImage.

[101]  R. Ilmoniemi,et al.  Interpreting magnetic fields of the brain: minimum norm estimates , 2006, Medical and Biological Engineering and Computing.

[102]  Philippe Forster,et al.  Matched and Mismatched Estimation of Kronecker Product of Linearly Structured Scatter Matrices Under Elliptical Distributions , 2021, IEEE Transactions on Signal Processing.

[103]  Petre Stoica,et al.  On Estimation of Covariance Matrices With Kronecker Product Structure , 2008, IEEE Transactions on Signal Processing.

[104]  Nisheeth K. Vishnoi Geodesic Convex Optimization: Differentiation on Manifolds, Geodesics, and Convexity , 2018, ArXiv.

[105]  Alfred O. Hero,et al.  Robust Kronecker Product PCA for Spatio-Temporal Covariance Estimation , 2015, IEEE Transactions on Signal Processing.

[106]  Ami Wiesel,et al.  Structured Robust Covariance Estimation , 2015, Found. Trends Signal Process..

[107]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[108]  Lourens J. Waldorp,et al.  Estimating stationary dipoles from MEG/EEG data contaminated with spatially and temporally correlated background noise , 2001, NeuroImage.

[109]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[110]  Leo Liberti ON A CLASS OF NONCONVEX PROBLEMS WHERE ALL LOCAL MINIMA ARE GLOBAL , 2004 .

[111]  Silvere Bonnabel,et al.  Riemannian Metric and Geometric Mean for Positive Semidefinite Matrices of Fixed Rank , 2008, SIAM J. Matrix Anal. Appl..

[112]  Jeffrey A. Fessler,et al.  An Expanded Theoretical Treatment of Iteration-Dependent Majorize-Minimize Algorithms , 2007, IEEE Transactions on Image Processing.

[113]  Giuseppe Caire,et al.  Non-Bayesian Activity Detection, Large-Scale Fading Coefficient Estimation, and Unsourced Random Access with a Massive MIMO Receiver , 2019, ArXiv.

[114]  Bertrand Thirion,et al.  Multi-subject MEG/EEG source imaging with sparse multi-task regression , 2019, NeuroImage.

[115]  Heinz H. Bauschke,et al.  Fenchel–Rockafellar Duality , 2011 .

[116]  J. Fermaglich Electric Fields of the Brain: The Neurophysics of EEG , 1982 .

[117]  A. Ben-Tal On generalized means and generalized convex functions , 1977 .

[118]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[119]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[120]  Stephen P. Boyd,et al.  Variations and extension of the convex–concave procedure , 2016 .

[121]  Klaus-Robert Müller,et al.  Introduction to machine learning for brain imaging , 2011, NeuroImage.

[122]  Stefan Haufe,et al.  Large-scale EEG/MEG source localization with spatial flexibility , 2011, NeuroImage.

[123]  Babak Hassibi,et al.  Error bounds for Bregman denoising and structured natural parameter estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[124]  Stefan Haufe,et al.  Single-trial analysis and classification of ERP components — A tutorial , 2011, NeuroImage.

[125]  David P. Wipf,et al.  A unified Bayesian framework for MEG/EEG source imaging , 2009, NeuroImage.