Local‐aggregate modeling for big data via distributed optimization: Applications to neuroimaging

Technological advances have led to a proliferation of structured big data that have matrix-valued covariates. We are specifically motivated to build predictive models for multi-subject neuroimaging data based on each subject's brain imaging scans. This is an ultra-high-dimensional problem that consists of a matrix of covariates (brain locations by time points) for each subject; few methods currently exist to fit supervised models directly to this tensor data. We propose a novel modeling and algorithmic strategy to apply generalized linear models (GLMs) to this massive tensor data in which one set of variables is associated with locations. Our method begins by fitting GLMs to each location separately, and then builds an ensemble by blending information across locations through regularization with what we term an aggregating penalty. Our so called, Local-Aggregate Model, can be fit in a completely distributed manner over the locations using an Alternating Direction Method of Multipliers (ADMM) strategy, and thus greatly reduces the computational burden. Furthermore, we propose to select the appropriate model through a novel sequence of faster algorithmic solutions that is similar to regularization paths. We will demonstrate both the computational and predictive modeling advantages of our methods via simulations and an EEG classification problem.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[3]  Rainer Goebel,et al.  Classification of fMRI independent components using IC-fingerprints and support vector machine classifiers , 2007, NeuroImage.

[4]  Bo Wahlberg,et al.  An ADMM algorithm for solving ℓ1 regularized MPC , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[5]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[6]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[7]  Jianhua Z. Huang,et al.  The Analysis of Two-Way Functional Data Using Two-Way Regularized Singular Value Decompositions , 2009 .

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[9]  Genevera I. Allen,et al.  Journal of the American Statistical Association a Generalized Least-square Matrix Decomposition a Generalized Least-square Matrix Decomposition , 2022 .

[10]  Tulay Adali,et al.  A method for comparing group fMRI data using independent component analysis: application to visual, motor and visuomotor tasks. , 2004, Magnetic resonance imaging.

[11]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[12]  HeBingsheng,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2016 .

[13]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[14]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[15]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[16]  J. Polich,et al.  Visual P3a in male alcoholics and controls. , 1999, Alcoholism, clinical and experimental research.

[17]  T. Chan,et al.  Independent component analysis-based classification of Alzheimer's disease MRI data. , 2011, Journal of Alzheimer's disease : JAD.

[18]  Xiaoshan Li,et al.  Tucker Tensor Regression and Neuroimaging Analysis , 2018, Statistics in Biosciences.

[19]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[20]  Janaina Mourão Miranda,et al.  Classifying brain states and determining the discriminating activation patterns: Support Vector Machine on functional MRI data , 2005, NeuroImage.

[21]  P. Aguiar,et al.  A Proof of Convergence For the Alternating Direction Method of Multipliers Applied to Polyhedral-Constrained Functions , 2011, 1112.2295.

[22]  R. Tibshirani,et al.  A note on the group lasso and a sparse group lasso , 2010, 1001.0736.

[23]  Danny C. Sorensen,et al.  Direct methods for matrix Sylvester and Lyapunov equations , 2003 .

[24]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[25]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[26]  Stephen P. Boyd,et al.  An ADMM Algorithm for a Class of Total Variation Regularized Estimation Problems , 2012, 1203.1828.

[27]  Bingsheng He,et al.  A new inexact alternating directions method for monotone variational inequalities , 2002, Math. Program..

[28]  B. He,et al.  Alternating Direction Method with Self-Adaptive Penalty Parameters for Monotone Variational Inequalities , 2000 .

[29]  Lexin Li,et al.  Regularized matrix regression , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[30]  C. F. Beckmann,et al.  Tensorial extensions of independent component analysis for multisubject FMRI analysis , 2005, NeuroImage.

[31]  H. Begleiter,et al.  Alcoholism and Human Electrophysiology , 2003, Alcohol research & health : the journal of the National Institute on Alcohol Abuse and Alcoholism.

[32]  Francis R. Bach,et al.  Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties , 2011, ICML.

[33]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  Jian Huang,et al.  The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression. , 2011, Annals of statistics.

[36]  Jonathan E. Taylor,et al.  Interpretable whole-brain prediction analysis with GraphNet , 2013, NeuroImage.

[37]  Eric Darve,et al.  Computing entries of the inverse of a sparse matrix using the FIND algorithm , 2008, J. Comput. Phys..

[38]  H. Benali,et al.  Support vector machine-based classification of Alzheimer’s disease from whole-brain anatomical MRI , 2009, Neuroradiology.

[39]  Genevera I. Allen,et al.  A Generalized Least Squares Matrix Decomposition , 2011, 1102.3074.

[40]  Uang,et al.  A TWO-WAY REGULARIZATION METHOD FOR MEG SOURCE RECONSTRUCTION , 2011 .

[41]  Tom M. Mitchell,et al.  Learning to Decode Cognitive States from Brain Images , 2004, Machine Learning.

[42]  Vince D. Calhoun,et al.  A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data , 2009, NeuroImage.

[43]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[44]  Eric C. Chi,et al.  Splitting Methods for Convex Clustering , 2013, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[45]  Robert R. Meyer,et al.  A variable-penalty alternating directions method for convex optimization , 1998, Math. Program..

[46]  Wotao Yin,et al.  On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers , 2016, J. Sci. Comput..

[47]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[48]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[49]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[50]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[51]  Hung Hung,et al.  Matrix variate logistic regression model with application to EEG data. , 2011, Biostatistics.

[52]  Tom M. Mitchell,et al.  Machine learning classifiers and fMRI: A tutorial overview , 2009, NeuroImage.

[53]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[54]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.