A transfer learning approach for network modeling

Network models have been widely used in many subject areas to characterize the interactions between physical entities. A typical problem is to identify the network for multiple related tasks that share some similarities. In this case, a transfer learning approach that can leverage the knowledge gained during the modeling of one task to help better model another task is highly desirable. This article proposes a transfer learning approach that adopts a Bayesian hierarchical model framework to characterize the relatedness between tasks and additionally uses L 1-regularization to ensure robust learning of the networks with limited sample sizes. A method based on the Expectation–Maximization (EM) algorithm is further developed to learn the networks from data. Simulation studies are performed that demonstrate the superiority of the proposed transfer learning approach over single-task learning that learns the network of each task in isolation. The proposed approach is also applied to identify brain connectivity networks associated with Alzheimer’s Disease (AD) from functional magnetic resonance image data. The findings are consistent with the AD literature.

[1]  Ming-Deh A. Huang,et al.  Proof of proposition 2 , 1992 .

[2]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[3]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[4]  Kuncheng Li,et al.  Altered functional connectivity in early Alzheimer's disease: A resting‐state fMRI study , 2007, Human brain mapping.

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[7]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[8]  Y. Stern Cognitive Reserve and Alzheimer Disease , 2006, Alzheimer disease and associated disorders.

[9]  Frank Dellaert,et al.  The Expectation Maximization Algorithm , 2002 .

[10]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[11]  Yaakov Stern,et al.  Cognitive reserve and Alzheimer disease. , 2006, Alzheimer disease and associated disorders.

[12]  et al.,et al.  Categorical and correlational analyses of baseline fluorodeoxyglucose positron emission tomography images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) , 2009, NeuroImage.

[13]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[14]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[15]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[16]  Charles Elkan,et al.  Expectation Maximization Algorithm , 2010, Encyclopedia of Machine Learning.

[17]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[18]  E. Bullmore,et al.  Brain mechanisms of successful compensation during learning in Alzheimer disease , 2006, Neurology.

[19]  Daniel L. Rubin,et al.  Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer's Disease , 2008, PLoS Comput. Biol..

[20]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[21]  Jing Li,et al.  Knowledge discovery from observational data for process control using causal Bayesian networks , 2007 .

[22]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[23]  Lester Melie-García,et al.  Estimating brain functional connectivity with sparse multivariate autoregression , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[24]  Neil D. Lawrence,et al.  Learning to learn with the informative vector machine , 2004, ICML.

[25]  Michael W. Weiner,et al.  Twelve-month metabolic declines in probable Alzheimer's disease and amnestic mild cognitive impairment assessed using an empirically pre-defined statistical region-of-interest: Findings from the Alzheimer's Disease Neuroimaging Initiative , 2010, NeuroImage.

[26]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[27]  Jing Li,et al.  Learning brain connectivity of Alzheimer's disease by sparse inverse covariance estimation , 2010, NeuroImage.

[28]  Karl J. Friston,et al.  Characterizing the Response of PET and fMRI Data Using Multivariate Linear Models , 1997, NeuroImage.

[29]  Sebastian Thrun,et al.  Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.

[30]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[31]  Yiming Yang,et al.  Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[32]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[33]  Hongzhe Li,et al.  Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. , 2006, Biostatistics.

[34]  C. Stam,et al.  Small-world networks and functional connectivity in Alzheimer's disease. , 2006, Cerebral cortex.

[35]  Jing Li,et al.  Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation , 2009, KDD.

[36]  Bin Yu,et al.  Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of boldmathell_1-regularized MLE , 2008, NIPS 2008.

[37]  Adam J. Rothman,et al.  Sparse estimation of large covariance matrices via a nested Lasso penalty , 2008, 0803.3872.

[38]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..