Learning Topology and Dynamics of Large Recurrent Neural Networks

Large-scale recurrent networks have drawn increasing attention recently because of their capabilities in modeling a large variety of real-world phenomena and physical mechanisms. This paper studies how to identify all authentic connections and estimate system parameters of a recurrent network, given a sequence of node observations. This task becomes extremely challenging in modern network applications, because the available observations are usually very noisy and limited, and the associated dynamical system is strongly nonlinear. By formulating the problem as multivariate sparse sigmoidal regression, we develop simple-to-implement network learning algorithms, with rigorous convergence guarantee in theory, for a variety of sparsity-promoting penalty forms. A quantile variant of progressive recurrent network screening is proposed for efficient computation and allows for direct cardinality control of network topology in estimation. Moreover, we investigate recurrent network stability conditions in Lyapunov's sense, and integrate such stability constraints into sparse network learning. Experiments show excellent performance of the proposed algorithms in network topology identification and forecasting.

[1]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[2]  Eric P. Xing,et al.  Discrete Temporal Models of Social Networks , 2006, SNA@ICML.

[3]  Dapeng Wu,et al.  Stationary-sparse causality network learning , 2013, J. Mach. Learn. Res..

[4]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[5]  D. Talay,et al.  The law of the Euler scheme for stochastic differential equations , 1996 .

[6]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[7]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[8]  Yiyuan She,et al.  Reduced Rank Vector Generalized Linear Models for Feature Extraction , 2010, 1007.3098.

[9]  W. Haddad,et al.  Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach , 2008 .

[10]  Martin Lysy,et al.  A Multiresolution Method for Parameter Estimation of Diffusion Processes , 2012, Journal of the American Statistical Association.

[11]  Randall D. Beer,et al.  The dynamics of adaptive behavior: A research program , 1997, Robotics Auton. Syst..

[12]  Daniel S. Margulies,et al.  Integration of a neuroimaging processing pipeline into a pan-canadian computing grid , 2012, HPC 2012.

[13]  A. M. Lyapunov The general problem of the stability of motion , 1992 .

[14]  Stephen Grossberg,et al.  Absolute stability of global pattern formation and parallel memory storage by competitive neural networks , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  O. Sporns,et al.  Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.

[16]  R. Ruth,et al.  Stability of dynamical systems , 1988 .

[17]  Bor-Sen Chen,et al.  Quantitative characterization of the transcriptional regulatory network in the yeast cell cycle , 2004, Bioinform..

[18]  P. L. Combettes,et al.  A Dykstra-like algorithm for two monotone operators , 2007 .

[19]  Yiyuan She,et al.  Thresholding-based Iterative Selection Procedures for Generalized Linear Models , 2009, 0911.5460.

[20]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[21]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..

[22]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[23]  J. C. Gallacher,et al.  Continuous time recurrent neural networks: a paradigm for evolvable analog controller circuits , 2000, Proceedings of the IEEE 2000 National Aerospace and Electronics Conference. NAECON 2000. Engineering Tomorrow (Cat. No.00CH37093).

[24]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[25]  L. Brouwer Über Abbildung von Mannigfaltigkeiten , 1911 .

[26]  M. Milham,et al.  The ADHD-200 Consortium: A Model to Advance the Translational Potential of Neuroimaging in Clinical Neuroscience , 2012, Front. Syst. Neurosci..

[27]  Philip E. Gill,et al.  Practical optimization , 1981 .

[28]  João Ricardo Sato,et al.  Modeling gene expression regulatory networks with the sparse vector autoregressive model , 2007, BMC Systems Biology.

[29]  P. Pedroni The Econometric Modelling of Financial Time Series , 2001 .

[30]  Jiri Vohradsky,et al.  Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae , 2006, Nucleic acids research.

[31]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[32]  Rainer Goebel,et al.  Mapping directed influence over the brain using Granger causality and fMRI , 2005, NeuroImage.

[33]  Jürgen Schmidhuber,et al.  A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.