Predictive learning extracts latent space representations from sensory observations

Neural networks have achieved many recent successes in solving sequential processing and planning tasks. Their success is often ascribed to the emergence of the task9s low-dimensional latent structure in the network activity - i.e., in the learned neural representations. Similarly, biological neural circuits and in particular the hippocampus may produce representations that organize semantically related episodes. Here, we investigate the hypothesis that representations with low-dimensional latent structure, reflecting such semantic organization, result from learning to predict observations about the world. Specifically, we ask whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables. Using a recurrent neural network model trained to predict a sequence of observations in a simulated spatial navigation task, we show that network dynamics exhibit low-dimensional but nonlinearly transformed representations of sensory inputs that capture the latent structure of the sensory environment. We quantify these results using nonlinear measures of intrinsic dimensionality which highlight the importance of the predictive aspect of neural representations, and provide mathematical arguments for when and why these representations emerge. We focus throughout on how our results can aid the analysis and interpretation of experimental data.

[1]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[2]  Nachum Ulanovsky,et al.  Spatial cognition in bats and rats: from sensory acquisition to multiscale maps and navigation , 2015, Nature Reviews Neuroscience.

[3]  David W. Tank,et al.  Probing variability in a cognitive map using manifold inference from neural dynamics , 2018, bioRxiv.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Samuel Gershman,et al.  Deep Successor Reinforcement Learning , 2016, ArXiv.

[6]  P. Campadelli,et al.  Intrinsic Dimension Estimation: Relevant Techniques and a Benchmark Framework , 2015 .

[7]  Antonino Staiano,et al.  Intrinsic dimension estimation: Advances and open problems , 2016, Inf. Sci..

[8]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[9]  Surya Ganguli,et al.  A theory of multineuronal dimensionality, dynamics and measurement , 2017, bioRxiv.

[10]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[11]  Haim Sompolinsky,et al.  Interactions between Intrinsic and Stimulus-Evoked Activity in Recurrent Neural Networks , 2009, 0912.3832.

[12]  G. La Camera,et al.  Stimuli Reduce the Dimensionality of Cortical Activity , 2015, bioRxiv.

[13]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[14]  L. Squire,et al.  Preserved learning and retention of pattern-analyzing skill in amnesia: dissociation of knowing how and knowing that. , 1980, Science.

[15]  Razvan Pascanu,et al.  Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.

[16]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[17]  M. Botvinick,et al.  The hippocampus as a predictive map , 2016 .

[18]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[19]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[20]  Jascha Sohl-Dickstein,et al.  Capacity and Trainability in Recurrent Neural Networks , 2016, ICLR.

[21]  May-Britt Moser,et al.  The entorhinal grid map is discretized , 2012, Nature.

[22]  Alfred O. Hero,et al.  Manifold Learning with Geodesic Minimal Spanning Trees , 2003, ArXiv.

[23]  Yoshua Bengio,et al.  Deep Learning of Representations: Looking Forward , 2013, SLSP.

[24]  Xiao-Jing Wang,et al.  The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.

[25]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[26]  Christian F. Doeller,et al.  Journal of Experimental Psychology : General Mnemonic Networks in the Hippocampal Formation : From Spatial Maps to Temporal and Conceptual Codes , 2013 .

[27]  G. Buzsáki,et al.  Memory, navigation and theta rhythm in the hippocampal-entorhinal system , 2013, Nature Neuroscience.

[28]  J. O'Keefe,et al.  The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. , 1971, Brain research.

[29]  R U Muller,et al.  Head-direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis , 1990, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[30]  Stefano Fusi,et al.  Attractor concretion as a mechanism for the formation of context representations , 2010, NeuroImage.

[31]  Claudia Clopath,et al.  Sparse synaptic connectivity is required for decorrelation and pattern separation in feedforward networks , 2017, Nature Communications.

[32]  Dean V. Buonomano,et al.  ROBUST TIMING AND MOTOR PATTERNS BY TAMING CHAOS IN RECURRENT NEURAL NETWORKS , 2012, Nature Neuroscience.

[33]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[34]  Samuel Gershman,et al.  Predictive representations can link model-based reinforcement learning to model-free mechanisms , 2017, bioRxiv.

[35]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[36]  Anirvan M. Sengupta,et al.  Why Do Similarity Matching Objectives Lead to Hebbian/Anti-Hebbian Networks? , 2017, Neural Computation.

[37]  Alessandro Rozza,et al.  Minimum Neighbor Distance Estimators of Intrinsic Dimension , 2011, ECML/PKDD.

[38]  Thomas J. Wills,et al.  Development of the Hippocampal Cognitive Map in Preweanling Rats , 2010, Science.

[39]  Alessandro Rozza,et al.  DANCo: Dimensionality from Angle and Norm Concentration , 2012, ArXiv.

[40]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[41]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[42]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[43]  Bonnie E. Shook-Sa,et al.  . CC-BY-NC-ND 4 . 0 International licenseIt is made available under a is the author / funder , who has granted medRxiv a license to display the preprint in perpetuity , 2021 .

[44]  M. Moser,et al.  Representation of Geometric Borders in the Entorhinal Cortex , 2008, Science.

[45]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[46]  David J. Foster,et al.  Memory and Space: Towards an Understanding of the Cognitive Map , 2015, The Journal of Neuroscience.

[47]  Haim Sompolinsky,et al.  Optimal Degrees of Synaptic Connectivity , 2017, Neuron.

[48]  Joel Z. Leibo,et al.  Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.

[49]  R. Muller,et al.  Head-direction cells recorded from the postsubiculum in freely moving rats. II. Effects of environmental manipulations , 1990, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[50]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[51]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[52]  Ingmar Kanitscheider,et al.  Training recurrent networks to generate hypotheses about how the brain solves hard navigation problems , 2016, NIPS.

[53]  H. Eichenbaum,et al.  Can We Reconcile the Declarative Memory and Spatial Navigation Views on Hippocampal Function? , 2014, Neuron.

[54]  Peter Dayan,et al.  Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.

[55]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .

[56]  Sanjeev Arora,et al.  RAND-WALK: A Latent Variable Model Approach to Word Embeddings , 2015 .

[57]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[58]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[59]  Xiao-Jing Wang,et al.  Internal Representation of Task Rules by Recurrent Dynamics: The Importance of the Diversity of Neural Responses , 2010, Front. Comput. Neurosci..

[60]  Samuel Gershman,et al.  Design Principles of the Hippocampal Cognitive Map , 2014, NIPS.

[61]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[62]  Anirvan M. Sengupta,et al.  Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks , 2018, bioRxiv.

[63]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[64]  Arkady Konovalov,et al.  Neurocomputational Dynamics of Sequence Learning , 2018, Neuron.

[65]  Prateek Jain,et al.  Learning Sparsely Used Overcomplete Dictionaries , 2014, COLT.

[66]  Surya Ganguli,et al.  Random projections of random manifolds , 2016, ArXiv.