The Utility of Sparse Representations for Control in Reinforcement Learning

We investigate sparse representations for control in reinforcement learning. While these representations are widely used in computer vision, their prevalence in reinforcement learning is limited to sparse coding where extracting representations for new data can be computationally intensive. Here, we begin by demonstrating that learning a control policy incrementally with a representation from a standard neural network fails in classic control domains, whereas learning with a representation obtained from a neural network that has sparsity properties enforced is effective. We provide evidence that the reason for this is that the sparse representation provides locality, and so avoids catastrophic interference, and particularly keeps consistent, stable values for bootstrapping. We then discuss how to learn such sparse representations. We explore the idea of Distributional Regularizers, where the activation of hidden nodes is encouraged to match a particular distribution that results in sparse activation across time. We identify a simple but effective way to obtain sparse representations, not afforded by previously proposed strategies, making it more practical for further investigation into sparse representations for reinforcement learning.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[3]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[4]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .

[5]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[6]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[7]  Robert M. French,et al.  Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .

[8]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[9]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[10]  Bruno A. Olshausen,et al.  Sparse Codes and Spikes , 2001 .

[11]  Yee Whye Teh,et al.  Energy-Based Models for Sparse Overcomplete Representations , 2003, J. Mach. Learn. Res..

[12]  P. Földiák,et al.  Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.

[13]  Doina Precup,et al.  Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning , 2004, ECML.

[14]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[15]  Jochen Triesch,et al.  A Gradient Rule for the Plasticity of a Neuron's Intrinsic Excitability , 2005, ICANN.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[18]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[19]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[20]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[21]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[22]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[23]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[24]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[25]  G. Kreiman,et al.  Measuring sparseness in the brain: comment on Bowers (2009). , 2010, Psychological review.

[26]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[27]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[28]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[29]  Jochen J. Steil,et al.  Online learning and generalization of parts-based image representations by non-negative sparse autoencoders , 2012, Neural Networks.

[30]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[32]  Brendan J. Frey,et al.  k-Sparse Autoencoders , 2013, ICLR.

[33]  Brendan J. Frey,et al.  Winner-Take-All Autoencoders , 2014, NIPS.

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Subutai Ahmad,et al.  Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory , 2015, ArXiv.

[37]  Venu Govindaraju,et al.  Why Regularized Auto-Encoders learn Sparse Representation? , 2015, ICML.

[38]  Martha White,et al.  Unifying Task Specification in Reinforcement Learning , 2016, ICML.

[39]  Martha White,et al.  Learning Sparse Representations in Reinforcement Learning with Sparse Coding , 2017, IJCAI.

[40]  Razvan Pascanu,et al.  Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.

[41]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[42]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.