N ETWORKS : U NIFYING D EEP AND S PECTRAL L EARNING

We present Spectral Inference Networks, a framework for learning eigenfunctions of linear operators by stochastic optimization. Spectral Inference Networks generalize Slow Feature Analysis to generic symmetric operators, and are closely related to Variational Monte Carlo methods from computational physics. As such, they can be a powerful tool for unsupervised representation learning from video or graph-structured data. We cast training Spectral Inference Networks as a bilevel optimization problem, which allows for online learning of multiple eigenfunctions. We show results of training Spectral Inference Networks on problems in quantum mechanics and feature learning for videos on synthetic datasets. Our results demonstrate that Spectral Inference Networks accurately recover eigenfunctions of linear operators and can discover interpretable representations from video in a fully unsupervised manner.

[1]  G. Carleo,et al.  Symmetries and Many-Body Excitations with Neural-Network Quantum States. , 2018, Physical review letters.

[2]  Jun Zhu,et al.  A Spectral Approach to Gradient Estimation for Implicit Distributions , 2018, ICML.

[3]  Marlos C. Machado,et al.  Eigenoption Discovery through the Deep Successor Representation , 2017, ICLR.

[4]  Christos Tzamos,et al.  A converse to Banach's fixed point theorem and its CLS-completeness , 2017, STOC.

[5]  Kimberly L. Stachenfeld,et al.  The hippocampus as a predictive map , 2017, Nature Neuroscience.

[6]  Marlos C. Machado,et al.  A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.

[7]  John P. Cunningham,et al.  Maximum Entropy Flow Networks , 2017, ICLR.

[8]  Tom Schaul,et al.  Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.

[9]  Matthias Troyer,et al.  Solving the quantum many-body problem with artificial neural networks , 2016, Science.

[10]  David Pfau,et al.  Connecting Generative Adversarial Networks and Actor-Critic Methods , 2016, ArXiv.

[11]  Iain Murray,et al.  Differentiation of the Cholesky decomposition , 2016, ArXiv.

[12]  Cristian Sminchisescu,et al.  Matrix Backpropagation for Deep Networks with Structured Layers , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  George H. Booth,et al.  Krylov-Projected Quantum Monte Carlo Method. , 2014, Physical review letters.

[14]  Lin Sun,et al.  DL-SFA: Deeply-Learned Slow Feature Analysis for Action Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[16]  Anima Anandkumar,et al.  A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.

[17]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.

[18]  Jürgen Schmidhuber,et al.  Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.

[19]  Henning Sprekeler,et al.  On the Relation of Slow Feature Analysis and Laplacian Eigenmaps , 2011, Neural Computation.

[20]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[21]  Sham M. Kakade,et al.  A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..

[22]  M. Giles An extended collection of matrix derivative results for forward and reverse mode algorithmic dieren tiation , 2008 .

[23]  Sridhar Mahadevan,et al.  Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..

[24]  Laurenz Wiskott,et al.  Slowness and Sparseness Lead to Place, Head-Direction, and Spatial-View Cells , 2007, PLoS Comput. Biol..

[25]  P. König,et al.  A Model of the Ventral Visual System Based on Temporal Stability and Local Memory , 2006, PLoS biology.

[26]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[27]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[28]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[29]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[30]  R. Needs,et al.  Quantum Monte Carlo simulations of solids , 2001 .

[31]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[32]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[33]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[34]  V. Pan,et al.  The Complexity of the AlgebraicEigenproblem , 1998 .

[35]  R. Nieminen,et al.  Stochastic gradient approximation: An efficient method to optimize many-body wave functions , 1997 .

[36]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  V. Borkar Stochastic approximation with two time scales , 1997 .

[38]  Yang,et al.  Analytic solution of a two-dimensional hydrogen atom. I. Nonrelativistic theory. , 1991, Physical review. A, Atomic, molecular, and optical physics.

[39]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[40]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.