Neural Program Lattices

We propose the Neural Program Lattice (NPL), a neural network that learns to perform complex tasks by composing low-level programs to express high-level programs. Our starting point is the recent work on Neural Programmer-Interpreters (NPI), which can only learn from strong supervision that contains the whole hierarchy of low-level and high-level programs. NPLs remove this limitation by providing the ability to learn from weak supervision consisting only of sequences of low-level operations. We demonstrate the capability of NPL to learn to perform long-hand addition and arrange blocks in a grid-world environment. Experiments show that it performs on par with NPI while using weak supervision in place of most of the strong supervision, thus indicating its ability to infer the high-level program structure from examples containing only the low-level operations.

[1]  Wojciech Zaremba,et al.  Learning Simple Algorithms from Examples , 2015, ICML.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Pushmeet Kohli,et al.  TerpreT: A Probabilistic Programming Language for Program Induction , 2016, ArXiv.

[4]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[5]  Dawn Xiaodong Song,et al.  Making Neural Programming Architectures Generalize via Recursion , 2017, ICLR.

[6]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[7]  Rob Fergus,et al.  MazeBase: A Sandbox for Learning from Games , 2015, ArXiv.

[8]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[9]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[10]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.

[11]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[12]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[13]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[14]  Garrison W. Cottrell,et al.  Please Scroll down for Article Connection Science Learning Simple Arithmetic Procedures , 2022 .

[15]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[16]  Wojciech Zaremba,et al.  Learning to Execute , 2014, ArXiv.

[17]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[18]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[19]  Marlos C. Machado,et al.  Learning Purposeful Behaviour in the Absence of Rewards , 2016, ArXiv.

[20]  Alex Graves,et al.  Grid Long Short-Term Memory , 2015, ICLR.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[23]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[24]  Marcin Andrychowicz,et al.  Learning Efficient Algorithms with Hierarchical Attentive Memory , 2016, ArXiv.

[25]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[26]  Marc Brockschmidt,et al.  Neural Functional Programming , 2016, ICLR.

[27]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.