Challenges of Acquiring Compositional Inductive Biases via Meta-Learning

Meta-learning is typically applied to settings where, given a distribution over related training tasks, the goal is to learn inductive biases that aid in generalization to new tasks from this distribution. Alternatively, we might consider a scenario where, given an inductive bias, we must construct a family of tasks that will inject the given inductive bias into a parametric model (e.g. a neural network) if meta-training is performed on the constructed task family. Inspired by recent work showing that such an algorithm can leverage meta-learning to improve generalization on a single-task learning problem, we consider various approaches to both a) the construction of the family of tasks and b) the procedure for selecting support sets for a particular single-task problem, the SCAN compositional generalization benchmark. We perform ablation experiments aimed at identifying when a meta-learning algorithm and family of tasks can impart the compositional inductive bias needed to solve SCAN. We conclude that existing meta-learning approaches to injecting compositional inductive biases are brittle and difficult to interpret, showing high sensitivity to both the family of meta-training tasks and the procedure for selecting support sets.

[1]  Tong Gao,et al.  Systematic Generalization on gSCAN with Language Conditioned Embedding , 2020, AACL.

[2]  Thomas L. Griffiths,et al.  Universal linguistic inductive biases via meta-learning , 2020, CogSci.

[3]  Alex 'Sandy' Pentland,et al.  A Study of Compositional Generalization in Neural Models , 2020, ArXiv.

[4]  David Lopez-Paz,et al.  Permutation Equivariant Models for Compositional Generalization in Language , 2020, ICLR.

[5]  Armando Solar-Lezama,et al.  Learning Compositional Rules via Neural Program Synthesis , 2020, NeurIPS.

[6]  B. Lake,et al.  A Benchmark for Systematic Generalization in Grounded Language Understanding , 2020, NeurIPS.

[7]  Xiao Wang,et al.  Measuring Compositional Generalization: A Comprehensive Method on Realistic Data , 2019, ICLR.

[8]  Liang Zhao,et al.  Compositional Generalization for Primitive Substitutions , 2019, EMNLP.

[9]  Mathijs Mul,et al.  The compositionality of neural networks: integrating symbolism and connectionism , 2019, ArXiv.

[10]  Michael J. Frank,et al.  Generalizing to generalize: Humans flexibly switch between compositional and conjunctive structures during reinforcement learning , 2019, bioRxiv.

[11]  Christopher D. Manning,et al.  Learning by Abstraction: The Neural State Machine , 2019, NeurIPS.

[12]  Brenden M. Lake,et al.  Compositional generalization through meta sequence-to-sequence learning , 2019, NeurIPS.

[13]  Yoshua Bengio,et al.  Compositional generalization in a deep seq2seq model by separating syntax and semantics , 2019, ArXiv.

[14]  Jacob Andreas,et al.  Good-Enough Compositional Data Augmentation , 2019, ACL.

[15]  Yoshua Bengio,et al.  BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop , 2018, ArXiv.

[16]  Aaron C. Courville,et al.  Systematic Generalization: What Is Required and Can It Be Learned? , 2018, ICLR.

[17]  Paolo Frasconi,et al.  Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.

[18]  Christopher D. Manning,et al.  Compositional Attention Networks for Machine Reasoning , 2018, ICLR.

[19]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[20]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[22]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[24]  Dan Klein,et al.  Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[26]  Bartunov Sergey,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016 .