Unsupervised Learning by Program Synthesis

We introduce an unsupervised learning algorithm that combines probabilistic modeling with solver-based techniques for program synthesis. We apply our techniques to both a visual learning domain and a language learning problem, showing that our algorithm can learn many visual concepts from only a few examples and that it can recover some English inflectional morphology. Taken together, these results give both a new approach to unsupervised learning of symbolic compositional structures, and a technique for applying program synthesis tools to noisy data.

[1]  Ting Li,et al.  Comparing machines and humans on a visual categorization test , 2011, Proceedings of the National Academy of Sciences.

[2]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[3]  E. Spelke,et al.  Core Knowledge of Geometry in an Amazonian Indigene Group , 2006, Science.

[4]  Timothy O'Donnell,et al.  Productivity and Reuse in Language: A Theory of Linguistic Computation and Storage , 2015 .

[5]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[6]  Emina Torlak,et al.  Growing solver-aided languages with rosette , 2013, Onward!.

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Pierre Pica,et al.  Core knowledge of geometry in an Amazonian indigene group. , 2006, Science.

[9]  Mikko Kurimo,et al.  Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline , 2013 .

[10]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[11]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[12]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[13]  Constantine Lignos,et al.  Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning , 2010 .

[14]  Sumit Gulwani,et al.  Inductive programming meets the real world , 2015, Commun. ACM.

[15]  Armando Solar-Lezama,et al.  Program synthesis by sketching , 2008 .

[16]  Sumit Gulwani,et al.  Synthesis of loop-free programs , 2011, PLDI '11.

[17]  Joshua B. Tenenbaum,et al.  One-shot learning by inverting a compositional causal process , 2013, NIPS.

[18]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[19]  Alexander Aiken,et al.  Stochastic superoptimization , 2012, ASPLOS '13.

[20]  Michael I. Jordan,et al.  Learning Programs: A Hierarchical Bayesian Approach , 2010, ICML.

[21]  B. Hayes,et al.  Rules vs. analogy in English past tenses: a computational/experimental study , 2003, Cognition.

[22]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[23]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[24]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[25]  David C. Plaut,et al.  Quasiregularity and Its Discontents: The Legacy of the Past Tense Debate , 2014, Cogn. Sci..

[26]  Yarden Katz,et al.  Modeling Semantic Cognition as Logical Dimensionality Reduction , 2008 .

[27]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.

[28]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..