Sampling for Bayesian Program Learning

Towards learning programs from data, we introduce the problem of sampling programs from posterior distributions conditioned on that data. Within this setting, we propose an algorithm that uses a symbolic solver to efficiently sample programs. The proposal combines constraint-based program synthesis with sampling via random parity constraints. We give theoretical guarantees on how well the samples approximate the true posterior, and have empirical results showing the algorithm is efficient in practice, evaluating our approach on 22 program learning problems in the domains of text editing and computer-aided programming.

[1]  Andreas Krause,et al.  Learning programs from noisy data , 2016, POPL.

[2]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[3]  Supratik Chakraborty,et al.  A Scalable and Nearly Uniform Generator of SAT Witnesses , 2013, CAV.

[4]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[5]  Sanjit A. Seshia,et al.  Distribution-Aware Sampling and Weighted Model Counting for SAT , 2014, AAAI.

[6]  Yarden Katz,et al.  Modeling Semantic Cognition as Logical Dimensionality Reduction , 2008 .

[7]  Armando Solar-Lezama,et al.  Unsupervised Learning by Program Synthesis , 2015, NIPS.

[8]  Leslie G. Valiant,et al.  NP is as easy as detecting unique solutions , 1985, STOC '85.

[9]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[10]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[11]  Bart Selman,et al.  Near-Uniform Sampling of Combinatorial Spaces Using XOR Constraints , 2006, NIPS.

[12]  Dimitris Achlioptas,et al.  Stochastic Integration via Error-Correcting Codes , 2015, UAI.

[13]  Stephen Muggleton,et al.  Bias reformulation for one-shot function induction , 2014, ECAI.

[14]  Dan Klein,et al.  Learning Dependency-Based Compositional Semantics , 2011, CL.

[15]  Bart Selman,et al.  Embed and Project: Discrete Sampling with Universal Hashing , 2013, NIPS.

[16]  Sumit Gulwani,et al.  Automated feedback generation for introductory programming assignments , 2012, PLDI.

[17]  Bart Selman,et al.  Low-density Parity Constraints for Hashing-Based Discrete Integration , 2014, ICML.

[18]  Bart Selman,et al.  Model Counting: A New Strategy for Obtaining Good Bounds , 2006, AAAI.

[19]  Butler W. Lampson,et al.  A Machine Learning Framework for Programming by Example , 2013, ICML.

[20]  Alexander Aiken,et al.  Stochastic superoptimization , 2012, ASPLOS '13.

[21]  Armando Solar-Lezama,et al.  Program synthesis by sketching , 2008 .

[22]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[23]  Michael I. Jordan,et al.  Learning Programs: A Hierarchical Bayesian Approach , 2010, ICML.

[24]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.