论文信息 - Dimensionality Reduction via Program Induction

Dimensionality Reduction via Program Induction

This work proposes a machine learning algorithm for inductive synthesis of programs. The objective of the algorithm is to learn a distribution over programs that evaluate to a given constant. The algorithm searches for programs in a bottomup fashion, going directly from the data to the program, in contrast to recent, similar work that adopts a generate-and-test approach. I use this algorithm to perform density estimation upon strings. Then, I show that the estimated density can be used to perform two distinct kinds of dimensionality reduction upon strings: one, converting strings in to real-valued vectors; and two, converting strings to compressive symbolic descriptions. The goals of this work are to demonstrate a bottom-up approach to program induction, and to evaluate two types of dimensionality reduction suggested by this bottom-up approach.

Joshua B. Tenenbaum | Eyal Dechter | Kevin Ellis

[1] Yarden Katz,et al. Modeling Semantic Cognition as Logical Dimensionality Reduction , 2008 .

[2] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[3] Stephen Muggleton,et al. Machine Invention of First Order Predicates by Inverting Resolution , 1988, ML.

[4] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[5] Ian H. Witten,et al. Identifying Hierarchical Structure in Sequences: A linear-time algorithm , 1997, J. Artif. Intell. Res..

[6] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[7] Kenichi Kurihara,et al. Variational Bayesian Grammar Induction for Natural Language , 2006, ICGI.

[8] Joshua B. Tenenbaum,et al. Church: a language for generative models , 2008, UAI.

[9] Michael I. Jordan,et al. Learning Programs: A Hierarchical Bayesian Approach , 2010, ICML.

[10] J. D. Lafferty. A derivation of the Inside-Outside algorithm from the EM algorithm , 1993 .

[11] Timothy O'Donnell,et al. Productivity and Reuse in Language: A Theory of Linguistic Computation and Storage , 2015 .