Making sense of raw input

Abstract How should a machine intelligence perform unsupervised structure discovery over streams of sensory input? One approach to this problem is to cast it as an apperception task [1] . Here, the task is to construct an explicit interpretable theory that both explains the sensory sequence and also satisfies a set of unity conditions, designed to ensure that the constituents of the theory are connected in a relational structure. However, the original formulation of the apperception task had one fundamental limitation: it assumed the raw sensory input had already been parsed using a set of discrete categories, so that all the system had to do was receive this already-digested symbolic input, and make sense of it. But what if we don't have access to pre-parsed input? What if our sensory sequence is raw unprocessed information? The central contribution of this paper is a neuro-symbolic framework for distilling interpretable theories out of streams of raw, unprocessed sensory experience. First, we extend the definition of the apperception task to include ambiguous (but still symbolic) input: sequences of sets of disjunctions. Next, we use a neural network to map raw sensory input to disjunctive input. Our binary neural network is encoded as a logic program, so the weights of the network and the rules of the theory can be solved jointly as a single SAT problem. This way, we are able to jointly learn how to perceive (mapping raw sensory information to concepts) and apperceive (combining concepts into declarative rules).

[1]  Tor Lattimore,et al.  Free Lunch for optimisation under the universal distribution , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[2]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[3]  Luc De Raedt,et al.  Relational Reinforcement Learning , 2001, Machine Learning.

[4]  Douglas R. Hofstadter,et al.  Fluid Concepts and Creative Analogies , 1995 .

[5]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[6]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[7]  Pascal Hitzler,et al.  Connectionist model generation: A first-order approach , 2008, Neurocomputing.

[8]  Richard Evans,et al.  Inductive general game playing , 2019, Machine Learning.

[9]  Artur S. d'Avila Garcez,et al.  Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge , 2016, NeSy@HLAI.

[10]  Kai-Uwe Kühnberger,et al.  Neural-Symbolic Learning and Reasoning: A Survey and Interpretation , 2017, Neuro-Symbolic Artificial Intelligence.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[13]  Razvan Pascanu,et al.  Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[14]  Rolf Morel,et al.  Typed Meta-interpretive Learning of Logic Programs , 2019, JELIA.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Rob Fergus,et al.  Composable Planning with Attributes , 2018, ICML.

[17]  Song-Chun Zhu,et al.  Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[18]  Stephen Muggleton,et al.  How Does Predicate Invention Affect Human Comprehensibility? , 2016, ILP.

[19]  José Hernández-Orallo,et al.  Making sense of sensory input , 2019, Artif. Intell..

[20]  A.S. d'Avila Garcez,et al.  A connectionist inductive learning system for modal logic programming , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[21]  Krzysztof R. Apt,et al.  Logic Programming , 1990, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[22]  Leonid Ryzhyk,et al.  Verifying Properties of Binarized Deep Neural Networks , 2017, AAAI.

[23]  Artur S. d'Avila Garcez,et al.  The Connectionist Inductive Learning and Logic Programming System , 1999, Applied Intelligence.

[24]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[26]  Stephen Muggleton,et al.  Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP , 2018, Machine Learning.

[27]  Razvan Pascanu,et al.  Relational Deep Reinforcement Learning , 2018, ArXiv.

[28]  Joshua B. Tenenbaum,et al.  Learning abstract structure for drawing by efficient motor program induction , 2020, NeurIPS.

[29]  Monica S. Lam,et al.  Using Datalog with Binary Decision Diagrams for Program Analysis , 2005, APLAS.

[30]  Steffen Hölldobler,et al.  Approximating the Semantics of Logic Programs by Recurrent Neural Networks , 1999, Applied Intelligence.

[31]  Katsumi Inoue,et al.  Meta-Interpretive Learning Using HEX-Programs , 2019, IJCAI.

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[34]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[35]  I. Levi,et al.  Making It Explicit , 1994 .

[36]  Robert A. Kowalski,et al.  Predicate Logic as Programming Language , 1974, IFIP Congress.

[37]  Chung-Hao Huang,et al.  Verification of Binarized Neural Networks via Inter-neuron Factoring - (Short Paper) , 2017, VSTTE.

[38]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[39]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[40]  Pieter Abbeel,et al.  Learning Plannable Representations with Causal InfoGAN , 2018, NeurIPS.

[41]  C Loehlin John,et al.  Latent variable models: an introduction to factor, path, and structural analysis , 1986 .

[42]  K. Westphal Kant and the Capacity to Judge , 2000 .

[43]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[44]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[45]  Fabio Viola,et al.  Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.

[46]  Andrew Cropper,et al.  Playgol: learning programs through play , 2019, IJCAI.

[47]  Nando de Freitas,et al.  Learning Compositional Neural Programs with Recursive Tree Search and Planning , 2019, NeurIPS.

[48]  Zhi-Hua Zhou,et al.  Meta-Interpretive Learning from noisy images , 2018, Machine Learning.

[49]  Luciano Serafini,et al.  Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning , 2019, FLAP.

[50]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[51]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[52]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53]  Luc De Raedt,et al.  From Statistical Relational to Neuro-Symbolic Artificial Intelligence , 2020, IJCAI.

[54]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[55]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[56]  Richard Evans,et al.  Learning Explanatory Rules from Noisy Data , 2017, J. Artif. Intell. Res..

[57]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[58]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[59]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[60]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[61]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[62]  Dov M. Gabbay,et al.  Dimensions of Neural-symbolic Integration - A Structured Survey , 2005, We Will Show Them!.

[63]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[64]  John Wylie Lloyd,et al.  Foundations of Logic Programming , 1987, Symbolic Computation.

[65]  Paris Smaragdis,et al.  Bitwise Neural Networks , 2016, ArXiv.

[66]  Armando Solar-Lezama,et al.  Learning to Infer Graphics Programs from Hand-Drawn Images , 2017, NeurIPS.

[67]  Cynthia Rudin,et al.  Please Stop Explaining Black Box Models for High Stakes Decisions , 2018, ArXiv.

[68]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[69]  Martin Gebser,et al.  ASP-Core-2 Input Language Format , 2019, Theory and Practice of Logic Programming.

[70]  Chiaki Sakama,et al.  Learning from interpretation transition , 2013, Machine Learning.

[71]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[72]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[73]  Yee Whye Teh,et al.  Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects , 2018, NeurIPS.

[74]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[75]  Ryan P. Adams,et al.  Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.

[76]  Sergey Levine,et al.  Model-Based Reinforcement Learning for Atari , 2019, ICLR.

[77]  Sebastian Nowozin,et al.  DeepCoder: Learning to Write Programs , 2016, ICLR.

[78]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[79]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[80]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[81]  Wei Xiong,et al.  Learning to Generate Time-Lapse Videos Using Multi-stage Dynamic Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[82]  Philipp J. Keller,et al.  Whole-brain functional imaging at cellular resolution using light-sheet microscopy , 2013, Nature Methods.

[83]  R. Smullyan First-Order Logic , 1968 .

[84]  Suman V. Ravuri,et al.  A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury , 2019, Nature.

[85]  Martin Gebser,et al.  Clingo = ASP + Control: Preliminary Report , 2014, ArXiv.

[86]  F. Black,et al.  The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.

[87]  John P. Cunningham,et al.  Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity , 2008, NIPS.

[88]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[89]  Tor Lattimore,et al.  No Free Lunch versus Occam's Razor in Supervised Learning , 2011, Algorithmic Probability and Friends.