Making sense of sensory input

This paper attempts to answer a central question in unsupervised learning: what does it mean to "make sense" of a sensory sequence? In our formalization, making sense involves constructing a symbolic causal theory that both explains the sensory sequence and also satisfies a set of unity conditions. The unity conditions insist that the constituents of the causal theory -- objects, properties, and laws -- must be integrated into a coherent whole. On our account, making sense of sensory input is a type of program synthesis, but it is unsupervised program synthesis. Our second contribution is a computer implementation, the Apperception Engine, that was designed to satisfy the above requirements. Our system is able to produce interpretable human-readable causal theories from very small amounts of data, because of the strong inductive bias provided by the unity conditions. A causal theory produced by our system is able to predict future sensor readings, as well as retrodict earlier readings, and impute (fill in the blanks of) missing sensory readings, in any combination. We tested the engine in a diverse variety of domains, including cellular automata, rhythms and simple nursery tunes, multi-modal binding problems, occlusion tasks, and sequence induction intelligence tests. In each domain, we test our engine's ability to predict future sensor values, retrodict earlier sensor values, and impute missing sensory data. The engine performs well in all these domains, significantly out-performing neural net baselines. We note in particular that in the sequence induction intelligence tests, our system achieved human-level performance. This is notable because our system is not a bespoke system designed specifically to solve intelligence tests, but a general-purpose system that was designed to make sense of any sensory sequence.

[1]  P. Strawson,et al.  The Bounds of Sense: An Essay on Kant's Critique of Pure Reason. , 2018 .

[2]  Rina Dechter,et al.  Propositional semantics for disjunctive logic programs , 1994, Annals of Mathematics and Artificial Intelligence.

[3]  Marek J. Sergot,et al.  A logic-based calculus of events , 1989, New Generation Computing.

[4]  Douglas R. Hofstadter,et al.  Fluid Concepts and Creative Analogies , 1995 .

[5]  Melanie Mitchell,et al.  Analogy-making as perception - a computer model , 1993, Neural network modeling and connectionism.

[6]  P. N. Johnson-Laird,et al.  Inference with Mental Models , 2012 .

[7]  Stephen Muggleton,et al.  Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP , 2018, Machine Learning.

[8]  Katsumi Inoue,et al.  Probabilistic Rule Learning in Nonmonotonic Domains , 2011, CLIMA.

[9]  John Dewey,et al.  Leibniz's New Essays Concerning the Human Understanding: A Critical Exposition , 2010 .

[10]  Katsumi Inoue,et al.  Exploiting Answer Set Programming with External Sources for Meta-Interpretive Learning , 2018, Theory Pract. Log. Program..

[11]  Katsumi Inoue,et al.  Learning Logic Program Representation for Delayed Systems With Limited Training Data , 2017, ILP.

[12]  Georg Gottlob,et al.  Complexity and expressive power of logic programming , 2001, CSUR.

[13]  Charles Cole,et al.  Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought , 1996 .

[14]  M. Hegarty Mechanical reasoning by mental simulation , 2004, Trends in Cognitive Sciences.

[15]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[16]  Fabio Viola,et al.  Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.

[17]  Ramón P. Otero,et al.  Induction of the Indirect Effects of Actions by Monotonic Methods , 2005, ILP.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Marsha J. Ekstrom Meredith,et al.  Seek-Whence: A Model of Pattern Perception , 1986 .

[20]  Richard Evans,et al.  Learning Explanatory Rules from Noisy Data , 2017, J. Artif. Intell. Res..

[21]  Martin Gebser,et al.  Clingo = ASP + Control: Preliminary Report , 2014, ArXiv.

[22]  Rob Fergus,et al.  Composable Planning with Attributes , 2018, ICML.

[23]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[24]  Karl J. Friston The history of the future of the Bayesian brain , 2012, NeuroImage.

[25]  L. P. Kaelbling,et al.  Learning Symbolic Models of Stochastic Domains , 2007, J. Artif. Intell. Res..

[26]  Michael R. Genesereth,et al.  The International General Game Playing Competition , 2013, AI Mag..

[27]  Evangelos Michelioudakis,et al.  Semi-supervised online structure learning for composite event recognition , 2018, Machine Learning.

[28]  H A SIMON,et al.  HUMAN ACQUISITION OF CONCEPTS FOR SEQUENTIAL PATTERNS. , 1963, Psychological review.

[29]  Thomas G. Dietterich,et al.  Structured machine learning: the next ten years , 2008, Machine Learning.

[30]  Link Swanson The Predictive Processing Paradigm Has Roots in Kant , 2016, Front. Syst. Neurosci..

[31]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[32]  Krysia Broda,et al.  Learning weak constraints in answer set programming , 2015, Theory and Practice of Logic Programming.

[33]  Krysia Broda,et al.  Iterative Learning of Answer Set Programs from Context Dependent Examples , 2016, Theory and Practice of Logic Programming.

[34]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[36]  Arthur B. Markman,et al.  Knowledge Representation , 1998 .

[37]  Michael I. Jordan,et al.  Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.

[38]  Luc De Raedt,et al.  Relational Reinforcement Learning , 2001, Machine Learning.

[39]  Sergey Levine,et al.  Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.

[40]  Richard Evans,et al.  Inductive general game playing , 2019, Machine Learning.

[41]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[42]  Chen Sun,et al.  Unsupervised Discovery of Parts, Structure, and Dynamics , 2019, ICLR.

[43]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[44]  Noah D. Goodman,et al.  Theory learning as stochastic search in the language of thought , 2012 .

[45]  Chiaki Sakama,et al.  Learning from interpretation transition , 2013, Machine Learning.

[46]  Matthias Jarke,et al.  Logic Programming and Databases , 1984, Expert Database Workshop.

[47]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[48]  Evangelos Michelioudakis,et al.  \mathtt OSLα : Online Structure Learning Using Background Knowledge Axiomatization , 2016, ECML/PKDD.

[49]  Pushmeet Kohli,et al.  TerpreT: A Probabilistic Programming Language for Program Induction , 2016, ArXiv.

[50]  Andrew Cropper Learning efficient logic programs , 2018, Machine Learning.

[51]  Robert A. Kowalski,et al.  Predicate Logic as Programming Language , 1974, IFIP Congress.

[52]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[53]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[54]  Noah D. Goodman,et al.  Learning a theory of causality. , 2011, Psychological review.

[55]  Luc De Raedt,et al.  Logical and relational learning , 2008, Cognitive Technologies.

[56]  S. Wolfram Statistical mechanics of cellular automata , 1983 .

[57]  Rolf Morel,et al.  Typed Meta-interpretive Learning of Logic Programs , 2019, JELIA.

[58]  Armando Solar-Lezama,et al.  Unsupervised Learning by Program Synthesis , 2015, NIPS.

[59]  Razvan Pascanu,et al.  Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[60]  Stephen Muggleton,et al.  Meta-interpretive learning of higher-order dyadic datalog: predicate invention revisited , 2013, Machine Learning.

[61]  Alessandra Russo,et al.  Inductive Logic Programming in Answer Set Programming , 2011, ILP.

[62]  Krysia Broda,et al.  Inductive Learning of Answer Set Programs , 2014, JELIA.

[63]  A. Shiryayev On Tables of Random Numbers , 1993 .

[64]  J. Tenenbaum,et al.  Theory-based Bayesian models of inductive learning and reasoning , 2006, Trends in Cognitive Sciences.

[65]  Alex S. Fukunaga,et al.  Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary , 2017, AAAI.

[66]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[67]  Bernt Schiele,et al.  Long-Term Image Boundary Prediction , 2016, AAAI.

[68]  Martin Gebser,et al.  Complex optimization in answer set programming , 2011, Theory and Practice of Logic Programming.

[69]  Paulo Félix,et al.  On the adoption of abductive reasoning for time series interpretation , 2016, Artif. Intell..

[70]  J. Tenenbaum,et al.  Intuitive Theories , 2020, Encyclopedia of Creativity, Invention, Innovation and Entrepreneurship.

[71]  Jessica B. Hamrick,et al.  Analogues of mental simulation and imagination in deep learning , 2019, Current Opinion in Behavioral Sciences.

[72]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[73]  David Maier,et al.  Dedalus: Datalog in Time and Space , 2010, Datalog.

[74]  Céline Rouveirol,et al.  Active Learning of Relational Action Models , 2011, ILP.

[75]  P. Harris The work of imagination , 1991 .

[76]  Chitta Baral,et al.  Logic Programming and Knowledge Representation , 1994, J. Log. Program..

[77]  K. Westphal Kant and the Capacity to Judge , 2000 .

[78]  Murray Shanahan,et al.  Towards Deep Symbolic Reinforcement Learning , 2016, ArXiv.

[79]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[80]  Krysia Broda,et al.  The complexity and generality of learning answer set programs , 2018, Artif. Intell..

[81]  Erik T. Mueller,et al.  Commonsense Reasoning , 2006, Qualitative Representations.

[82]  Alexander Artikis,et al.  Incremental learning of event definitions with Inductive Logic Programming , 2014, Machine Learning.

[83]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[84]  Miroslaw Truszczynski,et al.  Answer Set Optimization , 2003, IJCAI.

[85]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[86]  Katsumi Inoue,et al.  Learning Prime Implicant Conditions from Interpretation Transition , 2014, ILP.

[87]  Rolf Morel,et al.  Learning higher-order logic programs , 2019, Machine Learning.

[88]  Robert A. Kowalski,et al.  Logic for problem solving , 1982, The computer science library : Artificial intelligence series.

[89]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[90]  Rocky Ross,et al.  Mental models , 2004, SIGA.

[91]  José Hernández-Orallo,et al.  Computer models solving intelligence test problems: Progress and implications , 2016, Artif. Intell..

[92]  John McCarthy,et al.  Challenges to Machine Learning: Relations Between Reality and Appearance , 2007, ILP.

[93]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[94]  Andrew Cropper,et al.  Efficiently learning efficient programs , 2017 .

[95]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[96]  Stephen Muggleton,et al.  Learning Higher-Order Logic Programs through Abstraction and Invention , 2016, IJCAI.

[97]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[98]  Martin Gebser,et al.  Answer Set Solving in Practice , 2012, Answer Set Solving in Practice.

[99]  Masataro Asai,et al.  Unsupervised Grounding of Plannable First-Order Logic Representation from Images , 2019, ICAPS.

[100]  Chiaki Sakama,et al.  Learning Multi-valued Biological Models with Delayed Influence from Time-Series Observations , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[101]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[102]  David J. Chalmers,et al.  High-level perception, representation, and analogy: a critique of artificial intelligence methodology , 1992, J. Exp. Theor. Artif. Intell..

[103]  Stephen Moyle,et al.  Using Theory Completion to Learn a Robot Navigation Control Program , 2002, ILP.

[104]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[105]  Matthew Cook,et al.  Universality in Elementary Cellular Automata , 2004, Complex Syst..

[106]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[107]  Itamar Arel,et al.  Beyond the Turing Test , 2009, Computer.

[108]  Krzysztof R. Apt,et al.  Logic Programming , 1990, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[109]  Amit P. Sheth,et al.  Semantic Perception: Converting Sensory Observations to Abstractions , 2012, IEEE Internet Computing.

[110]  Katsumi Inoue,et al.  Inducing Causal Laws by Regular Inference , 2005, ILP.

[111]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[112]  Keith L. Clark,et al.  Negation as Failure , 1987, Logic and Data Bases.

[113]  P. Johnson-Laird Mental models , 1989 .

[114]  Daan Wierstra,et al.  Recurrent Environment Simulators , 2017, ICLR.

[115]  W. H. F. Barnes The Nature of Explanation , 1944, Nature.

[116]  Daniel L. K. Yamins,et al.  Flexible Neural Representation for Physics Prediction , 2018, NeurIPS.

[117]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[118]  Quoc V. Le,et al.  GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.

[119]  Oliver Ray,et al.  Nonmonotonic abductive inductive learning , 2009, J. Appl. Log..

[120]  Katsumi Inoue,et al.  Inductive Learning from State Transitions over Continuous Domains , 2017, ILP.

[121]  Tarek R. Besold,et al.  The Artificial Jack of All Trades : The Importance of Generality in Approaches to Human-Level Artificial Intelligence , 2015 .

[122]  Robert A. Kowalski,et al.  The Semantics of Predicate Logic as a Programming Language , 1976, JACM.

[123]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[124]  W. James Scientific Books: Talks to Teachers on Psychology, and to Students on Some of Life's Ideals , 2013 .

[125]  Alexander Artikis,et al.  Online learning of event definitions , 2016, Theory and Practice of Logic Programming.

[126]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[127]  Rolf Morel,et al.  Learning programs by learning from failures , 2020, ArXiv.

[128]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[129]  José Hernández-Orallo,et al.  The Measure of All Minds: Evaluating Natural and Artificial Intelligence , 2017 .

[130]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[131]  Sergey Levine,et al.  Model-Based Reinforcement Learning for Atari , 2019, ICLR.

[132]  Kenneth A. Ross,et al.  The well-founded semantics for general logic programs , 1991, JACM.