论文信息 - Joint Learning of Unsupervised Object-Based Perception and Control

Joint Learning of Unsupervised Object-Based Perception and Control

This paper is concerned with object-based perception control (OPC), which allows for joint optimization of hierarchical object-based perception and decision making. We define the OPC framework by extending the Bayesian brain hypothesis to support object-based latent representations and propose an unsupervised end-to-end solution method. We develop a practical algorithm and analyze the convergence of the perception model update. Experiments on a high-dimensional pixel environment justify the learning effectiveness of our object-based perception control approach.

[1] E. Spelke,et al. Origins of knowledge. , 1992, Psychological review.

[2] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[3] E. M. L. Beale,et al. Nonlinear Programming: A Unified Approach. , 1970 .

[4] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.

[6] J. Bruner,et al. Value and need as organizing factors in perception. , 1947, Journal of abnormal psychology.

[7] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8] Kurt Driessens,et al. Relational Reinforcement Learning , 1998, Machine-mediated learning.

[9] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[10] Alexander Lerchner,et al. COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration , 2019, ArXiv.

[11] Matthew Botvinick,et al. MONet: Unsupervised Scene Decomposition and Representation , 2019, ArXiv.

[12] Jürgen Schmidhuber,et al. Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[13] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .

[14] Martin Wainwright,et al. Learning in graphical models: Missing data and rigorous guarantees with non-convexity , 2011 .

[15] Stuart A. Kauffman,et al. The origins of order , 1993 .

[16] Rajesh P. N. Rao,et al. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[17] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[18] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[19] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[20] Pascal Poupart,et al. Unsupervised Video Object Segmentation for Deep Reinforcement Learning , 2018, NeurIPS.

[21] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[22] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[23] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.

[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25] Ankush Gupta,et al. Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.

[26] Jessica B. Hamrick,et al. Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[27] R. Gregory. Perceptions as hypotheses. , 1980, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[28] Karl J. Friston,et al. Active Inference: A Process Theory , 2017, Neural Computation.

[29] Chris L. Baker,et al. Rational quantitative attribution of beliefs, desires and percepts in human mentalizing , 2017, Nature Human Behaviour.

[30] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[31] Tai Sing Lee,et al. Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[32] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[33] Jürgen Schmidhuber,et al. Neural Expectation Maximization , 2017, NIPS.

[34] New York Dover,et al. ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[35] Joshua B. Tenenbaum,et al. A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[36] Jürgen Schmidhuber,et al. Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[37] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[38] D. Knill,et al. The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[39] Karl J. Friston. The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[40] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[41] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[42] Chongjie Zhang,et al. Object-Oriented Dynamics Predictor , 2018, NeurIPS.

[43] Klaus Greff,et al. Multi-Object Representation Learning with Iterative Variational Inference , 2019, ICML.