论文信息 - Self-directed Lifelong Learning for Robot Vision

Self-directed Lifelong Learning for Robot Vision

Efforts towards robust visual scene understanding tend to rely heavily on manual annotations. When human labels are required, collecting a dataset large enough to train a successful robot vision system is almost certain to be prohibitively expensive. However, we argue that a robot with a vision sensor can learn powerful visual representations in a self-directed manner by relying on fundamental physical priors and bootstrapping techniques. For example, it has been shown that basic visual tracking systems can be used to automatically label short-range correspondences in video that allow one to train a system with capabilities analogous to object permanence in humans. An object permanence system can in turn be used to automatically label long-range correspondences, allowing one to train a system able to compare and contrast objects and scenes. In the end, the agent will develop a representation that encodes persistent material properties, state, lighting, etc. of various parts of a visual scene. Starting with a strong visual representation, the agent can then learn to solve traditional vision tasks such as class and/or instance recognition using only a sparse set of labels that can be found on the Internet or solicited at little cost from humans. More importantly, such a representation would also enable truly robust solutions to challenges in robotics such as global localization, loop closure detection, and object pose estimation.

Dieter Fox | Tanner Schmidt | D. Fox | Tanner Schmidt

[1] Abhinav Gupta,et al. Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2] Jitendra Malik,et al. Learning to See by Moving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[4] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[6] Dieter Fox,et al. Self-Supervised Visual Descriptor Learning for Dense Correspondence , 2017, IEEE Robotics and Automation Letters.