Development of object manipulation through self-exploratory visuomotor experience

Human infants learn to interpret their visuomotor experience and predict the effects of their actions by practicing interactions with the environment. This paper presents a computational model of this process that can be implemented in an artificial agent. We first present a mechanism for simultaneous segmentation and modeling of the agent's body, movable objects and their visual environment. This model can explain a “sense of agency” in terms of predictive certainty of object movements conditioned by actions. We then describe causality learning for object manipulation in detail. Our experimental setup requires a model considering combinational causalities beyond simple direct causality. A novel strategy for causal exploration is proposed and its effectiveness is shown in experiments. The results show that the proposed model allows an agent to efficiently acquire object manipulation skills through self-exploratory visuomotor experience, that is, a sequence of pairs of raw bitmap image and taken actions at each time step.

[1]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[2]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[3]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[4]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[5]  Pierre-Yves Oudeyer,et al.  Robust intrinsically motivated exploration and active learning , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[6]  B. Kuipers,et al.  Learning to predict the effects of actions: Synergy between rules and landmarks , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[7]  Benjamin Kuipers,et al.  Towards the Application of Reinforcement Learning to Undirected Developmental Learning , 2007 .

[8]  Yiannis Demiris,et al.  Learning Forward Models for Robots , 2005, IJCAI.

[9]  Kohtaro Sabe,et al.  A generative model for developmental understanding of visuomotor experience , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[10]  Kohtaro Sabe,et al.  Reward-free learning using sparsely-connected hidden Markov models and local controllers , 2009, EpiRob.

[11]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Kohtaro Sabe,et al.  Self-regulation mechanism for continual autonomous learning in open-ended environments , 2009, EpiRob.

[14]  Pierre-Yves Oudeyer,et al.  The Playground Experiment: Task-Independent Development of a Curious Robot , 2005 .