论文信息 - No Bad Dogs: Ethological Lessons for Learning in Hamsterdam* - 字舞流文

No Bad Dogs: Ethological Lessons for Learning in Hamsterdam*

We present an architecture for autonomous creatures that allows learning to be combined with action selection, based on ideas from ethology. We show how temporal-difference learning may be used within the context of an ethologically inspired animat architecture to build and modify portions of the behavior network, and to set fundamental parameters including the strength associated with individual Releasing Mechanisms, the time course associated with appetitive behaviors, and the learning rates to be used based on the observed reliability of specific contingencies. The learning algorithm has been implemented as part of the Hamsterdam toolkit for building autonomous animated creatures. When implemented in Silas, a virtual dog, the algorithm enables Silas to be trained using classical and instrumental conditioning.

Bruce Blumberg | Peter M. Todd | P. Todd | B. Blumberg

[1] P. Killeen. Mathematical principles of reinforcement , 1994 .

[2] Sara J. Shettleworth,et al. CHAPTER 7 – Biological Approaches to the Study of Learning , 1994 .

[3] Ron Koppelberger,et al. Space and Time , 2021, Nature.

[4] Simon Giszter. Reinforcement tuning of action synthesis and selection in a “virtual frog” , 1994 .

[5] Ian Horswill. A simple, cheap, and robust visual navigation system , 1993 .

[6] Alex Pentland,et al. The ALIVE system: wireless, full-body interaction with autonomous agents , 1997, Multimedia Systems.

[7] B. Bernstein,et al. Animal Behavior , 1927, Japanese Marine Life.

[8] Terrence J. Sejnowski,et al. Foraging in an Uncertain Environment Using Predictive Hebbian Learning , 1993, NIPS.

[9] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.

[10] Pattie Maes,et al. Situated agents can have goals , 1990, Robotics Auton. Syst..

[11] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .

[12] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .

[13] James S. Morgan,et al. A Hierarchical Network of Control Systems that Learn: Modeling Nervous System Function During Classical and Instrumental Conditioning , 1993, Adapt. Behav..

[14] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .

[15] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[16] S. Abdi. Darwin Machines and the Nature of Knowledge , 1995 .

[17] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .

[18] Craig W. Reynolds. Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[19] Steven Douglas Whitehead,et al. Reinforcement learning for the adaptive control of perception and action , 1992 .

[20] Peter M. Todd,et al. Exploring adaptive agency II: simulating the evolution of associative learning , 1991 .

[21] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[22] G. Reeke. Marvin Minsky, The Society of Mind , 1991, Artif. Intell..

[23] Toby Tyrrell,et al. Computational mechanisms for action selection , 1993 .

[24] G. Davey. Ecological Learning Theory , 1989 .

[25] P. Todd,et al. Exploring Adaptive Agency I: Theory and Methods for Simulating the Evolution of Learning , 1991 .

[26] Leonard N. Foner,et al. Paying Attention to What's Important: Using Focus of Attention to Improve Unsupervised Learning , 1994 .

[27] H. Minkowski,et al. Space and time , 1952 .

[28] Pattie Maes,et al. Modeling Adaptive Autonomous Agents , 1993, Artificial Life.

[29] Demetri Terzopoulos,et al. Artificial fishes: physics, locomotion, perception, behavior , 1994, SIGGRAPH.

[30] H. C. Plotkin,et al. Darwin machines and the nature of knowledge , 1994 .

[31] J. L. Gould,et al. The Animal Mind , 1931, Nature.

[33] Bruce Blumberg,et al. Multi-level direction of autonomous creatures for real-time virtual environments , 1995, SIGGRAPH.

[34] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[35] 永福智志. The Organization of Learning , 2005, Journal of Cognitive Neuroscience.

[36] Bruce Blumberg,et al. Action-selection in hamsterdam: lessons from ethology , 1994 .

[37] K. Lorenz,et al. Man Meets Dog , 1950 .