论文信息 - Learning to Perceive and Act - 字舞流文

Learning to Perceive and Act

This paper considers adaptive control architectures that integrate active sensory-motor sys tems with decision systems based on reinforcement learning. One unavoidable consequence of active perception is that the agent's internal representation often confounds external world states. We call this phenomenon perceptual aliasing and show that it destabilizes existing reinforcement learning algorithms with respect to the optimal decision policy. We then de scribe a new decision system that overcomes these difficulties for a restricted class of decision problems. The system incorporates a perceptual subcycle within the overall decision cycle and uses a modified learning algorithm to suppress the effects of perceptual aliasing. The result is a control architecture that learns not only how to solve a task but also where to focus its attention in order to collect necessary sensory information. "This work was supported in part by NSF research grant no. DCR-8602958, and in part by NSF research grant no. IRI-8903582.

Dana H. Ballard | Steven D. Whitehead | D. Ballard | S. Whitehead

[1] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..

[2] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[3] Manuel Blum,et al. Toward a Mathematical Theory of Inductive Inference , 1975, Inf. Control..

[4] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[5] James S. Albus,et al. Brains, behavior, and robotics , 1981 .

[6] Lashon B. Booker,et al. Intelligent Behavior as an Adaptation to the Task Environment , 1982 .

[7] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[9] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .

[10] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .

[11] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[12] R. James Firby,et al. An Investigation into Reactive Planning in Complex Domains , 1987, AAAI.

[13] Amy L. Lansky,et al. Reactive Reasoning and Planning , 1987, AAAI.

[14] David Chapman,et al. Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[15] John H. Holland,et al. Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[16] Philip E. Agre,et al. The dynamic structure of everyday life , 1988 .

[17] Richard S. Sutton,et al. Sequential Decision Problems and Neural Networks , 1989, NIPS 1989.

[18] Dana H. Ballard,et al. Reference Frames for Animate Vision , 1989, IJCAI.

[19] D. Ballard,et al. A Role for Anticipation in Reactive Systems that Learn , 1989, ML.

[20] E. Reed. The Ecological Approach to Visual Perception , 1989 .

[21] C. Watkins. Learning from delayed rewards , 1989 .

[22] Charles W. Anderson. Tower of Hanoi with Connectionist Networks: Learning New Features , 1989, ML.

[23] Leslie Pack Kaelbling,et al. A Formal Framework for Learning in Embedded Systems , 1989, ML.

[24] David Chapman,et al. Penguins Can Make Cake , 1989, AI Mag..

[25] Tom M. Mitchell,et al. On Becoming Reactive , 1989, ML.

[26] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .

[27] Dana H. Ballard,et al. Reactive behavior, learning, and anticipation , 1989 .

[28] Mark Drummond,et al. Situated Control Rules , 1989, KR.

[29] Marcel Joachim Schoppers,et al. Representation and automatic synthesis of reaction plans , 1989 .

[30] Andrew W. Moore,et al. Some experiments in adaptive state-space robotics , 1989 .

[31] Matthew L. Ginsberg,et al. Universal Planning: An (Almost) Universally Bad Idea , 1989, AI Mag..

[32] Michael Hormel,et al. A Self-organizing Associative Memory System for Control Applications , 1989, NIPS.

[33] Richard S. Sutton,et al. Neural networks for control , 1990 .

[34] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[35] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.