Scaling-up Knowledge for a Cognizant Robot

This paper takes a new approach to the old adage that knowledge is the key for artificial intelligence. A cognizant robot is a robot with a deep and immediately accessible understanding of its interaction with the environment—an understanding the robot can use to flexibly adapt to novel situations. Such a robot will need a vast amount of situated, revisable, and expressive knowledge to display flexible intelligent behaviors. Instead of relying on human-provided knowledge, we propose that an arbitrary robot can autonomously acquire pertinent knowledge directly from everyday interaction with the environment. We show how existing ideas in reinforcement learning can enable a robot to maintain and improve its knowledge. The robot performs a continual learning process that scales-up knowledge acquisition to cover a large number of facts, skills and predictions. This knowledge has semantics that are grounded in sensorimotor experience. We see the approach of developing more cognizant robots as a necessary key step towards broadly competent robots.

[1]  W. Grey Walter,et al.  AN ELECTRO‐MECHANICAL »ANIMAL«1 , 1950 .

[2]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[3]  J. Pearce Animal Learning and Cognition: An Introduction , 1997 .

[4]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[5]  Michael R. James,et al.  Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[6]  R. Greenspan,et al.  Cognitive consonance: complex brain functions in the fruit fly and its relatives , 2004, Trends in Neurosciences.

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Shalabh Bhatnagar,et al.  Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.

[9]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[10]  Richard S. Sutton,et al.  GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.

[11]  R. Sutton,et al.  GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .

[12]  Patrick M. Pilarski,et al.  Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.