A self-organizing developmental cognitive architecture with interactive reinforcement learning

Abstract Developmental cognitive systems can endow robots with the abilities to incrementally learn knowledge and autonomously adapt to complex environments. Conventional cognitive methods often acquire knowledge through passive perception, such as observing and listening. However, this learning way may generate incorrect representations inevitably and cannot correct them online without any feedback. To tackle this problem, we propose a biologically-inspired hierarchical cognitive system called Self-Organizing Developmental Cognitive Architecture with Interactive Reinforcement Learning (SODCA-IRL). The architecture introduces interactive reinforcement learning into hierarchical self-organizing incremental neural networks to simultaneously learn object concepts and fine-tune the learned knowledge by interacting with humans. In order to realize the integration, we equip individual neural networks with a memory model, which is designed as an exponential function controlled by two forgetting factors to simulate the consolidation and forgetting processes of humans. Besides, an interactive reinforcement strategy is designed to provide appropriate rewards and execute mistake correction. The feedback acts on the forgetting factors to reinforce or weaken the memory of neurons. Therefore, correct knowledge is preserved while incorrect representations are forgotten. Experimental results show that the proposed method can make effective use of the feedback from humans to improve the learning effectiveness significantly and reduce the model redundancy.

[1]  H. Ebbinghaus Memory A Contribution Toexperimental Psychology , 1913 .

[2]  M. Tomasello The Cultural Origins of Human Cognition , 2000 .

[3]  David Vernon,et al.  The role of cognitive architectures in general artificial intelligence , 2018, Cognitive Systems Research.

[4]  Frank Kirchner,et al.  Intrinsic interactive reinforcement learning – Using error-related potentials for real world human-robot interaction , 2017, Scientific Reports.

[5]  Andrew James Smith,et al.  Applications of the self-organising map to reinforcement learning , 2002, Neural Networks.

[6]  Peter Stone,et al.  Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance , 2015, Artif. Intell..

[7]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[8]  Stefan Wermter,et al.  Training Agents With Interactive Reinforcement Learning and Contextual Affordances , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[9]  Stefan Wermter,et al.  Lifelong learning of human actions with deep neural network self-organization , 2017, Neural Networks.

[10]  Mark K. Ho,et al.  Social is special: A normative framework for teaching with and learning from evaluative feedback , 2017, Cognition.

[11]  C. Tamis-LeMonda,et al.  Why Is Infant Language Learning Facilitated by Parental Responsiveness? , 2014 .

[12]  Paulo J. L. Adeodato,et al.  A Temporal Difference GNG-Based Approach for the State Space Quantization in Reinforcement Learning Environments , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[13]  Angelo Cangelosi,et al.  Why Are There Developmental Stages in Language Learning? A Developmental Robotics Model of Language Development. , 2017, Cognitive science.

[14]  Andrea Lockerd Thomaz,et al.  Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.

[15]  Ian D. Walker,et al.  A Gesture Learning Interface for Simulated Robot Path Shaping With a Human Teacher , 2014, IEEE Transactions on Human-Machine Systems.

[16]  Antonio Lieto,et al.  The knowledge level in cognitive architectures: Current limitations and possible developments , 2018, Cognitive Systems Research.

[17]  Marco Wiering,et al.  Learning from Monte Carlo Rollouts with Opponent Models for Playing Tron , 2018, ICAART.

[18]  Jacek M. Zurada,et al.  Self-Organizing Neural Networks Integrating Domain Knowledge and Reinforcement Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[19]  J. Wixted,et al.  Genuine power curves in forgetting: A quantitative analysis of individual subject forgetting functions , 1997, Memory & cognition.

[20]  Stefan Wermter,et al.  Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[21]  Andrea Lockerd Thomaz,et al.  Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[22]  Bin Chen,et al.  Autonomous intelligent decision-making system based on Bayesian SOM neural network for robot soccer , 2014, Neurocomputing.

[23]  Martin Jägersand,et al.  Incremental learning for robot perception through HRI , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Robert Babuska,et al.  Decentralized Reinforcement Learning of Robot Behaviors , 2018, Artif. Intell..

[25]  Malik Ghallab,et al.  Deliberation for autonomous robots: A survey , 2017, Artif. Intell..

[26]  Stefan Wermter,et al.  Improving interactive reinforcement learning: What makes a good teacher? , 2018, Connect. Sci..

[27]  M.A. Wiering,et al.  Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[28]  Stefano Nolfi,et al.  Embodied Language Learning and Cognitive Bootstrapping: Methods and Design Principles , 2016 .

[29]  D. Sagi,et al.  Benefits of efficient consolidation: Short training enables long-term resistance to perceptual adaptation induced by intensive testing , 2008, Vision Research.

[30]  Philippe Gaussier,et al.  Robots Learn to Recognize Individuals from Imitative Encounters with People and Avatars , 2016, Scientific Reports.

[31]  Aude Billard,et al.  Social babbling: The emergence of symbolic gestures and words , 2018, Neural Networks.

[32]  Amir Aly,et al.  Metrics and benchmarks in human-robot interaction: Recent advances in cognitive robotics , 2017, Cognitive Systems Research.

[33]  Stefan Wermter,et al.  Multi-modal integration of dynamic audiovisual patterns for an interactive reinforcement learning scenario , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Shen Furao,et al.  A general associative memory based on self-organizing incremental neural network , 2013, Neurocomputing.

[35]  Yukie Nagai,et al.  Parental scaffolding as a bootstrapping mechanism for learning grasp affordances and imitation skills , 2014, Robotica.

[36]  Jaap M. J. Murre,et al.  Replication and Analysis of Ebbinghaus’ Forgetting Curve , 2015, PloS one.

[37]  Shimon Whiteson,et al.  Social interaction for efficient agent learning from human reward , 2017, Autonomous Agents and Multi-Agent Systems.

[38]  Haipeng Shao Fading Model of Drivers' Short-Term Memory of Traffic Signs , 2010, 2010 International Conference on Machine Vision and Human-machine Interface.

[39]  Reza Safabakhsh,et al.  Continuous state/action reinforcement learning: A growing self-organizing map approach , 2011, Neurocomputing.

[40]  Bofeng Zhang,et al.  User Model Evolution Algorithm: Forgetting and Reenergizing User Preference , 2011, 2011 International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing.

[41]  Fei-Yue Wang,et al.  A Survey of Cognitive Architectures in the Past 20 Years , 2018, IEEE Transactions on Cybernetics.

[42]  Stefan Wermter,et al.  A self-organizing neural network architecture for learning human-object interactions , 2017, Neurocomputing.

[43]  Sonia Chernova,et al.  Effect of human guidance and state space size on Interactive Reinforcement Learning , 2011, 2011 RO-MAN.

[44]  Luís Seabra Lopes,et al.  Hierarchical Object Representation for Open-Ended Object Category Learning and Recognition , 2016, NIPS.

[45]  Ah-Hwee Tan,et al.  FALCON: a fusion architecture for learning, cognition, and navigation , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[46]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[47]  Shen Furao,et al.  Perception Evolution Network Based on Cognition Deepening Model—Adapting to the Emergence of New Sensory Receptor , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Robert Mayer Assimilation and Forgetting of the Educational Information: Results of Imitating Modelling. , 2017 .

[49]  Sumio Hosaka,et al.  Emulating the Ebbinghaus forgetting curve of the human brain with a NiO-based memristor , 2013 .

[50]  P. Langley Interactive Cognitive Systems and Social Intelligence , 2017, IEEE Intelligent Systems.

[51]  Li Li,et al.  Sentiment-enhanced learning model for online language learning system , 2018, Electron. Commer. Res..

[52]  Tony Belpaeme,et al.  Supervised autonomy for online learning in human-robot interaction , 2017, Pattern Recognit. Lett..

[53]  Luís Seabra Lopes,et al.  An experimental protocol for the evaluation of open-ended category learning algorithms , 2015, 2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS).

[54]  Timothée Masquelier,et al.  First-Spike-Based Visual Categorization Using Reward-Modulated STDP , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[55]  Angelo Cangelosi,et al.  Visual attention and object naming in humanoid robots using a bio-inspired spiking neural network , 2018, Robotics Auton. Syst..

[56]  S. Saults,et al.  The ravages of absolute and relative amounts of time on memory. , 2001 .

[57]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[58]  Luc Steels,et al.  Aibo''s first words. the social learning of language and meaning. Evolution of Communication , 2002 .

[59]  Xin Ma,et al.  An Autonomous Developmental Cognitive Architecture Based on Incremental Associative Neural Network With Dynamic Audiovisual Fusion , 2019, IEEE Access.

[60]  Gi Hyun Lim,et al.  Towards lifelong assistive robotics: A tight coupling between object perception and manipulation , 2018, Neurocomputing.

[61]  Ah-Hwee Tan,et al.  Perception Coordination Network: A Neuro Framework for Multimodal Concept Acquisition and Binding , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[62]  Patrícia Amâncio Vargas,et al.  Towards Autonomous Robots Via an Incremental Clustering and Associative Learning Architecture , 2014, Cognitive Computation.