Cognitive Architecture for Video Games

There has been an increasing interest in Frame-oriented reinforcement learning (FORL) in recent year. However, most of the works in the literature show little inspiration from human’s perception action reward cycle (PARC) and causation.Inspired by human’s vision system and learning strategy, we propose a novel architecture for FORL that understands the content of raw frames. The architecture achieves four objectives:1.Extracting information from the environment by exploiting only unsupervised learning and reinforcement learning.2.Understanding the content of a raw frame.3.Exploiting a Folvea vision strategy which is analogous to human’s vision system.4.Establishing self-awareness and collecting new training data subset automatically to learn new objects without forgetting previous ones..The architecture is developed in the Super Mario Brothers video game.. At first, Mario is the only object recognized by the architecture. After automatic data subset collection and memory update, the architecture can recognize both Goomba and Mario and classify them using incremental training.We exemplify performance of each piece of the architecture with snippets obtained from the video game.

[1]  Jiri Matas,et al.  Discriminative Correlation Filter with Channel and Spatial Reliability , 2017, CVPR.

[2]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[3]  Christopher Burgess,et al.  DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[4]  C. Allen,et al.  The Cognitive Animal: Empirical and Theoretical Perspectives on Animal Cognition , 2002 .

[5]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[7]  Pascal Poupart,et al.  Unsupervised Video Object Segmentation for Deep Reinforcement Learning , 2018, NeurIPS.

[8]  Stefano Soatto,et al.  Quick Shift and Kernel Methods for Mode Seeking , 2008, ECCV.

[9]  Illtyd Trethowan Causality , 1938 .

[10]  Andreas Keil,et al.  Predicting visual attention using gamma kernels , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  J. Knott The organization of behavior: A neuropsychological theory , 1951 .

[12]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[13]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Katia P. Sycara,et al.  Object-sensitive Deep Reinforcement Learning , 2017, GCAI.

[15]  José Carlos Príncipe,et al.  Deep Predictive Coding Networks , 2013, ICLR.

[16]  Ankush Gupta,et al.  Unsupervised Learning of Object Landmarks through Conditional Image Generation , 2018, NeurIPS.

[17]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[18]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[19]  Martin A. Riedmiller,et al.  Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[20]  José Carlos Príncipe,et al.  A fast proximal method for convolutional sparse coding , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[21]  Ankush Gupta,et al.  Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.