Towards Symbolic Reinforcement Learning with Common Sense

Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such capacities to deep Q-networks (DQN) by learning a relevant symbolic representation prior to using Q-learning. In this paper, we propose a novel extension of DSRL, which we call Symbolic Reinforcement Learning with Common Sense (SRL+CS), offering a better balance between generalization and specialization, inspired by principles of common sense when assigning rewards and aggregating Q-values. Experiments reported in this paper show that SRL+CS learns consistently faster than Q-learning and DSRL, achieving also a higher accuracy. In the hardest case, where agents were trained in a deterministic environment and tested in a random environment, SRL+CS achieves nearly 100% average accuracy compared to DSRL's 70% and DQN's 50% accuracy. To the best of our knowledge, this is the first case of near perfect zero-shot transfer learning using Reinforcement Learning.

[1]  Dileep George,et al.  Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[2]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[3]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[4]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5]  Ke Tang,et al.  Relief R-CNN: Utilizing Convolutional Features for Fast Object Detection , 2017, ISNN.

[6]  Son N. Tran,et al.  Deep Logic Networks: Inserting and Extracting Knowledge From Deep Belief Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Dov M. Gabbay,et al.  Neural-Symbolic Cognitive Reasoning , 2008, Cognitive Technologies.

[8]  Kurt Driessens,et al.  Relational Reinforcement Learning , 1998, Machine-mediated learning.

[9]  Shin'ichi Satoh,et al.  Faster R-CNN Features for Instance Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Matthias Nickles,et al.  Integrating Relational Reinforcement Learning with Reasoning about Actions and Change , 2011, ILP.

[11]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[12]  Richard S. Sutton,et al.  Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[13]  Ernest Davis,et al.  Commonsense reasoning and commonsense knowledge in artificial intelligence , 2015, Commun. ACM.

[14]  Ke Tang,et al.  Relief Impression Image Detection : Unsupervised Extracting Objects Directly From Feature Arrangements of Deep CNN , 2016, ArXiv.

[15]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[16]  John McCarthy,et al.  Programs with common sense , 1960 .

[17]  Artur S. d'Avila Garcez,et al.  Logic Tensor Networks for Semantic Image Interpretation , 2017, IJCAI.

[18]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[19]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[20]  Rich Sutton,et al.  A Deeper Look at Planning as Learning from Replay , 2015, ICML.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Murray Shanahan,et al.  Towards Deep Symbolic Reinforcement Learning , 2016, ArXiv.

[23]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[24]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[25]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.