How Decisions Are Made in Brains: Unpack “Black Box” of CNN With Ms. Pac-Man Video Game

The convolutional neural network (CNN) is widely used in various computer vision problems such as image recognition and image classification because of its powerful ability to process image data. However, it is an end-to-end model that remains a “block box” for users. The internal logic of CNN is not explicitly known. Interpreting CNN can help us better understand neural networks and the various ways they benefit us as users. In this paper, we explain the contributions of the convolutional layer of CNN with a neuroscience experiment paradigm: the Ms. Pac-Man video game. Ms. Pac-Man is a popular game that provides a complex yet natural decision-making task rather than a laboratory artifact. An analysis of the game can thus intuitively reveal the complicated decision-making process in animal brains. We sought to (1) elucidate the role of the CNN convolutional layer and (2) analyze the low-level strategies in animal brains based on high-level decisions. We use recorded videos of monkeys playing the Ms. Pac-Man game to empirically demonstrate that our network is able to predict the moving direction of the Pac-Man at every time step. We further find that the decision-making process at work during gameplay is high-reward-driven. A heatmap of the weighted feature map at each convolutional layer shows that CNN makes predictions based on the most important input pattern, which in this case is the high reward entities in the game.

[1]  Simon M. Lucas,et al.  Ms Pac-Man competition , 2007, SEVO.

[2]  András Lörincz,et al.  Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man , 2007, J. Artif. Intell. Res..

[3]  Marc A Sommer,et al.  The frontal eye field as a prediction map. , 2008, Progress in brain research.

[4]  Quanshi Zhang,et al.  Examining CNN representations with respect to Dataset Bias , 2017, AAAI.

[5]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Quanshi Zhang,et al.  Interpreting CNN knowledge via an Explanatory Graph , 2017, AAAI.

[7]  Fan Zhang,et al.  An Intrusion Detection System Using a Deep Neural Network With Gated Recurrent Units , 2018, IEEE Access.

[8]  Simon M. Lucas,et al.  Ms. Pac-Man Versus Ghost Team CIG 2016 competition , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[9]  R. Wurtz,et al.  Brain circuits for the internal monitoring of movements. , 2008, Annual review of neuroscience.

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Tiejian Luo,et al.  A Recommendation Model Based on Deep Neural Network , 2018, IEEE Access.

[13]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[14]  Silvia Ferrari,et al.  A model-based cell decomposition approach to on-line pursuit-evasion path planning and the video game Ms. Pac-Man , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Masakazu Matsugu,et al.  Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Luca Maria Gambardella,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification , 2022 .

[19]  Quanshi Zhang,et al.  Interpreting CNNs via Decision Trees , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[21]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[22]  Dušan Nemec,et al.  A visual attention operator for playing Pac-Man , 2018, 2018 ELEKTRO.

[23]  José Carlos Príncipe,et al.  Analysis of Agent Expertise in Ms. Pac-Man Using Value-of-Information-Based Policies , 2017, IEEE Transactions on Games.

[24]  Huayan Pu,et al.  Reliable Intelligent Path Following Control for a Robotic Airship Against Sensor Faults , 2019, IEEE/ASME Transactions on Mechatronics.

[25]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[26]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[30]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[31]  Xin Huo,et al.  Observer-based adaptive neural tracking control for output-constrained switched MIMO nonstrict-feedback nonlinear systems with unknown dead zone , 2020 .

[32]  Silvia Ferrari,et al.  A Model-Based Approach to Optimizing Ms. Pac-Man Game Strategies in Real Time , 2017, IEEE Transactions on Computational Intelligence and AI in Games.

[33]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[34]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[37]  Quanshi Zhang,et al.  Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.

[38]  Peter Xiaoping Liu,et al.  Adaptive Neural Output-Feedback Decentralized Control for Large-Scale Nonlinear Systems With Stochastic Disturbances , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[41]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.