A Layered Architecture for Active Perception: Image Classification using Deep Reinforcement Learning

We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that evaluates the reward and makes a prediction. We design and implement these layers using deep reinforcement learning. A generalized policy gradient algorithm is utilized to learn the parameters of these layers to maximize the expected reward. Our proposed methodology is tested on the MNIST dataset of handwritten digits, which provides us with a level of explainability while interpreting the agent's intermediate goals and course of action.

[1]  John K. Tsotsos A Computational Perspective on Visual Attention , 2011 .

[2]  Carlos A. B. Mello,et al.  Object recognition using saliency guided searching , 2016, Integr. Comput. Aided Eng..

[3]  Stefan Everling,et al.  Attentional Selection Can Be Predicted by Reinforcement Learning of Task-relevant Stimulus Features Weighted by Value-independent Stickiness , 2016, Journal of Cognitive Neuroscience.

[4]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[5]  Martin Takác,et al.  Multi-Agent Image Classification via Reinforcement Learning , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .

[7]  David W. Aha,et al.  Goal reasoning for autonomous underwater vehicles: Responding to unexpected agents , 2018, AI Commun..

[8]  Markus Vincze,et al.  Survey of recent advances in 3D visual attention for robotics , 2017, Int. J. Robotics Res..

[9]  E. Ardizzone,et al.  Image Content Enhancement Through Salient Regions Segmentation for People With Color Vision Deficiencies , 2019, i-Perception.

[10]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[12]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[13]  David W. Aha,et al.  A Goal Reasoning Agent for Controlling UAVs in Beyond-Visual-Range Air Combat , 2017, IJCAI.

[14]  Hector Muñoz-Avila,et al.  Adaptive Goal Driven Autonomy , 2018, ICCBR.

[15]  D H Ballard,et al.  Hand-eye coordination during sequential tasks. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[16]  Y. Aloimonos Active Perception , 1993 .

[17]  Boris Schauerte Bottom-Up Audio-Visual Attention for Scene Exploration , 2016 .

[18]  John K. Tsotsos,et al.  On computational modeling of visual saliency: Examining what’s right, and what’s left , 2015, Vision Research.

[19]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[20]  Hector Muñoz-Avila,et al.  Goal-Driven Autonomy with Semantically-Annotated Hierarchical Cases , 2015, ICCBR.

[21]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[22]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[24]  Kutluhan Erol,et al.  Hierarchical task network planning: formalization, analysis, and implementation , 1996 .

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  David W. Aha,et al.  Goal Reasoning: Foundations, Emerging Applications, and Prospects , 2018, AI Mag..

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).