Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents

Psychlab is a simulated psychology laboratory inside the first-person 3D game world of DeepMind Lab (Beattie et al. 2016). Psychlab enables implementations of classical laboratory psychological experiments so that they work with both human and artificial agents. Psychlab has a simple and flexible API that enables users to easily create their own tasks. As examples, we are releasing Psychlab implementations of several classical experimental paradigms including visual search, change detection, random dot motion discrimination, and multiple object tracking. We also contribute a study of the visual psychophysics of a specific state-of-the-art deep reinforcement learning agent: UNREAL (Jaderberg et al. 2016). This study leads to the surprising conclusion that UNREAL learns more quickly about larger target stimuli than it does about smaller stimuli. In turn, this insight motivates a specific improvement in the form of a simple model of foveal vision that turns out to significantly boost UNREAL's performance, both on Psychlab tasks, and on standard DeepMind Lab tasks. By open-sourcing Psychlab we hope to facilitate a range of future such studies that simultaneously advance deep reinforcement learning and improve its links with cognitive science.

[1]  L. Glass Moiré Effect from Random Dots , 1969, Nature.

[2]  R. Pérez,et al.  Perception of Random Dot Interference Patterns , 1973, Nature.

[3]  W. A. Phillips On the distinction between sensory storage and short-term visual memory , 1974 .

[4]  E. Switkes,et al.  Pattern Recognition in Humans: Correlations Which Cannot be Perceived , 1976, Perception.

[5]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[6]  Kunihiko Fukushima,et al.  Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition , 1982 .

[7]  Hugh R. Wilson Development of spatiotemporal mechanisms in infant vision , 1988, Vision Research.

[8]  Z W Pylyshyn,et al.  Tracking multiple independent targets: evidence for a parallel tracking mechanism. , 1988, Spatial vision.

[9]  W. Newsome,et al.  A selective impairment of motion perception following lesions of the middle temporal visual area (MT) , 1988, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[10]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[11]  J. Movshon,et al.  Neuronal mechanisms of motion perception. , 1990, Cold Spring Harbor symposia on quantitative biology.

[12]  Neil A. Macmillan,et al.  Detection Theory: A User's Guide , 1991 .

[13]  J. Movshon,et al.  The analysis of visual motion: a comparison of neuronal and psychophysical performance , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14]  B. Treutwein Adaptive psychophysical procedures , 1995, Vision Research.

[15]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[16]  J. Movshon,et al.  A relationship between behavioral choice and the visual responses of neurons in macaque MT. , 1996, Visual neuroscience.

[17]  Steven C. Dakin The detection of structure in glass patterns: Psychophysics and computational models , 1997, Vision Research.

[18]  Edward K. Vogel,et al.  The capacity of visual working memory for features and conjunctions , 1997, Nature.

[19]  Peripheral and central factors limiting the development of contrast sensitivity in Macaque monkeys , 1998, Vision Research.

[20]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[21]  H. Wilson,et al.  Detection of global structure in Glass patterns: implications for form vision , 1998, Vision Research.

[22]  R. Ratcliff,et al.  Modeling Response Times for Two-Choice Decisions , 1998 .

[23]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[24]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[25]  W. Newsome,et al.  Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. , 2001, Journal of neurophysiology.

[26]  Z. Pylyshyn Visual indexes, preconceptual objects, and situated vision , 2001, Cognition.

[27]  D. Purves,et al.  Neuroscience. 2nd edition , 2001 .

[28]  J. Movshon,et al.  Signals in Macaque Striate Cortical Neurons that Support the Perception of Glass Patterns , 2002, The Journal of Neuroscience.

[29]  Robert A. Jacobs,et al.  Comparing perceptual learning across tasks: A review , 2002 .

[30]  J. A Wilson,et al.  Glass pattern studies of local and global processing of contrast variations , 2004, Vision Research.

[31]  H. Barlow,et al.  Convergent evidence for the visual analysis of optic flow through anisotropic attenuation of high spatial frequencies. , 2004, Journal of vision.

[32]  P. Cavanagh,et al.  The Capacity of Visual Short-Term Memory is Set Both by Visual Information Load and by Number of Objects , 2004, Psychological science.

[33]  David R. Badcock,et al.  Interactions between luminance and contrast signals in global form detection , 2005, Vision Research.

[34]  P. Cavanagh,et al.  Tracking multiple targets with multifocal attention , 2005, Trends in Cognitive Sciences.

[35]  D. Badcock,et al.  Vernier acuity is normal in migraine, whereas global form and global motion perception are not. , 2006, Investigative ophthalmology & visual science.

[36]  J. Maunsell,et al.  Feature-based attention in visual cortex , 2006, Trends in Neurosciences.

[37]  D. Burr,et al.  The effects of opposite-polarity dipoles on the detection of Glass patterns , 2006, Vision Research.

[38]  Timothy D. Hanks,et al.  Microstimulation of macaque area LIP affects decision-making in a motion discrimination task , 2006, Nature Neuroscience.

[39]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[40]  J. Movshon,et al.  Glass pattern responses in macaque V2 neurons. , 2007, Journal of vision.

[41]  E. Vogel,et al.  Visual Working Memory Represents a Fixed Number of Items Regardless of Complexity , 2007, Psychological science.

[42]  S. Luck,et al.  Discrete fixed-resolution representations in visual working memory , 2008, Nature.

[43]  Aude Oliva,et al.  Visual long-term memory has a massive storage capacity for object details , 2008, Proceedings of the National Academy of Sciences.

[44]  Aaron R. Seitz,et al.  What a difference a parameter makes: A psychophysical comparison of random dot motion algorithms , 2009, Vision Research.

[45]  Colin W. G. Clifford,et al.  Discrimination of the local orientation structure of spiral Glass patterns early in human visual cortex , 2009, NeuroImage.

[46]  Lawrie S. McKay,et al.  Vision in autism spectrum disorders , 2009, Vision Research.

[47]  Maryam Vaziri Pashkam,et al.  Spatial Heterogeneity in the Perception of Face and Form Attributes , 2010, Current Biology.

[48]  M. Goldberg,et al.  Attention, intention, and priority in the parietal lobe. , 2010, Annual review of neuroscience.

[49]  E. Vogel,et al.  Quantity, not quality: the relationship between fluid intelligence and working memory capacity , 2010, Psychonomic bulletin & review.

[50]  Brian J. Spiering,et al.  A Critical Review of Habit Learning and the Basal Ganglia , 2011, Front. Syst. Neurosci..

[51]  E. Vogel,et al.  Visual working memory capacity: from psychophysics and neurobiology to individual differences , 2013, Trends in Cognitive Sciences.

[52]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[53]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[54]  Roozbeh Kiani,et al.  A neural mechanism of speed-accuracy tradeoff in macaque area LIP , 2014, eLife.

[55]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[56]  John K. Tsotsos,et al.  On computational modeling of visual saliency: Examining what’s right, and what’s left , 2015, Vision Research.

[57]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[58]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2015, ICLR.

[59]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[60]  William T. Freeman,et al.  Learning Ordinal Relationships for Mid-Level Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[61]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[62]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[63]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[64]  Devendra Singh Chaplot Transfer Deep Reinforcement Learning in 3 D Environments : An Empirical Study , 2016 .

[65]  Peter L. Bartlett,et al.  RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[66]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[67]  Shane Legg,et al.  DeepMind Lab , 2016, ArXiv.

[68]  Demis Hassabis,et al.  Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.

[69]  Tom Schaul,et al.  FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[70]  J. Wolfe,et al.  Five factors that guide attention in visual search , 2017, Nature Human Behaviour.

[71]  Zeb Kurth-Nelson,et al.  Learning to reinforcement learn , 2016, CogSci.

[72]  Stephen Clark,et al.  Understanding Grounded Language Learning Agents , 2017, ArXiv.

[73]  Samuel Ritter,et al.  Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study , 2017, ICML.

[74]  Stephen Clark,et al.  Understanding Early Word Learning in Situated Artificial Agents , 2017 .

[75]  Daan Wierstra,et al.  Recurrent Environment Simulators , 2017, ICLR.

[76]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[77]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2017, ICLR.

[78]  Guillaume Lample,et al.  Arnold: An Autonomous Agent to Play FPS Games , 2017, AAAI.

[79]  Tomaso A. Poggio,et al.  Do Deep Neural Networks Suffer from Crowding? , 2017, NIPS.

[80]  Ruslan Salakhutdinov,et al.  Gated-Attention Architectures for Task-Oriented Language Grounding , 2017, AAAI.