Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream

Humans recognize visually-presented objects rapidly and accurately. To understand this ability, we seek to construct models of the ventral stream, the series of cortical areas thought to subserve object recognition. One tool to assess the quality of a model of the ventral stream is the Representational Dissimilarity Matrix (RDM), which uses a set of visual stimuli and measures the distances produced in either the brain (i.e. fMRI voxel responses, neural firing rates) or in models (features). Previous work has shown that all known models of the ventral stream fail to capture the RDM pattern observed in either IT cortex, the highest ventral area, or in the human ventral stream. In this work, we construct models of the ventral stream using a novel optimization procedure for category-level object recognition problems, and produce RDMs resembling both macaque IT and human ventral stream. The model, while novel in the optimization procedure, further develops a long-standing functional hypothesis that the ventral visual stream is a hierarchically arranged series of processing stages optimized for visual object recognition.

[1]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[2]  Thomas Serre,et al.  A quantitative theory of immediate visual recognition. , 2007, Progress in brain research.

[3]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[4]  Ha Hong,et al.  The Neural Representation Benchmark and its Evaluation on Brain and Machine , 2013, ICLR.

[5]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[6]  David G. Lowe,et al.  University of British Columbia. , 1945, Canadian Medical Association journal.

[7]  Keiji Tanaka,et al.  Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey , 2008, Neuron.

[8]  N. Kanwisher,et al.  The lateral occipital complex and its role in object recognition , 2001, Vision Research.

[9]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[10]  David J. Field,et al.  How Close Are We to Understanding V1? , 2005, Neural Computation.

[11]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[12]  T. Poggio,et al.  A model of V4 shape selectivity and invariance. , 2007, Journal of neurophysiology.

[13]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[14]  Peter Lennie,et al.  Coding of color and form in the geniculostriate visual pathway (invited review). , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[15]  Nikolaus Kriegeskorte,et al.  Relating Population-Code Representations between Man, Monkey, and Computational Models , 2009, Front. Neurosci..

[16]  T. Poggio,et al.  BOOK REVIEW David Marr’s Vision: floreat computational neuroscience VISION: A COMPUTATIONAL INVESTIGATION INTO THE HUMAN REPRESENTATION AND PROCESSING OF VISUAL INFORMATION , 2009 .

[17]  N. Kanwisher,et al.  The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception , 1997, The Journal of Neuroscience.

[18]  N. Kanwisher,et al.  The Human Body , 2001 .

[19]  David D. Cox,et al.  Making a Science of Model Search , 2012, ArXiv.

[20]  Z Kourtzi,et al.  Representation of Perceived Object Shape by the Human Lateral Occipital Complex , 2001, Science.

[21]  David D. Cox,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[22]  Anitha Pasupathy,et al.  Transformation of shape information in the ventral pathway , 2007, Current Opinion in Neurobiology.

[23]  Keiji Tanaka,et al.  Object category structure in response patterns of neuronal population in monkey inferior temporal cortex. , 2007, Journal of neurophysiology.

[24]  R. Desimone,et al.  Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[25]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[26]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[27]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[28]  K. Shenoy,et al.  Temporal complexity and heterogeneity of single-neuron activity in premotor and motor cortex. , 2007, Journal of neurophysiology.

[29]  J. Gallant,et al.  Spectral receptive field properties explain shape selectivity in area V4. , 2006, Journal of neurophysiology.