A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation

While many models of biological object recognition share a common set of “broad-stroke” properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model—e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct “parts” have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to high-throughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision.

[1]  Rob A. Rutenbar,et al.  Simulated annealing algorithms: an overview , 1989, IEEE Circuits and Devices Magazine.

[2]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[3]  D. G. Albrecht,et al.  Cortical neurons: Isolation of contrast gain control , 1992, Vision Research.

[4]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[6]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[7]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[8]  Edmund T. Rolls,et al.  A Model of Invariant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information-Based Performance Measures , 2000, Neural Computation.

[9]  Y. Dan,et al.  Stimulus Timing-Dependent Plasticity in Cortical Processing of Orientation , 2001, Neuron.

[10]  Jitendra Malik,et al.  Geometric blur for template matching , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Gustavo Deco,et al.  Computational neuroscience of vision , 2002 .

[12]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[13]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[14]  Ruigang Yang,et al.  Scientific Computing on Commodity Graphics Hardware , 2004, CIS.

[15]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[16]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[18]  David D. Cox,et al.  'Breaking' position-invariant object recognition , 2005, Nature Neuroscience.

[19]  Julian Eggert,et al.  Learning viewpoint invariant object representations using a temporal coherence principle , 2005, Biological Cybernetics.

[20]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Cordelia Schmid,et al.  Dataset Issues in Object Recognition , 2006, Toward Category-Level Object Recognition.

[22]  David G. Lowe,et al.  University of British Columbia. , 1945, Canadian Medical Association journal.

[23]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[24]  B. Corda,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Visual Representation , 2007 .

[25]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[26]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  David D. Cox,et al.  Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[28]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[29]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[30]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[31]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[32]  S. Gerber,et al.  Unsupervised Natural Experience Rapidly Alters Invariant Object Representation in Visual Cortex , 2008 .

[33]  Lior Shamir,et al.  Evaluation of Face Datasets as Tools for Assessing the Performance of Face Recognition Methods , 2008, International Journal of Computer Vision.

[34]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[35]  Niko Wilbert,et al.  Invariant Object Recognition with Slow Feature Analysis , 2008, ICANN.

[36]  Jack J. Dongarra,et al.  The PlayStation 3 for High-Performance Scientific Computing , 2008, Computing in Science & Engineering.

[37]  Nicolas Pinto,et al.  Establishing Good Benchmarks and Baselines for Face Recognition , 2008 .

[38]  J. DiCarlo,et al.  A high-throughput screening approach to discovering good forms of inspired visual representation , 2009 .

[39]  Nicolas Pinto,et al.  How far can you get with a modern face recognition test set using only simple features? , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .