Sparse deep belief net model for visual area V2

Motivated in part by the hierarchical organization of the cortex, a number of algorithms have recently been proposed that try to learn hierarchical, or "deep," structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This paper presents an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks of Hinton et al. (2006). We learn two layers of nodes in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the Gabor functions known to model V1 cell receptive fields. Further, the second layer in our model encodes correlations of the first layer responses in the data. Specifically, it picks up both colinear ("contour") features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex "corner" features matches well with the results from the Ito & Komatsu's study of biological V2 responses. This suggests that our sparse variant of deep belief networks holds promise for modeling more higher-order features.

[1]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[2]  R. L. Valois,et al.  The orientation and direction selectivity of cells in macaque visual cortex , 1982, Vision Research.

[3]  J. B. Levitt,et al.  Receptive fields and functional architecture of macaque V2. , 1994, Journal of neurophysiology.

[4]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[5]  Y. Censor,et al.  Parallel Optimization: Theory, Algorithms, and Applications , 1997 .

[6]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[7]  J. H. Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998 .

[8]  J. Hegdé,et al.  Selectivity for Complex Shapes in Primate Visual Area V2 , 2000, The Journal of Neuroscience.

[9]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.

[10]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[11]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[12]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[13]  A. Hyvärinen,et al.  A multi-layer sparse coding network learns contour coding from natural images , 2002, Vision Research.

[14]  G. Boynton,et al.  Visual Cortex: The Continuing Puzzle of Area V2 , 2004, Current Biology.

[15]  Minami Ito,et al.  Representation of Angles Embedded within Contour Stimuli in Area V2 of Macaque Monkeys , 2004, The Journal of Neuroscience.

[16]  Aapo Hyvärinen,et al.  Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2 , 2004, BMC Neuroscience.

[17]  David J. Field,et al.  How Close Are We to Understanding V1? , 2005, Neural Computation.

[18]  Geoffrey E. Hinton,et al.  Learning Causally Linked Markov Random Fields , 2005, AISTATS.

[19]  Michael S. Lewicki,et al.  A Hierarchical Bayesian Model for Learning Nonlinear Statistical Regularities in Nonstationary Natural Signals , 2005, Neural Computation.

[20]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[21]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[22]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[23]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[24]  Geoffrey E. Hinton,et al.  Topographic Product Models Applied to Natural Scene Statistics , 2006, Neural Computation.

[25]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[26]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[27]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[28]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.