论文信息 - Inference via sparse coding in a hierarchical vision model

Inference via sparse coding in a hierarchical vision model

Sparse coding has been incorporated in models of the visual cortex for its computational advantages and connection to biology. But how the level of sparsity contributes to performance on visual tasks is not well understood. In this work, sparse coding has been integrated into an existing hierarchical V2 model (Hosoya and Hyvärinen, 2015), but replacing its independent component analysis (ICA) with an explicit sparse coding in which the degree of sparsity can be controlled. After training, the sparse coding basis functions with a higher degree of sparsity resembled qualitatively different structures, such as curves and corners. The contributions of the models were assessed with image classification tasks, specifically tasks associated with mid-level vision including figure-ground classification, texture classification, and angle prediction between two line stimuli. In addition, the models were assessed in comparison to a texture sensitivity measure that has been reported in V2 (Freeman et al., 2013), and a deleted-region inference task. The results from the experiments show that while sparse coding performed worse than ICA at classifying images, only sparse coding was able to better match the texture sensitivity level of V2 and infer deleted image regions, both by increasing the degree of sparsity in sparse coding. Higher degrees of sparsity allowed for inference over larger deleted image regions. The mechanism that allows for this inference capability in sparse coding is described here.

[1] Michael S. Lewicki,et al. Emergence of complex cell properties by learning to generalize in natural scenes , 2009, Nature.

[2] Aapo Hyvärinen,et al. Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[3] Brendan J. Frey,et al. Winner-Take-All Autoencoders , 2014, NIPS.

[4] Franck Ruffier,et al. Sparse deep predictive coding captures contour integration capabilities of the early visual system , 2019, PLoS Comput. Biol..

[5] Patrik O. Hoyer,et al. Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[6] Bruno A. Olshausen,et al. Learning real and complex overcomplete representations from the statistics of natural images , 2009, Optical Engineering + Applications.

[7] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[8] Anitha Pasupathy,et al. 'Artiphysiology' reveals V4-like shape tuning in a deep network trained for image classification , 2018, eLife.

[9] James A. Roberts,et al. Manipulating the structure of natural scenes using wavelets to study the functional architecture of perceptual hierarchies in the brain , 2020, NeuroImage.

[10] Garrison W. Cottrell,et al. Efficient Visual Coding: From Retina To V2 , 2013, ICLR.

[11] R. Gerchberg. A practical algorithm for the determination of phase from image and diffraction plane pictures , 1972 .

[12] David J. Field,et al. What Is the Other 85 Percent of V1 Doing , 2006 .

[13] Emily J. Allen,et al. A massive 7T fMRI dataset to bridge cognitive and computational neuroscience , 2021, bioRxiv.

[14] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[15] A. Norcia,et al. Representation of Maximally Regular Textures in Human Visual Cortex , 2016, The Journal of Neuroscience.

[16] F. Attneave. Some informational aspects of visual perception. , 1954, Psychological review.

[17] Martin Schrimpf,et al. Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations , 2020, bioRxiv.

[18] R. von der Heydt,et al. Mechanisms of contour perception in monkey visual cortex. I. Lines of pattern discontinuity , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[19] Alejandro F. Bujan,et al. Learning Overcomplete, Low Coherence Dictionaries with Linear Inference , 2016, J. Mach. Learn. Res..

[20] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21] Laurenz Wiskott,et al. Slow feature analysis yields a rich repertoire of complex cell properties. , 2005, Journal of vision.

[22] B. Willmore,et al. Sparse coding in striate and extrastriate visual cortex. , 2011, Journal of neurophysiology.

[23] Bruno A Olshausen,et al. Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[24] Jana Reinhard,et al. Textures A Photographic Album For Artists And Designers , 2016 .

[25] David J. Field,et al. What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[26] Soo-Chang Pei,et al. A Novel Image Recovery Algorithm for Visible Watermarked Images , 2006, IEEE Trans. Inf. Forensics Secur..

[27] Bruno A. Olshausen,et al. Principles of Image Representation in Visual Cortex , 2003 .

[28] Li Zhaoping,et al. Border Ownership from Intracortical Interactions in Visual Area V2 , 2005, Neuron.

[29] Eero P. Simoncelli,et al. Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[30] A. Yuille,et al. Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[31] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[32] Aapo Hyvärinen,et al. A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images , 2001, Vision Research.

[33] M. Farah,et al. A functional MRI study of mental image generation , 1997, Neuropsychologia.

[34] Eero P. Simoncelli,et al. A functional and perceptual signature of the second visual area in primates , 2013, Nature Neuroscience.

[35] Michael S. Lewicki,et al. A Hierarchical Bayesian Model for Learning Nonlinear Statistical Regularities in Nonstationary Natural Signals , 2005, Neural Computation.

[36] J. DiCarlo,et al. Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[37] Michael C. Frank,et al. Unsupervised neural network models of the ventral visual stream , 2020, Proceedings of the National Academy of Sciences.

[38] Kenichi Ohki,et al. Natural images are reliably represented by sparse and variable populations of neurons in visual cortex , 2020, Nature Communications.

[39] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[40] Yu Luo,et al. Removing Rain from a Single Image via Discriminative Sparse Coding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42] Matthew D. Zeiler,et al. Learning Image Decompositions with Hierarchical Sparse Coding , 2010 .

[43] Eero P. Simoncelli,et al. A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[44] Eero P. Simoncelli,et al. Selectivity and tolerance for visual texture in macaque V2 , 2016, Proceedings of the National Academy of Sciences.

[45] Leon A. Gatys,et al. Deep convolutional models improve predictions of macaque V1 responses to natural images , 2019, PLoS Comput. Biol..

[46] Jitendra Malik,et al. Local figure-ground cues are valid for natural images. , 2007, Journal of vision.

[47] Rajesh P. N. Rao,et al. Probabilistic Models of the Brain: Perception and Neural Function , 2002 .

[48] Md Nasir Uddin Laskar,et al. Deep neural networks capture texture sensitivity in V2 , 2020, Journal of vision.

[49] Lars Muckli,et al. A Self-Supervised Deep Neural Network for Image Completion Resembles Early Visual Cortex fMRI Activity Patterns for Occluded Scenes , 2020 .

[50] H. C. LONGUET-HIGGINS,et al. Non-Holographic Associative Memory , 1969, Nature.

[51] Yann LeCun,et al. Convolutional Matching Pursuit and Dictionary Training , 2010, ArXiv.

[52] Tatyana O. Sharpee,et al. Cross-orientation suppression in visual area V2 , 2017, Nature Communications.

[53] Guillermo Sapiro,et al. Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[54] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.

[55] Richard G. Baraniuk,et al. Sparse Coding via Thresholding and Local Competition in Neural Circuits , 2008, Neural Computation.

[56] O. Schwartz,et al. The impact on midlevel vision of statistically optimal divisive normalization in V1. , 2013, Journal of vision.

[57] Aapo Hyvärinen,et al. A Hierarchical Statistical Model of Natural Images Explains Tuning Properties in V2 , 2015, The Journal of Neuroscience.

[58] Nikolaus Kriegeskorte,et al. Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[59] Pentti Kanerva,et al. Sparse distributed memory and related models , 1993 .

[60] W. Geisler. Visual perception and the statistical properties of natural scenes. , 2008, Annual review of psychology.

[61] Joshua Bowren,et al. A Sparse Coding Interpretation of Neural Networks and Theoretical Implications , 2021, ArXiv.

[62] H. B. Barlow,et al. Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[63] Odelia Schwartz,et al. Stimulus- and goal-oriented frameworks for understanding natural vision , 2018, Nature Neuroscience.

[64] Chengxu Zhuang,et al. Deep Learning Predicts Correlation between a Functional Signature of Higher Visual Areas and Sparse Firing of Neurons , 2017, Front. Comput. Neurosci..

[65] József Fiser,et al. No evidence for active sparsification in the visual cortex , 2009, NIPS.

[66] Minami Ito,et al. Representation of Angles Embedded within Contour Stimuli in Area V2 of Macaque Monkeys , 2004, The Journal of Neuroscience.

[67] R. von der Heydt,et al. Mechanisms of contour perception in monkey visual cortex. II. Contours bridging gaps , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[68] Elijah D. Christensen,et al. Using deep learning to probe the neural code for images in primary visual cortex , 2019, Journal of vision.

[69] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[70] D. Chakrabarti,et al. A fast fixed - point algorithm for independent component analysis , 1997 .

[71] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[72] Bruno Galerne,et al. Random Phase Textures: Theory and Synthesis , 2011, IEEE Transactions on Image Processing.