论文信息 - The power of connectivity: Identity preserving transformations on visual streams in the spike domain

The power of connectivity: Identity preserving transformations on visual streams in the spike domain

We investigate neural architectures for identity preserving transformations (IPTs) on visual stimuli in the spike domain. The stimuli are encoded with a population of spiking neurons; the resulting spikes are processed and finally decoded. A number of IPTs are demonstrated including faithful stimulus recovery, as well as simple transformations on the original visual stimulus such as translations, rotations and zoomings. We show that if the set of receptive fields satisfies certain symmetry properties, then IPTs can easily be realized and additionally, the same basic stimulus decoding algorithm can be employed to recover the transformed input stimulus. Using group theoretic methods we advance two different neural encoding architectures and discuss the realization of exact and approximate IPTs. These are realized in the spike domain processing block by a "switching matrix" that regulates the input/output connectivity between the stimulus encoding and decoding blocks. For example, for a particular connectivity setting of the switching matrix, the original stimulus is faithfully recovered. For other settings, translations, rotations and dilations (or combinations of these operations) of the original video stream are obtained. We evaluate our theoretical derivations through extensive simulations on natural video scenes, and discuss implications of our results on the problem of invariant object recognition in the spike domain.

[1] László Tóth,et al. Perfect recovery and sensitivity analysis of time encoded bandlimited signals , 2004, IEEE Transactions on Circuits and Systems I: Regular Papers.

[2] Edward H. Adelson,et al. The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3] S. W. Kuffler. Discharge patterns and functional organization of mammalian retina. , 1953, Journal of neurophysiology.

[4] Alan V. Oppenheim,et al. Discrete-Time Signal Pro-cessing , 1989 .

[5] David W. Arathorn,et al. Map-Seeking Circuits in Visual Cognition: A Computational Mechanism for Biological and Machine Vision , 2002 .

[6] Aurel A. Lazar,et al. Encoding of multivariate stimuli with MIMO neural circuits , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[7] P. Dodwell. The Lie transformation group model of visual perception , 1983, Perception & psychophysics.

[8] David L. Sheinberg,et al. Visual object recognition. , 1996, Annual review of neuroscience.

[9] R. W. Rodieck. Quantitative analysis of cat retinal ganglion cell response to visual stimuli. , 1965, Vision research.

[10] A. Ron. Review of An introduction to Frames and Riesz bases, applied and numerical Harmonic analysis by Ole Christensen Birkhäuser, Basel, 2003 , 2005 .

[11] D C Van Essen,et al. Shifter circuits: a computational strategy for dynamic aspects of visual processing. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[12] Christoph von der Malsburg,et al. What Is the Optimal Architecture for Visual Information Routing? , 2007, Neural Computation.

[13] D. V. van Essen,et al. A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14] Fathi M. A. Salam,et al. A real-time experiment using a 50-neuron CMOS analog silicon chip with on-chip digital learning , 1991, IEEE Trans. Neural Networks.

[15] S. Ullman. Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[16] G D Field,et al. Information processing in the primate retina: circuitry and coding. , 2007, Annual review of neuroscience.

[17] O. Christensen. An introduction to frames and Riesz bases , 2002 .

[18] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[19] Aurel A. Lazar,et al. Reconstruction of Sensory Stimuli Encoded with Integrate-and-Fire Neurons with Random Thresholds , 2009, EURASIP J. Adv. Signal Process..

[20] Aurel A. Lazar,et al. A simple model of spike processing , 2006, Neurocomputing.

[21] Edward H. Adelson,et al. Shiftable multiscale transforms , 1992, IEEE Trans. Inf. Theory.

[22] Syed Twareque Ali,et al. Two-Dimensional Wavelets and their Relatives , 2004 .

[23] G. Westheimer,et al. Orientation dependency for foveal line stimuli: detection and intensity discrimination, resolution, orientation discrimination and Vernier acuity , 1998, Vision Research.

[24] Pierre Kornprobst,et al. Virtual Retina: A biological retina model and simulator, with contrast gain control , 2009, Journal of Computational Neuroscience.

[25] Aurel A. Lazar,et al. Video Time Encoding Machines , 2011, IEEE Transactions on Neural Networks.

[26] J. P. Jones,et al. An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[27] Aurel A Lazar,et al. Recovery of stimuli encoded with a Hodgkin-Huxley neuron using conditional PRCs , 2009, BMC Neuroscience.

[28] H H Bülthoff,et al. Psychophysical support for a two-dimensional view interpolation theory of object recognition. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[29] Jochen Triesch,et al. Implementations and Implications of Foveated Vision , 2009 .

[30] Aurel A. Lazar,et al. Faithful Representation of Stimuli with a Population of Integrate-and-Fire Neurons , 2008, Neural Computation.

[31] Tai Sing Lee,et al. Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[32] Aurel A. Lazar,et al. Massively parallel neural encoding and decoding of visual stimuli , 2012, Neural Networks.

[33] T. Poggio,et al. A network that learns to recognize three-dimensional objects , 1990, Nature.

[34] I. Daubechies. Ten Lectures on Wavelets , 1992 .

[35] Yehezkel Yeshurun,et al. An Efficient Data Structure for Feature Extraction in a Foveated Environment , 2000, Biologically Motivated Computer Vision.

[36] David D. Cox,et al. Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[37] Aurel A. Lazar,et al. Encoding natural scenes with neural circuits with random thresholds , 2010, Vision Research.

[38] Dwight J. Kravitz,et al. How position dependent is visual object recognition? , 2008, Trends in Cognitive Sciences.

[39] M. Tarr,et al. Mental rotation and orientation-dependence in shape recognition , 1989, Cognitive Psychology.

[40] W. Hoffman. The Lie algebra of visual perception , 1966 .

[41] Aurel A. Lazar,et al. Population Encoding With Hodgkin–Huxley Neurons , 2010, IEEE Transactions on Information Theory.