论文信息 - Frontiers in Neuroinformatics

Frontiers in Neuroinformatics

In this paper, we suggest that perception could be modeled by assuming that sensory input is generated by a hierarchy of attractors in a dynamic system. We describe a mathematical model which exploits the temporal structure of rapid sensory dynamics to track the slower trajectories of their underlying causes. This model establishes a proof of concept that slowly changing neuronal states can encode the trajectories of faster sensory signals. We link this hierarchical account to recent developments in the perception of human action; in particular artificial speech recognition. We argue that these hierarchical models of dynamical systems are a plausible starting point to develop robust recognition schemes, because they capture critical temporal dependencies induced by deep hierarchical structure. We conclude by suggesting that a fruitful computational neuroscience approach may emerge from modeling perception as non-autonomous recognition dynamics enslaved by autonomous hierarchical dynamics in the sensorium.

Karl J. Friston | Stefan J. Kiebel | Jean Daunizeau | J. Daunizeau | S. Kiebel

[1] F. Takens. Detecting strange attractors in turbulence , 1981 .

[2] D. J. Felleman,et al. Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[3] C. Browman,et al. Articulatory Phonology: An Overview , 1992, Phonetica.

[4] D Mumford,et al. On the computational architecture of the neocortex. II. The role of cortico-cortical loops. , 1992, Biological cybernetics.

[5] E.C.L. Vu,et al. Identification of a forebrain motor programming network for the learned song of zebra finches , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[6] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.

[7] A. C. Yu,et al. Temporal Hierarchical Control of Singing in Birds , 1996, Science.

[8] M M Sondhi,et al. The potential role of speech production models in automatic speech recognition. , 1996, The Journal of the Acoustical Society of America.

[9] Randall D. Beer,et al. The brain has a body: adaptive behavior emerges from interactions of nervous system, body and environment , 1997, Trends in Neurosciences.

[10] R. Guillery,et al. On the actions that one nerve cell can have on another: distinguishing "drivers" from "modulators". , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[11] Mari Ostendorf,et al. Moving beyond the 'beads-on-a-string' model of speech , 1999 .

[12] A. Borst. Seeing smells: imaging olfactory learning in bees , 1999, Nature Neuroscience.

[13] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[14] A. Liberman,et al. On the relation of speech to language , 2000, Trends in Cognitive Sciences.

[15] K. Sen,et al. Feature analysis of natural sounds in the songbird auditory forebrain. , 2001, Journal of neurophysiology.

[16] R. Smits. Hierarchical categorization of coarticulated phonemes: A theoretical analysis , 2001, Perception & psychophysics.

[17] K. Kaneko,et al. How fast elements can affect slow dynamics , 2001, nlin/0108038.

[18] Michael I. Jordan,et al. Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[19] Azriel Rosenfeld,et al. Face recognition: A literature survey , 2003, CSUR.

[20] K. Kaneko,et al. Bifurcation cascade as chaotic itinerancy with multiple time scales. , 2003, Chaos.

[21] Tai Sing Lee,et al. Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[22] T. Poggio,et al. Cognitive neuroscience: Neural mechanisms for the recognition of biological movements , 2003, Nature Reviews Neuroscience.

[23] Leonard A. Smith,et al. Indistinguishable states II. The imperfect model scenario , 2004 .

[24] Towards perceptually realistic talking heads: models, methods and McGurk , 2004, APGV '04.

[25] D. Mumford. On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[26] J. Fuster. Upper processing stages of the perception–action cycle , 2004, Trends in Cognitive Sciences.

[27] Eric Horvitz,et al. Layered representations for learning and inferring office activity from multiple sensory channels , 2004, Comput. Vis. Image Underst..

[28] David Mumford,et al. On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[29] Mark S. Nixon,et al. Automated person recognition by walking and running via model-based approaches , 2004, Pattern Recognit..

[30] Emanuel Todorov,et al. From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators , 2005 .

[31] Emanuel Todorov,et al. From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators , 2005, J. Field Robotics.

[32] Toward Perceptually Realistic Talking Heads: Models, Methods, and McGurk , 2005, TAP.

[33] Trevor Darrell,et al. Production domain modeling of pronunciation for visual speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[34] Karl J. Friston,et al. A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[35] Michael Breakspear,et al. Dynamics of a neural system with a multiscale architecture , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[36] Zhi-Hua Zhou,et al. Face recognition from a single image per person: A survey , 2006, Pattern Recognit..

[37] E. Koechlin,et al. Broca's Area and the Hierarchical Organization of Human Behavior , 2006, Neuron.

[38] Dong Yu,et al. Structured speech modeling , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[39] Karl J. Friston,et al. A free energy principle for the brain , 2006, Journal of Physiology-Paris.

[40] A. Selverston,et al. Dynamical principles in neuroscience , 2006 .

[41] Jeff A. Bilmes,et al. What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..

[42] A. Yuille,et al. Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[43] Atsushi Nakamura,et al. Production-Oriented Models for Speech Recognition , 2006, IEICE Trans. Inf. Syst..

[44] Christopher D. Manning,et al. Probabilistic models of language processing and acquisition , 2006, Trends in Cognitive Sciences.

[45] Adrian Hilton,et al. A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[46] Christopher M. Glaze,et al. Temporal Structure in Zebra Finch Song: Implications for Motor Coding , 2006, The Journal of Neuroscience.

[47] Ian D. Reid,et al. A general method for human activity recognition in video , 2006, Comput. Vis. Image Underst..

[48] Christopher W. Geib,et al. The meaning of action: a review on action recognition and mapping , 2007, Adv. Robotics.

[49] Odette Scharenborg,et al. Reaching over the gap: A review of efforts to link human and automatic speech recognition research , 2007, Speech Commun..

[50] Li Deng,et al. Adaptive Kalman Filtering and Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[51] A. Budhiraja,et al. A survey of numerical methods for nonlinear filtering problems , 2007 .

[52] D. Poeppel,et al. Speech perception at the interface of neurobiology and linguistics , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[53] Karl J. Friston,et al. The mirror-neuron system: a Bayesian perspective. , 2007, Neuroreport.

[54] Daniel Bullock,et al. Integrating robotics and neuroscience: brains for robots, bodies for brains , 2007, Adv. Robotics.

[55] Simon King,et al. Speech production knowledge in automatic speech recognition. , 2007, The Journal of the Acoustical Society of America.

[56] Simon King,et al. Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[57] Karl J. Friston. Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[58] Roger K. Moore,et al. Towards an investigation of speech energetics using ‘AnTon’: an animatronic model of a human tongue and vocal tract , 2008, Connect. Sci..

[59] D. Heeger,et al. A Hierarchy of Temporal Receptive Windows in Human Cortex , 2008, The Journal of Neuroscience.

[60] Douglas D. O'Shaughnessy,et al. Invited paper: Automatic speech recognition: History, methods and challenges , 2008, Pattern Recognit..

[61] Karl J. Friston,et al. A Hierarchy of Time-Scales and the Brain , 2008, PLoS Comput. Biol..

[62] K. R. Weiss,et al. Predicting Adaptive Behavior in the Environment from Central Nervous System Dynamics , 2008, PloS one.

[63] Karl J. Friston,et al. DEM: A variational treatment of dynamic systems , 2008, NeuroImage.

[64] R. Patterson,et al. Task-Dependent Modulation of Medial Geniculate Body Is Behaviorally Relevant for Speech Recognition , 2008, Current Biology.

[65] Vladimir Pavlovic,et al. Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[66] Karl J. Friston,et al. ATTRACTORS IN SONG , 2009 .

[67] J. Rauschecker,et al. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing , 2009, Nature Neuroscience.

[68] Christian R. Huyck,et al. A psycholinguistic model of natural language parsing implemented in simulated neurons , 2009, Cognitive Neurodynamics.