Neuromorphic detection of speech dynamics

Speech and voice technologies are experiencing a profound review as new paradigms are sought to overcome some specific problems which cannot be completely solved by classical approaches. Neuromorphic Speech Processing is an emerging area in which research is turning the face to understand the natural neural processing of speech by the Human Auditory System in order to capture the basic mechanisms solving difficult tasks in an efficient way. In the present paper a further step ahead is presented in the approach to mimic basic neural speech processing by simple neuromorphic units standing on previous work to show how formant dynamics - and henceforth consonantal features - can be detected by using a general neuromorphic unit which can mimic the functionality of certain neurons found in the upper auditory pathways. Using these simple building blocks a General Speech Processing Architecture can be synthesized as a layered structure. Results from different simulation stages are provided as well as a discussion on implementation details. Conclusions and future work are oriented to describe the functionality to be covered in the next research steps.

[1]  Jacob Benesty,et al.  Springer handbook of speech processing , 2007, Springer Handbooks.

[2]  Jont B. Allen,et al.  Nonlinear Cochlear Signal Processing and Masking in Speech Perception , 2008 .

[3]  S. Shamma,et al.  Physiological Representations of Speech , 2004 .

[4]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[5]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[6]  Nobuo Suga,et al.  Basic Acoustic Patterns and Neural Mechanisms Shared by Humans and Animals for Auditory Perception , 2012 .

[7]  J. Rothwell Principles of Neural Science , 1982 .

[8]  Steven Greenberg,et al.  Speech Processing in the Auditory System: An Overview , 2004 .

[9]  H. Sussman,et al.  An investigation of locus equations as a source of relational invariance for stop place categorization , 1991 .

[10]  Steven Greenberg,et al.  Listening to Speech : An Auditory Perspective , 2012 .

[11]  María Victoria Rodellar Biarge,et al.  Time-frequency representations in speech perception , 2009, Neurocomputing.

[12]  Pierre Vandergheynst,et al.  Learning Bimodal Structure in Audio–Visual Data , 2009, IEEE Transactions on Neural Networks.

[13]  J. Rauschecker,et al.  Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing , 2009, Nature Neuroscience.

[14]  Philip Rose,et al.  Realistic Extrinsic Forensic Speaker discrimination with the Diphthong / ai/ , 2006 .

[15]  J. Fulton,et al.  PHYSIOLOGY OF THE NERVOUS SYSTEM. , 1939, Science.

[16]  D. O. Hebb,et al.  The organization of behavior , 1988 .

[17]  S. Shamma On the role of space and time in auditory processing , 2001, Trends in Cognitive Sciences.

[18]  María Victoria Rodellar Biarge,et al.  A Bio-inspired Architecture for Cognitive Audio , 2007, IWINAC.

[19]  S. R. Cajal Textura del Sistema Nervioso del Hombre y de los Vertebrados, 1899–1904 , 2019 .

[20]  Steven Greenberg,et al.  Auditory Processing of Speech , 2006 .

[21]  V. Mountcastle The columnar organization of the neocortex. , 1997, Brain : a journal of neurology.

[22]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[23]  W. Precht The synaptic organization of the brain G.M. Shepherd, Oxford University Press (1975). 364 pp., £3.80 (paperback) , 1976, Neuroscience.

[24]  Shihab Shamma Physiological foundations of temporal integration in the perception of speech , 2003, J. Phonetics.

[25]  Biing-Hwang Juang,et al.  Auditory perception and cognition , 2008, IEEE Signal Processing Magazine.

[26]  R. Fay,et al.  Speech Processing in the Auditory System , 2010, Springer Handbook of Auditory Research.

[27]  Rafael Yuste,et al.  Ultrastructure of Dendritic Spines: Correlation Between Synaptic and Spine Morphologies , 2007, Front. Neurosci..

[28]  Günter Ehret,et al.  Time-critical integration of formants for perception of communication calls in mice , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[29]  G. Shepherd The Synaptic Organization of the Brain , 1979 .

[30]  P. Cariani,et al.  Encoding of pitch in the human brainstem is sensitive to language experience. , 2005, Brain research. Cognitive brain research.

[31]  Mounya Elhilali,et al.  Primary Auditory Cortical Responses while Attending to Different Streams , 2007 .