Brain-to-speech decoding will require linguistic and pragmatic data

OBJECTIVE Advances in electrophysiological methods such as electrocorticography (ECoG) have enabled researchers to decode phonemes, syllables, and words from brain activity. The ultimate aspiration underlying these efforts is the development of a brain-machine interface (BMI) that will enable speakers to produce real-time, naturalistic speech. In the effort to create such a device, researchers have typically followed a bottom-up approach whereby low-level units of language (e.g. phonemes, syllables, or letters) are decoded from articulation areas (e.g. premotor cortex) with the aim of assembling these low-level units into words and sentences. APPROACH In this paper, we recommend that researchers supplement the existing bottom-up approach with a novel top-down approach. According to the top-down proposal, initial decoding of top-down information may facilitate the subsequent decoding of downstream representations by constraining the hypothesis space from which low-level units are selected. MAIN RESULTS We identify types and sources of top-down information that may crucially inform BMI decoding ecosystems: communicative intentions (e.g. speech acts), situational pragmatics (e.g. recurrent communicative pressures), and formal linguistic data (e.g. syntactic rules and constructions, lexical collocations, speakers' individual speech histories). SIGNIFICANCE Given the inherently interactive nature of communication, we further propose that BMIs be entrained on neural responses associated with interactive dialogue tasks, as opposed to the typical practice of entraining BMIs with non-interactive presentations of language stimuli.

[1]  Kris Heylen,et al.  Monitoring Polysemy: Word Space Models as a Tool for Large-Scale Lexical Semantic Analysis , 2015 .

[2]  M. Tomasello Origins of human communication , 2008 .

[3]  Shy Shoham,et al.  Structured neuronal encoding and decoding of human speech features , 2012, Nature Communications.

[4]  Friedemann Pulvermüller,et al.  Brain basis of communicative actions in language , 2016, NeuroImage.

[5]  C. Frith,et al.  Social cognition in the we-mode , 2013, Trends in Cognitive Sciences.

[6]  Nima Mesgarani,et al.  Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity , 2016, Journal of neural engineering.

[7]  Thorsten O. Zander,et al.  Combining Eye Gaze Input With a Brain–Computer Interface for Touchless Human–Computer Interaction , 2010, Int. J. Hum. Comput. Interact..

[8]  Claude Delpuech,et al.  Neural dynamics of the intention to speak. , 2010, Cerebral cortex.

[9]  Shigeru Sato,et al.  Cortical mechanism of communicative speech production , 2007, NeuroImage.

[10]  G. Schalk,et al.  Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans , 2011, Journal of neural engineering.

[11]  V. Ferreira,et al.  Halting in Single Word Production: A Test of the Perceptual Loop Theory of Speech Monitoring. , 2006, Journal of memory and language.

[12]  Matthew K. Leonard,et al.  Real-time classification of auditory sentences using evoked cortical activity in humans , 2018, Journal of neural engineering.

[13]  Bradley Greger,et al.  Decoding spoken words using local field potentials recorded from the cortical surface , 2010, Journal of neural engineering.

[14]  Nick F. Ramsey,et al.  Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids , 2017, NeuroImage.

[15]  Robert D Flint,et al.  Direct classification of all American English phonemes using signals from functional speech motor cortex , 2014, Journal of neural engineering.

[16]  Susan Goldin-Meadow,et al.  Quality of early parent input predicts child vocabulary 3 years later , 2013, Proceedings of the National Academy of Sciences.

[17]  Frank H Guenther,et al.  The DIVA model: A neural theory of speech acquisition and production , 2011, Language and cognitive processes.

[18]  C. Fisher,et al.  Learning Words and Rules , 2006, Psychological science.

[19]  M. Tomasello Do young children have adult syntactic competence? , 2000, Cognition.

[20]  Noam Chomsky Recent contributions to the theory of innate ideas , 2004, Synthese.

[21]  L Robert Slevc,et al.  Of Papers and Pens: Polysemes and Homophones in Lexical (mis)Selection. , 2017, Cognitive science.

[22]  Tanja Schultz,et al.  Brain-to-text: decoding spoken phrases from phone representations in the brain , 2015, Front. Neurosci..

[23]  Peter Hagoort,et al.  Topographical functional connectivity pattern in the perisylvian language networks. , 2010, Cerebral cortex.

[24]  G. Hickok Computational neuroanatomy of speech production , 2012, Nature Reviews Neuroscience.

[25]  Gian Luca Romani,et al.  Brain activity modulation during the production of imperative and declarative pointing , 2015, NeuroImage.

[26]  M. Pickering,et al.  Toward a mechanistic psychology of dialogue , 2004, Behavioral and Brain Sciences.

[27]  S. Thompson-Schill,et al.  Reworking the language network , 2014, Trends in Cognitive Sciences.

[28]  Peter Indefrey,et al.  The Spatial and Temporal Signatures of Word Production Components: A Critical Update , 2011, Front. Psychology.

[29]  Valentina Bambini,et al.  A model for Social Communication And Language Evolution and Development (SCALED) , 2014, Current Opinion in Neurobiology.

[30]  A. Sirigu,et al.  Neural Dynamics of the Intention to Speak Francesca Carota , 2009 .

[31]  S. Waxman,et al.  Infants use known verbs to learn novel nouns: Evidence from 15- and 19-month-olds , 2014, Cognition.

[32]  F. Guenther,et al.  Classification of Intended Phoneme Production from Chronic Intracortical Microelectrode Recordings in Speech-Motor Cortex , 2011, Front. Neurosci..

[33]  Victor S Ferreira,et al.  Phonological Influences on Lexical (Mis)Selection , 2003, Psychological science.

[34]  M. Tomasello Constructing a Language: A Usage-Based Theory of Language Acquisition , 2003 .

[35]  Brian Murphy,et al.  Simultaneously Uncovering the Patterns of Brain Regions Involved in Different Story Reading Subprocesses , 2014, PloS one.

[36]  M. Pickering,et al.  Do people use language production to make predictions during comprehension? , 2007, Trends in Cognitive Sciences.

[37]  S. Levinson,et al.  Brain Mechanisms Underlying Human Communication , 2009, Front. Hum. Neurosci..

[38]  Kai Vogeley,et al.  Towards a neuroscience of social interaction , 2013, Front. Hum. Neurosci..

[39]  Richard A. Andersen,et al.  Toward More Versatile and Intuitive Cortical Brain–Machine Interfaces , 2014, Current Biology.

[40]  Rhodri Cusack,et al.  The Brain's Silent Messenger: Using Selective Attention to Decode Human Thought for Brain-Based Communication , 2013, The Journal of Neuroscience.

[41]  P. Hagoort On Broca, brain, and binding: a new framework , 2005, Trends in Cognitive Sciences.

[43]  G. Rees,et al.  Neuroimaging: Decoding mental states from brain activity in humans , 2006, Nature Reviews Neuroscience.

[44]  Noam Chomsky,et al.  The growth of language: Universal Grammar, experience, and principles of computation , 2017, Neuroscience & Biobehavioral Reviews.

[45]  L. Schilbach A second-person approach to other minds , 2010, Nature Reviews Neuroscience.

[46]  A. Friederici Towards a neural basis of auditory sentence processing , 2002, Trends in Cognitive Sciences.