Decoding spoken English phonemes from intracortical electrode arrays in dorsal precentral gyrus

Objective To evaluate the potential of intracortical electrode array signals for brain-computer interfaces (BCIs) to restore lost speech, we measured the performance of classifiers trained to discriminate a comprehensive basis set for speech: 39 English phonemes. We classified neural correlates of spoken-out-loud words in the “hand knob” area of precentral gyrus, which we view as a step towards the eventual goal of decoding attempted speech from ventral speech areas in patients who are unable to speak. Approach Neural and audio data were recorded while two BrainGate2 pilot clinical trial participants, each with two chronically-implanted 96-electrode arrays, spoke 420 different words that broadly sampled English phonemes. Phoneme onsets were identified from audio recordings, and their identities were then classified from neural features consisting of each electrode’s binned action potential counts or high-frequency local field potential power. We also examined two potential confounds specific to decoding overt speech: acoustic contamination of neural signals and systematic differences in labeling different phonemes’ onset times. Main results A linear decoder achieved up to 29.3% classification accuracy (chance = 6%) across 39 phonemes, while a recurrent neural network classifier achieved 33.9% accuracy. Parameter sweeps indicated that performance did not saturate when adding more electrodes or more training data, and that accuracy improved when utilizing time-varying structure in the data. Microphonic contamination and phoneme onset differences modestly increased decoding accuracy, but could be mitigated by acoustic artifact subtraction and using a neural speech onset marker, respectively. Significance The ability to decode a comprehensive set of phonemes using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.

[1]  Peter Dayan,et al.  The Effect of Correlated Variability on the Accuracy of a Population Code , 1999, Neural Computation.

[2]  Steven M Chase,et al.  Intracortical recording stability in human brain–computer interface users , 2018, Journal of neural engineering.

[3]  O. Creutzfeldt,et al.  Neuronal activity in the human lateral temporal lobe , 2004, Experimental Brain Research.

[4]  Leon Li,et al.  Brain-to-speech decoding will require linguistic and pragmatic data , 2018, Journal of neural engineering.

[5]  Nick F. Ramsey,et al.  Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids , 2017, NeuroImage.

[6]  Francis R. Willett,et al.  Neural ensemble dynamics in dorsal motor cortex during speech in people with paralysis , 2018, bioRxiv.

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Edward F. Chang,et al.  Speech synthesis from neural decoding of spoken sentences , 2019, Nature.

[9]  Nathan E. Crone,et al.  The Potential for a Speech Brain–Computer Interface Using Chronic Electrocorticography , 2019, Neurotherapeutics.

[10]  A. Schwartz,et al.  High-performance neuroprosthetic control by an individual with tetraplegia , 2013, The Lancet.

[11]  Robert D Flint,et al.  Direct classification of all American English phonemes using signals from functional speech motor cortex , 2014, Journal of neural engineering.

[12]  Alexander Kraskov,et al.  Influence of spiking activity on cortical local field potentials , 2013, The Journal of physiology.

[13]  Francis R. Willett,et al.  High performance communication by people with paralysis using an intracortical brain-computer interface , 2017, eLife.

[14]  Itzhak Fried,et al.  Degradation of Neuronal Encoding of Speech in the Subthalamic Nucleus in Parkinson's Disease , 2019, Neurosurgery.

[15]  Surya Ganguli,et al.  Accurate Estimation of Neural Population Dynamics without Spike Sorting , 2017, Neuron.

[16]  Joseph G. Makin,et al.  Real-time decoding of question-and-answer speech dialogue using human cortical activity , 2019, Nature Communications.

[17]  M A Mines,et al.  Frequency of Occurrence of Phonemes in Conversational English , 1978, Language and speech.

[18]  Chethan Pandarinath,et al.  Rapid calibration of an intracortical brain–computer interface for people with tetraplegia , 2018, Journal of neural engineering.

[19]  O. Creutzfeldt,et al.  Neuronal activity in the human lateral temporal lobe , 1989, Experimental Brain Research.

[20]  Brian N. Pasley,et al.  Decoding spectrotemporal features of overt and covert speech from the human cortex , 2014, Front. Neuroeng..

[21]  Dean J. Krusienski,et al.  Progress in speech decoding from the electrocorticogram , 2015 .

[22]  Kristofer E. Bouchard,et al.  Deep learning as a tool for neural data analysis: Speech classification and cross-frequency coupling in human sensorimotor cortex , 2018, PLoS Comput. Biol..

[23]  Philippe Kahane,et al.  Observation and assessment of acoustic contamination of electrophysiological brain signals during speech production and sound perception , 2019, bioRxiv.

[24]  Michael L. Boninger,et al.  Implicit Grasp Force Representation in Human Motor Cortical Recordings , 2018, Front. Neurosci..

[25]  Bradley Greger,et al.  Decoding spoken words using local field potentials recorded from the cortical surface , 2010, Journal of neural engineering.

[26]  Wilson Truccolo,et al.  Decoding speech from spike-based neural population recordings in secondary auditory cortex of non-human primates , 2019, Communications Biology.

[27]  Bahar Khalighinejad,et al.  Towards reconstructing intelligible speech from the human auditory cortex , 2019, Scientific Reports.

[28]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[29]  Vinay Jayaram,et al.  Speech-specific tuning of neurons in human superior temporal gyrus. , 2014, Cerebral cortex.

[30]  Tanja Schultz,et al.  Brain-to-text: decoding spoken phrases from phone representations in the brain , 2015, Front. Neurosci..

[31]  Tom Chau,et al.  A Review of Emerging Access Technologies for Individuals With Severe Motor Impairments , 2008, Assistive technology : the official journal of RESNA.

[32]  E. Keefer,et al.  Human motor decoding from neural signals: a review , 2019, BMC biomedical engineering.

[33]  Chethan Pandarinath,et al.  Feasibility of Automatic Error Detect-and-Undo System in Human Intracortical Brain–Computer Interfaces , 2018, IEEE Transactions on Biomedical Engineering.

[34]  L. Lapointe Aphasia And Related Neurogenic Language Disorders , 1997 .

[35]  Jon A. Mukand,et al.  Neuronal ensemble control of prosthetic devices by a human with tetraplegia , 2006, Nature.

[36]  P Suppes,et al.  Brain wave recognition of words. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Kristofer E. Bouchard,et al.  Functional Organization of Human Sensorimotor Cortex for Speech Articulation , 2013, Nature.

[38]  Francis R. Willett,et al.  Speech-related dorsal motor cortex activity does not interfere with iBCI cursor control , 2020, Journal of neural engineering.

[39]  Nicholas V. Annetta,et al.  Restoring cortical control of functional movement in a human with quadriplegia , 2016, Nature.

[40]  Nicolas Y. Masse,et al.  Virtual typing by people with tetraplegia using a self-calibrating intracortical brain-computer interface , 2015, Science Translational Medicine.

[41]  Sagi Perel,et al.  Extracellular voltage threshold settings can be tuned for optimal encoding of movement and stimulus parameters , 2016, Journal of neural engineering.

[42]  Cuntai Guan,et al.  Electrocorticographic representations of segmental features in continuous speech , 2015, Front. Hum. Neurosci..

[43]  Edward F Chang,et al.  Toward a Speech Neuroprosthesis. , 2019, JAMA.

[44]  Debadatta Dash,et al.  Decoding Imagined and Spoken Phrases From Non-invasive Neural (MEG) Signals , 2020, Frontiers in Neuroscience.

[45]  Nicolas Y. Masse,et al.  Reach and grasp by people with tetraplegia using a neurally controlled robotic arm , 2012, Nature.

[46]  Matthew C Tate,et al.  Speech synthesis from ECoG using densely connected 3D convolutional neural networks. , 2019, Journal of neural engineering.

[47]  Johanna Palmio,et al.  Speech deterioration in amyotrophic lateral sclerosis (ALS) after manifestation of bulbar symptoms. , 2018, International journal of language & communication disorders.

[48]  N. Hattori,et al.  PINK1 autophosphorylation upon membrane potential dissipation is essential for Parkin recruitment to damaged mitochondria , 2012, Nature Communications.

[49]  Dean J. Krusienski,et al.  Generating Natural, Intelligible Speech From Brain Activity in Motor, Premotor, and Inferior Frontal Cortices , 2019, Front. Neurosci..

[50]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[51]  Francis R. Willett,et al.  Restoration of reaching and grasping in a person with tetraplegia through brain-controlled muscle stimulation: a proof-of-concept demonstration , 2017, The Lancet.

[52]  David Sussillo,et al.  Making brain–machine interfaces robust to future neural variability , 2016, Nature communications.

[53]  Melanie Fried-Oken,et al.  New and emerging access technologies for adults with complex communication needs and severe motor impairments: State of the science , 2019, Augmentative and alternative communication.

[54]  Nicholas V. Annetta,et al.  Extracting wavelet based neural features from human intracortical recordings for neuroprosthetics applications , 2018, Bioelectronic Medicine.

[55]  Hong-Wei Xue,et al.  Arabidopsis PROTEASOME REGULATOR1 is required for auxin-mediated suppression of proteasome activity and regulates auxin signalling , 2016, Nature Communications.

[56]  F. Guenther,et al.  Classification of Intended Phoneme Production from Chronic Intracortical Microelectrode Recordings in Speech-Motor Cortex , 2011, Front. Neurosci..

[57]  F. Guenther,et al.  A Wireless Brain-Machine Interface for Real-Time Speech Synthesis , 2009, PloS one.

[58]  Vikash Gilja,et al.  Decoding speech using the timing of neural signal modulation , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[59]  Keith Johnson,et al.  Encoding of Articulatory Kinematic Trajectories in Human Speech Sensorimotor Cortex , 2018, Neuron.

[60]  Matthew T. Kaufman,et al.  The Largest Response Component in the Motor Cortex Reflects Movement Timing but Not Movement Type , 2016, eNeuro.

[61]  Dean J. Krusienski,et al.  The Potential of Stereotactic-EEG for Brain-Computer Interfaces: Current Progress and Future Directions , 2020, Frontiers in Neuroscience.

[62]  Axel Meyer,et al.  Asymmetric paralog evolution between the “cryptic” gene Bmp16 and its well-studied sister genes Bmp2 and Bmp4 , 2019, Scientific Reports.

[63]  Francis R. Willett,et al.  Signal processing methods for reducing artifacts in microelectrode brain recordings caused by functional electrical stimulation , 2018, Journal of neural engineering.

[64]  M L Boninger,et al.  Ten-dimensional anthropomorphic arm control in a human brain−machine interface: difficulties, solutions, and limitations , 2015, Journal of neural engineering.

[65]  Kristofer E. Bouchard,et al.  Neural decoding of spoken vowels from human sensory-motor cortex with high-density electrocorticography , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[66]  P. Sachs,et al.  SMARCAD1 ATPase activity is required to silence endogenous retroviruses in embryonic stem cells , 2019, Nature Communications.

[67]  Krishna V Shenoy,et al.  ERAASR: an algorithm for removing electrical stimulation artifacts from multielectrode array recordings , 2017, bioRxiv.

[68]  Tanja Schultz,et al.  Automatic Speech Recognition from Neural Signals: A Focused Review , 2016, Front. Neurosci..

[69]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[70]  Steven Brown,et al.  Representation of the speech effectors in the human motor cortex: Somatotopy or overlap? , 2010, Brain and Language.

[71]  Karen Livescu,et al.  Differential Representation of Articulatory Gestures and Phonemes in Precentral and Inferior Frontal Gyri , 2018, The Journal of Neuroscience.

[72]  Naoshige Uchida,et al.  Demixed principal component analysis of neural population data , 2014, eLife.

[73]  Panagiotis Artemiadis,et al.  Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features , 2018, Journal of neural engineering.

[74]  M J Vansteensel,et al.  The influence of prior pronunciations on sensorimotor cortex activity patterns during vowel production , 2018, Journal of neural engineering.

[75]  Francis R. Willett,et al.  Hand Knob Area of Premotor Cortex Represents the Whole Body in a Compositional Way , 2020, Cell.

[76]  Marc W Slutzky,et al.  Brain-Machine Interfaces: Powerful Tools for Clinical Treatment and Neuroscientific Investigations , 2019, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[77]  Francis R. Willett,et al.  Neural Representation of Observed, Imagined, and Attempted Grasping Force in Motor Cortex of Individuals with Chronic Tetraplegia , 2020, Scientific Reports.

[78]  Julie A Fiez,et al.  Behavioral / Cognitive SUBTHALAMIC NUCLEUS NEURONS DIFFERENTIALLY ENCODE EARLY AND LATE ASPECTS OF SPEECH PRODUCTION , 2018 .

[79]  José del R. Millán,et al.  Decoding Inner Speech Using Electrocorticography: Progress and Challenges Toward a Speech Prosthesis , 2018, Front. Neurosci..

[80]  Joseph G. Makin,et al.  Machine translation of cortical activity to text with an encoder-decoder framework , 2019, bioRxiv.

[81]  Shy Shoham,et al.  Structured neuronal encoding and decoding of human speech features , 2012, Nature Communications.

[82]  Vikash Gilja,et al.  ECoG data analyses to inform closed-loop BCI experiments for speech-based prosthetic applications , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[83]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[84]  Francis R. Willett,et al.  Decoding Speech from Intracortical Multielectrode Arrays in Dorsal “Arm/Hand Areas” of Human Motor Cortex , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[85]  Gerhard Friehs,et al.  Intra-day signal instabilities affect decoding performance in an intracortical neural interface system , 2013, Journal of neural engineering.