High-performance brain-to-text communication via imagined handwriting

Brain-computer interfaces (BCIs) can restore communication to people who have lost the ability to move or speak. To date, a major focus of BCI research has been on restoring gross motor skills, such as reaching and grasping1–5 or point-and-click typing with a 2D computer cursor6,7. However, rapid sequences of highly dexterous behaviors, such as handwriting or touch typing, might enable faster communication rates. Here, we demonstrate an intracortical BCI that can decode imagined handwriting movements from neural activity in motor cortex and translate it to text in real-time, using a novel recurrent neural network decoding approach. With this BCI, our study participant (whose hand was paralyzed) achieved typing speeds that exceed those of any other BCI yet reported: 90 characters per minute at >99% accuracy with a general-purpose autocorrect. These speeds are comparable to able-bodied smartphone typing speeds in our participant’s age group (115 characters per minute)8 and significantly close the gap between BCI-enabled typing and able-bodied typing rates. Finally, new theoretical considerations explain why temporally complex movements, such as handwriting, may be fundamentally easier to decode than point-to-point movements. Our results open a new approach for BCIs and demonstrate the feasibility of accurately decoding rapid, dexterous movements years after paralysis.

[1]  Hosein Hashemi,et al.  Fuzzy Clustering of Seismic Sequences: Segmentation of Time-Frequency Representations , 2012, IEEE Signal Processing Magazine.

[2]  R. Andersen,et al.  Decoding motor imagery from the posterior parietal cortex of a tetraplegic human , 2015, Science.

[3]  Robert D Flint,et al.  Direct classification of all American English phonemes using signals from functional speech motor cortex , 2014, Journal of neural engineering.

[4]  A. Schwartz,et al.  High-performance neuroprosthetic control by an individual with tetraplegia , 2013, The Lancet.

[5]  J. Wolpaw,et al.  A novel P300-based brain–computer interface stimulus presentation paradigm: Moving beyond rows and columns , 2010, Clinical Neurophysiology.

[6]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[7]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[8]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[9]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[10]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[11]  Mehryar Mohri,et al.  Speech Recognition with Weighted Finite-State Transducers , 2008 .

[12]  Joseph G. Makin,et al.  Machine translation of cortical activity to text with an encoder–decoder framework , 2020, Nature Neuroscience.

[13]  Nicolas Y. Masse,et al.  Neural Point-and-Click Communication by a Person With Incomplete Locked-In Syndrome , 2015, Neurorehabilitation and neural repair.

[14]  Vikash Gilja,et al.  Comparison of spike sorting and thresholding of voltage waveforms for intracortical brain–machine interface performance , 2015, Journal of neural engineering.

[15]  Edward F. Chang,et al.  Speech synthesis from neural decoding of spoken sentences , 2019, Nature.

[16]  Tara N. Sainath,et al.  Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[18]  Gabriel Synnaeve,et al.  Wav2Letter: an End-to-End ConvNet-based Speech Recognition System , 2016, ArXiv.

[19]  Tzyy-Ping Jung,et al.  High-speed spelling with a noninvasive brain–computer interface , 2015, Proceedings of the National Academy of Sciences.

[20]  David Miller,et al.  The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text , 2004, LREC.

[21]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[22]  Nicholas V. Annetta,et al.  Restoring cortical control of functional movement in a human with quadriplegia , 2016, Nature.

[23]  Nicolas Y. Masse,et al.  Virtual typing by people with tetraplegia using a self-calibrating intracortical brain-computer interface , 2015, Science Translational Medicine.

[24]  Sanjeev Khudanpur,et al.  Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  J. Wolpaw,et al.  A P300-based brain–computer interface for people with amyotrophic lateral sclerosis , 2008, Clinical Neurophysiology.

[26]  Jacob Benesty,et al.  Springer handbook of speech processing , 2007, Springer Handbooks.

[27]  Surya Ganguli,et al.  Accurate Estimation of Neural Population Dynamics without Spike Sorting , 2017, Neuron.

[28]  N. Ramsey,et al.  Fully Implanted Brain-Computer Interface in a Locked-In Patient with ALS. , 2016, The New England journal of medicine.

[29]  Vikash Gilja,et al.  Toward optimal target placement for neural prosthetic devices. , 2008, Journal of neurophysiology.

[30]  Byron M. Yu,et al.  Stabilization of a brain–computer interface via the alignment of low-dimensional spaces of neural activity , 2020, Nature Biomedical Engineering.

[31]  Per Ola Kristensson,et al.  How do People Type on Mobile Devices?: Observations from a Study with 37,000 Volunteers , 2019, MobileHCI.

[32]  Jon A. Mukand,et al.  Neuronal ensemble control of prosthetic devices by a human with tetraplegia , 2006, Nature.

[33]  H. Alkadhi,et al.  Localization of the motor hand area to a knob on the precentral gyrus. A new landmark. , 1997, Brain : a journal of neurology.

[34]  Surya Ganguli,et al.  Time-warped PCA : simultaneous alignment and dimensionality reduction of neural data , 2016 .

[35]  Jonathan R Wolpaw,et al.  Independent home use of a brain-computer interface by people with amyotrophic lateral sclerosis , 2018, Neurology.

[36]  Surya Ganguli,et al.  A theory of multineuronal dimensionality, dynamics and measurement , 2017, bioRxiv.

[37]  Nicolas Y. Masse,et al.  Reach and grasp by people with tetraplegia using a neurally controlled robotic arm , 2012, Nature.

[38]  Steven M Chase,et al.  Intracortical recording stability in human brain–computer interface users , 2018, Journal of neural engineering.

[39]  Francis R. Willett,et al.  High performance communication by people with paralysis using an intracortical brain-computer interface , 2017, eLife.

[40]  J. Wolpaw,et al.  P300-based brain-computer interface (BCI) event-related potentials (ERPs): People with amyotrophic lateral sclerosis (ALS) vs. age-matched controls , 2015, Clinical Neurophysiology.

[41]  Surya Ganguli,et al.  Discovering Precise Temporal Patterns in Large-Scale Neural Recordings through Robust and Interpretable Time Warping , 2020, Neuron.

[42]  Francis R. Willett,et al.  Hand Knob Area of Premotor Cortex Represents the Whole Body in a Compositional Way , 2020, Cell.

[43]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[44]  Hermann Ney,et al.  A comprehensive study of deep bidirectional LSTM RNNS for acoustic modeling in speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[46]  Francis R. Willett,et al.  Restoration of reaching and grasping in a person with tetraplegia through brain-controlled muscle stimulation: a proof-of-concept demonstration , 2017, The Lancet.

[47]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[48]  David Sussillo,et al.  Making brain–machine interfaces robust to future neural variability , 2016, Nature communications.