Direct classification of all American English phonemes using signals from functional speech motor cortex

OBJECTIVE Although brain-computer interfaces (BCIs) can be used in several different ways to restore communication, communicative BCI has not approached the rate or efficiency of natural human speech. Electrocorticography (ECoG) has precise spatiotemporal resolution that enables recording of brain activity distributed over a wide area of cortex, such as during speech production. In this study, we sought to decode elements of speech production using ECoG. APPROACH We investigated words that contain the entire set of phonemes in the general American accent using ECoG with four subjects. Using a linear classifier, we evaluated the degree to which individual phonemes within each word could be correctly identified from cortical signal. MAIN RESULTS We classified phonemes with up to 36% accuracy when classifying all phonemes and up to 63% accuracy for a single phoneme. Further, misclassified phonemes follow articulation organization described in phonology literature, aiding classification of whole words. Precise temporal alignment to phoneme onset was crucial for classification success. SIGNIFICANCE We identified specific spatiotemporal features that aid classification, which could guide future applications. Word identification was equivalent to information transfer rates as high as 3.0 bits s(-1) (33.6 words min(-1)), supporting pursuit of speech articulation for BCI control.

[1]  Dennis L. Barbour,et al.  Towards a Speech BCI Using ECoG , 2013 .

[2]  Jeremy R. Manning,et al.  Spontaneously Reactivated Patterns in Frontal and Temporal Lobe Predict Semantic Clustering during Memory Search , 2012, The Journal of Neuroscience.

[3]  L. Miller,et al.  Accurate decoding of reaching movements from field potentials in the absence of spikes , 2012, Journal of neural engineering.

[4]  Dennis J. McFarland,et al.  Brain–computer interfaces for communication and control , 2002, Clinical Neurophysiology.

[5]  Eric Leuthardt,et al.  Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition , 2011, NeuroImage.

[6]  Rajesh P. N. Rao,et al.  Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[7]  Andreas Schulze-Bonhage,et al.  Movement related activity in the high gamma range of the human EEG , 2008, NeuroImage.

[8]  L. Cohen,et al.  Brain–computer interfaces: communication and restoration of movement in paralysis , 2007, The Journal of physiology.

[9]  J. Wolpaw,et al.  Decoding two-dimensional movement trajectories using electrocorticographic signals in humans , 2007, Journal of neural engineering.

[10]  F. Guenther,et al.  A Wireless Brain-Machine Interface for Real-Time Speech Synthesis , 2009, PloS one.

[11]  Eran Stark,et al.  Predicting Movement from Multiunit Activity , 2007, The Journal of Neuroscience.

[12]  M A Mines,et al.  Frequency of Occurrence of Phonemes in Conversational English , 1978, Language and speech.

[13]  L. Miller,et al.  Optimal spacing of surface electrode arrays for brain–machine interface applications , 2010, Journal of neural engineering.

[14]  Yijun Wang,et al.  A high-speed BCI based on code modulation VEP , 2011, Journal of neural engineering.

[15]  Robert D Flint,et al.  Local field potentials allow accurate decoding of muscle activity. , 2012, Journal of neurophysiology.

[16]  Nicholas P. Szrama,et al.  Using the electrocorticographic speech network to control a brain–computer interface in humans , 2011, Journal of neural engineering.

[17]  L. Miller,et al.  Decoding the rat forelimb movement direction from epidural and intracortical field potentials , 2011, Journal of neural engineering.

[18]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[19]  P. Kennedy,et al.  Restoration of neural output from a paralyzed patient by a direct brain connection , 1998, Neuroreport.

[20]  Gerwin Schalk,et al.  Can Electrocorticography (ECoG) Support Robust and Powerful Brain–Computer Interfaces? , 2010, Front. Neuroeng..

[21]  N. Birbaumer,et al.  BCI2000: a general-purpose brain-computer interface (BCI) system , 2004, IEEE Transactions on Biomedical Engineering.

[22]  Nathaniel I. Durlach,et al.  Note on Information Transfer Rates in Human Communication , 1998, Presence.

[23]  Bradley Greger,et al.  Decoding spoken words using local field potentials recorded from the cortical surface , 2010, Journal of neural engineering.

[24]  G. Schalk,et al.  Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans , 2011, Journal of neural engineering.

[25]  Eric Leuthardt,et al.  Real-time detection of event-related brain activity , 2008, NeuroImage.

[26]  Adam Brown International Phonetic Alphabet , 2012 .

[27]  Kristofer E. Bouchard,et al.  Functional Organization of Human Sensorimotor Cortex for Speech Articulation , 2013, Nature.

[28]  F. Guenther,et al.  Classification of Intended Phoneme Production from Chronic Intracortical Microelectrode Recordings in Speech-Motor Cortex , 2011, Front. Neurosci..

[29]  Rajesh P. N. Rao,et al.  Brain surface electrode co-registration using MRI and x-ray , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[30]  Robin C. Ashmore,et al.  An Electrocorticographic Brain Interface in an Individual with Tetraplegia , 2013, PloS one.

[31]  G. Pfurtscheller,et al.  Brain-Computer Interfaces for Communication and Control. , 2011, Communications of the ACM.

[32]  Nick F. Ramsey,et al.  Automated electrocorticographic electrode localization on individually rendered brain surfaces , 2010, Journal of Neuroscience Methods.