Decoding of Chinese phoneme clusters using ECoG

A finite set of phonetic units is used in human speech, but how our brain recognizes these units from speech streams is still largely unknown. The revealing of this neural mechanism may lead to the development of new types of speech brain computer interfaces (BCI) and computer speech recognition systems. In this study, we used electrocorticography (ECoG) signal from human cortex to decode phonetic units during the perception of continuous speech. By exploring the wavelet time-frequency features, we identified ECoG electrodes that have selective response to specific Chinese phonemes. Gamma and high-gamma power of these electrodes were further combined to separate sets of phonemes into clusters. The clustered organization largely coincided with phonological categories defined by the place of articulation and manner of articulation. These findings were incorporated into a decoding framework of Chinese phonemes clusters. Using support vector machine (SVM) classifier, we achieved consistent accuracies higher than chance level across five patients discriminating specific phonetic clusters, which suggests a promising direction of implementing a speech BCI.

[1]  Helen Meng,et al.  Signal representation comparison for phonetic classification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Robert Oostenveld,et al.  FieldTrip: Open Source Software for Advanced Analysis of MEG, EEG, and Invasive Electrophysiological Data , 2010, Comput. Intell. Neurosci..

[3]  Hua Shu,et al.  Phonemes matter: the role of phoneme-level awareness in emergent Chinese readers. , 2011, Journal of experimental child psychology.

[4]  Rajesh P. N. Rao,et al.  Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[5]  Mohammad Dastjerdi,et al.  Numerical processing in the human parietal cortex during experimental and natural conditions , 2013, Nature Communications.

[6]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[7]  Manuel R. Mercier,et al.  Mapping phonemic processing zones along human perisylvian cortex: an electro-corticographic investigation , 2013, Brain Structure and Function.

[8]  E. Chang,et al.  Categorical Speech Representation in Human Superior Temporal Gyrus , 2010, Nature Neuroscience.

[9]  J. Rauschecker,et al.  Segregation of Vowels and Consonants in Human Auditory Cortex: Evidence for Distributed Hierarchical Organization , 2010, Front. Psychology.

[10]  Rui Xu,et al.  Toward a minimally invasive brain–computer interface using a single subdural channel: A visual speller study , 2013, NeuroImage.

[11]  Zhang Jialu The distinctive feature trees of Standard Chinese (Putonghua) , 2006 .

[12]  Jeffrey G. Ojemann,et al.  Power-Law Scaling in the Brain Surface Electric Potential , 2009, PLoS Comput. Biol..

[13]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[14]  Eishi Asano,et al.  Electrocorticographic correlates of overt articulation of 44 English phonemes: Intracranial recording in children with focal epilepsy , 2014, Clinical Neurophysiology.

[15]  Rinus G. Verdonschot,et al.  Masked Syllable Priming Effects in Word and Picture Naming in Chinese , 2012, PloS one.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Arthur Gretton,et al.  Low-Frequency Local Field Potentials and Spikes in Primary Visual Cortex Convey Independent Visual Information , 2008, The Journal of Neuroscience.

[18]  N. Crone,et al.  High-frequency gamma oscillations and human brain mapping with electrocorticography. , 2006, Progress in brain research.

[19]  Shy Shoham,et al.  Structured neuronal encoding and decoding of human speech features , 2012, Nature Communications.

[20]  Markus F Damian,et al.  Sound-sized segments are significant for Mandarin speakers , 2012, Proceedings of the National Academy of Sciences.

[21]  G. Schalk,et al.  Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans , 2011, Journal of neural engineering.