Classifying phonological categories in imagined and articulated speech

This paper presents a new dataset combining 3 modalities (EEG, facial, and audio) during imagined and vocalized phonemic and single-word prompts. We pre-process the EEG data, compute features for all 3 modalities, and perform binary classification of phonological categories using a combination of these modalities. For example, a deep-belief network obtains accuracies over 90% on identifying consonants, which is significantly more accurate than two baseline support vector machines. We also classify between the different states (resting, stimuli, active thinking) of the recording, achieving accuracies of 95%. These data may be used to learn multimodal relationships, and to develop silent-speech and brain-computer interfaces.

[1]  Tanja Schultz,et al.  EEG-based Speech Recognition - Impact of Temporal Effects , 2009, BIOSIGNALS.

[2]  Friedemann Pulvermüller,et al.  Motor cortex maps articulatory features of speech sounds , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  P. Kennedy,et al.  Neurotrophic electrode: Method of assembly and implantation into human motor speech cortex , 2008, Journal of Neuroscience Methods.

[4]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[5]  Raymond D. Kent,et al.  Toward phonetic intelligibility testing in dysarthria. , 1989, The Journal of speech and hearing disorders.

[6]  D E Callan,et al.  Single-sweep EEG analysis of neural processes underlying perception and production of vowels. , 2000, Brain research. Cognitive brain research.

[7]  Brian N. Pasley,et al.  Reconstructing Speech from Human Auditory Cortex , 2012, PLoS biology.

[8]  N. Fujimaki,et al.  Event-related potentials in silent speech , 2005, Brain Topography.

[9]  Bradley Greger,et al.  Decoding spoken words using local field potentials recorded from the cortical surface , 2010, Journal of neural engineering.

[10]  P Suppes,et al.  Brain wave recognition of words. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Tom Michael Mitchell,et al.  Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.

[12]  Rajesh P. N. Rao,et al.  Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[13]  Michael D'Zmura,et al.  Toward EEG Sensing of Imagined Speech , 2009, HCI.

[14]  Massimo Poesio,et al.  Of Words, Eyes and Brains: Correlating Image-Based Distributional Semantic Models with Neural Representations of Concepts , 2013, EMNLP.

[15]  B. V. K. Vijaya Kumar,et al.  Imagined Speech Classification with EEG Signals for Silent Communication: A Preliminary Investigation into Synthetic Telepathy , 2010, 2010 4th International Conference on Bioinformatics and Biomedical Engineering.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Massimo Poesio,et al.  EEG responds to conceptual stimuli and corpus semantics , 2009, EMNLP.

[18]  H. Lüders,et al.  American Electroencephalographic Society Guidelines for Standard Electrode Position Nomenclature , 1991, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[19]  Makoto Sato,et al.  Single-trial classification of vowel speech imagery using common spatial patterns , 2009, Neural Networks.

[20]  W. De Clercq,et al.  Automatic Removal of Ocular Artifacts in the EEG without an EOG Reference Channel , 2006, Proceedings of the 7th Nordic Signal Processing Symposium - NORSIG 2006.