Decoding Imagined, Heard, and Spoken Speech: Classification and Regression of EEG Using a 14-Channel Dry-Contact Mobile Headset

We investigate the use of a 14-channel, mobile EEG device in the decoding of heard, imagined, and articulated English phones from brainwave data. To this end we introduce a dataset that fills a current gap in the range of available open-access EEG datasets for speech processing with lightweight, affordable EEG devices made for the consumer market. We investigate the effectiveness of two classification models and a regression model for reconstructing spectral features of the original speech signal. We report that our classification performance is almost on a par with similar findings that use EEG data collected with research-grade devices. We conclude that commercial-grade devices can be used as speech-decoding BCIs with minimal signal processing.

[1]  Damien Coyle,et al.  Classification of imagined spoken Word-Pairs using Convolutional Neural Networks , 2019, GBCIC.

[2]  N. Birbaumer,et al.  Brain–computer interfaces for communication and rehabilitation , 2016, Nature Reviews Neurology.

[3]  Matthew Richardson,et al.  Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.

[4]  Jonathan R Wolpaw,et al.  A brain-computer interface for long-term independent home use , 2010, Amyotrophic lateral sclerosis : official publication of the World Federation of Neurology Research Group on Motor Neuron Diseases.

[5]  Tom Eichele,et al.  Semi-automatic identification of independent components representing EEG artifact , 2009, Clinical Neurophysiology.

[6]  Matthew C Tate,et al.  Speech synthesis from ECoG using densely connected 3D convolutional neural networks. , 2019, Journal of neural engineering.

[7]  Martin J. McKeown,et al.  Removing electroencephalographic artifacts: comparison between ICA and PCA , 1998, Neural Networks for Signal Processing VIII. Proceedings of the 1998 IEEE Signal Processing Society Workshop (Cat. No.98TH8378).

[8]  Wolfram Burgard,et al.  Deep learning with convolutional neural networks for EEG decoding and visualization , 2017, Human brain mapping.

[9]  Junichi Yamagishi,et al.  CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .

[10]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Tonio Ball,et al.  A Large-Scale Evaluation Framework for EEG Deep Learning Architectures , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[13]  Alessandro Angrilli,et al.  Developmental aspects of language lateralization in delta, theta, alpha and beta EEG bands , 2010, Biological Psychology.

[14]  José del R. Millán,et al.  Individual Word Classification During Imagined Speech Using Intracranial Recordings , 2019, Brain-Computer Interface Research.

[15]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[16]  Pramit Saha,et al.  Hierarchical Deep Feature Learning For Decoding Imagined Speech From EEG , 2019, AAAI.

[17]  Robert Oostenveld,et al.  A 204-subject multimodal neuroimaging dataset to study language processing , 2019, Scientific Data.

[18]  Edmund C. Lalor,et al.  Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech , 2017, Current Biology.

[19]  Kenji Kirihara,et al.  Hierarchical Organization of Gamma and Theta Oscillatory Dynamics in Schizophrenia , 2012, Biological Psychiatry.

[20]  T W Picton,et al.  The P300 Wave of the Human Event‐Related Potential , 1992, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[21]  Edward F. Chang,et al.  Speech synthesis from neural decoding of spoken sentences , 2019, Nature.

[22]  Bahar Khalighinejad,et al.  Towards reconstructing intelligible speech from the human auditory cortex , 2019, Scientific Reports.

[23]  Frank Rudzicz,et al.  Classifying phonological categories in imagined and articulated speech , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Jan-Mathijs Schoffelen,et al.  A Predictive Coding Perspective on Beta Oscillations during Sentence-Level Language Comprehension , 2016, Front. Hum. Neurosci..

[25]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[26]  Clemens Brunner,et al.  Mu rhythm (de)synchronization and EEG single-trial classification of different motor imagery tasks , 2006, NeuroImage.