Decoding Imagined and Spoken Phrases From Non-invasive Neural (MEG) Signals

Speech production is a hierarchical mechanism involving the synchronization of the brain and the oral articulators, where the intention of linguistic concepts is transformed into meaningful sounds. Individuals with locked-in syndrome (fully paralyzed but aware) lose their motor ability completely including articulation and even eyeball movement. The neural pathway may be the only option to resume a certain level of communication for these patients. Current brain-computer interfaces (BCIs) use patients' visual and attentional correlates to build communication, resulting in a slow communication rate (a few words per minute). Direct decoding of imagined speech from the neural signals (and then driving a speech synthesizer) has the potential for a higher communication rate. In this study, we investigated the decoding of five imagined and spoken phrases from single-trial, non-invasive magnetoencephalography (MEG) signals collected from eight adult subjects. Two machine learning algorithms were used. One was an artificial neural network (ANN) with statistical features as the baseline approach. The other was convolutional neural networks (CNNs) applied on the spatial, spectral and temporal features extracted from the MEG signals. Experimental results indicated the possibility to decode imagined and spoken phrases directly from neuromagnetic signals. CNNs were found to be highly effective with an average decoding accuracy of up to 93% for the imagined and 96% for the spoken phrases.

[1]  Panagiotis Artemiadis,et al.  Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features , 2018, Journal of neural engineering.

[2]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3]  Aziah Ali,et al.  Word-Based Classification of Imagined Speech Using EEG , 2017 .

[4]  Anil Kumar Sao,et al.  Automatic Recognition of Resting State fMRI Networks with Dictionary Learning , 2018, BI.

[5]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[6]  Matthew C Tate,et al.  Speech synthesis from ECoG using densely connected 3D convolutional neural networks. , 2019, Journal of neural engineering.

[7]  Frank H. Guenther,et al.  Brain-computer interfaces for speech communication , 2010, Speech Commun..

[8]  Debadatta Dash,et al.  Towards a Speaker Independent Speech-BCI Using Speaker Adaptation , 2019, INTERSPEECH.

[9]  Tiago H. Falk,et al.  Deep learning-based electroencephalography analysis: a systematic review , 2019, Journal of neural engineering.

[10]  Jun Wang,et al.  OVERT SPEECH RETRIEVAL FROM NEUROMAGNETIC SIGNALS USING WAVELETS AND ARTIFICIAL NEURAL NETWORKS , 2018, 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[11]  Yan Liu,et al.  Deep residual learning for image steganalysis , 2018, Multimedia Tools and Applications.

[12]  Maria Chait,et al.  Sensitivity to the temporal structure of rapid sound sequences — An MEG study , 2015, NeuroImage.

[13]  Lauri Parkkonen,et al.  Optical Co-registration of MRI and On-scalp MEG , 2018, Scientific Reports.

[14]  Tanja Schultz,et al.  Brain-to-text: decoding spoken phrases from phone representations in the brain , 2015, Front. Neurosci..

[15]  Luca Maria Gambardella,et al.  Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.

[16]  Myung Jong Kim,et al.  Towards decoding speech production from single-trial magnetoencephalography (MEG) signals , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Yasuharu Koike,et al.  Decoding of Covert Vowel Articulation Using Electroencephalography Cortical Currents , 2016, Front. Neurosci..

[18]  Niall Holmes,et al.  Moving magnetoencephalography towards real-world applications with a wearable system , 2018, Nature.

[19]  Myung Jong Kim,et al.  Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis , 2017, INTERSPEECH.

[20]  Matthew J. Brookes,et al.  Towards OPM-MEG in a virtual reality environment , 2019, NeuroImage.

[21]  A. Ludolph,et al.  Amyotrophic lateral sclerosis. , 2012, Current opinion in neurology.

[22]  Sofia C. Olhede,et al.  Higher-Order Properties of Analytic Wavelets , 2008, IEEE Transactions on Signal Processing.

[23]  Debadatta Dash,et al.  Decoding Speech from Single Trial MEG Signals Using Convolutional Neural Networks and Transfer Learning , 2019, 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[24]  Edward F. Chang,et al.  Speech synthesis from neural decoding of spoken sentences , 2019, Nature.

[25]  Debadatta Dash,et al.  Automatic Speech Activity Recognition from MEG Signals Using Seq2Seq Learning , 2019, 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER).

[26]  D. Cohen,et al.  Demonstration of useful differences between magnetoencephalogram and electroencephalogram. , 1983, Electroencephalography and clinical neurophysiology.

[27]  Paul Ferrari,et al.  Cortical activity during speech and non-speech oromotor tasks: A magnetoencephalography (MEG) study , 2012, Neuroscience Letters.

[28]  Damien Coyle,et al.  Neurolinguistics Research Advancing Development of a Direct-Speech Brain-Computer Interface , 2018, iScience.

[29]  W. Levelt,et al.  The spatial and temporal signatures of word production components , 2004, Cognition.

[30]  Richard M. Leahy,et al.  Brainstorm: A User-Friendly Application for MEG/EEG Analysis , 2011, Comput. Intell. Neurosci..

[31]  W. Levelt Models of word production , 1999, Trends in Cognitive Sciences.

[32]  Bahar Khalighinejad,et al.  Towards reconstructing intelligible speech from the human auditory cortex , 2019, Scientific Reports.

[33]  Albert Montillo,et al.  Determining the Optimal Number of MEG Trials: A Machine Learning and Speech Decoding Perspective , 2018, BI.

[34]  G. Buzsáki,et al.  Neuronal Oscillations in Cortical Networks , 2004, Science.

[35]  F. Guenther,et al.  Classification of Intended Phoneme Production from Chronic Intracortical Microelectrode Recordings in Speech-Motor Cortex , 2011, Front. Neurosci..

[36]  Joseph G. Makin,et al.  Real-time decoding of question-and-answer speech dialogue using human cortical activity , 2019, Nature Communications.

[37]  Bradley Greger,et al.  Decoding spoken words using local field potentials recorded from the cortical surface , 2010, Journal of neural engineering.

[38]  Peter Vuust,et al.  Comparing the Performance of Popular MEG/EEG Artifact Correction Methods in an Evoked-Response Study , 2016, Comput. Intell. Neurosci..

[39]  Yusuf Uzzaman Khan,et al.  EEG based classification of imagined vowel sounds , 2015, 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom).

[40]  J. Selhorst,et al.  "Locked-in" syndrome. , 1987, Stroke.

[41]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[42]  Lotfi Senhadji,et al.  Removal of muscle artifact from EEG data: comparison between stochastic (ICA and CCA) and deterministic (EMD and wavelet-based) approaches , 2012, EURASIP J. Adv. Signal Process..

[43]  Jürgen Dammers,et al.  Deep Learning Approach for Automatic Classification of Ocular and Cardiac Artifacts in MEG Data , 2018 .

[44]  Siyi Deng,et al.  EEG classification of imagined syllable rhythm using Hilbert spectrum methods , 2010, Journal of neural engineering.

[45]  José del R. Millán,et al.  Word pair classification during imagined speech using direct brain recordings , 2016 .

[46]  Nick F. Ramsey,et al.  Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids , 2017, NeuroImage.

[47]  Andrzej Cichocki,et al.  EEG windowed statistical wavelet scoring for evaluation and discrimination of muscular artifacts , 2008, Physiological measurement.

[48]  Niels Birbaumer,et al.  Brain–computer-interface research: Coming of age , 2006, Clinical Neurophysiology.

[49]  Andreas Schulze-Bonhage,et al.  Neurolinguistic and machine-learning perspectives on direct speech BCIs for restoration of naturalistic communication , 2017 .

[50]  Hermann Ackermann,et al.  Cerebellar contributions to speech production and speech perception: psycholinguistic and neurobiological perspectives , 2008, Trends in Neurosciences.

[51]  Hanna-Leena Halme,et al.  Comparing Features for Classification of MEG Responses to Motor Imagery , 2016, PloS one.

[52]  Karim Jerbi,et al.  Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy , 2015, Journal of Neuroscience Methods.

[53]  T. Shibata,et al.  Neural decoding of single vowels during covert articulation using electrocorticography , 2014, Front. Hum. Neurosci..

[54]  O. Hardiman,et al.  Amyotrophic lateral sclerosis , 2011, The Lancet.

[55]  Raffaella Folli,et al.  Optimizing Layers Improves CNN Generalization and Transfer Learning for Imagined Speech Decoding from EEG , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[56]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[57]  K. Grill-Spector,et al.  Repetition and the brain: neural models of stimulus-specific effects , 2006, Trends in Cognitive Sciences.

[58]  L. Shuster,et al.  An fMRI investigation of covertly and overtly produced mono- and multisyllabic words , 2005, Brain and Language.

[59]  Gary S. Dell,et al.  Inner speech slips exhibit lexical bias, but not the phonemic similarity effect , 2008, Cognition.

[60]  R. K. Sharma,et al.  EEG signal denoising based on wavelet transform , 2017, 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA).

[61]  Yuguo Yu,et al.  Decoding English Alphabet Letters Using EEG Phase Information , 2018, Front. Neurosci..

[62]  G Zouridakis,et al.  Identification of language-specific brain activity using magnetoencephalography. , 1998, Journal of clinical and experimental neuropsychology.

[63]  Damien Coyle,et al.  Classification of imagined spoken Word-Pairs using Convolutional Neural Networks , 2019, GBCIC.

[64]  Anil Kumar Sao,et al.  The model order limit: Deep sparse factorization for resting brain , 2018, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).

[65]  F. Guenther,et al.  A Wireless Brain-Machine Interface for Real-Time Speech Synthesis , 2009, PloS one.

[66]  Michael D'Zmura,et al.  Toward EEG Sensing of Imagined Speech , 2009, HCI.

[67]  Rainer Goebel,et al.  "Who" Is Saying "What"? Brain-Based Decoding of Human Voice and Speech , 2008, Science.

[68]  B. V. K. Vijaya Kumar,et al.  Imagined Speech Classification with EEG Signals for Silent Communication: A Preliminary Investigation into Synthetic Telepathy , 2010, 2010 4th International Conference on Bioinformatics and Biomedical Engineering.

[69]  Debadatta Dash,et al.  Spatial and Spectral Fingerprint in the Brain: Speaker Identification from Single Trial MEG Signals , 2019, INTERSPEECH.

[70]  Michael D'Zmura,et al.  EEG-Based Discrimination of Imagined Speech Phonemes , 2011 .

[71]  Tom Chau,et al.  EEG Classification of Covert Speech Using Regularized Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[72]  Manousos A. Klados,et al.  The Removal Of Ocular Artifacts From EEG Signals: A Comparison of Performances For Different Methods , 2009 .

[73]  Nicholas P. Szrama,et al.  Using the electrocorticographic speech network to control a brain–computer interface in humans , 2011, Journal of neural engineering.

[74]  Tanja Schultz,et al.  EEG-based Speech Recognition - Impact of Temporal Effects , 2009, BIOSIGNALS.

[75]  Frank Rudzicz,et al.  Classifying phonological categories in imagined and articulated speech , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[76]  Randy L. Buckner,et al.  An Event-Related fMRI Study of Overt and Covert Word Stem Completion , 2001, NeuroImage.

[77]  Paul Ferrari,et al.  MEG studies of motor cortex gamma oscillations: evidence for a gamma “fingerprint” in the brain? , 2013, Front. Hum. Neurosci..

[78]  R. Koopmans,et al.  The prevalence and characteristics of patients with classic locked-in syndrome in Dutch nursing homes , 2013, Journal of Neurology.

[79]  Tianyou Yu,et al.  Cross-Subject MEG Decoding Using 3D Convolutional Neural Networks , 2019, 2019 WRC Symposium on Advanced Robotics and Automation (WRC SARA).

[80]  Kristofer E. Bouchard,et al.  Deep learning as a tool for neural data analysis: Speech classification and cross-frequency coupling in human sensorimotor cortex , 2018, PLoS Comput. Biol..

[81]  W. Drinkenburg,et al.  Cortical high gamma network oscillations and connectivity: a translational index for antipsychotics to normalize aberrant neurophysiological activity , 2017, Translational Psychiatry.