Real-time voice activity detection for ECoG-based speech brain machine interfaces

In this article, we investigated the performance of a real-time voice activity detection module exploiting different time-frequency methods for extracting signal features in a subject with implanted electrocorticographic (ECoG) electrodes. We used ECoG signals recorded while the subject performed a syllable repetition task. The voice activity detection module used, as input, ECoG data streams, on which it performed feature extraction and classification. With this approach we were able to detect voice activity (speech onset and offset) from ECoG signals with high accuracy. The results demonstrate that different time-frequency representations carried complementary information about voice activity, with the S-transform achieving 92% accuracy using the 86 best features and support vector machines as the classifier. The proposed real-time voice activity detector may be used as a part of an automated natural speech BMI system for rehabilitating individuals with communication deficits.

[1]  S. Acharya,et al.  Connectivity Analysis as a Novel Approach to Motor Decoding for Prosthesis Control , 2012, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[2]  Anastasios Bezerianos,et al.  Joint Spatial-Spectral Feature Space Clustering for Speech Activity Detection from ECoG Signals , 2014, IEEE Transactions on Biomedical Engineering.

[3]  Rabab K Ward,et al.  A survey of signal processing algorithms in brain–computer interfaces based on electrical brain signals , 2007, Journal of neural engineering.

[4]  N. Barbaro,et al.  Spatiotemporal Dynamics of Word Processing in the Human Brain , 2007, Front. Neurosci..

[5]  X. Zeng,et al.  Geometric strategies for neuroanatomic analysis from MRI , 2004, NeuroImage.

[6]  Sylvain Meignen,et al.  Time-Frequency Reassignment and Synchrosqueezing: An Overview , 2013, IEEE Signal Processing Magazine.

[7]  Brian N. Pasley,et al.  Reconstructing Speech from Human Auditory Cortex , 2012, PLoS biology.

[8]  Bradley Greger,et al.  Decoding spoken words using local field potentials recorded from the cortical surface , 2010, Journal of neural engineering.

[9]  V. Gilja,et al.  Signal Processing Challenges for Neural Prostheses , 2008, IEEE Signal Processing Magazine.

[10]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[11]  J. Wolpaw,et al.  Decoding flexion of individual fingers using electrocorticographic signals in humans , 2009, Journal of neural engineering.

[12]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[13]  Makoto Sato,et al.  Single-trial classification of vowel speech imagery using common spatial patterns , 2009, Neural Networks.

[14]  J. Wolpaw,et al.  Decoding two-dimensional movement trajectories using electrocorticographic signals in humans , 2007, Journal of neural engineering.

[15]  Wei Wu,et al.  Spoken sentences decoding based on intracranial high gamma response using dynamic time warping , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[16]  G. Schalk,et al.  Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans , 2011, Journal of neural engineering.

[17]  J. Seifer,et al.  [Locked-in syndrome]. , 1982, Medicina.

[18]  F. Guenther,et al.  A Wireless Brain-Machine Interface for Real-Time Speech Synthesis , 2009, PloS one.

[19]  S. Acharya,et al.  Toward Electrocorticographic Control of a Dexterous Upper Limb Prosthesis: Building Brain-Machine Interfaces , 2012, IEEE Pulse.