Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces
暂无分享,去创建一个
Gábor Gosztolya | László Tóth | Tamás Gábor Csapó | Tamás Grósz | Alexandra Markó | L. Tóth | T. Csapó | G. Gosztolya | Alexandra Markó | Tamás Grósz
[1] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[2] Gábor Gosztolya,et al. F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Tibor Fegyó,et al. Improved Recognition of Spontaneous Hungarian Speech—Morphological and Acoustic Modeling Techniques for a Less Resourced Task , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Phil D. Green,et al. Direct Speech Reconstruction From Articulatory Sensor Data by Machine Learning , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Tara N. Sainath,et al. Deep Neural Network Language Models , 2012, WLM@NAACL-HLT.
[6] Myungjong Kim,et al. Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Thomas Hueber,et al. Feature extraction using multimodal convolutional neural networks for visual speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] James T. Heaton,et al. Silent Speech Recognition as an Alternative Communication Device for Persons With Laryngectomy , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Tanja Schultz,et al. Estimation of fundamental frequency from surface electromyographic data: EMG-to-F0 , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] J. M. Gilbert,et al. Development of a (silent) speech recognition system for patients following laryngectomy. , 2008, Medical engineering & physics.
[11] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[12] J. M. Gilbert,et al. Silent speech interfaces , 2010, Speech Commun..
[13] Gérard Chollet,et al. Towards a Practical Silent Speech Interface Based on Vocal Tract Imaging , 2011 .
[14] Tamás Gábor Csapó,et al. Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images. , 2017, The Journal of the Acoustical Society of America.
[15] Jasha Droppo,et al. Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] António J. S. Teixeira,et al. Enhancing multimodal silent speech interfaces with feature selection , 2014, INTERSPEECH.
[17] Laurent Girin,et al. Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces , 2016, PLoS Comput. Biol..
[18] Pierre Roussel-Ragot,et al. An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging , 2016, INTERSPEECH.
[19] Tanja Schultz,et al. Direct conversion from facial myoelectric signals to speech using Deep Neural Networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[20] Jasha Droppo,et al. Multi-task learning in deep neural networks for improved phoneme recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[22] Matthias Janke,et al. EMG-to-Speech: Direct Generation of Speech From Facial Electromyographic Signals , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] Gábor Gosztolya,et al. DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface , 2017, INTERSPEECH.
[24] Peter Bell,et al. Regularization of context-dependent deep neural networks with context-independent multi-task training , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Gérard Chollet,et al. Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech Interface , 2011, INTERSPEECH.
[26] Gérard Chollet,et al. Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips , 2010, Speech Commun..
[27] Li-Rong Dai,et al. Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks , 2016, INTERSPEECH.
[28] Jun Wang,et al. Preliminary Test of a Real-Time, Interactive Silent Speech Interface Based on Electromagnetic Articulograph , 2014, SLPAT@ACL.
[29] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[30] Bruce Denby,et al. Speech synthesis from real time ultrasound images of the tongue , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[31] S. Imai,et al. Mel Log Spectrum Approximation (MLSA) filter for speech synthesis , 1983 .
[32] Simon King,et al. Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Jun Wang,et al. Sentence recognition from articulatory movements for silent speech interfaces , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] James T. Heaton,et al. Towards a practical silent speech recognition system , 2014, INTERSPEECH.
[35] Jianwu Dang,et al. Prediction of F0 Based on Articulatory Features Using DNN , 2017, ISSP.
[36] Tanja Schultz,et al. Biosignal-Based Spoken Communication: A Survey , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.