Autoencoder-Based Articulatory-to-Acoustic Mapping for Ultrasound Silent Speech Interfaces
暂无分享,去创建一个
Gábor Gosztolya | László Tóth | Tamás Gábor Csapó | Tamás Grósz | Alexandra Markó | Ádám Pintér | L. Tóth | T. Csapó | G. Gosztolya | Alexandra Markó | Tamás Grósz | Á. Pintér
[1] James T. Heaton,et al. Towards a practical silent speech recognition system , 2014, INTERSPEECH.
[2] James T. Heaton,et al. Silent Speech Recognition as an Alternative Communication Device for Persons With Laryngectomy , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[4] Tamás Gábor Csapó,et al. Error analysis of extracted tongue contours from 2d ultrasound images , 2015, INTERSPEECH.
[5] Tanja Schultz,et al. Session-Independent Array-Based EMG-to-Speech Conversion using Convolutional Neural Networks , 2018, ITG Symposium on Speech Communication.
[6] Varga László,et al. Information Content of Projections and Reconstruction of Objects in Discrete Tomography , 2013 .
[7] John G Harris,et al. A sawtooth waveform inspired pitch estimator for speech and music. , 2008, The Journal of the Acoustical Society of America.
[8] Tara N. Sainath,et al. Deep Neural Network Language Models , 2012, WLM@NAACL-HLT.
[9] Bruce Denby,et al. Speech synthesis from real time ultrasound images of the tongue , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[10] Gérard Chollet,et al. Towards a Practical Silent Speech Interface Based on Vocal Tract Imaging , 2011 .
[11] Pierre Roussel-Ragot,et al. An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging , 2016, INTERSPEECH.
[12] Tokihiko Kaburagi,et al. Articulatory-to-speech Conversion Using Bi-directional Long Short-term Memory , 2018, INTERSPEECH.
[13] Charles A. Sutton,et al. Scheduled denoising autoencoders , 2015, ICLR.
[14] G. Widmer,et al. Learning Transformations of Musical Material using Gated Autoencoders , 2017 .
[15] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[16] Gábor Gosztolya,et al. F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Phil D. Green,et al. Direct Speech Reconstruction From Articulatory Sensor Data by Machine Learning , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[18] Nenghai Yu,et al. StyleBank: An Explicit Representation for Neural Image Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[20] Laurent Girin,et al. Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces , 2016, PLoS Comput. Biol..
[21] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[22] S. Imai,et al. Mel Log Spectrum Approximation (MLSA) filter for speech synthesis , 1983 .
[23] Tamás Gábor Csapó,et al. Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images. , 2017, The Journal of the Acoustical Society of America.
[24] Gérard Chollet,et al. Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips , 2010, Speech Commun..
[25] Li-Rong Dai,et al. Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks , 2016, INTERSPEECH.
[26] Gábor Gosztolya,et al. Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces , 2018, INTERSPEECH.
[27] J. M. Gilbert,et al. Development of a (silent) speech recognition system for patients following laryngectomy. , 2008, Medical engineering & physics.
[28] Jun Wang,et al. Preliminary Test of a Real-Time, Interactive Silent Speech Interface Based on Electromagnetic Articulograph , 2014, SLPAT@ACL.
[29] Kenji Doya,et al. Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning , 2017, Neural Networks.
[30] Hemant A. Patil,et al. Effectiveness of Generative Adversarial Network for Non-Audible Murmur-to-Whisper Speech Conversion , 2018, INTERSPEECH.
[31] António J. S. Teixeira,et al. Enhancing multimodal silent speech interfaces with feature selection , 2014, INTERSPEECH.
[32] Myungjong Kim,et al. Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[33] Thomas Hueber,et al. Feature extraction using multimodal convolutional neural networks for visual speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] J. M. Gilbert,et al. Silent speech interfaces , 2010, Speech Commun..
[35] Gérard Chollet,et al. Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech Interface , 2011, INTERSPEECH.
[36] Martin Andrews. Compressing Word Embeddings , 2016, ICONIP.
[37] Jianwu Dang,et al. Prediction of F0 Based on Articulatory Features Using DNN , 2017, ISSP.
[38] Tanja Schultz,et al. Biosignal-Based Spoken Communication: A Survey , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[39] Jiro Katto,et al. Deep Convolutional AutoEncoder-based Lossy Image Compression , 2018, 2018 Picture Coding Symposium (PCS).
[40] Tanja Schultz,et al. Direct conversion from facial myoelectric signals to speech using Deep Neural Networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[41] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[42] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[43] Jürgen Schmidhuber,et al. Lipreading with long short-term memory , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Stefano Ermon,et al. Learning Hierarchical Features from Generative Models , 2017, ArXiv.
[45] Stefano Ermon,et al. Learning Hierarchical Features from Deep Generative Models , 2017, ICML.
[46] Myung Jong Kim,et al. Articulation-to-Speech Synthesis Using Articulatory Flesh Point Sensors' Orientation Information , 2018, INTERSPEECH.
[47] M. Stone. A guide to analysing tongue motion from ultrasound images , 2005, Clinical linguistics & phonetics.
[48] Tanja Schultz,et al. Estimation of fundamental frequency from surface electromyographic data: EMG-to-F0 , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Jun Wang,et al. Sentence recognition from articulatory movements for silent speech interfaces , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[51] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[52] Matthias Janke,et al. EMG-to-Speech: Direct Generation of Speech From Facial Electromyographic Signals , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[53] Gábor Gosztolya,et al. DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface , 2017, INTERSPEECH.
[54] Tanja Schultz,et al. Domain-Adversarial Training for Session Independent EMG-based Speech Recognition , 2018, INTERSPEECH.