End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training
暂无分享,去创建一个
Lin-Shan Lee | Hung-yi Lee | Alexander H. Liu | Heng-Jui Chang | Hung-yi Lee | Lin-Shan Lee | Heng-Jui Chang
[1] Prasanta Kumar Ghosh,et al. A Study on Robustness of Articulatory Features for Automatic Speech Recognition of Neutral and Whispered Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Dorde T. Grozdic,et al. Whispered Speech Database: Design, Processing and Application , 2013, TSD.
[3] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[4] Alex Waibel,et al. Improving Sequence-To-Sequence Speech Recognition Training with On-The-Fly Data Augmentation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network , 2020, ArXiv.
[6] Maja Pantic,et al. Visual-Only Recognition of Normal, Whispered and Silent Speech , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Hermann Ney,et al. From Feedforward to Recurrent LSTM Neural Networks for Language Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Carlos Busso,et al. Lipreading approach for isolated digits recognition under whisper and neutral speech , 2014, INTERSPEECH.
[10] Dorde T. Grozdic,et al. Whispered Speech Recognition Using Deep Denoising Autoencoder and Inverse Filtering , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Fadi Biadsy,et al. Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation , 2019, INTERSPEECH.
[12] Hiroshi Sato,et al. Neural Whispered Speech Detection with Imbalanced Learning , 2019, INTERSPEECH.
[13] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] S. Jovicic,et al. Acoustic analysis of consonants in whispered speech. , 2008, Journal of voice : official journal of the Voice Foundation.
[16] Boon Pang Lim,et al. Transfer Learning with Bottleneck Feature Networks for Whispered Speech Recognition , 2016, INTERSPEECH.
[17] Kazuya Takeda,et al. Analysis and recognition of whispered speech , 2005, Speech Commun..
[18] Mark A. Clements,et al. Reconstruction of speech from whispers , 2002, MAVEBA.
[19] Kyu J. Han,et al. State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention with Dilated 1D Convolutions , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[20] Carlos Busso,et al. Audiovisual corpus to analyze whisper speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[22] Bin Ma,et al. A whispered Mandarin corpus for speech technology applications , 2014, INTERSPEECH.
[23] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[24] Rajesh M. Hegde,et al. Significance of parametric spectral ratio methods in detection and recognition of whispered speech , 2012, EURASIP J. Adv. Signal Process..
[25] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[26] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[27] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[28] Liang Lu,et al. Noise-robust whispered speech recognition using a non-audible-murmur microphone with VTS compensation , 2012, 2012 8th International Symposium on Chinese Spoken Language Processing.
[29] Gabriel Synnaeve,et al. Who Needs Words? Lexicon-Free Speech Recognition , 2019, INTERSPEECH.
[30] Marius Cotescu,et al. Voice Conversion for Whispered Speech Synthesis , 2020, IEEE Signal Processing Letters.
[31] Yu Zhang,et al. Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Myungjong Kim,et al. Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data. , 2016, Workshop on Speech and Language Processing for Assistive Technologies.
[33] John H. L. Hansen,et al. Deep neural network training for whispered speech recognition using small databases and generative model sampling , 2017, Int. J. Speech Technol..
[34] D. T. Grozdic,et al. Application of neural networks in whispered speech recognition , 2012, 2012 20th Telecommunications Forum (TELFOR).
[35] Boon Pang Lim,et al. Computational differences between whispered and non-whispered speech , 2011 .
[36] Yu Zhang,et al. Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM , 2017, INTERSPEECH.
[37] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[38] Gabriel Synnaeve,et al. Wav2Letter: an End-to-End ConvNet-based Speech Recognition System , 2016, ArXiv.
[39] Philip Chan,et al. Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..
[40] Tanja Schultz,et al. Whispery speech recognition using adapted articulatory features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[41] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[42] Gabriel Synnaeve,et al. Letter-Based Speech Recognition with Gated ConvNets , 2017, ArXiv.
[43] John H. L. Hansen,et al. Generative Modeling of Pseudo-Whisper for Robust Whispered Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[44] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[45] Julius Kunze,et al. Transfer Learning for Speech Recognition on a Budget , 2017, Rep4NLP@ACL.