Domain Robust Feature Extraction for Rapid Low Resource ASR Development
暂无分享,去创建一个
[1] R. Maas,et al. A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research , 2016, EURASIP Journal on Advances in Signal Processing.
[2] Lukás Burget,et al. Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge , 2015, INTERSPEECH.
[3] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[4] Ebru Arisoy,et al. Turkish Broadcast News Transcription and Retrieval , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Hynek Hermansky,et al. Robust speech recognition in unknown reverberant and noisy conditions , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[6] Yifan Gong,et al. Unsupervised adaptation with domain separation networks for robust speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[7] Dan Stowell,et al. An Open Dataset for Research on Audio Field Recording Archives: freefield1010 , 2013, Semantic Audio.
[8] Ebru Arisoy,et al. Compositional Neural Network Language Models for Agglutinative Languages , 2016, INTERSPEECH.
[9] Peter Vary,et al. A binaural room impulse response database for the evaluation of dereverberation algorithms , 2009, 2009 16th International Conference on Digital Signal Processing.
[10] Steve Renals,et al. Multilingual training of deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Siddharth Dalmia,et al. Epitran: Precision G2P for Many Languages , 2018, LREC.
[12] Patrick A. Naylor,et al. EVALUATION OF SPEECH DEREVERBERATION ALGORITHMS USING THE MARDY DATABASE , 2006 .
[13] Florian Metze,et al. Distributed learning of multilingual DNN feature extractors using GPUs , 2014, INTERSPEECH.
[14] Hervé Bourlard,et al. Fast Language Adaptation Using Phonological Information , 2018, INTERSPEECH.
[15] Sanjeev Khudanpur,et al. JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[16] Mark J. F. Gales,et al. Investigation of multilingual deep neural networks for spoken term detection , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[17] Mary Harper. The Automatic Speech recogition In Reverberant Environments (ASpIRE) challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[18] Florian Metze,et al. Extracting deep bottleneck features using stacked auto-encoders , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[19] Florian Metze,et al. Sequence-Based Multi-Lingual Low Resource Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Dong Yu,et al. Improved Bottleneck Features Using Pretrained Deep Neural Networks , 2011, INTERSPEECH.
[21] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[22] Tanja Schultz,et al. Fast bootstrapping of LVCSR systems with multilingual phoneme sets , 1997, EUROSPEECH.
[23] Steve Renals,et al. Differentiable pooling for unsupervised speaker adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Andreas Stolcke,et al. Cross-Domain and Cross-Language Portability of Acoustic Features Estimated by Multilayer Perceptrons , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[25] Simão Ferraz de Campos Neto. The ITU-T Software Tool Library , 1999, Int. J. Speech Technol..
[26] Hynek Hermansky,et al. Multilingual MLP features for low-resource LVCSR systems , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[28] Horacio Franco,et al. Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition , 2016, INTERSPEECH.
[29] Ngoc Thang Vu,et al. Multilingual bottle-neck features and its application for under-resourced languages , 2012, SLTU.
[30] Arindam Mandal,et al. Normalized amplitude modulation features for large vocabulary noise-robust speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[32] Florian Metze,et al. Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[33] Tomohiro Nakatani,et al. The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[34] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[35] Yifan Gong,et al. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[36] Wonkyum Lee,et al. Semi-supervised training in low-resource ASR and KWS , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Yusuke Shinohara,et al. Adversarial Multi-Task Learning of Deep Neural Networks for Robust Speech Recognition , 2016, INTERSPEECH.
[38] Damian Murphy,et al. OpenAIR: An Interactive Auralization Web Resource and Database , 2010 .
[39] Li-Rong Dai,et al. Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[40] Chandra Sekhar Seelamantula,et al. Gammatone wavelet Cepstral Coefficients for robust speech recognition , 2013, 2013 IEEE International Conference of IEEE Region 10 (TENCON 2013).
[41] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[42] Martin Karafiát,et al. Adaptation of multilingual stacked bottle-neck neural network structure for new language , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Martin Graciarena,et al. Damped oscillator cepstral coefficients for robust speech recognition , 2013, INTERSPEECH.
[44] Satoshi Nakamura,et al. Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition , 2000, LREC.
[45] Yanning Zhang,et al. An unsupervised deep domain adaptation approach for robust speech recognition , 2017, Neurocomputing.