The Pytorch-kaldi Speech Recognition Toolkit
暂无分享,去创建一个
Titouan Parcollet | Yoshua Bengio | Mirco Ravanelli | Yoshua Bengio | Titouan Parcollet | M. Ravanelli
[1] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[5] Yoshua Bengio,et al. Twin Regularization for online speech recognition , 2018, INTERSPEECH.
[6] Georges Linarès,et al. The LIA Speech Recognition System: From 10xRT to 1xRT , 2007, TSD.
[7] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[8] Maurizio Omologo,et al. Automatic context window composition for distant speech recognition , 2018, Speech Commun..
[9] Maurizio Omologo,et al. Contaminated speech training methods for robust DNN-HMM distant speech recognition , 2017, INTERSPEECH.
[10] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[11] H. Bourlard,et al. Interpretation of Multiparty Meetings the AMI and Amida Projects , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.
[12] Maurizio Omologo. A prototype of distant-talking interface for control of interactive TV , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.
[13] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[14] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[15] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[16] Ying Zhang,et al. Batch normalized recurrent neural networks , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Yoshua Bengio,et al. Light Gated Recurrent Units for Speech Recognition , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.
[18] Hermann Ney,et al. RASR - The RWTH Aachen University Open Source Speech Recognition Toolkit , 2011 .
[19] Amit Agarwal,et al. CNTK: Microsoft's Open-Source Deep-Learning Toolkit , 2016, KDD.
[20] Peter Bell,et al. Multitask Learning of Context-Dependent Targets in Deep Neural Network Acoustic Models , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[22] Yoshua Bengio,et al. Interpretable Convolutional Filters with SincNet , 2018, ArXiv.
[23] Mehryar Mohri,et al. Finite-State Transducers in Language and Speech Processing , 1997, CL.
[24] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[25] Shrikanth S. Narayanan,et al. Pykaldi: A Python Wrapper for Kaldi , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[27] DeLiang Wang,et al. Joint noise adaptive training for robust automatic speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[29] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.
[30] Yoshua Bengio,et al. Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[31] Mirco Ravanelli,et al. Deep Learning for Distant Speech Recognition , 2017, ArXiv.
[32] Liang Lu,et al. Deep beamforming networks for multi-channel speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Yoshua Bengio,et al. A network of deep neural networks for Distant Speech Recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[35] Yoshua Bengio,et al. Batch-normalized joint training for DNN-based distant speech recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[36] Tatsuya Kawahara,et al. Recent Development of Open-Source Speech Recognition Engine Julius , 2009 .
[37] Inchul Song,et al. RNNDROP: A novel dropout for RNNS in ASR , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[38] Yoshua Bengio,et al. Improving Speech Recognition by Revising Gated Recurrent Units , 2017, INTERSPEECH.
[39] Maurizio Omologo,et al. Realistic Multi-Microphone Data Simulation for Distant Speech Recognition , 2016, INTERSPEECH.
[40] Petros Maragos,et al. The DIRHA simulated corpus , 2014, LREC.
[41] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..