Twin Regularization for online speech recognition
暂无分享,去创建一个
[1] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[2] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[3] Yi He,et al. Deep LSTM for Large Vocabulary Continuous Speech Recognition , 2017, ArXiv.
[4] Yoshua Bengio,et al. Light Gated Recurrent Units for Speech Recognition , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.
[5] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[6] Yoshua Bengio,et al. Improving Speech Recognition by Revising Gated Recurrent Units , 2017, INTERSPEECH.
[7] Georg Heigold,et al. Asynchronous, online, GMM-free training of a context dependent acoustic model for speech recognition , 2014, INTERSPEECH.
[8] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Maurizio Omologo,et al. Realistic Multi-Microphone Data Simulation for Distant Speech Recognition , 2016, INTERSPEECH.
[10] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[11] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[12] Jianxin Wu,et al. Minimal gated unit for recurrent neural networks , 2016, International Journal of Automation and Computing.
[13] Alexander Gruenstein,et al. Accurate and compact large vocabulary speech recognition on mobile devices , 2013, INTERSPEECH.
[14] Yoshua Bengio,et al. Variational Bi-LSTMs , 2017, ArXiv.
[15] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[16] Sanjeev Khudanpur,et al. Reverberation robust acoustic modeling using i-vectors with time delay neural networks , 2015, INTERSPEECH.
[17] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[18] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[19] Ying Zhang,et al. Batch normalized recurrent neural networks , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[21] Yiming Wang,et al. Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs , 2018, IEEE Signal Processing Letters.
[22] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[23] Georg Heigold,et al. Small-footprint keyword spotting using deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Inchul Song,et al. RNNDROP: A novel dropout for RNNS in ASR , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[25] Kai Chen,et al. Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Yoshua Bengio,et al. MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).
[27] Hermann Ney,et al. Towards Online-Recognition with Deep Bidirectional LSTM Acoustic Models , 2016, INTERSPEECH.
[28] Yoshua Bengio,et al. Z-Forcing: Training Stochastic Recurrent Networks , 2017, NIPS.
[29] Christopher Joseph Pal,et al. Twin Networks: Matching the Future for Sequence Generation , 2017, ICLR.
[30] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[31] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[32] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[33] Zhijie Yan,et al. Improving latency-controlled BLSTM acoustic models for online speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Maurizio Omologo,et al. Contaminated speech training methods for robust DNN-HMM distant speech recognition , 2017, INTERSPEECH.
[35] Geoffrey Zweig,et al. Deep bi-directional recurrent networks over spectral windows , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[36] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[37] Tara N. Sainath,et al. Convolutional neural networks for small-footprint keyword spotting , 2015, INTERSPEECH.
[38] Yongqiang Wang,et al. Small-footprint high-performance deep neural network-based speech recognition using split-VQ , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Maurizio Omologo,et al. Automatic context window composition for distant speech recognition , 2018, Speech Commun..
[40] Samy Bengio,et al. An Online Sequence-to-Sequence Model Using Partial Conditioning , 2015, NIPS.
[41] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[42] Florian Metze,et al. New Era for Robust Speech Recognition , 2017, Springer International Publishing.
[43] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[44] Masakiyo Fujimoto,et al. Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[45] Steve Renals,et al. Small-Footprint Highway Deep Neural Networks for Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.