Rapid Speaker Adaptation of Neural Network Based Filterbank Layer for Automatic Speech Recognition
暂无分享,去创建一个
[1] Dong Yu,et al. Recent progresses in deep learning based acoustic models , 2017, IEEE/CAA Journal of Automatica Sinica.
[2] Alex Waibel,et al. Vocal Tract Length Normalization for Large Vocabulary Continuous Speech Recognition , 1997 .
[3] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[5] K. Maekawa. CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .
[6] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[7] Biing-Hwang Juang,et al. An application of discriminative feature extraction to filter-bank-based speech recognition , 2001, IEEE Trans. Speech Audio Process..
[8] Seiichi Nakagawa,et al. A deep neural network integrated with filterbank learning for speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Yi Jiang,et al. Auditory features based on Gammatone filters for robust speech recognition , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).
[10] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[11] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Kenta Oono,et al. Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .
[13] Jesse Engel,et al. Learning Multiscale Features Directly from Waveforms , 2016, INTERSPEECH.
[14] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[15] Li-Rong Dai,et al. Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition , 2014, Journal of Signal Processing Systems.
[16] Hemant A. Patil,et al. Novel Unsupervised Auditory Filterbank Learning Using Convolutional RBM for Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Yun Lei,et al. Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions , 2014, INTERSPEECH.
[18] Steve Renals,et al. Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[19] R. Patterson,et al. B OF THE SVOS FINAL REPORT ( Part A : The Auditory Filterbank ) AN EFFICIENT AUDITORY FIL TERBANK BASED ON THE GAMMATONE FUNCTION , 2010 .
[20] Mark J. F. Gales,et al. Mean and variance adaptation within the MLLR framework , 1996, Comput. Speech Lang..
[21] Tara N. Sainath,et al. Learning filter banks within a deep neural network framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[22] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[23] Chandra Sekhar Seelamantula,et al. Gammatone wavelet Cepstral Coefficients for robust speech recognition , 2013, 2013 IEEE International Conference of IEEE Region 10 (TENCON 2013).
[24] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.