Improved Single System Conversational Telephone Speech Recognition with VGG Bottleneck Features
暂无分享,去创建一个
Roger Hsiao | Jeff Z. Ma | Man-Hung Siu | Tim Ng | William Hartmann | Francis Keith | Tim Ng | M. Siu | William Hartmann | Roger Hsiao | Francis Keith
[1] Richard M. Schwartz,et al. The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system , 2005, INTERSPEECH.
[2] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[3] Geoffrey Zweig,et al. An introduction to computational networks and the computational network toolkit (invited talk) , 2014, INTERSPEECH.
[4] Geoffrey Zweig,et al. The microsoft 2016 conversational speech recognition system , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[6] Martin Karafiát,et al. The language-independent bottleneck features , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[7] Richard M. Schwartz,et al. Unsupervised adaptation for deep neural network using linear least square method , 2015, INTERSPEECH.
[8] Martin Karafiát,et al. Convolutive Bottleneck Network features for LVCSR , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[9] Xiaodong Cui,et al. Network architectures for multilingual speech representation learning , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] George Saon,et al. The IBM 2016 English Conversational Telephone Speech Recognition System , 2016, INTERSPEECH.
[11] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[12] Andrew W. Senior,et al. Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition , 2014, ArXiv.
[13] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[14] Mark J. F. Gales,et al. CUED-RNNLM — An open-source toolkit for efficient training and evaluation of recurrent neural network language models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Alexey Prudnikov,et al. Improving English Conversational Telephone Speech Recognition , 2016, INTERSPEECH.
[16] Sanjeev Khudanpur,et al. A pitch extraction algorithm tuned for automatic speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Ralf Schlüter,et al. Investigation on cross- and multilingual MLP features under matched and mismatched acoustical conditions , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Xiaodong Cui,et al. English Conversational Telephone Speech Recognition by Humans and Machines , 2017, INTERSPEECH.
[19] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[20] Brian Kingsbury,et al. Very deep multilingual convolutional neural networks for LVCSR , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Lukás Burget,et al. Investigation into bottle-neck features for meeting speech recognition , 2009, INTERSPEECH.
[22] Jean-Marc Boite,et al. Nonlinear discriminant analysis for improved speech recognition , 1997, EUROSPEECH.
[23] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[24] Richard M. Schwartz,et al. Improved Multilingual Training of Stacked Neural Network Acoustic Models for Low Resource Languages , 2016, INTERSPEECH.
[25] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[26] Stavros Tsakalidis,et al. Alternative networks for monolingual bottleneck features , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Richard M. Schwartz,et al. Comparison of Multiple System Combination Techniques for Keyword Spotting , 2016, INTERSPEECH.
[28] Roger Hsiao,et al. Unsupervised adaptation for deep neural networks using Alternating Direction Method of Multipliers , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.
[30] Jan Silovský,et al. Sage: The New BBN Speech Processing Platform , 2016, INTERSPEECH.
[31] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.