Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Naoyuki Kanda,et al. Maximum a posteriori Based Decoding for CTC Acoustic Models , 2016, INTERSPEECH.
[3] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[5] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Kai Yu,et al. Phone Synchronous Speech Recognition With CTC Lattices , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Björn W. Schuller,et al. Robust speech recognition using long short-term memory recurrent neural networks for hybrid acoustic modelling , 2014, INTERSPEECH.
[8] Daniel Jurafsky,et al. First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs , 2014, ArXiv.
[9] Biing-Hwang Juang,et al. Recurrent deep neural networks for robust speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Unfolded recurrent neural networks for speech recognition , 2014, INTERSPEECH.
[11] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[12] Georg Heigold,et al. Sequence discriminative distributed training of long short-term memory recurrent neural networks , 2014, INTERSPEECH.
[13] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[15] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[16] Yu Zhang,et al. Highway long short-term memory RNNS for distant speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Andrew W. Senior,et al. Flat start training of CD-CTC-SMBR LSTM RNN acoustic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Johan Schalkwyk,et al. Learning acoustic frame labeling for speech recognition with recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[20] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[21] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Florian Metze,et al. An empirical exploration of CTC acoustic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Naoyuki Kanda,et al. Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[25] Daniel Jurafsky,et al. Lexicon-Free Conversational Speech Recognition with Neural Networks , 2015, NAACL.
[26] K. Maekawa. CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .
[27] Daniel Povey,et al. Revisiting Recurrent Neural Networks for robust ASR , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Andrew W. Senior,et al. Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition , 2014, ArXiv.
[29] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[30] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[31] Geoffrey Zweig,et al. Exploring multidimensional lstms for large vocabulary ASR , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Tara N. Sainath,et al. Deep Convolutional Neural Networks for Large-scale Speech Tasks , 2015, Neural Networks.
[33] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[34] Tatsuya Kawahara. Benchmark test for speech recognition using the Corpus of Spontaneous Japanese , 2003 .
[35] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[36] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[37] Naoyuki Kanda,et al. Elastic spectral distortion for low resource speech recognition with deep neural networks , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[38] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[39] Tara N. Sainath,et al. Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks , 2016, INTERSPEECH.
[40] Tao Xu,et al. Phone Synchronous Decoding with CTC Lattice , 2016, INTERSPEECH.
[41] Izhak Shafran,et al. Context dependent phone models for LSTM RNN acoustic modelling , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Georg Heigold,et al. Multilingual acoustic models using distributed deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[43] Tara N. Sainath,et al. Acoustic modelling with CD-CTC-SMBR LSTM RNNS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[44] Brian Roark,et al. Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.
[45] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[46] Geoffrey Zweig,et al. LSTM time and frequency recurrence for automatic speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[47] Eric Fosler-Lussier,et al. A study on the use of conditional random fields for automatic speech recognition , 2010 .
[48] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[49] Li Deng,et al. A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[50] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[51] Li Deng,et al. Ensemble deep learning for speech recognition , 2014, INTERSPEECH.