On the importance of event detection for ASR
暂无分享,去创建一个
[1] Tara N. Sainath,et al. Joint training of convolutional and non-convolutional neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Amit Srivastava,et al. Online speaker adaptation and tracking for real-time speech recognition , 2005, INTERSPEECH.
[3] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[4] Jürgen Schmidhuber,et al. Learning to forget: continual prediction with LSTM , 1999 .
[5] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[6] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[8] Jean-Luc Gauvain,et al. Partitioning and transcription of broadcast news data , 1998, ICSLP.
[9] Mohammad Hossein Moattar,et al. A review on speaker diarization systems and approaches , 2012, Speech Commun..
[10] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[11] Jean-Luc Gauvain,et al. Multistage speaker diarization of broadcast news , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[12] John H. L. Hansen,et al. Efficient audio stream segmentation via the combined T/sup 2/ statistic and Bayesian information criterion , 2005, IEEE Transactions on Speech and Audio Processing.
[13] Tara N. Sainath,et al. Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization , 2012, INTERSPEECH.
[14] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[15] Ponani S. Gopalakrishnan,et al. Clustering via the Bayesian information criterion with applications in speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[16] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Björn W. Schuller,et al. From speech to letters - using a novel neural network architecture for grapheme based ASR , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[18] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[19] Hsin-Min Wang,et al. A sequential metric-based audio segmentation method via the Bayesian information criterion , 2003, INTERSPEECH.
[20] Jan Nouza,et al. Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives , 2013, Speech Commun..
[21] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[22] Brian Kingsbury,et al. Improvements to the IBM speech activity detection system for the DARPA RATS program , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Jason W. Pelecanos,et al. Online speaker diarization using adapted i-vector transforms , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).