Memory-Enhanced Neural Networks and NMF for Robust ASR
暂无分享,去创建一个
Björn W. Schuller | Martin Wöllmer | Gerhard Rigoll | Felix Weninger | Jürgen T. Geiger | Jort F. Gemmeke
[1] Björn W. Schuller,et al. Analyzing the memory of BLSTM Neural Networks for enhanced emotion classification in dyadic spoken interactions , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Yuuki Tachioka,et al. DISCRIMINATIVE METHODS FOR NOISE ROBUST SPEECH RECOGNITION: A CHIME CHALLENGE BENCHMARK , 2013 .
[3] Chong Kwan Un,et al. Speech recognition in noisy environments using first-order vector Taylor series , 1998, Speech Commun..
[4] Reinhold Häb-Umbach,et al. Model-Based Feature Enhancement for Reverberant Speech Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Hynek Hermansky,et al. RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[6] J.A. Bilmes,et al. Graphical model architectures for speech recognition , 2005, IEEE Signal Processing Magazine.
[7] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[8] Tuomas Virtanen,et al. Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Tuomas Virtanen,et al. Compact long context spectral factorisation models for noise robust recognition of medium vocabulary speech , 2013 .
[11] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[13] Tara N. Sainath,et al. Exemplar-Based Processing for Speech Recognition: An Overview , 2012, IEEE Signal Processing Magazine.
[14] Louis ten Bosch,et al. Using a DBN to integrate sparse classification and GMM-based ASR , 2010, INTERSPEECH.
[15] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Biing-Hwang Juang,et al. Recurrent deep neural networks for robust speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Björn Schuller,et al. The TUM+TUT+KUL Approach to the 2nd CHiME Challenge: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF , 2013, ICASSP 2013.
[18] Björn W. Schuller,et al. A multi-stream ASR framework for BLSTM modeling of conversational speech , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Ning Ma,et al. The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..
[20] Daniel Povey,et al. Revisiting Recurrent Neural Networks for robust ASR , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[22] Björn W. Schuller,et al. Robust in-car spelling recognition - a tandem BLSTM-HMM approach , 2009, INTERSPEECH.
[23] Francesco Nesta,et al. A FLEXIBLE SPATIAL BLIND SOURCE EXTRACTION FRAMEWORK FOR ROBUST SPEECH RECOGNITION IN NOISY ENVIRONMENTS , 2013 .
[24] Ulpu Remes,et al. Uncertainty Measures for Improving Exemplar-Based Source Separation , 2011, INTERSPEECH.
[25] Andrew C. Morris,et al. Recent advances in the multi-stream HMM/ANN hybrid approach to noise robust ASR , 2005, Comput. Speech Lang..
[26] H. Ney,et al. Linear discriminant analysis for improved large vocabulary continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[27] Tuomas Virtanen,et al. Toward a practical implementation of exemplar-based noise robust ASR , 2011, 2011 19th European Signal Processing Conference.
[28] Bhiksha Raj,et al. Techniques for Noise Robustness in Automatic Speech Recognition , 2012, Techniques for Noise Robustness in Automatic Speech Recognition.
[29] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..
[30] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[31] George Saon,et al. Maximum likelihood discriminant feature spaces , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[32] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[33] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] Ulpu Remes,et al. Techniques for Noise Robustness in Automatic Speech Recognition , 2012 .
[35] Jürgen Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.
[36] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[37] Yuuki Tachioka,et al. Effectiveness of discriminative training and feature transformation for reverberated and noisy speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[38] Tomohiro Nakatani,et al. Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling? , 2013, INTERSPEECH.
[39] Paris Smaragdis,et al. Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[40] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[41] Björn W. Schuller,et al. Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise , 2012, INTERSPEECH.
[42] Bhiksha Raj,et al. Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[43] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[44] Björn W. Schuller,et al. Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory , 2013, Comput. Speech Lang..
[45] John R. Hershey,et al. Efficient model-based speech separation and denoising using non-negative subspace analysis , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.