Robust ASR using neural network based speech enhancement and feature simulation
暂无分享,去创建一个
Antoine Liutkus | Siddharth Dalmia | Aditya Nugraha | Sunit Sivasankaran | Irina Illina | Emmanuel Vincent | Juan Andres Morales-Cordovilla | I. Illina | E. Vincent | A. Liutkus | Siddharth Dalmia | S. Sivasankaran | Aditya Arie Nugraha | J. A. Morales-Cordovilla
[1] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[2] Ramesh A. Gopinath,et al. Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[3] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[4] Franck Giron,et al. Deep neural network based instrument extraction from music , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Tetsuya Takiguchi,et al. Voice conversion in time-invariant speaker-independent space , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Paris Smaragdis,et al. Deep learning for monaural speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[8] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..
[9] Li Deng,et al. Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition , 2011, Robust Speech Recognition of Uncertain or Missing Data.
[10] Emmanuel Vincent,et al. Multi-source TDOA estimation in reverberant audio using angular spectra and clustering , 2012, Signal Process..
[11] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[12] Seiichi Nakagawa,et al. Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition , 2014, EURASIP J. Audio Speech Music. Process..
[13] Benedikt Loesch,et al. Adaptive Segmentation and Separation of Determined Convolutive Mixtures under Dynamic Conditions , 2010, LVA/ICA.
[14] Xabier Jaureguiberry,et al. Fusion Methods for Audio Source Separation , 2014 .
[15] Paris Smaragdis,et al. Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Rémi Gribonval,et al. Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Bhiksha Raj,et al. Techniques for Noise Robustness in Automatic Speech Recognition , 2012, Techniques for Noise Robustness in Automatic Speech Recognition.
[18] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[19] Bhiksha Raj,et al. Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors , 2012, IEEE Signal Processing Magazine.
[20] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.
[21] Emmanuel Vincent,et al. A General Flexible Framework for the Handling of Prior Information in Audio Source Separation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Christophe Garcia,et al. An Online Backpropagation Algorithm with Validation Error-Based Adaptive Learning Rate , 2007, ICANN.
[23] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[24] Rémi Gribonval,et al. From Blind to Guided Audio Source Separation: How models and side information can improve the separation of sound , 2014, IEEE Signal Processing Magazine.
[25] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.
[26] Jun Du,et al. Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers , 2014, The 9th International Symposium on Chinese Spoken Language Processing.
[27] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[28] Treebank Penn,et al. Linguistic Data Consortium , 1999 .
[29] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[30] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[31] Antoine Liutkus,et al. Scalable audio separation with light Kernel Additive Modelling , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[33] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[34] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[35] Christian Igel,et al. An Introduction to Restricted Boltzmann Machines , 2012, CIARP.
[36] B. Schölkopf,et al. Modeling Human Motion Using Binary Latent Variables , 2007 .
[37] Petros Maragos,et al. The DIRHA simulated corpus , 2014, LREC.
[38] Antoine Liutkus,et al. Kernel Additive Models for Source Separation , 2014, IEEE Transactions on Signal Processing.
[39] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[40] John McDonough,et al. Distant Speech Recognition , 2009 .
[41] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[42] Jan Cernocký,et al. Improved feature processing for deep neural networks , 2013, INTERSPEECH.