An analysis of environment, microphone and data simulation mismatches in robust speech recognition
暂无分享,去创建一个
Jon Barker | Emmanuel Vincent | Shinji Watanabe | Aditya Nugraha | Ricard Marxer | E. Vincent | Shinji Watanabe | J. Barker | R. Marxer | Aditya Arie Nugraha
[1] Daniel P. W. Ellis,et al. Model-Based Expectation-Maximization Source Separation and Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[3] Jacob Benesty,et al. Speech Processing in Modern Communication: Challenges and Perspectives , 2012 .
[4] Takuya Yoshioka,et al. Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[6] Tuomas Virtanen,et al. Modelling non-stationary noise with spectral factorisation in automatic speech recognition , 2013, Comput. Speech Lang..
[7] Lukás Burget,et al. iVector-based discriminative adaptation for automatic speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[8] Masahito Togami,et al. Unified ASR system using LGM-based source separation, noise-robust feature extraction, and word hypothesis selection , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[9] Ning Ma,et al. The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..
[10] Tomohiro Nakatani,et al. The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[11] Jonathan Le Roux,et al. The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[12] Alessio Brutti,et al. On the relationship between Early-to-Late Ratio of Room Impulse Responses and ASR performance in reverberant environments , 2016, Speech Commun..
[13] Jon Barker,et al. The third 'CHiME' speech separation and recognition challenge: Analysis and outcomes , 2017, Comput. Speech Lang..
[14] Koichi Shinoda. Speaker adaptation techniques for automatic speech recognition , 2011 .
[15] Michael S. Brandstein,et al. Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.
[16] Yifan Gong,et al. Robust automatic speech recognition : a bridge to practical application , 2015 .
[17] Shoko Araki,et al. Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Jeff A. Bilmes,et al. The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments , 2012, Comput. Speech Lang..
[19] Antoine Liutkus,et al. Robust ASR using neural network based speech enhancement and feature simulation , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[20] Mark J. F. Gales,et al. The MGB challenge: Evaluating multi-genre broadcast media recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[21] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[22] Walter Kellermann,et al. Robust coherence-based spectral enhancement for distant speech recognition , 2015, ArXiv.
[23] Birger Kollmeier,et al. A CHiME-3 challenge system: Long-term acoustic features for noise robust automatic speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[24] James Glass,et al. Research Developments and Directions in Speech Recognition and Understanding, Part 1 , 2009 .
[25] Mary Harper. The Automatic Speech recogition In Reverberant Environments (ASpIRE) challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[26] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..
[27] Stephen Cox,et al. Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[28] Yongqiang Wang,et al. Adaptation of deep neural network acoustic models using factorised i-vectors , 2014, INTERSPEECH.
[29] Xiaofei Wang,et al. Noise Robust IOA/CAS Speech Separation and Recognition System For The Third 'CHIME' Challenge , 2015, ArXiv.
[30] Thomas Hain,et al. The sheffield wargames corpus , 2013, INTERSPEECH.
[31] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[32] H. Bourlard,et al. Interpretation of Multiparty Meetings the AMI and Amida Projects , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.
[33] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[34] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[35] Lori Lamel,et al. The translanguage English database (TED) , 1994, ICSLP.
[36] Michael I. Mandel,et al. Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[37] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[38] Emmanuel Vincent,et al. Multichannel Audio Source Separation With Deep Neural Networks , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[39] Rémi Gribonval,et al. Oracle estimators for the benchmarking of source separation algorithms , 2007, Signal Process..
[40] Reinhold Häb-Umbach,et al. BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[41] Roland Badeau,et al. Multichannel Audio Source Separation With Probabilistic Reverberation Priors , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[42] Dimitra Vergyri,et al. Medium-duration modulation cepstral feature for robust speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Bernd T. Meyer,et al. Mutual benefits of auditory spectro-temporal Gabor features and deep learning for the 3rd CHiME Challenge , 2015 .
[44] Henry Cox,et al. Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..
[45] Naoyuki Kanda,et al. Elastic spectral distortion for low resource speech recognition with deep neural networks , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[46] Sven Fischer,et al. Suppression of coherent and incoherent noise using a microphone array , 1994 .
[47] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[48] Rémi Gribonval,et al. Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[50] Marc Moonen,et al. Superdirective Beamforming Robust Against Microphone Mismatch , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[51] John McDonough,et al. Distant Speech Recognition , 2009 .
[52] X. Mestre,et al. On diagonal loading for minimum variance beamformers , 2003, Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795).
[53] Mark A. Poletti,et al. Spatially Robust Far-field Beamforming Using the von Mises(-Fisher) Distribution , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[54] Antoine Liutkus,et al. Scalable audio separation with light Kernel Additive Modelling , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[55] DeLiang Wang,et al. Noise Perturbation Improves Supervised Speech Separation , 2015, LVA/ICA.
[56] Paris Smaragdis,et al. Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures , 2015, LVA/ICA.
[57] R. Zelinski,et al. A microphone array with adaptive post-filtering for noise reduction in reverberant rooms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[58] Qiang Chen,et al. Network In Network , 2013, ICLR.
[59] Sergei Aleinik,et al. Improvement of Microphone Array Characteristics for Speech Capturing , 2015 .
[60] David Lo,et al. DSM: a specification mining tool using recurrent neural network based language model , 2018, ESEC/SIGSOFT FSE.
[61] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[62] Chng Eng Siong,et al. Speech enhancement using beamforming and non negative matrix factorization for robust speech recognition in the CHiME-3 challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[63] P. Stoica,et al. Robust Adaptive Beamforming , 2013 .
[64] Fengyun Zhu,et al. Noise-Robust ASR for the third 'CHiME' Challenge Exploiting Time-Frequency Masking based Multi-Channel Speech Enhancement and Recurrent Neural Network , 2015, ArXiv.
[65] Bhiksha Raj,et al. Microphone array processing for distant speech recognition: Towards real-world deployment , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[66] Emmanuel Vincent,et al. Multichannel music separation with deep neural networks , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).
[67] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[68] Ulpu Remes,et al. Techniques for Noise Robustness in Automatic Speech Recognition , 2012 .
[69] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[70] Walter Kellermann,et al. Unbiased coherent-to-diffuse ratio estimation for dereverberation , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).
[71] Martin Graciarena,et al. Damped oscillator cepstral coefficients for robust speech recognition , 2013, INTERSPEECH.
[72] Chengzhu Yu,et al. The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[73] Maxim Korenevsky,et al. Adaptive beamforming and adaptive training of DNN acoustic models for enhanced multichannel noisy speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[74] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[75] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[76] John H. L. Hansen,et al. "CU-move" : analysis & corpus development for interactive in-vehicle speech systems , 2001, INTERSPEECH.
[77] James R. Glass,et al. Developments and directions in speech recognition and understanding, Part 1 [DSP Education] , 2009, IEEE Signal Processing Magazine.