Keyword Based Speaker Localization: Localizing a Target Speaker in a Multi-speaker Environment
暂无分享,去创建一个
[1] R. Maas,et al. A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research , 2016, EURASIP Journal on Advances in Signal Processing.
[2] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[3] Scott Rickard,et al. Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.
[4] Daniel P. W. Ellis,et al. An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments , 2006, NIPS.
[5] Francesco Nesta,et al. Cumulative State Coherence Transform for a Robust Two-Channel Multiple Source Localization , 2009, ICA.
[6] Hiroshi Sawada,et al. Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Francesco Piazza,et al. A neural network based algorithm for speaker localization in a multi-room environment , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).
[8] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .
[10] Junichi Yamagishi,et al. Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks , 2016, INTERSPEECH.
[11] Hynek Hermansky,et al. A long, deep and wide artificial neural net for robust speech recognition in unknown noise , 2014, INTERSPEECH.
[12] Emmanuel Vincent,et al. A French Corpus for Distant-Microphone Speech Processing in Real Homes , 2016, INTERSPEECH.
[13] Yifan Gong,et al. Robust Automatic Speech Recognition , 2015 .
[14] B C Wheeler,et al. Localization of multiple sound sources with two microphones. , 2000, The Journal of the Acoustical Society of America.
[15] Guy J. Brown,et al. Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions , 2015, INTERSPEECH.
[16] C. Faller,et al. Source localization in complex listening situations: selection of binaural cues based on interaural coherence. , 2004, The Journal of the Acoustical Society of America.
[17] DeLiang Wang,et al. Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).
[18] Kazunori Komatani,et al. Discriminative multiple sound source localization based on deep neural networks using independent location model , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[19] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[20] Daniel Pressnitzer,et al. Predictive denoising of speech in noise using deep neural networks , 2017 .
[21] Iván V. Meza,et al. Localization of sound sources in robotics: A review , 2017, Robotics Auton. Syst..
[22] Ulpu Remes,et al. Techniques for Noise Robustness in Automatic Speech Recognition , 2012 .
[23] Emanuel A. P. Habets,et al. Multi-Speaker Localization Using Convolutional Neural Network Trained with Noise , 2017, ArXiv.
[24] Emmanuel Vincent,et al. Multi-source TDOA estimation in reverberant audio using angular spectra and clustering , 2012, Signal Process..
[25] Emanuel A. P. Habets,et al. Broadband doa estimation using convolutional neural networks trained with noise signals , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[26] John R. Hershey,et al. Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks , 2015, INTERSPEECH.
[27] Emmanuel Vincent,et al. Audio Source Separation and Speech Enhancement , 2018 .
[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[29] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[30] Emmanuel Vincent,et al. A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[31] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .
[32] Özgür Yilmaz,et al. On the approximate W-disjoint orthogonality of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[33] Dinh-Tuan Pham,et al. A phase-based dual microphone method to count and locate audio sources in reverberant rooms , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.