Gated Recurrent Fusion of Spatial and Spectral Features for Multi-Channel Speech Separation with Deep Embedding Representations
暂无分享,去创建一个
[1] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[2] Bin Liu,et al. Spatial and spectral deep attention fusion for multi-channel speech separation using deep embedding features , 2020, ArXiv.
[3] Jie Li,et al. 3D Gated Recurrent Fusion for Semantic Scene Completion , 2020, ArXiv.
[4] Jonathan Le Roux,et al. Single-Channel Multi-Speaker Separation Using Deep Clustering , 2016, INTERSPEECH.
[5] Xiong Xiao,et al. Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[6] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Zhong-Qiu Wang,et al. Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[10] Jianhua Tao,et al. Deep Attention Fusion Feature for Speech Separation with End-to-End Post-filter Method , 2020, ArXiv.
[11] Bin Liu,et al. Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation , 2018, 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[12] Paris Smaragdis,et al. Singing-Voice Separation from Monaural Recordings using Deep Recurrent Neural Networks , 2014, ISMIR.
[13] Mark D. Plumbley,et al. Combining Mask Estimates for Single Channel Audio Source Separation Using Deep Neural Networks , 2016, INTERSPEECH.
[14] Dong Yu,et al. Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information , 2019, INTERSPEECH.
[15] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Jianhua Tao,et al. End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[18] Antony William Rix,et al. Perceptual evaluation of speech quality (PESQ): The new ITU standard for end-to-end speech quality a , 2002 .
[19] Zhong-Qiu Wang,et al. Integrating Spectral and Spatial Features for Multi-Channel Speaker Separation , 2018, INTERSPEECH.
[20] Masahito Togami,et al. Spatial Constraint on Multi-channel Deep Clustering , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] John J. Foxe,et al. Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.
[22] Jiangyan Yi,et al. Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features , 2019, INTERSPEECH.
[23] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[25] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] DeLiang Wang,et al. Combining Spectral and Spatial Features for Deep Learning Based Blind Speaker Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.