暂无分享,去创建一个
[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Nima Mesgarani,et al. Deep attractor network for single-microphone speaker separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Liu Liu,et al. FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation , 2019, ArXiv.
[4] Rémi Gribonval,et al. BSS_EVAL Toolbox User Guide -- Revision 2.0 , 2005 .
[5] Takuya Yoshioka,et al. Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Peter F. Assmann,et al. The Perception of Speech Under Adverse Conditions , 2004 .
[7] Paris Smaragdis,et al. Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[8] DeLiang Wang,et al. Model-based sequential organization in cochannel speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Robert E. Yantorno,et al. Performance of the modified Bark spectral distortion as an objective speech quality measure , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[10] Kaiming He,et al. Group Normalization , 2018, ECCV.
[11] Zhong-Qiu Wang,et al. Alternative Objective Functions for Deep Clustering , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Tuomas Virtanen,et al. Speech recognition using factorial hidden Markov models for separation in the feature space , 2006, INTERSPEECH.
[14] Nima Mesgarani,et al. TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Jonathan Le Roux,et al. Single-Channel Multi-Speaker Separation Using Deep Clustering , 2016, INTERSPEECH.
[16] Gang Wang,et al. Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Huibin Lin,et al. Furcax: End-to-end Monaural Speech Separation Based on Deep Gated (De)convolutional Neural Networks with Adversarial Example Training , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Jonathan Le Roux,et al. Universal Sound Separation , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[19] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] Liu Liu,et al. Is CQT more suitable for monaural speech separation than STFT? an empirical study , 2019, ArXiv.
[21] Jonah Casebeer,et al. Adaptive Front-ends for End-to-end Source Separation , 2017 .
[22] Nima Mesgarani,et al. Speaker-Independent Speech Separation With Deep Attractor Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] D. Wang,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2008, IEEE Trans. Neural Networks.
[24] Haizhou Li,et al. Single Channel Speech Separation with Constrained Utterance Level Permutation Invariant Training Using Grid LSTM , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Liu Liu,et al. FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks , 2019, MMM.
[26] Jonathan Le Roux,et al. SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Nima Mesgarani,et al. TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation. , 2018 .
[29] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Le Roux. Sparse NMF – half-baked or well done? , 2015 .
[31] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] DeLiang Wang,et al. An Unsupervised Approach to Cochannel Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[33] Tong Zhang,et al. Spatial–Temporal Recurrent Neural Network for Emotion Recognition , 2017, IEEE Transactions on Cybernetics.
[34] Peng Gao,et al. CBLDNN-Based Speaker-Independent Speech Separation Via Generative Adversarial Training , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] ChenZhuo,et al. Speaker-Independent Speech Separation With Deep Attractor Network , 2018 .