Demystifying TasNet: A Dissecting Approach
暂无分享,去创建一个
[1] Tomohiro Nakatani,et al. Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures , 2017, INTERSPEECH.
[2] Timo Gerkmann,et al. A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Nima Mesgarani,et al. TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation. , 2018 .
[4] Reinhold Haeb-Umbach,et al. SMS-WSJ: Database, performance measures, and baseline recipe for multi-channel source separation and recognition , 2019, ArXiv.
[5] Yonghong Yan,et al. Multi-talker Speech Separation Based on Permutation Invariant Training and Beamforming , 2018, INTERSPEECH.
[6] DeLiang Wang,et al. Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Tomi Kinnunen,et al. INTERSPEECH 2013 14thAnnual Conference of the International Speech Communication Association , 2013, Interspeech 2015.
[8] Takuya Yoshioka,et al. Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Zhong-Qiu Wang,et al. Alternative Objective Functions for Deep Clustering , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Mark Hasegawa-Johnson,et al. Glottal Model Based Speech Beamforming for ad-hoc Microphone Arrays , 2017, INTERSPEECH.
[13] Naoya Takahashi,et al. Recursive speech separation for unknown number of speakers , 2019, INTERSPEECH.
[14] Daniel P. W. Ellis,et al. MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.
[15] Yong Xu,et al. A comprehensive study of speech separation: spectrogram vs waveform separation , 2019, INTERSPEECH.
[16] Reinhold Haeb-Umbach,et al. Integration of Neural Networks and Probabilistic Spatial Models for Acoustic Blind Source Separation , 2019, IEEE Journal of Selected Topics in Signal Processing.
[17] Jonathan Le Roux,et al. WHAM!: Extending Speech Separation to Noisy Environments , 2019, INTERSPEECH.
[18] Shih-Chii Liu,et al. FaSNet: Low-Latency Adaptive Beamforming for Multi-Microphone Audio Processing , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[19] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Nima Mesgarani,et al. Deep attractor network for single-microphone speaker separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[22] Liu Liu,et al. FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation , 2019, ArXiv.
[23] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] John R. Hershey,et al. Phasebook and Friends: Leveraging Discrete Representations for Source Separation , 2018, IEEE Journal of Selected Topics in Signal Processing.
[26] Yong Xu,et al. End-to-End Multi-Channel Speech Separation , 2019, ArXiv.
[27] Tomohiro Nakatani,et al. Listening to Each Speaker One by One with Recurrent Selective Hearing Networks , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Zhong-Qiu Wang,et al. End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction , 2018, INTERSPEECH.