LaFurca: Iterative Multi-Stage Refined End-to-End Monaural Speech Separation Based on Context-Aware Dual-Path Deep Parallel Inter-Intra Bi-LSTM
暂无分享,去创建一个
[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Nima Mesgarani,et al. Deep attractor network for single-microphone speaker separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[4] Nima Mesgarani,et al. TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Liu Liu,et al. Is CQT more suitable for monaural speech separation than STFT? an empirical study , 2019, ArXiv.
[6] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Yossi Adi,et al. Voice Separation with an Unknown Number of Multiple Speakers , 2020, ICML.
[8] Takuya Yoshioka,et al. Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Gang Wang,et al. Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Huibin Lin,et al. Furcax: End-to-end Monaural Speech Separation Based on Deep Gated (De)convolutional Neural Networks with Adversarial Example Training , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .
[13] Kaiming He,et al. Group Normalization , 2018, ECCV.
[14] Liu Liu,et al. FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks , 2019, MMM.
[15] Neil Zeghidour,et al. Wavesplit: End-to-End Speech Separation by Speaker Clustering , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] DeLiang Wang,et al. Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Huibin Lin,et al. Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation , 2019, INTERSPEECH.
[18] Paris Smaragdis,et al. Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Jonathan Le Roux,et al. SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Jonathan Le Roux,et al. Single-Channel Multi-Speaker Separation Using Deep Clustering , 2016, INTERSPEECH.
[21] Huibin Lin,et al. End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network , 2019, INTERSPEECH.
[22] Nima Mesgarani,et al. Speaker-Independent Speech Separation With Deep Attractor Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] DeLiang Wang,et al. Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Tuomas Virtanen,et al. Speech recognition using factorial hidden Markov models for separation in the feature space , 2006, INTERSPEECH.
[25] Jesper Jensen,et al. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] DeLiang Wang,et al. An Unsupervised Approach to Cochannel Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[27] Tong Zhang,et al. Spatial–Temporal Recurrent Neural Network for Emotion Recognition , 2017, IEEE Transactions on Cybernetics.
[28] Jonathan Le Roux,et al. Universal Sound Separation , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[29] DeLiang Wang,et al. Model-based sequential organization in cochannel speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.