Continuous Speech Separation: Dataset and Analysis
暂无分享,去创建一个
Zhuo Chen | Zhong Meng | Takuya Yoshioka | Jinyu Li | Liang Lu | Jian Wu | Tianyan Zhou | Yi Luo
[1] Jonathan Le Roux,et al. WHAM!: Extending Speech Separation to Noisy Environments , 2019, INTERSPEECH.
[2] Masakiyo Fujimoto,et al. Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Elizabeth Shriberg,et al. Analysis of overlaps in meetings by dialog factors, hot spots, speakers, and collection site: insights for automatic speech recognition , 2006, INTERSPEECH.
[4] Andreas Stolcke,et al. The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[5] Nima Mesgarani,et al. TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Emmanuel Vincent,et al. Multichannel Speech Separation with Recurrent Neural Networks from High-Order Ambisonics Recordings , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] E. Habets,et al. Generating sensor signals in isotropic noise fields. , 2007, The Journal of the Acoustical Society of America.
[8] John R. Hershey,et al. VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking , 2018, INTERSPEECH.
[9] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[11] Xiong Xiao,et al. Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks , 2018, INTERSPEECH.
[12] Xiong Xiao,et al. Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[13] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[14] Jean Carletta,et al. The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.
[15] Yong Xu,et al. A comprehensive study of speech separation: spectrogram vs waveform separation , 2019, INTERSPEECH.
[16] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Nima Mesgarani,et al. Deep attractor network for single-microphone speaker separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[19] Takuya Yoshioka,et al. Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Chuang Gan,et al. The Sound of Pixels , 2018, ECCV.
[21] Hakan Erdogan,et al. Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Zhong-Qiu Wang,et al. Alternative Objective Functions for Deep Clustering , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Haizhou Li,et al. Optimization of Speaker Extraction Neural Network with Magnitude and Temporal Spectrum Approximation Loss , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Jonathan Le Roux,et al. SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Zhong-Qiu Wang,et al. Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Tomohiro Nakatani,et al. Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures , 2017, INTERSPEECH.
[27] Ian McLoughlin,et al. Listening and Grouping: An Online Autoregressive Approach for Monaural Speech Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Jasha Droppo,et al. Sequence Modeling in Unsupervised Single-Channel Overlapped Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Naoyuki Kanda,et al. Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[31] Zhuo Chen,et al. Single-channel Speech Extraction Using Speaker Inventory and Attention Network , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[33] Roland Maas,et al. DiPCo - Dinner Party Corpus , 2019, INTERSPEECH.
[34] DeLiang Wang,et al. Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[35] Jon Barker,et al. The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines , 2018, INTERSPEECH.
[36] Rémi Gribonval,et al. BSS_EVAL Toolbox User Guide -- Revision 2.0 , 2005 .
[37] Liang Lu,et al. PyKaldi2: Yet another speech toolkit based on Kaldi and PyTorch , 2019, ArXiv.
[38] Liu Liu,et al. FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks , 2019, MMM.
[39] Tomohiro Nakatani,et al. All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).