Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition
暂无分享,去创建一个
Hakan Erdogan | Takuya Yoshioka | Zhuo Chen | Fil Alleva | Hakan Erdogan | F. Alleva | Zhuo Chen | Takuya Yoshioka
[1] Chengzhu Yu,et al. The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[2] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Nima Mesgarani,et al. Deep attractor network for single-microphone speaker separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[5] Tomohiro Nakatani,et al. Deep Clustering-Based Beamforming for Separation with Unknown Number of Sources , 2017, INTERSPEECH.
[6] Yifan Gong,et al. Large-Scale Domain Adaptation via Teacher-Student Learning , 2017, INTERSPEECH.
[7] Takuya Yoshioka,et al. Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Francesco Nesta,et al. Convolutive BSS of Short Mixtures by ICA Recursively Regularized Across Frequencies , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Reinhold Häb-Umbach,et al. Tight Integration of Spatial and Spectral Features for BSS with Deep Clustering Embeddings , 2017, INTERSPEECH.
[11] Hiroshi Sawada,et al. Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[12] DeLiang Wang,et al. Binaural Reverberant Speech Separation Based on Deep Neural Networks , 2017, INTERSPEECH.
[13] Reinhold Häb-Umbach,et al. Blind speech separation employing directional statistics in an Expectation Maximization framework , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[14] Reinhold Häb-Umbach,et al. Source counting in speech mixtures using a variational EM approach for complex WATSON mixture models , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Jonathan Le Roux,et al. Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks , 2016, INTERSPEECH.
[16] Hiroshi Sawada,et al. A Multichannel MMSE-Based Framework for Speech Source Separation and Noise Reduction , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Jun Du,et al. A Maximum Likelihood Approach to Deep Neural Network Based Nonlinear Spectral Mapping for Single-Channel Speech Separation , 2017, INTERSPEECH.
[18] Hiroshi Sawada,et al. Doa Estimation for Multiple Sparse Sources with Normalized Observation Vector Clustering , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[19] Jacob Benesty,et al. On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.