暂无分享,去创建一个
Naoyuki Kanda | Zhuo Chen | Takuya Yoshioka | Yifan Gong | Shujie Liu | Yu Wu | Gang Liu | Jinyu Li | Yong Zhao | Jian Wu | Tianyan Zhou | Sanyuan Chen | Xiong Xiao | Jinyu Li | Y. Gong | Shujie Liu | Naoyuki Kanda | Zhuo Chen | Gang Liu | Jian Wu | Xiong Xiao | Yu Wu | Yong Zhao | Sanyuan Chen | Tianyan Zhou | Takuya Yoshioka
[1] Naoyuki Kanda,et al. Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR , 2019, INTERSPEECH.
[2] Yong Zhao,et al. ResNeXt and Res2Net Structure for Speaker Verification , 2020, ArXiv.
[3] Jon Barker,et al. CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings , 2020, 6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020).
[4] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[5] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Elizabeth Shriberg,et al. Analysis of overlaps in meetings by dialog factors, hot spots, speakers, and collection site: insights for automatic speech recognition , 2006, INTERSPEECH.
[7] Yuri Y. Khokhlov,et al. The STC System for the CHiME-6 Challenge , 2020, 6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020).
[8] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[9] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Kenneth Ward Church,et al. The Second DIHARD Diarization Challenge: Dataset, task, and baselines , 2019, INTERSPEECH.
[11] Ming Zhou,et al. Continuous Speech Separation with Conformer , 2020, ArXiv.
[12] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Jian Cheng,et al. Additive Margin Softmax for Face Verification , 2018, IEEE Signal Processing Letters.
[14] James R. Glass,et al. Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Alan McCree,et al. Speaker diarization using deep neural network embeddings , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Quan Wang,et al. Speaker Diarization with LSTM , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Hervé Bredin,et al. TristouNet: Triplet loss for speaker turn embedding , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Joon Son Chung,et al. Spot the conversation: speaker diarisation in the wild , 2020, INTERSPEECH.
[20] Zhuo Chen,et al. Continuous Speech Separation: Dataset and Analysis , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Takuya Yoshioka,et al. Advances in Online Audio-Visual Meeting Transcription , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[23] Jon Barker,et al. The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines , 2018, INTERSPEECH.
[24] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[25] Sylvain Meignier,et al. LIUM SPKDIARIZATION: AN OPEN SOURCE TOOLKIT FOR DIARIZATION , 2010 .
[26] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Permutation-Free Objectives , 2019, INTERSPEECH.
[27] Yifan Gong,et al. CNN with Phonetic Attention for Text-Independent Speaker Verification , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[28] Kai Zhao,et al. Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Frank Rudzicz,et al. Speaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[31] Andreas Stolcke,et al. Dover: A Method for Combining Diarization Outputs , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[32] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Self-Attention , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[33] David A. van Leeuwen,et al. Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.