暂无分享,去创建一个
Ian McGraw | Quan Wang | Qiao Liang | Yanzhang He | Rajeev Rikhye | Ian McGraw | Quan Wang | Yanzhang He | Qiao Liang | R. Rikhye
[1] Tomohiro Nakatani,et al. Learning speaker representation for neural network based multichannel speaker extraction , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[2] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[3] Ian McGraw,et al. Personalized Keyphrase Detection using Speaker and Environment Information , 2021, Interspeech.
[4] Jun Du,et al. Online Speaker Adaptation for LVCSR Based on Attention Mechanism , 2018, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[5] Hitoshi Yamamoto,et al. Attention Mechanism in Speaker Recognition: What Does it Learn in Deep Speaker Embedding? , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[6] Tara N. Sainath,et al. Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Ngoc Thang Vu,et al. End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning , 2019, INTERSPEECH.
[8] Naoyuki Kanda,et al. Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers , 2020, INTERSPEECH.
[9] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[10] Yanbing Liu,et al. Deep CNNs With Self-Attention for Speaker Identification , 2019, IEEE Access.
[11] Chunlei Zhang,et al. Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Shaojin Ding,et al. Personal VAD: Speaker-Conditioned Voice Activity Detection , 2019, Odyssey.
[13] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[14] Quan Wang,et al. Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition , 2021, Interspeech 2021.
[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[16] Wei Rao,et al. Target Speaker Verification With Selective Auditory Attention for Single and Multi-Talker Speech , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Jun Wang,et al. Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures , 2018, INTERSPEECH.
[18] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Rohit Prabhavalkar,et al. On the Efficient Representation and Execution of Deep Acoustic Models , 2016, INTERSPEECH.
[20] Tomohiro Nakatani,et al. Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures , 2017, INTERSPEECH.
[21] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[22] Dong Wang,et al. CN-Celeb: A Challenging Chinese Speaker Recognition Dataset , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Quan Wang,et al. Attention-Based Models for Text-Dependent Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Turaj Zakizadeh Shabestary,et al. Hotword Cleaner: Dual-microphone Adaptive Noise Cancellation with Deferred Filter Coefficients for Robust Keyword Spotting , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Yifan Gong,et al. Speaker Separation Using Speaker Inventories and Estimated Speech , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Wei Li,et al. VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition , 2020, INTERSPEECH.
[27] Haizhou Li,et al. SpEx: Multi-Scale Time Domain Speaker Extraction Network , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[29] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] E. A. Martin,et al. Multi-style training for robust isolated-word speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[31] Tara N. Sainath,et al. Recognizing Long-Form Speech Using Streaming End-to-End Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[32] Quan Wang,et al. Version Control of Speaker Recognition Systems , 2020, ArXiv.
[33] B. Widrow,et al. Adaptive noise cancelling: Principles and applications , 1975 .
[34] Tomohiro Nakatani,et al. Single Channel Target Speaker Extraction and Recognition with Speaker Beam , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[36] Sanjeev Khudanpur,et al. A study on data augmentation of reverberant speech for robust speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] John R. Hershey,et al. VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking , 2018, INTERSPEECH.
[38] Liang Qiao,et al. Optimizing Speech Recognition For The Edge , 2019, ArXiv.
[39] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .
[40] Jia Pan,et al. Speaker Adaptive Training for Speech Recognition Based on Attention-Over-Attention Mechanism , 2020, INTERSPEECH.
[41] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).