暂无分享,去创建一个
[1] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[2] Jon Barker,et al. The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines , 2018, INTERSPEECH.
[3] Omkar M. Parkhi,et al. VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).
[4] Pietro Laface,et al. Probabilistic linear discriminant analysis of i-vector posterior distributions , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Joon Son Chung,et al. Perfect Match: Improved Cross-modal Embeddings for Audio-visual Synchronisation , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Yun Lei,et al. A novel scheme for speaker recognition using a phonetically-aware deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Theodoros Giannakopoulos,et al. Audio-visual speaker diarization using fisher linear semi-discriminant analysis , 2014, Multimedia Tools and Applications.
[9] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[10] Gerald Friedland,et al. The ICSI RT-09 Speaker Diarization System , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Lukás Burget,et al. Full-covariance UBM and heavy-tailed PLDA in i-vector speaker verification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[13] Sanjeev Khudanpur,et al. Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.
[14] Gerald Friedland,et al. Towards Audio-Visual On-line Diarization Of Participants In Group Meetings , 2008, ECCV 2008.
[15] Dan Istrate,et al. NIST RT'05S Evaluation: Pre-processing Techniques and Speaker Diarization on Multiple Microphone Meetings , 2005, MLMI.
[16] Giorgio Biagetti,et al. Robust Speaker Identification in a Meeting with Short Audio Segments , 2016 .
[17] Reinhold Häb-Umbach,et al. Fusing audio and video information for online speaker diarization , 2009, INTERSPEECH.
[18] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[19] Shinji Watanabe,et al. Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge , 2018, INTERSPEECH.
[20] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Chuohao Yeo,et al. Multi-modal speaker diarization of real-world meetings using compressed-domain video features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[23] Joseph H. DiBiase. A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays , 2000 .
[24] Panayiotis G. Georgiou,et al. Multimodal Speaker Segmentation and Identification in Presence of Overlapped Speech Segments , 2010, J. Multim..
[25] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[27] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[28] Richard C. Rose,et al. Deep bottleneck features for i-vector based text-independent speaker verification , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[29] Andrea Vedaldi,et al. MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.
[30] Jun Du,et al. Speaker Diarization with Enhancing Speech for the First DIHARD Challenge , 2018, INTERSPEECH.
[31] Chuohao Yeo,et al. Dialocalization: Acoustic speaker diarization and visual localization as joint optimization problem , 2010, TOMCCAP.
[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Daniel C. Burnett,et al. WebRTC: APIs and RTCWEB Protocols of the HTML5 Real-Time Web , 2012 .
[35] Nicolás Ruiz-Reyes,et al. Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis , 2018, Multimedia Tools and Applications.
[36] Jean Carletta,et al. The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.