Speaker identification and clustering using convolutional neural networks
暂无分享,去创建一个
Oliver Durr | Thilo Stadelmann | Carlo Vogt | Yanick Lukic | Thilo Stadelmann | Oliver Durr | Y. X. Lukic | Carlo Vogt
[1] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[2] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Andreas Stolcke,et al. Artificial neural network features for speaker diarization , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[4] Benjamin Schrauwen,et al. End-to-end learning for music audio , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Douglas A. Reynolds,et al. Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..
[6] Bernd Freisleben,et al. Unfolding speaker clustering potential: a biomimetic approach , 2009, ACM Multimedia.
[7] Biing-Hwang Juang,et al. The use of cohort normalized scores for speaker verification , 1992, ICSLP.
[8] Ahmad Salman,et al. Learning Speaker-Specific Characteristics With a Deep Neural Architecture , 2011, IEEE Transactions on Neural Networks.
[9] M. Al-Akaidi. Fractal Speech Processing , 2004 .
[10] Anastasios Tefas,et al. Multimodal speaker clustering in full length movies , 2015, Multimedia Tools and Applications.
[11] Jeroen Breebaart,et al. Features for audio and music classification , 2003, ISMIR.
[12] Constantine Kotropoulos,et al. Speaker segmentation and clustering , 2008, Signal Process..
[13] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[14] Dong Yu,et al. Exploring convolutional neural network structures and optimization techniques for speech recognition , 2013, INTERSPEECH.
[15] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[16] Figen Ertaş,et al. FUNDAMENTALS OF SPEAKER RECOGNITION , 2011 .
[17] Yu Tsao,et al. Clustering-based i-vector formulation for speaker recognition , 2014, INTERSPEECH.
[18] Colin Raffel,et al. Lasagne: First release. , 2015 .
[19] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[20] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[21] Gerald Friedland,et al. The ICSI RT-09 Speaker Diarization System , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Jitendra Ajmera,et al. A robust speaker clustering algorithm , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[23] Andrew C. Morris,et al. PAPER Special Section/Issue on Corpus-Based Speech Technologies GMM based clustering and speaker separability in the Timit speech database , 2005 .
[24] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[25] Yun Lei,et al. Application of convolutional neural networks to speaker recognition in noisy conditions , 2014, INTERSPEECH.
[26] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[27] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[28] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[29] Mickael Rouvier,et al. Speaker diarization through speaker embeddings , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).
[30] Simon King,et al. Where are the challenges in speaker diarization? , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.