Joint Learning of Speaker and Phonetic Similarities with Siamese Networks
暂无分享,去创建一个
[1] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] J De VriesNic,et al. A smartphone-based ASR data collection tool for under-resourced languages , 2014 .
[3] Aren Jansen,et al. Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline , 2013, INTERSPEECH.
[4] Themos Stafylakis,et al. Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition , 2014, Odyssey.
[5] Maarten Versteegh,et al. A deep scattering spectrum — Deep Siamese network pipeline for unsupervised acoustic modeling , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Hynek Hermansky,et al. Evaluating speech features with the minimal-pair ABX task (II): resistance to noise , 2014, INTERSPEECH.
[7] Emmanuel Dupoux,et al. Phonetics embedding learning with side information , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[8] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[9] Emmanuel Dupoux,et al. Weakly Supervised Multi-Embeddings Learning of Acoustic Models , 2015, ICLR.
[10] Samy Bengio,et al. Large Scale Online Learning of Image Similarity through Ranking , 2009, IbPRIA.
[11] Lorenzo Rosasco,et al. On Invariance and Selectivity in Representation Learning , 2015, ArXiv.
[12] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[13] Alta de Waal,et al. A smartphone-based ASR data collection tool for under-resourced languages , 2014, Speech Commun..
[14] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.
[15] Yann LeCun,et al. Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..
[16] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[17] Aren Jansen,et al. The Zero Resource Speech Challenge 2015: Proposed Approaches and Results , 2016, SLTU.
[18] Kilian Q. Weinberger,et al. Stochastic triplet embedding , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.
[19] Yun Lei,et al. Application of Convolutional Neural Networks to Language Identification in Noisy Conditions , 2014, Odyssey.
[20] Karen Livescu,et al. Deep convolutional acoustic word embeddings using word-pair side information , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).