A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling
暂无分享,去创建一个
Ewan Dunbar | Maarten Versteegh | Emmanuel Dupoux | Gabriel Synnaeve | Roland Thiollière | Gabriel Synnaeve | Emmanuel Dupoux | Ewan Dunbar | Maarten Versteegh | Roland Thiollière
[1] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[2] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.
[3] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[4] Aren Jansen,et al. Unsupervised neural network based feature extraction using weak top-down constraints , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Abdellah Fourtassi,et al. Exploring the Relative Role of Bottom-up and Top-down Information in Phoneme Learning , 2014, ACL.
[6] Sharon Goldwater,et al. A role for the developing lexicon in phonetic category acquisition. , 2013, Psychological review.
[7] P. Jusczyk. The discovery of spoken language , 1997 .
[8] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[9] Etienne Barnard,et al. The NCHLT speech corpus of the South African languages , 2014, SLTU.
[10] Andrew W. Senior,et al. Improving DNN speaker independence with I-vector inputs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Emmanuel Dupoux,et al. Phonetics embedding learning with side information , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[12] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[13] Yann LeCun,et al. Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..
[14] Tetsuji Ogawa,et al. A new efficient measure for accuracy prediction and its application to multistream-based unsupervised adaptation , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[15] Aren Jansen,et al. Efficient spoken term discovery using randomized algorithms , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[16] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[17] Sharon Peperkamp,et al. Learning Phonemes With a Proto-Lexicon , 2013, Cogn. Sci..
[18] P. Kuhl. A new view of language acquisition. , 2000, Proceedings of the National Academy of Sciences of the United States of America.
[19] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.
[20] Emmanuel Dupoux,et al. Weakly Supervised Multi-Embeddings Learning of Acoustic Models , 2015, ICLR.
[21] Hynek Hermansky,et al. Mean temporal distance: Predicting ASR error from temporal properties of speech signal , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] syhw. abnet: interspeech 2015 status , 2015 .
[23] Aren Jansen,et al. The zero resource speech challenge 2017 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[24] Anil K. Jain,et al. On-line signature verification, , 2002, Pattern Recognit..
[25] James R. Glass,et al. Unsupervised Pattern Discovery in Speech , 2008, IEEE Transactions on Audio, Speech, and Language Processing.