Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection
暂无分享,去创建一个
Haizhou Li | Hongjie Chen | Cheung-Chi Leung | Bin Ma | Lei Xie | Haizhou Li | B. Ma | Lei Xie | C. Leung | Hongjie Chen
[1] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[2] Bin Ma,et al. Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[3] Aren Jansen,et al. A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge , 2015, INTERSPEECH.
[4] Simon King,et al. Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Aren Jansen,et al. Unsupervised neural network based feature extraction using weak top-down constraints , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[7] Cheung-Chi Leung,et al. Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Lukás Burget,et al. BUT QUESST 2014 system description , 2014, MediaEval.
[9] Bin Ma,et al. An acoustic segment modeling approach to query-by-example spoken term detection , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Florian Metze,et al. Query by Example Search on Speech at Mediaeval 2015 , 2014, MediaEval.
[11] Lin-Shan Lee,et al. Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder , 2016, INTERSPEECH.
[12] Hynek Hermansky,et al. Evaluating speech features with the minimal-pair ABX task (II): resistance to noise , 2014, INTERSPEECH.
[13] Karen Livescu,et al. Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings , 2017, INTERSPEECH.
[14] Yifan Gong,et al. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] Lukás Burget,et al. An empirical evaluation of zero resource acoustic unit discovery , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Frédéric Bimbot,et al. Audio keyword extraction by unsupervised word discovery , 2009, INTERSPEECH.
[17] Bin Ma,et al. Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis , 2016, INTERSPEECH.
[18] James R. Glass,et al. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[19] Karel Veselý,et al. BUT2012 Approaches for Spoken Web Search - MediaEval 2012 , 2012, MediaEval.
[20] Chng Eng Siong,et al. The NNI Query-by-Example System for MediaEval 2015 , 2014, MediaEval.
[21] Florian Metze,et al. Spoken Web Search , 2011, MediaEval.
[22] Ji Wu,et al. Rapid adaptation for deep neural networks through multi-task learning , 2015, INTERSPEECH.
[23] John W. Fisher,et al. Supplemental Material for Parallel Sampling of DP Mixture Models using Sub-Clusters Splits , 2013 .
[24] Phil D. Green,et al. Multitask learning in connectionist robust ASR using recurrent neural networks , 2003, INTERSPEECH.
[25] Mireia Díez,et al. High-performance Query-by-Example Spoken Term Detection on the SWS 2013 evaluation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Joseph Picone,et al. A Doubly Hierarchical Dirichlet Process Hidden Markov Model with a Non-Ergodic Structure , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[27] Hung-An Chang,et al. Resource configurable spoken query detection using Deep Boltzmann Machines , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Shai Ben-David,et al. Exploiting Task Relatedness for Mulitple Task Learning , 2003, COLT.
[29] Satoshi Nakamura,et al. Iterative training of a DPGMM-HMM acoustic unit recognizer in a zero resource scenario , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[30] James R. Glass,et al. Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input , 2007, ACL.
[31] Aren Jansen,et al. Weak top-down constraints for unsupervised acoustic model training , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Aren Jansen,et al. Segmental acoustic indexing for zero resource keyword search , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Florian Metze,et al. The Spoken Web Search Task , 2012, MediaEval.
[34] Giorgio Metta,et al. An auto-encoder based approach to unsupervised learning of subword units , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] John R. Hershey,et al. Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks , 2015, INTERSPEECH.
[36] James R. Glass,et al. Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[37] Lukás Burget,et al. Copingwith channel mismatch in Query-by-Example - But QUESST 2014 , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Bin Ma,et al. Language independent query-by-example spoken term detection using N-best phone sequences and partial matching , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] James R. Glass,et al. Unsupervised Pattern Discovery in Speech , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[40] Peter Bell,et al. Regularization of context-dependent deep neural networks with context-independent multi-task training , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Carl E. Rasmussen,et al. The Infinite Gaussian Mixture Model , 1999, NIPS.
[42] Ewan Dunbar,et al. A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling , 2015, INTERSPEECH.
[43] Aren Jansen,et al. NLP on Spoken Documents Without ASR , 2010, EMNLP.
[44] Bin Ma,et al. Acoustic TextTiling for story segmentation of spoken documents , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Simon King,et al. Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[46] Peter Bell,et al. Complementary tasks for context-dependent deep neural network acoustic models , 2015, INTERSPEECH.
[47] Jasha Droppo,et al. Multi-task learning in deep neural networks for improved phoneme recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[48] Joseph Picone,et al. A Nonparametric Bayesian Approach for Spoken Term Detection by Example Query , 2016, INTERSPEECH.
[49] James R. Glass,et al. A Nonparametric Bayesian Approach to Acoustic Model Discovery , 2012, ACL.
[50] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[51] Bin Ma,et al. Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection , 2016, INTERSPEECH.
[52] Bin Ma,et al. Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information , 2016, INTERSPEECH.
[53] Timothy J. Hazen,et al. Query-by-example spoken term detection using phonetic posteriorgram templates , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[54] Bin Ma,et al. A Vector Space Modeling Approach to Spoken Language Identification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[55] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[56] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[57] Bin Ma,et al. Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study , 2015, INTERSPEECH.
[58] Aren Jansen,et al. Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline , 2013, INTERSPEECH.
[59] Ryan P. Adams,et al. Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.
[60] Tasha Nagamine,et al. On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models , 2016, INTERSPEECH.
[61] Satoshi Nakamura,et al. Supervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering , 2016, INTERSPEECH.
[62] Dong Wang,et al. Collaborative Joint Training With Multitask Recurrent Model for Speech and Speaker Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[63] Lukás Burget,et al. Variational Inference for Acoustic Unit Discovery , 2016, Workshop on Spoken Language Technologies for Under-resourced Languages.
[64] Igor Szöke,et al. BUT QUESST 2015 System Description , 2015, MediaEval.
[65] Lin-Shan Lee,et al. An iterative deep learning framework for unsupervised discovery of speech features and linguistic units with applications on spoken term detection , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[66] Kai Yu,et al. Multi-task learning for text-dependent speaker verification , 2015, INTERSPEECH.