Generating Complementary Acoustic Model Spaces in DNN-Based Sequence-to-Frame DTW Scheme for Out-of-Vocabulary Spoken Term Detection
暂无分享,去创建一个
[1] Lukás Burget. Measurement of Complementarity of Recognition Systems , 2004, TSD.
[2] C. Breslin,et al. Generating Complementary System , 2006 .
[3] Mark J. F. Gales,et al. Product of Gaussians for speech recognition , 2006, Comput. Speech Lang..
[4] G. Zweig,et al. The IBM 2006 Speech Transcription System , 2006 .
[5] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[6] Kenney Ng,et al. Subword-based approaches for spoken document retrieval , 2000, Speech Commun..
[7] Shi-wook Lee,et al. Combination of diverse subword units in spoken term detection , 2015, INTERSPEECH.
[8] Karen Spärck Jones,et al. Effects of out of vocabulary words in spoken document retrieval (poster session) , 2000, SIGIR '00.
[9] Brian Kingsbury,et al. Exploiting diversity for spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[11] Timothy J. Hazen,et al. Query-by-example spoken term detection using phonetic posteriorgram templates , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[12] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[13] Haihua Xu,et al. Minimum Bayes Risk decoding and system combination based on a recursion for edit distance , 2011, Comput. Speech Lang..
[14] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.
[15] Olivier Siohan,et al. Multiple classifiers by constrained minimization , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[16] Shi-wook Lee,et al. Effective combination of heterogeneous subword-based spoken term detection systems , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[17] Jing Huang,et al. Detection, diarization, and transcription of far-field lecture speech , 2007, INTERSPEECH.
[18] Bin Ma,et al. Score fusion and calibration in multiple language detectors with large performance variation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Gunnar Evermann,et al. Posterior probability decoding, confidence estimation and system combination , 2000 .
[20] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[21] Tatsuya Kawahara,et al. Overview of the NTCIR-10 SpokenDoc-2 Task , 2013, NTCIR.
[22] Carmen García-Mateo,et al. Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion , 2015, EURASIP J. Audio Speech Music. Process..
[23] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[24] Shi-wook Lee,et al. Combining multiple subword representations for open-vocabulary spoken document retrieval , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[25] James R. Glass,et al. Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Mark J. F. Gales,et al. Directed decision trees for generating complementary systems , 2009, Speech Commun..
[27] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[28] Richard Sproat,et al. Lattice-Based Search for Spoken Utterance Retrieval , 2004, NAACL.
[29] Steve Renals,et al. Revisiting hybrid and GMM-HMM system combination techniques , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[30] Yu Zhang,et al. Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages , 2014, INTERSPEECH.
[31] Lars Kai Hansen,et al. Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[32] Dong Yu,et al. Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition , 2010 .
[33] James R. Glass,et al. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[34] K. Maekawa. CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .