论文信息 - A keyword-aware grammar framework for LVCSR-based spoken keyword search

A keyword-aware grammar framework for LVCSR-based spoken keyword search

In this paper, we proposed a method to realize the recently developed keyword-aware grammar for LVCSR-based keyword search using weight finite-state automata (WFSA). The approach creates a compact and deterministic grammar WFSA by inserting keyword paths to an existing n-gram WFSA. Tested on the evalpart1 data of the IARPA Babel OpenKWS13 Vietnamese and OpenKWS14 Tamil limited language pack tasks, the experimental results indicate the proposed keyword-aware framework achieves significant improvement, with about 50% relative actual term weighted value (ATWV) enhancement for both languages. Comparisons between the keyword-aware grammar and our previously proposed n-gram LM based approximation approach for the grammar also show that the KWS performances of these two realizations are complementary.

I-Fan Chen | Chin-Hui Lee | Chongjia Ni | Boon Pang Lim | Nancy F. Chen

[1] Herbert Gish,et al. Rapid and accurate spoken term detection , 2007, INTERSPEECH.

[2] Bin Ma,et al. Low-resource keyword search strategies for tamil , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] Herbert Gish,et al. Reducing word error rate on conversational speech from the Switchboard corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[4] L. Lamel,et al. Large-vocabulary continuous speech recognition: advances and applications , 2000, Proceedings of the IEEE.

[5] Bin Ma,et al. Strategies for Vietnamese keyword search , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Keikichi Hirose,et al. WFST-Based Grapheme-to-Phoneme Conversion: Open Source tools for Alignment, Model-Building and Decoding , 2012, FSMNLP.

[7] R. Rosenfeld,et al. Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[8] Chin-Hui Lee,et al. Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition , 1998 .

[9] Sridha Sridharan,et al. A phonetic search approach to the 2006 NIST spoken term detection evaluation , 2007, INTERSPEECH.

[10] Daben Liu,et al. Speech and language technologies for audio indexing and retrieval , 2000, Proceedings of the IEEE.

[11] I-Fan Chen,et al. A novel keyword+LVCSR-filler based grammar network representation for spoken keyword search , 2014, The 9th International Symposium on Chinese Spoken Language Processing.

[12] Myoung-Wan Koo,et al. Speech recognition and utterance verification based on a generalized confidence score , 2001, IEEE Trans. Speech Audio Process..

[13] Chin-Hui Lee,et al. Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition , 1996, IEEE Trans. Speech Audio Process..

[14] Mehryar Mohri,et al. Speech Recognition with Weighted Finite-State Transducers , 2008 .

[15] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[16] Biing-Hwang Juang,et al. Key-phrase detection and verification for flexible speech understanding , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[17] S. Furui,et al. Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication , 2000, Proceedings of the IEEE.

[18] Andreas Stolcke,et al. The SRI/OGI 2006 spoken term detection system , 2007, INTERSPEECH.

[19] Biing-Hwang Juang,et al. Flexible speech understanding based on combined key-phrase detection and verification , 1998, IEEE Trans. Speech Audio Process..

[20] Arto Salomaa,et al. Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[21] Lukás Burget,et al. Comparison of keyword spotting approaches for informal continuous speech , 2005, INTERSPEECH.

[22] Jonathan G. Fiscus,et al. Results of the 2006 Spoken Term Detection Evaluation , 2006 .

[23] Kuldip K. Paliwal,et al. Automatic Speech and Speaker Recognition , 1996 .

[24] Chin-Hui Lee,et al. Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[25] Kuldip K. Paliwal,et al. Automatic Speech and Speaker Recognition: Advanced Topics , 1999 .

[26] H Bung. Automatic speech recognition and understanding : A first step toward natural human-machine communication , 2000 .

[27] Biing-Hwang Juang,et al. Discriminative utterance verification for connected digits recognition , 1995, IEEE Trans. Speech Audio Process..

[28] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .

[29] J. Rinehart. U . S . Patent , 2006 .

[30] Richard Rose,et al. A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[31] Bhuvana Ramabhadran,et al. Vocabulary independent spoken term detection , 2007, SIGIR.

[32] Brian Roark,et al. Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.