A keyword-aware grammar framework for LVCSR-based spoken keyword search

In this paper, we proposed a method to realize the recently developed keyword-aware grammar for LVCSR-based keyword search using weight finite-state automata (WFSA). The approach creates a compact and deterministic grammar WFSA by inserting keyword paths to an existing n-gram WFSA. Tested on the evalpart1 data of the IARPA Babel OpenKWS13 Vietnamese and OpenKWS14 Tamil limited language pack tasks, the experimental results indicate the proposed keyword-aware framework achieves significant improvement, with about 50% relative actual term weighted value (ATWV) enhancement for both languages. Comparisons between the keyword-aware grammar and our previously proposed n-gram LM based approximation approach for the grammar also show that the KWS performances of these two realizations are complementary.

[1]  Herbert Gish,et al.  Rapid and accurate spoken term detection , 2007, INTERSPEECH.

[2]  Bin Ma,et al.  Low-resource keyword search strategies for tamil , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Herbert Gish,et al.  Reducing word error rate on conversational speech from the Switchboard corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[4]  L. Lamel,et al.  Large-vocabulary continuous speech recognition: advances and applications , 2000, Proceedings of the IEEE.

[5]  Bin Ma,et al.  Strategies for Vietnamese keyword search , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Keikichi Hirose,et al.  WFST-Based Grapheme-to-Phoneme Conversion: Open Source tools for Alignment, Model-Building and Decoding , 2012, FSMNLP.

[7]  R. Rosenfeld,et al.  Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[8]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition , 1998 .

[9]  Sridha Sridharan,et al.  A phonetic search approach to the 2006 NIST spoken term detection evaluation , 2007, INTERSPEECH.

[10]  Daben Liu,et al.  Speech and language technologies for audio indexing and retrieval , 2000, Proceedings of the IEEE.

[11]  I-Fan Chen,et al.  A novel keyword+LVCSR-filler based grammar network representation for spoken keyword search , 2014, The 9th International Symposium on Chinese Spoken Language Processing.

[12]  Myoung-Wan Koo,et al.  Speech recognition and utterance verification based on a generalized confidence score , 2001, IEEE Trans. Speech Audio Process..

[13]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition , 1996, IEEE Trans. Speech Audio Process..

[14]  Mehryar Mohri,et al.  Speech Recognition with Weighted Finite-State Transducers , 2008 .

[15]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[16]  Biing-Hwang Juang,et al.  Key-phrase detection and verification for flexible speech understanding , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[17]  S. Furui,et al.  Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication , 2000, Proceedings of the IEEE.

[18]  Andreas Stolcke,et al.  The SRI/OGI 2006 spoken term detection system , 2007, INTERSPEECH.

[19]  Biing-Hwang Juang,et al.  Flexible speech understanding based on combined key-phrase detection and verification , 1998, IEEE Trans. Speech Audio Process..

[20]  Arto Salomaa,et al.  Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[21]  Lukás Burget,et al.  Comparison of keyword spotting approaches for informal continuous speech , 2005, INTERSPEECH.

[22]  Jonathan G. Fiscus,et al.  Results of the 2006 Spoken Term Detection Evaluation , 2006 .

[23]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition , 1996 .

[24]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[25]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition: Advanced Topics , 1999 .

[26]  H Bung Automatic speech recognition and understanding : A first step toward natural human-machine communication , 2000 .

[27]  Biing-Hwang Juang,et al.  Discriminative utterance verification for connected digits recognition , 1995, IEEE Trans. Speech Audio Process..

[28]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[29]  J. Rinehart U . S . Patent , 2006 .

[30]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[31]  Bhuvana Ramabhadran,et al.  Vocabulary independent spoken term detection , 2007, SIGIR.

[32]  Brian Roark,et al.  Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.