A study of lattice-based spoken term detection for Chinese spontaneous speech

We examine the task of spoken term detection in Chinese spontaneous speech with a lattice-based approach. We compare lattices generated with different units: word, character, tonal syllable and toneless syllable, and also look into methods of converting lattices from one unit to another one. We find the best system is with toneless-syllable lattices converted from word lattices. Further improvement is achieved by lattice post-processing and system combination. Our best system has an accuracy of 80.2% on a keyword spotting task.

[1]  Peng Yu,et al.  Vocabulary-independent search in spontaneous speech , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Peng Yu,et al.  Vocabulary-independent indexing of spontaneous speech , 2005, IEEE Transactions on Speech and Audio Processing.

[3]  Lin-Shan Lee,et al.  Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  Chiu-yu Tseng,et al.  A multi-phase approach for fast spotting of large vocabulary Chinese keywords from Mandarin speech using prosodic information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[6]  Peng Yu,et al.  Towards Spoken-Document Retrieval for the Internet: Lattice Indexing For Large-Scale Web-Search Architectures , 2006, NAACL.

[7]  Yu Shi,et al.  Segmental tonal modeling for phone set design in Mandarin LVCSR , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Peng Yu,et al.  A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech , 2004, INTERSPEECH.

[9]  Hsin-Min Wang,et al.  Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese , 2000, Speech Commun..

[10]  Richard Sproat,et al.  Lattice-Based Search for Spoken Utterance Retrieval , 2004, NAACL.

[11]  Shi-wook Lee,et al.  Combining multiple subword representations for open-vocabulary spoken document retrieval , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..