Application of out-of-language detection to spoken term detection

This paper investigates the detection of English spoken terms in a conversational multi-language scenario. The speech is processed using a large vocabulary continuous speech recognition system. The recognition output is represented in the form of word recognition lattices which are then used to search required terms. Due to the potential multi-lingual speech segments at the input, the spoken term detection system is combined with a module performing out-of-language detection to adjust its confidence scores. First, experimental results of spoken term detection are provided on the conversational telephone speech database distributed by NIST in 2006. Then, the system is evaluated on a multi-lingual database with and without employment of the out-of-language detection module, where we are only interested in detecting English terms (stored in the index database). Several strategies to combine these two systems in an efficient way are proposed and evaluated. Around 7% relative improvement over a stand-alone STD is achieved.

[1]  Andreas Stolcke,et al.  The SRI/OGI 2006 spoken term detection system , 2007, INTERSPEECH.

[2]  Herbert Gish,et al.  Rapid and accurate spoken term detection , 2007, INTERSPEECH.

[3]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[4]  Petr Motlícek Automatic out-of-language detection based on confidence measures derived from LVCSR word and phone lattices , 2009, INTERSPEECH.

[5]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Alex Acero,et al.  Maximum Entropy Confidence Estimation for Speech Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.