论文信息 - Out-of-Vocabulary Word Recovery using FST-Based Subword Unit Clustering in a Hybrid ASR System

Out-of-Vocabulary Word Recovery using FST-Based Subword Unit Clustering in a Hybrid ASR System

The paper presents a new approach to extracting useful information from out-of-vocabulary (OOV) speech regions in ASR system output. The system makes use of a hybrid decoding network with both words and sub-word units. In the decoded lattices, candidates for OOV regions are identified as sub-graphs of sub-word units. To facilitate OOV word recovery, we search for recurring OOV s by clustering the detected candidate OOV s. The metrics for clustering is based on a comparison of the sub-graphs corresponding to the OOV candidates. The proposed method discovers repeating out-of-vocabulary words and finds their graphemic representation more robustly than more conventional techniques taking into account only one best sub-word string hypotheses.

Lukás Burget | Ekaterina Egorova | L. Burget | E. Egorova

[1] Mehryar Mohri,et al. Speech Recognition with Weighted Finite-State Transducers , 2008 .

[2] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .

[4] Hynek Hermansky,et al. Recovery of Rare Words in Lecture Speech , 2010, TSD.

[5] Murat Saraclar,et al. Lattice Indexing for Spoken Term Detection , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6] Mark J. Embrechts,et al. On the Use of the Adjusted Rand Index as a Metric for Evaluating Supervised Classification , 2009, ICANN.

[7] Slav Petrov,et al. Syntactic Annotations for the Google Books NGram Corpus , 2012, ACL.

[8] Alexander I. Rudnicky,et al. Learning better lexical properties for recurrent OOV words , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[9] Lukás Burget,et al. Similarity scoring for recognizing repeated out-of-vocabulary words , 2010, INTERSPEECH.

[10] Bhuvana Ramabhadran,et al. A new method for OOV detection using hybrid word/fragment system , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11] Alexander I. Rudnicky,et al. Finding recurrent out-of-vocabulary words , 2013, INTERSPEECH.

[12] Richard M. Schwartz,et al. Subword and phonetic search for detecting out-of-vocabulary keywords , 2014, INTERSPEECH.

[13] Richard M. Schwartz,et al. Combination of search techniques for improved spotting of OOV keywords , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[15] Murat Saraclar,et al. Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16] Richard M. Schwartz,et al. Semi-Supervised Methods for Improving Keyword Search of Unseen Terms , 2012, INTERSPEECH.

[17] Jean-Luc Gauvain,et al. Acoustic unit discovery and pronunciation generation from a grapheme-based lexicon , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[18] Hermann Ney,et al. Open vocabulary speech recognition with flat hybrid models , 2005, INTERSPEECH.

[19] James R. Glass,et al. Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[20] Igor Szöke. Hybrid word-subword spoken term detection , 2010 .