G2p variant prediction techniques for ASR and STD

This work was supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Defense U.S. Army Research Laboratory contract numberW911NF-12- C-0013. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either express or implied, of IARPA, DoD/ARL, or the U.S. Government.

[1]  Guillaume Gravier,et al.  The ester 2 evaluation campaign for the rich transcription of French radio broadcasts , 2009, INTERSPEECH.

[2]  Etienne Barnard,et al.  Developing consistent pronunciation models for phonemic variants , 2006, INTERSPEECH.

[3]  Jonathan G. Fiscus,et al.  Results of the 2006 Spoken Term Detection Evaluation , 2006 .

[4]  Stefan Hahn,et al.  Comparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks , 2012, INTERSPEECH.

[5]  Denis Jouvet,et al.  Evaluating grapheme-to-phoneme converters in automatic speech recognition context , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Lori Lamel,et al.  Pronunciation variants across system configuration, language and speaking style , 1999, Speech Commun..

[7]  Etienne Barnard,et al.  Pronunciation prediction with Default&Refine , 2008, Comput. Speech Lang..

[8]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[9]  R. G. Brunet,et al.  Impact of pronunciation variation in speech recognition , 2012, 2012 International Conference on Signal Processing and Communications (SPCOM).

[10]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[11]  Susan L. Epstein,et al.  Phonemic Similarity Metrics to Compare Pronunciation Methods , 2011, INTERSPEECH.

[12]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[13]  James R. Glass,et al.  Learning Lexicons From Speech Using a Pronunciation Mixture Model , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[15]  Simon King,et al.  Stochastic Pronunciation Modeling for Out-of-Vocabulary Spoken Term Detection , 2011, IEEE Transactions on Audio, Speech, and Language Processing.