Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks
暂无分享,去创建一个
[1] Brian Roark,et al. Encoding linear models as weighted finite-state transducers , 2014, INTERSPEECH.
[2] Fuchun Peng,et al. Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] James F. Allen,et al. Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion , 2002, INTERSPEECH.
[4] Hermann Ney,et al. Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion , 2013, INTERSPEECH.
[5] Fuchun Peng,et al. Fix it where it fails: Pronunciation learning by mining error corrections from speech logs , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Keikichi Hirose,et al. Failure transitions for joint n-gram models and G2p conversion , 2013, INTERSPEECH.
[7] Joseph P. Olive,et al. Text-to-speech synthesis , 1995, AT&T Technical Journal.
[8] Grzegorz Kondrak,et al. A Ranking Approach to Stress Prediction for Letter-to-Phoneme Conversion , 2009, ACL/IJCNLP.
[9] Fuchun Peng,et al. Pronunciation learning for named-entities through crowd-sourcing , 2014, INTERSPEECH.
[10] Stefan Hahn,et al. Comparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks , 2012, INTERSPEECH.
[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[12] Simon King,et al. Letter-to-Sound Pronunciation Prediction Using Conditional Random Fields , 2011, IEEE Signal Processing Letters.
[13] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[14] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..
[15] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..
[16] Frédéric Bimbot,et al. Variable-length sequence matching for phonetic transcription using joint multigrams , 1995, EUROSPEECH.
[17] Enikö Beatrice Bilcu. Text-To-Phoneme Mapping Using Neural Networks , 2008 .
[18] Richard Sproat,et al. Applications of maximum entropy rankers to problems in spoken language processing , 2014, INTERSPEECH.
[19] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[20] Keikichi Hirose,et al. Improving WFST-based G2P Conversion with Alignment Constraints and RNNLM N-best Rescoring , 2012, INTERSPEECH.