Multimodal, Multilingual Grapheme-to-Phoneme Conversion for Low-Resource Languages
暂无分享,去创建一个
Han Zhang | James Route | Steven Hillis | Alan Black | Isak C. Etinger | Han Zhang | A. Black | Steven Hillis | James Route | I. C. Etinger
[1] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Goutam Saha,et al. Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition , 2012, Speech Commun..
[3] James R. Glass,et al. Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech , 2018, INTERSPEECH.
[4] Thaweesak Yingthawornsuk,et al. Speech Recognition using MFCC , 2012 .
[5] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[6] Alan W. Black,et al. CMU Wilderness Multilingual Speech Dataset , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Kevin Knight,et al. Grapheme-to-Phoneme Models for (Almost) Any Language , 2016, ACL.
[8] Karen Livescu,et al. Jointly learning to align and convert graphemes to phonemes with neural attention models , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[9] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[10] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.
[11] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[12] Grzegorz Kondrak,et al. Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion , 2007, NAACL.
[13] Terrence J. Sejnowski,et al. NETtalk: a parallel network that learns to read aloud , 1988 .
[14] Siddharth Dalmia,et al. Epitran: Precision G2P for Many Languages , 2018, LREC.
[15] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[16] Barnabás Póczos,et al. Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities , 2018, AAAI.
[17] Barnabás Póczos,et al. Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis , 2018, ArXiv.
[18] 张国亮,et al. Comparison of Different Implementations of MFCC , 2001 .
[19] Josef van Genabith,et al. Massively Multilingual Neural Grapheme-to-Phoneme Conversion , 2017, ArXiv.
[20] Vaibhava Goel,et al. Deep multimodal learning for Audio-Visual Speech Recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] John C. Wells,et al. Computer-coding the IPA: a proposed extension of SAMPA , 1995 .
[22] Stanley F. Chen,et al. Conditional and joint models for grapheme-to-phoneme conversion , 2003, INTERSPEECH.
[23] Prateek Verma,et al. Audio-linguistic Embeddings for Spoken Sentences , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..
[25] Zheng Fang,et al. Comparison of different implementations of MFCC , 2001 .
[26] Nikos Fakotakis,et al. Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task , 2007 .