Speech Technology for Unwritten Languages
暂无分享,去创建一个
Florian Metze | Mark Hasegawa-Johnson | Graham Neubig | Shruti Palaskar | Odette Scharenborg | Alan W. Black | Laurent Besacier | Emmanuel Dupoux | Philip Arthur | Mingxing Du | Liming Wang | Lucas Ondel | Rachid Riad | Pierre Godard | Danny Merkx | Markus Mueller | Francesco Ciannella | Elin Larsen | Sebastian Stueker | M. Hasegawa-Johnson | Graham Neubig | A. Black | Florian Metze | Emmanuel Dupoux | Lucas Ondel | L. Besacier | Philip Arthur | Shruti Palaskar | O. Scharenborg | Francesco Ciannella | Pierre Godard | Danny Merkx | Rachid Riad | Liming Wang | S. Stueker | Mingxing Du | Markus Mueller | Elin Larsen
[1] Matthias Sperber,et al. XNMT: The eXtensible Neural Machine Translation Toolkit , 2018, AMTA.
[2] B. Nash-Webber,et al. Semantic support for a speech understanding system , 1975 .
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Kenneth Ward Church,et al. A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Lalit R. Bahl,et al. Design of a linguistic statistical decoder for the recognition of continuous speech , 1975, IEEE Trans. Inf. Theory.
[6] Jiebo Luo,et al. Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Florian Metze,et al. Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the “Speaking Rosetta” JSALT 2017 Workshop , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] James R. Glass,et al. Deep multimodal semantic embeddings for speech and images , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[10] Christos Faloutsos,et al. Automatic image captioning , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).
[11] Martine Adda-Decker,et al. Parallel Speech Collection for Under-resourced Language Studies Using the Lig-Aikuma Mobile Device App , 2016, SLTU.
[12] Kevin Duh,et al. The JHU/KyotoU Speech Translation System for IWSLT 2018 , 2018, IWSLT.
[13] Mark Hasegawa-Johnson,et al. Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition , 2012 .
[14] Mattia Antonino Di Gangi,et al. Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018 , 2018, IWSLT.
[15] Olivier Pietquin,et al. Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation , 2016, NIPS 2016.
[16] Mauro Cettolo,et al. The IWSLT 2018 Evaluation Campaign , 2018, IWSLT.
[17] James R. Glass,et al. Towards multi-speaker unsupervised speech pattern discovery , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] James R. Glass,et al. Unsupervised Pattern Discovery in Speech , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Graham Neubig,et al. Neural Machine Translation and Sequence-to-sequence Models: A Tutorial , 2017, ArXiv.
[20] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[21] Grzegorz Chrupala,et al. Encoding of phonology in a recurrent neural model of grounded speech , 2017, CoNLL.
[22] N. Umeda,et al. Linguistic rules for text-to-speech synthesis , 1976, Proceedings of the IEEE.
[23] David Chiang,et al. An Attentional Model for Speech Translation Without Transcription , 2016, NAACL.
[24] James R. Glass,et al. Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input , 2018, ECCV.
[25] Frantisek Grézl,et al. Multilingually trained bottleneck features in spoken language recognition , 2017, Comput. Speech Lang..
[26] Lukás Burget,et al. Variational Inference for Acoustic Unit Discovery , 2016, Workshop on Spoken Language Technologies for Under-resourced Languages.
[27] Bowen Zhou,et al. TOWARDS SPEECH TRANSLATION OF NON WRITTEN LANGUAGES , 2006, 2006 IEEE Spoken Language Technology Workshop.
[28] Navdeep Jaitly,et al. Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech , 2017, ArXiv.
[29] Martin Karafiát,et al. The language-independent bottleneck features , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[30] Majid Mirbagheri,et al. ASR for Under-Resourced Languages From Probabilistic Transcription , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[31] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[32] Alan W. Black,et al. Random forests for statistical speech synthesis , 2015, INTERSPEECH.
[33] Elena Lloret,et al. Improving Automatic Image Captioning Using Text Summarization Techniques , 2010, TSD.
[34] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[35] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[36] A. Waibel,et al. The 2014 KIT IWSLT speech-to-text systems for English, German and Italian , 2014, IWSLT.
[37] Eric Fosler-Lussier. CONTEXTUAL WORD AND SYLLABLE PRONUNCIATION MODELS , 1999 .
[38] Alan W. Black,et al. Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[40] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[41] I. A. Richards,et al. The Meaning of Meaning: a Study of the Influence of Language upon Thought and of the Science of Symbolism , 1923, Nature.
[42] James R. Glass,et al. Unsupervised Learning of Spoken Language with Visual Context , 2016, NIPS.
[43] Thierry Dutoit,et al. High-quality speech synthesis for phonetic speech segmentation , 1997, EUROSPEECH.
[44] Ian Maddieson,et al. Patterns of sounds , 1986 .
[45] Alan W. Black,et al. CLUSTERGEN: a statistical parametric synthesizer using trajectory modeling , 2006, INTERSPEECH.
[46] Sebastian Stüker,et al. Innovative technologies for under-resourced language documentation: The BULB Project , 2016 .
[47] F. Jelinek,et al. Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.
[48] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[49] Sebastian Stüker,et al. Breaking the Unwritten Language Barrier: The BULB Project , 2016, SLTU.
[50] George R. Doddington,et al. The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.
[51] Lou Boves,et al. Experiences from the Spoken Dutch Corpus Project , 2002, LREC.
[52] Hermann Ney,et al. Cross-language bootstrapping for unsupervised acoustic model training: rapid development of a Polish speech recognition system , 2009, INTERSPEECH.
[53] James R. Glass,et al. Towards Visually Grounded Sub-word Speech Unit Discovery , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[54] Mari Ostendorf,et al. A stochastic segment model for phoneme-based continuous speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..
[55] Mark Hasegawa-Johnson,et al. Image 2 speech : Automatically generating audio descriptions of images , 2017 .
[56] Adam Lopez,et al. Pre-training on high-resource speech recognition improves low-resource speech-to-text translation , 2018, NAACL.
[57] James R. Glass,et al. Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[58] Sanjeev Khudanpur,et al. Unsupervised Learning of Acoustic Sub-word Units , 2008, ACL.
[59] Ngoc Thang Vu,et al. Multilingual bottle-neck features and its application for under-resourced languages , 2012, SLTU.
[60] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.
[61] Tanja Schultz,et al. Experiments on cross-language acoustic modeling , 2001, INTERSPEECH.
[62] Sebastian Stüker,et al. A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments , 2017, LREC.
[63] Samy Bengio,et al. Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..
[64] Zhen-Hua Ling,et al. Enhancing Sentence Embedding with Generalized Pooling , 2018, COLING.
[65] A. Black,et al. Building an ASR System for a Low-resource Language Through the Adaptation of a High-resource Language ASR System: Preliminary Results , 2017 .
[66] Chng Eng Siong,et al. A comparative study of BNF and DNN multilingual training on cross-lingual low-resource speech recognition , 2015, INTERSPEECH.
[67] Mark Hasegawa-Johnson,et al. Building an ASR System for Mboshi Using A Cross-Language Definition of Acoustic Units Approach , 2018, SLTU.