暂无分享,去创建一个
[1] Danny Merkx,et al. Learning semantic sentence representations from visually grounded language without lexical knowledge , 2019, Natural Language Engineering.
[2] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[3] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.
[4] Leslie N. Smith,et al. Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[5] James R. Glass,et al. Deep multimodal semantic embeddings for speech and images , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[6] Zhong-Qiu Wang,et al. A Joint Training Framework for Robust Automatic Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Graham Neubig,et al. A Tree-based Decoder for Neural Machine Translation , 2018, EMNLP.
[8] M. Tomasello,et al. Early syntactic creativity: a usage-based approach. , 2003, Journal of child language.
[9] Allan Jabri,et al. Learning Visually Grounded Sentence Representations , 2018, NAACL.
[10] J. Pine,et al. Reanalysing rote-learned phrases: individual differences in the transition to multi-word speech , 1993, Journal of Child Language.
[11] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[12] M. Braine. Children's First Word Combinations. , 1976 .
[13] M. Tomasello. First Steps toward a Usage-Based Theory of Language Acquisition , 2001 .
[14] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[15] James Glass,et al. Analysis of Audio-Visual Features for Unsupervised Speech Recognition , 2017 .
[16] David J. Fleet,et al. VSE++: Improved Visual-Semantic Embeddings , 2017, ArXiv.
[17] Sebastian Stüker,et al. Multilingual shifting deep bottleneck features for low-resource ASR , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Kilian Q. Weinberger,et al. Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.
[20] Grzegorz Chrupala,et al. Representations of language in a model of visually grounded speech signal , 2017, ACL.
[21] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[23] Eneko Agirre,et al. SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.
[24] James R. Glass,et al. Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input , 2018, ECCV.
[25] Frantisek Grézl,et al. Multilingually trained bottleneck features in spoken language recognition , 2017, Comput. Speech Lang..
[26] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[27] Gregory Shakhnarovich,et al. Visually Grounded Learning of Keyword Prediction from Untranscribed Speech , 2017, INTERSPEECH.
[28] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[29] James R. Glass,et al. Unsupervised Learning of Spoken Language with Visual Context , 2016, NIPS.