Learning Words by Drawing Images
暂无分享,去创建一个
James Glass | James R. Glass | Antonio Torralba | David Bau | Dídac Surís | Adrià Recasens | David Harwath | A. Torralba | David Bau | Dídac Surís | Adrià Recasens | David F. Harwath
[1] J. Elman. Learning and development in neural networks: the importance of starting small , 1993, Cognition.
[2] H. Hayne,et al. The effect of drawing on memory performance in young children. , 1995 .
[3] Alex Pentland,et al. Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..
[4] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[5] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[6] Shiguang Shan,et al. Self-Paced Curriculum Learning , 2015, AAAI.
[7] Yuandong Tian,et al. Simple Baseline for Visual Question Answering , 2015, ArXiv.
[8] Rob Fergus,et al. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.
[9] Bolei Zhou,et al. Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.
[10] Allan Jabri,et al. Revisiting Visual Question Answering Baselines , 2016, ECCV.
[11] Dan Klein,et al. Learning to Compose Neural Networks for Question Answering , 2016, NAACL.
[12] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[13] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[14] James R. Glass,et al. Unsupervised Learning of Spoken Language with Visual Context , 2016, NIPS.
[15] Andrew Owens,et al. Ambient Sound Provides Supervision for Visual Learning , 2016, ECCV.
[16] Wei Liu,et al. Multi-Modal Curriculum Learning for Semi-Supervised Image Classification , 2016, IEEE Transactions on Image Processing.
[17] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[19] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Grzegorz Chrupala,et al. Encoding of phonology in a recurrent neural model of grounded speech , 2017, CoNLL.
[21] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[22] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[23] James Glass,et al. Analysis of Audio-Visual Features for Unsupervised Speech Recognition , 2017 .
[24] Bolei Zhou,et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Andrew Slavin Ross,et al. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.
[26] James R. Glass,et al. Learning Word-Like Units from Joint Audio-Visual Analysis , 2017, ACL.
[27] Gregory Shakhnarovich,et al. Visually Grounded Learning of Keyword Prediction from Untranscribed Speech , 2017, INTERSPEECH.
[28] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[29] Grzegorz Chrupala,et al. Representations of language in a model of visually grounded speech signal , 2017, ACL.
[30] Thomas Brox,et al. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Andrew Owens,et al. Audio-Visual Scene Analysis with Self-Supervised Multisensory Features , 2018, ECCV.
[32] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[33] Rogério Schmidt Feris,et al. Learning to Separate Object Sounds by Watching Unlabeled Video , 2018, ECCV.
[34] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.
[35] James R. Glass,et al. Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input , 2018, ECCV.
[36] Felix Hill,et al. Measuring abstract reasoning in neural networks , 2018, ICML.
[37] Tae-Hyun Oh,et al. Learning to Localize Sound Source in Visual Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[38] Kristen Grauman,et al. Attributes as Operators , 2018, ArXiv.
[39] David Mascharka,et al. Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[40] Andrew Zisserman,et al. Objects that Sound , 2017, ECCV.
[41] Joon Son Chung,et al. The Conversation: Deep Audio-Visual Speech Enhancement , 2018, INTERSPEECH.
[42] Chuang Gan,et al. The Sound of Pixels , 2018, ECCV.
[43] Bolei Zhou,et al. Interpreting Deep Visual Representations via Network Dissection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[44] Bolei Zhou,et al. Visualizing and Understanding Generative Adversarial Networks (Extended Abstract) , 2019, ArXiv.