暂无分享,去创建一个
Song-Chun Zhu | Siyuan Huang | Yining Hong | Qing Li | Song-Chun Zhu | Siyuan Huang | Yining Hong | Qing Li
[1] Lin Ma,et al. Multimodal Convolutional Neural Networks for Matching Image and Sentence , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[2] A. Simone,et al. Guiding new physics searches with unsupervised learning , 2018, The European Physical Journal C.
[3] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[4] Song-Chun Zhu,et al. A Numerical Study of the Bottom-Up and Top-Down Inference Processes in And-Or Graphs , 2011, International Journal of Computer Vision.
[5] Qing Li,et al. Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning , 2020, ICML.
[6] Song-Chun Zhu,et al. Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[7] Feng Han,et al. Bottom-Up/Top-Down Image Parsing with Attribute Grammar , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[8] R. Langacker. Foundations of cognitive grammar , 1983 .
[9] Leonidas J. Guibas,et al. StructureNet , 2019, ACM Trans. Graph..
[10] Howard Hunt Pattee,et al. Hierarchy Theory: The Challenge of Complex Systems , 1973 .
[11] Cordelia Schmid,et al. VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Geoffrey E. Hinton,et al. How to Represent Part-Whole Hierarchies in a Neural Network , 2021, Neural Computation.
[13] Yair Neuman,et al. Literal and Metaphorical Sense Identification through Concrete and Abstract Context , 2011, EMNLP.
[14] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[15] Stephen Clark,et al. Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More , 2014, ACL.
[16] Kevin Gimpel,et al. Visually Grounded Neural Syntax Acquisition , 2019, ACL.
[17] Leonidas J. Guibas,et al. PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Mark Johnson,et al. The body in the mind: the bodily basis of meaning , 1988 .
[19] Ronald W. Langacker,et al. An Introduction to Cognitive Grammar , 1986, Cogn. Sci..
[20] Kun Liu,et al. PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Furu Wei,et al. VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.
[22] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[23] Alexander M. Rush,et al. Compound Probabilistic Context-Free Grammars for Grammar Induction , 2019, ACL.
[24] Alexander M. Rush,et al. What is Learned in Visually Grounded Neural Syntax Acquisition , 2020, ACL.
[25] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[26] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[27] O. Firschein,et al. Syntactic pattern recognition and applications , 1983, Proceedings of the IEEE.
[28] Jiasen Lu,et al. Hierarchical Co-Attention for Visual Question Answering , 2016 .
[29] Ivan Titov,et al. Visually Grounded Compound PCFGs , 2020, EMNLP.
[30] Mohit Bansal,et al. LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.
[31] Yu Cheng,et al. UNITER: UNiversal Image-TExt Representation Learning , 2019, ECCV.
[32] Dan Klein,et al. A Generative Constituent-Context Model for Improved Grammar Induction , 2002, ACL.
[33] Vladimir Solmon,et al. The estimation of stochastic context-free grammars using the Inside-Outside algorithm , 2003 .
[34] Valentin I. Spitkovsky,et al. Viterbi Training Improves Unsupervised Dependency Parsing , 2010, CoNLL.
[35] Jason Eisner,et al. Inside-Outside and Forward-Backward Algorithms Are Just Backprop (tutorial paper) , 2016, SPNLP@EMNLP.
[36] Aaron C. Courville,et al. Neural Language Modeling by Jointly Learning Syntax and Lexicon , 2017, ICLR.
[37] Kewei Tu,et al. Unsupervised Structure Learning of Stochastic And-Or Grammars , 2013, NIPS.
[38] J. R EKERS,et al. Defining and Parsing Visual Languages with Layered Graph Grammars 1 , 1997 .
[39] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[40] K S Fu,et al. Syntactic Shape Recognition Using Attributed Grammars. , 1978 .
[41] Zhuowen Tu,et al. Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.
[42] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[43] Aaron C. Courville,et al. Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.
[44] Leonidas J. Guibas,et al. Learning hierarchical shape segmentation and labeling from online repositories , 2017, ACM Trans. Graph..
[45] Luc Van Gool,et al. SCAN: Learning to Classify Images Without Labels , 2020, ECCV.
[46] Mohit Yadav,et al. Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders , 2019, NAACL.
[47] Andy Schürr,et al. Defining and Parsing Visual Languages with Layered Graph Grammars , 1997, J. Vis. Lang. Comput..
[48] L. L. Cam,et al. Asymptotic Methods In Statistical Decision Theory , 1986 .
[49] J. Baker. Trainable grammars for speech recognition , 1979 .
[50] Song-Chun Zhu,et al. Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image , 2018, ECCV.
[51] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Noah A. Smith,et al. The Shared Logistic Normal Distribution for Grammar Induction , 2008 .
[53] Alexander M. Rush,et al. Unsupervised Recurrent Neural Network Grammars , 2019, NAACL.
[54] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[55] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .