暂无分享,去创建一个
Gal Chechik | Amir Globerson | Aviv Shamsian | Ofri Kleinfeld | A. Globerson | Gal Chechik | Aviv Shamsian | Ofri Kleinfeld
[1] J. Piaget. The construction of reality in the child , 1954 .
[2] R. Baillargeon,et al. Object permanence in young infants: further evidence. , 1991, Child development.
[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[4] R. Baillargeon,et al. 2.5-Month-Old Infants' Reasoning about When Objects Should and Should Not Be Occluded , 1999, Cognitive Psychology.
[5] Yan Huang,et al. Tracking multiple objects through occlusions , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[6] A. Smitsman,et al. The significance of event information for 6- to 16-month-old infants' perception of containment. , 2009, Developmental psychology.
[7] Antonis A. Argyros,et al. Multiple objects tracking in the presence of long-term occlusions , 2010, Comput. Vis. Image Underst..
[8] Philippe C. Cattin,et al. Tracking the invisible: Learning where the object might be , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[9] Ali Farhadi,et al. Recognition using visual phrases , 2011, CVPR 2011.
[10] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Ruslan Salakhutdinov,et al. Action Recognition using Visual Attention , 2015, NIPS 2015.
[14] Ming-Hsuan Yang,et al. Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[16] Michael S. Bernstein,et al. Visual Relationship Detection with Language Priors , 2016, ECCV.
[17] Richard P. Wildes,et al. Spatiotemporal Residual Networks for Video Action Recognition , 2016, NIPS.
[18] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[19] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Richard P. Wildes,et al. Spatiotemporal Multiplier Networks for Video Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Heng Tao Shen,et al. Video Captioning With Attention-Based LSTM and Semantic Consistency , 2017, IEEE Transactions on Multimedia.
[23] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.
[24] Wenjun Zeng,et al. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.
[25] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Michael Felsberg,et al. The Sixth Visual Object Tracking VOT2018 Challenge Results , 2018, ECCV Workshops.
[27] Wei Wu,et al. High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Wei Wu,et al. Distractor-aware Siamese Networks for Visual Object Tracking , 2018, ECCV.
[29] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[30] Wei Liang,et al. Tracking Occluded Objects and Recovering Incomplete Trajectories by Reasoning About Containment Relations and Human Actions , 2018, AAAI.
[31] Shimon Ullman,et al. A model for discovering ‘containment’ relations , 2019, Cognition.
[32] Fan Yang,et al. LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Haibin Ling,et al. Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Chuang Gan,et al. CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2020, ICLR.
[35] D. Ramanan,et al. CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning , 2019, ICLR.
[36] Seyed Mojtaba Marvasti-Zadeh,et al. Deep Learning for Visual Tracking: A Comprehensive Survey , 2019, IEEE Transactions on Intelligent Transportation Systems.