Seeing What You're Told: Sentence-Guided Activity Recognition in Video
暂无分享,去创建一个
Jeffrey Mark Siskind | N. Siddharth | Andrei Barbu | J. Siskind | Andrei Barbu | Siddharth Narayanaswamy
[1] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[2] Jack K. Wolf,et al. Finding the best set of K paths through a trellis with application to multitarget tracking , 1989 .
[3] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[4] Yiannis Aloimonos,et al. Corpus-Guided Sentence Generation of Natural Images , 2011, EMNLP.
[5] Karl Stratos,et al. Midge: Generating Image Descriptions From Computer Vision Detections , 2012, EACL.
[6] Li Zhuo,et al. Semantic context based refinement for news video annotation , 2013, Multimedia Tools and Applications.
[7] Kunio Fukunaga,et al. Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.
[8] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[9] Barbara Caputo,et al. Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation , 2009, NIPS.
[10] K. X. M. Tzeng,et al. Convolutional Codes and 'Their Performance in Communication Systems , 1971 .
[11] Pau Baiget,et al. Natural Language Descriptions of Human Behavior from Video Sequences , 2007, KI.
[12] Jeffrey Mark Siskind,et al. Simultaneous Object Detection, Tracking, and Event Recognition , 2012, ArXiv.
[13] Jeffrey Mark Siskind,et al. Grounded Language Learning from Video Described with Sentences , 2013, ACL.
[14] Piji Li,et al. What is happening in a still picture? , 2011, The First Asian Conference on Pattern Recognition.
[15] Cyrus Rashtchian,et al. Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.
[16] C. V. Jawahar,et al. Choosing Linguistics over Vision to Describe Images , 2012, AAAI.
[17] Muhammad Usman Ghani Khan,et al. Describing Video Contents in Natural Language , 2012 .
[18] Sven J. Dickinson,et al. Video In Sentences Out , 2012, UAI.
[19] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.