Unsupervised Linking of Visual Features to Textual Descriptions in Long Manipulation Activities
暂无分享,去创建一个
Eren Erdal Aksoy | Tamim Asfour | Adil Orhan | Yezhou Yang | Ekaterina Ovchinnikova | T. Asfour | E. Aksoy | A. Orhan | Yezhou Yang | Ekaterina Ovchinnikova
[1] Mirko Wächter,et al. The ArmarX Statechart Concept: Graphical Programing of Robot Behavior , 2016, Front. Robot. AI.
[2] Heinz Wörn,et al. Recognition and Understanding Situations and Activities with Description Logicsfor Safe Human-Robot Cooperation , 2010 .
[3] Jeffrey Mark Siskind,et al. A Compositional Framework for Grounding Language Inference, Generation, and Acquisition in Video , 2015, J. Artif. Intell. Res..
[4] Yoshihiko Nakamura,et al. Statistical mutual conversion between whole body motion primitives and linguistic sentences for human motions , 2015, Int. J. Robotics Res..
[5] Eren Erdal Aksoy,et al. Semantic Decomposition and Recognition of Long and Complex Manipulation Action Sequences , 2016, International Journal of Computer Vision.
[6] Eren Erdal Aksoy,et al. Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..
[7] Trevor Darrell,et al. Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] Kate Saenko,et al. Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild , 2014, COLING.
[9] Tae-Kyun Kim,et al. A syntactic approach to robot imitation learning using probabilistic activity grammars , 2013, Robotics Auton. Syst..
[10] Stevan Harnad. The Symbol Grounding Problem , 1999, ArXiv.
[11] Yoshihiko Nakamura,et al. Statistical Behavioral Understanding by Motion, Object, and Language , 2015 .
[12] Tomoaki Nakamura,et al. Mutual learning of an object concept and language model based on MLDA and NPYLM , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[13] Vladimir Zaytsev,et al. Generating Conceptual Metaphors from Proposition Stores , 2014, ArXiv.
[14] Liang Lin,et al. I2T: Image Parsing to Text Description , 2010, Proceedings of the IEEE.
[15] Eren Erdal Aksoy,et al. Semantic parsing of human manipulation activities using on-line learned models for robot imitation , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[16] Kate Saenko,et al. Generating Natural-Language Video Descriptions Using Text-Mined Knowledge , 2013, AAAI.
[17] Christopher Joseph Pal,et al. Video Description Generation Incorporating Spatio-Temporal Features and a Soft-Attention Mechanism , 2015, ArXiv.
[18] Bernt Schiele,et al. Translating Video Content to Natural Language Descriptions , 2013, 2013 IEEE International Conference on Computer Vision.
[19] Eren Erdal Aksoy,et al. Model-free incremental learning of the semantics of manipulation actions , 2015, Robotics Auton. Syst..
[20] Trevor Darrell,et al. YouTube2Text: Recognizing and Describing Arbitrary Activities Using Semantic Hierarchies and Zero-Shot Recognition , 2013, 2013 IEEE International Conference on Computer Vision.
[21] Abdullah Al Mamun,et al. Unsupervised Alignment of Actions in Video with Text Descriptions , 2016, IJCAI.
[22] Babette Dellen,et al. Depth-supported real-time video segmentation with the Kinect , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).
[23] Charles J. Fillmore,et al. THE CASE FOR CASE. , 1967 .
[24] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[25] Johan Bos,et al. Wide-Coverage Semantic Analysis with Boxer , 2008, STEP.
[26] Mark Steedman,et al. Object-Action Complexes: Grounded abstractions of sensory-motor processes , 2011, Robotics Auton. Syst..