The Cognitive Dialogue: A new model for vision implementing common sense reasoning

We propose a new model for vision, where vision is part of an intelligent system that reasons. To achieve this we need to integrate perceptual processing with computational reasoning and linguistics. In this paper we present the basics of this formalism.

[1]  Yiannis Aloimonos,et al.  Towards a Watson that sees: Language-guided action recognition for robots , 2012, 2012 IEEE International Conference on Robotics and Automation.

[2]  Douglas Summers-Stay,et al.  Using a minimal action grammar for activity understanding in the real world , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Xiaodong Yu,et al.  Active scene recognition with vision and language , 2011, 2011 International Conference on Computer Vision.

[4]  Dima Damen,et al.  Detecting Carried Objects in Short Video Sequences , 2008, ECCV.

[5]  Cyrus Rashtchian,et al.  Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.

[6]  Cecilia Ovesdotter Alm,et al.  Object Categorization: Words and Pictures: Categories, Modifiers, Depiction, and Iconography , 2009 .

[7]  Christiane Fellbaum,et al.  WordNet then and now , 2007, Lang. Resour. Evaluation.

[8]  Song-Chun Zhu,et al.  Using Causal Induction in Humans to Learn and Infer Causality from Video , 2013, CogSci.

[9]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[10]  Panagiotis Dimitrakis,et al.  Embodied Language Processing: A New Generation of Language Technology , 2011, Language-Action Tools for Cognitive Artificial Agents.

[11]  Paul F. M. J. Verschure,et al.  Distributed Adaptive Control: A theory of the Mind, Brain, Body Nexus , 2012, BICA 2012.

[12]  Yiannis Aloimonos,et al.  A Cognitive System for Understanding Human Manipulation Actions , 2014 .

[13]  Yiannis Aloimonos,et al.  The minimalist grammar of action , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[14]  Larry S. Davis,et al.  Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.