Combining Deep Learning and Qualitative Spatial Reasoning to Learn Complex Structures from Sparse Examples with Noise

Many modern machine learning approaches require vast amounts of training data to learn new concepts; conversely, human learning often requires few examples--sometimes only one--from which the learner can abstract structural concepts. We present a novel approach to introducing new spatial structures to an AI agent, combining deep learning over qualitative spatial relations with various heuristic search algorithms. The agent extracts spatial relations from a sparse set of noisy examples of block-based structures, and trains convolutional and sequential models of those relation sets. To create novel examples of similar structures, the agent begins placing blocks on a virtual table, uses a CNN to predict the most similar complete example structure after each placement, an LSTM to predict the most likely set of remaining moves needed to complete it, and recommends one using heuristic search. We verify that the agent learned the concept by observing its virtual block-building activities, wherein it ranks each potential subsequent action toward building its learned concept. We empirically assess this approach with human participants' ratings of the block structures. Initial results and qualitative evaluations of structures generated by the trained agent show where it has generalized concepts from the training data, which heuristics perform best within the search space, and how we might improve learning and execution.

[1]  James Pustejovsky,et al.  An Evaluation Framework for Multimodal Interaction , 2018, LREC.

[2]  R. Schneider Convex Bodies: The Brunn–Minkowski Theory: Minkowski addition , 1993 .

[3]  S. Muggleton Meta-Interpretive Learning: achievements and challenges , 2015 .

[4]  David B. Leake,et al.  Flexible Feature Deletion: Compacting Case Bases by Selectively Compressing Case Contents , 2015, ICCBR.

[5]  Basura Fernando,et al.  Unsupervised Human Action Detection by Action Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Xilin Chen,et al.  Visual Relationship Detection With Deep Structural Ranking , 2018, AAAI.

[7]  Pat Langley,et al.  A Unified Cognitive Architecture for Physical Agents , 2006, AAAI.

[8]  Ce Liu,et al.  Deep Convolutional Neural Network for Image Deconvolution , 2014, NIPS.

[9]  John E. Laird,et al.  The Soar Cognitive Architecture , 2012 .

[10]  Anthony G. Cohn,et al.  Learning Relational Event Models from Video , 2015, J. Artif. Intell. Res..

[11]  Shyamanta M. Hazarika,et al.  Extracting Qualitative Spatiotemporal Relations for Objects in a Video , 2018 .

[12]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[13]  Ugur Kuter,et al.  Analogical Localization: Flexible Plan Execution in Open Worlds , 2017, ICCBR.

[14]  Anthony G. Cohn,et al.  Natural Language Acquisition and Grounding for Embodied Robotic Systems , 2017, AAAI.

[15]  Barry Smyth,et al.  Case-Based Recommendation , 2007, The Adaptive Web.

[16]  Chaman L. Sabharwal,et al.  RCC-3D: Qualitative Spatial Reasoning in 3D , 2010, CAINE.

[17]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[18]  Anthony G. Cohn,et al.  QSRlib: a software library for online acquisition of qualitative spatial relations from video , 2016 .

[19]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[20]  Anthony G. Cohn,et al.  Learning of Object Properties, Spatial Relations, and Actions for Embodied Agents from Language and Vision , 2017, AAAI Spring Symposia.

[21]  Kenneth D. Forbus,et al.  Extending SME to Handle Large-Scale Cognitive Modeling , 2017, Cogn. Sci..

[22]  Anthony G. Cohn,et al.  A Spatial Logic based on Regions and Connection , 1992, KR.

[23]  Dan Roth,et al.  Towards Problem Solving Agents that Communicate and Learn , 2017, RoboNLP@ACL.

[24]  John Folkesson,et al.  Combining Top-down Spatial Reasoning and Bottom-up Object Class Recognition for Scene Understanding , 2014, IROS 2014.

[25]  Ivan Laptev,et al.  Unsupervised Learning from Narrated Instruction Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Silvio Savarese,et al.  Watch-n-patch: Unsupervised understanding of actions and relations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  James Pustejovsky,et al.  Multimodal Semantic Simulations of Linguistically Underspecified Motion Events , 2016, Spatial Cognition.

[28]  Stefan Lee,et al.  Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  H. Bekkering,et al.  Developmental psychology: Rational imitation in preverbal infants , 2002, Nature.

[30]  Ernest Davis,et al.  The scope and limits of simulation in automated reasoning , 2016, Artif. Intell..

[31]  Bernhard Nebel,et al.  Qualitative Spatial Reasoning about Relative Position: The Tradeoff between Strong Formal Properties and Successful Reasoning about Route Graphs , 2003, Spatial Cognition.

[32]  James Pustejovsky,et al.  VoxSim: A Visual Platform for Modeling Motion Language , 2016, COLING.

[33]  Demis Hassabis,et al.  Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.

[34]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[35]  Minoru Asada,et al.  Cooperative Behavior Acquisition for Mobile Robots in Dynamically Changing Real Worlds Via Vision-Based Reinforcement Learning and Development , 1999, Artif. Intell..

[36]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[37]  Jeffrey Mark Siskind,et al.  Simultaneous Object Detection, Tracking, and Event Recognition , 2012, ArXiv.

[38]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[39]  James Pustejovsky,et al.  VoxML: A Visualization Modeling Language , 2016, LREC.

[40]  Susan Craw,et al.  Learning adaptation knowledge to improve case-based reasoning , 2006, Artif. Intell..

[41]  Kenneth D. Forbus,et al.  Extending Analogical Generalization with Near-Misses , 2015, AAAI.

[43]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[44]  Luc De Raedt,et al.  Relational Learning for Spatial Relation Extraction from Natural Language , 2011, ILP.

[45]  Nikolaos Papanikolopoulos,et al.  Learning Dynamic Event Descriptions in Image Sequences , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[47]  John D. Kelleher,et al.  Towards a Cognitive System that Can Recognize Spatial Regions Based on Context , 2012, AAAI.

[48]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[49]  David Ménager Episodic Memory in a Cognitive Model , 2016, ICCBR Workshops.

[50]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[51]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[52]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[53]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).