Physics-Based Visual Understanding

An understanding of a scene's causal physics?how scene elements interact and respond to forces?is a precondition to reasoning about how the scene came to be, how it may evolve in time, and how it will respond to manipulation. We propose a computationally inexpensive method for recovering causal structure from images, in which a scene model is built incrementally through interleaved sensing and analysis. Reasoning uses generic qualitative knowledge about rigid-body interactions, reusable between domains and similar to concepts thought to be acquired or activated during child development. Causal constraint propagation reveals anomalous degrees of freedom in the scene model; prediction yields sensory plans to resolve them. Sensing operations are highly directed and local in scope, e.g., visual routines and proprioception. Inference depth and the number of pixels “touched” are bounded by the complexity of the scene. We present algorithms and semantics that have been successfully reused in several domains of highly structured scenes; in particular we detail a vision system that reverse-engineers machines.

[1]  A. Michotte The perception of causality , 1963 .

[2]  T. Garvey Perceptual strategies for purposive vision , 1975 .

[3]  H. Barrow Interactive Aids for Cartography and Photo Interpretation , 1976 .

[4]  Dana H. Ballard,et al.  An Approach to Knowledge-Directed Image Analysis , 1977, IJCAI.

[5]  Eugene C. Freuder A Computer System for Visual Recognition Using Active Knowledge , 1977, IJCAI.

[6]  Allen R. Hanson,et al.  Computer Vision Systems , 1978 .

[7]  Michael Brady,et al.  Preface - The Changing Shape of Computer Vision , 1981, Artif. Intell..

[8]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[9]  John K. Tsotsos Knowledge and the visual process: Content, form and use , 1984, Pattern Recognit..

[10]  Boi Faltings,et al.  Qualitative Kinematics: A Framework , 1987, IJCAI.

[11]  J. Freyd,et al.  Representing statics as forces in equilibrium. , 1988, Journal of experimental psychology. General.

[12]  R. Baillargeon,et al.  Is the Top Object Adequately Supported by the Bottom Object? Young Infants' Understanding of Support Relations , 1990 .

[13]  Ruzena Bajcsy,et al.  Recovery of Parametric Models from Range Images: The Case for Superquadrics with Global Deformations , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Dimitris N. Metaxas,et al.  Dynamic 3D models with local and global deformations: deformable superquadrics , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[15]  Alex Pentland,et al.  Closed-form solutions for physically-based shape modeling and recognition , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  S. Carey,et al.  The Epigenesis of mind : essays on biology and cognition , 1991 .

[17]  Dimitris N. Metaxas,et al.  Dynamic 3D Models with Local and Global Deformations: Deformable Superquadrics , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Kenneth D. Forbus,et al.  Qualitative Spatial Reasoning: The Clock Project , 1991, Artif. Intell..

[19]  H. Barrow,et al.  Scene modeling: a structural basis for image description , 1980 .

[20]  Jeffrey Mark Siskind,et al.  Naive physics, event perception, lexical semantics, and language acquisition , 1992 .

[21]  Matthew Brand,et al.  A short note on local region growing by pseudophysical simulation , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[22]  E. Spelke,et al.  Perceiving and reasoning about objects: Insights from infants , 1993 .

[23]  Dmitry B. Goldgof,et al.  Function-based recognition from incomplete knowledge of shape , 1993 .

[24]  Lawrence Birnbaum,et al.  Sensible Scenes: Visual Understanding of Complex Structures through Causal Analysis , 1993, AAAI.

[25]  Lawrence Birnbaum,et al.  Looking for trouble: Using causal semantics to direct focus of attention , 1993, 1993 (4th) International Conference on Computer Vision.

[26]  Lawrence Birnbaum,et al.  Perceptual causal analysis for interaction with the world , 1994, AAAI 1994.

[27]  Daniel D. Fu,et al.  Vision and navigation in man-made environments: looking for syrup in all the right places , 1994 .

[28]  Lawrence Birnbaum,et al.  Divided We Fall: Resolving Occlusions using Causal Reasoning , 1994, ECCV.

[29]  Lawrence Birnbaum,et al.  Seeing Physics, or: Physics is for Prediction , 1995 .

[30]  A. U.S. Causal Analysis for Visual Gesture Understanding , 1995 .

[31]  Allan D. Jepson,et al.  Computational Perception of Scene Dynamics , 1996, ECCV.

[32]  Matthew Brand,et al.  Understanding manipulation in video , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[33]  M. Brand,et al.  A knowledge framework for seeing and learning , 1997 .