Exploiting Physical Inconsistencies for 3D Scene Understanding

Reliable 3D object tracking can provide strong cues for scene understanding. In this paper we exploit inconsistencies between measured 3D trajectories and their predictions using a physical model. In a set of proof-of-concept experiments we show how to retrieve the camera rotation and translation and how to detect surfaces that are hard to visually discern by simply tracking a rigid object. Furthermore we introduce the class distinction between active and passive objects. Prototype examples demonstrate the usability of the visual input for this type of classification. In all the presented experiments, additional information and a deeper understanding about the scene can be obtained, which would not be possible by analyzing solely the image measurements.

[1]  Fei-Fei Li,et al.  Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  David J. Fleet,et al.  Physics-Based Person Tracking Using Simplified Lower-Body Dynamics , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  A. Torralba,et al.  The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[4]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Steven M. Seitz,et al.  Computing the Physical Parameters of Rigid-Body Motion from Video , 2002, ECCV.

[6]  Dimitris N. Metaxas Shape and Nonrigid Motion Estimation , 1997 .

[7]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Marc Pollefeys,et al.  A Minimal Case Solution to the Calibrated Relative Pose Problem for the Case of Two Known Orientation Angles , 2010, ECCV.

[9]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[10]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[11]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Luc Van Gool,et al.  Space-Time-Scale Registration of Dynamic Scene Reconstructions , 2006, ECCV.

[13]  Ali Farhadi,et al.  Recognition using visual phrases , 2011, CVPR 2011.

[14]  Philippe C. Cattin,et al.  Tracking the invisible: Learning where the object might be , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Luc Van Gool,et al.  What makes a chair a chair? , 2011, CVPR 2011.

[16]  Thierry Fraichard,et al.  An anthropomorphic navigation scheme for dynamic scenarios , 2011, 2011 IEEE International Conference on Robotics and Automation.

[17]  Antonis A. Argyros,et al.  Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints , 2011, 2011 International Conference on Computer Vision.

[18]  Antonis A. Argyros,et al.  Binding Computer Vision to Physics Based Simulation: The Case Study of a Bouncing Ball , 2011, British Machine Vision Conference.

[19]  Angel P. del Pobil,et al.  A framework for compliant physical interaction , 2010, Auton. Robots.

[20]  Antonis A. Argyros,et al.  Binding Vision to Physics Based Simulation: The Case Study of a Bouncing Ball , 2011 .

[21]  Alexei A. Efros,et al.  From 3D scene geometry to human workspace , 2011, CVPR 2011.

[22]  Derek Hoiem,et al.  Seeing the world behind the image: Spatial layout for 3D scene understanding , 2007 .

[23]  Raquel Urtasun,et al.  Physically-based motion models for 3D tracking: A convex formulation , 2011, 2011 International Conference on Computer Vision.

[24]  Alexei A. Efros,et al.  Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics , 2010, ECCV.

[25]  Pedro José Sanz Valero,et al.  A framework for compliant physical interaction : the grasp meets the task , 2010 .

[26]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.