Shape2Pose

As 3D acquisition devices and modeling tools become widely available there is a growing need for automatic algorithms that analyze the semantics and functionality of digitized shapes. Most recent research has focused on analyzing geometric structures of shapes. Our work is motivated by the observation that a majority of man-made shapes are designed to be used by people. Thus, in order to fully understand their semantics, one needs to answer a fundamental question: "how do people interact with these objects?" As an initial step towards this goal, we offer a novel algorithm for automatically predicting a static pose that a person would need to adopt in order to use an object. Specifically, given an input 3D shape, the goal of our analysis is to predict a corresponding human pose, including contact points and kinematic parameters. This is especially challenging for man-made objects that commonly exhibit a lot of variance in their geometric structure. We address this challenge by observing that contact points usually share consistent local geometric features related to the anthropometric properties of corresponding parts and that human body is subject to kinematic constraints and priors. Accordingly, our method effectively combines local region classification and global kinematically-constrained search to successfully predict poses for various objects. We also evaluate our algorithm on six diverse collections of 3D polygonal models (chairs, gym equipment, cockpits, carts, bicycles, and bipedal devices) containing a total of 147 models. Finally, we demonstrate that the poses predicted by our algorithm can be used in several shape analysis problems, such as establishing correspondences between objects, detecting salient regions, finding informative viewpoints, and retrieving functionally-similar shapes.

[1]  Leonidas J. Guibas,et al.  An optimization approach for extracting and encoding consistent maps in a shape collection , 2012, ACM Trans. Graph..

[2]  Leonidas J. Guibas,et al.  Exploration of continuous variability in collections of 3D shapes , 2011, ACM Trans. Graph..

[3]  Peter K. Allen,et al.  Data-driven grasping , 2011, Auton. Robots.

[4]  Leonidas J. Guibas,et al.  Fine-grained semi-supervised labeling of large shape collections , 2013, ACM Trans. Graph..

[5]  Yun Jiang,et al.  Hallucinating Humans for Learning Robotic Placement of Objects , 2012, ISER.

[6]  S. Greenberg,et al.  The Psychology of Everyday Things , 2012 .

[7]  Rodney A. Brooks,et al.  Humanoid robots , 2002, CACM.

[8]  Yun Jiang,et al.  Infinite Latent Conditional Random Fields , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[9]  Victor B. Zordan,et al.  Physically based grasping control from example , 2005, SCA '05.

[10]  Ashutosh Saxena,et al.  Monocular depth perception and robotic grasping of novel objects , 2009 .

[11]  Alexei A. Efros,et al.  Scene Semantics from Long-Term Observation of People , 2012, ECCV.

[12]  Daniel Cohen-Or,et al.  iWIRES: an analyze-and-edit approach to shape manipulation , 2009, ACM Trans. Graph..

[13]  James M. Rehg,et al.  Learning Visual Object Categories for Robot Affordance Prediction , 2010, Int. J. Robotics Res..

[14]  Leonidas J. Guibas,et al.  Probabilistic reasoning for assembly-based 3D modeling , 2011, ACM Trans. Graph..

[15]  Alexei A. Efros,et al.  From 3D scene geometry to human workspace , 2011, CVPR 2011.

[16]  D. Norman The psychology of everyday things", Basic Books Inc , 1988 .

[17]  Daniel Cohen-Or,et al.  Salient geometric features for partial shape matching and similarity , 2006, TOGS.

[18]  Jocelyne Troccaz,et al.  Automatic preshaping for a dextrous hand from a simple description of objects , 1990, EEE International Workshop on Intelligent Robots and Systems, Towards a New Frontier of Applications.

[19]  Yun Jiang,et al.  Infinite Latent Conditional Random Fields for Modeling Environments through Humans , 2013, Robotics: Science and Systems.

[20]  Stephen DiVerdi,et al.  Exploring collections of 3D models using fuzzy correspondences , 2012, ACM Trans. Graph..

[21]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects , 2006, NIPS.

[22]  Vladlen Koltun,et al.  Joint shape segmentation with linear programming , 2011, ACM Trans. Graph..

[23]  Thomas A. Funkhouser,et al.  Distinctive regions of 3D surfaces , 2007, TOGS.

[24]  Bernt Schiele,et al.  Functional Object Class Detection Based on Learned Affordance Cues , 2008, ICVS.

[25]  Lucas Paletta,et al.  Learning Predictive Features in Affordance based Robotic Perception Systems , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  David W. Jacobs,et al.  Mesh saliency , 2005, ACM Trans. Graph..

[27]  Daniel Cohen-Or,et al.  Consistent mesh partitioning and skeletonisation using the shape diameter function , 2008, The Visual Computer.

[28]  Nanning Zheng,et al.  Modeling 4D Human-Object Interactions for Event and Object Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[29]  Danica Kragic,et al.  A Metric for Comparing the Anthropomorphic Motion Capability of Artificial Hands , 2013, IEEE Transactions on Robotics.

[30]  Niloy J. Mitra,et al.  Symmetry in 3D Geometry: Extraction and Applications , 2013, Comput. Graph. Forum.

[31]  Mirko Wächter,et al.  A skeleton-based approach to grasp known objects with a humanoid robot , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[32]  Daniel Cohen-Or,et al.  Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering , 2011, ACM Trans. Graph..

[33]  Yun Jiang,et al.  Learning Object Arrangements in 3D Scenes using Human Context , 2012, ICML.

[34]  Alexei A. Efros,et al.  People Watching: Human Actions as a Cue for Single View Geometry , 2012, International Journal of Computer Vision.

[35]  Aaron Hertzmann,et al.  Learning 3D mesh segmentation and labeling , 2010, ACM Trans. Graph..

[36]  Daniel Cohen-Or,et al.  Co-hierarchical analysis of shape structures , 2013, ACM Trans. Graph..

[37]  三嶋 博之 The theory of affordances , 2008 .

[38]  Stephen DiVerdi,et al.  Learning part-based templates from large collections of 3D shapes , 2013, ACM Trans. Graph..

[39]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[40]  Allen R. Hanson,et al.  Computer Vision Systems , 1978 .

[41]  Daniel Cohen-Or,et al.  Structure-aware shape processing , 2013, Eurographics.

[42]  Daniel Cohen-Or,et al.  Upright orientation of man-made objects , 2008, ACM Trans. Graph..

[43]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[44]  James M. Rehg,et al.  Affordance Prediction via Learned Object Attributes , 2011 .

[45]  Jinxiang Chai,et al.  Robust realtime physics-based motion control for human grasping , 2013, ACM Trans. Graph..

[46]  Thomas A. Funkhouser,et al.  Consistent segmentation of 3D models , 2009, Comput. Graph..

[47]  Yun Jiang,et al.  Hallucinated Humans as the Hidden Context for Labeling 3D Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Luc Van Gool,et al.  What makes a chair a chair? , 2011, CVPR 2011.

[49]  Thomas A. Funkhouser,et al.  Schelling points on 3D surface meshes , 2012, ACM Trans. Graph..

[50]  Adam Finkelstein,et al.  Perceptual models of viewpoint preference , 2011, TOGS.

[51]  D. Cohen-Or,et al.  Upright orientation of man-made objects , 2008, SIGGRAPH 2008.

[52]  Ying Li,et al.  Data-Driven Grasp Synthesis Using Shape Matching and Task-Based Pruning , 2007, IEEE Transactions on Visualization and Computer Graphics.

[53]  Siddhartha Chaudhuri,et al.  A probabilistic model for component-based shape synthesis , 2012, ACM Trans. Graph..

[54]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[55]  S. Buss Introduction to Inverse Kinematics with Jacobian Transpose , Pseudoinverse and Damped Least Squares methods , 2004 .

[56]  Josep M. Porta,et al.  Global Optimization of Robotic Grasps , 2011, Robotics: Science and Systems.