Learning how objects function via co-analysis of interactions

We introduce a co-analysis method which learns a functionality model for an object category, e.g., strollers or backpacks. Like previous works on functionality, we analyze object-to-object interactions and intra-object properties and relations. Differently from previous works, our model goes beyond providing a functionality-oriented descriptor for a single object; it prototypes the functionality of a category of 3D objects by co-analyzing typical interactions involving objects from the category. Furthermore, our co-analysis localizes the studied properties to the specific locations, or surface patches, that support specific functionalities, and then integrates the patch-level properties into a category functionality model. Thus our model focuses on the how, via common interactions, and where, via patch localization, of functionality analysis. Given a collection of 3D objects belonging to the same category, with each object provided within a scene context, our co-analysis yields a set of proto-patches, each of which is a patch prototype supporting a specific type of interaction, e.g., stroller handle held by hand. The learned category functionality model is composed of proto-patches, along with their pairwise relations, which together summarize the functional properties of all the patches that appear in the input object category. With the learned functionality models for various object categories serving as a knowledge base, we are able to form a functional understanding of an individual 3D object, without a scene context. With patch localization in the model, functionality-aware modeling, e.g, functional object enhancement and the creation of functional object hybrids, is made possible.

[1]  David G. Stork,et al.  Generic object recognition using form and function , 1998, Pattern Analysis and Applications.

[2]  Ehud Rivlin,et al.  Functional 3D Object Classification Using Simulation of Embodied Agent , 2006, BMVC.

[3]  Taku Komura,et al.  Indexing 3D Scenes Using the Interaction Bisector Surface , 2014, ACM Trans. Graph..

[4]  Daniel Cohen-Or,et al.  Structure-aware shape processing , 2013, Eurographics.

[5]  Pat Hanrahan,et al.  Example-based synthesis of 3D object arrangements , 2012, ACM Trans. Graph..

[6]  D. Stork Generic object recognition using form & function , 1996 .

[7]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Ben Niu,et al.  Bacterial-inspired algorithms for solving constrained optimization problems , 2015, Neurocomputing.

[10]  Li Fei-Fei,et al.  Reasoning about Object Affordances in a Knowledge Base Representation , 2014, ECCV.

[11]  Daniel Cohen-Or,et al.  Meta-representation of shape families , 2014, ACM Trans. Graph..

[12]  Ligang Liu,et al.  Interaction context (ICON) , 2015, ACM Trans. Graph..

[13]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[14]  Levent Burak Kara,et al.  Semantic shape editing using deformation handles , 2015, ACM Trans. Graph..

[15]  Hamid Laga,et al.  Geometry and context for semantic correspondences and functionality recognition in man-made 3D shapes , 2013, TOGS.

[16]  Rui Ma,et al.  Organizing heterogeneous scene collections through contextual focal points , 2014, ACM Trans. Graph..

[17]  Matthias Nießner,et al.  Activity-centric scene synthesis for functional 3D scene modeling , 2015, ACM Trans. Graph..

[18]  L. Stark,et al.  Dissertation Abstract , 1994, Journal of Cognitive Education and Psychology.

[19]  Luc Van Gool,et al.  What makes a chair a chair? , 2011, CVPR 2011.

[20]  Hui Huang,et al.  Data-driven contextual modeling for 3D scene understanding , 2016, Comput. Graph..

[21]  Leonidas J. Guibas,et al.  Shape2Pose , 2014, ACM Trans. Graph..

[22]  Yun Jiang,et al.  Hallucinated Humans as the Hidden Context for Labeling 3D Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Pat Hanrahan,et al.  SceneGrok: inferring action maps in 3D environments , 2014, ACM Trans. Graph..

[24]  Qinghua Hu,et al.  Multi-granularity distance metric learning via neighborhood granule margin maximization , 2014, Inf. Sci..

[25]  Michelle R. Greene,et al.  Visual scenes are categorized by function. , 2016, Journal of experimental psychology. General.

[26]  Mark W. Schmidt,et al.  Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm , 2009, AISTATS.