Deep Virtual Markers for Articulated 3D Shapes

We propose deep virtual markers, a framework for estimating dense and accurate positional information for various types of 3D data. We design a concept and construct a framework that maps 3D points of 3D articulated models, like humans, into virtual marker labels. To realize the framework, we adopt a sparse convolutional neural network and classify 3D points of an articulated model into virtual marker labels. We propose to use soft labels for the classifier to learn rich and dense interclass relationships based on geodesic distance. To measure the localization accuracy of the virtual markers, we test FAUST challenge, and our result outperforms the state-of-the-art. We also observe outstanding performance on the generalizability test, unseen data evaluation, and different 3D data types (meshes and depth maps). We show additional applications using the estimated virtual markers, such as non-rigid registration, texture transfer, and realtime dense marker prediction from depth maps.

[1]  K. Nishi,et al.  Generation of human depth images with body part labels for complex human pose recognition , 2017, Pattern Recognit..

[2]  Junchi Yan,et al.  Learning Local Neighboring Structure for Robust 3D Shape Representation , 2020 .

[3]  Pushmeet Kohli,et al.  Fusion4D , 2016, ACM Trans. Graph..

[4]  Jonathan Masci,et al.  Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Silvio Savarese,et al.  4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael J. Black,et al.  FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Taku Komura,et al.  A Virtual Reality Dance Training System Using Motion Capture Technology , 2011, IEEE Transactions on Learning Technologies.

[8]  Dong-Ming Yan,et al.  MGCN: Descriptor Learning using Multiscale GCNs , 2020, ACM Trans. Graph..

[9]  Maks Ovsjanikov,et al.  Correspondence Learning via Linearly-invariant Embedding , 2020, NeurIPS.

[10]  Alexander M. Bronstein,et al.  Deep Functional Maps: Structured Prediction for Dense Shape Correspondence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Jonathan Masci,et al.  Learning shape correspondence with anisotropic convolutional neural networks , 2016, NIPS.

[13]  Cordelia Schmid,et al.  Learning from Synthetic Humans , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shahram Izadi,et al.  Motion2fusion , 2017, ACM Trans. Graph..

[15]  Weichao Qiu,et al.  Learning From Synthetic Animals , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jonathan T. Barron,et al.  3D self-portraits , 2013, ACM Trans. Graph..

[17]  Daniel Cremers,et al.  The wave kernel signature: A quantum mechanical approach to shape analysis , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[18]  Dan Raviv,et al.  Cyclic Functional Mapping: Self-supervised correspondence between non-isometric deformable shapes , 2019, ECCV.

[19]  Qionghai Dai,et al.  DoubleFusion: Real-Time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Tao Yu,et al.  BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Jing Ren,et al.  Continuous and orientation-preserving correspondences via functional maps , 2018, ACM Trans. Graph..

[22]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[23]  Michael J. Black,et al.  The stitched puppet: A graphical model of 3D human shape and pose , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Chris Bregler,et al.  Motion Capture Technology for Entertainment [In the Spotlight] , 2007, IEEE Signal Processing Magazine.

[25]  Marcel Campen,et al.  A Simple Approach to Intrinsic Correspondence Learning on Unstructured 3D Meshes , 2018, ECCV Workshops.

[26]  Peng Wang,et al.  Joint Multi-person Pose Estimation and Semantic Part Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Leonidas J. Guibas,et al.  Robust single-view geometry and motion reconstruction , 2009, SIGGRAPH 2009.

[28]  Vladimir G. Kim,et al.  Blended intrinsic maps , 2011, ACM Trans. Graph..

[29]  Wolfram Burgard,et al.  Deep learning for human part discovery in images , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Umberto Castellani,et al.  FARM: Functional Automatic Registration Method for 3D Human Bodies , 2018, Comput. Graph. Forum.

[31]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[32]  Jing Ren,et al.  ZoomOut: Spectral Upsampling for Efficient Shape Correspondence , 2019, ACM Trans. Graph..

[33]  Jovan Popovic,et al.  Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[34]  Stefanos Zafeiriou,et al.  SpiralNet++: A Fast and Highly Efficient Mesh Convolution Operator , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[35]  Heinrich Müller,et al.  SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Daniel Cremers,et al.  Smooth Shells: Multi-Scale Shape Registration With Functional Maps , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Maks Ovsjanikov,et al.  Unsupervised Deep Learning for Structured Shape Matching , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Michael J. Black,et al.  3D Menagerie: Modeling the 3D Shape and Pose of Animals , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Dragomir Anguelov,et al.  SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[40]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[41]  Alberto Menache,et al.  Understanding Motion Capture for Computer Animation and Video Games , 1999 .

[42]  Christian Theobalt,et al.  LiveCap , 2018, ACM Trans. Graph..

[43]  Edmond Boyer,et al.  FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[45]  Abhishek Sharma,et al.  Deep Geometric Functional Maps: Robust Feature Learning for Shape Correspondence , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jitendra Malik,et al.  Recognizing Objects in Range Data Using Regional Point Descriptors , 2004, ECCV.

[47]  Leonidas J. Guibas,et al.  A concise and provably informative multi-scale signature based on heat diffusion , 2009 .

[48]  Christian Theobalt,et al.  DeepCap: Monocular Human Performance Capture Using Weak Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Christoph Lassner,et al.  Efficient Learning on Point Clouds With Basis Point Sets , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50]  Peter V. Gehler,et al.  Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Vladlen Koltun,et al.  Fully Convolutional Geometric Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[52]  Leonidas J. Guibas,et al.  Robust single-view geometry and motion reconstruction , 2009, ACM Trans. Graph..

[53]  Mathieu Aubry,et al.  Learning elementary structures for 3D shape generation and matching , 2019, NeurIPS.

[54]  Sergio Escalera,et al.  Graph cuts optimization for multi-limb human segmentation in depth maps , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[56]  Leonidas J. Guibas,et al.  KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[57]  Bharat Lal Bhatnagar,et al.  LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration , 2020, NeurIPS.

[58]  Alan L. Yuille,et al.  Semantic part segmentation using compositional model combining shape and appearance , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Bo Li,et al.  Shape Retrieval of Non-Rigid 3D Human Models , 2014, 3DOR@Eurographics.

[60]  Daniel Cremers,et al.  Deep Shells: Unsupervised Shape Correspondence with Optimal Transport , 2020, NeurIPS.

[61]  Leonidas J. Guibas,et al.  Computing and processing correspondences with functional maps , 2016, SIGGRAPH Courses.

[62]  Federico Tombari,et al.  SHOT: Unique signatures of histograms for surface and texture description , 2014, Comput. Vis. Image Underst..

[63]  Yaser Sheikh,et al.  LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Qi-Xing Huang,et al.  Dense Human Body Correspondences Using Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  E. Kalogerakis,et al.  RigNet , 2020, ACM Trans. Graph..

[66]  Pierre Vandergheynst,et al.  Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[67]  HebertMartial,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999 .

[68]  Vladlen Koltun,et al.  Robust Nonrigid Registration by Convex Optimization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[69]  Ron Kimmel,et al.  Unsupervised Learning of Dense Shape Correspondence , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Jirí Zára,et al.  Skinning with dual quaternions , 2007, SI3D.

[71]  Mathieu Aubry,et al.  3D-CODED: 3D Correspondences by Deep Deformation , 2018, ECCV.

[72]  Alexander M. Bronstein,et al.  Numerical Geometry of Non-Rigid Shapes , 2009, Monographs in Computer Science.

[73]  Chris Bregler,et al.  Motion Capture Technology for Entertainment , 2007 .

[74]  Yaser Sheikh,et al.  Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.