Lions and Tigers and Bears: Capturing Non-rigid, 3D, Articulated Shape from Images

Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries. Modeling 3D animal shape, however, is difficult because the 3D scanning methods used to capture human shape are not applicable to wild animals or natural settings. Consequently, we propose a method to capture the detailed 3D shape of animals from images alone. The articulated and deformable nature of animals makes this problem extremely challenging, particularly in unconstrained environments with moving and uncalibrated cameras. To make this possible, we use a strong prior model of articulated animal shape that we fit to the image data. We then deform the animal shape in a canonical reference pose such that it matches image evidence when articulated and projected into multiple images. Our method extracts significantly more 3D shape detail than previous methods and is able to model new species, including the shape of an extinct animal, using only a few video frames. Additionally, the projected 3D shapes are accurate enough to facilitate the extraction of a realistic texture map from multiple frames.

[1]  Marc Alexa,et al.  As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[2]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[3]  Fiora Pirri,et al.  Component-Wise Modeling of Articulated Objects , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[5]  C. Leung,et al.  Animating animal motion from still , 2008, SIGGRAPH 2008.

[6]  Stuart Geman,et al.  Statistical methods for tomographic image reconstruction , 1987 .

[7]  Marie-Paule Cani,et al.  Animal gaits from video , 2004, SCA '04.

[8]  Aaron Hertzmann,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis , 2022 .

[9]  D. A. Field Laplacian smoothing and Delaunay triangulations , 1988 .

[10]  Andrew W. Fitzgibbon,et al.  What Shape Are Dolphins? Building 3D Morphable Models from 2D Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ronen Basri,et al.  Learning 3D Deformation of Animals from 2D Images , 2015, Comput. Graph. Forum.

[12]  Aaron Hertzmann,et al.  Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[14]  Emiliano Gambaretto,et al.  Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation , 2010, International Journal of Computer Vision.

[15]  Michael J. Black,et al.  Dynamic FAUST: Registering Human Bodies in Motion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[17]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[18]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[19]  Lourdes Agapito,et al.  Balloon Shapes: Reconstructing and Deforming Objects with Volume from Images , 2013, 2013 International Conference on 3D Vision.

[20]  Hans-Peter Seidel,et al.  Animated 3D Creatures from Single-view Video by Skeletal Sketching , 2016, Graphics Interface.

[21]  David A. Forsyth,et al.  Building models of animals from video , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Slobodan Ilic,et al.  Probabilistic Deformable Surface Tracking from Multiple Videos , 2010, ECCV.

[23]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[24]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[25]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[26]  Michael J. Black,et al.  3D Menagerie: Modeling the 3D Shape and Pose of Animals , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Hans-Peter Seidel,et al.  Model-Based Outdoor Performance Capture , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[29]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[30]  Thomas Brox,et al.  Learning to Generate Chairs, Tables and Cars with Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.