论文信息 - Fast deformable model-based human performance capture and FVV using consumer-grade RGB-D sensors

Fast deformable model-based human performance capture and FVV using consumer-grade RGB-D sensors

Abstract In this paper, a novel end-to-end system for the fast reconstruction of human actor performances into 3D mesh sequences is proposed, using the input from a small set of consumer-grade RGB-Depth sensors. The proposed framework, by offline pre-reconstructing and employing a deformable actor’s 3D model to constrain the on-line reconstruction process, implicitly tracks the human motion. Handling non-rigid deformation of the 3D surface and applying appropriate texture mapping, it finally produces a dynamic sequence of temporally-coherent textured meshes, enabling realistic Free Viewpoint Video (FVV). Given the noisy input from a small set of low-cost sensors, the focus is on the fast (“quick-post”), robust and fully-automatic performance reconstruction. Apart from integrating existing ideas into a complete end-to-end system, which is itself a challenging task, several novel technical advances contribute to the speed, robustness and fidelity of the system, including a layered approach for model-based pose tracking, the definition and use of sophisticated energy functions, parallelizable on the GPU, as well as a new texture mapping scheme. The experimental results on a large number of challenging sequences, and comparisons with model-based and model-free approaches, demonstrate the efficiency of the proposed approach.

[1] Marc Alexa,et al. As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[2] Qionghai Dai,et al. Performance Capture of Interacting Characters with Handheld Kinects , 2012, ECCV.

[3] Takeo Kanade,et al. Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[4] Henry Fuchs,et al. Real-time volumetric 3D capture of room-sized scenes for telepresence , 2012, 2012 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[5] Paolo Cignoni,et al. Metro: Measuring Error on Simplified Surfaces , 1998, Comput. Graph. Forum.

[6] Marc Levoy,et al. Zippered polygon meshes from range images , 1994, SIGGRAPH.

[7] Bodo Rosenhahn,et al. Ball joints for Marker-less human Motion Capture , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[8] William H. Press,et al. Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[9] D. Marquardt. An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[10] O. Sorkine. Differential Representations for Mesh Processing , 2006 .

[11] David Coeurjolly,et al. Optimal Separable Algorithms to Compute the Reverse Euclidean Distance Transformation and Discrete Medial Axis in Arbitrary Dimension , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Andrew W. Fitzgibbon,et al. KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[13] F. Sebastian Grassia,et al. Practical Parameterization of Rotations Using the Exponential Map , 1998, J. Graphics, GPU, & Game Tools.

[14] Olga Sorkine-Hornung,et al. On Linear Variational Surface Deformation Methods , 2008, IEEE Transactions on Visualization and Computer Graphics.

[15] Hans-Peter Seidel,et al. Clustered Stochastic Optimization for Object Recognition and Pose Estimation , 2007, DAGM-Symposium.

[16] Petros Daras,et al. An Integrated Platform for Live 3D Human Reconstruction and Motion Capturing , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[17] Christian Theobalt,et al. On-set performance capture of multiple actors with a stereo camera , 2013, ACM Trans. Graph..

[18] Hans-Peter Seidel,et al. Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Christian Rössl,et al. Dense correspondence finding for parametrization-free animation reconstruction from video , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Jitendra Malik,et al. Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[21] Hans-Peter Seidel,et al. Performance capture from sparse multi-view video , 2008, SIGGRAPH 2008.

[22] John Darby,et al. Tracking human pose with multiple activity models , 2010, Pattern Recognit..

[23] Hong Zhou,et al. Accurate integration of multi-view range images using k-means clustering , 2008, Pattern Recognit..

[24] Albert Dipanda,et al. Towards a real-time 3D shape reconstruction using a structured light system , 2005, Pattern Recognit..

[25] Roberto Cipolla,et al. Multiview Stereo via Volumetric Graph-Cuts and Occlusion Robust Photo-Consistency , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Emiliano Gambaretto,et al. Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation , 2010, International Journal of Computer Vision.

[27] Aljoscha Smolic,et al. 3D video and free viewpoint video - From capture to display , 2011, Pattern Recognit..

[28] Hans-Peter Seidel,et al. Interacting and Annealing Particle Filters: Mathematics and a Recipe for Applications , 2007, Journal of Mathematical Imaging and Vision.

[29] Ruigang Yang,et al. Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Xiaojun Wu,et al. Real-time dynamic 3-D object shape reconstruction and high-fidelity texture mapping for 3-D video , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[31] Kiriakos N. Kutulakos,et al. A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[32] Dieter Fox,et al. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Wojciech Matusik,et al. Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[34] Alvaro Collet,et al. High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[35] Horst Bischof,et al. Simultaneous Shape and Pose Adaption of Articulated Models Using Linear Optimization , 2012, ECCV.

[36] Hans-Peter Seidel,et al. Optimization and Filtering for Human Motion Capture , 2010, International Journal of Computer Vision.

[37] Yee-Hong Yang,et al. Robust multi-view L2 triangulation via optimal inlier selection and 3D structure refinement , 2014, Pattern Recognit..

[38] Bodo Rosenhahn,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence Combined Region-and Motion-based 3d Tracking of Rigid and Articulated Objects , 2022 .

[39] Ilya Baran,et al. Automatic rigging and animation of 3D characters , 2007, SIGGRAPH 2007.

[40] S. Shankar Sastry,et al. A mathematical introduction to robotics manipulation , 1994 .

[41] Matthias Nießner,et al. VolumeDeform: Real-Time Volumetric Non-rigid Reconstruction , 2016, ECCV.

[42] Titus B. Zaharia,et al. FAMC: The MPEG-4 standard for Animated Mesh Compression , 2008, 2008 15th IEEE International Conference on Image Processing.

[43] Juergen Gall,et al. International Journal of Computer Vision manuscript No. (will be inserted by the editor) Optimization and Filtering for Human Motion Capture A Multi-layer Framework , 2022 .

[44] Jean Ponce,et al. Carved Visual Hulls for Image-Based Modeling , 2006, International Journal of Computer Vision.

[45] Michael M. Kazhdan,et al. Reconstruction of solid models from oriented point sets , 2005, SGP '05.

[46] Federico Tombari,et al. Semantic parametric body shape estimation from noisy depth sequences , 2016, Robotics Auton. Syst..

[47] Mark R. Stevens,et al. Methods for Volumetric Reconstruction of Visual Scenes , 2004, International Journal of Computer Vision.

[48] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[49] Marc Levoy,et al. A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[50] Petros Daras,et al. Real-Time, Full 3-D Reconstruction of Moving Foreground Objects From Multiple Consumer Depth Cameras , 2013, IEEE Transactions on Multimedia.

[51] Adrian Hilton,et al. Visual Analysis of Humans - Looking at People , 2013 .

[52] Seong-Whan Lee,et al. Reconstruction of 3D human body pose from stereo image sequences based on top-down learning , 2007, Pattern Recognit..

[53] Adrien Bartoli,et al. Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces , 2013, BMVC.

[54] Ruzena Bajcsy,et al. High-Quality Visualization for Geographically Distributed 3-D Teleimmersive Applications , 2011, IEEE Transactions on Multimedia.

[55] Richard Szeliski,et al. High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[56] Petros Daras,et al. Real-time, realistic full-body 3D reconstruction and texture mapping from multiple Kinects , 2013, IVMSP 2013.

[57] Andrew W. Fitzgibbon,et al. Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[58] Michael M. Kazhdan,et al. Poisson surface reconstruction , 2006, SGP '06.

[59] S. Goldsack,et al. IN REAL-TIME , 2008 .

[60] Horst Bischof,et al. Rapid Skin: Estimating the 3D Human Pose and Shape in Real-Time , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[61] Petros Daras,et al. Toward Real-Time and Efficient Compression of Human Time-Varying Meshes , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[62] Xu Zhao,et al. Generative tracking of 3D human motion by hierarchical annealed genetic algorithm , 2008, Pattern Recognit..

[63] Hans-Peter Seidel,et al. Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[64] Jan-Michael Frahm,et al. Scanning and tracking dynamic objects with commodity depth cameras , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[65] Edmond Boyer,et al. Efficient Polyhedral Modeling from Silhouettes , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66] Prabhu Kaliamoorthi,et al. Parametric annealing: A stochastic search method for human pose tracking , 2013, Pattern Recognit..

[67] Christian Rössl,et al. Eurographics Symposium on Point-based Graphics (2006) Template Deformation for Point Cloud Fitting , 2022 .

[68] Gloria Haro. Shape from Silhouette Consensus , 2012, Pattern Recognit..

[69] Surya Prakash,et al. A semi-supervised approach to space carving , 2010, Pattern Recognit..