论文信息 - Dynamic Appearance Modelling from Minimal Cameras

Dynamic Appearance Modelling from Minimal Cameras

We present a novel method for modelling dynamic texture appearance from a minimal set of cameras. Previous methods to capture the dynamic appearance of a human from multi-view video have relied on large, expensive camera setups, and typically store texture on a frame-by-frame basis. We fit a parameterised human body model to multi-view video from minimal cameras (as few as 3), and combine the partial texture observations from multiple viewpoints and frames in a learned framework to generate full-body textures with dynamic details given an input pose. Key to our method are our multi-band loss functions, which apply separate blending functions to the high and low spatial frequencies to reduce texture artefacts. We evaluate our method on a range of multi-view datasets, and show that our model is able to accurately produce full-body dynamic textures, even with only partial camera coverage. We demonstrate that our method outperforms other texture generation methods on minimal camera setups.

Jean-Yves Guillemaut | Adrian Hilton | Lewis Bridgeman

[1] Hao Li,et al. PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2] Hao Li,et al. SiCloPe: Silhouette-Based Clothed People , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Edmond Boyer,et al. Video Based Animation Synthesis with the Essential Graph , 2015, 2015 International Conference on 3D Vision.

[4] Michael Werman,et al. Multiresolution Textures from Image Sequences , 1997, IEEE Computer Graphics and Applications.

[5] Adam Baumberg,et al. Blending Images for Texturing 3D Models , 2002, BMVC.

[6] Weihong Deng,et al. Very deep convolutional neural network based image classification using small training sample size , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[7] Victor Lempitsky,et al. Textured Neural Avatars , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Alvaro Collet,et al. High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[9] Yi Yang,et al. Self-Correction for Human Parsing , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Hao Li,et al. paGAN: real-time avatars using dynamic textures , 2019, ACM Trans. Graph..

[11] Marcus A. Magnor,et al. Learning to Reconstruct People in Clothing From a Single RGB Camera , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Winston H. Hsu,et al. Free-Form Video Inpainting With 3D Gated Convolution and Temporal PatchGAN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Holly E. Rushmeier,et al. High-Quality Texture Reconstruction from Multiple Scans , 2001, IEEE Trans. Vis. Comput. Graph..

[14] Paul Debevec,et al. Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[15] Adrian Hilton,et al. 4D video textures for interactive character appearance , 2014, Comput. Graph. Forum.

[16] Chongyang Ma,et al. Deep Volumetric Video From Very Sparse Multi-view Performance Capture , 2018, ECCV.

[17] Zhaolin Chen,et al. 3D Texture Mapping in Multi-view Reconstruction , 2012, ISVC.

[18] Vladlen Koltun,et al. Color map optimization for 3D reconstruction with consumer depth cameras , 2014, ACM Trans. Graph..

[19] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[20] Hao Li,et al. ARCH: Animatable Reconstruction of Clothed Humans , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Hans-Peter Seidel,et al. Automatic generation of personalized human avatars from multi-view video , 2005, VRST '05.

[22] Gerard Pons-Moll,et al. 360-Degree Textures of People in Clothing from a Single Image , 2019, 2019 International Conference on 3D Vision (3DV).

[23] Christian Theobalt,et al. MonoPerfCap , 2017, ACM Trans. Graph..

[24] Marcus A. Magnor,et al. Video Based Reconstruction of 3D People Models , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Christian Theobalt,et al. LiveCap , 2018, ACM Trans. Graph..

[26] Vagia Tsiminaki,et al. Eigen Appearance Maps of Dynamic Shapes , 2016, ECCV.

[27] Yaser Sheikh,et al. Deep appearance models for face rendering , 2018, ACM Trans. Graph..

[28] Yinghao Huang,et al. Towards Accurate Marker-Less Human Shape and Pose Estimation over Time , 2017, 2017 International Conference on 3D Vision (3DV).

[29] Harry Shum,et al. Optimal texture map reconstruction from multiple views , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[30] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[31] Michael J. Black,et al. SMPL: A Skinned Multi-Person Linear Model , 2023 .

[32] Jochen Wingbermühle,et al. Automatic reconstruction of 3D objects using a mobile monoscopic camera , 1997, Proceedings. International Conference on Recent Advances in 3-D Digital Imaging and Modeling (Cat. No.97TB100134).

[33] Ting-Chun Wang,et al. Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[34] Paolo Cignoni,et al. Multiple Texture Stitching and Blending on 3D Objects , 1999, Rendering Techniques.

[35] Marcus A. Magnor,et al. Detailed Human Avatars from Monocular Video , 2018, 2018 International Conference on 3D Vision (3DV).

[36] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.