论文信息 - Temporally Coherent General Dynamic Scene Reconstruction

Temporally Coherent General Dynamic Scene Reconstruction

Existing techniques for dynamic scene reconstruction from multiple wide-baseline cameras primarily focus on reconstruction in controlled environments, with fixed calibrated cameras and strong prior constraints. This paper introduces a general approach to obtain a 4D representation of complex dynamic scenes from multi-view wide-baseline static or moving cameras without prior knowledge of the scene structure, appearance, or illumination. Contributions of the work are: an automatic method for initial coarse reconstruction to initialize joint estimation; sparse-to-dense temporal correspondence integrated with joint multi-view segmentation and reconstruction to introduce temporal coherence; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes by introducing shape constraint. Comparison with state-of-the-art approaches on a variety of complex indoor and outdoor scenes, demonstrates improved accuracy in both multi-view segmentation and dense reconstruction. This paper demonstrates unsupervised reconstruction of complete temporally coherent 4D scene models with improved non-rigid object segmentation and shape reconstruction and its application to various applications such as free-view rendering and virtual reality.

[1] Andrew Zisserman,et al. Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[2] Daniel Cremers,et al. Stereoscopic Scene Flow Computation for 3D Motion Understanding , 2011, International Journal of Computer Vision.

[3] Didier Stricker,et al. Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Patrick Pérez,et al. Sparse Multi-View Consistency for Object Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] G. Rote,et al. On the Bounding Boxes Obtained by Principal Component Analysis , 2006 .

[6] Yasushi Yagi,et al. Reflectance and Shape Estimation with a Light Field Camera Under Natural Illumination , 2019, International Journal of Computer Vision.

[7] Mei Han,et al. Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8] Larry H. Matthies,et al. Stereo vision for planetary rovers: Stochastic modeling to near real-time implementation , 1991, Optics & Photonics.

[9] J.-Y. Bouguet,et al. Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[10] Carsten Rother,et al. PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[11] Luc Van Gool,et al. Simultaneous Segmentation and 3D Reconstruction of Monocular Image Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12] Olga Veksler,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13] Marc Pollefeys,et al. Modeling Dynamic Scenes Recorded with Freely Moving Cameras , 2010, ACCV.

[14] Minglun Gong,et al. Stereo-Based 3D Reconstruction of Dynamic Fluid Surfaces by Global Optimization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Pushmeet Kohli,et al. PoseCut: Simultaneous Segmentation and 3D Pose Estimation of Humans Using Dynamic Graph-Cuts , 2006, ECCV.

[16] VekslerOlga,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001 .

[17] Adrian Hilton,et al. MSFD: Multi-Scale Segmentation-Based Feature Detection for Wide-Baseline Scene Reconstruction , 2019, IEEE Transactions on Image Processing.

[18] Xiaoyan Hu,et al. A Quantitative Evaluation of Confidence Measures for Stereo Vision , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Toby P. Breckon,et al. Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Allen R. Hanson,et al. Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations , 2013, 2013 IEEE International Conference on Computer Vision.

[21] Jean-Yves Guillemaut,et al. General Dynamic Scene Reconstruction from Multiple View Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22] Daniel Cremers,et al. KillingFusion: Non-rigid 3D Reconstruction without Correspondences , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Jean-Yves Guillemaut,et al. Joint Multi-Layer Segmentation and Reconstruction for Free-Viewpoint Video Applications , 2011, International Journal of Computer Vision.

[24] Vladimir Kolmogorov,et al. Graph cut based image segmentation with connectivity priors , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Vittorio Ferrari,et al. Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[26] Gert Vegter,et al. In handbook of discrete and computational geometry , 1997 .

[27] Marc Pollefeys,et al. Multi-view Occlusion Reasoning for Probabilistic Silhouette-Based Dynamic Scene Reconstruction , 2010, International Journal of Computer Vision.

[28] James M. Rehg,et al. Joint Semantic Segmentation and 3D Reconstruction from Monocular Video , 2014, ECCV.

[29] Roberto Cipolla,et al. Automatic 3D object segmentation in multiple views using volumetric graph-cuts , 2007, Image Vis. Comput..

[30] Hujun Bao,et al. Robust Bilayer Segmentation and Motion/Depth Estimation with a Handheld Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Andreas Geiger,et al. Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Andrew Blake,et al. Probabilistic Fusion of Stereo with Color and Contrast for Bilayer Segmentation , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[33] Marcus A. Magnor,et al. Space-time isosurface evolution for temporally coherent 3D reconstruction , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[34] Alan L. Yuille,et al. The Manhattan World Assumption: Regularities in Scene Statistics which Enable Bayesian Inference , 2000, NIPS.

[35] Cheng Lei,et al. A new multiview spacetime-consistent depth recovery framework for free viewpoint video rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36] Alexander H. Liu,et al. Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Olga Veksler,et al. Semiautomatic segmentation with compact shape prior , 2009, Image Vis. Comput..

[38] Richard Szeliski,et al. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[39] Andreas Geiger,et al. Learning 3D Shape Completion Under Weak Supervision , 2018, International Journal of Computer Vision.

[40] Adrian Hilton,et al. 4D Match Trees for Non-rigid Surface Alignment , 2016, ECCV.

[41] Matthias Zwicker,et al. Specular-to-Diffuse Translation for Multi-View Reconstruction , 2018, ECCV.

[42] Adrian Hilton,et al. A Free-Viewpoint Video Renderer , 2009, J. Graphics, GPU, & Game Tools.

[43] Yaser Sheikh,et al. Spatiotemporal Bundle Adjustment for Dynamic 3D Reconstruction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Andrew Blake,et al. Geodesic star convexity for interactive image segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45] Richard Szeliski,et al. Stereo Matching with Transparency and Matting , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[46] Richard Szeliski,et al. Multiple View Object Cosegmentation Using Appearance and Stereo Cues , 2012, ECCV.

[47] Jean-Yves Guillemaut,et al. Temporally Coherent 4D Reconstruction of Complex Dynamic Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Hujun Bao,et al. 3D Reconstruction of Dynamic Scenes with Multiple Handheld Cameras , 2012, ECCV.

[49] Daniel Cremers,et al. Generalized Connectivity Constraints for Spatio-temporal 3D Reconstruction , 2014, ECCV.

[50] Jean-Yves Guillemaut,et al. Outdoor Dynamic 3-D Scene Reconstruction , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[51] Adrian Hilton,et al. Segmentation Based Features for Wide-Baseline Multi-view Reconstruction , 2015, 2015 International Conference on 3D Vision.

[52] Changchang Wu,et al. Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[53] Marc Pollefeys,et al. Joint 3D Scene Reconstruction and Class Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54] Mubarak Shah,et al. Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[55] Minsu Cho,et al. Multi-object reconstruction from dynamic scenes: An object-centered approach , 2013, Comput. Vis. Image Underst..

[56] Jean-Yves Guillemaut,et al. Temporal trimap propagation for video matting using inferential statistics , 2011, 2011 18th IEEE International Conference on Image Processing.

[57] ZENG,et al. SILHOUETTE EXTRACTION FROM MULTIPLE IMAGES OF AN UNKNOWN BACKGROUND Gang , 2003 .

[58] Jean-Yves Guillemaut,et al. Space-Time Joint Multi-layer Segmentation and Depth Estimation , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[59] Patrick Pérez,et al. Multi-view Object Segmentation in Space and Time , 2013, 2013 IEEE International Conference on Computer Vision.

[60] Adrian Hilton,et al. Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[61] Radu Bogdan Rusu,et al. Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments , 2010, KI - Künstliche Intelligenz.

[62] Woontack Woo,et al. Silhouette Segmentation in Multiple Views , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63] Yael Moses,et al. Multi-view scene flow estimation: A view centered variational approach , 2010, CVPR.

[64] Roberto Manduchi,et al. Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[65] Joseph O'Rourke,et al. Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[66] Pushmeet Kohli,et al. Simultaneous Segmentation and Pose Estimation of Humans Using Dynamic Graph Cuts , 2008, International Journal of Computer Vision.

[67] Marc Pollefeys,et al. Temporally Consistent Reconstruction from Multiple Video Streams Using Enhanced Belief Propagation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[68] Olga Veksler,et al. Star Shape Prior for Graph-Cut Image Segmentation , 2008, ECCV.

[69] Vladimir Kolmogorov,et al. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70] Takashi Matsuyama,et al. Complete multi-view reconstruction of dynamic scenes from probabilistic fusion of narrow and wide baseline stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[71] Michael M. Kazhdan,et al. Poisson surface reconstruction , 2006, SGP '06.

[72] Marc Pollefeys,et al. Unstructured video-based rendering: interactive exploration of casually captured videos , 2010, SIGGRAPH 2010.

[73] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74] Chongyang Ma,et al. Deep Volumetric Video From Very Sparse Multi-view Performance Capture , 2018, ECCV.