Recovering 3D Planar Arrangements from Videos

Acquiring 3D geometry of real world objects has various applications in 3D digitization, such as navigation and content generation in virtual environments. Image remains one of the most popular media for such visual tasks due to its simplicity of acquisition. Traditional image-based 3D reconstruction approaches heavily exploit point-to-point correspondence among multiple images to estimate camera motion and 3D geometry. Establishing point-to-point correspondence lies at the center of the 3D reconstruction pipeline, which however is easily prone to errors. In this paper, we propose an optimization framework which traces image points using a novel structure-guided dynamic tracking algorithm and estimates both the camera motion and a 3D structure model by enforcing a set of planar constraints. The key to our method is a structure model represented as a set of planes and their arrangements. Constraints derived from the structure model is used both in the correspondence establishment stage and the bundle adjustment stage in our reconstruction pipeline. Experiments show that our algorithm can effectively localize structure correspondence across dense image frames while faithfully reconstructing the camera motion and the underlying structured 3D model.

[1]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Hang Yang,et al.  Structured Indoor Modeling , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Reconstructing the World ’ s Museums : Supplementary Material , 2012 .

[5]  Marc Pollefeys,et al.  Interactive 3D architectural modeling from unordered photo collections , 2008, SIGGRAPH 2008.

[6]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[7]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Tsuhan Chen,et al.  3D Reasoning from Blocks to Stability , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Stephen J. Maybank,et al.  A Method for Interactive 3D Reconstruction of Piecewise Planar Objects from Single Images , 1999, BMVC.

[11]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Burcu Akinci,et al.  Automatic Creation of Semantically Rich 3D Building Models from Laser Scanner Data , 2013 .

[13]  Jianxiong Xiao,et al.  Reconstructing the World's Museums , 2012, ECCV.

[14]  Daniel Cremers,et al.  Structure- and motion-adaptive regularization for high accuracy optic flow , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Didier Stricker,et al.  Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[17]  Xiaoou Tang,et al.  Symmetric piecewise planar object reconstruction from a single image , 2011, CVPR 2011.

[18]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[19]  Ping Tan,et al.  Symmetric architecture modeling with a single image , 2009, SIGGRAPH 2009.

[20]  Philip H. S. Torr,et al.  VideoTrace: rapid interactive scene modelling from video , 2007, ACM Trans. Graph..

[21]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[22]  Niloy J. Mitra,et al.  RAPter , 2015, ACM Trans. Graph..

[23]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yin Yang,et al.  Interactive mechanism modeling from multi-view images , 2016, ACM Trans. Graph..

[25]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, CVPR.

[27]  陈宝权 GlobFit: Consistently Fitting Primitives by Discovering Global Relations , 2011 .

[28]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[29]  Carsten Rother,et al.  FusionFlow: Discrete-continuous optimization for optical flow estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[31]  Philip H. S. Torr,et al.  Efficient online structured output learning for keypoint-based object tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision , 2004 .

[33]  Niloy J. Mitra,et al.  Coupled structure-from-motion and 3D symmetry detection for urban facades , 2014, ACM Trans. Graph..

[34]  Renato Pajarola,et al.  Automatic room detection and reconstruction in cluttered indoor environments with complex room layouts , 2014, Comput. Graph..

[35]  Zhenguo Li,et al.  A Closed-form Solution to 3D Reconstruction of Piecewise Planar Objects from Single Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Pushmeet Kohli,et al.  MobileFusion: Real-Time Volumetric Surface Reconstruction and Dense Tracking on Mobile Phones , 2015, IEEE Transactions on Visualization and Computer Graphics.