User Directed Multi-view-stereo

Depth reconstruction from video footage and image collections is a fundamental part of many modelling and image-based rendering applications. However real-world scenes often contain limited texture information, repeated elements and other ambiguities which remain challenging for fully automatic algorithms. This paper presents a technique that combines intuitive user constraints with dense multi-view stereo reconstruction. By providing annotations in the form of simple paint strokes, a user can guide a multi-view stereo algorithm and avoid common failure cases. We show how smoothness, discontinuity and depth ordering constraints can be incorporated directly into a variational optimization framework for multi-view stereo. Our method avoids the need for heuristic approaches that edit a depth-map in a sequential process, and avoids requiring the user to accurately segment object boundaries or to directly model geometry. We show how with a small amount of intuitive input, a user may create improved depth maps in challenging cases for multi-view-stereo.

[1]  Margrit Gelautz,et al.  Segmentation-Based Depth Propagation in Videos ∗ , 2011 .

[2]  Antonin Chambolle,et al.  Diagonal preconditioning for first order primal-dual algorithms in convex optimization , 2011, 2011 International Conference on Computer Vision.

[3]  Sing Bing Kang,et al.  Depth Director: A System for Adding Depth to Movies , 2011, IEEE Computer Graphics and Applications.

[4]  BaoHujun,et al.  Consistent Depth Maps Recovery from a Video Sequence , 2009 .

[5]  Jan-Michael Frahm,et al.  PatchMatch Based Joint View Selection and Depthmap Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Philip H. S. Torr,et al.  VideoTrace: rapid interactive scene modelling from video , 2007, ACM Trans. Graph..

[7]  Roberto Cipolla,et al.  Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo , 2008, ECCV.

[8]  Daniel Cremers,et al.  Large displacement optical flow computation withoutwarping , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Li Zhang,et al.  Single view modeling of free-form scenes , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[11]  Hujun Bao,et al.  Consistent Depth Maps Recovery from a Video Sequence , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Xuelong Li,et al.  Intrinsic images using optimization , 2011, CVPR 2011.

[13]  Ruigang Yang,et al.  High-Quality Stereo Video Matching via User Interaction and Space-Time Propagation , 2013, 2013 International Conference on 3D Vision.

[14]  Miao Liao,et al.  Video Stereolization: Combining Motion Analysis with User Interaction , 2012, IEEE Transactions on Visualization and Computer Graphics.

[15]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[16]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[17]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[18]  Daniel Cremers,et al.  Image-Based 3D Modeling via Cheeger Sets , 2010, ACCV.

[19]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[20]  Michael Wimmer,et al.  O-snap , 2013, ACM Trans. Graph..

[21]  Daniel Cohen-Or,et al.  Semi-automatic stereo extraction from video footage , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  John Dingliana,et al.  Adding Depth to Cartoons Using Sparse Depth (In)equalities , 2010, Comput. Graph. Forum.

[23]  Daniel Cohen-Or,et al.  3-Sweep , 2013, ACM Trans. Graph..