Calipso: physics-based image and video editing through CAD model proxies

We present Calipso, an interactive method for editing images and videos in a physically coherent manner. Our main idea is to realize physics-based manipulations by running a full-physics simulation on proxy geometries given by non-rigidly aligned CAD models. Running these simulations allows us to apply new, unseen forces to move or deform selected objects, change physical parameters such as mass or elasticity, or even add entire new objects that interact with the rest of the underlying scene. In our method, the user makes edits directly in 3D; these edits are processed by the simulation and then transferred to the target 2D content using shape-to-image correspondences in a photo-realistic rendering process. To align the CAD models, we introduce an efficient CAD-to-image alignment procedure that jointly minimizes for rigid and non-rigid alignment while preserving the high-level structure of the input shape. Moreover, the user can choose to exploit image flow to estimate scene motion, producing coherent physical behavior with ambient dynamics. We demonstrate physics-based editing on a wide range of examples producing myriad physical behavior while preserving geometric and visual consistency.

[1]  D. Cohen-Or,et al.  Parametric reshaping of human bodies in images , 2010, ACM Trans. Graph..

[2]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[3]  Jiawen Chen,et al.  The video mesh: A data structure for image-based three-dimensional video editing , 2011, 2011 IEEE International Conference on Computational Photography (ICCP).

[4]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[5]  Yaser Sheikh,et al.  3D object manipulation in a single photograph using stock 3D models , 2014, ACM Trans. Graph..

[6]  Takeo Igarashi,et al.  Interactive motion photography from a single image , 2010, The Visual Computer.

[7]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[8]  Marc Alexa,et al.  A sketch-based interface for detail-preserving mesh editing , 2005, SIGGRAPH 2005.

[9]  Marc Alexa,et al.  As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[10]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[11]  Alexei A. Efros,et al.  Photo clip art , 2007, ACM Trans. Graph..

[12]  Niloy J. Mitra,et al.  SMASH: physics-guided reconstruction of collisions from videos , 2016, ACM Trans. Graph..

[13]  Niloy J. Mitra,et al.  Interactive Videos: Plausible Video Editing using Sparse Structure Points , 2016, Comput. Graph. Forum.

[14]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Justus Thies,et al.  Real-time expression transfer for facial reenactment , 2015, ACM Trans. Graph..

[16]  Sabine Coquillart,et al.  Extended free-form deformation: a sculpturing tool for 3D geometric modeling , 1990, SIGGRAPH.

[17]  David A. Forsyth,et al.  Rendering synthetic objects into legacy photographs , 2011, ACM Trans. Graph..

[18]  S. Avidan,et al.  Seam carving for content-aware image resizing , 2007, SIGGRAPH 2007.

[19]  Erik Reinhard,et al.  Image-based material editing , 2005, SIGGRAPH '05.

[20]  Christian Duriez,et al.  SOFA: A Multi-Model Framework for Interactive Physical Simulation , 2012 .

[21]  John Hart,et al.  Textureshop: texture synthesis as a photograph editing tool , 2004, SIGGRAPH 2004.

[22]  Christian Duriez,et al.  Efficient Contact Modeling using Compliance Warping , 2008, CGI 2008.

[23]  Hans-Peter Seidel,et al.  MovieReshape: tracking and reshaping of humans in videos , 2010, ACM Trans. Graph..

[24]  D. Stewart,et al.  Time-stepping for three-dimensional rigid body dynamics , 1999 .

[25]  D. Levin,et al.  Mesh-Independent Surface Interpolation , 2004 .

[26]  Ariel Shamir,et al.  Seam Carving for Content-Aware Image Resizing , 2007, ACM Trans. Graph..

[27]  Marc Alexa,et al.  A sketch-based interface for detail-preserving mesh editing , 2007, SIGGRAPH Courses.

[28]  Daniel Cohen-Or,et al.  3-Sweep , 2013, ACM Trans. Graph..

[29]  Frédo Durand,et al.  Visual vibrometry: Estimating material properties from small motions in video , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  William A. Barrett,et al.  Object-based image editing , 2002, ACM Trans. Graph..

[31]  Shi-Min Hu,et al.  Data‐Driven Object Manipulation in Images , 2012, Comput. Graph. Forum.

[32]  Motoji Yamamoto,et al.  An edge-based computationally efficient formulation of Saint Venant-Kirchhoff tetrahedral finite elements , 2009, ACM Trans. Graph..

[33]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[34]  Eli Shechtman,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, ACM Trans. Graph..

[35]  David A. Forsyth,et al.  Generalizing motion edits with Gaussian processes , 2009, ACM Trans. Graph..

[36]  Andrew P. Witkin,et al.  Large steps in cloth simulation , 1998, SIGGRAPH.

[37]  Paul Debevec Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 2008, SIGGRAPH Classes.

[38]  Maneesh Agrawala,et al.  Selectively de-animating video , 2012, ACM Trans. Graph..

[39]  Frédo Durand,et al.  Image-space modal bases for plausible manipulation of objects in video , 2015, ACM Trans. Graph..

[40]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[41]  Christian Duriez,et al.  Real-time simulation of contact and cutting of heterogeneous soft-tissues , 2014, Medical Image Anal..

[42]  Ken-ichi Anjyo,et al.  Animating pictures of water scenes using video retrieval , 2016, The Visual Computer.

[43]  Katsushi Ikeuchi,et al.  Separating reflection components of textured surfaces using a single image , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Andrew Nealen,et al.  Physically Based Deformable Models in Computer Graphics , 2005, Eurographics.

[45]  H. Seidel,et al.  Pattern-aware Deformation Using Sliding Dockers , 2011, SIGGRAPH 2011.

[46]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[47]  Frédo Durand,et al.  A gentle introduction to bilateral filtering and its applications , 2007, SIGGRAPH Courses.

[48]  Ali Farhadi,et al.  Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks , 2016, ECCV.

[49]  Kun Zhou,et al.  Interactive images , 2012, ACM Trans. Graph..

[50]  Kun Zhou,et al.  Imagining the unseen , 2014, ACM Trans. Graph..

[51]  Katsushi Ikeuchi,et al.  Light source position and reflectance estimation from a single view without the distant illumination assumption , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[53]  Markus H. Gross,et al.  Physically Based Video Editing , 2016, Comput. Graph. Forum.