TrackCam: 3D-aware tracking shots from consumer video

Panning and tracking shots are popular photography techniques in which the camera tracks a moving object and keeps it at the same position, resulting in an image where the moving foreground is sharp but the background is blurred accordingly, creating an artistic illustration of the foreground motion. Such shots however are hard to capture even for professionals, especially when the foreground motion is complex (e.g., non-linear motion trajectories). In this work we propose a system to generate realistic, 3D-aware tracking shots from consumer videos. We show how computer vision techniques such as segmentation and structure-from-motion can be used to lower the barrier and help novice users create high quality tracking shots that are physically plausible. We also introduce a pseudo 3D approach for relative depth estimation to avoid expensive 3D reconstruction for improved robustness and a wider application range. We validate our system through extensive quantitative and qualitative evaluations.

[1]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[2]  Kelvin Sung,et al.  Spatial-Temporal Antialiasing , 2002, IEEE Trans. Vis. Comput. Graph..

[3]  Jian Sun,et al.  SteadyFlow: Spatially Smooth Optical Flow for Video Stabilization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Takeo Kanade,et al.  A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Jian Sun,et al.  Bundled camera paths for video stabilization , 2013, ACM Trans. Graph..

[6]  David Salesin,et al.  Interactive digital photomontage , 2004, ACM Trans. Graph..

[7]  H. C. Longuet-Higgins,et al.  A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[8]  Amnon Shashua,et al.  Trajectory Triangulation: 3D Reconstruction of Moving Points from a Monocular Image Sequence , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Michael Gleicher,et al.  Subspace video stabilization , 2011, TOGS.

[10]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, SIGGRAPH 2009.

[11]  Hanspeter Pfister,et al.  Video Snapshots: Creating High-Quality Images from Video Clips , 2012, IEEE Transactions on Visualization and Computer Graphics.

[12]  Andrew Zisserman,et al.  Multiple View Geometry , 1999 .

[13]  Huei-Yung Lin,et al.  Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis , 2006, PSIVT.

[14]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[15]  Mina Teicher,et al.  A General Framework for Trajectory Triangulation , 2004, Journal of Mathematical Imaging and Vision.

[16]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[17]  Zeev Farbman,et al.  Interactive local adjustment of tonal values , 2006, ACM Trans. Graph..

[18]  Sunghyun Cho,et al.  Fast motion deblurring , 2009, SIGGRAPH 2009.

[19]  Aljoscha Smolic,et al.  DuctTake: Spatiotemporal Video Compositing , 2013, Comput. Graph. Forum.

[20]  Ian D. Reid,et al.  Articulated structure from motion by factorization , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Yaser Sheikh,et al.  3D Reconstruction of a Moving Point from a Series of 2D Projections , 2010, ECCV.

[22]  Harry Shum,et al.  Full-frame video stabilization , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  George Drettakis,et al.  Depth synthesis and local warps for plausible image-based navigation , 2013, TOGS.

[24]  Irfan A. Essa,et al.  Image-based motion blur for stop motion animation , 2001, SIGGRAPH.

[25]  Jiajun Bu,et al.  Video stabilization with a depth camera , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  Sundaresh Ram,et al.  Removing Camera Shake from a Single Photograph , 2009 .

[28]  Olivier D. Faugeras,et al.  The geometry of multiple images - the laws that govern the formation of multiple images of a scene and some of their applications , 2001 .

[29]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[30]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[31]  Luc Van Gool,et al.  Reconstructing 3D trajectories of independently moving objects using generic constraints , 2004, Comput. Vis. Image Underst..

[32]  Seungyong Lee,et al.  Video deblurring for hand-held cameras using patch-based synthesis , 2012, ACM Trans. Graph..

[33]  Marc Pollefeys,et al.  A factorization-based approach to articulated motion recovery , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[34]  David Jacobs,et al.  CTSR 2011-03 Digital Video Stabilization and Rolling Shutter Correction using Gyroscopes , 2011 .

[35]  Siglas de Palabras B.U.J. , 2013 .

[36]  Michael Gleicher,et al.  Content-preserving warps for 3D video stabilization , 2009, ACM Trans. Graph..

[37]  Takeo Igarashi,et al.  As-rigid-as-possible shape manipulation , 2005, ACM Trans. Graph..

[38]  Maneesh Agrawala,et al.  Using Photographs to Enhance Videos of a Static Scene , 2007, Rendering Techniques.