Geometry Reconstruction of Players for Novel-View Synthesis of Sports Broadcasts

In this chapter, we present two methods for geometric reconstruction of players in standard sports broadcasts specifically designed to enable the broadcast director to generate novel views from locations where there is no physical camera (novel-view synthesis). This will significantly broaden the creative freedom of the director greatly enhancing the viewing experience. First, we propose a data-driven method based on multiview body pose estimation. This method can operate in uncontrolled environments with loosely calibrated and low resolution cameras and without restricting assumptions on the family of possible poses or motions. Second, we propose a scalable top-down patch-based method that reconstructs the geometry of the players adaptively based on the amount of detail available in the video streams. These methods are complementary to each other and together provides a more complete set of tools for novel-view synthesis for sport broadcasts.

[1]  Greg Mori,et al.  Guiding model search using segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[2]  Luca Ballan,et al.  Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes , 2008 .

[3]  Adam Finkelstein,et al.  The Generalized PatchMatch Correspondence Algorithm , 2010, ECCV.

[4]  John M. Schreiner,et al.  Inter-surface mapping , 2004, SIGGRAPH 2004.

[5]  Markus H. Gross,et al.  Space-Time Body Pose Estimation in Uncontrolled Environments , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[6]  Marc Pollefeys,et al.  Unstructured video-based rendering: interactive exploration of casually captured videos , 2010, SIGGRAPH 2010.

[7]  J. Starck,et al.  A Robust Free-Viewpoint Video System for Sport Scenes , 2007, 2007 3DTV Conference.

[8]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..

[9]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[10]  Derek Bradley,et al.  Markerless garment capture , 2008, SIGGRAPH 2008.

[11]  Markus H. Gross,et al.  Spatio‐temporal geometry fusion for multiple hybrid cameras using moving least squares surfaces , 2014, Comput. Graph. Forum.

[12]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[13]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[14]  Jérémie Allard,et al.  Multicamera Real-Time 3D Modeling for Telepresence and Remote Collaboration , 2010, Int. J. Digit. Multim. Broadcast..

[15]  Jorge J. Moré,et al.  The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .

[16]  G. A. Thomas,et al.  Real-Time Camera Pose Estimation for Augmenting Sports Scenes , 2006 .

[17]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, SIGGRAPH 2008.

[18]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[20]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[21]  Sukhan Lee,et al.  Real-time 3D object pose estimation and tracking for natural landmark based visual servo , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Horst Bischof,et al.  A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Markus H. Gross,et al.  Novel‐View Synthesis of Outdoor Sport Events Using an Adaptive View‐Dependent Geometry , 2012, Comput. Graph. Forum.

[25]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[26]  Markus H. Gross,et al.  Articulated Billboards for Video‐based Rendering , 2010, Comput. Graph. Forum.

[27]  Hideo Saito,et al.  Intermediate view generation of soccer scene from multiple videos , 2001, Object recognition supported by user interaction for service robots.

[28]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[30]  Jean-Yves Guillemaut,et al.  Joint Multi-Layer Segmentation and Reconstruction for Free-Viewpoint Video Applications , 2011, International Journal of Computer Vision.

[31]  Hans-Peter Seidel,et al.  Performance Capture from Multi-View Video , 2010, Image and Geometry Processing for 3-D Cinematography.

[32]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Oliver Grau,et al.  3D-TV Production From Conventional Cameras for Sports Broadcast , 2011, IEEE Transactions on Broadcasting.

[34]  Richard Szeliski,et al.  Piecewise planar stereo for image-based rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[35]  Konrad Schindler,et al.  Piecewise planar scene reconstruction from sparse correspondences , 2006, Image Vis. Comput..

[36]  Jan-Michael Frahm,et al.  Piecewise planar and non-planar stereo for urban scene reconstruction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Hideo Saito,et al.  Fly through view video generation of soccer scene , 2002, IWEC.

[38]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Hideo Saito,et al.  Synthesizing Free-Viewpoing Images from Multiple View Videos in Soccer StadiumADIUM , 2006, International Conference on Computer Graphics, Imaging and Visualisation (CGIV'06).

[40]  Edmond Boyer,et al.  Efficient Polyhedral Modeling from Silhouettes , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Hans-Peter Seidel,et al.  Marker-free kinematic skeleton estimation from sequences of volume data , 2004, VRST '04.

[42]  Jean-Yves Guillemaut,et al.  Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[43]  Marcus A. Magnor,et al.  View and Time Interpolation in Image Space , 2008, Comput. Graph. Forum.

[44]  Hans-Peter Seidel,et al.  A Hybrid Hardware-Accelerated Algorithm for High Quality Rendering of Visual Hulls , 2004, Graphics Interface.