Data Driven 2-D-to-3-D Video Conversion for Soccer

A wide adoption of 3-D videos is hindered by the lack of high-quality 3-D content. One promising solution to this problem is through data-driven 2-D-to-3-D video conversion. Such approaches are based on learning depth maps from a large dataset of 2-D+Depth images. However, current conversion methods, while general, produce low-quality results with artifacts that are not acceptable to many viewers. We propose a novel, data-driven method for 2-D-to-3-D video conversion. Our method transfers the depth gradients from a large database of 2-D+Depth images. Capturing 2-D+Depth databases, however, are complex and costly, especially for outdoor sports games. We address this problem by creating a synthetic database from computer games and showing that this synthetic database can effectively be used to convert real videos. We propose a spatio-temporal method to ensure the smoothness of the generated depth within individual frames and across successive frames. In addition, we present an object boundary detection method customized for 2-D-to-3-D conversion systems, which produces clear depth boundaries for players. We implement our method and validate it by conducting user studies that evaluate depth perception and visual comfort of the converted 3-D videos. We show that our method produces high-quality 3-D videos that are almost indistinguishable from videos shot by stereo cameras. In addition, our method significantly outperforms the current state-of-the-art methods. For example, up to 20% improvement in the perceived depth is achieved by our method, which translates to improving the mean opinion score from good to excellent.

[1]  Meng Wang,et al.  Learning-Based, Automatic 2D-to-3D Image and Video Conversion , 2013, IEEE Transactions on Image Processing.

[2]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, ACM Trans. Graph..

[3]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[4]  Miao Liao,et al.  Video Stereolization: Combining Motion Analysis with User Interaction , 2012, IEEE Transactions on Visualization and Computer Graphics.

[5]  Dimitrios Androutsos,et al.  Robust Semi-Automatic Depth Map Generation in Unconstrained Images and Video Sequences for 2D to Stereoscopic 3D Conversion , 2014, IEEE Transactions on Multimedia.

[6]  Ce Liu,et al.  Depth Transfer: Depth Extraction from Video Using Non-Parametric Sampling , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Wen Gao,et al.  Visual pertinent 2D-to-3D video conversion by multi-cue fusion , 2011, 2011 18th IEEE International Conference on Image Processing.

[8]  Hujun Bao,et al.  Spatio-Temporal Video Segmentation of Static Scenes and Its Applications , 2015, IEEE Transactions on Multimedia.

[9]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Charles A. Bouman,et al.  CLUSTER: An Unsupervised Algorithm for Modeling Gaussian Mixtures , 2014 .

[11]  Ruzena Bajcsy,et al.  ViewCast: view dissemination and management for multi-party 3d tele-immersive environments , 2007, ACM Multimedia.

[12]  Subjective methods for the assessment of stereoscopic 3DTV systems , 2015 .

[13]  Tarek Elgamal,et al.  Cloud-Based Multimedia Content Protection System , 2015, IEEE Transactions on Multimedia.

[14]  Daniel Cohen-Or,et al.  Semi-automatic stereo extraction from video footage , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Aljoscha Smolic,et al.  2D to 3D conversion of sports content using panoramas , 2011, 2011 18th IEEE International Conference on Image Processing.

[16]  Wojciech Matusik,et al.  Gradient-based 2D-to-3D Conversion for Soccer Videos , 2015, ACM Multimedia.

[17]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[18]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Michael F. Cohen,et al.  Fourier Analysis of the 2D Screened Poisson Equation for Gradient Domain Problems , 2008, ECCV.

[20]  Ludovic J. Angot,et al.  A 2D to 3D video and image conversion technique based on a bilateral filter , 2010, Electronic Imaging.

[21]  Jitendra Malik,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Segmentation of Moving Objects by Long Term Video Analysis , 2022 .

[22]  Carlos Vázquez,et al.  3D-TV Content Creation: Automatic 2D-to-3D Video Conversion , 2011, IEEE Transactions on Broadcasting.

[23]  S. Domnic,et al.  Walsh–Hadamard Transform Kernel-Based Feature Vector for Shot Boundary Detection , 2014, IEEE Transactions on Image Processing.

[24]  Wojciech Matusik,et al.  Anahita: A System for 3D Video Streaming with Depth Customization , 2014, ACM Multimedia.

[25]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[26]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[27]  고재승,et al.  2D-to-3D stereoscopic conversion : depth estimation in 2D images and soccer videos = 낮은 심도 영상과 축구 영상에서 대한 스테레오 변환 기술 , 2008 .

[28]  Jean-Marc Odobez,et al.  Robust Multiresolution Estimation of Parametric Motion Models , 1995, J. Vis. Commun. Image Represent..

[29]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[30]  Seunghoon Hong,et al.  Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation , 2015, NIPS.

[31]  Jiebo Luo,et al.  Learning to Produce 3D Media From a Captured 2D Video , 2011, IEEE Transactions on Multimedia.

[32]  Wojciech Matusik,et al.  Large-scale, Fast and Accurate Shot Boundary Detection through Spatio-temporal Convolutional Neural Networks , 2017, ArXiv.

[33]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[34]  Wen Gao,et al.  Interactive Stereoscopic Video Conversion , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Meng Wang,et al.  Automatic 2D-to-3D image conversion using 3D examples from the internet , 2012, Electronic Imaging.