Depth Map Enhancement with Interaction in 2D-to-3D Video Conversion

The demand for 3D video content is growing. Conventional 3D video creation approaches need certain devices to take the 3D videos or lots of people to do the labor-intensive depth labeling work. To reduce the manpower and time consumption, many automatic approaches has been developed to convert legacy 2D videos into 3D. However, due to the strict quality requirements in video production industry, most of the automatic conversion methods are suffered from many quality issues and could not be used in the actual production. As a result manual or semi-automatic 3D video generation approaches are still mainstream 3D video generation technologies. In our project, we took advantage of an automatic video generation method and tried to apply human-computer interactions in its process procedure [1] in the aim to find a balance between time efficiency and depth map generation quality. The novelty of the paper relies on the successful attempt on improving an automatic 3D video generation method in the angle of video and film industry.

[1]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2]  Jörg Stückler,et al.  Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video , 2015, International Journal of Computer Vision.

[3]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[4]  Ce Liu,et al.  Depth Extraction from Video Using Non-parametric Sampling , 2012, ECCV.

[5]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ashutosh Saxena,et al.  Make3D: Learning 3D Scene Structure from a Single Still Image , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Different Scenes , 2008, ECCV.

[8]  Liang Zhang,et al.  Stereoscopic image generation based on depth images for 3D TV , 2005, IEEE Transactions on Broadcasting.

[9]  Meng Wang,et al.  2D-to-3D image conversion by learning depth from examples , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[11]  Ce Liu,et al.  Depth Transfer: Depth Extraction from Video Using Non-Parametric Sampling , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Stephen Gould,et al.  Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.