论文信息 - Discriminatively Trained Dense Surface Normal Estimation

Discriminatively Trained Dense Surface Normal Estimation

In this work we propose the method for a rather unexplored problem of computer vision - discriminatively trained dense surface normal estimation from a single image. Our method combines contextual and segment-based cues and builds a regressor in a boosting framework by transforming the problem into the regression of coefficients of a local coding. We apply our method to two challenging data sets containing images of man-made environments, the indoor NYU2 data set and the outdoor KITTI data set. Our surface normal predictor achieves results better than initially expected, significantly outperforming state-of-the-art.

[1] Alexei A. Efros,et al. Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[2] Eli Shechtman,et al. Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Marc Pollefeys,et al. Pulling Things out of Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Roberto Cipolla,et al. Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] Antonin Chambolle,et al. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[7] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[9] KeeChang Lee,et al. Fast Automatic Single-View 3-d Reconstruction of Urban Scenes , 2008, ECCV.

[10] Pushmeet Kohli,et al. Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Stewart Burn,et al. Superpixels via pseudo-Boolean optimization , 2011, 2011 International Conference on Computer Vision.

[12] Antonio Torralba,et al. Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13] Olivier D. Faugeras,et al. Shape From Shading , 2006, Handbook of Mathematical Models in Computer Vision.

[14] Cor J. Veenman,et al. Kernel Codebooks for Scene Categorization , 2008, ECCV.

[15] Yihong Gong,et al. Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[16] Honglak Lee,et al. A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17] Ian D. Reid,et al. A Dynamic Programming Approach to Reconstructing Building Interiors , 2010, ECCV.

[18] Alexei A. Efros,et al. Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics , 2010, ECCV.

[19] Jitendra Malik,et al. Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[20] Martial Hebert,et al. Data-Driven 3D Primitives for Single Image Understanding , 2013, 2013 IEEE International Conference on Computer Vision.

[21] Alexei A. Efros,et al. Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[23] Tsuhan Chen,et al. Learning class-specific affinities for image labelling , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[25] T. Kanade,et al. Geometric reasoning for single image structure recovery , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[27] Karl Kunisch,et al. Total Generalized Variation , 2010, SIAM J. Imaging Sci..

[28] Ashutosh Saxena,et al. 3-D Depth Reconstruction from a Single Still Image , 2007, International Journal of Computer Vision.

[29] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[30] Ian D. Reid,et al. Growing semantically meaningful models for visual SLAM , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31] Bill Triggs,et al. Visual Recognition Using Local Quantized Patterns , 2012, ECCV.

[32] Pushmeet Kohli,et al. Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33] Thomas S. Huang,et al. Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[34] Kiyoharu Aizawa,et al. Photometric Stereo Using Constrained Bivariate Regression for General Isotropic Surfaces , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Cristian Sminchisescu,et al. Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[37] Stephen Gould,et al. Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38] Andrea Vedaldi,et al. Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[39] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] Lin Yang,et al. Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[42] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[43] Joost van de Weijer,et al. Fusing Global and Local Scale for Semantic Image Segmentation , 2011 .

[44] Antonio Criminisi,et al. TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[45] Ashutosh Saxena,et al. Learning 3-D Scene Structure from a Single Still Image , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[46] Isabelle Guyon,et al. Automatic Capacity Tuning of Very Large VC-Dimension Classifiers , 1992, NIPS.

[47] Cordelia Schmid,et al. Object Recognition by Integrating Multiple Image Segmentations , 2008, ECCV.

[48] Serge J. Belongie,et al. Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[49] David J. Kriegman,et al. Beyond Lambert: reconstructing specular surfaces using color , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[50] Joost van de Weijer,et al. Harmony Potentials , 2011, International Journal of Computer Vision.

[51] Alexei A. Efros,et al. Closing the loop in scene interpretation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[52] Yihong Gong,et al. Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.