Neural Geometric Parser for Single Image Camera Calibration

We propose a neural geometric parser learning single image camera calibration for man-made scenes. Unlike previous neural approaches that rely only on semantic cues obtained from neural networks, our approach considers both semantic and geometric cues, resulting in significant accuracy improvement. The proposed framework consists of two networks. Using line segments of an image as geometric cues, the first network estimates the zenith vanishing point and generates several candidates consisting of the camera rotation and focal length. The second network evaluates each candidate based on the given image and the geometric cues, where prior knowledge of man-made scenes is used for the evaluation. With the supervision of datasets consisting of the horizontal line and focal length of the images, our networks can be trained to estimate the same camera parameters. Based on the Manhattan world assumption, we can further estimate the camera rotation and focal length in a weakly supervised manner. The experimental results reveal that the performance of our neural approach is significantly higher than that of existing state-of-the-art camera calibration techniques for single images of indoor and outdoor scenes.

[1]  Alan L. Yuille,et al.  Manhattan World: compass direction from a single image by Bayesian inference , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision , 2004 .

[3]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision: From Images to Geometric Models , 2003 .

[4]  Rafael Grompone von Gioi,et al.  LSD: A Fast Line Segment Detector with a False Detection Control , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Krista A. Ehinger,et al.  Recognizing scene viewpoint using panoramic place representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Jean-Philippe Tardif,et al.  Non-iterative approach for fast and accurate vanishing point detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Eric Brachmann,et al.  Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Connor Greenwell,et al.  DEEPFOCAL: A method for direct focal length estimation , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[10]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[11]  James H. Elder,et al.  Efficient Edge-Based Methods for Estimating Manhattan Frames in Urban Imagery , 2008, ECCV.

[12]  Thomas Brox,et al.  Image Orientation Estimation with Convolutional Networks , 2015, GCPR.

[13]  Scott Workman,et al.  Detecting Vanishing Points Using Global Image Context in a Non-ManhattanWorld , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yichao Zhou,et al.  NeurVPS: Neural Vanishing Point Scanning via Conic Convolution , 2019, NeurIPS.

[15]  Pushmeet Kohli,et al.  Geometric Image Parsing in Man-Made Environments , 2010, International Journal of Computer Vision.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Matthew Fisher,et al.  UprightNet: Geometry-Aware Camera Orientation Estimation From Single Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[19]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[20]  Wei Zhang,et al.  Video Compass , 2002, ECCV.

[21]  Seungyong Lee,et al.  Automatic Upright Adjustment of Photographs With Robust Camera Calibration , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Scott Workman,et al.  Horizon Lines in the Wild , 2016, BMVC.

[24]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Frank Dellaert,et al.  Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[26]  Anthony Hoogs,et al.  A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Allan Hanbury,et al.  Robust camera self-calibration from monocular images of Manhattan worlds , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Eric Brachmann,et al.  CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[31]  Cuneyt Akinlar,et al.  EDLines: A real-time line segment detector with a false detection control , 2011, Pattern Recognit. Lett..

[32]  James H. Elder,et al.  MCMLSD: A Dynamic Programming Approach to Line Segment Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yun-Hui Liu,et al.  Quasi-Globally Optimal and Efficient Vanishing Point Estimation in Manhattan World , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[35]  Marie-Odile Berger,et al.  A-Contrario Horizon-First Vanishing Point Detection Using Second-Order Grouping Laws , 2018, ECCV.

[36]  Yannick Hold-Geoffroy,et al.  A Perceptual Measure for Deep Single Image Camera Calibration , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.