Adversarial Transfer of Pose Estimation Regression

We address the problem of camera pose estimation in visual localization. Current regression-based methods for pose estimation are trained and evaluated scene-wise. They depend on the coordinate frame of the training dataset and show a low generalization across scenes and datasets. We identify the dataset shift an important barrier to generalization and consider transfer learning as an alternative way towards a better reuse of pose estimation models. %To benefit from %for in the image classification and semantic segmentation. e revise domain adaptation techniques for classification and extend them to camera pose estimation, which is a multi-regression task. We develop a deep adaptation network for learning scene-invariant image representations and use adversarial learning to generate such representations for model transfer. We enrich the network with self-supervised learning and use the adaptability theory to validate the existence of scene-invariant representation of images in two given scenes. We evaluate our network on two public datasets, Cambridge Landmarks and 7Scene, demonstrate its superiority over several baselines and compare to the state of the art methods.

[1]  Roberto Cipolla,et al.  Modelling uncertainty in deep learning for camera relocalization , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Radu Horaud,et al.  A Comprehensive Analysis of Deep Regression , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Alexander Kolesnikov,et al.  Revisiting Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ethan Rublee,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[5]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Roberto Cipolla,et al.  Geometric Loss Functions for Camera Pose Regression with Deep Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[8]  Eric Brachmann,et al.  Learning Less is More - 6D Camera Localization via 3D Surface Regression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Wolfram Burgard,et al.  VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry , 2018, IEEE Robotics and Automation Letters.

[11]  Trevor Darrell,et al.  Efficient Learning of Domain-invariant Image Representations , 2013, ICLR.

[12]  Philip David,et al.  Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[13]  Jianmin Wang,et al.  Learning to Transfer Examples for Partial Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  C. V. Jawahar,et al.  Improved Visual Relocalization by Discovering Anchor Points , 2018, BMVC.

[15]  Tatsuya Harada,et al.  Open Set Domain Adaptation by Backpropagation , 2018, ECCV.

[16]  Jan Kautz,et al.  Geometry-Aware Learning of Maps for Camera Localization , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Torsten Sattler,et al.  To Learn or Not to Learn: Visual Localization from Essential Matrices , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Ping Tan,et al.  SANet: Scene Agnostic Network for Camera Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Andrew W. Fitzgibbon,et al.  Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[21]  Jianmin Wang,et al.  Transferability vs. Discriminability: Batch Spectral Penalization for Adversarial Domain Adaptation , 2019, ICML.

[22]  Mehryar Mohri,et al.  Domain Adaptation in Regression , 2011, ALT.

[23]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[24]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[25]  Yoli Shavit,et al.  Introduction to Camera Pose Estimation with Deep Learning , 2019, ArXiv.

[26]  Michael I. Jordan,et al.  Universal Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Juho Kannala,et al.  Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[28]  Brian D. Ziebart,et al.  Robust Covariate Shift Regression , 2016, AISTATS.

[29]  Shuda Li,et al.  RelocNet: Continuous Metric Learning Relocalisation Using Neural Nets , 2018, ECCV.

[30]  Daniel Cremers,et al.  Image-Based Localization Using LSTMs for Structured Feature Correlation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Jianmin Wang,et al.  Partial Adversarial Domain Adaptation , 2018, ECCV.

[32]  Philip S. Yu,et al.  Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[34]  Torsten Sattler,et al.  Understanding the Limitations of CNN-Based Absolute Camera Pose Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Wolfram Burgard,et al.  Deep Auxiliary Learning for Visual Localization and Odometry , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Esa Rahtu,et al.  Relative Camera Pose Estimation Using Convolutional Neural Networks , 2017, ACIVS.

[37]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[38]  Bo Li,et al.  Local Feature Descriptor for Image Matching: A Survey , 2019, IEEE Access.