Leveraging Outdoor Webcams for Local Descriptor Learning

We present AMOS Patches, a large set of image cut-outs, intended primarily for the robustification of trainable local feature descriptors to illumination and appearance changes. Images contributing to AMOS Patches originate from the AMOS dataset of recordings from a large set of outdoor webcams. The semiautomatic method used to generate AMOS Patches is described. It includes camera selection, viewpoint clustering and patch selection. For training, we provide both the registered full source images as well as the patches. A new descriptor, trained on the AMOS Patches and 6Brown datasets, is introduced. It achieves state-of-the-art in matching under illumination changes on standard benchmarks.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[3]  Zhanyi Hu,et al.  Aggregating gradient distributions into intensity orders: A novel local image descriptor , 2011, CVPR 2011.

[4]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  C. Lawrence Zitnick,et al.  Edge foci interest points , 2011, 2011 International Conference on Computer Vision.

[6]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[7]  Yan Lu,et al.  Local Descriptors Optimized for Average Precision , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Chenglu Wen,et al.  H-Net: Neural Network for Cross-domain Image Patch Matching , 2018, IJCAI.

[9]  Lei Zhou,et al.  Matchable Image Retrieval by Learning from Surface Reconstruction , 2018, ACCV.

[10]  Masatoshi Okutomi,et al.  Structure from motion using dense CNN features with keypoint relocalization , 2018, IPSJ Transactions on Computer Vision and Applications.

[11]  Vincent Lepetit,et al.  TILDE: A Temporally Invariant Learned DEtector , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[13]  Charles V. Stewart,et al.  Keypoint Descriptors for Matching Across Multiple Image Modalities and Non-linear Intensity Variations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[15]  Henrik Aanæs,et al.  Interesting Interest Points , 2011, International Journal of Computer Vision.

[16]  Robert Pless,et al.  The global network of outdoor webcams: properties and applications , 2009, GIS.

[17]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jiri Matas,et al.  Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[19]  Seungryong Kim,et al.  LAT: Local area transform for cross modal correspondence matching , 2017, Pattern Recognit..

[20]  Noah Snavely,et al.  Image matching using local symmetry features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Gabriela Csurka,et al.  From handcrafted to deep local invariant features , 2018, ArXiv.

[22]  Bolei Zhou,et al.  Semantic Understanding of Scenes Through the ADE20K Dataset , 2016, International Journal of Computer Vision.

[23]  Krystian Mikolajczyk,et al.  Learning local feature descriptors with triplets and shallow convolutional neural networks , 2016, BMVC.

[24]  Sharat Chandran,et al.  A Large Dataset for Improving Patch Matching , 2018, ArXiv.

[25]  Martial Hebert,et al.  Deep Material-Aware Cross-Spectral Stereo Matching , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Margarita Chli,et al.  Learning Deep Descriptors with Scale-Aware Triplet Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Tinne Tuytelaars,et al.  Location recognition over large time lags , 2014, Comput. Vis. Image Underst..

[29]  Scott Workman,et al.  Sky segmentation in the wild: An empirical study , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Robert Pless,et al.  Consistent Temporal Variations in Many Outdoor Scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Chia-Ling Tsai,et al.  Registration of Challenging Image Pairs: Initialization, Estimation, and Decision , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Minh N. Do,et al.  DASC: Robust Dense Descriptor for Multi-Modal and Multi-Spectral Correspondence Estimation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jiri Matas,et al.  WxBS: Wide Baseline Stereo Generalizations , 2015, BMVC.

[34]  王振华,et al.  Exploring Local and Overall Ordinal Information for Robust Feature Description , 2016 .

[35]  Torsten Sattler,et al.  Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Cristhian Aguilera,et al.  Cross-Spectral Local Descriptors via Quadruplet Network , 2017, Sensors.

[37]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[38]  Lei Zhou,et al.  GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints , 2018, ECCV.

[39]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Jiri Matas,et al.  MODS: Fast and robust method for two-view matching , 2015, Comput. Vis. Image Underst..

[41]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[43]  Wolfram Burgard,et al.  Learning a Local Feature Descriptor for 3D LiDAR Scans , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).