An End-to-end Deep Convolutional Neural Network for a Multi-scale Image Matching and Localization Problem

Diverse imaging techniques are utilized in many scientific domains to acquire a rich description of the subject under study and to further discover a variety of its properties. Especially, in sample systems, by probing with optical, electron or x-ray beams, the captured images describe the sample in an extremely large range of length scales. This makes the correlation from one image to another very difficult in addition to the intrinsic appearance complexity of those scientific images. In this paper, we aim to tackle this multi-scale image matching and localization problem by proposing an end-to-end deep convolutional neural network. Our proposed network is designed to first generate different filters according to the two queried images originated from different length scales. Then, to compute the correlation map, we use these filters to predict the correspondence between the two images. For the training and evaluation, we collect a number of electron microscopy experiments to form a multi-scaled image patch dataset comprised of various material structures. We observe about 90% accuracy for multi-scale image matching and localization while a triplet-based network shows about 78% accuracy.

[1]  Gustavo Carneiro,et al.  Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Lei Liu,et al.  Learning a Rotation Invariant Detector with Rotatable Bounding Box , 2017, ArXiv.

[5]  Pascal Fua,et al.  Learning to Match Aerial Images with Deep Attentive Architectures , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[7]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Shin'ichi Satoh,et al.  Faster R-CNN Features for Instance Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Esa Rahtu,et al.  Image Patch Matching Using Convolutional Descriptors with Euclidean Distance , 2016, ACCV Workshops.

[12]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[16]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[19]  Shih-Fu Chang,et al.  Learning Spread-Out Local Feature Descriptors , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[23]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[25]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[26]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[27]  Rahul Sukthankar,et al.  MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).