Efficient Sliding Window Computation for NN-Based Template Matching

Template matching is a fundamental problem in computer vision, with many applications. Existing methods use sliding window computation for choosing an image-window that best matches the template. For classic algorithms based on SSD, SAD and normalized cross-correlation, efficient algorithms have been developed allowing them to run in real-time. Current state of the art algorithms are based on nearest neighbor (NN) matching of small patches within the template to patches in the image. These algorithms yield state-of-the-art results since they can deal better with changes in appearance, viewpoint, illumination, non-rigid transformations, and occlusion. However, NN-based algorithms are relatively slow not only due to NN computation for each image patch, but also since their sliding window computation is inefficient. We therefore propose in this paper an efficient NN-based algorithm. Its accuracy is similar (in some cases slightly better) than the existing algorithms and its running time is 43–200 times faster depending on the sizes of the images and templates used. The main contribution of our method is an algorithm for incrementally computing the score of each image window based on the score computed for the previous window. This is in contrast to computing the score for each image window independently, as in previous NN-based methods. The complexity of our method is therefore O(|I|) instead of O(|I||T|), where I and T are the image and the template respectively.

[1]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[2]  Yuandong Tian,et al.  Globally Optimal Estimation of Nonrigid Image Distortion , 2012, International Journal of Computer Vision.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Yichen Wei,et al.  Efficient histogram-based sliding window , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Brahim Chaib-draa,et al.  Deep Object Ranking for Template Matching , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6]  Francesc Moreno-Noguer,et al.  Matchability Prediction for Full-Search Template Matching Algorithms , 2015, 2015 International Conference on 3D Vision.

[7]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shai Avidan,et al.  Fast-Match: Fast Affine Template Matching , 2013, International Journal of Computer Vision.

[9]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[10]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  William T. Freeman,et al.  Best-Buddies Similarity—Robust Template Matching Using Mutual Nearest Neighbors , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Federico Tombari,et al.  Performance Evaluation of Full Search Equivalent Pattern Matching Algorithms , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[14]  Chao Zhang,et al.  Fast Affine Template Matching over Galois Field , 2015, BMVC.

[15]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[16]  Vineet Gandhi,et al.  Long-Term Visual Object Tracking Benchmark , 2017, ACCV.

[17]  Lihi Zelnik-Manor,et al.  Template Matching with Deformable Diversity Similarity , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.