Object instance search in videos

In this paper, we propose a novel approach for object instance search in videos. Employing discriminative mutual information score and inferring the location of target object centers from matched local feature descriptors using Hough voting, we achieve robust matching and per-frame localization despite orientation and scale variations. We then leverage Max-Path search [1] to efficiently find the globally optimal spatio-temporal trajectory of the object center in each video sequence. Experimental results on a collection of mobile-captured videos in real-world environments demonstrate the effectiveness and accuracy of our method.

[1]  Peyman Milanfar,et al.  Training-Free, Generic Object Detection Using Locally Adaptive Regression Kernels , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[5]  Junsong Yuan,et al.  Optimal spatio-temporal path discovery for video event detection , 2011, CVPR 2011.

[6]  Yuning Jiang,et al.  Interactive visual object search through mutual information maximization , 2010, ACM Multimedia.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .