论文信息 - Retrieval of Multiple Instances of Objects in Videos

Retrieval of Multiple Instances of Objects in Videos

This paper tackles the issue of retrieving different instances of an object of interest within a given video document or in a video database. The principle consists in considering a semi-global image representation based on an over-segmentation of image frames. An aggregation mechanism is then applied in order to group a set of sub-regions into an object similar to the query, under a global similarity criterion. Two different strategies are proposed. The first one involves a greedy, dynamic region construction method. The second is based on simulated annealing, and aims at determining a global optimum. Experimental results show promising performances, with object detection rates of up to 79%.

[1] Cordelia Schmid,et al. Vector Quantizing Feature Space with a Regular Lattice , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2] N. Metropolis,et al. Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[3] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4] Jiri Matas,et al. Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[5] B. S. Manjunath,et al. Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[6] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[7] Alan F. Smeaton,et al. Video retrieval using dialogue, keyframe similarity and video objects , 2005, IEEE International Conference on Image Processing 2005.

[8] Wei-Han Chang,et al. A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval , 2008, J. Vis. Commun. Image Represent..

[9] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10] Kristen Grauman,et al. Boundary preserving dense local regions , 2011, CVPR 2011.

[11] Ze-Nian Li,et al. Matching by Linear Programming and Successive Convexification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Andrei Bursuc,et al. Mobile video browsing and retrieval with the OVIDIUS platform , 2010, ACM Multimedia.

[13] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[14] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15] Hongsheng Li,et al. Object matching with a locally affine-invariant constraint , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16] Alistair I. Mees,et al. Convergence of an annealing algorithm , 1986, Math. Program..

[17] Bernt Schiele,et al. Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[18] Pietro Perona,et al. Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition , 2007, International Journal of Computer Vision.

[19] Marcel Worring,et al. Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[20] Hervé Glotin,et al. IRIM at TRECVID 2014: Semantic Indexing and Instance Search , 2014, TRECVID.

[21] Ruxandra Tapu,et al. A complete framework for temporal video segmentation , 2011, 2011 IEEE International Conference on Consumer Electronics -Berlin (ICCE-Berlin).

[22] Jenny Benois-Pineau,et al. Segmentation-based multi-class semantic object detection , 2012, Multimedia Tools and Applications.

[23] James Lee Hafner,et al. Efficient Color Histogram Indexing for Quadratic Form Distance Functions , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[24] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.

[25] Cordelia Schmid,et al. An Affine Invariant Interest Point Detector , 2002, ECCV.

[26] Mads Nielsen,et al. Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[27] Jenny Benois-Pineau,et al. Retrieval of objects in video by similarity based on graph matching , 2007, Pattern Recognit. Lett..

[28] Alexei A. Efros,et al. Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[29] Alan F. Smeaton,et al. TRECVid 2006 Experiments at Dublin City University , 2012, TRECVID.

[30] Takashi Toriu,et al. Dominant Color Embedded Markov Chain Model for Object Image Retrieval , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[31] Vincent Lepetit,et al. A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Jitendra Malik,et al. Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[33] Tinne Tuytelaars,et al. Dense interest points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34] Stephen Gould,et al. Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.