Retrieving objects from videos based on affine regions

We present a method to (semi-)automatically annotate video material. More precisely, we focus on recognizing specific objects and scenes in keyframes. Objects are learnt simply by having the user delineate them in one (or a few) images. The basic building block to achieve this goal consists of affine invariant regions. These are local image patches that adapt their shape based on the image content so as to be invariant to viewpoint changes. Instead of simply matching the regions and counting the number of matches, we propose to gather more evidence about the presence of the object by exploring the image around the initial matches. This boosts the performance, especially under difficult, real-world imaging conditions. Experimental results on news broadcast data demonstrate the viability of the approach.

[1]  Luc Van Gool,et al.  Wide-baseline multiple-view correspondences , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  Luc Van Gool,et al.  Simultaneous Object Recognition and Segmentation by Image Exploration , 2004, ECCV.

[3]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[4]  Luc Van Gool,et al.  Local Features for Image Retrieval , 1999, State-of-the-Art in Content-Based Image and Video Retrieval.

[5]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Luc Van Gool,et al.  Video shot characterization , 2004, Machine Vision and Applications.

[7]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .

[9]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[10]  Nasser M. Nasrabadi,et al.  Object Recognition Using , 1997 .

[11]  Stepán Obdrzálek,et al.  Object Recognition using Local Affine Frames on Distinguished Regions , 2002, BMVC.

[12]  Adam Baumberg,et al.  Reliable feature matching across widely separated views , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[13]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[14]  Hans-Peter Kriegel,et al.  State-of-the-Art in Content-Based Image and Video Retrieval , 2001, Computational Imaging and Vision.