Evaluation of key frame-based retrieval techniques for video

We investigate the application of a variety of content-based image retrieval techniques to the problem of video retrieval. We generate large numbers of features for each of the key frames selected by a highly effective shot boundary detection algorithm to facilitate a query by example type search. The retrieval performance of two learning methods, boosting and k-nearest neighbours, is compared against a vector space model. We carry out a novel and extensive evaluation to demonstrate and compare the usefulness of these algorithms for video retrieval tasks using a carefully created test collection of over 6000 still images, where performance is measured against relevance judgements based on human image annotations. Three types of experiment are carried out: classification tasks, category searches (both related to automated annotation and summarisation of video material) and real world searches (for navigation and entry point finding). We also show graphical results of real video search tasks using the algorithms, which have not previously been applied to video material in this way.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[2]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[3]  David Pye,et al.  Audio-visual segmentation for content-based retrieval , 1998, ICSLP.

[4]  Paul A. Viola,et al.  Boosting Image Retrieval , 2004, International Journal of Computer Vision.

[5]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[6]  Donna K. Harman,et al.  Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[7]  David R. Bull,et al.  Video Retrieval Using Global Features in Keyframes , 2002, TREC.

[8]  Howard D. Wactlar,et al.  INFORMEDIATM: NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION , 1998 .

[9]  W D Wright,et al.  Color Science, Concepts and Methods. Quantitative Data and Formulas , 1967 .

[10]  Gerald Salton,et al.  Automatic text processing , 1988 .

[11]  Thierry Pun,et al.  The Truth about Corel - Evaluation in Image Retrieval , 2002, CIVR.

[12]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[13]  Ellen M. Voorhees,et al.  Overview of TREC 2003 , 2003, TREC.

[14]  G. Wyszecki,et al.  Color Science Concepts and Methods , 1982 .

[16]  R. N. Jackson,et al.  Computer Generated Colour , 1995 .

[17]  K. Wakimoto,et al.  Efficient and Effective Querying by Image Content , 1994 .

[18]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[19]  Nicu Sebe,et al.  Robust Shape Matching , 2002, CIVR.

[20]  Karen Spärck Jones,et al.  Automatic content-based retrieval of broadcast news , 1995, MULTIMEDIA '95.

[21]  Ellen M. Voorhees,et al.  The Eleventh Text REtrieval Conference (TREC-11) | NIST , 2003 .