A walk through the web’s video clips

Approximately 105 video clips are posted every day on the Web. The popularity of Web-based video databases poses a number of challenges to machine vision scientists: how do we organize, index and search such large wealth of data? Content-based video search and classification have been proposed in the literature and applied successfully to analyzing movies, TV broadcasts and lab-made videos. We explore the performance of some of these algorithms on a large data-set of approximately 3000 videos. We collected our data-set directly from the Web minimizing bias for content or quality, way so as to have a faithful representation of the statistics of this medium. We find that the algorithms that we have come to trust do not work well on video clips, because their quality is lower and their subject is more varied. We will make the data publicly available to encourage further research.

[1]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[2]  Nilesh V. Patel,et al.  Video shot detection and characterization for video databases , 1997, Pattern Recognition.

[3]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[4]  Wei Xiong,et al.  Efficient Scene Change Detection and Camera Motion Annotation for Video Classification , 1998, Comput. Vis. Image Underst..

[5]  Riccardo Leonardi,et al.  Scene break detection: a comparison , 1998, Proceedings Eighth International Workshop on Research Issues in Data Engineering. Continuous-Media Databases and Applications.

[6]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[7]  C.-C. Jay Kuo,et al.  Rule-based video classification system for basketball video indexing , 2000, MULTIMEDIA '00.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[10]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Hugh E. Williams,et al.  Video Cut Detection using Frame Windows , 2005, ACSC.

[12]  Lihi Zelnik-Manor,et al.  Statistical analysis of dynamic actions , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[14]  Jiebo Luo,et al.  Large-scale multimodal semantic concept detection for consumer video , 2007, MIR '07.

[15]  Jiebo Luo,et al.  Kodak consumer video benchmark data set : concept definition and annotation * * , 2008 .

[16]  Adrian Ulges,et al.  Content-based Video Tagging for Online Video Portals ∗ , 2007 .