A Survey on Visual Content-Based Video Indexing and Retrieval

Video indexing and retrieval have a wide spectrum of promising applications, motivating the interest of researchers worldwide. This paper offers a tutorial and an overview of the landscape of general strategies in visual content-based video indexing and retrieval, focusing on methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, video retrieval including query interfaces, similarity measure and relevance feedback, and video browsing. Finally, we analyze future research directions.

[1]  Shih-Fu Chang,et al.  Structure analysis of soccer video with domain knowledge and hidden Markov models , 2004, Pattern Recognit. Lett..

[2]  Jiebo Luo,et al.  Utilizing semantic word similarity measures for video retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Patrick Pérez,et al.  Nonparametric motion characterization using causal probabilistic models for video indexing and retrieval , 2002, IEEE Trans. Image Process..

[4]  Zygmunt Pizlo,et al.  Automated video program summarization using speech transcripts , 2006, IEEE Transactions on Multimedia.

[5]  Xian-Sheng Hua,et al.  To learn representativeness of video frames , 2005, MULTIMEDIA '05.

[6]  Fernando Pereira,et al.  Automatic video summarization based on MPEG-7 descriptions , 2004, Signal Process. Image Commun..

[7]  Wolfgang Effelsberg,et al.  Automatic recognition of film genres , 1995, MULTIMEDIA '95.

[8]  Pong C. Yuen,et al.  Shot Boundary Detection: An Information Saliency Approach , 2008, 2008 Congress on Image and Signal Processing.

[9]  Diane J. Cook,et al.  Automatic Video Classification: A Survey of the Literature , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[10]  Tao Liu,et al.  BUPT at TRECVID 2007: Shot Boundary Detection , 2007, TRECVID.

[11]  Daniel Keren,et al.  A rejection-based method for event detection in video , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Patrick Gros,et al.  A Geometrical Key-Frame Selection Method Exploiting Dominant Motion Estimation in Video , 2004, CIVR.

[13]  Irfan A. Essa,et al.  Structure from Statistics - Unsupervised Activity Analysis using Suffix Trees , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Paul Over,et al.  The trecvid 2008 BBC rushes summarization evaluation , 2008, TVS '08.

[15]  Raimondo Schettini,et al.  Supervised and unsupervised classification post-processing for visual video summaries , 2006, IEEE Transactions on Consumer Electronics.

[16]  Chia-Hung Yeh,et al.  Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques , 2006, IEEE Signal Processing Magazine.

[17]  Kin-Man Lam,et al.  A new key frame representation for video segment retrieval , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Dipti Prasad Mukherjee,et al.  Key Frame Estimation in Video Using Randomness Measure of Feature Point Pattern , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Ming-Syan Chen,et al.  Association and Temporal Rule Mining for Post-Filtering of Semantic Concept Detection in Video , 2008, IEEE Transactions on Multimedia.

[20]  Neill W. Campbell,et al.  Visual abstraction of wildlife footage using Gaussian mixture models and the minimum description length criterion , 2002, Object recognition supported by user interaction for service robots.

[21]  Regunathan Radhakrishnan,et al.  Motion activity-based extraction of key-frames from video shots , 2002, Proceedings. International Conference on Image Processing.

[22]  Bo Zhang,et al.  Learning concepts from large scale imbalanced data sets using support cluster machines , 2006, MM '06.

[23]  Jianping Fan,et al.  Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing , 2004, IEEE Transactions on Image Processing.

[24]  Jiebo Luo,et al.  Multilabel machine learning and its application to semantic scene classification , 2003, IS&T/SPIE Electronic Imaging.

[25]  Qi Tian,et al.  Multilevel video representation with application to keyframe extraction , 2004, 10th International Multimedia Modelling Conference, 2004. Proceedings..

[26]  Fred Stentiford,et al.  Recent Trends in Video Analysis: A Taxonomy of Video Classification Problems , 2002, IMSA.

[27]  Gu Xu,et al.  An HMM-based framework for video semantic analysis , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Meng Wang,et al.  Automatic video annotation by semi-supervised learning with kernel density estimation , 2006, MM '06.

[29]  Zhi-Cheng Zhao,et al.  Shot Boundary Detection Algorithm in Compressed Domain Based on Adaboost and Fuzzy Theory , 2006, ICNC.

[30]  Chong-Wah Ngo,et al.  Rushes video summarization by object and event understanding , 2007, TVS '07.

[31]  Paul Over,et al.  High-level feature detection from video in TRECVid: a 5-year retrospective of achievements , 2009 .

[32]  Bo Zhang,et al.  An online-optimized incremental learning framework for video semantic classification , 2004, MULTIMEDIA '04.

[33]  Wei-Ying Ma,et al.  Recent Advances and Challenges of Semantic Image/Video Search , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[34]  Frédéric Precioso,et al.  Shot Boundary Detection at TRECVID 2006 , 2006, TRECVID.

[35]  Christos Faloutsos,et al.  "GeoPlot": spatial data mining on video libraries , 2002, CIKM '02.

[36]  Paul Browne Video information retrieval using objects and ostensive relevance feedback , 2004, SAC '04.

[37]  Songyang Lao,et al.  A News Video Mining Method Based on Statistical Analysis and Visualization , 2004, CIVR.

[38]  Nathalie Guyader,et al.  Video Summarization Based on Camera Motion and a Subjective Evaluation Method , 2007, EURASIP J. Image Video Process..

[39]  Guoliang Fan,et al.  Combined key-frame extraction and object-based video segmentation , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[41]  Qingning Zeng,et al.  Shot Boundary Detection Based on Difference Sequences of Mutual Information , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[42]  R. Venkatesh Babu,et al.  Compressed domain video retrieval using object and global motion descriptors , 2006, Multimedia Tools and Applications.

[43]  Frank Hopfgartner,et al.  Simulated Testing of an Adaptive Multimedia Information Retrieval System , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[44]  Wolfgang Effelsberg,et al.  Automatic recognition of film genres (demonstration) , 1995, MULTIMEDIA '95.

[45]  Lide Wu,et al.  Key frame extraction using inter-shot information , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[46]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[47]  Yi Wu,et al.  Ontology-based multi-classification learning for video concept detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[48]  HongJiang Zhang,et al.  Motion texture: a new motion based video representation , 2002, Object recognition supported by user interaction for service robots.

[49]  Paul Over,et al.  The TRECVid 2008 BBC Rushes Summarization Evaluation | NIST , 2009 .

[50]  Sang Uk Lee,et al.  Efficient video indexing scheme for content-based retrieval , 1999, IEEE Trans. Circuits Syst. Video Technol..

[51]  Feng Niu,et al.  An SVM Framework for Genre-Independent Scene Change Detection , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[52]  David C. Gibbon,et al.  Brief and high-interest video summary generation: evaluating the AT&T labs rushes summarizations , 2008, TVS '08.

[53]  Tie-Yan Liu,et al.  Dynamic selection and effective compression of key frames for video abstraction , 2003, Pattern Recognit. Lett..

[54]  Andrew Zisserman,et al.  Person Spotting: Video Shot Retrieval for Face Sets , 2005, CIVR.

[55]  Chi-Chun Lo,et al.  Video segmentation using a histogram-based fuzzy c-means clustering algorithm , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[56]  Ishwar K. Sethi,et al.  The PERSEUS Project: Creating Personalized Multimedia News Portal , 2001, MDM/KDD.

[57]  Shyam Varan Nath,et al.  Crime Pattern Detection Using Data Mining , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops.

[58]  Yap-Peng Tan,et al.  Model-based clustering and analysis of video scenes , 2002, Proceedings. International Conference on Image Processing.

[59]  John R. Smith,et al.  On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[60]  Tianming Liu,et al.  A novel video key-frame-extraction algorithm based on perceived motion energy model , 2003, IEEE Trans. Circuits Syst. Video Technol..

[61]  John Adcock,et al.  FXPAL Experiments for TRECVID 2004 , 2004, TRECVID.

[62]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[63]  Rong Yan,et al.  Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[64]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[65]  Ting Wang,et al.  An Approach to Video Key-frame Extraction Based on Rough Set , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[66]  Wei-Hao Lin,et al.  Exploring the utility of fast-forward surrogates for bbc rushes , 2008, TVS '08.

[67]  Changsheng Xu,et al.  Automatic mobile sports highlights , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[68]  Mubarak Shah,et al.  Scene detection in Hollywood movies and TV shows , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[69]  Fei Wu,et al.  Automatic Video Summarization by Affinity Propagation Clustering and Semantic Content Mining , 2008, 2008 International Symposium on Electronic Commerce and Security.

[70]  Rong Yan,et al.  Semi-supervised cross feature learning for semantic concept detection in videos , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[71]  Mubarak Shah,et al.  Story Segmentation in News Videos Using Visual and Text Cues , 2005, CIVR.

[72]  Janko Calic,et al.  Spatial analysis in key-frame extraction using video segmentation , 2004 .

[73]  Xinyang Huang,et al.  A Image Digital Watermarking based on DWT in Invariant Wavelet Domain , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[74]  A. Murat Tekalp,et al.  Automatic soccer video analysis and summarization , 2003, IEEE Trans. Image Process..

[75]  Xiangyang Xue,et al.  Shot boundary detection using unsupervised clustering and hypothesis testing , 2004, 2004 International Conference on Communications, Circuits and Systems (IEEE Cat. No.04EX914).

[76]  Shih-Fu Chang,et al.  Algorithms and system for segmentation and structure analysis in soccer video , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[77]  Ioannis Pitas,et al.  Information theory-based shot cut/fade detection and video summarization , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[78]  A. Murat Tekalp,et al.  Two-stage hierarchical video summary extraction to match low-level user browsing preferences , 2003, IEEE Trans. Multim..

[79]  Changsheng Xu,et al.  Semantic Event Extraction from Basketball Games using Multi-Modal Analysis , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[80]  Guo-Hui Li,et al.  A Probabilistic Model for Surveillance Video Mining , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[81]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[82]  Ajay Divakaran,et al.  Broadcast Video Program Summarization using Face Tracks , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[83]  Steven C. H. Hoi,et al.  Chinese University of Hong Kong at TRECVID 2006: Shot Boundary Detection and Video Search , 2006, TRECVID.

[84]  Min Chen,et al.  Video Semantic Event/Concept Detection Using a Subspace-Based Multimedia Data Mining Framework , 2008, IEEE Transactions on Multimedia.

[85]  Alberto Del Bimbo,et al.  Symbolic Description and Visual Querying of Image Sequences Using Spatio-Temporal Logic , 1995, IEEE Trans. Knowl. Data Eng..

[86]  Xinbo Gao,et al.  A Video Shot Boundary Detection Algorithm Based on Feature Tracking , 2006, RSKT.

[87]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[88]  Shih-Fu Chang,et al.  Algorithms and system for segmentation and structure analysis in soccer video , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[89]  Meng Wang,et al.  Semi-automatic video annotation based on active learning with multiple complementary predictors , 2005, MIR '05.

[90]  Chong-Wah Ngo,et al.  Video summarization and scene detection by graph modeling , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[91]  Li-Rong Dai,et al.  Video Annotation by Active Learning and Cluster Tuning , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[92]  Evangelos Pallis,et al.  Shot boundary detection without threshold parameters , 2006, J. Electronic Imaging.

[93]  Alan F. Smeaton Techniques used and open challenges to the analysis, indexing and retrieval of digital video , 2007, Inf. Syst..

[94]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[95]  Peter J. L. van Beek,et al.  Detection of slow-motion replay segments in sports video for highlights generation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[96]  G. Camara-Chavez,et al.  Shot Boundary Detection by a Hierarchical Supervised Approach , 2007, 2007 14th International Workshop on Systems, Signals and Image Processing and 6th EURASIP Conference focused on Speech and Image Processing, Multimedia Communications and Services.

[97]  Michael G. Christel,et al.  Mining Novice User Activity with TRECVID Interactive Retrieval Tasks , 2006, CIVR.

[98]  Mubarak Shah,et al.  Video scene segmentation using Markov chain Monte Carlo , 2006, IEEE Transactions on Multimedia.

[99]  S. V. Basavaraja,et al.  SPSA based feature relevance estimation for video retrieval , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[100]  Shih-Fu Chang,et al.  Video scene segmentation using video and audio features , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[101]  Pavel Zemcík,et al.  TRECVID 2007 by the Brno Group , 2007, TRECVID.

[102]  Ehud Rivlin,et al.  Understanding Video Events: A Survey of Methods for Automatic Interpretation of Semantic Occurrences in Video , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[103]  Maneesh Kumar Singh,et al.  State-of-the-art on spatio-temporal information-based video retrieval , 2009, Pattern Recognit..

[104]  John S. Boreczky,et al.  A hidden Markov model framework for video segmentation using audio and image features , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[105]  Liu Hua-Yong,et al.  Event Detection in Sports Video Based on Multiple Feature Fusion , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[106]  Alan F. Smeaton,et al.  Video retrieval using dialogue, keyframe similarity and video objects , 2005, IEEE International Conference on Image Processing 2005.

[107]  Siripinyo Chantamunee,et al.  University of Sheffield at TRECVID 2007: Shot Boundary Detection and Rushes Summarisation , 2007, TRECVID.

[108]  Sanjeev R. Kulkarni,et al.  Rapid estimation of camera motion from compressed video with application to video annotation , 2000, IEEE Trans. Circuits Syst. Video Technol..

[109]  Qi Tian,et al.  A unified framework for semantic shot classification in sports video , 2002, IEEE Transactions on Multimedia.

[110]  Monique Thonnat,et al.  AN INTERFACE FOR IMAGE RETRIEVAL AND ITS EXTENSION TO VIDEO RETRIEVAL , 2006 .

[111]  Ling Guan,et al.  Automatic relevance feedback for video retrieval , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[112]  Andreas Girgensohn,et al.  Time-Constrained Keyframe Selection Technique , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[113]  Rong Yan,et al.  A review of text and image retrieval approaches for broadcast news video , 2007, Information Retrieval.

[114]  Eric Bruno,et al.  Design of Multimodal Dissimilarity Spaces for Retrieval of Video Documents , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[115]  Georges Quénot,et al.  CLIPS at TRECVID : Shot Boundary Detection and Feature Detection , 2003, TRECVID.

[116]  Qingming Huang,et al.  JDL at TRECVID 2006 Shot Boundary Detection , 2006, TRECVID.

[117]  Jung-Hwan Oh,et al.  Multimedia Data Mining Framework for Raw Video Sequences , 2002, Revised Papers from MDM/KDD and PAKDD/KDMCD.

[118]  Yannis Avrithis,et al.  Personalized Content Retrieval in Context Using Ontological Knowledge , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[119]  Yihong Gong Summarizing Audiovisual Contents of a Video Program , 2003, EURASIP J. Adv. Signal Process..

[120]  Jianping Fan,et al.  Incorporating Concept Ontology for Hierarchical Video Classification, Annotation, and Visualization , 2007, IEEE Transactions on Multimedia.

[121]  Anil K. Jain,et al.  3D model-assisted face recognition in video , 2005, The 2nd Canadian Conference on Computer and Robot Vision (CRV'05).

[122]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[123]  Wesley De Neve,et al.  A compressed-domain approach for shot boundary detection on H.264/AVC bit streams , 2008, Signal Process. Image Commun..

[124]  José María Martínez Sanchez,et al.  On-Line Video Summarization Based on Signature-Based Junk and Redundancy Filtering , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[125]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[126]  Rama Chellappa,et al.  From Videos to Verbs: Mining Videos for Activities using a Cascade of Dynamical Systems , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[127]  Wallapak Tavanapong,et al.  Shot clustering techniques for story browsing , 2004, IEEE Transactions on Multimedia.

[128]  Minh-Son Dao,et al.  Video retrieval using video object-trajectory and edge potential function , 2004, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004..

[129]  Ba Tu Truong,et al.  Scene extraction in motion pictures , 2003, IEEE Trans. Circuits Syst. Video Technol..

[130]  Michael C. Burl,et al.  Mining Patters of Activity from Video Data , 2004, SDM.

[131]  Yo-Sung Ho,et al.  Content-based event retrieval using semantic scene interpretation for automated traffic surveillance , 2001, IEEE Trans. Intell. Transp. Syst..

[132]  De Xu,et al.  Hierarchical Video Summarization Based on Video Structure and Highlight , 2006, SSPR/SPR.

[133]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[134]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[135]  Shih-Fu Chang,et al.  A Framework for Sub-Window Shot Detection , 2005, 11th International Multimedia Modelling Conference.

[136]  Guizhong Liu,et al.  A Multiple Visual Models Based Perceptive Analysis Framework for Multilevel Video Summarization , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[137]  Shih-Fu Chang,et al.  Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[138]  R. Narasimha,et al.  Key frame extraction using MPEG-7 motion descriptors , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[139]  Marcel Worring,et al.  Adding Semantics to Detectors for Video Retrieval , 2007, IEEE Transactions on Multimedia.

[140]  Mark Pawlewski,et al.  Video genre classification using dynamics , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[141]  Jonathan Foote,et al.  Discriminative techniques for keyframe selection , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[142]  Mei Han,et al.  Extract highlights from baseball game video with hidden Markov models , 2002, Proceedings. International Conference on Image Processing.

[143]  Li-Rong Dai,et al.  Video Annotation by Active Learning and Semi-Supervised Ensembling , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[144]  Loong Fah Cheong,et al.  Addressing the problems of Bayesian network classification of video using high-dimensional features , 2004, IEEE Transactions on Knowledge and Data Engineering.

[145]  Xian-Sheng Hua,et al.  Multi-modality web video categorization , 2007, MIR '07.

[146]  Chong-Wah Ngo,et al.  A robust dissolve detector by support vector machine , 2003, ACM Multimedia.

[147]  Yelena Yesha,et al.  Keyframe-based video summarization using Delaunay clustering , 2006, International Journal on Digital Libraries.

[148]  Meng Wang,et al.  Optimizing multi-graph learning: towards a unified video annotation scheme , 2007, ACM Multimedia.

[149]  Seong-Yoon Shin,et al.  Video Shot Boundary Detection Algorithm , 2006, ICVGIP.

[150]  Paul Over,et al.  Video shot boundary detection: Seven years of TRECVid activity , 2010, Comput. Vis. Image Underst..

[151]  Mubarak Shah,et al.  Detection and representation of scenes in videos , 2005, IEEE Transactions on Multimedia.

[152]  Bo Zhang,et al.  A Formal Study of Shot Boundary Detection , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[153]  Yasuo Ariki,et al.  Highlight scene extraction in real time from baseball live video , 2003, MIR '03.

[154]  Changick Kim,et al.  Object-based video abstraction using cluster analysis , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[155]  Jin Zhao,et al.  Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[156]  Rong Yan,et al.  Co-training non-robust classifiers for video semantic concept detection , 2005, IEEE International Conference on Image Processing 2005.

[157]  Tiecheng Liu,et al.  Summarization of Visual Content in Instructional Videos , 2007, IEEE Transactions on Multimedia.

[158]  Thomas S. Huang,et al.  Constructing table-of-content for videos , 1999, Multimedia Systems.

[159]  C.-C. Jay Kuo,et al.  Rule-based video classification system for basketball video indexing , 2000, MULTIMEDIA '00.

[160]  Chen-Yu Chen,et al.  Efficient news video querying and browsing based on distributed news video servers , 2006, IEEE Transactions on Multimedia.

[161]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[162]  Andrew Zisserman,et al.  Video data mining using configurations of viewpoint invariant regions , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[163]  John R. Kender,et al.  A method and browser for cross-referenced video summaries , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[164]  Alan Hanjalic,et al.  Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[165]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[166]  Yueting Zhuang,et al.  ImprovingWeb-Based Learning: Automatic Annotation of Multimedia Semantics and Cross-Media Indexing , 2004, ICWL.

[167]  Majid Mirmehdi,et al.  A shortest path representation for video summarisation , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[168]  Rong Yan,et al.  Negative pseudo-relevance feedback in content-based video retrieval , 2003, MULTIMEDIA '03.

[169]  R. Lienhart A system for effortless content annotation to unfold the semantics in videos , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.

[170]  Dan Schonfeld,et al.  Real-Time Motion Trajectory-Based Indexing and Retrieval of Video Sequences , 2007, IEEE Transactions on Multimedia.

[171]  Paul Over,et al.  TRECVID 2008 - Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2010, TRECVID.

[172]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[173]  Meng Wang,et al.  Correlative Linear Neighborhood Propagation for Video Annotation , 2009, IEEE Trans. Syst. Man Cybern. Part B.

[174]  Fei Wu,et al.  Automatic Video Summarization by Spatio-temporal Analysis and Non-trivial Repeating Pattern Detection , 2008, 2008 Congress on Image and Signal Processing.

[175]  Changsheng Xu,et al.  A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video , 2008, IEEE Transactions on Multimedia.

[176]  Marcel Worring,et al.  A Learned Lexicon-Driven Paradigm for Interactive Video Retrieval , 2007, IEEE Transactions on Multimedia.

[177]  Fernando Pereira,et al.  Multimedia Retrieval and Delivery: Essential Metadata Challenges and Standards , 2008, Proceedings of the IEEE.

[178]  George Economou,et al.  Key frame extraction in video sequences: a vantage points approach , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[179]  Tao Mei,et al.  Automatic Video Genre Categorization using Hierarchical SVM , 2006, 2006 International Conference on Image Processing.

[180]  B. Fong,et al.  An intelligent video categorization engine , 2005 .

[181]  Guoliang Fan,et al.  Joint Key-Frame Extraction and Object Segmentation for Content-Based Video Analysis , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[182]  S. Aksoy,et al.  A Relevance Feedback Technique for Multimodal Retrieval of News Videos , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[183]  Duy-Dinh Le,et al.  Face Retrieval in Broadcasting News Video by Fusing Temporal and Intensity Information , 2006, CIVR.

[184]  Lexing Xie,et al.  Event Mining in Multimedia Streams , 2008, Proceedings of the IEEE.

[185]  Yongdong Zhang,et al.  Segregated feedback with performance-based adaptive sampling for interactive news video retrieval , 2007, ACM Multimedia.

[186]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[187]  Marcin Grzegorzek,et al.  Shot boundary detection using spectral clustering , 2007, 2007 15th European Signal Processing Conference.

[188]  Bernd Freisleben,et al.  Semi-supervised learning for semantic video retrieval , 2007, CIVR '07.

[189]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[190]  Frank Hopfgartner,et al.  Video browsing interfaces and applications: a review , 2010 .

[191]  Marcel Worring,et al.  The Mediamill Semantic Video Search Engine , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[192]  Nicu Sebe,et al.  Object Recognition for Video Retrieval , 2002, CIVR.

[193]  Cedric Nishan Canagarajah,et al.  A Unified Framework for Object Retrieval and Mining , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[194]  Kebin Jia,et al.  Video Key Frame Extraction Based on Spatial-Temporal Color Distribution , 2008, 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[195]  Wen-Nung Lie,et al.  Video Summarization Based on Semantic Feature Analysis and User Preference , 2008, 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (sutc 2008).

[196]  Wei-Hao Lin,et al.  Confounded Expectations: Informedia at TRECVID 2004 , 2004, TRECVID.

[197]  Jianping Fan,et al.  ClassView: hierarchical video shot classification, indexing, and accessing , 2004, IEEE Transactions on Multimedia.

[198]  Howard D. Wactlar,et al.  Putting active learning into multimedia applications: dynamic definition and refinement of concept classifiers , 2005, MULTIMEDIA '05.

[199]  Paul Over,et al.  TRECVID 2005 - An Overview , 2005, TRECVID.

[200]  Rong Yan,et al.  Mining Relationship Between Video Concepts using Probabilistic Graphical Models , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[201]  Santanu Chaudhury,et al.  Learning ontology for personalized video retrieval , 2007, MS '07.

[202]  Alan F. Smeaton,et al.  TRECVid 2006 Experiments at Dublin City University , 2012, TRECVID.

[203]  Jun-Wei Hsieh,et al.  Motion-based video retrieval by trajectory matching , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[204]  Martial Hebert,et al.  Event Detection in Crowded Videos , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[205]  Li Guo-hui,et al.  Video Mining:A Survey , 2006 .

[206]  Katsumi Tanaka,et al.  Querying Video Data by Spatio-Temporal Relationships of Moving Object Traces , 2002, VDB.

[207]  Alan Hanjalic,et al.  Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[208]  Qi Tian,et al.  Semantic retrieval of video - review of research on video retrieval in meetings, movies and broadcast news, and sports , 2006, IEEE Signal Processing Magazine.

[209]  Michael G. Christel,et al.  Exploiting multiple modalities for interactive video retrieval , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[210]  Jenny Benois-Pineau,et al.  The COST292 experimental framework for rushes summarization task in TRECVID 2008 , 2008, TVS '08.

[211]  Yueting Zhuang,et al.  Content-based video similarity model , 2000, MM 2000.

[212]  Alan F. Smeaton,et al.  Interactive Object-Based Retrieval Using Relevance Feedback , 2005, ACIVS.

[213]  Anindya Sarkar,et al.  Feature fusion and redundancy pruning for rush video summarization , 2007, TVS '07.

[214]  Raimondo Schettini,et al.  Erratum to: An innovative algorithm for key frame extraction in video summarization , 2006, Journal of Real-Time Image Processing.

[215]  Rong Yan,et al.  Probabilistic latent query analysis for combining multiple retrieval sources , 2006, SIGIR.

[216]  In-So Kweon,et al.  Scalable temporal interest points for abstraction and classification of video events , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[217]  Qi Tian,et al.  Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video , 2003, MULTIMEDIA '03.

[218]  Shih-Fu Chang,et al.  Active Context-Based Concept Fusionwith Partial User Labels , 2006, 2006 International Conference on Image Processing.

[219]  Yue Gao,et al.  A video summarization tool using two-level redundancy detection for personal video recorders , 2008, IEEE Transactions on Consumer Electronics.

[220]  Tie-Yan Liu,et al.  Shot reconstruction degree: a novel criterion for key frame selection , 2004, Pattern Recognit. Lett..

[221]  Edward K. Wong,et al.  Knowledge-based approach to video content classification , 2001, IS&T/SPIE Electronic Imaging.

[222]  Dong-Sik Jang,et al.  Gradual shot boundary detection using localized edge blocks , 2006, Multimedia Tools and Applications.

[223]  Gang Wei,et al.  Video classification based on HMM using text and faces , 2000, 2000 10th European Signal Processing Conference.

[224]  Meng Wang,et al.  Manifold-ranking based video concept detection on large database and feature pool , 2006, MM '06.

[225]  Ting Liu,et al.  Video Segmentation via Temporal Pattern Classification , 2007, IEEE Transactions on Multimedia.

[226]  Luc Van Gool,et al.  Video mining with frequent itemset configurations , 2006 .

[227]  Tao Mei,et al.  EMS: Energy Minimization Based Video Scene Segmentation , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[228]  Liang Bai,et al.  Video shot boundary detection using Petri-Net , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[229]  Paul Over,et al.  The trecvid 2007 BBC rushes summarization evaluation pilot , 2007, TVS '07.

[230]  Stuart Harvey Rubin,et al.  A Human-Centered Multiple Instance Learning Framework for Semantic Video Retrieval , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[231]  Apostol Natsev,et al.  Exploring Automatic Query Refinement for Text-Based Video Retrieval , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[232]  Yaser Sheikh,et al.  On the use of computable features for film classification , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[233]  Chong-Wah Ngo,et al.  Motion-Based Video Representation for Scene Change Detection , 2004, International Journal of Computer Vision.

[234]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[235]  Charles A. Bouman,et al.  ViBE: a compressed video database structured for active browsing and search , 2004, IEEE Transactions on Multimedia.

[236]  Aggelos K. Katsaggelos,et al.  Rate-distortion optimal video summary generation , 2005, IEEE Transactions on Image Processing.

[237]  Marcel Worring,et al.  The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[238]  Meng Wang,et al.  Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.

[239]  Xiaokun Li,et al.  A hidden Markov model framework for traffic event detection using video features , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[240]  Masaharu Ogawa,et al.  A highlight scene detection and video summarization system using audio feature for a personal video recorder , 2005, IEEE Transactions on Consumer Electronics.

[241]  Kuo-Chin Fan,et al.  Motion Flow-Based Video Retrieval , 2007, IEEE Transactions on Multimedia.

[242]  Rainer Lienhart,et al.  The Holy Grail of Multimedia Information Retrieval: So Close or Yet So Far Away? , 2008 .

[243]  Yuchou Chang,et al.  Unsupervised Video Shot Detection Using Clustering Ensemble with a Color Global Scale-Invariant Feature Transform Descriptor , 2008, EURASIP J. Image Video Process..

[244]  Chong-Wah Ngo,et al.  Hot Event Detection and Summarization by Graph Modeling and Matching , 2005, CIVR.

[245]  Alan F. Smeaton,et al.  Using video objects and relevance feedback in video retrieval , 2005, SPIE Optics East.

[246]  Aggelos K. Katsaggelos,et al.  MINMAX optimal video summarization , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[247]  Sarah V. Porter,et al.  Video Segmentation and Indexing using Motion Estimation , 2004 .

[248]  David Doermann,et al.  Video indexing and retrieval based on recognized text , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[249]  Zhouyu Fu,et al.  Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[250]  Nobuyuki Yagi,et al.  Shot Boundary Detection at TRECVID 2007 , 2007, TRECVID.

[251]  Wen-Nung Lie,et al.  Content-based video retrieval based on object motion trajectory , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[252]  Liang-Hua Chen,et al.  Movie scene segmentation using background information , 2008, Pattern Recognit..

[253]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[254]  Li Huan,et al.  A General Method for Shot Boundary Detection , 2008, 2008 International Conference on Multimedia and Ubiquitous Engineering (mue 2008).

[255]  Marcel Worring,et al.  Building a visual ontology for video retrieval , 2005, MULTIMEDIA '05.

[256]  C.-C. Jay Kuo,et al.  On-line knowledge- and rule-based video classification system for video indexing and dissemination , 2002, Inf. Syst..

[257]  Rong Yan,et al.  Video Retrieval Based on Semantic Concepts , 2008, Proceedings of the IEEE.

[258]  Li Zhao,et al.  Video shot grouping using best-first model merging , 2001, IS&T/SPIE Electronic Imaging.

[259]  Xindong Wu,et al.  Video data mining: semantic indexing and event detection from the association perspective , 2005, IEEE Transactions on Knowledge and Data Engineering.

[260]  Loong Fah Cheong,et al.  Affective understanding in film , 2006, IEEE Trans. Circuits Syst. Video Technol..

[261]  Werner Bailer,et al.  Comparison of content selection methods for skimming rushes video , 2008, TVS '08.

[262]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[263]  Meng Wang,et al.  Efficient semantic annotation method for indexing large personal video database , 2006, MIR '06.

[264]  Alan F. Smeaton,et al.  TRECVID 2004 Experiments in Dublin City University , 2004, TRECVID.

[265]  Janko Calic,et al.  Efficient Layout of Comic-Like Video Summaries , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[266]  Shih-Fu Chang,et al.  Motion trajectory matching of video objects , 1999, Electronic Imaging.

[267]  Keiichiro Hoashi,et al.  SVM-Based Shot Boundary Detection with a Novel Feature , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[268]  Nikolas P. Galatsanos,et al.  Video rushes summarization using spectral clustering and sequence alignment , 2008, TVS '08.

[269]  Andrew Zisserman,et al.  Video Google: Efficient Visual Search of Videos , 2006, Toward Category-Level Object Recognition.

[270]  Liang-Hua Chen,et al.  An Integrated Approach to Video Retrieval , 2008, ADC.

[271]  Marcel Worring,et al.  Interactive Search by Direct Manipulation of Dissimilarity Space , 2007, IEEE Transactions on Multimedia.

[272]  Alberto Del Bimbo,et al.  Automatic video annotation using ontologies extended with visual information , 2005, MULTIMEDIA '05.

[273]  Tao Mei,et al.  Modeling and Mining of Users' Capture Intention for Home Videos , 2007, IEEE Transactions on Multimedia.

[274]  Seong-Dae Kim,et al.  Iterative key frame selection in the rate-constraint environment , 2003, Signal Process. Image Commun..

[275]  Wang Jun,et al.  Traffic Jam Detection Based on Corner Feature of Background Scene In Video-Based ITS , 2008, 2008 IEEE International Conference on Networking, Sensing and Control.

[276]  Ba Tu Truong,et al.  Automatic genre identification for content-based video categorization , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[277]  Janko Calic,et al.  Efficient key-frame extraction and video analysis , 2002, Proceedings. International Conference on Information Technology: Coding and Computing.

[278]  M. Pawlewski,et al.  Motion-based classification of cartoons , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).