Fast video segment retrieval by Sort-Merge feature selection, boundary refinement, and lazy evaluation

We present a fast video retrieval system with three novel characteristics. First, it exploits the methods of machine learning to construct automatically a hierarchy of small subsets of features that are progressively more useful for indexing. These subsets are induced by a new heuristic method called Sort-Merge feature selection, which exploits a novel combination of Fastmap for dimensionality reduction and Mahalanobis distance for likelihood determination. Second, because these induced feature sets form a hierarchy with increasing classification accuracy, video segments can be segmented and categorized simultaneously in a coarse-fine manner that efficiently and progressively detects and refines their temporal boundaries. Third, the feature set hierarchy enables an efficient implementation of query systems by the approach of lazy evaluation, in which new queries are used to refine the retrieval index in real-time. We analyze the performance of these methods, and demonstrate them in the domain of a 75-min instructional video and a 30-min baseball video.

[1]  John R. Smith,et al.  Integrating Features, Models, and Semantics for TREC Video Retrieval , 2001, TREC.

[2]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[3]  Erwin M. Bakker,et al.  Semantic Video Retrieval Using Audio Analysis , 2002, CIVR.

[4]  Stephen W. Smoliar,et al.  Video parsing and browsing using compressed data , 1995, Multimedia Tools and Applications.

[5]  Gert Pfurtscheller,et al.  Discovering Patterns in EEG-Signals: Comparative Study of a Few Methods , 1993, ECML.

[6]  Marcus Jerome Pickering,et al.  Video Retrieval by Feature Learning in Key Frames , 2002, CIVR.

[7]  Gregory M. Provan,et al.  A Comparison of Induction Algorithms for Selective and non-Selective Bayesian Classifiers , 1995, ICML.

[8]  Zhibin Lei,et al.  3D shape inferencing and modeling for semantic video retrieval , 1996, Other Conferences.

[9]  Rich Caruana,et al.  How Useful Is Relevance , 1994 .

[10]  Andrew Zisserman,et al.  Automated Scene Matching in Movies , 2002, CIVR.

[11]  Gregory M. Provan,et al.  Efficient Learning of Selective Bayesian Network Classifiers , 1996, ICML.

[12]  Thomas D. C. Little,et al.  Video query formulation , 1995, Electronic Imaging.

[13]  Wei-Hao Lin,et al.  News video classification using SVM-based multimodal classifiers and combination strategies , 2002, MULTIMEDIA '02.

[14]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[15]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[16]  Takeo Kanade,et al.  Video Ocr: Indexing Digital News Libraries by Recognition of Superimposed Caption Sharman Block Los Angeles County She?iff Sheriff Suspect Murders Kinberly Pandelios Model Body 1993 Meeting Sheriff Index Video Fig. 1. Indexing Video Data Using Closed Caption and Superimposed Caption , 1998 .

[17]  Nicu Sebe,et al.  Challenges of Image and Video Retrieval , 2002, CIVR.

[18]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[19]  Gang Wei,et al.  Omni-face detection for video/image content description , 2000, MULTIMEDIA '00.

[20]  Junho Choi,et al.  Extracting Semantic Information from Basketball Video Based on Audio-Visual Features , 2002, CIVR.

[21]  Pat Langley,et al.  Oblivious Decision Trees and Abstract Cases , 1994 .

[22]  R. Brunelli,et al.  A Survey on Video Indexing , 1996 .

[23]  Irena Koprinska,et al.  Temporal video segmentation: A survey , 2001, Signal Process. Image Commun..

[24]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[25]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[26]  Rainer Lienhart,et al.  Automatic text recognition for video indexing , 1997, MULTIMEDIA '96.

[27]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[28]  Thomas S. Huang,et al.  Learning and Feature Selection in Stereo Matching , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Jungwoo Lee,et al.  Multiresolution video indexing for subband coded video databases , 1994, Electronic Imaging.

[30]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[31]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[32]  Tommi S. Jaakkola,et al.  Feature Selection and Dualities in Maximum Entropy Discrimination , 2000, UAI.

[33]  R. Brunelli,et al.  A Survey on the Automatic Indexing of Video Data, , 1999, J. Vis. Commun. Image Represent..

[34]  Alexander G. Hauptmann,et al.  Text, Speech, and Vision for Video Segmentation: The InformediaTM Project , 1995 .

[35]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[36]  David G. Stork,et al.  Pattern Classification , 1973 .

[37]  Arbee L. P. Chen,et al.  3D-List: A Data Structure for Efficient Video Query Processing , 2002, IEEE Trans. Knowl. Data Eng..