论文信息 - Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing

Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing

Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several content-based video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and concept-oriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semantic-sensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic video-classifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a concept-oriented video database organization technique through a certain domain-dependent concept hierarchy to enable semantic-sensitive video retrieval and browsing.

[1] Thomas S. Huang,et al. Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[2] James Ze Wang,et al. SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3] Zhu Liu,et al. Integration of multimodal features for video scene classification based on HMM , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[4] Shih-Fu Chang,et al. A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[5] W. Arthur,et al. WHAT IS A BODY PLAN , 1997 .

[6] Chahab Nastar,et al. Relevance feedback and category search in image databases , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[7] Qi Tian,et al. Discriminant-EM algorithm with application to image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8] Amarnath Gupta,et al. Virage video engine , 1997, Electronic Imaging.

[9] JongWon Kim,et al. SIVOG: smart interactive video object generation system , 1999, MULTIMEDIA '99.

[10] Shih-Fu Chang,et al. CVEPS - a compressed video editing and parsing system , 1997, MULTIMEDIA '96.

[11] Shih-Fu Chang,et al. Computable scenes and structures in films , 2002, IEEE Trans. Multim..

[12] David B. Lomet,et al. The hB-tree: a multiattribute indexing method with good guaranteed performance , 1990, TODS.

[13] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14] Charles A. Bouman,et al. A compressed video database structured for active browsing and search , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[15] R. A. Leibler,et al. On Information and Sufficiency , 1951 .

[16] Milind R. Naphade,et al. A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[17] Hans-Peter Kriegel,et al. The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[18] Yanxi Liu,et al. Classification Driven Semantic Based Medical Image Indexing and Retrieval , 1999 .

[19] Shih-Fu Chang,et al. Semantic visual templates: linking visual features to semantics , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[20] A. Murat Tekalp,et al. Temporal video segmentation using unsupervised clustering and semantic object tracking , 1998, J. Electronic Imaging.

[21] Boon-Lock Yeo,et al. Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[22] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[23] Jianping Fan,et al. Automatic image segmentation by integrating color-edge extraction and seeded region growing , 2001, IEEE Trans. Image Process..

[24] Shih-Fu Chang,et al. MediaNet: a multimedia information network for knowledge representation , 2000, SPIE Optics East.

[25] Pietro Perona,et al. Towards automatic discovery of object categories , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[26] Alfred O. Hero,et al. Space-alternating generalized expectation-maximization algorithm , 1994, IEEE Trans. Signal Process..

[27] Boon-Lock Yeo,et al. Classification, simplification, and dynamic visualization of scene transition graphs for video browsing , 1997, Electronic Imaging.

[28] Christian Böhm,et al. Independent quantization: an index compression technique for high-dimensional data spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[29] Shih-Fu Chang,et al. Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[30] B. S. Manjunath,et al. NeTra-V: toward an object-based video representation , 1997, Electronic Imaging.

[31] Jonathan D. Courtney. Automatic video indexing via object motion analysis , 1997, Pattern Recognit..

[32] Jing Huang,et al. An automatic hierarchical image classification scheme , 1998, MULTIMEDIA '98.

[33] Alexander G. Hauptmann,et al. Text, Speech, and Vision for Video Segmentation: The InformediaTM Project , 1995 .

[34] Marcel Worring,et al. Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[35] Nuno Vasconcelos,et al. A Bayesian framework for semantic content characterization , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[36] Alexander Thomasian,et al. Clustering and singular value decomposition for approximate indexing in high dimensional spaces , 1998, CIKM '98.

[37] Ingemar J. Cox,et al. Correction to "the Bayesian image retrieval system, pichunter: theory, implementation, and psychophysical experiments" , 2000, IEEE Transactions on Image Processing.

[38] Levent Onural,et al. Image sequence analysis for emerging interactive multimedia services-the European COST 211 framework , 1998, IEEE Trans. Circuits Syst. Video Technol..

[39] Demetri Terzopoulos,et al. Snakes: Active contour models , 2004, International Journal of Computer Vision.

[40] K. Selçuk Candan,et al. Hierarchical Image Modeling for Object-Based Media Retrieval , 1998, Data Knowl. Eng..

[41] Joo-Hwee Lim. Learnable visual keywords for image classification , 1999, DL '99.

[42] Edward Y. Chang,et al. Clindex: Clustering for Similarity Queries in High-Dimensional Spaces. , 1999 .

[43] Wolfgang Effelsberg,et al. Automatic recognition of film genres , 1995, MULTIMEDIA '95.

[44] Christos Faloutsos,et al. The TV-tree: An index structure for high-dimensional data , 1994, The VLDB Journal.

[45] Charles A. Bouman,et al. ViBE: a compressed video database structured for active browsing and search , 2004, IEEE Transactions on Multimedia.

[46] W. Eric L. Grimson,et al. Configuration based scene classification and image indexing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47] Boon-Lock Yeo,et al. Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[48] Faouzi Kossentini,et al. Automatic Key Video Object Plane Selection Using the Shape Information in the MPEG-4 Compressed Domain , 2000, IEEE Trans. Multim..

[49] John P. Oakley,et al. Storage and Retrieval for Image and Video Databases , 1993 .

[50] Shaoping Ma,et al. Relevance feedback in content-based image retrieval: Bayesian framework, feature subspaces, and progressive learning , 2003, IEEE Trans. Image Process..

[51] Wei-Hao Lin,et al. News video classification using SVM-based multimodal classifiers and combination strategies , 2002, MULTIMEDIA '02.

[52] Thomas S. Huang,et al. Constructing table-of-content for videos , 1999, Multimedia Systems.

[53] C.-C. Jay Kuo,et al. Rule-based video classification system for basketball video indexing , 2000, MULTIMEDIA '00.

[54] Ali N. Akansu,et al. Multi-Modal Dialog Scene Detection Using Hidden Markov Models for Content-Based Multimedia Indexing , 2001, Multimedia Tools and Applications.

[55] Edward Y. Chang,et al. Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[56] Zhu Liu,et al. Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[57] Patrick Bouthemy,et al. Motion segmentation and qualitative dynamic scene analysis from an image sequence , 1993, International Journal of Computer Vision.

[58] Chung-Ming Huang,et al. MING-I: a distributed interactive multimedia document development mechanism , 1998, Multimedia Systems.

[59] B. S. Manjunath,et al. Adaptive nearest neighbor search for relevance feedback in large image databases , 2001, MULTIMEDIA '01.

[60] James Ze Wang,et al. SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[61] Tom Minka,et al. Interactive learning with a "society of models" , 1997, Pattern Recognit..

[62] King Ngi Ngan,et al. Automatic segmentation of moving objects for video object plane generation , 1998, IEEE Trans. Circuits Syst. Video Technol..

[63] Shin'ichi Satoh,et al. The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[64] Anil K. Jain,et al. Bayesian framework for semantic classification of outdoor vacation images , 1998, Electronic Imaging.

[65] B. S. Manjunath,et al. An efficient color representation for image retrieval , 2001, IEEE Trans. Image Process..

[66] Shih-Fu Chang,et al. IMKA: a multimedia organization system combining perceptual and semantic knowledge , 2001, MULTIMEDIA '01.

[67] Ingemar J. Cox,et al. The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[68] Josef Pieprzyk,et al. A Multi-Level View Model for Secure Object-Oriented Databases , 1997, Data Knowl. Eng..

[69] Dragutin Petkovic,et al. Query by Image and Video Content: The QBIC System , 1995, Computer.

[70] Takeo Kanade,et al. Name-It: association of face and name in video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[71] Stephen W. Smoliar,et al. An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[72] Atreyi Kankanhalli,et al. Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[73] Zhu Liu,et al. Classification TV programs based on audio information using hidden Markov model , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[74] John R. Smith,et al. Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues , 2003, EURASIP J. Adv. Signal Process..

[75] Christos Faloutsos,et al. MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[76] Tieniu Tan,et al. Efficient image gradient based vehicle localization , 2000, IEEE Trans. Image Process..

[77] John R. Smith,et al. VideoZoom Spatio-Temporal Video Browser , 1999, IEEE Trans. Multim..

[78] Edward Y. Chang,et al. CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[79] Aidong Zhang,et al. Semantic clustering and querying on heterogeneous features for visual data , 1998, MULTIMEDIA '98.

[80] G. McLachlan,et al. The EM algorithm and extensions , 1996 .

[81] Stan Z. Li,et al. Extraction of feature subspaces for content-based retrieval using relevance feedback , 2001, MULTIMEDIA '01.

[82] Jianping Fan,et al. Hierarchical video summarization for medical data , 2001, IS&T/SPIE Electronic Imaging.

[83] Svetha Venkatesh,et al. Towards automatic extraction of expressive elements from motion pictures: tempo , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[84] Duen-Ren Liu,et al. Classifying Video Documents by Hierarchical Structure of Video Contents , 2000, Comput. J..

[85] Antonin Guttman,et al. R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[86] David A. Forsyth,et al. Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[87] Svetha Venkatesh,et al. Toward automatic extraction of expressive elements from motion pictures: tempo , 2002, IEEE Trans. Multim..

[88] Wei Xiong,et al. Query by video clip , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[89] Anil K. Jain,et al. Automatic classification of tennis video for high-level content-based retrieval , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[90] Michael I. Jordan,et al. On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[91] Ming-Chieh Lee,et al. Semiautomatic segmentation and tracking of semantic video objects , 1998, IEEE Trans. Circuits Syst. Video Technol..

[92] New York Dover,et al. ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[93] Jianping Fan,et al. MultiView: Multilevel video content representation and retrieval , 2001, J. Electronic Imaging.

[94] Ambuj K. Singh,et al. Dimensionality reduction for similarity searching in dynamic databases , 1998, SIGMOD '98.

[95] Jianping Fan,et al. Automatic model-based semantic object extraction algorithm , 2001, IEEE Trans. Circuits Syst. Video Technol..