Accessing Video Contents through Key Objects over IP

In order to support content-based video database access over the Internet Protocol (IP), achieving the following objectives are important: (i) video query by a representative object (key object) or some statistical characterization of the target contents, (ii) bandwidth-efficient browsing over IP, and (iii) scalable and user-centric video transmission over a heterogeneous and variable-bandwidth network. We present a video object extraction and scalable coding system designed to meet the above objectives. In our system, key objects of meaning to video database users are generated via a human-computer-interaction procedure, and are tracked across frames. Given a key object, an algorithm classifies a subset of its VOPs as key VOPs. This subset forms the basis of a highly bandwidth-efficient base layer for supporting activities such as browsing and refining queries. Over the base layer, a number of enhancement layers can be defined to progressively increase the spatial and temporal resolutions of retrieved video. It is expected that heterogeneous users can subscribe to different numbers of the enhancement layers according to their own conditions, such as access authorization, available connection bandwidth, and quality preference.

[1]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[2]  Jianping Fan,et al.  Adaptive motion-compensated video coding scheme towards content-based bit rate allocation , 2000, J. Electronic Imaging.

[3]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[5]  Elisa Bertino,et al.  Modeling Spatio-Temporal Constraints for Multimedia Objects , 1999, Data Knowl. Eng..

[6]  Fuxi Gan,et al.  Spatiotemporal segmentation based on two-dimensional spatiotemporal entropic thresholding , 1997 .

[7]  Jianping Fan,et al.  Automatic image segmentation by integrating color-edge extraction and seeded region growing , 2001, IEEE Trans. Image Process..

[8]  Ying Xu,et al.  2D image segmentation using minimum spanning trees , 1997, Image Vis. Comput..

[9]  M. Hötter,et al.  Image segmentation based on object oriented mapping parameter estimation , 1988 .

[10]  Jie Chen,et al.  Affine curve moment invariants for shape recognition , 1997, Pattern Recognit..

[11]  Thomas S. Huang,et al.  Browsing and retrieving video content in a unified framework , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[12]  Huitao Luo,et al.  Designing an interactive tool for video object segmentation and annotation , 1999, MULTIMEDIA '99.

[13]  Michal Irani,et al.  Video indexing based on mosaic representations , 1998, Proc. IEEE.

[14]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[15]  Thomas S. Huang,et al.  A novel relevance feedback technique in image retrieval , 1999, MULTIMEDIA '99.

[16]  Murat Kunt,et al.  Spatiotemporal Segmentation Based on Region Merging , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Gang Wei,et al.  Face detection for image annotation , 1999, Pattern Recognition Letters.

[18]  Jianping Fan,et al.  Automatic moving object extraction toward compact video representation , 2000 .

[19]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[20]  King Ngi Ngan,et al.  Automatic segmentation of moving objects for video object plane generation , 1998, IEEE Trans. Circuits Syst. Video Technol..

[21]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[22]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[24]  Thomas S. Huang,et al.  Exploring video structure beyond the shots , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[25]  A. Murat Tekalp,et al.  Temporal video segmentation using unsupervised clustering and semantic object tracking , 1998, J. Electronic Imaging.

[26]  Norbert Diehl,et al.  Object-oriented motion estimation and segmentation in image sequences , 1991, Signal Process. Image Commun..

[27]  Milind R. Naphade,et al.  A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[28]  Anil K. Jain,et al.  Contour extraction of moving objects in complex outdoor scenes , 1995, International Journal of Computer Vision.

[29]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[30]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[31]  Charles A. Bouman,et al.  A compressed video database structured for active browsing and search , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[32]  Jianping Fan,et al.  Seeded image segmentation for content-based image retrieval application , 2001, IS&T/SPIE Electronic Imaging.

[33]  Wei Xiong,et al.  Query by video clip , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[34]  Thomas S. Huang,et al.  Modified Fourier Descriptors for Shape Representation - A Practical Approach , 1996 .

[35]  Aidong Zhang,et al.  Semantic clustering and querying on heterogeneous features for visual data , 1998, MULTIMEDIA '98.

[36]  C. V. Ramamoorthy,et al.  Knowledge and Data Engineering , 1989, IEEE Trans. Knowl. Data Eng..

[37]  Takeo Kanade,et al.  Name-It: association of face and name in video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[39]  Katsumi Tanaka,et al.  OVID: Design and Implementation of a Video-Object Database System , 1993, IEEE Trans. Knowl. Data Eng..

[40]  Ming-Chieh Lee,et al.  Semiautomatic segmentation and tracking of semantic video objects , 1998, IEEE Trans. Circuits Syst. Video Technol..

[41]  Philippe Salembier,et al.  Morphological multiscale segmentation for image coding , 1994, Signal Process..

[42]  Clement T. Yu,et al.  Detecting human faces in color images , 1998, Proceedings International Workshop on Multi-Media Database Management Systems (Cat. No.98TB100249).

[43]  Anil K. Jain,et al.  Bayesian framework for semantic classification of outdoor vacation images , 1998, Electronic Imaging.

[44]  Patrick Bouthemy,et al.  Motion segmentation and qualitative dynamic scene analysis from an image sequence , 1993, International Journal of Computer Vision.

[45]  Fuxi Gan,et al.  Motion estimation based on global and local uncompensability analysis , 1997, IEEE Trans. Image Process..

[46]  Rong Wang,et al.  Image sequence segmentation based on 2D temporal entropic thresholding , 1996, Pattern Recognit. Lett..

[47]  B. S. Manjunath,et al.  NeTra-V: toward an object-based video representation , 1997, Electronic Imaging.

[48]  Jonathan D. Courtney Automatic video indexing via object motion analysis , 1997, Pattern Recognit..

[49]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[50]  Boon-Lock Yeo,et al.  Classification, simplification, and dynamic visualization of scene transition graphs for video browsing , 1997, Electronic Imaging.

[51]  Alberto Del Bimbo,et al.  Symbolic Description and Visual Querying of Image Sequences Using Spatio-Temporal Logic , 1995, IEEE Trans. Knowl. Data Eng..

[52]  Shih-Fu Chang,et al.  Model-based classification of visual information for content-based retrieval , 1998, Electronic Imaging.

[53]  Gilad Adiv,et al.  Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Tom Minka,et al.  Vision texture for annotation , 1995, Multimedia Systems.

[55]  Aidong Zhang,et al.  SemQuery: Semantic Clustering and Querying on Heterogeneous Features for Visual Data , 2002, IEEE Trans. Knowl. Data Eng..

[56]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[57]  JongWon Kim,et al.  SIVOG: smart interactive video object generation system , 1999, MULTIMEDIA '99.

[58]  John F. Haddon,et al.  Image Segmentation by Unifying Region and Boundary Information , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[60]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[61]  Levent Onural,et al.  Image sequence analysis for emerging interactive multimedia services-the European COST 211 framework , 1998, IEEE Trans. Circuits Syst. Video Technol..

[62]  Charles A. Bouman,et al.  ViBE: a compressed video database structured for active browsing and search , 2004, IEEE Transactions on Multimedia.

[63]  David A. Forsyth,et al.  Finding people and animals by guided assembly , 1997, Proceedings of International Conference on Image Processing.

[64]  Montse Pardàs,et al.  Hierarchical morphological segmentation for image sequence coding , 1994, IEEE Trans. Image Process..