Hierarchical visual description schemes for still images and video sequences

This paper proposes two description schemes (DSs) to describe the visual information of an audio-visual (AV) document. The first one, is devoted to still images. It describes the image visual appearance and its structure with regions as well as its semantic content in terms of objects. The second DS is devoted to video sequences. It describes the sequence structure as well as its semantic content in terms of events. Features such as motion, camera activity, etc. are included in this DS. Moreover, it involves static visual representations such as key-frames, background mosaics and key-regions. These elements are considered as still images and are described by the first DS.

[1]  Fernando Pereira,et al.  MPEG-4: Context and objectives , 1997, Signal Process. Image Commun..

[2]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Philippe Salembier,et al.  Binary partition tree as an efficient representation for filtering, segmentation and information retrieval , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[4]  Thomas S. Huang,et al.  Exploring video structure beyond the shots , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).