Interaction between High-Level and Low-Level Image Analysis for Semantic Video Object Extraction

The task of extracting a semantic video object is split into two subproblems, namely, object segmentation and region segmentation. Object segmentation relies on a priori assumptions, whereas region segmentation is data-driven and can be solved in an automatic manner. These two subproblems are not mutually independent, and they can benefit from interactions with each other. In this paper, a framework for such interaction is formulated. This representation scheme based on region segmentation and semantic segmentation is compatible with the view that image analysis and scene understanding problems can be decomposed into low-level and high-level tasks. Low-level tasks pertain to region-oriented processing, whereas the high-level tasks are closely related to object-level processing. This approach emulates the human visual system: what one "sees" in a scene depends on the scene itself (region segmentation) as well as on the cognitive task (semantic segmentation) at hand. The higher-level segmentation results in a partition corresponding to semantic video objects. Semantic video objects do not usually have invariant physical properties and the definition depends on the application. Hence, the definition incorporates complex domain-specific knowledge and is not easy to generalize. For the specific implementation used in this paper, motion is used as a clue to semantic information. In this framework, an automatic algorithm is presented for computing the semantic partition based on color change detection. The change detection strategy is designed to be immune to the sensor noise and local illumination variations. The lower-level segmentation identifies the partition corresponding to perceptually uniform regions. These regions are derived by clustering in an-dimensional feature space, composed of static as well as dynamic image attributes. We propose an interaction mechanism between the semantic and the region partitions which allows to cope with multiple simultaneous objects. Experimental results show that the proposed method extracts semantic video objects with high spatial accuracy and temporal coherence.

[1]  David J. Fleet,et al.  Phase-based disparity measurement , 1991, CVGIP Image Underst..

[2]  Yo-Sung Ho,et al.  A VOP generation tool: automatic segmentation of moving objects in image sequences based on spatio-temporal information , 1999, IEEE Trans. Circuits Syst. Video Technol..

[3]  Shimon Edelman,et al.  Representation and recognition in vision , 1999 .

[4]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[5]  Fernando Pereira,et al.  The role of analysis in content-based video coding and indexing , 1998, Signal Process..

[6]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[7]  Touradj Ebrahimi,et al.  Automatic and Interactive Segmentation of Video Sequences , 2001 .

[8]  Til Aach,et al.  Statistical model-based change detection in moving video , 1993, Signal Process..

[9]  Liang-Gee Chen,et al.  Efficient moving object segmentation algorithm using background registration technique , 2002, IEEE Trans. Circuits Syst. Video Technol..

[10]  D. Hubel Eye, brain, and vision , 1988 .

[11]  K. J. Ray Liu,et al.  Multimedia Fingerprinting Forensics for Traitor Tracing , 2005 .

[12]  Jörn Ostermann,et al.  Object-oriented analysis-synthesis coding of moving images , 1989, Signal Process. Image Commun..

[13]  R. Bajcsy Active perception , 1988 .

[14]  Ioannis Pitas,et al.  Nonlinear Model-Based Image/Video Processing and Analysis , 2001 .

[15]  Touradj Ebrahimi,et al.  Multiple video object tracking in complex scenes , 2002, MULTIMEDIA '02.

[16]  M. Kunt,et al.  Second-generation image-coding techniques , 1985, Proceedings of the IEEE.

[17]  Rudolf Mester,et al.  Detection and description of moving objects by stochastic modelling and analysis of complex scenes , 1996, Signal Process. Image Commun..

[18]  Touradj Ebrahimi,et al.  Change detection based on color edges , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[19]  Arnold W. M. Smeulders,et al.  Color Based Object Recognition , 1997, ICIAP.

[20]  Hai Tao,et al.  Object Tracking with Bayesian Estimation of Dynamic Layer Representations , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Touradj Ebrahimi,et al.  Robust and illumination invariant change detection based on linear dependence for surveillance application , 2000, 2000 10th European Signal Processing Conference.

[22]  Touradj Ebrahimi,et al.  Video object extraction based on adaptive background and statistical change detection , 2000, IS&T/SPIE Electronic Imaging.

[23]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[24]  Alessandro Neri,et al.  Automatic moving object and background separation , 1998, Signal Process..

[25]  Roland Mech,et al.  A noise robust method for 2D shape estimation of moving objects in video sequences considering a moving camera , 1998, Signal Process..

[26]  Amir Averbuch,et al.  Automatic segmentation of moving objects in video sequences: a region labeling approach , 2002, IEEE Trans. Circuits Syst. Video Technol..

[27]  Jenq-Neng Hwang,et al.  Fast and automatic video object segmentation and tracking for content-based applications , 2002, IEEE Trans. Circuits Syst. Video Technol..

[28]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .