Real time motion analysis toward semantic understanding of video content

Video motion analysis and its applications have been a classic research topic for decades. In this paper, we explore the problem of real time video semantics understanding based on motion information. The work can be divided into two segments: global / camera motion estimation and object motion analysis. The former involves optical flow analysis and semantic meaning parsing, and the latter involves object detection and tracking. Although each of these topics has been studied extensively in the literature, a thorough system combining all of them without human intervention, especially under a real time application scenario, is still worthy of further investigation. In this paper we develop our approach toward such a destination and propose an integral architecture. The usability and efficiency of the proposed system have been demonstrated through experiments. Results of this project have numerous applications in digital entertainment, such as video and image summarization, annotation, retrieval and editing.

[1]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Tong Zhang Intelligent keyframe extraction for video printing , 2004, SPIE Optics East.

[4]  Ronald L. Rardin,et al.  Optimization in operations research , 1997 .

[5]  Bing Zeng,et al.  A new three-step search algorithm for block motion estimation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[6]  Nobuyuki Yagi,et al.  Estimation of camera parameters from image sequence for model-based video coding , 1994, IEEE Trans. Circuits Syst. Video Technol..

[7]  J. Reitberger,et al.  Automatic car detection in high resolution urban scenes based on an adaptive 3D-model , 2003, 2003 2nd GRSS/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas.

[8]  Dorin Comaniciu,et al.  Mean shift analysis and applications , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.