Modeling background from compressed video

Background models have been widely used for video surveillance and other applications. Methods for constructing background models and associated application algorithms are mainly studied in the spatial domain (pixel level). Many video sources, however, are in a compressed format before processing. In this paper, we propose an approach to construct background models directly from compressed video. The proposed approach utilizes the information from DCT coefficients at block level to construct accurate background models at pixel level. We implemented three representative algorithms of background models in the compressed domain, and theoretically explored their properties and the relationship with their counterparts in the spatial domain. We also present some general technical improvements to make them more capable for a wide range of applications. The proposed method can achieve the same accuracy as the methods that construct background models from the spatial domain with much lower computational cost (50% on average) and more compact storages.

[1]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Rita Cucchiara,et al.  Detecting Moving Objects, Ghosts, and Shadows in Video Streams , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Sergio A. Velastin,et al.  Automatic congestion detection system for underground platforms , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[4]  Qi Tian,et al.  Robust moving video object segmentation in the MPEG compressed domain , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[5]  A. Enis Çetin,et al.  Moving Region Detection in Compressed Video , 2004, ISCIS.

[6]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[7]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Aidong Zhang,et al.  Stationary background generation in mpeg compressed video sequences , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[9]  Bohyung Han,et al.  SEQUENTIAL KERNEL DENSITY APPROXIMATION THROUGH MODE PROPAGATION: APPLICATIONS TO BACKGROUND MODELING , 2004 .

[10]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[11]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[12]  Wen Gao,et al.  Automatic moving object extraction in MPEG video , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[13]  Sudeep Sarkar,et al.  Baseline results for the challenge problem of HumanID using gait analysis , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[14]  Kazuhiko Sumi,et al.  Background subtraction based on cooccurrence of image variations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..