Detection of Independently Moving Objects in Non-planar Scenes via Multi-Frame Monocular Epipolar Constraint

In this paper we present a novel approach for detection of independently moving foreground objects in non-planar scenes captured by a moving camera. We avoid the traditional assumptions that the stationary background of the scene is planar, or that it can be approximated by dominant single or multiple planes, or that the camera used to capture the video is orthographic. Instead we utilize a multiframe monocular epipolar constraint of camera motion derived for monocular moving cameras defined by an evolving epipolar plane between the moving camera center and 3D scene points. This constraint is parameterized as a polynomial function of time, and unlike repeated computations of inter-frame fundamental matrix, requires the estimation of fewer unknowns, and provides a more consistent separation between moving and static objects for different levels of noise. This constraint allows us to segment out moving objects in a general 3D scene where other approaches fail because their initial assumptions do not hold, and provides a natural way of fusing temporal information across multiple frames. We use a combination of optical flow and particle advection to capture all motion in the video across a number of frames, in the form of particle trajectories. We then apply the derived multi-frame epipolar constraint to these trajectories to determine which trajectories violate it, thus segmenting out the independently moving objects. We show superior results on a number of moving camera sequences observing non-planar scenes, where other methods fail.

[1]  Mubarak Shah,et al.  KNIGHT/spl trade/: a real time surveillance system for multiple and non-overlapping cameras , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[2]  Jan-Olof Eklundh,et al.  Statistical background subtraction for a mobile observer , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jing Zhang,et al.  Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Hui Cheng,et al.  ViTex: Video To Tex and Its Application in Aerial Video Surveillance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Hai Tao,et al.  Object Tracking with Bayesian Estimation of Dynamic Layer Representations , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[8]  Mubarak Shah,et al.  COCOA: tracking in aerial imagery , 2006, SPIE Defense + Commercial Sensing.

[9]  Gérard G. Medioni,et al.  Detection and tracking of moving objects from a moving platform in presence of strong parallax , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Mubarak Shah,et al.  Motion layer extraction in the presence of occlusion using graph cuts , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hui Cheng,et al.  Geo-spatial aerial video processing for scene understanding and object tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Nikos Paragios,et al.  Motion-based background subtraction using adaptive kernel density estimation , 2004, CVPR 2004.

[13]  Takeo Kanade,et al.  A subspace approach to layer extraction , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Takeo Kanade,et al.  Background Subtraction for Freely Moving Cameras , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Richard I. Hartley,et al.  In Defense of the Eight-Point Algorithm , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  A. G. Amitha Perera,et al.  A unified framework for tracking through occlusions and across sensor gaps , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  M. Shah,et al.  KNIGHT M : A REAL TIME SURVEILLANCE SYSTEM FOR MULTIPLE OVERLAPPING AND NON-OVERLAPPING CAMERAS , 2003 .

[18]  Mubarak Shah,et al.  Matching actions in presence of camera motion , 2006, Comput. Vis. Image Underst..

[19]  Harpreet S. Sawhney,et al.  Independent motion detection in 3D scenes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Joseph L. Mundy,et al.  Change Detection in a 3-d World , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  P. Anandan,et al.  A Unified Approach to Moving Object Detection in 2D and 3D Scenes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[23]  Ramesh C. Jain,et al.  On the Analysis of Accumulative Difference Pictures from Image Sequences of Real World Scenes , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[25]  Stan Sclaroff,et al.  Segmenting foreground objects from a dynamic textured background via a robust Kalman filter , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  Huijun Di,et al.  Background modeling from a free-moving camera by Multi-Layer Homography Algorithm , 2008, 2008 15th IEEE International Conference on Image Processing.