Modeling and Visualization of Human Activities for Multicamera Networks

Multicamera networks are becoming complex involving larger sensing areas in order to capture activities and behavior that evolve over long spatial and temporal windows. This necessitates novel methods to process the information sensed by the network and visualize it for an end user. In this paper, we describe a system for modeling and on-demand visualization of activities of groups of humans. Using the prior knowledge of the 3D structure of the scene as well as camera calibration, the system localizes humans as they navigate the scene. Activities of interest are detected by matching models of these activities learnt a priori against the multiview observations. The trajectories and the activity index for each individual summarize the dynamic content of the scene. These are used to render the scene with virtual 3D human models that mimic the observed activities of real humans. In particular, the rendering framework is designed to handle large displays with a cluster of GPUs as well as reduce the cognitive dissonance by rendering realistic weather effects and illumination. We envision use of this system for immersive visualization as well as summarization of videos that capture group behavior.

[1]  Dirk Reiners,et al.  Special Issue on the OpenSG Symposium and OpenSG Plus , 2004, Comput. Graph..

[2]  Frank Dellaert,et al.  Inferring Temporal Order of Images From 3D Structure , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[4]  Fatih Porikli,et al.  Multi-Camera Surveillance: Object-Based Summarization Approach , 2004 .

[5]  Lisa M. Brown,et al.  IBM smart surveillance system (S3): a open and extensible framework for event based surveillance , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[6]  Hans-Peter Seidel,et al.  Hardware-accelerated rendering of antialiased shadows with shadow maps , 2001, Proceedings. Computer Graphics International 2001.

[7]  Franklin C. Crow,et al.  Shadow algorithms for computer graphics , 1977, SIGGRAPH.

[8]  Y. Bar-Shalom Tracking and data association , 1988 .

[9]  Ashok Veeraraghavan,et al.  The Function Space of an Activity , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  B. Anderson,et al.  Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Andrew Lauritzen,et al.  Variance shadow maps , 2006, I3D '06.

[12]  Larry S. Davis,et al.  Multi-camera Tracking and Segmentation of Occluded People on Ground Plane Using Search-Guided Particle Filtering , 2006, ECCV.

[13]  Jan-Michael Frahm,et al.  Towards Urban 3D Reconstruction from Video , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[14]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[15]  Rama Chellappa,et al.  Unsupervised view and rate invariant clustering of video sequences q , 2009 .

[16]  FuaPascal,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008 .

[17]  Rama Chellappa,et al.  From Videos to Verbs: Mining Videos for Activities using a Cascade of Dynamical Systems , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Frank Dellaert,et al.  Line-Based Structure from Motion for Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[19]  Suya You,et al.  3D video surveillance with Augmented Virtual Environments , 2003, IWVS '03.

[20]  H. Opower Multiple view geometry in computer vision , 2002 .

[21]  Pascal Mamassian,et al.  Illusory motion from shadows , 1996, Nature.

[22]  Takeo Kanade,et al.  Advances in Cooperative Multi-Sensor Video Surveillance , 1999 .

[23]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[24]  R. Chellappa,et al.  Optimal Multi-View Fusion of Object Locations , 2008, 2008 IEEE Workshop on Motion and video Computing.

[25]  Richard J. Martin A metric for ARMA processes , 2000, IEEE Trans. Signal Process..

[26]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[27]  G. Voss,et al.  OpenSG: Basic Concepts , 2002 .

[28]  Mubarak Shah,et al.  A Multiview Approach to Tracking People in Crowded Scenes Using a Planar Homography Constraint , 2006, ECCV.

[29]  Takeo Kanade,et al.  Virtualized Reality: Perspectives on 4D Digitization of Dynamic Events , 2007, IEEE Computer Graphics and Applications.

[30]  Stefano Soatto,et al.  Recognition of human gaits , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[31]  Bart De Moor,et al.  Subspace angles between ARMA models , 2002, Syst. Control. Lett..