Multi-camera multi-object voxel-based Monte Carlo 3D tracking strategies

This article presents a new approach to the problem of simultaneous tracking of several people in low-resolution sequences from multiple calibrated cameras. Redundancy among cameras is exploited to generate a discrete 3D colored representation of the scene, being the starting point of the processing chain. We review how the initiation and termination of tracks influences the overall tracker performance, and present a Bayesian approach to efficiently create and destroy tracks. Two Monte Carlo-based schemes adapted to the incoming 3D discrete data are introduced. First, a particle filtering technique is proposed relying on a volume likelihood function taking into account both occupancy and color information. Sparse sampling is presented as an alternative based on a sampling of the surface voxels in order to estimate the centroid of the tracked people. In this case, the likelihood function is based on local neighborhoods computations thus dramatically decreasing the computational load of the algorithm. A discrete 3D re-sampling procedure is introduced to drive these samples along time. Multiple targets are tracked by means of multiple filters, and interaction among them is modeled through a 3D blocking scheme. Tests over CLEAR-annotated database yield quantitative results showing the effectiveness of the proposed algorithms in indoor scenarios, and a fair comparison with other state-of-the-art algorithms is presented. We also consider the real-time performance of the proposed algorithm.

[1]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[2]  Fritz Albregtsen,et al.  Fast and exact computation of Cartesian geometric moments using discrete Green's theorem , 1996, Pattern Recognit..

[3]  Montse Pardàs,et al.  Multi-person Tracking Strategies Based on Voxel Analysis , 2007, CLEAR.

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Richard J. Telford,et al.  Exploratory Data Analysis and Data Display , 2012 .

[7]  Olivier D. Faugeras,et al.  Variational principles, surface evolution, PDEs, level set methods, and the stereo problem , 1998, IEEE Trans. Image Process..

[8]  Oswald Lanz,et al.  Approximate Bayesian multibody tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Rainer Stiefelhagen,et al.  Multi-level Particle Filter Fusion of Features and Cues for Audio-Visual Person Tracking , 2007, CLEAR.

[10]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[11]  Alexander H. Waibel CHIL - Computers in the Human Interaction Loop , 2005, MVA.

[12]  Roberto Brunelli,et al.  An Appearance-Based Particle Filter for Visual Tracking in Smart Rooms , 2007, CLEAR.

[13]  Anne Lohrli Chapman and Hall , 1985 .

[14]  R D McGovern,et al.  Efficient calculation of mass moments of inertia for segmented homogeneous three-dimensional objects. , 1997, Journal of biomechanics.

[15]  Emilio Maggio,et al.  Particle PHD Filtering for Multi-Target Visual Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[16]  Anthony G. Constantinides,et al.  Audio–Visual Active Speaker Tracking in Cluttered Indoors Environments , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Montse Pardàs,et al.  Particle filtering and sparse sampling for multi-person 3D tracking , 2008, 2008 15th IEEE International Conference on Image Processing.

[18]  Frank Dellaert,et al.  Efficient particle filter-based tracking of multiple interacting targets using an MRF-based motion model , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[19]  Rainer Stiefelhagen,et al.  Multiple Object Tracking Performance Metrics and Evaluation in a Smart Room Environment , 2006 .

[20]  Josep R. Casas,et al.  Image-based multi-view scene analysis using 'conexels' , 2006 .

[21]  Rainer Stiefelhagen,et al.  Towards vision-based 3-D people tracking in a smart room , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[22]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[23]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[24]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[25]  O. Faugeras,et al.  Variational principles, surface evolution, PDE's, level set methods and the stereo problem , 1998, 5th IEEE EMBS International Summer School on Biomedical Imaging, 2002..

[26]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  Xiaojun Wu,et al.  MCMC based multi-body tracking using full 3D model of both target and environment , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[28]  James Black,et al.  Multi view image surveillance and tracking , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[29]  Sonya A. H. McMullen,et al.  Mathematical Techniques in Multisensor Data Fusion (Artech House Information Warfare Library) , 2004 .

[30]  Aristodemos Pnevmatikakis,et al.  The AIT 3D Audio / Visual Person Tracker for CLEAR 2007 , 2007, CLEAR.

[31]  Montse Pardàs,et al.  Towards a Bayesian Approach to Robust Finding Correspondences in Multiple View Geometry Environments , 2005, International Conference on Computational Science.

[32]  Stan Sclaroff,et al.  Stochastic refinement of the visual hull to satisfy photometric and silhouette consistency constraints , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[33]  Adolfo López,et al.  Multi-Person 3D Tracking with Particle Filters on Voxels , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[34]  Mohan M. Trivedi,et al.  Understanding human interactions with track and body synergies (TBS) captured from multiple views , 2008, Comput. Vis. Image Underst..

[35]  Jia-Guu Leu Computing a shape's moments from its boundary , 1991, Pattern Recognit..

[36]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[37]  Chung-Lin Huang,et al.  Multiview-Based Cooperative Tracking of Multiple Human Objects , 2008, EURASIP J. Image Video Process..