Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation

For a responsive audio art installation in a skylit atrium, we introduce a single-camera statistical segmentation and tracking algorithm. The algorithm combines statistical background image estimation, per-pixel Bayesian segmentation, and an approximate solution to the multi-target tracking problem using a bank of Kalman filters and Gale-Shapley matching. A heuristic confidence model enables selective filtering of tracks based on dynamic data. We demonstrate that our algorithm has improved recall and F2-score over existing methods in OpenCV 2.1 in a variety of situations. We further demonstrate that feedback between the tracking and the segmentation systems improves recall and F2-score. The system described operated effectively for 5-8 hours per day for 4 months; algorithms are evaluated on video from the camera installed in the atrium. Source code and sample data is open source and available in OpenCV.

[1]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Larry S. Davis,et al.  Fast multiple object tracking via a hierarchical particle filter , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Osama Masoud,et al.  A novel method for tracking and counting pedestrians in real-time using a single camera , 2001, IEEE Trans. Veh. Technol..

[4]  David L. Olson,et al.  Advanced Data Mining Techniques , 2008 .

[5]  Larry S. Davis,et al.  A Robust Background Subtraction and Shadow Detection , 1999 .

[6]  Luc Van Gool,et al.  An adaptive color-based particle filter , 2003, Image Vis. Comput..

[7]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  S. Beucher,et al.  Morphological segmentation , 1990, J. Vis. Commun. Image Represent..

[10]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[12]  L. Davis,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002, Proc. IEEE.

[13]  Jonathan H. Connell,et al.  A Statistical Approach for Real-time Robust Background Subtrac tion and Shadow Detection , 2014 .

[14]  Alex Zelinsky,et al.  Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.

[15]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[16]  Manolis I. A. Lourakis,et al.  Binocular Hand Tracking and Reconstruction Based on 2D Shape Matching , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[17]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[18]  Ignacio Parra,et al.  Combination of Feature Extraction Methods for SVM Pedestrian Detection , 2007, IEEE Transactions on Intelligent Transportation Systems.

[19]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[20]  Amy LaViers,et al.  The ballet automaton: A formal model for human motion , 2011, Proceedings of the 2011 American Control Conference.

[21]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[22]  Qi Tian,et al.  Foreground object detection from videos containing complex background , 2003, MULTIMEDIA '03.

[23]  Luc Vincent,et al.  Morphological grayscale reconstruction in image analysis: applications and efficient algorithms , 1993, IEEE Trans. Image Process..

[24]  Trista Pei-chun Chen,et al.  Computer Vision Workload Analysis: Case Study of Video Surveillance Systems , 2005 .

[25]  Naomi Ehrich Leonard,et al.  In the dance studio: Analysis of human flocking , 2012, 2012 American Control Conference (ACC).

[26]  John Baillieul,et al.  The control theory of motion-based communication: Problems in teaching robots to dance , 2012, 2012 American Control Conference (ACC).

[27]  V. Pisarevsky,et al.  Intel's Computer Vision Library: applications in calibration, stereo segmentation, tracking, gesture, face and object recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[28]  Josep Vidal,et al.  Kalman tracking for mobile location in NLOS situations , 2003, 14th IEEE Proceedings on Personal, Indoor and Mobile Radio Communications, 2003. PIMRC 2003..

[29]  Bruno Sinopoli,et al.  Kalman filtering with intermittent observations , 2004, IEEE Transactions on Automatic Control.

[30]  Rodrigo F. Cádiz,et al.  Generating music from flocking dynamics , 2012, 2012 American Control Conference (ACC).

[31]  Luc Vincent,et al.  Morphological Area Openings and Closings for Grey-scale Images , 1994 .

[32]  MalikJitendra,et al.  Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues , 2004 .

[33]  L. S. Shapley,et al.  College Admissions and the Stability of Marriage , 2013, Am. Math. Mon..

[34]  P. Wayne Power,et al.  Understanding Background Mixture Models for Foreground Segmentation , 2002 .

[35]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[36]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[37]  Raffaello D'Andrea,et al.  Feed-forward parameter identification for precise periodic quadrocopter motions , 2012, 2012 American Control Conference (ACC).

[38]  Qi Tian,et al.  Statistical modeling of complex backgrounds for foreground object detection , 2004, IEEE Transactions on Image Processing.

[39]  David Beymer,et al.  A real-time computer vision system for vehicle tracking and traffic surveillance , 1998 .

[40]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[41]  Amy LaViers,et al.  Style based robotic motion , 2012, 2012 American Control Conference (ACC).

[42]  Samuel S. Blackman,et al.  Multiple-Target Tracking with Radar Applications , 1986 .