Dynamic Processing Allocation in Video

Large stores of digital video pose severe computational challenges to existing video analysis algorithms. In applying these algorithms, users must often trade off processing speed for accuracy, as many sophisticated and effective algorithms require large computational resources that make it impractical to apply them throughout long videos. One can save considerable effort by applying these expensive algorithms sparingly, directing their application using the results of more limited processing. We show how to do this for retrospective video analysis by modeling a video using a chain graphical model and performing inference both to analyze the video and to direct processing. We apply our method to problems in background subtraction and face detection, and show in experiments that this leads to significant improvements over baseline algorithms.

[1]  徐梦溪,et al.  Network video monitoring system based on OpenCV (open source computer vision library) , 2011 .

[2]  Massimo Piccardi,et al.  Background subtraction techniques: a review , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[3]  Marcel Worring,et al.  High-Performance Distributed Video Content Analysis with Parallel-Horus , 2007, IEEE MultiMedia.

[4]  Luc Van Gool,et al.  GPU-Based Foreground-Background Segmentation using an Extended Colinearity Criterion , 2005 .

[5]  Bin Wu,et al.  Exploiting Network Structure for Active Inference in Collective Classification , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[6]  Ruigang Yang,et al.  Multi-resolution real-time stereo on commodity graphics hardware , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Lise Getoor,et al.  Effective label acquisition for collective classification , 2008, KDD.

[8]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[9]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ben Taskar,et al.  Structured Prediction Cascades , 2010, AISTATS.

[11]  W. James MacLean,et al.  Implementation of an Affine-Covariant Feature D etector in Field-Programmable Gate Arrays , 2007, ICVS 2007.

[12]  Andrew Blake,et al.  A Probabilistic Background Model for Tracking , 2000, ECCV.

[13]  Reinhard Koch,et al.  Real-time multi-stereo depth estimation on GPU with approximative discontinuity handling , 2004 .

[14]  Yuan Li,et al.  Vector boosting for rotation invariant multi-view face detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Chandrika Kamath,et al.  Robust techniques for background subtraction in urban traffic video , 2004, IS&T/SPIE Electronic Imaging.

[17]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Shaogang Gong,et al.  Multi-camera activity correlation analysis , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Valentina Bayer-Zubek Learning diagnostic policies from examples by systematic search , 2004, UAI 2004.

[20]  Jonathan G. Fiscus,et al.  AVSS Multiple Camera Person Tracking Challenge Evaluation Overview , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[21]  Noel E. O'Connor,et al.  Optimising resource allocation for background modeling using algorithm switching , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[22]  Z. Zivkovic Improved adaptive Gaussian mixture model for background subtraction , 2004, ICPR 2004.

[23]  Thomas S. Huang,et al.  Human face detection in a complex background , 1994, Pattern Recognit..

[24]  Horst Bischof,et al.  Hierarchical Disparity Estimation with Programmable 3D Hardware , 2004 .

[25]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Andreas Krause,et al.  Optimal Nonmyopic Value of Information in Graphical Models - Efficient Algorithms and Theoretical Limits , 2005, IJCAI.

[27]  Ruigang Yang,et al.  Fast Image Segmentation and Smoothing Using Commodity Graphics Hardware , 2002, J. Graphics, GPU, & Game Tools.

[28]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Andreas Krause,et al.  Optimal Value of Information in Graphical Models , 2009, J. Artif. Intell. Res..

[30]  Marcel Worring,et al.  High-Performance Distributed Image and Video Content Analysis with Parallel-Horus , 2007 .

[31]  Yakup Genc,et al.  GPU-based Video Feature Tracking And Matching , 2006 .

[32]  Anil K. Jain,et al.  Face Detection in Color Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[34]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[35]  Takeo Kanade,et al.  Probabilistic modeling of local appearance and spatial relationships for object recognition , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[36]  Sergio A. Velastin,et al.  Automatic congestion detection system for underground platforms , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[37]  Solomon Eyal Shimony,et al.  Efficient Deterministic Approximation Algorithms for Non-myopic Value of Information in Graphical Models , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[38]  Takeshi Mita,et al.  Joint Haar-like features for face detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[39]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[40]  Steve Mann,et al.  OpenVIDIA: parallel GPU computer vision , 2005, ACM Multimedia.

[41]  Ashish Kapoor,et al.  Visual recognition and detection under bounded computational resources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Rong Xiao,et al.  Boosting chain learning for object detection , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[43]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[44]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[45]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[46]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[47]  Ben Taskar,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[48]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[49]  Prateek Jain,et al.  Far-sighted active learning on a budget for image and video recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[50]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[52]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[53]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[54]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[55]  Timothy D. Barfoot,et al.  VISUAL MOTION ESTIMATION AND TERRAIN MODELING FOR PLANETARY ROVERS , 2005 .

[56]  Thomas Wiegand,et al.  SIFT Implementation and Optimization for General-Purpose GPU , 2007 .

[57]  Ramesh C. Jain,et al.  On the Analysis of Accumulative Difference Pictures from Image Sequences of Real World Scenes , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[59]  Luca Lombardi,et al.  Multi-module switching and fusion for robust video surveillance , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[60]  H. B. McMahan,et al.  Robust Submodular Observation Selection , 2008 .

[61]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.