Real-time Visual Tracker by Stream Processing

In this work, we implement a real-time visual tracker that targets the position and 3D pose of objects in video sequences, specifically faces. The use of stream processors for the computations and efficient Sparse-Template-based particle filtering allows us to achieve real-time processing even when tracking multiple objects simultaneously in high-resolution video frames. Stream processing is a relatively new computing paradigm that permits the expression and execution of data-parallel algorithms with great efficiency and minimum effort. Using a GPU (graphics processing unit, a consumer-grade stream processor) and the NVIDIA CUDA™ technology, we can achieve performance improvements as large as ten times compared to a similar CPU-only tracker. At the same time, the Stream processing approach opens the door to other computing devices, like the Cell/BE™ or other multicore CPUs.

[1]  William J. Dally,et al.  A Programmable 512 GOPS Stream Processor for Signal, Image, and Video Processing , 2007, IEEE Journal of Solid-State Circuits.

[2]  William J. Dally,et al.  Programmable Stream Processors , 2003, Computer.

[3]  Steve Mann,et al.  Computer vision signal processing on graphics processing units , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[5]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[6]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[7]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[8]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Juan José Pantrigo,et al.  Bandwidth-Improved GPU Particle Filter for Visual Tracking , 2006 .

[11]  Timothy F. Cootes,et al.  Interpreting face images using active appearance models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[12]  Takeshi Shakunaga,et al.  Sparse Template Matching and Its Application to Real-time Object Tracking , 2005 .

[13]  SugermanJeremy,et al.  Brook for GPUs , 2004 .

[14]  William J. Dally,et al.  A Programmable 512 GOPS Stream Processor for Signal, Image, and Video Processing , 2008, IEEE J. Solid State Circuits.

[15]  Suresh Venkatasubramanian The Graphics Card as a Streaming Computer , 2003, ArXiv.

[16]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[17]  Petar M. Djuric,et al.  Resampling algorithms and architectures for distributed particle filters , 2005, IEEE Transactions on Signal Processing.

[18]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[19]  Mark S. Peercy,et al.  A performance-oriented data parallel virtual machine for GPUs , 2006, SIGGRAPH '06.

[20]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[21]  Bjarne K. Ersbøll,et al.  FAME-a flexible appearance modeling environment , 2003, IEEE Transactions on Medical Imaging.

[22]  William J. Dally,et al.  The Imagine Stream Processor , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[23]  Juan José Pantrigo,et al.  Particle filter on GPUs for real-time tracking , 2004, SIGGRAPH '04.

[24]  J. Kulpa,et al.  Time-frequency analysis using NVIDIA compute unified device architecture (CUDA) , 2009, Symposium on Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments (WILGA).

[25]  Hiroshi Murase,et al.  Conversation Scene Analysis with Dynamic Bayesian Network Basedon Visual Head Tracking , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[26]  Michael Isard,et al.  A mixed-state condensation tracker with automatic model-switching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).