Multi-target tracking in time-lapse video forensics

To help an officer to efficiently review many hours of surveillance recordings, we develop a system of automated video analysis. We introduce a multi-target tracking algorithm that operates on recorded video. Apart from being robust to visual challenges (like partial and full occlusion, variation in illumination and camera view), our algorithm is also robust to temporal challenges, i.e., unknown variation in frame rate. The complication with variation in frame rate is that it invalidates motion estimation. As such, tracking algorithms that are based on motion models will show decreased performance. On the other hand, appearance based tracking suffers from a plethora of false detections. Our tracking algorithm, albeit relying on appearance based detection, deals robustly with the caveats of both approaches. The solution rests on the fact that we can make fully informed choices; not only based on preceding, but also based on following frames. It works as follows. We assume an object detection algorithm that is able to detect all target objects that are present in each frame. From this we build a graph structure. The detections form the graph's nodes. The vertices are formed by connecting each detection in one frame to all detections in the following frame. Thus, each path through the graph shows some particular selection of successive object detections. Object tracking is then reformulated as a heuristic search for optimal paths, where optimal means to find all detections belonging to a single object and excluding any other detection. We show that this approach, without an explicit motion model, is robust to both the visual and temporal challenges.

[1]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[2]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[3]  M. Hazewinkel Encyclopaedia of mathematics , 1987 .

[4]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[6]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[7]  Ben J. A. Kröse,et al.  Keeping Track of Humans: Have I Seen This Person Before? , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Arnold W. M. Smeulders,et al.  Robust Tracking Using Foreground-Background Texture Discrimination , 2006, International Journal of Computer Vision.

[10]  Marcel Worring,et al.  A Multi-Camera Visual Surveillance System for Tracking of Reoccurrences of People , 2007, 2007 First ACM/IEEE International Conference on Distributed Smart Cameras.

[11]  Gang Hua,et al.  Context-Aware Visual Tracking , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.