Video surveillance of interactions

This paper describes an automatic surveillance system, which performs labeling of events and interactions in an outdoor environment. The system is designed to monitor activities in an open parking lot. It consists of three components-an adaptive tracker, an event generator, which maps object tracks onto a set of pre-determined discrete events, and a stochastic parser. The system performs segmentation and labeling of surveillance video of a parking lot and identifies person-vehicle interactions, such as pick-up and drop-off. The system presented in this paper is developed jointly by MIT Media Lab and MIT Artificial Intelligence Lab.

[1]  Alfred V. Aho,et al.  A Minimum Distance Error-Correcting Parser for Context-Free Languages , 1972, SIAM J. Comput..

[2]  Tieniu Tan,et al.  Agent orientated annotation in model based visual surveillance , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[3]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[4]  H. Bunke,et al.  PARSING MULTIVALUED STRINGS AND ITS APPLICATION TO IMAGE AND WAVEFORM RECOGNITION , 1990 .

[5]  Aaron F. Bobick,et al.  Probabilistic Parsing in Action Recognition , 2001 .

[6]  Gheorghe Paun,et al.  Grammar Systems: A Grammatical Approach to Distribution and Cooperation , 1995, ICALP.

[7]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[8]  Aaron F. Bobick,et al.  Action recognition using probabilistic parsing , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[9]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[10]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[11]  Andreas Stolcke,et al.  An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[12]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Matthew Brand,et al.  Understanding manipulation in video , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[14]  Alex Pentland,et al.  Statistical Modeling of Human Interactions , 1998, CVPR 1998.

[15]  Aaron F. Bobick,et al.  Parsing multi-agent interactions , 1998, CVPR 1998.

[16]  Olaf Munkelt,et al.  Adaptive Background Estimation and Foreground Detection using Kalman-Filtering , 1995 .

[17]  Jonathan D. Courtney Automatic video indexing via object motion analysis , 1997, Pattern Recognit..

[18]  W. Eric L. Grimson,et al.  Using adaptive tracking to classify and monitor activities in a site , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[19]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.