Learning the Behavior of Users in a Public Space through Video Tracking

The paper describes a video tracking system that tracks and analyzes the behavioral pattern of users in a public space. We have obtained important statistical measurements about users' behavior, which can be used to evaluate architectural design in terms of human spatial behavior and model the behavior of users in public spaces. Previously, such measurements could only be obtained through costly manual processes, e.g. behavioral mapping and time-lapse filming with human examiners. Our system has automated the process of analyzing the behavior of users. The system consists of a head detector for detecting people in each single frame of the video and data association for tracking people through frames. We compared the results obtained using our system with those obtained by manual counting, for a small data set, and found the results to be fairly accurate. We then applied the system to a large-scale data set and obtained substantial statistical measurements of parameters such as the total number of users who entered the space, the total number of users who sat by a fountain, the time that each spent by the fountain, etc. These statistics allow fundamental rethinking of the way people use a public space. This research is a novel application of computer vision in evaluating architectural design in terms of human behavior

[1]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Stuart J. Russell,et al.  Object Identification: A Bayesian Analysis with Application to Traffic Surveillance , 1998, Artif. Intell..

[3]  Yehuda E. Kalay,et al.  Evaluating and predicting design performance , 1992 .

[4]  Mubarak Shah,et al.  A hierarchical approach to robust background subtraction using color and gradient information , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[5]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[6]  Ramin Zabih,et al.  Counting people from multiple cameras , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[7]  Jianbo Shi,et al.  Finding ( Un ) Usual Events in Video CMU-RI-TR-0305 , 2003 .

[8]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interaction , 1999, ICVS.

[9]  John R. Kender,et al.  Finding skin in color images , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[10]  Jan Gehl,et al.  Life Between Buildings: Using Public Space , 2003 .

[11]  Leonidas J. Guibas,et al.  Counting people in crowds with a real-time network of simple image sensors , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  W. Whyte The social life of small urban spaces , 1980 .

[13]  Clare Cooper Marcus,et al.  People Places: Design Guidelines for Urban Open Space , 1997 .

[14]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Shaogang Gong,et al.  Bayesian Modality Fusion for Tracking Multiple People with a Multi-Camera System , 2002 .