HERMES: A research project on human sequence evaluation

Human Sequence Evaluation concentrates on how to extract descriptions of human behaviour from videos in a restricted discourse domain, such as (i) pedestrians crossing inner-city roads where pedestrians appear approaching or waiting at stops of busses or trams, and (ii) humans in indoor worlds like an airport hall, a train station, or a lobby. These discourse domains allow to explore a coherent evaluation of human movements and facial expressions across a wide variation of scale. This general approach lends itself to various cognitive surveillance scenarios at varying degrees of resolution: from wide-field-of-view multiple-agent scenes, through to more specific inferences of emotional state that could be elicited from high resolution imagery of faces. The true challenge of the HERMES project will consist in the development of a system facility which starts with basic knowledge about pedestrian behaviour in the chosen discourse domain, but could cluster evaluation results into semantically meaningful sub-sets of behaviours. The envisaged system will comprise an internal logic-based representation which enables it to comment each individual subset, giving natural language explanations of why the system has created the subset in question.

[1]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[2]  Hans-Hellmut Nagel,et al.  Behavioral Knowledge Representation for the Understanding and Creation of Video Sequences , 2003, KI.

[3]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[4]  Osama Masoud,et al.  A method for human action recognition , 2003, Image Vis. Comput..

[5]  A. Murat Tekalp,et al.  Stochastic kinematic modeling and feature extraction for gait analysis , 2003, IEEE Trans. Image Process..

[6]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[7]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Minhua Ma,et al.  Interval Relations in Lexical Semantics of Verbs , 2004, Artificial Intelligence Review.

[9]  Jordi Gonzàlez i Sabaté Human sequence evaluation: the key-frame approach , 2005 .

[10]  Patrick Bouthemy,et al.  Real-Time Tracking of Moving Persons by Exploiting Spatio-Temporal Image Slices , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[12]  Mark S. Nixon,et al.  Automated markerless extraction of walking people using deformable contour models , 2004, Comput. Animat. Virtual Worlds.

[13]  Yi Li,et al.  A multiscale morphological method for human posture recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[14]  N. Atsushi,et al.  Tracking multiple people using distributed vision systems , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[15]  David C. Hogg,et al.  Statistical Models of Object Interaction , 2004, International Journal of Computer Vision.

[16]  Kunio Fukunaga,et al.  Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.

[17]  A. David Marshall,et al.  Tracking people in three dimensions using a hierarchical model of dynamics , 2002, Image Vis. Comput..

[18]  Alberto Sanfeliu,et al.  An approach of visual motion analysis , 2005, Pattern Recognit. Lett..

[19]  Masanori Yamada,et al.  A new robust real-time method for extracting human silhouettes from color images , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[20]  Takashi Matsuyama,et al.  Real-time cooperative multi-target tracking by communicating active vision agents , 2002, Object recognition supported by user interaction for service robots.

[21]  Hans-Hellmut Nagel,et al.  Steps toward a Cognitive Vision System , 2004, AI Mag..

[22]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[24]  Hans-Hellmut Nagel,et al.  From image sequences towards conceptual descriptions , 1988, Image Vis. Comput..

[25]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[26]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Hironobu Fujiyoshi,et al.  Moving target classification and tracking from real-time video , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[28]  Aaron F. Bobick,et al.  Recognizing Planned, Multiperson Action , 2001, Comput. Vis. Image Underst..

[29]  Tieniu Tan,et al.  People tracking based on motion model and motion constraints with automatic initialization , 2004, Pattern Recognit..

[30]  Heinrich Niemann,et al.  Semantic Networks for Understanding Scenes , 1997, Advances in Computer Vision and Machine Intelligence.

[31]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Maja J. Mataric,et al.  Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[33]  Luc Van Gool,et al.  An adaptive color-based particle filter , 2003, Image Vis. Comput..

[34]  Qi Tian,et al.  Statistical modeling of complex backgrounds for foreground object detection , 2004, IEEE Transactions on Image Processing.

[35]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[36]  Shyamsundar Rajaram,et al.  Human Activity Recognition Using Multidimensional Indexing , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  David C. Hogg,et al.  Learning Variable-Length Markov Models of Behavior , 2001, Comput. Vis. Image Underst..

[38]  Tieniu Tan,et al.  Agent orientated annotation in model based visual surveillance , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[39]  John S. Zelek,et al.  Real-time tracking for visual interface applications in cluttered and occluding situations , 2004, Image Vis. Comput..