Natural Language Descriptions of Human Behavior from Video Sequences

This contribution addresses the generation of textual descriptions in several natural languages for evaluation of human behavior in video sequences. The problem is tackled by converting geometrical information extracted from videos of the scenario into predicates in fuzzy logic formalism, which facilitates the internal representations of the conceptual data and allows the temporal analysis of situations in a deterministic fashion, by means of Situation Graph Trees (SGTs). The results of the analysis are stored in structures proposed by the Discourse Representation Theory (DRT), which facilitate a subsequent generation of natural language text. This set of tools has been proved to be perfectly suitable for the specified purpose.

[1]  Bernd Neumann,et al.  Natural Language Inquiries abouth Motion in an Automatically Analyzed Traffic Scene , 1981, GWAI.

[2]  Bernd Neumann,et al.  Natural Language Dialogue about Moving Objects in an Automatically Analyzed Traffic Scene , 1981, IJCAI.

[3]  Tomek Strzalkowski,et al.  From Discourse to Logic , 1991 .

[4]  Shaogang Gong,et al.  Visual Surveillance in a Dynamic and Uncertain World , 1995, Artif. Intell..

[5]  K. Schäfer,et al.  “F-Limette” fuzzy logic programming integrating metric temporal extensions , 1996 .

[6]  Hans-Hellmut Nagel,et al.  Integration of Image Sequence Evaluation and Fuzzy Metric Temporal Logic Programming , 1997, KI.

[7]  Robert Dale,et al.  Building Natural Language Generation Systems: Figures , 2000 .

[8]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[9]  Hans-Hellmut Nagel,et al.  Representation of Behavioral Knowledge for Planning and Plan-Recognition in a Cognitive Vision System , 2002, KI.

[10]  Gerhard Lakemeyer,et al.  KI 2002: Advances in Artificial Intelligence , 2002, Lecture Notes in Computer Science.

[11]  Hans-Hellmut Nagel,et al.  Steps toward a Cognitive Vision System , 2004, AI Mag..

[12]  Kunio Fukunaga,et al.  Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.

[13]  Maneesha Singh,et al.  Pattern Recognition and Data Mining, Third International Conference on Advances in Pattern Recognition, ICAPR 2005, Bath, UK, August 22-25, 2005, Proceedings, Part I , 2005, International Conference on Advances in Pattern Recognition.

[14]  Jordi Gonzàlez,et al.  Improving Tracking by Handling Occlusions , 2005, ICAPR.

[15]  Jordi Gonzàlez i Sabaté Human sequence evaluation: the key-frame approach , 2005 .