Video Processing and Understanding Tools for Augmented Multisensor Perception and Mobile User Interaction in Smart Spaces

In this paper, a complete Smart Space architecture and related system prototype are presented. The system is able to analyze situations of interest in a given environment and to produce related contextual information. Experimental results show that video information plays a major role for what concerns both situation perception and personalized contex-aware communications. For this reason, the poropsed multisensor system automatically extracts information from multiple cameras as well as diverse sensors describing environment status. This information is then used to trigger personalized and context-aware video messages adaptively sent to users. A rule-based module is encharged to customize video messages in relation to the user profile, contextual situation and users's terminal. The systems outputs graphically generated video messages consisting of an animated avatar (i.e. Virtual Character) closing the loop on users. Proposed results validate the conceptual schema behind the architecture and the successf...

[1]  Iso/iec 14496-2 Information Technology — Coding of Audio-visual Objects — Part 2: Visual , .

[2]  Catherine Pelachaud,et al.  Embodied contextual agent in information delivering application , 2002, AAMAS '02.

[3]  R. Koenen,et al.  MPEG-4 multimedia for our time , 1999 .

[4]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Terrance E. Boult,et al.  Into the woods: visual surveillance of noncooperative and camouflaged targets in complex outdoor settings , 2001, Proc. IEEE.

[6]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Gian Luca Foresti,et al.  Object detection and tracking in time-varying and badly illuminated outdoor environments , 1998 .

[8]  Carlo S. Regazzoni,et al.  From multi-sensor surveillance towards smart interactive spaces , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[9]  Fabio Lavagetto,et al.  Real-time MPEG-4 facial animation with 3D scalable meshes , 2002, Signal Process. Image Commun..

[10]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[11]  Franco Oberti,et al.  A neural network approach for moving objects recognition in color image sequences for surveillance applications , 1999, NSIP.

[12]  Andrew Marriott,et al.  VHML - Directing a Talking Head , 2001, Active Media Technology.

[13]  Gérard Bailly,et al.  Audiovisual Speech Synthesis , 2003, Int. J. Speech Technol..

[14]  Fernando Pereira,et al.  MPEG-4 facial animation technology: survey, implementation, and results , 1999, IEEE Trans. Circuits Syst. Video Technol..

[15]  Z E. Jess Friedman-hill,et al.  The Java Expert System Shell , 2000 .

[16]  Azriel Rosenfeld,et al.  Computer Vision , 1988, Adv. Comput..

[17]  Fabio Lavagetto,et al.  The facial animation engine: toward a high-level interface for the design of MPEG-4 compliant animated faces , 1999, IEEE Trans. Circuits Syst. Video Technol..

[18]  T. Ebrahimi,et al.  Change detection and background extraction by linear algebra , 2001, Proc. IEEE.

[19]  Stan Sclaroff,et al.  Improved Tracking of Multiple Humans with Trajectory Predcition and Occlusion Modeling , 1998 .