Detecting Human Behavior Models From Multimodal Observation in a Smart Home

This paper addresses learning and recognition of human behavior models from multimodal observation in a smart home environment. The proposed approach is part of a framework for acquiring a high-level contextual model for human behavior in an augmented environment. A 3-D video tracking system creates and tracks entities (persons) in the scene. Further, a speech activity detector analyzes audio streams coming from head set microphones and determines for each entity, whether the entity speaks or not. An ambient sound detector detects noises in the environment. An individual role detector derives basic activity like ldquowalkingrdquo or ldquointeracting with tablerdquo from the extracted entity properties of the 3-D tracker. From the derived multimodal observations, different situations like ldquoaperitifrdquo or ldquopresentationrdquo are learned and detected using statistical models (HMMs). The objective of the proposed general framework is two-fold: the automatic offline analysis of human behavior recordings and the online detection of learned human behavior models. To evaluate the proposed approach, several multimodal recordings showing different situations have been conducted. The obtained results, in particular for offline analysis, are very good, showing that multimodality as well as multiperson observation generation are beneficial for situation recognition.

[1]  Samy Bengio,et al.  Multimodal group action clustering in meetings , 2004, VSSN '04.

[2]  Barry Brumitt,et al.  EasyLiving: Technologies for Intelligent Environments , 2000, HUC.

[3]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[4]  David Garlan,et al.  Context is key , 2005, CACM.

[5]  Paul Dourish,et al.  What we talk about when we talk about context , 2004, Personal and Ubiquitous Computing.

[6]  Anind K. Dey,et al.  Understanding and Using Context , 2001, Personal and Ubiquitous Computing.

[7]  Samy Bengio,et al.  Automatic analysis of multimodal group actions in meetings , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Diane J. Cook,et al.  MavHome: an agent-based smart home , 2003, Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003. (PerCom 2003)..

[9]  Lucy A. Suchman,et al.  Plans and Situated Actions: The Problem of Human-Machine Communication (Learning in Doing: Social, , 1987 .

[10]  J. Crowley,et al.  Robust Visual Tracking from Dynamic Control of Processing , 2004 .

[11]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[12]  James W. Davis,et al.  The KidsRoom: A Perceptually-Based Interactive and Immersive Story Environment , 1999, Presence.

[13]  Pedro Ribeiro,et al.  Human Activity Recognition from Video: modeling, feature selection and classification architecture , 2005 .

[14]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Oliver Brdiczka,et al.  Automatic detection of interaction groups , 2005, ICMI '05.

[16]  Seng Wai Loke Representing and reasoning with situations for context-aware pervasive computing: a logic programming perspective , 2004, Knowl. Eng. Rev..

[17]  Oliver Brdiczka,et al.  Learning to detect user activity and availability from a variety of sensor data , 2004, Second IEEE Annual Conference on Pervasive Computing and Communications, 2004. Proceedings of the.

[18]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[19]  Oliver Brdiczka,et al.  Deterministic and probabilistic implementation of context , 2006, Fourth Annual IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW'06).

[20]  James L. Crowley,et al.  Perceptual Components for Context Aware Computing , 2002, UbiComp.

[21]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[22]  G Roup,et al.  Survey: Probabilistic Methodology and Techniques for Artefact Conception and Development , 2002 .

[23]  Alexander H. Waibel CHIL - Computers in the Human Interaction Loop , 2005, MVA.

[24]  Oliver Brdiczka,et al.  Attentional Model for Perceiving Social Context in Intelligent Environments , 2006, AIAI.

[25]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[26]  Oliver Brdiczka,et al.  Learning context models for the recognition of scenarios , 2006, AIAI.

[27]  Oliver Brdiczka,et al.  Unsupervised Segmentation of Meeting Configurations and Activities using Speech Activity Detection , 2006, AIAI.

[28]  Jérôme Martin,et al.  Automatic handwriting gestures recognition using hidden Markov models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[29]  Oliver Brdiczka,et al.  Learning individual roles from video in a smart home , 2006 .

[30]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[31]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .