Online Recognition of Daily Activities by Color-Depth Sensing and Knowledge Models

Visual activity recognition plays a fundamental role in several research fields as a way to extract semantic meaning of images and videos. Prior work has mostly focused on classification tasks, where a label is given for a video clip. However, real life scenarios require a method to browse a continuous video flow, automatically identify relevant temporal segments and classify them accordingly to target activities. This paper proposes a knowledge-driven event recognition framework to address this problem. The novelty of the method lies in the combination of a constraint-based ontology language for event modeling with robust algorithms to detect, track and re-identify people using color-depth sensing (Kinect® sensor). This combination enables to model and recognize longer and more complex events and to incorporate domain knowledge and 3D information into the same models. Moreover, the ontology-driven approach enables human understanding of system decisions and facilitates knowledge transfer across different scenes. The proposed framework is evaluated with real-world recordings of seniors carrying out unscripted, daily activities at hospital observation rooms and nursing homes. Results demonstrated that the proposed framework outperforms state-of-the-art methods in a variety of activities and datasets, and it is robust to variable and low-frame rate recordings. Further work will investigate how to extend the proposed framework with uncertainty management techniques to handle strong occlusion and ambiguous semantics, and how to exploit it to further support medicine on the timely diagnosis of cognitive disorders, such as Alzheimer’s disease.

[1]  Alan Fern,et al.  Probabilistic event logic for interval-based event recognition , 2011, CVPR 2011.

[2]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[3]  Christopher Pramerdorfer EVALUATION OF KINECT SENSORS FOR FALL DETECTION , 2013 .

[4]  François Brémond,et al.  Background subtraction in people detection framework for RGB-D cameras , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[5]  James M. Keller,et al.  Recognizing complex instrumental activities of daily living using scene information and fuzzy logic , 2015, Comput. Vis. Image Underst..

[6]  Bernadette Dorizzi,et al.  A pervasive multi-sensor data fusion for smart home healthcare monitoring , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[7]  Linmi Tao,et al.  An Event-driven Context Model in Elderly Health Monitoring , 2009, 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing.

[8]  Duc Phu Chau,et al.  Automatic Parameter Adaptation for Multi-object Tracking , 2013, ICVS.

[9]  Francisco Javier Díaz Pernas,et al.  A Kinect-based system for cognitive rehabilitation exercises monitoring , 2014, Comput. Methods Programs Biomed..

[10]  Jesse Hoey,et al.  Sensor-Based Activity Recognition , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[11]  Michel Vacher,et al.  Introducing knowledge in the process of supervised classification of activities of Daily Living in Health Smart Homes , 2010, The 12th IEEE International Conference on e-Health Networking, Applications and Services.

[12]  Larry S. Davis,et al.  Event Modeling and Recognition Using Markov Logic Networks , 2008, ECCV.

[13]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[14]  Marjorie Skubic,et al.  Automated fall detection with quality improvement "rewind" to reduce falls in hospital rooms. , 2014, Journal of gerontological nursing.

[15]  François Brémond,et al.  Evaluation of a monitoring system for event recognition of older people , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[16]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[17]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[18]  S. K. Tasoulis,et al.  Statistical data mining of streaming motion data for activity and fall recognition in assistive environments , 2013, Neurocomputing.

[19]  Yiannis Kompatsiaris,et al.  The Dem@Care Experiments and Datasets: a Technical Report , 2016, ArXiv.

[20]  Chris D. Nugent,et al.  An Ontology-Based Hybrid Approach to Activity Modeling for Smart Homes , 2014, IEEE Transactions on Human-Machine Systems.

[21]  Christopher Town,et al.  Ontological inference for image and video analysis , 2006, Machine Vision and Applications.

[22]  François Brémond,et al.  Automatic Video Interpretation: A Novel Algorithm for Temporal Scenario Recognition , 2003, IJCAI.

[23]  C. Derouesné [Mini-mental state examination]. , 2001, Revue neurologique.

[24]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[25]  Werner Ceusters,et al.  Introducing Ontological Realism for Semi-Supervised Detection and Annotation of Operationally Significant Activity in Surveillance Videos , 2010, STIDS.

[26]  Bohyung Han,et al.  Scenario-based video event recognition by constraint flow , 2011, CVPR 2011.