A Real-Time, Multiview Fall Detection System: A LHMM-Based Approach

Automatic detection of a falling person in video sequences has interesting applications in video-surveillance and is an important part of future pervasive home monitoring systems. In this paper, we propose a multiview approach to achieve this goal, where motion is modeled using a layered hidden Markov model (LHMM). The posture classification is performed by a fusion unit, merging the decision provided by the independently processing cameras in a fuzzy logic context. In each view, the fall detection is optimized in a given plane by performing a metric image rectification, making it possible to extract simple and robust features, and being convenient for real-time purpose. A theoretical analysis of the chosen descriptor enables us to define the optimal camera placement for detecting people falling in unspecified situations, and we prove that two cameras are sufficient in practice. Regarding event detection, the LHMM offers a principle way for solving the inference problem. Moreover, the hierarchical architecture decouples the motion analysis into different temporal granularity levels, making the algorithm able to detect very sudden changes, and robust to low-level steps errors.

[1]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  A. Enis Çetin,et al.  HMM Based Falling Person Detection Using Both Audio and Video , 2005, 2006 IEEE 14th Signal Processing and Communications Applications.

[3]  Nicolas Thome,et al.  A robust appearance model for tracking human motions , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[4]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[5]  Guang-Zhong Yang,et al.  FROM IMAGING NETWORKS TO BEHAVIOR PROFILING: UBIQUITOUS SENSING FOR MANAGED HOMECARE OF THE ELDERLY , 2005 .

[6]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[7]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jianbo Shi,et al.  Detecting unusual activity in video , 2004, CVPR 2004.

[9]  M. Carter Computer graphics: Principles and practice , 1997 .

[10]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  David C. Hogg,et al.  Learning Variable-Length Markov Models of Behavior , 2001, Comput. Vis. Image Underst..

[12]  Ramakant Nevatia,et al.  Self-calibration of a camera from video of a walking human , 2002, Object recognition supported by user interaction for service robots.

[13]  Rita Cucchiara,et al.  A multi‐camera vision system for fall detection and alarm generation , 2007, Expert Syst. J. Knowl. Eng..

[14]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[16]  A. J. Sixsmith,et al.  SIMBAD: Smart inactivity monitor using array-based detector , 2002 .

[17]  Stephen J. McKenna,et al.  Activity summarisation and fall detection in a supportive home environment , 2004, ICPR 2004.

[18]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[19]  H. Opower Multiple view geometry in computer vision , 2002 .

[20]  B. Ugur Toreyin,et al.  Ses ve video işaretlerinde saklı markof modeli tabanlı düşen kişi tespiti , 2006 .

[21]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[22]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[23]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[24]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[25]  Eric Horvitz,et al.  Layered representations for human activity recognition , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.