A Novel Probabilistic Modeling Framework for Person-to-Person Interaction Recognition in Video Surveillance

In this paper, we propose a novel probabilistic modeling framework for automatic analysis and understanding of human interactions in visual surveillance tasks. Our principal assumption is that an interaction episode is composed of meaningful small unit interactions, which we call 'sub-interactions.' We model each sub-interaction by a dynamic probabilistic model using spatio-temporal characteristics and propose a Modified Factorial Hidden Markov Model (MFHMM) with factored observations. The complete interaction is represented with a network of Dynamic Probabilistic Models (DPMs) by an ordered concatenation of sub-interaction models. The rationale for this approach is that it is more effective in utilizing common components, i.e., sub-interaction models, to describe complex interaction patterns. We demonstrate the feasibility and effectiveness of the proposed method by analyzing the structure of network of DPMs and its success on two different databases: a self-collected dataset and Tsinghua University's dataset.