Toward creation of interaction models: simple objects-interaction approach

This paper presents a proposal to manage simple-objects interaction in video surveillance system. The proposal consists on locating a set of features in each video frame. Maxima regions from the second Eigen- value of the tensor matrix are used as features. Afterwards, statics features are discarded (labeling as background) and dynamic features are used to represent objects in motion (foreground). Dynamics features are dynamically clustered with k-neighborhood and EM algorithm. The centroid of each cluster locally represents motion objects, and its displacement through time is denoted by displacement of cumulus over several frames. The behavior of cumulus in time help us to model simple object interactions. These primitives can be used in addition to a causal dependencies across time; i.e. cluster division, cluster fusion and cluster motion with respect to the others, offer information of local dynamics which is referred to local interactions. And based on causal dependencies theory, a graph dependence of local centroids behavior can be built. This graph can represent the local interaction model. In experimental section, the approach is tested in several scenarios, extracting simple interaction objects in controlled/not-controlled scenarios.

[1]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[2]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[3]  Yiannis Aloimonos,et al.  View-Invariant Modeling and Recognition of Human Actions Using Grammars , 2006, WDV.

[4]  J. Serra Introduction to mathematical morphology , 1986 .

[5]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Wei Wang,et al.  Extended-Maxima Transform Watershed Segmentation Algorithm for Touching Corn Kernels , 2013 .

[8]  Mannes Poel,et al.  Discriminative human action recognition using pairwise CSP classifiers , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[9]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[10]  Jenq-Neng Hwang,et al.  A Review on Video-Based Human Activity Recognition , 2013, Comput..

[11]  Harpreet S. Sawhney,et al.  Learning Actions Using Robust String Kernels , 2007, Workshop on Human Motion.

[12]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[13]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[14]  Huiyu Zhou,et al.  Object tracking using SIFT features and mean shift , 2009, Comput. Vis. Image Underst..

[15]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Bodo Rosenhahn,et al.  Human Motion - Understanding, Modeling, Capture and Animation, Second Workshop, Human Motion 2007, Rio de Janeiro, Brazil, October 20, 2007, Proceedings , 2007, Workshop on Human Motion.

[17]  Yang Wang,et al.  Unsupervised Discovery of Action Classes , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.