A differential geometric approach to representing the human actions

This paper presents a novel representation for human actions which encodes the variations in the shape and motion of the performing actor. When an actor performs an action, at each time instant, the outer object boundary is projected to the image plane as a 2D contour. A sequence of such contours forms a 3D volume in the spatiotemporal space. The differential geometric analysis of the volume surface results in a set of action descriptors. These descriptors constitute the action sketch which is used to represent the human actions. The action sketch captures the changes in the shape and motion of the performing actor in an unified manner. Since the action sketch is obtained from the extrema of the differential geometric surface features, it is robust to viewpoint changes. We demonstrate the versatility of the action sketch in the context of action recognition, which is formulated as a view geometric similarity problem.

[1]  William E. Lorensen,et al.  Marching cubes: a high resolution 3D surface construction algorithm , 1996 .

[2]  Oscar Firschein,et al.  Readings in computer vision: issues, problems, principles, and paradigms , 1987 .

[3]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  E. Adelson,et al.  Analyzing gait with spatiotemporal surfaces , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[5]  Mubarak Shah,et al.  Motion-Based Recognition , 1997, Computational Imaging and Vision.

[6]  Emanuele Trucco,et al.  Geometric Invariance in Computer Vision , 1995 .

[7]  Hiroshi Harashima,et al.  Spatiotemporal representation of dynamic objects , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Michael J. Black,et al.  Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion , 1997, International Journal of Computer Vision.

[11]  Xin Li,et al.  Contour-based object tracking with occlusion handling in video acquired using mobile cameras , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  J. Sethian Level set methods : evolving interfaces in geometry, fluid mechanics, computer vision, and materials science , 1996 .

[13]  Cor J. Veenman,et al.  Resolving Motion Correspondence for Densely Moving Points , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[15]  Randal C. Nelson,et al.  Recognizing activities , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[16]  Mubarak Shah,et al.  Recognizing human actions in videos acquired by uncalibrated moving cameras , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  M. Irani,et al.  Event-Based Video Analysis, , 2001 .

[18]  M. Shah,et al.  On the use of anthropometry in the invariant analysis of human actions , 2004, ICPR 2004.

[19]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[20]  M. Alex O. Vasilescu,et al.  Recognizing action events from multiple viewpoints , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[21]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[22]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  James A. Sethian,et al.  Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid , 2012 .

[24]  Robert C. Bolles,et al.  Epipolar-plane image analysis: a technique for analyzing motion sequences , 1987 .

[25]  Tamal K. Dey,et al.  Delaunay based shape reconstruction from large data , 2001, Proceedings IEEE 2001 Symposium on Parallel and Large-Data Visualization and Graphics (Cat. No.01EX520).

[26]  Tanveer F. Syeda-Mahmood,et al.  View-invariant alignment and matching of video sequences , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[27]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[28]  Jake K. Aggarwal,et al.  Segmentation and recognition of continuous human activity , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[29]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[30]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[31]  Anand Rangarajan,et al.  A new algorithm for non-rigid point matching , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[32]  Gang Zhao,et al.  Feature-preserving smoothing algorithm for polygons and meshes , 2004, VRCAI '04.

[33]  Mubarak Shah,et al.  Actions sketch: a novel action representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[34]  Jeffrey M. Zacks,et al.  Event structure in perception and conception. , 2001, Psychological bulletin.

[35]  Ramesh C. Jain,et al.  Invariant surface characteristics for 3D object recognition in range images , 1985, Comput. Vis. Graph. Image Process..

[36]  Richard I. Hartley,et al.  In defence of the 8-point algorithm , 1995, Proceedings of IEEE International Conference on Computer Vision.

[37]  Qunsheng Peng,et al.  Robust mesh smoothing , 2004, Journal of Computer Science and Technology.

[38]  P. Olver,et al.  Affine Invariant Detection: Edge Maps, Anisotropic Diffusion, and Active Contours , 1999 .

[39]  Robert M. Haralick,et al.  Structural Descriptions and Inexact Matching , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Sunghee Choi,et al.  A simple algorithm for homeomorphic surface reconstruction , 2000, SCG '00.

[41]  Jitendra Malik,et al.  Shape contexts enable efficient retrieval of similar shapes , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[42]  Mubarak Shah,et al.  Monitoring human behavior from video taken in an office environment , 2001, Image Vis. Comput..

[43]  Eric W. Weisstein,et al.  The CRC concise encyclopedia of mathematics , 1999 .

[44]  Sudeep Sarkar,et al.  The gait identification challenge problem: data sets and baseline algorithm , 2002, Object recognition supported by user interaction for service robots.

[45]  Anil K. Jain,et al.  Analysis and Interpretation of Range Images , 1989, Springer Series in Perception Engineering.

[46]  Jeffrey Mark Siskind,et al.  A Maximum-Likelihood Approach to Visual Event Classification , 1996, ECCV.

[47]  Stefan Carlsson,et al.  Recognizing and Tracking Human Action , 2002, ECCV.

[48]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[49]  ShahMubarak,et al.  A differential geometric approach to representing the human actions , 2008 .