3D human motion analysis framework for shape similarity and retrieval

3D shape similarity from video is a challenging problem lying at the heart of many primary research areas in computer graphics and computer vision applications. In this paper, we address within a new framework the problem of 3D shape representation and shape similarity in human video sequences. Our shape representation is formulated using extremal human curve (EHC) descriptor extracted from the body surface. It allows taking benefits from Riemannian geometry in the open curve shape space and therefore computing statistics on it. It also allows subject pose comparison regardless of geometrical transformations and elastic surface change. Shape similarity is performed by an efficient method which takes advantage of a compact EHC representation in open curve shape space and an elastic distance measure. Thanks to these main assets, several important exploitations of the human action analysis are performed: shape similarity computation, video sequence comparison, video segmentation, video clustering, summarization and motion retrieval. Experiments on both synthetic and real 3D human video sequences show that our approach provides an accurate static and temporal shape similarity for pose retrieval in video, compared with the state-of-the-art approaches. Moreover, local 3D video retrieval is performed using motion segmentation and dynamic time warping (DTW) algorithm in the feature vector space. The obtained results are promising and show the potential of this approach.

[1]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[2]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[4]  Anne Verroust-Blondet,et al.  Level set diagrams of polyhedral objects , 1999, SMA '99.

[5]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, SIGGRAPH 2008.

[6]  Toni Giorgino,et al.  Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package , 2009 .

[7]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[8]  Takeo Kanade,et al.  Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[9]  Leonidas J. Guibas,et al.  A concise and provably informative multi-scale signature based on heat diffusion , 2009 .

[10]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[11]  Olivier Colot,et al.  A New 3D-Matching Method of Nonrigid and Partially Similar Models Using Curve Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Mohamed Daoudi,et al.  Indexed heat curves for 3D-model retrieval , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[13]  Marcel Körtgen,et al.  3 D Shape Matching with 3 D Shape Contexts , 2003 .

[14]  Li Wang,et al.  Elastic Sequence Correlation for Human Action Analysis , 2011, IEEE Transactions on Image Processing.

[15]  Giuseppe Patanè,et al.  Affine-Invariant Skeleton of 3D Shapes , 2002, Shape Modeling International.

[16]  Irena Koprinska,et al.  Temporal video segmentation: A survey , 2001, Signal Process. Image Commun..

[17]  Atsushi Nakazawa,et al.  Rhythmic motion analysis using motion capture and musical information , 2003, Proceedings of IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI2003..

[18]  Anuj Srivastava,et al.  Shape Analysis of Elastic Curves in Euclidean Spaces , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yong Rui,et al.  Segmenting visual actions based on spatio-temporal motion patterns , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[20]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[21]  BakerSimon,et al.  Shape-From-Silhouette Across Time Part II , 2005 .

[22]  Nanning Zheng,et al.  Unsupervised Analysis of Human Gestures , 2001, IEEE Pacific Rim Conference on Multimedia.

[23]  Takeo Kanade,et al.  Shape-From-Silhouette Across Time Part I: Theory and Algorithms , 2005, International Journal of Computer Vision.

[24]  Kiyoharu Aizawa,et al.  Motion Segmentation of 3D Video using Modified Shape Distribution , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[25]  Hans-Peter Kriegel,et al.  3D Shape Histograms for Similarity Search and Classification in Spatial Databases , 1999, SSD.

[26]  Jongik Kim,et al.  Hierarchical querying scheme of human motions for smart home environment , 2012, Eng. Appl. Artif. Intell..

[27]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[28]  Adrian Hilton,et al.  Shape-Colour Histograms for matching 3D video sequences , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[29]  Ayellet Tal,et al.  Mesh segmentation using feature point and core extraction , 2005, The Visual Computer.

[30]  Takashi Matsuyama,et al.  Topology Dictionary for 3D Video Understanding , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Adrian Hilton,et al.  Shape Similarity for 3D Video Sequences of People , 2010, International Journal of Computer Vision.

[32]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[33]  Hassen Drira,et al.  3D Face Recognition under Expressions, Occlusions, and Pose Variations , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Kiyoharu Aizawa,et al.  3D video segmentation using point distance histograms , 2005, IEEE International Conference on Image Processing 2005.

[35]  Eugene Zhang,et al.  Pairwise Harmonics for Shape Analysis , 2013, IEEE Transactions on Visualization and Computer Graphics.

[36]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[37]  Jing Wang,et al.  View-robust action recognition based on temporal self-similarities and dynamic time warping , 2012, 2012 IEEE International Conference on Computer Science and Automation Engineering (CSAE).

[38]  Hazem Wannous,et al.  Extremal human curves: A new human body shape and pose descriptor , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[39]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[40]  Tony Tung,et al.  The Augmented Multiresolution Reeb Graph Approach for Content-based Retrieval of 3d Shapes , 2005, Int. J. Shape Model..

[41]  Takashi Matsuyama,et al.  Complete multi-view reconstruction of dynamic scenes from probabilistic fusion of narrow and wide baseline stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[42]  Leonidas J. Guibas,et al.  One Point Isometric Matching with the Heat Kernel , 2010, Comput. Graph. Forum.

[43]  Anthony J. Yezzi,et al.  Conformal metrics and true "gradient flows" for curves , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[44]  Marcel Körtgen,et al.  3D Shape Matching with 3D Shape Contexts , 2003 .

[45]  T. Funkhouser,et al.  Möbius voting for surface correspondence , 2009, SIGGRAPH 2009.

[46]  Adrian Hilton,et al.  Automatic 3D Video Summarization: Key Frame Extraction from Self-Similarity , 2008 .

[47]  D. Mumford,et al.  Riemannian Geometries on Spaces of Plane Curves , 2003, math/0312384.

[48]  Mohamed Daoudi,et al.  A probabilistic approach for 3D shape retrieval by characteristic views , 2007, Pattern Recognit. Lett..

[49]  Rama Chellappa,et al.  Silhouette-based gesture and action recognition via modeling trajectories on Riemannian shape manifolds , 2011, Comput. Vis. Image Underst..

[50]  Tony Tung,et al.  Comparison of Skeleton and Non-Skeleton Shape Descriptors for 3D Video , 2010 .

[51]  Anuj Srivastava,et al.  Analysis of planar shapes using geodesic paths on shape spaces , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Anuj Srivastava,et al.  A Novel Representation for Riemannian Analysis of Elastic Curves in Rn , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Kiyoharu Aizawa,et al.  Motion Segmentation and Retrieval for 3D Video Based on Modified Shape Distribution , 2007, EURASIP J. Adv. Signal Process..

[54]  Sethuraman Panchanathan,et al.  Automated gesture segmentation from dance sequences , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[55]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[56]  Mohamed Daoudi,et al.  Invariant High Level Reeb Graphs of 3D Polygonal Meshes , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).