Shape Similarity for 3D Video Sequences of People

This paper presents a performance evaluation of shape similarity metrics for 3D video sequences of people with unknown temporal correspondence. Performance of similarity measures is compared by evaluating Receiver Operator Characteristics for classification against ground-truth for a comprehensive database of synthetic 3D video sequences comprising animations of fourteen people performing twenty-eight motions. Static shape similarity metrics shape distribution, spin image, shape histogram and spherical harmonics are evaluated using optimal parameter settings for each approach. Shape histograms with volume sampling are found to consistently give the best performance for different people and motions. Static shape similarity is extended over time to eliminate the temporal ambiguity. Time-filtering of the static shape similarity together with two novel shape-flow descriptors are evaluated against temporal ground-truth. This evaluation demonstrates that shape-flow with a multi-frame alignment of motion sequences achieves the best performance, is stable for different people and motions, and overcome the ambiguity in static shape similarity. Time-filtering of the static shape histogram similarity measure with a fixed window size achieves marginally lower performance for linear motions with the same computational cost as static shape descriptors. Performance of the temporal shape descriptors is validated for real 3D video sequence of nine actors performing a variety of movements. Time-filtered shape histograms are shown to reliably identify frames from 3D video sequences with similar shape and motion for people with loose clothing and complex motion.

[1]  Taku Komura,et al.  Topology matching for fully automatic similarity estimation of 3D shapes , 2001, SIGGRAPH.

[2]  Sven J. Dickinson,et al.  Skeleton based shape matching and retrieval , 2003, 2003 Shape Modeling International..

[3]  Tsuhan Chen,et al.  Efficient feature extraction for 2D/3D objects in mesh representation , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[4]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[5]  Hans-Peter Seidel,et al.  Seeing People in Different Light — Joint Shape , Motion , and Reflectance Capture , 2007 .

[6]  Ali Shokoufandeh,et al.  Solid Model Databases: Techniques and Empirical Results , 2001, J. Comput. Inf. Sci. Eng..

[7]  Hans-Peter Kriegel,et al.  3D Shape Histograms for Similarity Search and Classification in Spatial Databases , 1999, SSD.

[8]  Adrian Hilton,et al.  Model-based multiple view reconstruction of people , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[10]  Bernard Chazelle,et al.  A Reflective Symmetry Descriptor , 2002, ECCV.

[11]  Ryutarou Ohbuchi,et al.  Shape-similarity search of 3D models by using enhanced shape functions , 2003, Proceedings of Theory and Practice of Computer Graphics, 2003..

[12]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[13]  Hao Zhang,et al.  A spectral approach to shape-based retrieval of articulated 3D models , 2007, Comput. Aided Des..

[14]  Adrian Hilton,et al.  A Study of Shape Similarity for Temporal Surface Sequences of People , 2007, Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007).

[15]  Lucas Kovar,et al.  Motion graphs , 2002, SIGGRAPH Classes.

[16]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[17]  Christopher W. Geib,et al.  The meaning of action: a review on action recognition and mapping , 2007, Adv. Robotics.

[18]  Adrian Hilton,et al.  TEMPORAL 3 D SHAPE MATCHING , 2007 .

[19]  Adrian Hilton,et al.  Temporal 3D shape matching , 2007 .

[20]  Karthik Ramani,et al.  Three-dimensional shape searching: state-of-the-art review and future trends , 2005, Comput. Aided Des..

[21]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[22]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[23]  Takeo Kanade,et al.  Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[24]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[25]  Roddy MacLeod,et al.  Coarse Filters for Shape Matching , 2002, IEEE Computer Graphics and Applications.

[26]  Marcel Körtgen,et al.  3D Shape Matching with 3D Shape Contexts , 2003 .

[27]  Adrian Hilton,et al.  Video-based character animation , 2005, SCA '05.

[28]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[29]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[30]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  David A. Forsyth,et al.  Motion synthesis from annotations , 2003, ACM Trans. Graph..

[32]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Martial Hebert,et al.  On 3D shape similarity , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[35]  Ron Kimmel,et al.  On Bending Invariant Signatures for Surfaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Daniel A. Keim,et al.  Content-Based 3D Object Retrieval , 2007, IEEE Computer Graphics and Applications.

[37]  Chin Seng Chua,et al.  Point Signatures: A New Representation for 3D Object Recognition , 1997, International Journal of Computer Vision.

[38]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[40]  Kiyoharu Aizawa,et al.  Motion Editing in 3D Video Database , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[41]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[43]  Francoise J. Preteux,et al.  3D-shape-based retrieval within the MPEG-7 framework , 2001, IS&T/SPIE Electronic Imaging.

[44]  R. Allen Miller,et al.  A database system of mechanical components based on geometric and topological similarity. Part II: indexing, retrieval, matching, and similarity assessment , 2003, Comput. Aided Des..

[45]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[46]  Adrian Hilton,et al.  Human motion synthesis from 3D video , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Marcin Novotni,et al.  3D zernike descriptors for content based shape retrieval , 2003, SM '03.

[48]  Marc Rioux,et al.  Description of shape information for 2-D and 3-D objects , 2000, Signal Process. Image Commun..

[49]  Remco C. Veltkamp,et al.  A Survey of Content Based 3D Shape Retrieval Methods , 2004, SMI.

[50]  Alberto Del Bimbo,et al.  Content-based retrieval of 3D models , 2006, TOMCCAP.