Deep motifs and motion signatures

Many analysis tasks for human motion rely on high-level similarity between sequences of motions, that are not an exact matches in joint angles, timing, or ordering of actions. Even the same movements performed by the same person can vary in duration and speed. Similar motions are characterized by similar sets of actions that appear frequently. In this paper we introduce motion motifs and motion signatures that are a succinct but descriptive representation of motion sequences. We first break the motion sequences to short-term movements called motion words, and then cluster the words in a high-dimensional feature space to find motifs. Hence, motifs are words that are both common and descriptive, and their distribution represents the motion sequence. To cluster words and find motifs, the challenge is to define an effective feature space, where the distances among motion words are semantically meaningful, and where variations in speed and duration are handled. To this end, we use a deep neural network to embed the motion words into feature space using a triplet loss function. To define a signature, we choose a finite set of motion-motifs, creating a bag-of-motifs representation for the sequence. Motion signatures are agnostic to movement order, speed or duration variations, and can distinguish fine-grained differences between motions of the same class. We illustrate examples of characterizing motion sequences by motifs, and for the use of motion signatures in a number of applications.

[1]  Bobby Bodenheimer,et al.  An evaluation of a cost metric for selecting transitions between motion segments , 2003, SCA '03.

[2]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[3]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[5]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[6]  Hassan Foroosh,et al.  Motion Retrieval Using Low‐Rank Subspace Decomposition of Motion Volume , 2011, Comput. Graph. Forum.

[7]  Wei Wang,et al.  A system for analyzing and indexing human-motion databases , 2005, SIGMOD '05.

[8]  Chao-Hung Lin,et al.  Human Motion Retrieval from Hand-Drawn Sketch , 2012, IEEE Transactions on Visualization and Computer Graphics.

[9]  Arno Zinke,et al.  Fast local and global similarity searches in large motion capture databases , 2010, SCA '10.

[10]  Eugene Fiume,et al.  An efficient search algorithm for motion data using weighted PCA , 2005, SCA '05.

[11]  Reinhard Klein,et al.  Efficient unsupervised temporal segmentation of human motion , 2014, SCA '14.

[12]  Taku Komura,et al.  Learning motion manifolds with convolutional autoencoders , 2015, SIGGRAPH Asia Technical Briefs.

[13]  Daniel Cohen-Or,et al.  Emotion control of unstructured dance movements , 2017, Symposium on Computer Animation.

[14]  Andreas Aristidou,et al.  Emotion Analysis and Classification: Understanding the Performers' Emotions Using the LMA Entities , 2015, Comput. Graph. Forum.

[15]  Wei Chen,et al.  Motion track: Visualizing variations of human motion data , 2010, 2010 IEEE Pacific Visualization Symposium (PacificVis).

[16]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yuan Li,et al.  Rotation-invariant similarity in time series using bag-of-patterns representation , 2012, Journal of Intelligent Information Systems.

[18]  Philippe Beaudoin,et al.  Motion-motif graphs , 2008, SCA '08.

[19]  Dieter W. Fellner,et al.  Visual-Interactive Semi-Supervised Labeling of Human Motion Capture Data , 2017, Visualization and Data Analysis.

[20]  Jessica K. Hodgins,et al.  Performance animation from low-dimensional control signals , 2005, SIGGRAPH 2005.

[21]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Hans-Peter Seidel,et al.  Efficient and Robust Annotation of Motion Capture Data , 2009 .

[23]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[24]  Ian H. Witten,et al.  Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[25]  Eamonn J. Keogh,et al.  Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile , 2017, Data Mining and Knowledge Discovery.

[26]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yueting Zhuang,et al.  Sparse motion bases selection for human motion denoising , 2015, Signal Process..

[28]  Nikos Nikolaidis,et al.  Action recognition on motion capture data using a dynemes and forward differences representation , 2014, J. Vis. Commun. Image Represent..

[29]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[30]  Hans-Peter Seidel,et al.  Motion reconstruction using sparse accelerometer data , 2011, TOGS.

[31]  Taku Komura,et al.  Phase-functioned neural networks for character control , 2017, ACM Trans. Graph..

[32]  Takeo Igarashi,et al.  Retrieval and Visualization of Human Motion Data via Stick Figures , 2012, Comput. Graph. Forum.

[33]  David A. Forsyth,et al.  Motion synthesis from annotations , 2003, ACM Trans. Graph..

[34]  Eamonn J. Keogh,et al.  Matrix Profile VI: Meaningful Multidimensional Motif Discovery , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[35]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jernej Barbic,et al.  Segmenting Motion Capture Data into Distinct Behaviors , 2004, Graphics Interface.

[37]  Eamonn J. Keogh,et al.  Classification of Multi-dimensional Streaming Time Series by Weighting Each Classifier's Track Record , 2013, 2013 IEEE 13th International Conference on Data Mining.

[38]  Zhaoqi Wang,et al.  Indexing and retrieval of human motion data by a hierarchical tree , 2009, VRST '09.

[39]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[40]  Norman I. Badler,et al.  Efficient motion retrieval in large motion databases , 2013, I3D '13.

[41]  Tobias Schreck,et al.  MotionExplorer: Exploratory Search in Human Motion Capture Data Based on Hierarchical Aggregation , 2013, IEEE Transactions on Visualization and Computer Graphics.

[42]  Daniel Cohen-Or,et al.  Self‐similarity Analysis for Motion Capture Cleaning , 2018, Comput. Graph. Forum.

[43]  Montserrat Ros,et al.  Recognizing human motions through mixture modeling of inertial data , 2015, Pattern Recognit..

[44]  Yoshihiko Nakamura,et al.  Symbolically structured database for human whole body motions based on association between motion symbols and motion words , 2015, Robotics Auton. Syst..

[45]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Jessica Lin,et al.  Linear Time Complexity Time Series Classification with Bag-of-Pattern-Features , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[47]  Darko Kirovski,et al.  Real-time classification of dance gestures from skeleton animation , 2011, SCA '11.

[48]  Dimitrios Gunopulos,et al.  Indexing Large Human-Motion Databases , 2004, VLDB.

[49]  Meinard Müller,et al.  Motion templates for automatic classification and retrieval of motion capture data , 2006, SCA '06.

[50]  Feng Liu,et al.  3D motion retrieval with motion index tree , 2003, Comput. Vis. Image Underst..

[51]  Xiaowei Zhou,et al.  Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[53]  Norman I. Badler,et al.  Segmenting motion capture data using a qualitative analysis , 2015, MIG.

[54]  M. Alex O. Vasilescu Human motion signatures: analysis, synthesis, recognition , 2002, Object recognition supported by user interaction for service robots.

[55]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[56]  Hans-Peter Seidel,et al.  VNect , 2017, ACM Trans. Graph..

[57]  Michael Neff,et al.  Deep signatures for indexing and retrieval in large motion databases , 2015, MIG.

[58]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[59]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[60]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[61]  Yuan Yan Tang,et al.  Efficient Human Motion Retrieval via Temporal Adjacent Bag of Words and Discriminative Neighborhood Preserving Dictionary Learning , 2017, IEEE Transactions on Human-Machine Systems.

[62]  Hans-Peter Seidel,et al.  An efficient algorithm for keyframe-based motion retrieval in the presence of temporal deformations , 2008, MIR '08.

[63]  Lucas Kovar,et al.  Automated extraction and parameterization of motions in large data sets , 2004, ACM Trans. Graph..

[64]  Zhengxing Sun,et al.  Scalable Organization of Collections of Motion Capture Data via Quantitative and Qualitative Analysis , 2015, ICMR.

[65]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Michiel van de Panne,et al.  Motion doodles: an interface for sketching character motion , 2004, SIGGRAPH Courses.

[67]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[68]  Jessica K. Hodgins,et al.  Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Meinard Müller,et al.  Efficient content-based retrieval of motion capture data , 2005, SIGGRAPH '05.

[70]  Alexei A. Efros,et al.  Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[71]  Zhigang Deng,et al.  Perceptually consistent example-based human motion retrieval , 2009, I3D '09.

[72]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[73]  Lucas Kovar,et al.  Motion graphs , 2002, SIGGRAPH '08.

[74]  Taku Komura,et al.  A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..