Spatiotemporal Similarity Search in 3D Motion Capture Gesture Streams

The question of how to model spatiotemporal similarity between gestures arising in 3D motion capture data streams is of major significance in currently ongoing research in the domain of human communication. While qualitative perceptual analyses of co-speech gestures, which are manual gestures emerging spontaneously and unconsciously during face-to-face conversation, are feasible in a small-to-moderate scale, these analyses are inapplicable to larger scenarios due to the lack of efficient query processing techniques for spatiotemporal similarity search. In order to support qualitative analyses of co-speech gestures, we propose and investigate a simple yet effective distance-based similarity model that leverages the spatial and temporal characteristics of co-speech gestures and enables similarity search in 3D motion capture data streams in a query-by-example manner. Experiments on real conversational 3D motion capture data evidence the appropriateness of the proposal in terms of accuracy and efficiency.

[1]  Seong-Whan Lee,et al.  Recognizing hand gestures using dynamic Bayesian network , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[2]  KwangYun Wohn,et al.  Recognition of hand gestures with 3D, nonlinear arm movement , 1997, Pattern Recognit. Lett..

[3]  Hsiao-Lung Chan,et al.  Human identification by quantifying similarity and dissimilarity in electrocardiogram phase space , 2009, Pattern Recognit..

[4]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Guillaume Doisy,et al.  Position-invariant, real-time gesture recognition based on dynamic time warping , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[6]  Christian Beecks,et al.  Distance based similarity models for content based multimedia retrieval , 2013 .

[7]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[8]  David McNeill,et al.  Body – Language – Communication: An International Handbook on Multimodality in Human Interaction , 2013 .

[9]  Sabina Jeschke,et al.  Sequential Pattern Mining of Multimodal Streams in the Humanities , 2015, BTW.

[10]  R. Watson A Survey of Gesture RecognitionTechniques. , 1993 .

[11]  Thomas Seidl,et al.  On stability of signature-based similarity measures for content-based image retrieval , 2012, Multimedia Tools and Applications.

[12]  F. Hausdorff Grundzüge der Mengenlehre , 1914 .

[13]  Elena Deza,et al.  Encyclopedia of Distances , 2014 .

[14]  Markus Hahn,et al.  3D Action Recognition and Long-Term Prediction of Human Motion , 2008, ICVS.

[15]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[16]  Jianyu Yang,et al.  A new descriptor for 3D trajectory recognition via modified CDTW , 2010, 2010 IEEE International Conference on Automation and Logistics.

[17]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[18]  Elena Mugellini,et al.  A Survey of Datasets for Human Gesture Recognition , 2014, HCI.

[19]  Justine Cassell,et al.  Visual classification of co-verbal gestures for gesture understanding , 2001 .

[20]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[21]  Jr. Joseph J. LaViola,et al.  A Survey of Hand Posture and Gesture Recognition Techniques and Technology , 1999 .

[22]  Rafiqul Zaman Khan,et al.  Survey on Various Gesture Recognition Technologies and Techniques , 2012 .

[23]  Thomas Seidl,et al.  Signature matching distance for content-based image retrieval , 2013, ICMR.

[24]  Eraldo Ribeiro,et al.  Human Motion Recognition Using Isomap and Dynamic Time Warping , 2007, Workshop on Human Motion.

[25]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[26]  Anupam Agrawal,et al.  Vision based hand gesture recognition for human computer interaction: a survey , 2012, Artificial Intelligence Review.

[27]  A. Kendon Some Relationships Between Body Motion and Speech , 1972 .

[28]  Dimitrios Gunopulos,et al.  Elastic Translation Invariant Matching of Trajectories , 2005, Machine Learning.

[29]  Qiang Wang,et al.  Elastic Partial Matching of Time Series , 2005, PKDD.

[30]  Heung-Il Suk,et al.  Hand gesture recognition based on dynamic Bayesian network framework , 2010, Pattern Recognit..

[31]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[32]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[33]  Rafiqul Zaman Khan,et al.  Survey on Gesture Recognition for Hand Image Postures , 2012, Comput. Inf. Sci..

[34]  Cornelia Müller,et al.  Redebegleitende Gesten : Kulturgeschichte, Theorie, Sprachvergleich , 1998 .

[35]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[36]  J. P. Foley,et al.  Gesture and Environment , 1942 .

[37]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[38]  Thomas Seidl,et al.  Towards a Mobile Health Context Prediction: Sequential Pattern Mining in Multiple Streams , 2011, 2011 IEEE 12th International Conference on Mobile Data Management.

[39]  Dacheng Tao,et al.  Feature fusion for 3D hand gesture recognition by learning a shared hidden space , 2012, Pattern Recognit. Lett..

[40]  Thomas Seidl,et al.  A comparative study of similarity measures for content-based multimedia retrieval , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[41]  Thomas Seidl,et al.  Signature Quadratic Form Distance , 2010, CIVR '10.

[42]  Sang Uk Lee,et al.  Color-Based Image Retrieval Using Perceptually Modified Hausdorff Distance , 2008, EURASIP J. Image Video Process..

[43]  A. Kendon Gesticulation and Speech: Two Aspects of the Process of Utterance , 1981 .

[44]  Dimitrios Gunopulos,et al.  Indexing multi-dimensional time-series with support for multiple distance measures , 2003, KDD '03.

[45]  S. Abdul-Kareem,et al.  RETRACTED ARTICLE: Static hand gesture recognition using neural networks , 2014, Artificial Intelligence Review.

[46]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[47]  D. McNeill Hand and Mind: What Gestures Reveal about Thought , 1992 .

[48]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[49]  Manolya Kavakli,et al.  A survey of speech-hand gesture recognition for the development of multimodal interfaces in computer games , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[50]  Tarik Arici,et al.  Robust gesture recognition using feature pre-processing and weighted dynamic time warping , 2014, Multimedia Tools and Applications.

[51]  Helman Stern,et al.  Most discriminating segment - Longest common subsequence (MDSLCS) algorithm for dynamic hand gesture classification , 2013, Pattern Recognit. Lett..

[52]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  A. Kendon Gesture: Visible Action as Utterance , 2004 .