Gesture segmentation in complex motion sequences

Complex human motion sequences (such as dances) are typically analyzed by segmenting them into shorter motion sequences, called gestures. However, this segmentation process is subjective, and varies considerably from one human observer to another. In this paper, we propose an algorithm called hierarchical activity segmentation. This algorithm employs a dynamic hierarchical layered structure to represent the human anatomy, and uses low-level motion parameters to characterize motion in the various layers of this hierarchy, which correspond to different segments of the human body. This characterization is used with a naive Bayesian classifier to derive creator profiles from empirical data. Then those profiles are used to predict how creators will segment gestures in other motion sequences. When the predictions were tested with a library of 3D motion capture sequences, which were segmented by 2 choreographers they were found to be reasonably accurate.

[1]  Alex Pentland,et al.  Task-Specific Gesture Analysis in Real-Time Using Interpolated Views , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Koichiro Akita,et al.  Image sequence analysis of real world human motion , 1984, Pattern Recognit..

[3]  Norman I. Badler,et al.  Synthesis and acquisition of laban movement analysis qualitative parameters for communicative gestures , 2001 .

[4]  D. Rose A Multilevel Approach to the Study of Motor Control and Learning , 1996 .

[5]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[6]  Alan F. Blackwell,et al.  Cognitive Dimensions and Musical Notation Systems , 2000 .

[7]  Marcel Worring,et al.  Complex Visual Activity Recognition Using a Temporally Ordered Database , 1999, VISUAL.

[8]  Mark O. Riedl,et al.  A perception/action substrate for cognitive modeling in HCI , 2001, Int. J. Hum. Comput. Stud..

[9]  Beth Levy,et al.  Conceptual Representations in Lan-guage Activity and Gesture , 1980 .

[10]  Patricia Adams,et al.  Programming Languages: Principles and Practice , 1993 .

[11]  A. Kendon Gesticulation and Speech: Two Aspects of the Process of Utterance , 1981 .

[12]  Kuniaki Uehara,et al.  Extraction of Primitive Motion and Discovery of Association Rules from Human Motion Data , 2002, Progress in Discovery Science.

[13]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[14]  R. Nelson,et al.  Low level recognition of human motion (or how to get your man without finding his body parts) , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[15]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Thomas Sudkamp,et al.  Languages and Machines , 1988 .

[17]  Alex Pentland,et al.  Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[18]  KwangYun Wohn,et al.  Recognition of space-time hand-gestures using hidden Markov model , 1996, VRST.

[19]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[20]  Nanning Zheng,et al.  Unsupervised Analysis of Human Gestures , 2001, IEEE Pacific Rim Conference on Multimedia.

[21]  Karl H.E. Kroemer,et al.  Ergonomics: How to Design for Ease and Efficiency , 1993 .

[22]  Ralph Grishman Computational linguistics: What is computational linguistics? , 1986 .

[23]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[24]  Jake K. Aggarwal,et al.  Human motion analysis: a review , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[25]  A. Kendon Movement coordination in social interaction: some examples described. , 1970, Acta psychologica.

[26]  A. Pentland,et al.  Robust estimation of a multi-layered motion representation , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[27]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[28]  J. P. Foley,et al.  Gesture and Environment , 1942 .

[29]  Yangsheng Xu,et al.  Online, interactive learning of gestures for human/robot interfaces , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[30]  Alex Pentland,et al.  Understanding purposeful human motion , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[31]  Aaron F. Bobick,et al.  A state-based technique for the summarization and recognition of gesture , 1995, Proceedings of IEEE International Conference on Computer Vision.

[32]  Adam Kendon,et al.  How gestures can become like words , 1988 .

[33]  Horst Bunke,et al.  Hidden Markov models: applications in computer vision , 2001 .

[34]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[35]  Harry Wechsler,et al.  Face and hand gesture recognition using hybrid classifiers , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[36]  Rudolf Arnheim,et al.  Hand and Mind: What Gestures Reveal About Thought by David McNeill (review) , 2017 .

[37]  Norman I. Badler,et al.  To gesture or not to gesture: what is the question? , 2000, Proceedings Computer Graphics International 2000.

[38]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39]  Nigel Goddard,et al.  Incremental model-based discrimination of articulated movement from motion features , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[40]  Ali N. Akansu,et al.  Low-level motion activity features for semantic characterization of video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[41]  James W. Davis,et al.  Real-time recognition of activity using temporal templates , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[42]  K. Nishimura,et al.  A gesture description model based on synthesizing fundamental gestures , 1999, Proceedings IEEE Southeastcon'99. Technology on the Brink of 2000 (Cat. No.99CH36300).

[43]  Jesse Hoey,et al.  Representation and recognition of complex human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[44]  D. McNeill So you think gestures are nonverbal , 1985 .

[45]  Alex Fukunaga,et al.  Towards Practical Automated Motion Synthesis , 1995 .

[46]  Aaron F. Bobick,et al.  Recognition of human body motion using phase space constraints , 1995, Proceedings of IEEE International Conference on Computer Vision.

[47]  J. Aggarwal,et al.  Lower limb kinematics of human walking with the medial axis transformation , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.