This paper describes a method for the detection, tracking and recognition of lower arm and hand movements from color video sequences using a linguistic approach driven by motion analysis and clustering techniques. The novelty of our method comes from (i) automatic arm detection, without any manual initialization, foreground or background modeling, (ii) gesture representation at different levels of abstraction using a linguistic approach based on signal-to-symbol mapping, and (iii) robust matching for gesture recognition using the weighted largest common sequence (of symbols). Learning vector quantization abstracts the affine motion parameters as morphological primitive units, i.e. "letters"; clustering techniques derive sequences of letters as "words" for both sub-activities and the transitions occurring between them; and, finally, the arm activities are recognized in terms of sequences of certain sub-activities. Using activity cycles from six kinds of arm movements, i.e. slow and fast pounding, striking, swinging, swirling and stirring, which were not available during training, the performance achieved is perfect (100%) if one allows, as should be the case for invariance purposes, slow and fast pounding video sequences to be recognized as one and the same type of activity.
[1]
Teuvo Kohonen,et al.
The 'neural' phonetic typewriter
,
1988,
Computer.
[2]
Thomas S. Huang,et al.
Image processing
,
1971
.
[3]
D. Signorini,et al.
Neural networks
,
1995,
The Lancet.
[4]
Dariu Gavrila,et al.
The Visual Analysis of Human Movement: A Survey
,
1999,
Comput. Vis. Image Underst..
[5]
Harry F. Olson,et al.
Phonetic typewriter
,
1957
.
[6]
Jake K. Aggarwal,et al.
Human Motion Analysis: A Review
,
1999,
Comput. Vis. Image Underst..
[7]
M. Knapp,et al.
Nonverbal communication in human interaction
,
1972
.
[8]
A F Bobick,et al.
Movement, activity and action: the role of knowledge in the perception of motion.
,
1997,
Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[9]
S. Nayar,et al.
Early Visual Learning
,
1996
.
[10]
Alex Pentland,et al.
A Bayesian Computer Vision System for Modeling Human Interactions
,
1999,
IEEE Trans. Pattern Anal. Mach. Intell..
[11]
Thomas B. Moeslund,et al.
A Survey of Computer Vision-Based Human Motion Capture
,
2001,
Comput. Vis. Image Underst..