Self-organized Evaluation of Dynamic Hand Gestures for Sign Language Recognition

Human-computer interaction (HCI) is entering our everyday life. We are welcomed by robot guide-bots in Japan [2] and play computer games using Nintendo’s nunchucks [1]. Nevertheless, the revolution is not finished and computer vision is still under development [21]. In this paper we present an organic computing approach to the recognition of gestures performed by a single person in front a monocular video camera. Visual gesture recognition has to deal with many well-known problems of image processing, like camera noise, object tracking, object recognition and the recognition of a dynamic trajectory. Thus, a gesture recognition system has to show robust feature extraction and adaptation to a flexible environment and signer. It requires properties of an organic computing system, with different autonomous modules cooperating to solve the given problem. Sign language is a good playground for gesture recognition research because it has a structure, which allows to develop and test methods on sign language recognition first before applying them on gesture recognition. Thus, here we restrict ourselves to working on signs of the British Sign Language (BSL) and concentrate on their manual part. We have to consider that the projection of the 3D scene onto a 2D plane results in loss of depth information and therefore the reconstruction of the 3Dtrajectory of the hand is not always possible. Also the position of the signer in front of the camera may vary. Movements like shifting in one direction or rotating around the body axis must be kept in mind, as well as the occlusion of some fingers or even a whole hand during signing. Despite its constant structure each sign shows plenty of variation in time and space. Even if the same person performs the same sign twice, small changes in speed and position of the hand will occur. Generally, a sign is

[1]  Andrew Zisserman,et al.  Vision based Interpretation of Natural Sign Languages , 2003 .

[2]  Jochen Triesch,et al.  Democratic Integration: Self-Organized Integration of Adaptive Cues , 2001, Neural Computation.

[3]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[4]  Rolf P. Würtz,et al.  Communicating Agents Architecture with Applications in Multimodal Human Computer Interaction , 2004, GI Jahrestagung.

[5]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Frank Buschmann,et al.  A system of patterns , 1995 .

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[9]  Peter Sommerlad,et al.  Pattern-Oriented Software Architecture Volume 1: A System of Patterns , 1996 .

[10]  Andrew Zisserman,et al.  Minimal Training, Large Lexicon, Unconstrained Sign Language Recognition , 2004, BMVC.

[11]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Rolf P. Würtz,et al.  Organic Computing Methods for Face Recognition (Methoden des Organic Computing zur Gesichtserkennung) , 2005, it Inf. Technol..

[13]  Daniel Schneider,et al.  Rapid Signer Adaptation for Isolated Sign Language Recognition , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[14]  Olaf Kähler,et al.  Self-Organizing, Adaptive Data Fusion for 3d Object Tracking , 2005, ARCS Workshops.

[15]  Karl-Friedrich Kraiss,et al.  Video-based sign recognition using self-organizing subunits , 2002, Object recognition supported by user interaction for service robots.

[16]  Richard Bowden,et al.  A boosted classifier tree for hand shape detection , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[17]  Dimitris N. Metaxas,et al.  Handshapes and Movements: Multiple-Channel American Sign Language Recognition , 2003, Gesture Workshop.

[18]  Jochen Triesch,et al.  Classification of hand postures against complex backgrounds using elastic graph matching , 2002, Image Vis. Comput..

[19]  Rolf P. Würtz,et al.  A Flexible Object Model for Recognising and Synthesising Facial Expressions , 2005, AVBPA.

[20]  Karl-Friedrich Kraiss,et al.  Robust Person-Independent Visual Sign Language Recognition , 2005, IbPRIA.

[21]  Christoph von der Malsburg,et al.  Vision as an Exercise in Organic Computing , 2004, GI Jahrestagung.