ICANDO: Low cost multimodal interface for hand disabled people

The article presents the multimodal user interface ICANDO (Intellectual Computer AssistaNt for Disabled Operators) that was awarded with the first prize of the Loco Mummy Software Contest in 2006. The interface is intended mainly for assistance to the persons without hands or with disabilities of their hands or arms but could be useful for ordinary users at hands-free contactless human-computer interaction too. It combines the module for automatic recognition of voice commands in English, French and Russian as well as the head tracking module in one multimodal interface. ICANDO interface was applied for hands-free work with Graphical User Interface of a personal computer in such tasks as Internet communication and work with graphical and text documents. The article describes the aim and the architecture of the interface, the methods for speech recognition and head tracking, information fusion and synchronization of the multimodal streams. The presented results of testing and exploitation of ICANDO user interface have confirmed high accuracy and robustness of the interface for contactless operation with a computer. The comparison of multimodal and standard ways of interaction has discovered that the first one is slower by a factor of 1.9 that is quite well for hands-free interaction between a computer and an impaired person.

[1]  Laura Farinetti,et al.  An eye-gaze input device for people with severe motor disabilities , 2002 .

[2]  Andrey Ronzhin,et al.  Multimodal system for hands-free PC control , 2005, 2005 13th European Signal Processing Conference.

[3]  Andrey Ronzhin,et al.  Audio-Visual Speech Recognition for Slavonic Languages (Czech and Russian) , 2006 .

[4]  Howell O. Istance,et al.  Why are eye mice unpopular? A detailed comparison of head and eye controlled assistive technology pointing devices , 2003, Universal Access in the Information Society.

[5]  Russian Federation,et al.  A MULTIMODAL FRAMEWORK FOR THE COMMUNICATION OF THE DISABLED , 2007 .

[6]  Jindrich Matousek,et al.  Design, implementation and evaluation of the Czech realistic audio-visual speech synthesis , 2006, Signal Process..

[7]  A. A. Karpov,et al.  Russian voice interface , 2007, Pattern Recognition and Image Analysis.

[8]  Alan F. Blackwell,et al.  Dasher—a data entry interface using continuous gestures and language models , 2000, UIST '00.

[9]  Andrey Ronzhin,et al.  Multi-modal system ICANDO: intellectual computer assistant for disabled operators , 2006, INTERSPEECH.

[10]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[11]  Alice Caplier,et al.  Biological approach for head motion detection and analysis , 2005, 2005 13th European Signal Processing Conference.

[12]  Dmitry O. Gorodnichy,et al.  Nouse 'use your nose as a mouse' perceptual vision technology for hands-free games and interfaces , 2004, Image Vis. Comput..

[13]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[14]  Jeffrey Richter Programming applications for Microsoft Windows , 1999 .

[15]  Alexey Karpov,et al.  Smart multimodal assistant for disabled , 2007 .

[16]  Margrit Betke,et al.  Communication via eye blinks and eyebrow raises: video-based human-computer interfaces , 2003, Universal Access in the Information Society.

[17]  J. Jacko,et al.  The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications , 2002 .