Integration and Fusion Aspects of Speech and Handwriting Media

In this paper we discuss synchronization approaches for fusion of speech and handwriting data on a signal representation level. There are many advantages in utilizing additional modalities to speech, for example bimodal signals have the potential of increasing accuracy of recognition systems. Further we intend to provide users more flexibility for human to computer communication by allowing them to choose their preferred modality. After discussion of goals, we analyze different ways for synchronization of media streams. Besides approaches based on synchronized time stamp protocols as additional metadata, we dwell on a concept for synchronization based on embedding the data stream of one modality into the other by using digital watermarking techniques. Here we introduce the general concept of direct embedding and analyze the necessary watermarking capacity (payload) for synchronization. Finally we have a look at aspects of information retrieval in multimodal documents.

[1]  David J. Burr,et al.  Designing a Handwriting Reader , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  John C. Tang,et al.  Liveboard: a large interactive display supporting group meetings, presentations, and remote collaboration , 1992, CHI.

[3]  Andrew Tomkins,et al.  On the Searchability of Electronic Ink , 1994 .

[4]  James A. Landay,et al.  Making Sharing Pervasive: Ubiquitous Computing for Shared Note Taking , 1999, IBM Syst. J..

[5]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[6]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Ralf Steinmetz,et al.  Biometric authentication for ID cards with hologram watermarks , 2002, IS&T/SPIE Electronic Imaging.

[8]  Anil K. Jain,et al.  Indexing and retrieval of on-line handwritten documents , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[9]  Giovanni Seni,et al.  Online handwriting recognition in a form-filling task: evaluating the impact of context-awareness , 2003, IS&T/SPIE Electronic Imaging.

[10]  Hong Chang,et al.  SVC2004: First International Signature Verification Competition , 2004, ICBA.

[11]  Yannis Stylianou,et al.  Fusion strategies for speech and handwriting modalities in HCI , 2005, IS&T/SPIE Electronic Imaging.