Activity detection for information access to oral communication

Oral communication is ubiquitous and carries important information yet it is also time consuming to document. Given the development of storage media and networks one could just record and store a conversation for documentation. The question is, however, how an interesting information piece would be found in a large database. Traditional information retrieval techniques use a histogram of keywords as the document representation but oral communication may offer additional indices such as the time and place of the rejoinder and the attendance. An alternative index could be the activity such as discussing, planning, informing, story-telling, etc. This paper addresses the problem of the automatic detection of those activities in meeting situation and everyday rejoinders. Several extensions of this basic idea are being discussed and/or evaluated: Similar to activities one can define subsets of larger database and detect those automatically which is shown on a large database of TV shows. Emotions and other indices such as the dominance distribution of speakers might be available on the surface and could be used directly. Despite the small size of the databases used some results about the effectiveness of these indices can be obtained.

[1]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[2]  Eric Brill A Report of Recent Progress in Transformation-Based Error-Driven Learning , 1994, HLT.

[3]  Franz Hundsnurscher,et al.  Handbuch der Dialoganalyse , 1994 .

[4]  M. Bakhtin,et al.  Speech genres and other late essays , 1986 .

[5]  Hinrich Schütze,et al.  Automatic Detection of Text Genre , 1997, ACL.

[6]  Gwyneth Doherty-Sneddon,et al.  The Reliability of a Dialogue Structure Coding Scheme , 1997, CL.

[7]  Seymour Sudman,et al.  Autobiographical memory and the validity of retrospective reports , 1994 .

[8]  Klaus Ries,et al.  Towards the detection and description of textual meaning indicators in spontaneous conversations , 1999, EUROSPEECH.

[9]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[10]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[11]  Douglas J. Herrmann,et al.  The Validity of Retrospective Reports as a Function of the Directness of Retrieval Processes , 1994 .

[12]  Hagen Soltau,et al.  Advances in automatic meeting record creation and access , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[13]  Per Linell,et al.  Interactional dominance in dyadic communication: a presentation of initiative-response analysis , 1988 .

[14]  Douglas Biber,et al.  Variation across speech and writing: Methodology , 1988 .

[15]  Alon Lavie,et al.  Shallow Discourse Genre Annotation in CallHome Spanish , 2000, LREC.

[16]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.