Unity Is Strength: Coupling Media for Thematic Segmentation

In this paper we present the preliminary results and the evaluation of a combined thematic segmentation of (a) meeting documents and (b) meeting speech transcript. Our approach is based on a clustering method applied on a 2D representation of the thematic alignment, and then the projection of the extracted clusters on each axis, corresponding to meeting documents and the speech transcript. Finally, our bi-modal thematic segmentation method is evaluated, in regards to a mono-modal segmentation method (TextTiling).

[1]  John D. Lafferty,et al.  Statistical Models for Text Segmentation , 1999, Machine Learning.

[2]  Denis Lalanne,et al.  Thematic alignment of recorded speech with documents , 2003, DocEng '03.

[3]  Carl G. Looney,et al.  Interactive clustering and merging with a new fuzzy expected value , 2002, Pattern Recognit..

[4]  Maurizio Rigamonti,et al.  Xed: a new tool for extracting hidden structures from electronic documents , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[5]  Denis Lalanne,et al.  Talking about documents: revealing a missing link to multimedia meeting archives , 2003, IS&T/SPIE Electronic Imaging.

[6]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.