Thematic segmentation of meetings through document/speech alignment

This article proposes a multimodal approach for segmenting meeting recordings. This bi-modal method takes advantages of the alignment of speech transcript with documents, in the context of meetings or lectures, where documents are discussed. The method first displays the alignment results as a set of nodes in a 2D space, where the two axes represent respectively the documents content and the speech transcript. The most connected regions in this graph are detected using a clustering method. The final clusters are then projected on the speech axis. Finally, the obtained sequence of segments is considered as the thematic structure of the speech transcript. In this article, we present our bi-modal method and compare it with two other mono-modal thematic segmentation methods.

[1]  Athanasios Kehagias,et al.  Linear Text Segmentation using a Dynamic Programming Algorithm , 2003, EACL.

[2]  Denis Lalanne,et al.  Talking about documents: revealing a missing link to multimedia meeting archives , 2003, IS&T/SPIE Electronic Imaging.

[3]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[4]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[5]  Andrei Popescu-Belis,et al.  Using Static Documents as Structured and Thematic Interfaces to Multimedia Meeting Archives , 2004, MLMI.

[6]  Petra Perner,et al.  Data Mining on Multimedia Data , 2002, Lecture Notes in Computer Science.

[7]  Denis Lalanne,et al.  Thematic alignment of recorded speech with documents , 2003, DocEng '03.

[8]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9]  Marti A. Hearst,et al.  A Critique and Improvement of an Evaluation Metric for Text Segmentation , 2002, CL.

[10]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  G. Karypis,et al.  Criterion functions for document clustering , 2005 .

[13]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Brigitte Grau,et al.  Thematic segmentation of texts: two methods for two kinds of texts , 1998, COLING.

[15]  Gerard Salton,et al.  Automatic text decomposition using text segments and text themes , 1996, HYPERTEXT '96.

[16]  Carl G. Looney,et al.  Interactive clustering and merging with a new fuzzy expected value , 2002, Pattern Recognit..

[17]  Maurizio Rigamonti,et al.  Xed: a new tool for extracting hidden structures from electronic documents , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[18]  M. Golumbic Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57) , 2004 .

[19]  John D. Lafferty,et al.  Statistical Models for Text Segmentation , 1999, Machine Learning.

[20]  M. Golummc Algorithmic graph theory and perfect graphs , 1980 .