Topic segmentation of TV-streams by watershed transform and vectorization

Abstract A fine-grained segmentation of radio or TV broadcasts is an essential step for most multimedia processing tasks. Applying segmentation algorithms to the speech transcripts seems straightforward. Yet, most of these algorithms are not suited when dealing with short segments or noisy data. In this paper, we present a new segmentation technique inspired from the image analysis field and relying on a new way to compute similarities between candidate segments called vectorization. Vectorization makes it possible to match text segments that do not share common words; this property is shown to be particularly useful when dealing with transcripts in which transcription errors and short segments makes the segmentation difficult. This new topic segmentation technique is evaluated on two corpora of transcripts from French TV broadcasts on which it largely outperforms other existing approaches from the state-of-the-art.

[1]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[2]  Heidi Christensen,et al.  Maximum entropy segmentation of broadcast news , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Christophe Collet,et al.  Fuzzy Markov Random Fields versus Chains for Multispectral Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ittai Abraham,et al.  Advances in metric embedding theory , 2006, STOC '06.

[5]  Vincent Claveau,et al.  Vectorisation des processus d'appariement document-requête , 2010, CORIA.

[6]  Germain Forestier,et al.  Supervised image segmentation using watershed transform, fuzzy classification and evolutionary computation , 2010, Pattern Recognit. Lett..

[7]  Mark T. Maybury,et al.  Broadcast news navigation using story segmentation , 1997, MULTIMEDIA '97.

[8]  Michael W. Berry,et al.  Principal Component Analysis for Information Retrieval , 2005 .

[9]  Vincent Claveau,et al.  Topic Segmentation of TV-Streams by Mathematical Morphology and Vectorization , 2011, INTERSPEECH.

[10]  Alan F. Smeaton,et al.  Segmenting broadcast news streams using lexical chains , 2002 .

[11]  Laurent Najman,et al.  Watershed of a continuous function , 1994, Signal Process..

[12]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[13]  Georges Quénot,et al.  Automatic Story Segmentation for TV News Video Using Multiple Modalities , 2012, Int. J. Digit. Multim. Broadcast..

[14]  Hitoshi Isahara,et al.  A Statistical Model for Domain-Independent Text Segmentation , 2001, ACL.

[15]  Mubarak Shah,et al.  Story Segmentation in News Videos Using Visual and Text Cues , 2005, CIVR.

[16]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[17]  Dilek Z. Hakkani-Tür,et al.  Speech segmentation and spoken document processing , 2008, IEEE Signal Processing Magazine.

[18]  Pascale Sébillot,et al.  Un modèle multi-sources pour la segmentation en sujets de journaux radiophoniques , 2008 .

[19]  Marti A. Hearst,et al.  A Critique and Improvement of an Evaluation Metric for Text Segmentation , 2002, CL.

[20]  Jos B. T. M. Roerdink,et al.  The Watershed Transform: Definitions, Algorithms and Parallelization Strategies , 2000, Fundam. Informaticae.

[21]  Gerard Salton,et al.  A theory of indexing , 1975, Regional conference series in applied mathematics.

[22]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[23]  Rafael C. González,et al.  Digital image processing, 3rd Edition , 2008 .

[24]  Frank Hopfgartner,et al.  TV News Story Segmentation Based on Semantic Coherence and Content Similarity , 2010, MMM.

[25]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[26]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Pascale Sébillot,et al.  Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition , 2010, Comput. Speech Lang..

[29]  Julia Hirschberg,et al.  Story Segmentation of Broadcast News in English, Mandarin and Arabic , 2006, NAACL.

[30]  Mitchell P. Marcus,et al.  Topic segmentation: algorithms and applications , 1998 .

[31]  Patrice Bellot,et al.  Adapting and comparing linear segmentation methods for French , 2004, RIAO.

[32]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[33]  Isabel Trancoso,et al.  Topic Indexing of TV Broadcast News Programs , 2003, PROPOR.

[34]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Pascale Sébillot,et al.  Improving ASR-based topic segmentation of TV programs with confidence measures and semantic relations , 2010, INTERSPEECH.

[36]  Olivier Ferret,et al.  Improving Text Segmentation by Combining Endogenous and Exogenous Methods , 2009, RANLP.

[37]  Min-Yen Kan,et al.  Linear Segmentation and Segment Significance , 1998, VLC@COLING/ACL.

[38]  Marie-Francine Moens,et al.  Multimodal News Story Segmentation , 2009, IHCI.

[39]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[40]  Laurent Amsaleg,et al.  NV-Tree: An Efficient Disk-Based Index for Approximate Search in Very Large High-Dimensional Collections , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[42]  Pascale Sébillot,et al.  Enhancing lexical cohesion measure with confidence measures, semantic relations and language model interpolation for multimedia spoken content topic segmentation , 2012, Comput. Speech Lang..

[43]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Benno Stein Principles of hash-based text retrieval , 2007, SIGIR.

[45]  John D. Lafferty,et al.  Statistical Models for Text Segmentation , 1999, Machine Learning.

[46]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.