Tracker Text Segmentation Approach: Integrating Complex Lexical and Conversation Cue Features

While text segmentation is a topic which has received a great attention since 9/11, most of current research projects remain focused on expository texts, stories and broadcast news. Current segmentation methods are well suited for written and structured texts making use of their distinctive macro-level structures. Text segmentation of transcribed multi-party conversation presents a different challenge given the lack of linguistic features such as headings, paragraph, and well formed sentences. This paper describes an algorithm suited for transcribed meeting conversations combining semantically complex lexical relations with conversational cue phrases to build lexical chains in determining topic boundaries.

[1]  Marti A. Hearst,et al.  A Critique and Improvement of an Evaluation Metric for Text Segmentation , 2002, CL.

[2]  Bernadette Sharp,et al.  Transcript Segmentation Using Utterance Cosine Similarity Measure , 2016, NLUCS.

[3]  Julia Hirschberg,et al.  Empirical Studies on the Disambiguation of Cue Phrases , 1993, Comput. Linguistics.

[4]  Jeffrey C. Reynar Statistical Models for Topic Segmentation , 1999, ACL.

[5]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[6]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[7]  K. Yamada,et al.  A maximum-likelihood approach to segmentation-based recognition of unconstrained handwriting text , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[8]  Guorong Wu,et al.  Detecting and Segmenting Text from Natural Scenes with 2-Stage Classification , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[9]  Okumura Manabu,et al.  Word Sense Disambiguation and Text Segmentation Based on Lexical Cohesion , 1994, COLING.

[10]  Nicola Stokes,et al.  Spoken and Written News Story Segmentation Using Lexical Chains , 2003, NAACL.

[11]  Kathleen R. McKeown,et al.  Linear segmentation and segment relevence , 1998 .

[12]  Johanna D. Moore,et al.  Latent Semantic Analysis for Text Segmentation , 2001, EMNLP.

[13]  John D. Lafferty,et al.  Text Segmentation Using Exponential Models , 1997, EMNLP.

[14]  Susan Gauch,et al.  ChatTrack: Chat Room Topic Detection Using Classification , 2004, ISI.

[15]  Nicola Stokes,et al.  Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking Domain , 2004 .

[16]  Bilan Zhu,et al.  Segmentation of on-line handwritten Japanese text of arbitrary line direction by a neural network for improving text recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[17]  G. Youmans A New Tool for Discourse Analysis: The Vocabulary-Management Profile. , 1991 .

[18]  Gina-Anne Levow,et al.  Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue , 2004, SIGDIAL Workshop.

[19]  Rebecca J. Passonneau,et al.  Discourse Segmentation by Human and Automated Means , 1997, CL.

[20]  Larry Gillick,et al.  A hidden Markov model approach to text segmentation and event tracking , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[21]  Mitchell P. Marcus,et al.  Topic segmentation: algorithms and applications , 1998 .