Feature Alignment for the Analysis of Verbatim Text Transcripts

In the research of deliberative democracy, political scientists are interested in analyzing the communication models of discussions, debates, and mediation processes with the goal of extracting reoccurring discourse patterns from the verbatim transcripts of these conversations. To enhance the time-exhaustive manual analysis of such patterns, we introduce a visual analytics approach that enables the exploration and analysis of repetitive feature patterns over parallel text corpora using feature alignment. Our approach is tailored to the requirements of our domain experts. In this paper, we discuss our visual design and workflow, and we showcase the applicability of our approach using an experimental parallel corpus of political debates.

[1]  Yen-Liang Chen,et al.  Discovering hybrid temporal patterns from sequences consisting of point- and interval-based events , 2009, Data Knowl. Eng..

[2]  Manuel Campos,et al.  Fast Vertical Mining of Sequential Patterns Using Co-occurrence Information , 2014, PAKDD.

[3]  Mark Olsen,et al.  Something Borrowed: Sequence Alignment and the Identification of Similar Passages in Large Text Collections , 2011 .

[4]  Anne E. Trefethen,et al.  Rule‐based Visual Mappings – with a Case Study on Poetry Visualization , 2013, Comput. Graph. Forum.

[5]  Min Chen,et al.  Constructive Visual Analytics for Text Similarity Detection , 2017, Comput. Graph. Forum.

[6]  Bela Gipp Citation-based Plagiarism Detection - Detecting Disguised and Cross-language Plagiarism using Citation Pattern Analysis , 2014 .

[7]  Pak Chung Wong,et al.  Visualizing sequential patterns for text mining , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[8]  John R. Kender,et al.  Alignment of Speech to Highly Imperfect Text Transcriptions , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[9]  Ramakrishnan Srikant,et al.  Discovering Trends in Text Databases , 1997, KDD.

[10]  Horacio Saggion,et al.  An Unsupervised Alignment Algorithm for Text Simplification Corpus Construction , 2011, Monolingual@ACL.

[11]  Emmanuel Viennet,et al.  bitSPADE: A Lattice-based Sequential Pattern Mining Algorithm Using Bitmap Representation , 2006, Sixth International Conference on Data Mining (ICDM'06).

[12]  Daniel A. Keim,et al.  VisArgue : A Visual Text Analytics Framework for the Study of Deliberative Communication , 2016 .

[13]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[14]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[15]  Antonio Gomariz,et al.  VMSP: Efficient Vertical Mining of Maximal Sequential Patterns , 2014, Canadian Conference on AI.

[16]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[17]  Daniel A. Keim,et al.  Visual linguistic analysis of political discussions: Measuring deliberative quality , 2015, Digit. Scholarsh. Humanit..

[18]  Ming Zhou,et al.  Detecting Erroneous Sentences using Automatically Mined Sequential Patterns , 2007, ACL.

[19]  Nizar R. Mabroukeh,et al.  A taxonomy of sequential pattern mining algorithms , 2010, CSUR.

[20]  Martin Wattenberg,et al.  Arc diagrams: visualizing structure in strings , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[21]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.