Collective Classification of Congressional Floor-Debate Transcripts

This paper explores approaches to sentiment classification of U. S. Congressional floor-debate transcripts. Collective classification techniques are used to take advantage of the informal citation structure present in the debates. We use a range of methods based on local and global formulations and introduce novel approaches for incorporating the outputs of machine learners into collective classification algorithms. Our experimental evaluation shows that the mean-field algorithm obtains the best results for the task, significantly outperforming the benchmark technique.

[1]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[2]  Matt Thomas,et al.  Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[3]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[4]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[5]  Lise Getoor,et al.  Supervised and Unsupervised Methods in Employing Discourse Relations for Improving Opinion Polarity Classification , 2009, EMNLP.

[6]  Lise Getoor,et al.  Combining Collective Classification and Link Prediction , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[7]  Koby Crammer,et al.  Confidence-weighted linear classification , 2008, ICML '08.

[8]  David A. Smith,et al.  Dependency Parsing by Belief Propagation , 2008, EMNLP.

[9]  Lars Schmidt-Thieme,et al.  Automatic Content-Based Categorization of Wikipedia Articles , 2009, PWNLP@IJCNLP.

[10]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[11]  Clinton Burfoot,et al.  Using Multiple Sources of Agreement Information for Sentiment Classification of Political Transcripts , 2008, ALTA.

[12]  Timothy Baldwin,et al.  Automatic Satire Detection: Are You Having a Laugh? , 2009, ACL.

[13]  Chengqing Zong,et al.  Multi-domain Sentiment Classification , 2008, ACL.

[14]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[15]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[16]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[17]  Claire Cardie,et al.  The Power of Negative Thinking: Exploiting Label Disagreement in the Min-cut Classification Framework , 2008, COLING.

[18]  Lise Getoor,et al.  Effective label acquisition for collective classification , 2008, KDD.

[19]  Sadao Kurohashi,et al.  An Alignment Algorithm Using Belief Propagation and a Structure-Based Distortion Model , 2009, EACL.

[20]  Sabine Bergler,et al.  When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging , 2008, ACL.

[21]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[22]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[23]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[24]  Remco R. Bouckaert,et al.  Choosing Between Two Learning Algorithms Based on Calibrated Tests , 2003, ICML.

[25]  Xiaoying Gao,et al.  Combining Contents and Citations for Scientific Document Classification , 2005, Australian Conference on Artificial Intelligence.