Detecting Forum Authority Claims in Online Discussions

This paper explores the problem of detecting sentence-level forum authority claims in online discussions. Using a maximum entropy model, we explore a variety of strategies for extracting lexical features in a sparse training scenario, comparing knowledge- and data-driven methods (and combinations). The augmentation of lexical features with parse context is also investigated. We find that certain markup features perform remarkably well alone, but are outperformed by data-driven selection of lexical features augmented with parse context.

[1]  Tanja Schultz,et al.  Modeling Vocal Interaction for Text-Independent Participant Characterization in Multi-Party Conversation , 2008, SIGDIAL Workshop.

[2]  Jonathan T. Morgan,et al.  Annotating Social Acts: Authority Claims and Alignment Moves in Wikipedia Talk Pages , 2011 .

[3]  J. Stuart Bunderson,et al.  Recognizing and Utilizing Expertise in Work Groups: A Status Characteristics Perspective , 2003 .

[4]  Mari Ostendorf,et al.  Unsupervised broadcast conversation speaker role labeling , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Soo-Min Kim,et al.  Automatic Identification of Pro and Con Reasons in Online Reviews , 2006, ACL.

[6]  Julia Hirschberg,et al.  Soundbite detection in broadcast news domain , 2006, INTERSPEECH.

[7]  Emily M. Bender,et al.  Detecting authority bids in online discussions , 2010, 2010 IEEE Spoken Language Technology Workshop.

[8]  Alessandro Vinciarelli,et al.  Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling , 2007, IEEE Transactions on Multimedia.

[9]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[10]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[11]  Feifan Liu,et al.  Soundbite identification using reference and automatic transcripts of broadcast news speech , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[12]  Yang Liu,et al.  Initial Study on Automatic Identification of Speaker Role in Broadcast News Speech , 2006, NAACL.

[13]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[14]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[15]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[16]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[17]  Eric Gilbert,et al.  Blogs are Echo Chambers: Blogs are Echo Chambers , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[18]  Julia Hirschberg,et al.  The Rules Behind Roles: Identifying Speaker Role in Radio Broadcasts , 2000, AAAI/IAAI.