Mining contentions from discussions and debates

Social media has become a major source of information for many applications. Numerous techniques have been proposed to analyze network structures and text contents. In this paper, we focus on fine-grained mining of contentions in discussion/debate forums. Contentions are perhaps the most important feature of forums that discuss social, political and religious issues. Our goal is to discover contention and agreement indicator expressions, and contention points or topics both at the discussion collection level and also at each individual post level. To the best of our knowledge, limited work has been done on such detailed analysis. This paper proposes three models to solve the problem, which not only model both contention/agreement expressions and discussion topics, but also, more importantly, model the intrinsic nature of discussions/debates, i.e., interactions among discussants or debaters and topic sharing among posts through quoting and replying relations. Evaluation results using real-life discussion/debate posts from several domains demonstrate the effectiveness of the proposed models.

[1]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[2]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[3]  Huidong Jin,et al.  Sequential Latent Dirichlet Allocation: Discover Underlying Topic Structures within a Document , 2010, 2010 IEEE International Conference on Data Mining.

[4]  Matt Thomas,et al.  Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[5]  Zhenfu Cao,et al.  HTM: A Topic Model for Hypertexts , 2008, EMNLP.

[6]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[7]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[8]  Andrew McCallum,et al.  Expertise modeling for matching papers with reviewers , 2007, KDD '07.

[9]  Kentaro Inui,et al.  Identifying Contradictory and Contrastive Relations between Statements to Outline Web Information on a Given Topic , 2010, COLING.

[10]  Timothy Baldwin,et al.  Collective Classification of Congressional Floor-Debate Transcripts , 2011, ACL.

[11]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[12]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[13]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[14]  Sean Gerrish,et al.  A Language-based Approach to Measuring Scholarly Impact , 2010, ICML.

[15]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[16]  Hal Daumé,et al.  Markov Random Topic Fields , 2009, ACL/IJCNLP.

[17]  Julia Hirschberg,et al.  Identifying Agreement and Disagreement in Conversational Speech: Use of Bayesian Networks to Model Pragmatic Dependencies , 2004, ACL.

[18]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[19]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[20]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[21]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[22]  A. McCallum,et al.  Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[23]  Raymond H. Putra,et al.  Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions , 2010, COLING.

[24]  Stan Szpakowicz,et al.  Language Patterns in the Learning of Strategies from Negotiation Texts , 2006, Canadian Conference on AI.

[25]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[26]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[27]  Ramakrishnan Srikant,et al.  Mining newsgroups using networks arising from social behavior , 2003, WWW '03.

[28]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[29]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[30]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[31]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[32]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[33]  Thomas L. Griffiths,et al.  Online Inference of Topics with Latent Dirichlet Allocation , 2009, AISTATS.

[34]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[35]  Regina Barzilay,et al.  Content Models with Attitude , 2011, ACL.

[36]  Gerhard Weikum,et al.  Language-model-based pro/con classification of political text , 2010, SIGIR.

[37]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[38]  Noriaki Kawamae Latent interest-topic model: finding the causal relationships behind dyadic data , 2010, CIKM '10.

[39]  John D. Lafferty,et al.  Visualizing Topics with Multi-Word Expressions , 2009, 0907.1013.

[40]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[41]  Michal Rosen-Zvi,et al.  Latent Topic Models for Hypertext , 2008, UAI.

[42]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[43]  Doug Downey,et al.  It’s a Contradiction – no, it’s not: A Case Study using Functional Relations , 2008, EMNLP.

[44]  Yang Song,et al.  Topical Keyphrase Extraction from Twitter , 2011, ACL.

[45]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[46]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.