Argument Extraction from News, Blogs, and Social Media

Argument extraction is the task of identifying arguments, along with their components in text. Arguments can be usually decomposed into a claim and one or more premises justifying it. Among the novel aspects of this work is the thematic domain itself which relates to Social Media, in contrast to traditional research in the area, which concentrates mainly on law documents and scientific publications. The huge increase of social media communities, along with their user tendency to debate, makes the identification of arguments in these texts a necessity. Argument extraction from Social Media is more challenging because texts may not always contain arguments, as is the case of legal documents or scientific publications usually studied. In addition, being less formal in nature, texts in Social Media may not even have proper syntax or spelling. This paper presents a two-step approach for argument extraction from social media texts. During the first step, the proposed approach tries to classify the sentences into “sentences that contain arguments” and “sentences that don’t contain arguments”. In the second step, it tries to identify the exact fragments that contain the premises from the sentences that contain arguments, by utilizing conditional random fields. The results exceed significantly the base line approach, and according to literature, are quite promising.

[1]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[2]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[3]  Georgios Paliouras,et al.  Ellogon: A New Text Engineering Platform , 2002, LREC.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[7]  Chris Reed,et al.  Araucaria: Software for Argument Analysis, Diagramming and Representation , 2004, Int. J. Artif. Intell. Tools.

[8]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[9]  B. Colosimo,et al.  Logistic regression analysis for experimental determination of forming limit diagrams , 2006 .

[10]  Anthony Hunter,et al.  Elements of Argumentation , 2007, ECSQARU.

[11]  Marie-Francine Moens,et al.  Automatic detection of arguments in legal texts , 2007, ICAIL.

[12]  Marie-Francine Moens,et al.  Argumentation mining: the detection, classification and structure of arguments in text , 2009, ICAIL.

[13]  Raquel Mochales Palau,et al.  Creating an argumentation corpus: do theories apply to real arguments?: a case study on the legal argumentation of the ECHR , 2009, ICAIL.

[14]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[15]  Marie-Francine Moens,et al.  Argumentation mining , 2011, Artificial Intelligence and Law.

[16]  Stephen Cranefield,et al.  Ontology-based modelling of related work sections in research articles: using CRFs for developing semantic data based information retrieval systems , 2010, I-SEMANTICS '10.

[17]  E. Krabbe,et al.  Groundwork in the Theory of Argumentation , 2012 .

[18]  Trevor J. M. Bench-Capon,et al.  Semi-Automated Argumentative Analysis of Online Product Reviews , 2012, COMMA.

[19]  Jodi Schneider,et al.  Identifying Consumers' Arguments in Text , 2012, SWAIE.

[20]  Tudor Groza,et al.  A review of argumentation for the Social Semantic Web , 2013, Semantic Web.

[21]  Pythagoras Karampiperis,et al.  Argument extraction for supporting public policy formulation , 2013, LaTeCH@ACL.