Annotating Attribution in the Penn Discourse TreeBank

An emerging task in text understanding and generation is to categorize information as fact or opinion and to further attribute it to the appropriate source. Corpus annotation schemes aim to encode such distinctions for NLP applications concerned with such tasks, such as information extraction, question answering, summarization, and generation. We describe an annotation scheme for marking the attribution of abstract objects such as propositions, facts and eventualities associated with discourse relations and their arguments annotated in the Penn Discourse TreeBank. The scheme aims to capture the source and degrees of factuality of the abstract objects. Key aspects of the scheme are annotation of the text spans signalling the attribution, and annotation of features recording the source, type, scopal polarity, and determinacy of attribution.

[1]  Alan Lee,et al.  Attribution and the (Non-)Alignment of Syntactic and Discourse Arguments of Connectives , 2005, FCA@ACL.

[2]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[3]  Livio Robaldo,et al.  The Penn Discourse Treebank 2.0 Annotation Manual , 2007 .

[4]  Lauri Karttunen,et al.  Some observations on factivity , 1971 .

[5]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[6]  Janyce Wiebe,et al.  Annotating Attributions and Private States , 2005, FCA@ACL.

[7]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[8]  Rashmi Prasad,et al.  Annotating Discourse Connectives and Their Arguments , 2004, FCP@NAACL-HLT.

[9]  J. Hintikka Semantics for Propositional Attitudes , 1969 .

[10]  Matthew Stone,et al.  Anaphora and Discourse Structure , 2001, CL.

[11]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[12]  Rashmi Prasad,et al.  The Penn Discourse TreeBank as a Resource for Natural Language Generation , 2005 .

[13]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[14]  Kenneth Harris The Semantics of Negation , 2003 .

[15]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[16]  Ellen Riloff,et al.  Exploiting Subjectivity Classification to Improve Information Extraction , 2005, AAAI.

[17]  Claire Cardie,et al.  Multi-Perspective Question Answering Using the OpQA Corpus , 2005, HLT.

[18]  Laurence R. Horn Remarks on Neg Raising , 1978 .

[19]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[20]  Nicholas Asher,et al.  Reference to abstract objects in discourse , 1993, Studies in linguistics and philosophy.

[21]  Rashmi Prasad,et al.  Annotation and Data Mining of the Penn Discourse TreeBank , 2004, ACL 2004.

[22]  Ivan A. Sag,et al.  An Integrated Theory of Complement Control. , 1991 .