Prior-informed Distant Supervision for Temporal Evidence Classification

Temporal evidence classification, i.e., finding associations between temporal expressions and relations expressed in text, is an important part of temporal relation extraction. To capture the variations found in this setting, we employ a distant supervision approach, modeling the task as multi-class text classification. There are two main challenges with distant supervision: (1) noise generated by incorrect heuristic labeling, and (2) distribution mismatch between the target and distant supervision examples. We are particularly interested in addressing the second problem and propose a sampling approach to handle the distribution mismatch. Our prior-informed distant supervision approach improves over basic distant supervision and outperforms a purely supervised approach when evaluated on TAC-KBP data, both on classification and end-to-end metrics.

[1]  Heng Ji,et al.  Combining Flat and Structured Approaches for Temporal Slot Filling or: How Much to Compress? , 2012, CICLing.

[2]  M. de Rijke,et al.  Exploring entity associations over time , 2013, SIGIR 2013.

[3]  John Dunnion,et al.  UCD IIRG at TAC 2012 , 2012, TAC.

[4]  Le Zhao,et al.  Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction , 2013, ACL.

[5]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[6]  Heng Ji,et al.  Tackling representation, annotation and classification challenges for temporal knowledge base population , 2014, Knowledge and Information Systems.

[7]  Leon Derczynski,et al.  USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding , 2011, TAC.

[8]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[10]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[11]  Valentin I. Spitkovsky,et al.  Stanford's Distantly-Supervised Slot-Filling System , 2011, TAC.

[12]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[13]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[14]  Heng Ji,et al.  Relabeling Distantly Supervised Training Data for Temporal Knowledge Base Population , 2012, AKBC-WEKEX@NAACL-HLT.

[15]  Oren Etzioni,et al.  Modeling Missing Data in Distant Supervision for Information Extraction , 2013, TACL.

[16]  Anselmo Peñas,et al.  Temporally Anchored Relation Extraction , 2012, ACL.

[17]  Avirup Sil,et al.  Towards Temporal Scoping of Relational Facts based on Wikipedia Data , 2014, CoNLL.

[18]  James Pustejovsky,et al.  TempEval-3: Evaluating Events, Time Expressions, and Temporal Relations , 2012, ArXiv.

[19]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.