This paper describes University of Illinois’s Cognitive Computation Group (UI-CCG)’s submissions for three TAC tracks: Event Nugget Detection/Coreference; Entity Discovery and Linking (EDL); and Slot Filler Validation (SFV). The Event Nugget Detection and Coreference system employs a supervised model for event nugget detection with rich lexical and semantic features while we experiment with both supervised and unsupervised event co-reference methods. We also utilize ACE2005 data as an additional source and use several domain adaptation techniques to improve our system’s performance. The Entity Discovery and Linking system focuses on solving the Spanish subtask. The system uses Google Translation to translate Spanish documents into English and then apply Illinois Wikifier to identify entity mentions and disambiguate them to Wikipedia entries. It outperforms other participants on both linking and clusteringevaluations. The Illinois SFV system treats the task as an entailment problem, seeking to identify for each individual query whether or not the proposed answer is valid based on the information contained in the query document. The system builds on those of previous years, and uses a machine learning component to try to extract cues from unmarked relations in the context of the query relation. The three systems described here were developed as separate systems. 1 Event Nugget Detection and Co-reference In this section, we describe our submission to the TAC KBP event task. Our team participated in the TAC KBP Event Nugget (EN) track. It includes three sub-tasks: event nugget detection, event co-reference based on gold and predicted event nuggets. Our system uses a supervised model for event nugget detection with rich lexical and semantic features. For event co-reference model, we experiment with both supervised and unsupervised methods. For supervised models, we train a classifier to model the similarity between each event nugget pair while we also ESA representations (Gabrilovich and Markovitch, 2007; Song and Roth, 2015) to compute this similarity in an unsupervised fashion. We show each module of our system in the following sub-sections and discuss several techniques that we employ. 1.1 Event Nugget Detection We use a stage wise classification approach to extract all events (Ahn, 2006; Chen and Ng, 2012). We first train a 34-class classifier (33 event subtypes and one non-event class) to detect event nuggets and classify them into different types. We apply it on each token. Features for this supervised classifier includes lexical features, features from parser, Named Entity Recognition (NER), Semantic Role Labelling (SRL), entity co-reference and WordNet, and other semantic features from Explicit Semantic Analysis (ESA) (Gabrilovich and Markovitch, 2005; Gabrilovich and Markovitch, 2007) and Brown Clusters (Brown et al., 1992). We then apply a classifer using the same set of rich features on each detected event nuggets to get REALIS information (ACTUAL, GENERIC or OTHER). Features They can be summarized in the following categories: 1. Lexical features: context (part-of-speech tag and lemma) of a candidate token in a window size of 5 and 20, plus their conjunctions. 2. Seed features: we use 140 seeds for event triggers following a previous work (Bronstein et al., 2015). We consider whether a candidate token is a seed or not (also its type if it matches) and conjunction of the matched seed and context seeds (also their types). 3. Parse Tree features: path from a candidate token to root, number of its right/left siblings and their categories, and paths connecting a candidate token with other seeds or named entities. 4. NER features: named entities and their types within a window size of 20 of a candidate token. 5. SRL features: whether a candidate token is VerbSRL/NomSRL predicate and its role, its conjunction with SRL relation names and the conjunction of the SRL relation name and the NER types in the context. 6. Coref features: co-referred entities with the candidate token, and their conjunction with both the candidate token and named entities in the context. 7. ESA features: top 50 ESA concepts for each candidate token. 8. Brown cluster features: brown cluster vector of prefix length 4, 6, 10 and 20. 9. Wordnet features: hypernym, hyponym, entailment words and derived words of both the candidate token and its context, and also the wordnet relations between the candidate token and seed words. 10. Other features: whether a candidate token is in Framenet/Propbank or is a deverbal noun. Learning Model We choose Support Vector Machine (SVM) to train both event nugget detection classifier and realis classifier. We use L2 loss and set C as 0.1 after tuning on a development set. We use Illinois NLP packages for NER1, SRL2, and http://cogcomp.cs.illinois.edu/page/ software_view/NETagger http://cogcomp.cs.illinois.edu/page/ software_view/SRL Entity Co-reference3. Domain Adaptation Apart from the KBP training data, we use ACE2005 as an additional source of our training data. The ACE event taxonomy is similar to that of the KBP task. To enable the domain adaptation from ACE to KBP, we employ the following techniques. 1. We view event triggers in ACE annotations as event nuggets in the KBP task. 2. We apply a deterministic rule to convert ACE realis information to KBP formulation. Specifically, we combine “Genericity.Past” and “Tense.Past” in ACE to be “Actual” in KBP. We also use“Genericity.Generic” directly as ”Generic” and “Tense.Unspecified” (and sometimes also ”Tense.Future”) as ”Others”. 3. As ACE and KBP have different data distributions based on event types, we use resampling in ACE to match the event nugget type distribution in KBP. There is also notable mismatch of the density of events between ACE and KBP. On average, each sentence contains 0.34 events in ACE, while in KBP, the statistics is 0.82, which is significantly larger. Thus, we also use subsampling to get a subset of the negative training examples in ACE to have a similar positive and negative training example ratio in KBP. Results We have two development datasets, one from ACE2005 dataset while the other is from KBP data. On ACE2005, we select 40 documents from newswire articles for testing and the rest for training. We only use this ACE development set to evaluate our performance on ACE. For KBP data, we also select 30 documents (20% of the available data) as the developmemt set. These selected documents contain genres of both news articles and discussion forums. We use this KBP development set to test performance on models trained on KBP data and ACE-KBP combined data using domain adaptation techniques. Results on the two development sets are shown in Table 1. The overall score on the KBP development set shows that it is best to train on ACE-KBP combined data without doing resampling and subsampling techniques. However, the sampling technique improves recall http://cogcomp.cs.illinois.edu/page/
[1]
Dan Roth,et al.
Relational Inference for Wikification
,
2013,
EMNLP.
[2]
Dan Roth,et al.
A Robust Shallow Temporal Reasoning System
,
2012,
HLT-NAACL.
[3]
Chen Chen,et al.
Joint Modeling for Chinese Event Extraction with Rich Linguistic Features
,
2012,
COLING.
[4]
Evgeniy Gabrilovich,et al.
Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis
,
2007,
IJCAI.
[5]
Dan Roth,et al.
The Use of Classifiers in Sequential Inference
,
2001,
NIPS.
[6]
David Ahn,et al.
The stages of event extraction
,
2006
.
[7]
Dan Roth,et al.
Understanding the Value of Features for Coreference Resolution
,
2008,
EMNLP.
[8]
Dan Roth,et al.
Part of Speech Tagging Using a Network of Linear Separators
,
1998,
ACL.
[9]
D. Roth,et al.
Inference with Classifiers : The Phrase Identification Problem Inference with Classifiers : The Phrase Identification Problem ∗
,
2004
.
[10]
Dan Roth,et al.
Unsupervised Sparse Vector Densification for Short Text Similarity
,
2015,
NAACL.
[11]
Heng Ji,et al.
Seed-Based Event Trigger Labeling: How far can event descriptions get us?
,
2015,
ACL.
[12]
Dan Roth,et al.
Design Challenges and Misconceptions in Named Entity Recognition
,
2009,
CoNLL.
[13]
Robert L. Mercer,et al.
Class-Based n-gram Models of Natural Language
,
1992,
CL.
[14]
Evgeniy Gabrilovich,et al.
Feature Generation for Text Categorization Using World Knowledge
,
2005,
IJCAI.
[15]
Pascal Denis,et al.
Joint Determination of Anaphoricity and Coreference Resolution using Integer Programming
,
2007,
NAACL.
[16]
Gourab Kundu,et al.
Overview of UI-CCG Systems for Event Argument Extraction , Entity Discovery and Linking , and Slot Filler Validation
,
2014
.
[17]
Doug Downey,et al.
Local and Global Algorithms for Disambiguation to Wikipedia
,
2011,
ACL.