论文信息 - Cross-Document Coreference on a Large Scale Corpus

Cross-Document Coreference on a Large Scale Corpus

Abstract : In this paper, we will compare and evaluate the effectiveness of different statistical methods in the task of cross-document coreference resolution. We created entity models for different test sets and compare the following disambiguation and clustering techniques to cluster the entity models in order to create coreference chains: Incremental Vector Space, KL-Divergence, Agglomerative Vector Space. Coreference analysis refers to the process of determining whether or not two mentions of entities refer to the same person (Kibble and Deemter, 2000).

James Allan | Chung Heong Gooi | James Allan

[1] Lillian Lee,et al. On the effectiveness of the skew divergence for statistical language analysis , 2001, AISTATS.

[2] Breck Baldwin. Coreference as the Foundations for Link Analysis over Free Text Databases , 1998 .

[3] Breck Baldwin,et al. Algorithms for Scoring Coreference Chains , 1998 .

[4] Breck Baldwin,et al. Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[5] Alan W. Biermann,et al. A Methodology for Cross-document Coreference Cross-document Coreference: the Problem Architecture and the Methodology , 2000 .

[6] Kees van Deemter,et al. Coreference Annotation: Whither? , 2000, LREC.

[7] James Allan,et al. Topic detection and tracking: event-based information organization , 2002 .

[8] Breck Baldwin,et al. How Much Processing Is Required for Cross-Document Coreference? , 1995 .