Cross-Document Coreference on a Large Scale Corpus

Abstract : In this paper, we will compare and evaluate the effectiveness of different statistical methods in the task of cross-document coreference resolution. We created entity models for different test sets and compare the following disambiguation and clustering techniques to cluster the entity models in order to create coreference chains: Incremental Vector Space, KL-Divergence, Agglomerative Vector Space. Coreference analysis refers to the process of determining whether or not two mentions of entities refer to the same person (Kibble and Deemter, 2000).