论文信息 - Team Association Analysis for Named Entity Filtering

Team Association Analysis for Named Entity Filtering

Abstract : This paper describes the participation of the Universities of Helsinki and Caen in the first round of the TREC Knowledge Base Acceleration track3. The task focused on filtering a stream of documents relevant to a set of entities. Our approach uses word co-occurrence graphs for modelling the named entities. We submitted two runs that achieved an average F-measure superior to the mean of all submitted runs. The best of those runs ranked in the top 5 runs for both the central and relevant F-measures, out of a total of 43 runs submitted by 11 institutions. As our runs were the produce of a first implementation of our approach these preliminary results are very supportive of our idea to use concept graphs for modelling named entity relations.

Hannu Toivonen | Antoine Doucet | Oskar Gross

[1] Heng Ji,et al. Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[2] Michael Gamon. Graph-Based Text Representation for Novelty Detection , 2006 .

[3] Valentin I. Spitkovsky,et al. A Cross-Lingual Dictionary for English Wikipedia Concepts , 2012, LREC.

[4] Ian Soboro. Overview of the TREC 2004 Novelty Track , 2004 .

[5] Ian Soboroff,et al. Overview of the TREC 2004 Novelty Track , 2004, TREC.

[6] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[7] Hui Fang,et al. Entity Profile Based Approach in Automatic Knowledge Finding , 2012, TREC.

[8] Dedre Gentner,et al. Why Nouns Are Learned before Verbs: Linguistic Relativity Versus Natural Partitioning. Technical Report No. 257. , 1982 .

[9] Paul McNamee,et al. The HLTCOE Approach to the TREC 2012 KBA Track , 2012, TREC.

[10] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[11] Feng Niu,et al. Building an Entity-Centric Stream Filtering Test Collection for TREC 2012 , 2012, TREC.

[12] Rada Mihalcea,et al. Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[13] Ralf Steinberger,et al. A survey of methods to ease the development of highly multilingual text mining applications , 2011, Language Resources and Evaluation.

[14] Andrew Trotman,et al. Focused Access to XML Documents, 6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007, Dagstuhl Castle, Germany, December 17-19, 2007. Selected Papers , 2008, INEX.

[15] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[16] Arjen P. de Vries,et al. CWI at TREC 2012, KBA Track and Session Track , 2012, TREC.