Clique-Based Clustering for Improving Named Entity Recognition Systems

We propose a system which builds, in a semi-supervised manner, a resource that aims at helping a NER system to annotate corpus-specific named entities. This system is based on a distributional approach which uses syntactic dependencies for measuring similarities between named entities. The specificity of the presented method however, is to combine a clique-based approach and a clustering technique that amounts to a soft clustering method. Our experiments show that the resource constructed by using this clique-based clustering system allows to improve different NER systems.

[1]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[2]  Dekang Lin Using Collocation Statistics in Information Extraction , 1998, MUC.

[3]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[4]  Isabelle Warnesson,et al.  Relational analysis and dictionaries , 1989 .

[5]  Pierre Michaud,et al.  Modèles d'optimisation en analyse des données relationnelles , 1979 .

[6]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[7]  Jean-Pierre Chanod,et al.  Robustness beyond shallowness: incremental deep parsing , 2002, Natural Language Engineering.

[8]  Patrick Pantel,et al.  Induction of semantic classes from natural language text , 2001, KDD '01.

[9]  Adam Kilgarriff,et al.  The Sketch Engine , 2004 .

[10]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[11]  Satoshi Sekine,et al.  Named Entity Discovery Using Comparable News Articles , 2004, COLING.

[12]  W. Bruce Croft,et al.  Relevance Models in Information Retrieval , 2003 .

[13]  Marius Pasca,et al.  Acquisition of categorized named entities for web search , 2004, CIKM '04.

[14]  Bernard Victorri,et al.  Construction d'espaces sémantiques à l'aide de dictionnaires de synonymes , 1998 .

[15]  Paola Velardi,et al.  Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence , 2001, CL.

[16]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[17]  Caroline Brun,et al.  Intertwining Deep Syntactic Processing and Named Entity Detection , 2004, EsTAL.

[18]  A. Ross Structural Linguistics , 1953, Nature.

[19]  Guillaume Jacquet,et al.  Vers une double annotation des Entités Nommées , 2006, Trait. Autom. des Langues.

[20]  Axel-Cyrille Ngonga Ngomo,et al.  SIGNUM: A Graph Algorithm for Terminology Extraction , 2008, CICLing.

[21]  Jon Oberlander,et al.  IN PROCEEDINGS OF EACL-2006 , 2006 .

[22]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.