Ontology based entity disambiguation with natural language patterns

Text is still the most important carrier of information. Names are essential anchor points in text, and they are essential to link pieces of information spread in the text. Unfortunately names are ambiguous, which makes it difficult for systems to understand whether syntactically identical names are actually semantically equivalent. Disambiguation is therefore an indispensable feature for the automatic processing of texts. We introduce a novel approach to dis-ambiguate names using ontologies, natural language text patterns and a customisable ranking algorithm. We evaluate our approach in the area of geographic names, and demonstrate the utility of our results with a geographic news reader.

[1]  Andreas Mueller,et al.  Fast sequential and parallel algorithms for association rule mining: a comparison , 1995 .

[2]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[3]  Raphael Volz,et al.  Towards Ontology-based Disambiguation of Geographical Identifiers , 2007, I3.

[4]  Bruno Pouliquen,et al.  Multilingual and cross-lingual news topic tracking , 2004, COLING.

[5]  Bart Goethals,et al.  Survey on Frequent Pattern Mining , 2003 .

[6]  Daqing He,et al.  Geographic Named Entity Disambiguation with Automatic Profile Generation , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[7]  Jochen L. Leidner Towards a Reference Corpus for Automatic Toponym Resolution Evaluation , 2004 .

[8]  Jochen L. Leidner,et al.  Grounding spatial named entities for information extraction and question answering , 2003, HLT-NAACL 2003.

[9]  Diana Maynard,et al.  JAPE: a Java Annotation Patterns Engine , 2000 .

[10]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[11]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[12]  Ismailcem Budak Arpinar,et al.  Ontology-Driven Automatic Entity Disambiguation in Unstructured Text , 2006, SEMWEB.

[13]  Stavros Christodoulakis,et al.  Ontology-Driven Semantic Ranking for Natural Language Disambiguation in the OntoNL Framework , 2007, ESWC.

[14]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[15]  David Yarowsky,et al.  Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[16]  Ansgar Bernardi,et al.  IdentityRank: Named Entity Disambiguation in the Context of the NEWS Project , 2007, ESWC.