Disambiguating entity references within an ontological model

In our everyday conversations, entities (persons, companies, etc.) are referred to by natural language identifiers (NLIs). Humans employ personal experience and situational context to interpret such identifiers. However, due to ambiguity, even humans run the risk of misinterpretations. In our prior work, we presented a novel method to resolve entity references in texts under the aspect of ambiguity. We explore ontological background knowledge represented in an RDF(S) graph. The different interpretation possibilities lead to different subgraphs of the underlying ontology, each subgraph describing one consistent, non-ambiguous interpretation of the ambiguous NLIs within the ontological knowledge base. Our domain-independent approach is based on spreading activation and uses a semantic relational ranking. In this paper, we suggest three extensions to our original algorithm. First, we process in a two-step interpretation---instead of the whole original input text---at first hand smaller text windows in order to get more precise reference interpretations through a smaller local text context. Second, we extend the spreading-activation algorithm within the RDF(S) graph towards a bidirectional exploration of edges which shall speed-up the algorithm. Third we use reinforcement learning in order to take advantage of re-occurring information. We present first experimental results with these algorithmic extensions and derive directions for future work.

[1]  Amit P. Sheth,et al.  Context and Domain Knowledge Enhanced Entity Spotting in Informal Text , 2009, SEMWEB.

[2]  Tok Wang Ling,et al.  Effective XML Keyword Search with Relevance Oriented Ranking , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[3]  Ferdinand de Saussure Course in General Linguistics , 1916 .

[4]  A Min Tjoa,et al.  Word Sense Disambiguation as the Primary Step of Ontology Integration , 2008, DEXA.

[5]  Ansgar Bernardi,et al.  IdentityRank: Named Entity Disambiguation in the Context of the NEWS Project , 2007, ESWC.

[6]  M. Ross Quillian,et al.  A revised design for an understanding machine , 1962, Mech. Transl. Comput. Linguistics.

[7]  Raphael Volz,et al.  Towards Ontology-based Disambiguation of Geographical Identifiers , 2007, I3.

[8]  Norberto Fernández García,et al.  Semantic Annotation of Web Resources Using IdentityRank and Wikipedia , 2007, AWIC.

[9]  Scott Everett Preece A spreading activation network model for information retrieval , 1981 .

[10]  Rada Mihalcea,et al.  Unsupervised graph-based word sense disambiguation , 2009 .

[11]  Tru H. Cao,et al.  Named entity disambiguation on an ontology enriched by Wikipedia , 2008, 2008 IEEE International Conference on Research, Innovation and Vision for the Future in Computing and Communication Technologies.

[12]  Amit P. Sheth,et al.  SemRank: ranking complex relationship search results on the semantic web , 2005, WWW '05.

[13]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[14]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[15]  Véronique Malaisé,et al.  Disambiguating automatic semantic annotation based on a thesaurus structure , 2007 .

[16]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[17]  Md Maruf Hasan,et al.  A Spreading Activation Framework for Ontology-Enhanced Adaptive Information Access within Organisations , 2003, AMKM.

[18]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[19]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[20]  Hitoshi Isahara,et al.  IREX: IR & IE Evaluation Project in Japanese , 2000, LREC.

[21]  Raphael Volz,et al.  Ontology based entity disambiguation with natural language patterns , 2009, 2009 Fourth International Conference on Digital Information Management.

[22]  Tru H. Cao,et al.  A Knowledge-Based Approach to Named Entity Disambiguation in News Articles , 2007, Australian Conference on Artificial Intelligence.

[23]  Ismailcem Budak Arpinar,et al.  Ontology-Driven Automatic Entity Disambiguation in Unstructured Text , 2006, SEMWEB.

[24]  Ted Briscoe,et al.  Semi-productive Polysemy and Sense Extension , 1995, J. Semant..

[25]  Claudio Gutiérrez,et al.  Bipartite Graphs as Intermediate Model for RDF , 2004, SEMWEB.

[26]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[27]  Jurij D. Apresjan REGULAR POLYSEMY , 1974 .

[28]  John R. Anderson A Spreading Activation Theory of Memory , 1988 .

[29]  Nancy Ide,et al.  Word Sense Disambiguation with Very Large Neural Networks Extracted from Machine Readable Dictionaries , 1990, COLING.

[30]  Haofen Wang,et al.  Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[31]  S. Sudarshan,et al.  Bidirectional Expansion For Keyword Search on Graph Databases , 2005, VLDB.

[32]  Bruno Pouliquen,et al.  Multilingual and cross-lingual news topic tracking , 2004, COLING.

[33]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[34]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[35]  Michalis Vazirgiannis,et al.  Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri , 2007, IJCAI.

[36]  Andreas Abecker,et al.  Entity Reference Resolution via Spreading Activation on RDF-Graphs , 2010, ESWC.

[37]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.