Linked Enterprise Data for Fine Grained Named Entity Linking and Web Intelligence

To identify trends and assign metadata elements such as location and sentiment to the correct entities, Web intelligence applications require methods for linking named entities and revealing relations between organizations, persons and products. For this purpose we introduce Recognyze, a named entity linking component that uses background knowledge obtained from linked data repositories. This paper outlines the underlying methods, provides insights into the migration of proprietary knowledge sources to linked enterprise data, and discusses the lessons learned from adapting linked data for named entity linking. A large dataset obtained from Orell Füssli, the largest Swiss business information provider, serves as the main showcase. This dataset includes more than nine million triples on companies, their contact information, management, products and brands. We identify major challenges towards applying this data for named entity linking and conduct a comprehensive evaluation based on several news corpora to illustrate how Recognyze helps address them, and how it improves the performance of named entity linking components drawing upon linked data rather than machine learning techniques.

[1]  Alexander Schill,et al.  Training a Named Entity Recognizer on the Web , 2011, WISE.

[2]  Hans Peter Luhn,et al.  A Business Intelligence System , 1958, IBM J. Res. Dev..

[3]  Hsinchun Chen,et al.  Business and Market Intelligence 2.0, Part 2 , 2010, IEEE Intelligent Systems.

[4]  Xianpei Han,et al.  Named entity disambiguation by leveraging wikipedia semantic knowledge , 2009, CIKM.

[5]  Rajeev Rastogi,et al.  Entity disambiguation with hierarchical topic models , 2011, KDD.

[6]  Joel Nothman,et al.  Learning multilingual named entity recognition from Wikipedia , 2013, Artif. Intell..

[7]  Surajit Chaudhuri,et al.  Targeted disambiguation of ad-hoc, homogeneous sets of named entities , 2012, WWW.

[8]  Arno Scharl,et al.  A Context-Dependent Supervised Learning Approach to Sentiment Detection in Large Textual Databases , 2010, J. Inf. Data Manag..

[9]  Arno Scharl,et al.  Extracting and Grounding Context-Aware Sentiment Lexicons , 2013 .

[10]  Aldo Gangemi,et al.  A Comparison of Knowledge Extraction Tools for the Semantic Web , 2013, ESWC.

[11]  Efraim Turban,et al.  Business Intelligence: Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures , 2013 .

[12]  Amitava Das,et al.  Sentimantics: Conceptual Spaces for Lexical Sentiment Polarity Representation with Contextuality , 2012, WASSA@ACL.

[13]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[14]  Hsinchun Chen Business and Market Intelligence 2.0 , 2010 .

[15]  Joel Nothman,et al.  Evaluating Entity Linking with Wikipedia , 2013, Artif. Intell..

[16]  Gerhard Paass,et al.  From names to entities using thematic context distance , 2011, CIKM '11.

[17]  Jason J. Jung Online named entity recognition method for microtexts in social networking services: A case study of twitter , 2012, Expert Syst. Appl..

[18]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[19]  Albert Weichselbraun A Utility Centered Approach for Evaluating and Optimizing Geo-tagging , 2009, KDIR.

[20]  Norberto Fernández García,et al.  IdentityRank: Named entity disambiguation in the news domain , 2012, Expert Syst. Appl..

[21]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[22]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[23]  Barbara Wixom,et al.  The Current State of Business Intelligence , 2007, Computer.

[24]  Arno Scharl,et al.  Extracting and Grounding Contextualized Sentiment Lexicons , 2013, IEEE Intelligent Systems.

[25]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[26]  Solomon Negash,et al.  Platforms for Business Intelligence , 2008 .

[27]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[28]  Surajit Chaudhuri,et al.  An overview of business intelligence technology , 2011, Commun. ACM.