Graph-based short text Entity Linking: A data integration perspective

Traditional Entity Linking is the task of linking name mentions in Web text with their referent entities in a knowledge base, and thus plays an important role in a range of Internet services, such as online recommender systems and Web search Engines. However, in our data integration framework, entity linking becomes the performance bottleneck. The reason is that data records of data integration systems are often short, noisy, informal texts with little context, and often contain phrases with ambiguous meanings. To rigorously address the entity linking problem for data integration, a novel graph-based short text entity linking framework is proposed. Firstly a hybrid candidate entity selection approach that considers both text description and instance data is introduced, then a graph-based semantic similarity metric combining context, topic and semantic correlation for short text entity disambiguation is proposed. Experimental results show good precision and recall of the proposed approaches.

[1]  Jiawei Han,et al.  A Probabilistic Model for Estimating Real-valued Truth from Conflicting Sources , 2012 .

[2]  Setsuo Ohsuga,et al.  INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .

[3]  Ariel Fuxman,et al.  Matching unstructured product offers to structured product specifications , 2011, KDD.

[4]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[5]  Aditya Kalyanpur,et al.  A Comparison of Hard Filters and Soft Evidence for Answer Typing in Watson , 2012, International Semantic Web Conference.

[6]  Mounia Lalmas,et al.  Penguins in sweaters, or serendipitous entity search on user-generated content , 2013, CIKM.

[7]  Liu Qi,et al.  BTM Topic Modeling Approach to Named Entity Linking , 2018 .

[8]  Renée J. Miller,et al.  Framework for Evaluating Clustering Algorithms in Duplicate Detection , 2009, Proc. VLDB Endow..

[9]  Joann J. Ordille,et al.  Data integration: the teenage years , 2006, VLDB.

[10]  Matthew Michelson,et al.  Tweet Disambiguate Entities Retrieve Folksonomy SubTree Step 1 : Discover Categories Generate Topic Profile from SubTrees Step 2 : Discover Profile Topic Profile : “ English Football ” “ World Cup ” , 2010 .

[11]  Wolfgang Nejdl,et al.  Combining a co-occurrence-based and a semantic measure for entity linking , 2013 .

[12]  Laura M. Haas,et al.  Schema Mapping as Query Discovery , 2000, VLDB.

[13]  Paolo Papotti,et al.  ++Spicy: an OpenSource Tool for Second-Generation Schema Mapping and Data Exchange , 2011, Proc. VLDB Endow..

[14]  Oren Etzioni,et al.  Entity Linking at Web Scale , 2012, AKBC-WEKEX@NAACL-HLT.